tustvold commented on code in PR #9077:
URL: https://github.com/apache/arrow-rs/pull/9077#discussion_r2662980779


##########
parquet/src/arrow/arrow_writer/mod.rs:
##########
@@ -819,7 +821,15 @@ impl ArrowColumnWriter {
     pub fn write(&mut self, col: &ArrowLeafColumn) -> Result<()> {
         match &mut self.writer {
             ArrowColumnWriterImpl::Column(c) => {
-                write_leaf(c, &col.0)?;
+                let leaf = col.0.array();
+                match leaf.as_any_dictionary_opt() {
+                    Some(dictionary) => {
+                        let materialized =
+                            arrow_select::take::take(dictionary.values(), 
dictionary.keys(), None)?;

Review Comment:
   There is a separate specialized encoded for ByteArray and DictionaryArray of 
ByteArray - https://github.com/apache/arrow-rs/pull/2221
   
   So this will only materialize dictionaries for primitives, which tbh is 
probably the most efficient thing to do.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to