tustvold commented on code in PR #9077:
URL: https://github.com/apache/arrow-rs/pull/9077#discussion_r2662937623


##########
parquet/src/arrow/arrow_writer/mod.rs:
##########
@@ -819,7 +821,15 @@ impl ArrowColumnWriter {
     pub fn write(&mut self, col: &ArrowLeafColumn) -> Result<()> {
         match &mut self.writer {
             ArrowColumnWriterImpl::Column(c) => {
-                write_leaf(c, &col.0)?;
+                let leaf = col.0.array();
+                match leaf.as_any_dictionary_opt() {
+                    Some(dictionary) => {
+                        let materialized =
+                            arrow_select::take::take(dictionary.values(), 
dictionary.keys(), None)?;

Review Comment:
   That is correct, there is likely a fair amount of low-hanging fruit to 
optimising the parquet writer
   
   Edit: Although it does appear there is something to handle dictionary array 
of bytearray, let me check I didn't break that



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to