alamb commented on code in PR #8162: URL: https://github.com/apache/arrow-rs/pull/8162#discussion_r2360358117
########## parquet/src/arrow/arrow_writer/mod.rs: ########## @@ -906,6 +918,12 @@ impl ArrowRowGroupWriterFactory { let writers = get_column_writers(&self.schema, &self.props, &self.arrow_schema)?; Ok(ArrowRowGroupWriter::new(writers, &self.arrow_schema)) } + + /// Create column writers for a new row group. + pub fn create_column_writers(&self, row_group_index: usize) -> Result<Vec<ArrowColumnWriter>> { Review Comment: So I am not sure making `ArrowRowGroupWriter` public gets us much of anything, and it would not allow per-column parallel encoding One benefit of getting the column writers individually, is that then the columns can be encoded in parallel. The `ArrowRowGroupWriter` can only write RowGroups in parallel. I looked at `ArrowRowGroupWriter` a bit more, and the only substantial thing it does is call a loop with [compute_leaves](https://docs.rs/parquet/latest/parquet/arrow/arrow_writer/fn.compute_leaves.html) which is already public. https://github.com/apache/arrow-rs/blob/bac36900826e411564231b89e3eb544ea9082cab/parquet/src/arrow/arrow_writer/mod.rs#L831-L835 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org