emkornfield commented on issue #32723: URL: https://github.com/apache/arrow/issues/32723#issuecomment-1565646113
1. Yes, I think so. 2. I think [ArrowReaderPropertires](https://github.com/apache/arrow/blob/130f9e981aa98c25de5f5bfe55185db270cec313/cpp/src/parquet/properties.h#L778) is probably where this belongs. For per column settings you can probably find inspiration from ParquetProperties (global might be fine for an initial implementation. 3. IIRC its not really memory limit as much as it is a limitation of the underlying address space of the Binary/String arrays which allow for at most 2GB of data in a row group. I don't recall the code well enough to know if there are other edge cases that you might encounter, but i think this would solve most issues. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
