piyushdubey commented on issue #40258: URL: https://github.com/apache/arrow/issues/40258#issuecomment-2016886238
> As an alternative to Parquet.Net, you could use ParquetSharp, which wraps the Arrow C++ Parquet library and has built-in support for reading Parquet files as Arrow record batches: https://github.com/G-Research/ParquetSharp/blob/master/docs/Arrow.md > > Disclaimer: I'm a maintainer of ParquetSharp > > But otherwise your algorithm seems sensible. I'd only suggest you might want one RecordBatch per row group rather than per file, in case your files contain many row groups and could be large. Thanks @adamreeve - Looks like eventually I will need to move to ParquetSharp only. Arrow conversion with other Parquet parsers is pretty tedious to work with. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
