alamb commented on issue #47: URL: https://github.com/apache/arrow-rs/issues/47#issuecomment-1242708749
Hi @jinyius There has been some non trivial work by @tustvold to support reading parquet files without having to clone filehandles -- e.g. https://docs.rs/parquet/22.0.0/parquet/file/serialized_reader/struct.SerializedFileReader.html now takes a `ChunkReader` which is implemented on Bytes. https://docs.rs/parquet/22.0.0/parquet/file/reader/trait.ChunkReader.html Thus, in order to read such a file, you can buffer it into `Bytes` https://docs.rs/parquet/22.0.0/parquet/file/reader/trait.ChunkReader.html#impl-ChunkReader-for-Bytes Perhaps with something like this (untested): ```rust let mut v = vec![]; let parquet_file: File = open_your_parquet_file(); // read parquet into memory (TODO error checking) parquet_file.read_to_end(&mut v).unwrap(); // convert to Bytes so we can read the file let b: Bytes = v.into(); let reader = SerializedFileReader::new(b).unwrap(); ``` > any update here as it's been a year? i can provide some test parquet files that triggers this issue if that helps. If you could provide an example file and the code you are using that shows the error, I would be happy to help try and apply the method above. If it works for you, I think we should update the documentation to explain this -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
