Hello Kohei, You can create a arrow::BufferReader to wrap your in-memory buffer: https://arrow.apache.org/docs/cpp/api/io.html#in-memory-streams and then pass it to parquet::FileReaderBuilder: https://arrow.apache.org/docs/cpp/api/formats.html#_CPPv4N7parquet5arrow17FileReaderBuilderE (BufferReader subclasses RandomAccessFile) Regards Antoine. Le 19/06/2023 à 14:57, Kohei Yoshida a écrit :
Hello there, I would like to get some guidance on how to load Parquet files from an in-memory buffers. I have already managed to load from files by following this tutorial: https://arrow.apache.org/docs/cpp/parquet.html and I did spend some time looking around the Arrow API to figure out a way to load from in-memory buffers. But so far no luck. Is there a way to achieve this using the existing Arrow API? Any help or guidance would be appreciated. A little background on why I'm doing this. I'm currently working on implementing an import filter for Parquet file format for LibreOffice Calc, and I'm doing so via orcus library[1] which specializes in providing spreadsheet-related file format filters as an external library. The orcus library API itself provides API for both loading from files and loading from in-memory buffers for all file formats it supports. LibreOffice itself uses orcus's in-memory buffer API to achieve file loading due to the way its file loading mechanism works. Currently, I'm temporarily saving the incoming buffer to a temporary file and loading from it, but that's far from ideal... Thanks, Kohei Yoshida [1] https://gitlab.com/orcus/orcus