Jenovesan opened a new issue, #42153: URL: https://github.com/apache/arrow/issues/42153
### Describe the usage question you have. Please include as many useful details as possible. ### Program Goal Hello, For my program, I am reading files sequentially. However, to speed up the program I want to preload the files async into a container so that they can already be read into memory when my program requests the file to be read. ### Solution? I've been scouring the docs and code and I think the best way to do this would be to have a `Dataset` containing the individual files as `RecordBatch`es and then use `Dataset::NewScan` to scan the whole dataset one `RecordBatch` at a time and as soon as the `RecordBatch` is read I can store it in the container. ### Additional Information Files are memory-mapped .feather files. In my dataset, there are thousands of files. Each file is either ~110KB or ~5KB in size. ### Conclusion If someone could let me know if this the best way to achieve what I want or guide me a in a better direction that would be great. Any advice would be greatly appreciated, Thanks ### Component(s) C++ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
