Hello there, I am using Arrow to store data on disk temporarily, so disk space is not a problem (I understand that Parquet is preferable for more efficient disk storage). It seems that Arrow's memory mapping/zero copy capabilities would provide better performance given this use case.
Here are my questions: 1. For new applications, should we prefer the pa.ipc.new_file interface over write_feather? My understanding from reading [0] is that pa.feather.write_feather is an API provided for backward compatibility, and with compression disabled, it seems to produce files of the same size (the files appear to be identical) as the RecordBatchFileWriter. 2. Does compression affect the need to make copies? I imagine that compressing the file means that the code to use the file cannot be zero-copy anymore. 3. When using pandas to analyze the data, is there a way to load the data using memory mapping, and if so, would this be expected to improve deserialization performance and memory utilization if multiple processes are reading the same table data simultaneously? Assume that I'm running on a modern server-class SSD. Thank you! Jonathan [0] https://arrow.apache.org/faq/#what-about-the-feather-file-format