Sep Dehpour created ARROW-9526:
----------------------------------
Summary: [Python] Memorymapped arrow file to parquet conversions
loaded everything into memory
Key: ARROW-9526
URL: https://issues.apache.org/jira/browse/ARROW-9526
Project: Apache Arrow
Issue Type: Bug
Reporter: Sep Dehpour
When converting a memory mapped arrow file into parquet file, it loads the
whole table into RAM. This effectively negates the point of memory mapping.
If this is not a bug, perhaps there is a proper way of converting the
memorymapped arrow file to parquet without using excessive memory?
Example code:
{code:java}
source = pa.memory_map(path_to_arrow_file, 'r')
table = pa.ipc.RecordBatchFileReader(source).read_all()
# The followlng line will load the whole thing into RAM
pq.write_table(table, path_to_parquet){code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)