[
https://issues.apache.org/jira/browse/ARROW-9526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sep Dehpour updated ARROW-9526:
-------------------------------
Summary: [Python] Memorymapped arrow file to parquet conversions loads
everything into RAM (was: [Python] Memorymapped arrow file to parquet
conversions loaded everything into memory)
> [Python] Memorymapped arrow file to parquet conversions loads everything into
> RAM
> ---------------------------------------------------------------------------------
>
> Key: ARROW-9526
> URL: https://issues.apache.org/jira/browse/ARROW-9526
> Project: Apache Arrow
> Issue Type: Bug
> Reporter: Sep Dehpour
> Priority: Minor
>
> When converting a memory mapped arrow file into parquet file, it loads the
> whole table into RAM. This effectively negates the point of memory mapping.
> If this is not a bug, perhaps there is a proper way of converting the
> memorymapped arrow file to parquet without using excessive memory?
>
> Example code:
> {code:java}
> source = pa.memory_map(path_to_arrow_file, 'r')
> table = pa.ipc.RecordBatchFileReader(source).read_all()
> # The followlng line will load the whole thing into RAM
> pq.write_table(table, path_to_parquet){code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)