Sep Dehpour created ARROW-9526:
----------------------------------

             Summary: [Python] Memorymapped arrow file to parquet conversions 
loaded everything into memory
                 Key: ARROW-9526
                 URL: https://issues.apache.org/jira/browse/ARROW-9526
             Project: Apache Arrow
          Issue Type: Bug
            Reporter: Sep Dehpour


When converting a memory mapped arrow file into parquet file, it loads the 
whole table into RAM. This effectively negates the point of memory mapping.

If this is not a bug,  perhaps there is a proper way of converting the 
memorymapped arrow file to parquet without using excessive memory?

 

Example code:
{code:java}
    source = pa.memory_map(path_to_arrow_file, 'r')
    table = pa.ipc.RecordBatchFileReader(source).read_all()
    # The followlng line will load the whole thing into RAM
    pq.write_table(table, path_to_parquet){code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to