[
https://issues.apache.org/jira/browse/ARROW-12650?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Alessandro Molina updated ARROW-12650:
--------------------------------------
Component/s: Documentation
> [Python] Improve documentation regarding dealing with memory mapped files
> -------------------------------------------------------------------------
>
> Key: ARROW-12650
> URL: https://issues.apache.org/jira/browse/ARROW-12650
> Project: Apache Arrow
> Issue Type: Improvement
> Components: Documentation
> Reporter: Alessandro Molina
> Priority: Minor
>
> While one of the Arrow promises is that it makes easy to read/write data
> bigger than memory, it's not immediately obvious from the pyarrow
> documentation how to deal with memory mapped files.
> We hint that you can open files as memory mapped (
> [https://arrow.apache.org/docs/python/memory.html?highlight=memory_map#on-disk-and-memory-mapped-files]
> ) but then we don't explain how to read/write Arrow Arrays or Tables from
> there.
> While most high level functions to read/write formats (pqt, feather, ...)
> have an easy to guess {{memory_map=True}} option, we don't have any example
> of how that is meant to work for Arrow format itself. For example how you can
> do that using {{RecordBatchFile*}}.
> An addition to the memory mapping section that makes a more meaningful
> example that reads/writes actual arrow data (instead of plain bytes) would
> probably be more helpful
--
This message was sent by Atlassian Jira
(v8.3.4#803005)