[GitHub] [arrow] amol- commented on a change in pull request #10266: ARROW-12650: [Doc][Python] Improve documentation regarding dealing with memory mapped files

GitBox Thu, 22 Jul 2021 07:10:53 -0700


amol- commented on a change in pull request #10266:
URL: https://github.com/apache/arrow/pull/10266#discussion_r674832346




##########
File path: docs/source/python/ipc.rst
##########
@@ -154,6 +154,73 @@ DataFrame output:
    df = pa.ipc.open_file(buf).read_pandas()
    df[:5]
 
+Efficiently Writing and Reading Arrow Arrays
+--------------------------------------------
+
+Being optimized for zero copy and memory mapped data, Arrow allows to easily
+read and write arrays consuming the minimum amount of resident memory.
+
+When writing and reading raw Arrow data, we can use the Arrow File Format
+or the Arrow Streaming Format.
+
+To dump an array to file, you can use the :meth:`~pyarrow.ipc.new_file`
+which will provide a new :class:`~pyarrow.ipc.RecordBatchFileWriter` instance
+that can be used to write batches of data to that file.
+
+For example to write an array of 10M integers, we could write it in 1000 chunks
+of 10000 entries:
+
+.. ipython:: python

Review comment:
       I'm not fond of `ipython` directive too, but we have a dedicated Jira 
Issue ( https://issues.apache.org/jira/browse/ARROW-13159 ), for now I adhered 
to what seemed to be the practice in the rest of that file.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] amol- commented on a change in pull request #10266: ARROW-12650: [Doc][Python] Improve documentation regarding dealing with memory mapped files

Reply via email to