jorisvandenbossche commented on a change in pull request #10266:
URL: https://github.com/apache/arrow/pull/10266#discussion_r657791375



##########
File path: docs/source/python/memory.rst
##########
@@ -277,6 +277,95 @@ types than with normal Python file objects.
    !rm example.dat
    !rm example2.dat
 
+Efficiently Writing and Reading Arrow Arrays
+--------------------------------------------
+
+Being optimized for zero copy and memory mapped data, Arrow allows to easily
+read and write arrays consuming the minimum amount of resident memory.
+
+When writing and reading raw arrow data, we can use the Arrow File Format
+or the Arrow Streaming Format.
+
+To dump an array to file, you can use the :meth:`~pyarrow.ipc.new_file`
+which will provide a new :class:`~pyarrow.ipc.RecordBatchFileWriter` instance
+that can be used to write batches of data to that file.
+
+For example to write an array of 100M integers, we could write it in 1000 
chunks
+of 100000 entries:
+
+.. ipython:: python

Review comment:
       Personally I would prefer to have *some* way to still verify the 
example, but this doesn't need to be with the IPython directive (which actually 
only ensures the code runs without error, not that the output is correct). This 
has come up before as well, so I opened a separate JIRA to discuss this in 
general: https://issues.apache.org/jira/browse/ARROW-13159




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to