[GitHub] [arrow] westonpace commented on issue #33759: [Python][C++] How to limit the memory consumption of to_batches()

via GitHub Fri, 03 Feb 2023 12:13:45 -0800


westonpace commented on issue #33759:
URL: https://github.com/apache/arrow/issues/33759#issuecomment-1416354105


   > I thought it was likely related as both issues are caused when using 
‘to_batches()’ on small data with the difference being I am reading directly 
from a mounted disk and the OP is reading over the network. If the scanner is 
the cause as some comments have suggested both our issues would be resolved by 
a fix.
   
   OP's issue has been identified and they have found a workaround (don't store 
full metadata in each file) and we have identified a long term fix (#33888).  
That problem and fix do not have anything to do with #33624.  In #33624 the 
total data transferred is larger than the on-disk size of the data.  This would 
not be caused by arrow retaining metadata in RAM.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] westonpace commented on issue #33759: [Python][C++] How to limit the memory consumption of to_batches()

Reply via email to