houqp opened a new issue #1657:
URL: https://github.com/apache/arrow-datafusion/issues/1657


   **Describe the bug**
   
   First reported by @ic4y at 
https://github.com/apache/arrow-datafusion/pull/1556#issuecomment-1012809108.
   
   This is also causing TPCH q7 benchmark to fail due to OOM in  
https://github.com/apache/arrow-datafusion/issues/1652#issuecomment-1019622028.
   
   **To Reproduce**
   
   Compare peak memory usage between 
https://github.com/apache/arrow-datafusion/commit/2008b1dc06d5030f572634c7f8f2ba48562fa636
 and 
https://github.com/apache/arrow-datafusion/commit/c0c9c7231f9c5685fda5fc9294fdc1711384b6fb
 when processing a parquet table.
   
   **Expected behavior**
   
   Memory usage should be on par with arrow-rs or alternatively we should have 
an option in arrow2 to let user make memory usage and array segmentation 
tradeoffs.
   
   **Additional context**
   
   Related upstream issue: https://github.com/jorgecarleitao/arrow2/issues/768
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to