Re: [PR] chore: Reserve memory for native shuffle writer per partition [datafusion-comet]

via GitHub Fri, 18 Oct 2024 08:39:50 -0700


andygrove commented on PR #1022:
URL: 
https://github.com/apache/datafusion-comet/pull/1022#issuecomment-2422750238


   I am testing with TPC-H sf=100. I usually test with one executor and 8 
cores, but with this PR I can only run with a single core. I tried with 2 cores 
with this config:
   
   ```
       --conf spark.executor.instances=1 \
       --conf spark.executor.memory=16G \
       --conf spark.executor.cores=2 \
       --conf spark.cores.max=2 \
       --conf spark.memory.offHeap.enabled=true \
       --conf spark.memory.offHeap.size=20g \
   ```
   
   The job fails with:
   
   ```
   org.apache.spark.SparkException: 
     Job aborted due to stage failure: Task 0 in stage 251.0 failed 4 times, 
most recent failure: 
     Lost task 0.3 in stage 251.0 (TID 2171) (10.0.0.118 executor 0): 
     org.apache.comet.CometNativeException: 
     External error: 
     Internal error: Partition is still not able to allocate enough memory for 
the array builders after spilling..
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] chore: Reserve memory for native shuffle writer per partition [datafusion-comet]

Reply via email to