FelixYBW commented on issue #4392:
URL: 
https://github.com/apache/incubator-gluten/issues/4392#issuecomment-3392636857

   > IIRC Spark sort-based shuffle is a heavy operator that remains on on-heap 
when off-heap is on. I would be glad to do the some path-findings to see if we 
can somehow fix this. [@FelixYBW](https://github.com/FelixYBW) If you want to 
also help locate where the remaining on-heap consumption came from? E.g., Could 
try setting `spark.shuffle.sort.bypassMergeThreshold = 2147483647` to disable 
vanilla sort-based shuffle then see if the on-heap consumption of vanilla Spark 
can be reduced. Thanks.
   
   Not the shuffle. From the workload it's the sort operator. We enabled 
offheap but config a small offheap memory and large onheap memory, the spill is 
triggered even onheap memory is far enough for the sort.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to