FelixYBW commented on issue #6947:
URL: 
https://github.com/apache/incubator-gluten/issues/6947#issuecomment-2495851658

   The 3 configurations have big impact to the offheap and overhead memory 
usage:
   
   spark.gluten.sql.columnar.backend.velox.spillWriteBufferSize
   spark.gluten.sql.columnar.backend.velox.MaxSpillRunRows
   spark.gluten.sql.columnar.backend.velox.maxSpillFileSize
   
   SpillWriteBufferSize controls the buffer size when spill write data to disk. 
Looks it also control the read buffer size when spill data is fetch back. Each 
file must have one buffer allocated in offheap memory. If the size is too 
large, it will report OOM error triggered by getOutput.
   
   MaxSpillRunRows controls the batch size of spill. The bigger the number, the 
more overhead memory is allocated, because during spill all memory allocation 
is overhead memory. The smaller the number, the more spill files.
   
   maxSpillFileSize controls the file size of spill. The smaller the number, 
the more spill files.
   
   #8025
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to