jinchengchenghh commented on issue #8025: URL: https://github.com/apache/incubator-gluten/issues/8025#issuecomment-2499462328
> Thank you, @jinchengchenghh . With the tuning of kMaxSpillRunRows and kSpillWriteBufferSize. one of my task succeed but the other one still fails. Looks like it still have some large memory allocation in getoutput. Maybe because the Streams will hold all the buffers, and released after all the files read completed. Compress consumes much buffer but not tracked by memory pool in the meantime. https://github.com/facebookincubator/velox/blob/main/velox/serializers/PrestoSerializer.cpp#L4416 I don't see the compression in Spark spill, so it doesn't need to request memory for compression. I will add a new config to control the velox spill codec. It is still OOM or kill by yarn? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
