kecookier commented on issue #7860:
URL:
https://github.com/apache/incubator-gluten/issues/7860#issuecomment-2466010191
> Would you like to post the OOM error message you have seen? Or it's only
an abnormal consumption of rss files?
@zhztheplayer we caught this issue by dumping `/proc/self/status` when the
executor is killed. It shows that `RssFile` is almost 3G when `VmRss` is 3.5G.
This issue affects a number of our large-scale ETL processes.
/proc/self/status

> So the root cause is that the memory is unmapped until the file is close.
When we merge the spills it eventually mapped all the spill data to memory.
"Killed by yarn" error makes sense here.
>
> Let's see if the ReadableFile performance is the same as MemoryMappedFile.
It's an easy fix if so. Otherwise we need to manually unmap the file.
@FelixYBW
The Arrow `MemoryMappedFile` does not support a method to get the underlying
`mmap` address. However, `MemoryMappedFile::MemoryMap` has a method to get the
region data pointer, so we might add a method for `MemoryMappedFile` to return
`head()` or `data()`.
Another approach is in `mergeSpills()`, where we open the file for each
partition and close it when finished. We do some internal performance test, it
shows ReadableFile may have some regression.
@ccat3z Can you commit a PR and trigger the community performance benchmark
by adding the comment /Benchmark Velox?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]