Re: [PR] [#1727] improvement(server): Introduce block num threshold for early buffer flush to mitigate GC issues [incubator-uniffle]

via GitHub Mon, 03 Jun 2024 01:45:23 -0700


xianjingfeng commented on PR #1759:
URL: 
https://github.com/apache/incubator-uniffle/pull/1759#issuecomment-2144631615


   > It is only for single shuffle buffer. If you look at it from the 
perspective of the shuffle server, rather than a single job, after this PR is 
merged, it will help the shuffle server to handle more tasks more stably, 
including those tasks with many small blocks due to unreasonable 
configurations. It will not cause a decrease in the overall memory utilization 
of the shuffle server. As long as you run enough tasks in parallel, the memory 
usage will still go up. If the threshold in this PR is triggered, it indicates 
that there are too many small blocks. Our shuffle server should only maintain 
large blocks, and too many small blocks in heap memory will only drag down the 
shuffle server.
   
   But in our production environment, most blocks are small. And we can't 
modify `spark.rss.writer.buffer.spill.size` uniformly because some application 
will be killed by yarn for exceeding memory limits.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [#1727] improvement(server): Introduce block num threshold for early buffer flush to mitigate GC issues [incubator-uniffle]

Reply via email to