Re: [PR] [#1727] improvement(server): Introduce block num threshold for early buffer flush to mitigate GC issues [incubator-uniffle]

via GitHub Tue, 04 Jun 2024 01:47:09 -0700


rickyma commented on PR #1759:
URL: 
https://github.com/apache/incubator-uniffle/pull/1759#issuecomment-2146959428


   We can do a math here. According to 
https://github.com/apache/incubator-uniffle/pull/1759#issuecomment-2139067679, 
the heap memory usage is 4000 * 200Bytes. If a Spark job has 20,000 reduce 
partitions, it will occupy 14.9GB of heap memory. Assuming the actual size of 
these blocks is very small, so they will never trigger a flush operation. This 
is just the case with one Spark job. If more extreme Spark jobs pile up, I 
think the Uniffle server will face increasingly severe GC pauses, or even OOM. 
Even more extreme, if a Spark job has 200,000 reduce partitions, it will occupy 
149GB of heap memory. Maybe more reduce partitions, like 2,000,000 reduce 
partitions?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [#1727] improvement(server): Introduce block num threshold for early buffer flush to mitigate GC issues [incubator-uniffle]

Reply via email to