Github user vanzin commented on the issue:

    https://github.com/apache/spark/pull/19170
  
    > Maybe we are using SHS too aggressively, but the GC issue is one of the 
major issues we met.
    
    Can you describe what this issue is? That is not what the bug is showing. 
The bug shows a heap dump with a lot of `BlockStatus` objects. I'm saying that 
with the new code, you should not get into that situation, because the SHS does 
not hold on to those objects. Is that not what you see?
    
    If you see `BlockStatus` objects still being referenced then there is 
probably a bug somewhere.
    
    Barring the issue above, this patch to the best of my knowledge would not 
help much with GC. The code still loads data from disk for these events (= 
creates garbage) and still creates json4s objects for it (= more garbage). 
You'd be avoiding a trivial amount of garbage after that by doing this 
filtering.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to