Github user a-roberts commented on the pull request:

    https://github.com/apache/spark/pull/12327#issuecomment-212991789
  
    @cloud-fan I've had a closer look at this and think a more robust method 
would be to use weak references to identify when an object is out of scope, 
with IBM Java we see the 29% reduction between cache size 1 and cache size 2 
but with OpenJDK we see a 4% increase, suggesting that we can't rely on the 
sizes being similar across JDK vendors, now thinking this is a test case issue 
rather than a problem in the ClosureCleaner or IBM Java code.
    
    With IBM Java our second cache size (after repartitioning) is much smaller; 
repartitioning uses ContextCleaner whereas with OpenJDK it grows. Either we 
have a bigger memory footprint or the cached size is being calculated 
incorrectly (looks fine to me and we actually have smaller object sizes). The 
problem on Z was due to using repl/pom.xml instead of pom.xml in the Spark home 
directory (same result if we use the right pom.xml file) so can be discarded 
for this discussion.
    
    I'm going to figure out what's in the Scala REPL line objects between 
vendors, I think the intention of this commit is to test that the REPL line 
object is being cleaned but the assertion in place at the moment doesn't look 
to be correct (the size is bigger after the cleaning and cacheSize2 is the 
result of cleaning if I'm understanding the code correctly), have I missed a 
trick?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to