linehrr commented on issue #24461: [SPARK-27434][CORE] Fix mem leak due to 
hadoop fs caching mechanism when eventLog is enabled
URL: https://github.com/apache/spark/pull/24461#issuecomment-488114925
 
 
   @vanzin 
   the dumps shows a HashSet under `FileSystem$Statistics` that's holding up 
lots of mem.
   and from the class definition, there is only one HashSet under that class. 
   
   also I think the HashMap there is just misleading, because in HashSet's 
implementation, it uses a HashMap for lookup: 
   ```
   public class HashSet<E>
       extends AbstractSet<E>
       implements Set<E>, Cloneable, java.io.Serializable
   {
       static final long serialVersionUID = -5024744406713321676L;
   
       private transient HashMap<E,Object> map;
     /**
        * Constructs a new, empty set; the backing <tt>HashMap</tt> instance has
        * default initial capacity (16) and load factor (0.75).
        */
       public HashSet() {
           map = new HashMap<>();
       }
   ```
   
   so that HashMap is just under the HashSet holding internal data structure. 
   
   I can look at the dump again, but I don't know if I can see more useful 
stuff. I will post back if I find anything new. but I agree, this is less 
likely to be a Spark issue at this moment. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to