[ 
https://issues.apache.org/jira/browse/TEZ-1585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14133758#comment-14133758
 ] 

Rajesh Balamohan commented on TEZ-1585:
---------------------------------------

Update:
=======
- In session mode, DAGImpl ends up creating different UserGroupInformation for 
every job which is run in the session. 
- Internally, FileSystem.get() has got its own static cache implementation 
which relies on UserGroupInformation.getCurrentUser() for populating key in its 
hashmap. 
{code}
          @Override
      public int hashCode() {
        return (scheme + authority).hashCode() + ugi.hashCode() + (int)unique;
      }
{code}
- Since UserGroupInformation.getCurrentUser() would return different hashCode 
for every job, it would keep adding as different key in FileSystem.CACHE.map.
- This would end up leaking 200KB of data per job submitted to the session.

Is it possible to share appMasterUgi with DAGImpl when credentials is not 
available?

> Memory leak in tez session mode
> -------------------------------
>
>                 Key: TEZ-1585
>                 URL: https://issues.apache.org/jira/browse/TEZ-1585
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Rajesh Balamohan
>         Attachments: app_master.tar.gz
>
>
> DAGAppMaster's memory gradually keeps increasing when running simple tez apps 
> repeatedly in session mode.  Taking memory snapshots revealed that 
> FileSystem.CACHE is getting populated with some keys which are not being 
> released.  
> Around 200KB of data is leaked for every job invocation in the session mode.  
> I will attach the profiler snapshots soon.  Haven't figured out yet on who 
> makes the call to FileSystem.get() and not calling close().



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to