[jira] Commented: (HADOOP-3762) Task tracker died due to OOM

Tsz Wo (Nicholas), SZE (JIRA) Wed, 16 Jul 2008 14:21:53 -0700

    [ 
https://issues.apache.org/jira/browse/HADOOP-3762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12614110#action_12614110
 ]


Tsz Wo (Nicholas), SZE commented on HADOOP-3762:
------------------------------------------------

For example, calls FileSystem.get("hdfs://host:8020/", conf), then it looks up 
the cache with username (may be null since user might not have logged in yet) 
and the uri "hdfs://host:8020/" as a key.  Suppose it fails, it will create a 
file system object (depending on the FileSystem class, user might login) and 
put it to the cache with a new key.  The uri in the new key will be 
"hdfs://host" (no port) since 8020 is the DEFAULT_PORT.  Then, if there is 
another call to FileSystem.get("hdfs://host:8020/", conf), it won't hit the 
cache.

> Task tracker died due to OOM 
> -----------------------------
>
>                 Key: HADOOP-3762
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3762
>             Project: Hadoop Core
>          Issue Type: Bug
>            Reporter: Runping Qi
>            Assignee: Tsz Wo (Nicholas), SZE
>            Priority: Blocker
>             Fix For: 0.18.0
>
>         Attachments: 3762_20080715.patch, 3762_20080715b.patch, 
> 3762_20080715c.patch, TaskTrackerStackTrace.txt
>
>
> When running about 100 moderate jobs on a small cluster (with 19 Task 
> Trackers),
> the task trackers all died due to OOM.
> I got a chance to dump the jstack strace of a task tracker before it died.
> Its image size was close 4GB!
> I saw 1200+ threads of DFSClient.LeaseChecker.
> Clearly we have a severe resource leakage problem!

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-3762) Task tracker died due to OOM

Reply via email to