[jira] [Commented] (HADOOP-8050) Deadlock in metrics

Kihwal Lee (Commented) (JIRA) Sun, 12 Feb 2012 21:35:46 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-8050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13206689#comment-13206689
 ]


Kihwal Lee commented on HADOOP-8050:
------------------------------------

bq. The correct fix (sans moving jmx to a sink) is not removing the lock on 
metrics system in the snapshot thread but fixing the lock order in 
MetricsSourceAdapter (to make source.getMetrics is done without holding the 
adapter lock).

I tried to do this in the new patch. Since updateJmxCache() doesn't block while 
calling getMetrics(), some may not get the latest metric data if 
updateJmxCache() is already being executed by another thread.

The patch passes all metrics related tests. 
                
> Deadlock in metrics
> -------------------
>
>                 Key: HADOOP-8050
>                 URL: https://issues.apache.org/jira/browse/HADOOP-8050
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: metrics
>    Affects Versions: 0.20.204.0, 0.20.205.0, 0.23.0, 1.0.0
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>             Fix For: 1.1.0, 1.0.1
>
>         Attachments: hadoop-8050-branch-1.patch.txt, 
> hadoop-8050-branch-1.patch.txt, hadoop-8050-branch-1.patch.txt, 
> hadoop-8050-trunk.patch.txt, hadoop-8050-trunk.patch.txt, 
> hadoop-8050.patch.txt
>
>
> The metrics serving thread and the periodic snapshot thread can deadlock.
> It happened a few times on one of namenodes we have. When it happens RPC 
> works but the web ui and hftp stop working. I haven't look at the trunk too 
> closely, but it might happen there too.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HADOOP-8050) Deadlock in metrics

Reply via email to