[
https://issues.apache.org/jira/browse/MAPREDUCE-2954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vinod Kumar Vavilapalli updated MAPREDUCE-2954:
-----------------------------------------------
Priority: Critical (was: Major)
Hitting another variant of this too:
{code}
Java stack information for the threads listed above:
===================================================
"Thread-45":
at
org.apache.hadoop.yarn.api.records.impl.pb.ApplicationAttemptIdPBImpl.getAttemptId(ApplicationAttemptIdPBImpl.java:90)
- waiting to lock <0xb5e2d1b0> (a
org.apache.hadoop.yarn.api.records.impl.pb.ApplicationAttemptIdPBImpl)
at
org.apache.hadoop.yarn.api.records.impl.pb.ApplicationAttemptIdPBImpl.compareTo(ApplicationAttemptIdPBImpl.java:147)
- locked <0xb5e2cb28> (a
org.apache.hadoop.yarn.api.records.impl.pb.ApplicationAttemptIdPBImpl)
at
org.apache.hadoop.yarn.api.records.impl.pb.ApplicationAttemptIdPBImpl.compareTo(ApplicationAttemptIdPBImpl.java:31)
at
org.apache.hadoop.yarn.api.records.impl.pb.ContainerIdPBImpl.compareTo(ContainerIdPBImpl.java:215)
at
org.apache.hadoop.yarn.api.records.impl.pb.ContainerIdPBImpl.compareTo(ContainerIdPBImpl.java:34)
at
java.util.concurrent.ConcurrentSkipListMap.doGet(ConcurrentSkipListMap.java:797)
at
java.util.concurrent.ConcurrentSkipListMap.get(ConcurrentSkipListMap.java:1640)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:360)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:355)
at
org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:113)
at
org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
at java.lang.Thread.run(Thread.java:619)
"Thread-30":
at
org.apache.hadoop.yarn.api.records.impl.pb.ApplicationAttemptIdPBImpl.getApplicationId(ApplicationAttemptIdPBImpl.java:101)
- waiting to lock <0xb5e2cb28> (a
org.apache.hadoop.yarn.api.records.impl.pb.ApplicationAttemptIdPBImpl)
at
org.apache.hadoop.yarn.api.records.impl.pb.ApplicationAttemptIdPBImpl.compareTo(ApplicationAttemptIdPBImpl.java:144)
- locked <0xb5e2d1b0> (a
org.apache.hadoop.yarn.api.records.impl.pb.ApplicationAttemptIdPBImpl)
at
org.apache.hadoop.yarn.api.records.impl.pb.ApplicationAttemptIdPBImpl.compareTo(ApplicationAttemptIdPBImpl.java:31)
at
org.apache.hadoop.yarn.api.records.impl.pb.ContainerIdPBImpl.compareTo(ContainerIdPBImpl.java:215)
at
org.apache.hadoop.yarn.api.records.impl.pb.ContainerIdPBImpl.compareTo(ContainerIdPBImpl.java:34)
at
java.util.concurrent.ConcurrentSkipListMap.doRemove(ConcurrentSkipListMap.java:1078)
at
java.util.concurrent.ConcurrentSkipListMap.remove(ConcurrentSkipListMap.java:1673)
at
java.util.concurrent.ConcurrentSkipListMap$Iter.remove(ConcurrentSkipListMap.java:2256)
at
org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.getNodeStatus(NodeStatusUpdaterImpl.java:223)
at
org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.access$300(NodeStatusUpdaterImpl.java:62)
at
org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl$1.run(NodeStatusUpdaterImpl.java:262)
Found 1 deadlock.
{code}
> Deadlock in NM with threads racing for ApplicationAttemptId
> -----------------------------------------------------------
>
> Key: MAPREDUCE-2954
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2954
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: mrv2
> Affects Versions: 0.23.0
> Reporter: Vinod Kumar Vavilapalli
> Priority: Critical
> Fix For: 0.23.0
>
>
> Found this:
> {code}
> Java stack information for the threads listed above:
> ===================================================
> "Thread-45":
> at
> org.apache.hadoop.yarn.api.records.impl.pb.ApplicationAttemptIdPBImpl.getApplicationId(ApplicationAttemptIdPBImpl.java:101)
> - waiting to lock <0xb6a43ba0> (a
> org.apache.hadoop.yarn.api.records.impl.pb.ApplicationAttemptIdPBImpl)
> at
> org.apache.hadoop.yarn.api.records.impl.pb.ApplicationAttemptIdPBImpl.compareTo(ApplicationAttemptIdPBImpl.java:144)
> - locked <0xb6a443a0> (a
> org.apache.hadoop.yarn.api.records.impl.pb.ApplicationAttemptIdPBImpl)
> at
> org.apache.hadoop.yarn.api.records.impl.pb.ApplicationAttemptIdPBImpl.compareTo(ApplicationAttemptIdPBImpl.java:31)
> at
> org.apache.hadoop.yarn.api.records.impl.pb.ContainerIdPBImpl.compareTo(ContainerIdPBImpl.java:215)
> at
> org.apache.hadoop.yarn.api.records.impl.pb.ContainerIdPBImpl.compareTo(ContainerIdPBImpl.java:34)
> at
> java.util.concurrent.ConcurrentSkipListMap.doGet(ConcurrentSkipListMap.java:797)
> at
> java.util.concurrent.ConcurrentSkipListMap.get(ConcurrentSkipListMap.java:1640)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:360)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl$ContainerEventDispatcher.handle(ContainerManagerImpl.java:355)
> at
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:113)
> at
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:75)
> at java.lang.Thread.run(Thread.java:619)
> "Thread-30":
> at
> org.apache.hadoop.yarn.api.records.impl.pb.ApplicationAttemptIdPBImpl.getApplicationId(ApplicationAttemptIdPBImpl.java:101)
> - waiting to lock <0xb6a443a0> (a
> org.apache.hadoop.yarn.api.records.impl.pb.ApplicationAttemptIdPBImpl)
> at
> org.apache.hadoop.yarn.api.records.impl.pb.ApplicationAttemptIdPBImpl.compareTo(ApplicationAttemptIdPBImpl.java:144)
> - locked <0xb6a43ba0> (a
> org.apache.hadoop.yarn.api.records.impl.pb.ApplicationAttemptIdPBImpl)
> at
> org.apache.hadoop.yarn.api.records.impl.pb.ApplicationAttemptIdPBImpl.compareTo(ApplicationAttemptIdPBImpl.java:31)
> at
> org.apache.hadoop.yarn.api.records.impl.pb.ContainerIdPBImpl.compareTo(ContainerIdPBImpl.java:215)
> at
> org.apache.hadoop.yarn.api.records.impl.pb.ContainerIdPBImpl.compareTo(ContainerIdPBImpl.java:34)
> at
> java.util.concurrent.ConcurrentSkipListMap.doRemove(ConcurrentSkipListMap.java:1078)
> at
> java.util.concurrent.ConcurrentSkipListMap.remove(ConcurrentSkipListMap.java:1673)
> at
> java.util.concurrent.ConcurrentSkipListMap$Iter.remove(ConcurrentSkipListMap.java:2256)
> at
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.getNodeStatus(NodeStatusUpdaterImpl.java:223)
> at
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.access$300(NodeStatusUpdaterImpl.java:62)
> at
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl$1.run(NodeStatusUpdaterImpl.java:262)
> Found 1 deadlock.
> {code}
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira