[
https://issues.apache.org/jira/browse/TEZ-3368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15553445#comment-15553445
]
Jonathan Eagles commented on TEZ-3368:
--------------------------------------
Findbugs warning is pre-existing. [~jlowe], is this patch still ready to go in?
{code:xml}
<BugInstance rank="8" category="MT_CORRECTNESS"
instanceHash="1610d6e467b95e03ffdc2eb056397eb4" instanceOccurrenceNum="0"
priority="2" abbrev="JLM" type="JLM_JSR166_UTILCONCURRENT_MONITORENTER"
instanceOccurrenceMax="0">
<ShortMessage>
Synchronization performed on util.concurrent instance
</ShortMessage>
<LongMessage>
Synchronization performed on java.util.concurrent.atomic.AtomicBoolean in
org.apache.tez.dag.app.rm.YarnTaskSchedulerService$DelayedContainerManager.mainLoop()
</LongMessage>
<Class
classname="org.apache.tez.dag.app.rm.YarnTaskSchedulerService$DelayedContainerManager"
primary="true">
<SourceLine start="1933"
classname="org.apache.tez.dag.app.rm.YarnTaskSchedulerService$DelayedContainerManager"
sourcepath="org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java"
sourcefile="YarnTaskSchedulerService.java" end="2155">
<Message>At YarnTaskSchedulerService.java:[lines 1933-2155]</Message>
</SourceLine>
<Message>
In class
org.apache.tez.dag.app.rm.YarnTaskSchedulerService$DelayedContainerManager
</Message>
</Class>
<Method isStatic="false"
classname="org.apache.tez.dag.app.rm.YarnTaskSchedulerService$DelayedContainerManager"
name="mainLoop" primary="true" signature="()V">
<SourceLine endBytecode="995" startBytecode="0" start="1970"
classname="org.apache.tez.dag.app.rm.YarnTaskSchedulerService$DelayedContainerManager"
sourcepath="org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java"
sourcefile="YarnTaskSchedulerService.java" end="2057"/>
<Message>
In method
org.apache.tez.dag.app.rm.YarnTaskSchedulerService$DelayedContainerManager.mainLoop()
</Message>
</Method>
<Type descriptor="Ljava/util/concurrent/atomic/AtomicBoolean;">
<SourceLine start="53" classname="java.util.concurrent.atomic.AtomicBoolean"
sourcepath="java/util/concurrent/atomic/AtomicBoolean.java"
sourcefile="AtomicBoolean.java" end="161">
<Message>At AtomicBoolean.java:[lines 53-161]</Message>
</SourceLine>
<Message>Type java.util.concurrent.atomic.AtomicBoolean</Message>
</Type>
<Field isStatic="false"
classname="org.apache.tez.dag.app.rm.YarnTaskSchedulerService$DelayedContainerManager"
name="drainedDelayedContainersForTest" primary="true" role="FIELD_VALUE_OF"
signature="Ljava/util/concurrent/atomic/AtomicBoolean;">
<SourceLine
classname="org.apache.tez.dag.app.rm.YarnTaskSchedulerService$DelayedContainerManager"
sourcepath="org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java"
sourcefile="YarnTaskSchedulerService.java">
<Message>In YarnTaskSchedulerService.java</Message>
</SourceLine>
<Message>
Value loaded from field
org.apache.tez.dag.app.rm.YarnTaskSchedulerService$DelayedContainerManager.drainedDelayedContainersForTest
</Message>
</Field>
<SourceLine endBytecode="50" startBytecode="50" start="1985"
classname="org.apache.tez.dag.app.rm.YarnTaskSchedulerService$DelayedContainerManager"
primary="true"
sourcepath="org/apache/tez/dag/app/rm/YarnTaskSchedulerService.java"
sourcefile="YarnTaskSchedulerService.java" end="1985">
<Message>At YarnTaskSchedulerService.java:[line 1985]</Message>
</SourceLine>
</BugInstance>
<BugCategory category="MT_CORRECTNESS">
<Description>Multithreaded correctness</Description>
</BugCategory>
<BugPattern category="MT_CORRECTNESS" abbrev="JLM"
type="JLM_JSR166_UTILCONCURRENT_MONITORENTER">
<ShortDescription>
Synchronization performed on util.concurrent instance
</ShortDescription>
<Details>
<p> This method performs synchronization an object that is an instance of a
class from the java.util.concurrent package (or its subclasses). Instances of
these classes have their own concurrency control mechanisms that are orthogonal
to the synchronization provided by the Java keyword <code>synchronized</code>.
For example, synchronizing on an <code>AtomicBoolean</code> will not prevent
other threads from modifying the <code>AtomicBoolean</code>.</p> <p>Such code
may be correct, but should be carefully reviewed and documented, and may
confuse people who have to maintain the code at a later date. </p>
</Details>
</BugPattern>
<BugCode abbrev="JLM">
<Description>Synchronization on java.util.concurrent objects</Description>
</BugCode>{code}
> NPE in DelayedContainerManager
> ------------------------------
>
> Key: TEZ-3368
> URL: https://issues.apache.org/jira/browse/TEZ-3368
> Project: Apache Tez
> Issue Type: Bug
> Affects Versions: 0.7.1
> Reporter: Jason Lowe
> Assignee: Jason Lowe
> Attachments: TEZ-3368.001.patch
>
>
> Saw a Tez AM hang due to an NPE in the DelayedContainerManager:
> {noformat}
> 2016-07-17 01:53:23,157 [ERROR] [DelayedContainerManager]
> |yarn.YarnUncaughtExceptionHandler|: Thread
> Thread[DelayedContainerManager,5,main] threw an Exception.
> java.lang.NullPointerException
> at
> org.apache.tez.dag.app.rm.TezAMRMClientAsync.getMatchingRequestsForTopPriority(TezAMRMClientAsync.java:142)
> at
> org.apache.tez.dag.app.rm.YarnTaskSchedulerService.getMatchingRequestWithoutPriority(YarnTaskSchedulerService.java:1474)
> at
> org.apache.tez.dag.app.rm.YarnTaskSchedulerService.access$500(YarnTaskSchedulerService.java:84)
> at
> org.apache.tez.dag.app.rm.YarnTaskSchedulerService$NodeLocalContainerAssigner.assignReUsedContainer(YarnTaskSchedulerService.java:1869)
> at
> org.apache.tez.dag.app.rm.YarnTaskSchedulerService.assignReUsedContainerWithLocation(YarnTaskSchedulerService.java:1753)
> at
> org.apache.tez.dag.app.rm.YarnTaskSchedulerService.assignDelayedContainer(YarnTaskSchedulerService.java:733)
> at
> org.apache.tez.dag.app.rm.YarnTaskSchedulerService.access$600(YarnTaskSchedulerService.java:84)
> at
> org.apache.tez.dag.app.rm.YarnTaskSchedulerService$DelayedContainerManager.run(YarnTaskSchedulerService.java:2030)
> {noformat}
> After the DelayedContainerManager thread exited the AM proceeded to receive
> requested containers that would go unused until the container allocations
> expired. Then they would be re-requested, and the cycle repeated
> indefinitely.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)