[ 
https://issues.apache.org/jira/browse/TEZ-3097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15611534#comment-15611534
 ] 

Harish Jaiprakash commented on TEZ-3097:
----------------------------------------

I was able to approximately replicate the issue: (Not the concurrent 
modification but get events after dag finished).
* delay enqueueAndScheduleNextEvent(new 
VertexManagerEventOnVertexStarted(completions)) in 
VertexManager.onVertexStarted by 20-30 ms in a different thread.
* This is to simulate a case where the executerService would not have invoked 
the callback for onVertexStarted, but the task complete events for the vertex 
has been handled and dag is finished.
* Now we have a finished dag, and suddenly 
ShuffleVertexManagerBase.OnVertexStarted is scheduled by the VertexManager and 
which caused new tasks to be scheduled and events sent to HistoryEventHandler.

Followup:
* Is this a bug? Should we handle case where dag can be finished before the 
above schedule is done. (No test fix required in this case).
* Is there a way to fire the task finished events only after Vertex has fully 
started. More like, fire task finished only after task start is called? Only 
way I can think of is to wait for task started events to arrive in 
MockHistoryEventHandler (make it thread safe too).
* These extra events do not affect the outcome, use a concurrent queue instead 
of the list to make MockHistoryEventHandler thread safe.

> Flaky test: TestCommit.testDAGCommitStartedEventFail_OnDAGSuccess
> -----------------------------------------------------------------
>
>                 Key: TEZ-3097
>                 URL: https://issues.apache.org/jira/browse/TEZ-3097
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.8.3
>            Reporter: Jeff Zhang
>
> {noformat}
> testDAGCommitStartedEventFail_OnDAGSuccess(org.apache.tez.dag.app.dag.impl.TestCommit)
>   Time elapsed: 0.072 sec  <<< ERROR!
> java.util.ConcurrentModificationException: null
>       at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:859)
>       at java.util.ArrayList$Itr.next(ArrayList.java:831)
>       at 
> org.apache.tez.dag.app.dag.impl.TestCommit$MockHistoryEventHandler.verifyVertexCommitStartedEvent(TestCommit.java:2033)
>       at 
> org.apache.tez.dag.app.dag.impl.TestCommit.testDAGCommitStartedEventFail_OnDAGSuccess(TestCommit.java:1880)
> {noformat}
> https://builds.apache.org/job/Tez-Build-Hadoop-2.4/245/console



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to