[ https://issues.apache.org/jira/browse/TEZ-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15109675#comment-15109675 ]
Jason Lowe commented on TEZ-3066: --------------------------------- {noformat} 2016-01-19 14:50:28,819 [WARN] [RecoveryEventHandlingThread] |recovery.RecoveryService|: Error handling recovery event java.util.ConcurrentModificationException at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901) at java.util.ArrayList$Itr.next(ArrayList.java:851) at org.apache.tez.dag.history.events.TaskAttemptFinishedEvent.toProto(TaskAttemptFinishedEvent.java:128) at org.apache.tez.dag.history.events.TaskAttemptFinishedEvent.toProtoStream(TaskAttemptFinishedEvent.java:165) at org.apache.tez.dag.history.recovery.RecoveryService.handleRecoveryEvent(RecoveryService.java:453) at org.apache.tez.dag.history.recovery.RecoveryService$1.run(RecoveryService.java:177) at java.lang.Thread.run(Thread.java:745) 2016-01-19 14:50:28,819 [ERROR] [EntityFileLoggingServiceEventHandler] |ats.EntityFileLoggingService|: Error writing entity log java.util.ConcurrentModificationException at java.util.ArrayList$Itr.checkForComodification(ArrayList.java:901) at java.util.ArrayList$Itr.next(ArrayList.java:851) at org.apache.tez.dag.history.utils.DAGUtils.convertDataEventDependecyInfoToATS(DAGUtils.java:115) at org.apache.tez.dag.history.logging.ats.HistoryEventTimelineConversion.convertTaskAttemptFinishedEvent(HistoryEventTimelineConversion.java:468) at org.apache.tez.dag.history.logging.ats.HistoryEventTimelineConversion.convertToTimelineEntity(HistoryEventTimelineConversion.java:114) at org.apache.tez.dag.history.logging.ats.EntityFileLoggingService.handleEvent(EntityFileLoggingService.java:294) at org.apache.tez.dag.history.logging.ats.EntityFileLoggingService.access$500(EntityFileLoggingService.java:65) at org.apache.tez.dag.history.logging.ats.EntityFileLoggingService$EventProcessor.run(EntityFileLoggingService.java:327) at java.lang.Thread.run(Thread.java:745) {noformat} It looks like the TaskAttemptFinishedEvent dataEvents list is getting updated while the recovery service and history logging services are attempting to process them. When the TaskAttemptFinishedEvent is built, it's handed a data event list directly (not a copy), and it looks like VertexImpl#getTaskAttemptTezEvents can modify that list. So I think the recovery service and history logging services are looking at the event just as something is updating the last data event sent. > TaskAttemptFinishedEvent ConcurrentModificationException if processed by > RecoveryService and history logging simultaneously > --------------------------------------------------------------------------------------------------------------------------- > > Key: TEZ-3066 > URL: https://issues.apache.org/jira/browse/TEZ-3066 > Project: Apache Tez > Issue Type: Bug > Affects Versions: 0.7.0 > Reporter: Jason Lowe > > A ConcurrentModificationException can occur if a TaskAttemptFinishedEvent is > processed simultaneously by the recovery service and another history logging > service. Sample stacktraces to follow. -- This message was sent by Atlassian JIRA (v6.3.4#6332)