[ https://issues.apache.org/jira/browse/SPARK-9366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Imran Rashid resolved SPARK-9366. --------------------------------- Resolution: Fixed Fix Version/s: 1.5.0 Issue resolved by pull request 7681 [https://github.com/apache/spark/pull/7681] > TaskEnd event emitted for task has different stage attempt ID than TaskStart > for same task > ------------------------------------------------------------------------------------------ > > Key: SPARK-9366 > URL: https://issues.apache.org/jira/browse/SPARK-9366 > Project: Spark > Issue Type: Bug > Components: Spark Core > Affects Versions: 1.4.1 > Reporter: Ryan Williams > Fix For: 1.5.0 > > > During a simple job I ran yesterday, I observed the following in the event > log: > {code} > {"Event":"SparkListenerTaskStart","Stage ID":0,"Stage Attempt ID":1,"Task > Info":{"Task ID":10244,"Index":55,"Attempt":1,"Launch > Time":1437767843724,"Executor > ID":"8","Host":"demeter-csmaz10-6.demeter.hpc.mssm.edu","Locality":"PROCESS_LOCAL","Speculative":true,"Getting > Result Time":0,"Finish Time":1437767844387,"Failed":false,"Accumulables":[]}} > … > {"Event":"SparkListenerTaskEnd","Stage ID":0,"Stage Attempt ID":2,"Task > Type":"ShuffleMapTask","Task End Reason":{"Reason":"Success"},"Task > Info":{"Task ID":10244,"Index":55,"Attempt":1,"Launch > Time":1437767843724,"Executor > ID":"8","Host":"demeter-csmaz10-6.demeter.hpc.mssm.edu","Locality":"PROCESS_LOCAL","Speculative":true,"Getting > Result Time":0,"Finish > Time":1437767844387,"Failed":false,"Accumulables":[]},"Task Metrics":{"Host > Name":"demeter-csmaz10-6.demeter.hpc.mssm.edu","Executor Deserialize > Time":63,"Executor Run Time":579,"Result Size":2235,"JVM GC Time":0,"Result > Serialization Time":1,"Memory Bytes Spilled":0,"Disk Bytes > Spilled":0,"Shuffle Write Metrics":{"Shuffle Bytes Written":2736,"Shuffle > Write Time":1388809,"Shuffle Records Written":100},"Input Metrics":{"Data > Read Method":"Network","Bytes Read":636000,"Records Read":100000}}} > {code} > The {{TaskStart}} event for task 10244 listed it (correctly) as coming from > stage 0, attempt 1, but the {{TaskEnd}} shows it as part of stage 0, attempt > 2. > I'm pretty sure this is due to [this > line|https://github.com/apache/spark/blob/1efe97dc9ed31e3b8727b81be633b7e96dd3cd34/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L930] > in the DAGScheduler, which fills in the latest attempt ID for the task's > stage, instead of the attempt that the task actually belongs to. > I know there's a lot of flux right now around concurrent stage attempts and > attempt-id-tracking, but this seems trivial to fix independent of that so > I'll send a PR momentarily. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org