[jira] [Resolved] (SPARK-9366) TaskEnd event emitted for task has different stage attempt ID than TaskStart for same task

Imran Rashid (JIRA) Mon, 27 Jul 2015 10:56:01 -0700

     [ 
https://issues.apache.org/jira/browse/SPARK-9366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Imran Rashid resolved SPARK-9366.
---------------------------------
       Resolution: Fixed
    Fix Version/s: 1.5.0

Issue resolved by pull request 7681
[https://github.com/apache/spark/pull/7681]

> TaskEnd event emitted for task has different stage attempt ID than TaskStart 
> for same task
> ------------------------------------------------------------------------------------------
>
>                 Key: SPARK-9366
>                 URL: https://issues.apache.org/jira/browse/SPARK-9366
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Core
>    Affects Versions: 1.4.1
>            Reporter: Ryan Williams
>             Fix For: 1.5.0
>
>
> During a simple job I ran yesterday, I observed the following in the event 
> log:
> {code}
> {"Event":"SparkListenerTaskStart","Stage ID":0,"Stage Attempt ID":1,"Task 
> Info":{"Task ID":10244,"Index":55,"Attempt":1,"Launch 
> Time":1437767843724,"Executor 
> ID":"8","Host":"demeter-csmaz10-6.demeter.hpc.mssm.edu","Locality":"PROCESS_LOCAL","Speculative":true,"Getting
>  Result Time":0,"Finish Time":1437767844387,"Failed":false,"Accumulables":[]}}
> …
> {"Event":"SparkListenerTaskEnd","Stage ID":0,"Stage Attempt ID":2,"Task 
> Type":"ShuffleMapTask","Task End Reason":{"Reason":"Success"},"Task 
> Info":{"Task ID":10244,"Index":55,"Attempt":1,"Launch 
> Time":1437767843724,"Executor 
> ID":"8","Host":"demeter-csmaz10-6.demeter.hpc.mssm.edu","Locality":"PROCESS_LOCAL","Speculative":true,"Getting
>  Result Time":0,"Finish 
> Time":1437767844387,"Failed":false,"Accumulables":[]},"Task Metrics":{"Host 
> Name":"demeter-csmaz10-6.demeter.hpc.mssm.edu","Executor Deserialize 
> Time":63,"Executor Run Time":579,"Result Size":2235,"JVM GC Time":0,"Result 
> Serialization Time":1,"Memory Bytes Spilled":0,"Disk Bytes 
> Spilled":0,"Shuffle Write Metrics":{"Shuffle Bytes Written":2736,"Shuffle 
> Write Time":1388809,"Shuffle Records Written":100},"Input Metrics":{"Data 
> Read Method":"Network","Bytes Read":636000,"Records Read":100000}}}
> {code}
> The {{TaskStart}} event for task 10244 listed it (correctly) as coming from 
> stage 0, attempt 1, but the {{TaskEnd}} shows it as part of stage 0, attempt 
> 2.
> I'm pretty sure this is due to [this 
> line|https://github.com/apache/spark/blob/1efe97dc9ed31e3b8727b81be633b7e96dd3cd34/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L930]
>  in the DAGScheduler, which fills in the latest attempt ID for the task's 
> stage, instead of the attempt that the task actually belongs to.
> I know there's a lot of flux right now around concurrent stage attempts and 
> attempt-id-tracking, but this seems trivial to fix independent of that so 
> I'll send a PR momentarily.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Resolved] (SPARK-9366) TaskEnd event emitted for task has different stage attempt ID than TaskStart for same task

Reply via email to