Sorry to clarify, there are two issues here: (1) attemptId has different meanings in the codebase (2) we currently don't propagate the 0-based per-task attempt identifier to the executors.
(1) should definitely be fixed. It sounds like Yin's original email was requesting that we add (2). On Mon, Oct 20, 2014 at 1:45 PM, Kay Ousterhout <k...@eecs.berkeley.edu> wrote: > Are you guys sure this is a bug? In the task scheduler, we keep two > identifiers for each task: the "index", which uniquely identifiers the > computation+partition, and the "taskId" which is unique across all tasks > for that Spark context (See > https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala#L439). > If multiple attempts of one task are run, they will have the same index, > but different taskIds. Historically, we have used "taskId" and > "taskAttemptId" interchangeably (which arose from naming in Mesos, which > uses similar naming). > > This was complicated when Mr. Xin added the "attempt" field to TaskInfo, > which we show in the UI. This field uniquely identifies attempts for a > particular task, but is not unique across different task indexes (it always > starts at 0 for a given task). I'm guessing the right fix is to rename > Task.taskAttemptId to Task.taskId to resolve this inconsistency -- does > that sound right to you Reynold? > > -Kay > > On Mon, Oct 20, 2014 at 1:29 PM, Patrick Wendell <pwend...@gmail.com> > wrote: > >> There is a deeper issue here which is AFAIK we don't even store a >> notion of attempt inside of Spark, we just use a new taskId with the >> same index. >> >> On Mon, Oct 20, 2014 at 12:38 PM, Yin Huai <huaiyin....@gmail.com> wrote: >> > Yeah, seems we need to pass the attempt id to executors through >> > TaskDescription. I have created >> > https://issues.apache.org/jira/browse/SPARK-4014. >> > >> > On Mon, Oct 20, 2014 at 1:57 PM, Reynold Xin <r...@databricks.com> >> wrote: >> > >> >> I also ran into this earlier. It is a bug. Do you want to file a jira? >> >> >> >> I think part of the problem is that we don't actually have the attempt >> id >> >> on the executors. If we do, that's great. If not, we'd need to >> propagate >> >> that over. >> >> >> >> On Mon, Oct 20, 2014 at 7:17 AM, Yin Huai <huaiyin....@gmail.com> >> wrote: >> >> >> >>> Hello, >> >>> >> >>> Is there any way to get the attempt number in a closure? Seems >> >>> TaskContext.attemptId actually returns the taskId of a task (see this >> >>> < >> >>> >> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/Executor.scala#L181 >> >>> > >> >>> and this >> >>> < >> >>> >> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/Task.scala#L47 >> >>> >). >> >>> It looks like a bug. >> >>> >> >>> Thanks, >> >>> >> >>> Yin >> >>> >> >> >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >> For additional commands, e-mail: dev-h...@spark.apache.org >> >> >