Yes, it is for (2). I was confused because the doc of TaskContext.attemptId (release 1.1) <http://spark.apache.org/docs/1.1.0/api/scala/index.html#org.apache.spark.TaskContext> is "the number of attempts to execute this task". Seems the per-task attempt id used to populate "attempt" field in the UI is maintained by TaskSetManager and its value is assigned in resourceOffer.
On Mon, Oct 20, 2014 at 4:56 PM, Reynold Xin <r...@databricks.com> wrote: > Yes, as I understand it this is for (2). > > Imagine a use case in which I want to save some output. In order to make > this atomic, the program uses part_[index]_[attempt].dat, and once it > finishes writing, it renames this to part_[index].dat. > > Right now [attempt] is just the TID, which could show up like (assuming > this is not the first stage): > > part_0_1000 > part_1_1001 > part_0_1002 (some retry) > ... > > This is fairly confusing. The natural thing to expect is > > part_0_0 > part_1_0 > part_0_1 > ... > > > > On Mon, Oct 20, 2014 at 1:47 PM, Kay Ousterhout <k...@eecs.berkeley.edu> > wrote: > >> Sorry to clarify, there are two issues here: >> >> (1) attemptId has different meanings in the codebase >> (2) we currently don't propagate the 0-based per-task attempt identifier >> to the executors. >> >> (1) should definitely be fixed. It sounds like Yin's original email was >> requesting that we add (2). >> >> On Mon, Oct 20, 2014 at 1:45 PM, Kay Ousterhout <k...@eecs.berkeley.edu> >> wrote: >> >>> Are you guys sure this is a bug? In the task scheduler, we keep two >>> identifiers for each task: the "index", which uniquely identifiers the >>> computation+partition, and the "taskId" which is unique across all tasks >>> for that Spark context (See >>> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala#L439). >>> If multiple attempts of one task are run, they will have the same index, >>> but different taskIds. Historically, we have used "taskId" and >>> "taskAttemptId" interchangeably (which arose from naming in Mesos, which >>> uses similar naming). >>> >>> This was complicated when Mr. Xin added the "attempt" field to TaskInfo, >>> which we show in the UI. This field uniquely identifies attempts for a >>> particular task, but is not unique across different task indexes (it always >>> starts at 0 for a given task). I'm guessing the right fix is to rename >>> Task.taskAttemptId to Task.taskId to resolve this inconsistency -- does >>> that sound right to you Reynold? >>> >>> -Kay >>> >>> On Mon, Oct 20, 2014 at 1:29 PM, Patrick Wendell <pwend...@gmail.com> >>> wrote: >>> >>>> There is a deeper issue here which is AFAIK we don't even store a >>>> notion of attempt inside of Spark, we just use a new taskId with the >>>> same index. >>>> >>>> On Mon, Oct 20, 2014 at 12:38 PM, Yin Huai <huaiyin....@gmail.com> >>>> wrote: >>>> > Yeah, seems we need to pass the attempt id to executors through >>>> > TaskDescription. I have created >>>> > https://issues.apache.org/jira/browse/SPARK-4014. >>>> > >>>> > On Mon, Oct 20, 2014 at 1:57 PM, Reynold Xin <r...@databricks.com> >>>> wrote: >>>> > >>>> >> I also ran into this earlier. It is a bug. Do you want to file a >>>> jira? >>>> >> >>>> >> I think part of the problem is that we don't actually have the >>>> attempt id >>>> >> on the executors. If we do, that's great. If not, we'd need to >>>> propagate >>>> >> that over. >>>> >> >>>> >> On Mon, Oct 20, 2014 at 7:17 AM, Yin Huai <huaiyin....@gmail.com> >>>> wrote: >>>> >> >>>> >>> Hello, >>>> >>> >>>> >>> Is there any way to get the attempt number in a closure? Seems >>>> >>> TaskContext.attemptId actually returns the taskId of a task (see >>>> this >>>> >>> < >>>> >>> >>>> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/Executor.scala#L181 >>>> >>> > >>>> >>> and this >>>> >>> < >>>> >>> >>>> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/Task.scala#L47 >>>> >>> >). >>>> >>> It looks like a bug. >>>> >>> >>>> >>> Thanks, >>>> >>> >>>> >>> Yin >>>> >>> >>>> >> >>>> >> >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org >>>> For additional commands, e-mail: dev-h...@spark.apache.org >>>> >>>> >>> >> >