Re: Get attempt number in a closure

Kay Ousterhout Mon, 20 Oct 2014 13:49:32 -0700

Sorry to clarify, there are two issues here:

(1) attemptId has different meanings in the codebase
(2) we currently don't propagate the 0-based per-task attempt identifier to
the executors.


(1) should definitely be fixed.  It sounds like Yin's original email was
requesting that we add (2).

On Mon, Oct 20, 2014 at 1:45 PM, Kay Ousterhout <k...@eecs.berkeley.edu>
wrote:

> Are you guys sure this is a bug?  In the task scheduler, we keep two
> identifiers for each task: the "index", which uniquely identifiers the
> computation+partition, and the "taskId" which is unique across all tasks
> for that Spark context (See
> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/TaskSetManager.scala#L439).
> If multiple attempts of one task are run, they will have the same index,
> but different taskIds.  Historically, we have used "taskId" and
> "taskAttemptId" interchangeably (which arose from naming in Mesos, which
> uses similar naming).
>
> This was complicated when Mr. Xin added the "attempt" field to TaskInfo,
> which we show in the UI.  This field uniquely identifies attempts for a
> particular task, but is not unique across different task indexes (it always
> starts at 0 for a given task).  I'm guessing the right fix is to rename
> Task.taskAttemptId to Task.taskId to resolve this inconsistency -- does
> that sound right to you Reynold?
>
> -Kay
>
> On Mon, Oct 20, 2014 at 1:29 PM, Patrick Wendell <pwend...@gmail.com>
> wrote:
>
>> There is a deeper issue here which is AFAIK we don't even store a
>> notion of attempt inside of Spark, we just use a new taskId with the
>> same index.
>>
>> On Mon, Oct 20, 2014 at 12:38 PM, Yin Huai <huaiyin....@gmail.com> wrote:
>> > Yeah, seems we need to pass the attempt id to executors through
>> > TaskDescription. I have created
>> > https://issues.apache.org/jira/browse/SPARK-4014.
>> >
>> > On Mon, Oct 20, 2014 at 1:57 PM, Reynold Xin <r...@databricks.com>
>> wrote:
>> >
>> >> I also ran into this earlier. It is a bug. Do you want to file a jira?
>> >>
>> >> I think part of the problem is that we don't actually have the attempt
>> id
>> >> on the executors. If we do, that's great. If not, we'd need to
>> propagate
>> >> that over.
>> >>
>> >> On Mon, Oct 20, 2014 at 7:17 AM, Yin Huai <huaiyin....@gmail.com>
>> wrote:
>> >>
>> >>> Hello,
>> >>>
>> >>> Is there any way to get the attempt number in a closure? Seems
>> >>> TaskContext.attemptId actually returns the taskId of a task (see this
>> >>> <
>> >>>
>> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/executor/Executor.scala#L181
>> >>> >
>> >>>  and this
>> >>> <
>> >>>
>> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/Task.scala#L47
>> >>> >).
>> >>> It looks like a bug.
>> >>>
>> >>> Thanks,
>> >>>
>> >>> Yin
>> >>>
>> >>
>> >>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> For additional commands, e-mail: dev-h...@spark.apache.org
>>
>>
>

Re: Get attempt number in a closure

Reply via email to