Github user squito commented on the issue:
https://github.com/apache/spark/pull/13603
@kayousterhout thanks for the feedback. Another round of updates. I
actually don't think SPARK-16106 is a blocker here, it doesn't effect the task
serializability tests, those still work, and I added some extra asserts on
them. (the error was in another test, and I put a workaround for it in any
case.)
In addition to your suggestions, I ended up making a couple of other
changes:
* make sure that we consistently check the blacklist with the partitionId,
not the index in the taskset (they are the same on initial stage submission,
but not stage retry). Also updated the existing naming, since it used to say
`taskId`:
https://github.com/apache/spark/pull/13603/commits/6362b28468e752a61560bffdb2f383c69163ac36
* fixed `FakeTask` to have reasonable `partitionId`s (before, when you
called `FakeTask.createTaskSet(10)`, each task had partitiionId = 0 but the
*stage* went from 0 to n):
https://github.com/apache/spark/pull/13603/commits/060dbfead7c5fe3ca57c1f2aaa3986455f34be09
* fix more leaked threads:
https://github.com/apache/spark/pull/13603/commits/fd48403e6c94cf220c7ed2ad86b7d0cd5bd05e94
I'm also not sure what is going on in the [second half of the test for not
serializable tasks](
https://github.com/apache/spark/blob/master/core/src/test/scala/org/apache/spark/scheduler/TaskSchedulerImplSuite.scala#L123).
Its resubmitting the exact same task set has the previous attempt, which
never happens (should have another attemptId at least), and submitting another
task set also with the same stageId and attemptId. I'm not sure what the
expected behavior is in this case, I think that test case should probably be
changes, but I've been playing enough whack-a-mole for now that I'm just going
to leave it.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]