Github user CodingCat commented on the pull request:
https://github.com/apache/spark/pull/228#issuecomment-39203409
Hi, @mridulm I have modified code according to your suggestion, thank you!
Hi, @kayousterhout , I thought about your suggestion. I totally agree on
that the DAGScheduler does not need to know so many details about the task, in
the current implementation, some code need to be refactored; but there are some
difficulties to do it in TM
For handling duplicate accumulator operation, I think TM is a good
candidate to handle task-level duplications, i.e. speculative task case.
However, some duplication comes from higher level (e.g. resubmitted stages)
TaskScheduler create a new TM for every submitted stage, no matter it is a new
one or a resubmitted one, and TM itself doesn't know what it is serving for,
new one or resubmitted one.
Furthermore, to deduplicate the accumulation, we actually need the
dependency information which is maintained by DAGScheduler; Here is an example
(in the first test case I wrote in this patch)
stage 0 depends on stage 1 and stage 2, we cannot accumulate the
accumulators in stage 1 and stage 2 even they are finished, otherwise, if stage
0 is failed due to a FetchFailed for stage 2's result, stage 2 will be
resubmitted, and then the accumulator in stage 2's tasks will be calculated for
twice.
Instead, we can only calculate the accumulator value of tasks in a stage
when we are sure about that the stage is successfully finished and not needed
by any jobs (And this is what I did in the current implementation)
So, I think accumulator is actually something related to dependency
information and we really need the status maintained by DAGScheduler to achieve
the goal
Any suggestion?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---