Github user markhamstra commented on the issue:
https://github.com/apache/spark/pull/13677
> One goal of this change it to make it clearer which functions may create
new stages (as opposed to looking up stages that already exist).
Something that I have been looking at of late, and I know that @squito has
looked at some, too. In short, I'm pretty confident that we doing some
silliness around creating new stages instead of reusing already existing
stages, then recognizing that all the task for the "new" stages are already
completed (at least we're smart enough to reuse the map outputs), so the "new"
stages just become "skipped".
I'll take a closer look at this tomorrow, and may have a follow-on PR in
the not too distant future.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]