Github user YanTangZhai commented on the pull request:
https://github.com/apache/spark/pull/3794#issuecomment-71405599
@JoshRosen I don't think just calling rdd.partitions on the final RDD could
achieve our goal. Furthermore, rdd.partitions has been called before:
470 // Check to make sure we are not launching a task on a partition that
does not exist.
471 val maxPartitions = rdd.partitions.length
However, it does not work for some scene like the example contrived by me.
To avoid thread-safety issue, do you think we could use another method to
get parent stages which does not mutate any global map, or we could just use
another method like getParentPartitions committed by me before to get
partitions directly?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]