Github user darabos commented on the pull request: https://github.com/apache/spark/pull/4533#issuecomment-73910087 Oh, thanks. I never looked into how `allowLocal` works. Looks like it results in local execution if the number of affected partitions is 1 (https://github.com/apache/spark/blob/v1.2.0/core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala#L749). So `take(n)` will always start with a local execution of partition 0, and then, if it decided it needs 10 more partitions, those partitions will be executed non-locally in parallel. Is that reading correct?
--- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org