Josh Rosen created SPARK-17283:
----------------------------------
Summary: Cancel job in RDD.take() as soon as enough output is
receieved
Key: SPARK-17283
URL: https://issues.apache.org/jira/browse/SPARK-17283
Project: Spark
Issue Type: Improvement
Components: Spark Core
Reporter: Josh Rosen
Assignee: Josh Rosen
The current implementation of RDD.take() waits until all partitions of each job
have been computed before checking whether enough rows have been received. If
take() were to perform this check on-the-fly as individual partitions were
completed then it could stop early, offering large speedups for certain
interactive queries.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]