Michael Armbrust created SPARK-2042: ---------------------------------------
Summary: Take triggers unneeded shuffle. Key: SPARK-2042 URL: https://issues.apache.org/jira/browse/SPARK-2042 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 1.0.0 Reporter: Michael Armbrust This query really shouldn't trigger a shuffle: {code} sql("SELECT * FROM src LIMIT 10").take(5) {code} One fix would be to make the following changes: * Fix take to insert a logical limit and then collect() * Add a rule for collapsing adjacent limits -- This message was sent by Atlassian JIRA (v6.2#6252)