Robert Kruszewski created SPARK-16984:
-----------------------------------------

             Summary: executeTake tries all partitions if first parition is 
empty
                 Key: SPARK-16984
                 URL: https://issues.apache.org/jira/browse/SPARK-16984
             Project: Spark
          Issue Type: Bug
    Affects Versions: 2.0.0
            Reporter: Robert Kruszewski


in executeTake if the number of rows returned by first partition is 0 we try 
all partitions next time. This can lead to pathological cases where your first 
partition is empty and rest have data. This unfortunately can happen with 
skewed data. Empirically observed it's better to make few roundtrips instead of 
potentially killing driver with big collect



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to