[jira] [Commented] (SPARK-13059) Sort inputsplits by size in HadoopRDD to avoid long tails

2016-02-04 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15132200#comment-15132200 ] Sean Owen commented on SPARK-13059: --- The splits have a meaningful ordering for some input formats that

[jira] [Commented] (SPARK-13059) Sort inputsplits by size in HadoopRDD to avoid long tails

2016-02-01 Thread holdenk (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15127153#comment-15127153 ] holdenk commented on SPARK-13059: - This sounds interesting - although having first() and take(1) still

[jira] [Commented] (SPARK-13059) Sort inputsplits by size in HadoopRDD to avoid long tails

2016-01-28 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15121044#comment-15121044 ] Sean Owen commented on SPARK-13059: --- I like the idea. Although I'm not 100% sure I think that wouldn't

[jira] [Commented] (SPARK-13059) Sort inputsplits by size in HadoopRDD to avoid long tails

2016-01-28 Thread Rajesh Balamohan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-13059?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15121214#comment-15121214 ] Rajesh Balamohan commented on SPARK-13059: -- Thanks [~srowen]. The same problem would exist even