[jira] [Commented] (SPARK-24906) Adaptively set split size for columnar file to ensure the task read data size fit expectation

2020-01-02 Thread Lior Chaga (Jira)
[ https://issues.apache.org/jira/browse/SPARK-24906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006691#comment-17006691 ] Lior Chaga commented on SPARK-24906: Looking at the PR, I see only using configurable estimations

[jira] [Commented] (SPARK-24906) Adaptively set split size for columnar file to ensure the task read data size fit expectation

2020-01-01 Thread Jason Guo (Jira)
[ https://issues.apache.org/jira/browse/SPARK-24906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006571#comment-17006571 ] Jason Guo commented on SPARK-24906: --- [~lio...@taboola.com] Yes, estimating with sampling would do

[jira] [Commented] (SPARK-24906) Adaptively set split size for columnar file to ensure the task read data size fit expectation

2020-01-01 Thread Lior Chaga (Jira)
[ https://issues.apache.org/jira/browse/SPARK-24906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006345#comment-17006345 ] Lior Chaga commented on SPARK-24906: [~habren] The suggested approach of using a multiplier will be