[jira] [Comment Edited] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-19 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517532#comment-16517532 ] Wenbo Zhao edited comment on SPARK-24578 at 6/19/18 8:55 PM: - woop,

[jira] [Commented] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-19 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517532#comment-16517532 ] Wenbo Zhao commented on SPARK-24578: woop, [~attilapiros], sorry, I didn't you have created a PR. 

[jira] [Commented] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-19 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517465#comment-16517465 ] Wenbo Zhao commented on SPARK-24578: [~attilapiros] if don't mind, I could create a PR for it :) >

[jira] [Commented] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-19 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517458#comment-16517458 ] Wenbo Zhao commented on SPARK-24578: Hi [~vanzin].  the commit 

[jira] [Commented] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-19 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517441#comment-16517441 ] Wenbo Zhao commented on SPARK-24578: [~irashid] Yes, that is exactly what I saw in our side. >

[jira] [Commented] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-19 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517440#comment-16517440 ] Wenbo Zhao commented on SPARK-24578: Hi [~attilapiros], I guess what you suggest is  {code:java}

[jira] [Commented] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-19 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16517180#comment-16517180 ] Wenbo Zhao commented on SPARK-24578: After digging more details, this commit 

[jira] [Commented] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-18 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516621#comment-16516621 ] Wenbo Zhao commented on SPARK-24578: Hi [~irashid], many thanks for clarifying my questions. I tried

[jira] [Commented] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-18 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516609#comment-16516609 ] Wenbo Zhao commented on SPARK-24578: For now, we could reproduce this issue in completely different

[jira] [Updated] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-18 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenbo Zhao updated SPARK-24578: --- Description: After Spark 2.3, we observed lots of errors like the following in some of our

[jira] [Commented] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-18 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16516602#comment-16516602 ] Wenbo Zhao commented on SPARK-24578: [~irashid] We didn't touch 

[jira] [Updated] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-18 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenbo Zhao updated SPARK-24578: --- Description: After Spark 2.3, we observed lots of errors like the following in some of our

[jira] [Commented] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-18 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16515858#comment-16515858 ] Wenbo Zhao commented on SPARK-24578: An easier reproduciable cluster setting is 10 executors each

[jira] [Updated] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-18 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenbo Zhao updated SPARK-24578: --- Description: After Spark 2.3, we observed lots of errors like the following in some of our

[jira] [Updated] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-18 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenbo Zhao updated SPARK-24578: --- Description: After Spark 2.3, we observed lots of errors like the following {code:java} 18/06/15

[jira] [Updated] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-18 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24578?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenbo Zhao updated SPARK-24578: --- Description: After Spark 2.3, we observed lots of errors like the following   {code:java} 18/06/15

[jira] [Created] (SPARK-24578) Reading remote cache block behavior changes and causes timeout issue

2018-06-18 Thread Wenbo Zhao (JIRA)
Wenbo Zhao created SPARK-24578: -- Summary: Reading remote cache block behavior changes and causes timeout issue Key: SPARK-24578 URL: https://issues.apache.org/jira/browse/SPARK-24578 Project: Spark

[jira] [Commented] (SPARK-24373) "df.cache() df.count()" no longer eagerly caches data when the analyzed plans are different after re-analyzing the plans

2018-05-29 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16493569#comment-16493569 ] Wenbo Zhao commented on SPARK-24373: [~mgaido] Thanks. I didn't look the comment carefully.  >

[jira] [Issue Comment Deleted] (SPARK-24373) "df.cache() df.count()" no longer eagerly caches data when the analyzed plans are different after re-analyzing the plans

2018-05-29 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenbo Zhao updated SPARK-24373: --- Comment: was deleted (was: Same question as [~icexelloss]. Also, any plan to make your fix into a

[jira] [Commented] (SPARK-24373) "df.cache() df.count()" no longer eagerly caches data when the analyzed plans are different after re-analyzing the plans

2018-05-29 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16493545#comment-16493545 ] Wenbo Zhao commented on SPARK-24373: Same question as [~icexelloss]. Also, any plan to make your fix

[jira] [Commented] (SPARK-24373) "df.cache() df.count()" no longer eagerly caches data

2018-05-24 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489820#comment-16489820 ] Wenbo Zhao commented on SPARK-24373: I guess we should use `planWithBarrier` in the

[jira] [Commented] (SPARK-24373) "df.cache() df.count()" no longer eagerly caches data

2018-05-24 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489534#comment-16489534 ] Wenbo Zhao commented on SPARK-24373: It is not apparently to me that they are the same issue though

[jira] [Comment Edited] (SPARK-24373) "df.cache() df.count()" no longer eagerly caches data

2018-05-24 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489233#comment-16489233 ] Wenbo Zhao edited comment on SPARK-24373 at 5/24/18 3:37 PM: - I turned on the

[jira] [Commented] (SPARK-24373) "df.cache() df.count()" no longer eagerly caches data

2018-05-24 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489233#comment-16489233 ] Wenbo Zhao commented on SPARK-24373: I turned on the log trace of RuleExecutor and found that in my

[jira] [Updated] (SPARK-24373) Spark Dataset groupby.agg/count doesn't respect cache with UDF

2018-05-23 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenbo Zhao updated SPARK-24373: --- Description: Here is the code to reproduce in local mode {code:java} scala> val df = sc.range(1,

[jira] [Updated] (SPARK-24373) Spark Dataset groupby.agg/count doesn't respect cache with UDF

2018-05-23 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenbo Zhao updated SPARK-24373: --- Description: Here is the code to reproduce in local mode {code:java} scala> val df = sc.range(1,

[jira] [Updated] (SPARK-24373) Spark Dataset groupby.agg/count doesn't respect cache with UDF

2018-05-23 Thread Wenbo Zhao (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenbo Zhao updated SPARK-24373: --- Summary: Spark Dataset groupby.agg/count doesn't respect cache with UDF (was: Spark Dataset

[jira] [Created] (SPARK-24373) Spark Dataset groupby.agg/count doesn't respect cache

2018-05-23 Thread Wenbo Zhao (JIRA)
Wenbo Zhao created SPARK-24373: -- Summary: Spark Dataset groupby.agg/count doesn't respect cache Key: SPARK-24373 URL: https://issues.apache.org/jira/browse/SPARK-24373 Project: Spark Issue