[
https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495695#comment-16495695
]
Li Jin commented on SPARK-24373:
[~smilegator] Thank you for the suggestion.
> "df.cache() df.count()"
[
https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16495691#comment-16495691
]
Xiao Li commented on SPARK-24373:
-
[~icexelloss] This is still possible since the query plans are
[
https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16493569#comment-16493569
]
Wenbo Zhao commented on SPARK-24373:
[~mgaido] Thanks. I didn't look the comment carefully.
>
[
https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16493548#comment-16493548
]
Marco Gaido commented on SPARK-24373:
-
[~wbzhao] as I answered on the PR, the fix is complete and
[
https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16493545#comment-16493545
]
Wenbo Zhao commented on SPARK-24373:
Same question as [~icexelloss]. Also, any plan to make your fix
[
https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16491309#comment-16491309
]
Li Jin commented on SPARK-24373:
[~smilegator] do you mean that add AnalysisBarrier to
[
https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16491304#comment-16491304
]
Xiao Li commented on SPARK-24373:
-
In the above example, each time when we re-analyze the plan that is
[
https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16491271#comment-16491271
]
Marco Gaido commented on SPARK-24373:
-
[~smilegator] yes, you're right, the impact would be
[
https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16491224#comment-16491224
]
Xiao Li commented on SPARK-24373:
-
{code}
def count(): Long = withAction("count",
[
https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16491153#comment-16491153
]
Tomasz Gawęda commented on SPARK-24373:
---
[~LI,Xiao] That is a good idea :) Eager caching is
[
https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16491124#comment-16491124
]
Marco Gaido commented on SPARK-24373:
-
[~smilegator] I think an eager API is not related to the
[
https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16491088#comment-16491088
]
Xiao Li commented on SPARK-24373:
-
BTW, I plan to continue my work of
[
https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16491086#comment-16491086
]
Li Jin commented on SPARK-24373:
We use groupby() and pivot()
> "df.cache() df.count()" no longer
[
https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16491081#comment-16491081
]
Xiao Li commented on SPARK-24373:
-
[~icexelloss] [~aweise] Are you also using the Dataset APIs groupBy(),
[
https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16490772#comment-16490772
]
Apache Spark commented on SPARK-24373:
--
User 'mgaido91' has created a pull request for this issue:
[
https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16490551#comment-16490551
]
Marco Gaido commented on SPARK-24373:
-
[~wbzhao] yes, I do agree with you. That is the problem.
>
[
https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489820#comment-16489820
]
Wenbo Zhao commented on SPARK-24373:
I guess we should use `planWithBarrier` in the
[
https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489534#comment-16489534
]
Wenbo Zhao commented on SPARK-24373:
It is not apparently to me that they are the same issue though
[
https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489389#comment-16489389
]
Marcelo Vanzin commented on SPARK-24373:
This could be the same as SPARK-23309.
> "df.cache()
[
https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489233#comment-16489233
]
Wenbo Zhao commented on SPARK-24373:
I turned on the log trace of RuleExecutor and found that in my
[
https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489129#comment-16489129
]
Andreas Weise commented on SPARK-24373:
---
We are also facing increased runtime duration for our SQL
[
https://issues.apache.org/jira/browse/SPARK-24373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16489060#comment-16489060
]
Li Jin commented on SPARK-24373:
This is a reproduce in unit test:
{code:java}
test("cache and count") {
22 matches
Mail list logo