Xiaoju Wu created SPARK-31872:
-
Summary: NotNullSafe to get complementary set
Key: SPARK-31872
URL: https://issues.apache.org/jira/browse/SPARK-31872
Project: Spark
Issue Type: Improvement
[
https://issues.apache.org/jira/browse/SPARK-30443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068290#comment-17068290
]
Xiaoju Wu edited comment on SPARK-30443 at 3/27/20, 5:50 AM:
-
Also see this
[
https://issues.apache.org/jira/browse/SPARK-30443?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17068290#comment-17068290
]
Xiaoju Wu commented on SPARK-30443:
---
Also see this kind of warning logs. SPARK-21492 may relate to
[
https://issues.apache.org/jira/browse/SPARK-31069?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xiaoju Wu updated SPARK-31069:
--
Description:
"shuffle-chunk-fetch-handler-2-40" #250 daemon prio=5 os_prio=0
tid=0x02ac
Xiaoju Wu created SPARK-31069:
-
Summary: high cpu caused by chunksBeingTransferred in external
shuffle service
Key: SPARK-31069
URL: https://issues.apache.org/jira/browse/SPARK-31069
Project: Spark
[
https://issues.apache.org/jira/browse/SPARK-23811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17006539#comment-17006539
]
Xiaoju Wu commented on SPARK-23811:
---
[~XuanYuan] The issue seems still exist after patch #17955, any
Xiaoju Wu created SPARK-30298:
-
Summary: bucket join cannot work for self-join with views
Key: SPARK-30298
URL: https://issues.apache.org/jira/browse/SPARK-30298
Project: Spark
Issue Type: Bug
[
https://issues.apache.org/jira/browse/SPARK-30072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16995627#comment-16995627
]
Xiaoju Wu commented on SPARK-30072:
---
[~cloud_fan] If the sql looks like:
SELECT * FROM df2 WHERE
[
https://issues.apache.org/jira/browse/SPARK-30072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16991554#comment-16991554
]
Xiaoju Wu commented on SPARK-30072:
---
[~afroozeh] I think the change from checking if queryExecution
Xiaoju Wu created SPARK-30186:
-
Summary: support Dynamic Partition Pruning in Adaptive Execution
Key: SPARK-30186
URL: https://issues.apache.org/jira/browse/SPARK-30186
Project: Spark
Issue
[
https://issues.apache.org/jira/browse/SPARK-27290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16862698#comment-16862698
]
Xiaoju Wu commented on SPARK-27290:
---
[~joshrosen] Got it. I think we should identify in which patterns
Xiaoju Wu created SPARK-27431:
-
Summary: move HashedRelation to global UnifiedMemoryManager and
enable offheap
Key: SPARK-27431
URL: https://issues.apache.org/jira/browse/SPARK-27431
Project: Spark
[
https://issues.apache.org/jira/browse/SPARK-27290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16803737#comment-16803737
]
Xiaoju Wu commented on SPARK-27290:
---
[~ekoifman] HashAggregate can not benefit from sorted input but
Xiaoju Wu created SPARK-27290:
-
Summary: remove unneed sort under Aggregate
Key: SPARK-27290
URL: https://issues.apache.org/jira/browse/SPARK-27290
Project: Spark
Issue Type: Improvement
[
https://issues.apache.org/jira/browse/SPARK-21492?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16794962#comment-16794962
]
Xiaoju Wu commented on SPARK-21492:
---
Any updates? Do you have any discussion on the general fix
[
https://issues.apache.org/jira/browse/SPARK-25837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16794932#comment-16794932
]
Xiaoju Wu commented on SPARK-25837:
---
Did you verify this fix with the reproduce case above? I tried
[
https://issues.apache.org/jira/browse/SPARK-23375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16794805#comment-16794805
]
Xiaoju Wu commented on SPARK-23375:
---
But one of your test cases is conflict with what I talked about
[
https://issues.apache.org/jira/browse/SPARK-23375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16794791#comment-16794791
]
Xiaoju Wu commented on SPARK-23375:
---
I think there's another case in which sort is redundant:
Sort
Xiaoju Wu created SPARK-26779:
-
Summary: NullPointerException when disable wholestage codegen
Key: SPARK-26779
URL: https://issues.apache.org/jira/browse/SPARK-26779
Project: Spark
Issue Type:
[
https://issues.apache.org/jira/browse/SPARK-23839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16628184#comment-16628184
]
Xiaoju Wu commented on SPARK-23839:
---
[~smilegator] Is there any plan on the cost-based optimizer?
>
Xiaoju Wu created SPARK-24088:
-
Summary: only HadoopRDD leverage HDFS Cache as preferred location
Key: SPARK-24088
URL: https://issues.apache.org/jira/browse/SPARK-24088
Project: Spark
Issue
[
https://issues.apache.org/jira/browse/SPARK-23839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1648#comment-1648
]
Xiaoju Wu edited comment on SPARK-23839 at 4/2/18 3:05 PM:
---
Yes, bucketing is
[
https://issues.apache.org/jira/browse/SPARK-23839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1648#comment-1648
]
Xiaoju Wu commented on SPARK-23839:
---
Yes, bucketing is one of the cases to say that the cost of
[
https://issues.apache.org/jira/browse/SPARK-23839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16422116#comment-16422116
]
Xiaoju Wu commented on SPARK-23839:
---
[~maropu] My concern is, "bucket join always firstly" doesn't mean
[
https://issues.apache.org/jira/browse/SPARK-23839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xiaoju Wu updated SPARK-23839:
--
Description:
Since spark 2.2, the cost-based JoinReorder rule is implemented and in Spark
2.3
[
https://issues.apache.org/jira/browse/SPARK-23839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16421613#comment-16421613
]
Xiaoju Wu commented on SPARK-23839:
---
Any discussion or ticket already related to this topic please let
Xiaoju Wu created SPARK-23839:
-
Summary: consider bucket join in cost-based JoinReorder rule
Key: SPARK-23839
URL: https://issues.apache.org/jira/browse/SPARK-23839
Project: Spark
Issue Type:
[
https://issues.apache.org/jira/browse/SPARK-17570?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16410752#comment-16410752
]
Xiaoju Wu commented on SPARK-17570:
---
[~tejasp] When you join 3 tables with bucket number 4,8,12, if
[
https://issues.apache.org/jira/browse/SPARK-17495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xiaoju Wu updated SPARK-17495:
--
Comment: was deleted
(was: [~tejasp] I can see HiveHash merged but never used. Seems the using of
[
https://issues.apache.org/jira/browse/SPARK-17495?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16387873#comment-16387873
]
Xiaoju Wu commented on SPARK-17495:
---
[~tejasp] I can see HiveHash merged but never used. Seems the
[
https://issues.apache.org/jira/browse/SPARK-22469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16385156#comment-16385156
]
Xiaoju Wu commented on SPARK-22469:
---
[~liutang123] cast Decimal to Double is possible to lose
[
https://issues.apache.org/jira/browse/SPARK-23493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xiaoju Wu resolved SPARK-23493.
---
Resolution: Not A Bug
> insert-into depends on columns order, otherwise incorrect data inserted
>
[
https://issues.apache.org/jira/browse/SPARK-23493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374266#comment-16374266
]
Xiaoju Wu commented on SPARK-23493:
---
If that's the case, it should throw an exception to tell the users
[
https://issues.apache.org/jira/browse/SPARK-23493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374172#comment-16374172
]
Xiaoju Wu commented on SPARK-23493:
---
[~mgaido] "Columns are matched in order while inserting" This is
[
https://issues.apache.org/jira/browse/SPARK-23493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374076#comment-16374076
]
Xiaoju Wu commented on SPARK-23493:
---
This issue is similar with the issue described in ticket
[
https://issues.apache.org/jira/browse/SPARK-9278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374073#comment-16374073
]
Xiaoju Wu commented on SPARK-9278:
--
Created a new ticket to trace this issue SPARK-23493
>
[
https://issues.apache.org/jira/browse/SPARK-23493?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Xiaoju Wu updated SPARK-23493:
--
Description:
insert-into only works when the partitionby key columns are set at last:
val data = Seq(
Xiaoju Wu created SPARK-23493:
-
Summary: insert-into depends on columns order, otherwise incorrect
data inserted
Key: SPARK-23493
URL: https://issues.apache.org/jira/browse/SPARK-23493
Project: Spark
[
https://issues.apache.org/jira/browse/SPARK-9278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374040#comment-16374040
]
Xiaoju Wu edited comment on SPARK-9278 at 2/23/18 7:48 AM:
---
Seems the issue
[
https://issues.apache.org/jira/browse/SPARK-9278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16374040#comment-16374040
]
Xiaoju Wu commented on SPARK-9278:
--
Seems the issue still exists, here's the test:
val data = Seq(
(7,
40 matches
Mail list logo