[GitHub] [spark] xuanyuanking edited a comment on pull request #28977: [WIP] Add all hive.execution suite in the parallel test group

2020-07-16 Thread GitBox
xuanyuanking edited a comment on pull request #28977: URL: https://github.com/apache/spark/pull/28977#issuecomment-656527166 Some analysis and summary for only add two time-consuming `hive.execution` suite: Test | Worker | Scala test time | - | ---

[GitHub] [spark] xuanyuanking edited a comment on pull request #28977: [WIP] Add all hive.execution suite in the parallel test group

2020-07-16 Thread GitBox
xuanyuanking edited a comment on pull request #28977: URL: https://github.com/apache/spark/pull/28977#issuecomment-656527166 Some analysis and summary for only add two time-consuming `hive.execution` suite(HashAggregationQueryWithControlledFallbackSuite, HiveCompatibilitySuite): Test

[GitHub] [spark] xuanyuanking commented on pull request #28977: [WIP] Add all hive.execution suite in the parallel test group

2020-07-16 Thread GitBox
xuanyuanking commented on pull request #28977: URL: https://github.com/apache/spark/pull/28977#issuecomment-659332063 retest this please This is an automated message from the Apache Git Service. To respond to the message, ple

[GitHub] [spark] peter-toth commented on pull request #28885: [SPARK-29375][SPARK-28940][SPARK-32041][SQL] Whole plan exchange and subquery reuse

2020-07-16 Thread GitBox
peter-toth commented on pull request #28885: URL: https://github.com/apache/spark/pull/28885#issuecomment-659332636 I've added a few more comments to explain how the PR works. @cloud-fan, @maryannxue, @maropu, @viirya please let me know if you have any concerns, suggestions or you ar

[GitHub] [spark] AmplabJenkins commented on pull request #28977: [WIP] Add all hive.execution suite in the parallel test group

2020-07-16 Thread GitBox
AmplabJenkins commented on pull request #28977: URL: https://github.com/apache/spark/pull/28977#issuecomment-659333863 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28977: [WIP] Add all hive.execution suite in the parallel test group

2020-07-16 Thread GitBox
AmplabJenkins removed a comment on pull request #28977: URL: https://github.com/apache/spark/pull/28977#issuecomment-659333863 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] LantaoJin removed a comment on pull request #29021: [SPARK-32201][SQL] More general skew join pattern matching

2020-07-16 Thread GitBox
LantaoJin removed a comment on pull request #29021: URL: https://github.com/apache/spark/pull/29021#issuecomment-659245767 Now I add another test case which is very similar with the user case in the description. I think it's done. Could you have a chance to review it? @cloud-fan ---

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29021: [WIP][SPARK-32201][SQL] More general skew join pattern matching

2020-07-16 Thread GitBox
AmplabJenkins removed a comment on pull request #29021: URL: https://github.com/apache/spark/pull/29021#issuecomment-659337425 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29021: [SPARK-32201][SQL] More general skew join pattern matching

2020-07-16 Thread GitBox
AmplabJenkins commented on pull request #29021: URL: https://github.com/apache/spark/pull/29021#issuecomment-659337425 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] cloud-fan commented on a change in pull request #28676: [SPARK-31869][SQL] BroadcastHashJoinExec can utilize the build side for its output partitioning

2020-07-16 Thread GitBox
cloud-fan commented on a change in pull request #28676: URL: https://github.com/apache/spark/pull/28676#discussion_r455703428 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -2665,6 +2665,15 @@ object SQLConf { .checkValue(_ >

[GitHub] [spark] cloud-fan commented on a change in pull request #28676: [SPARK-31869][SQL] BroadcastHashJoinExec can utilize the build side for its output partitioning

2020-07-16 Thread GitBox
cloud-fan commented on a change in pull request #28676: URL: https://github.com/apache/spark/pull/28676#discussion_r455705846 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashJoinExec.scala ## @@ -60,6 +63,81 @@ case class BroadcastHashJo

[GitHub] [spark] AmplabJenkins commented on pull request #29021: [WIP][SPARK-32201][SQL] More general skew join pattern matching

2020-07-16 Thread GitBox
AmplabJenkins commented on pull request #29021: URL: https://github.com/apache/spark/pull/29021#issuecomment-659340770 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] ulysses-you commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-07-16 Thread GitBox
ulysses-you commented on a change in pull request #28840: URL: https://github.com/apache/spark/pull/28840#discussion_r455706552 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala ## @@ -236,6 +236,46 @@ case class ShowFunctionsCommand(

[GitHub] [spark] AmplabJenkins commented on pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-07-16 Thread GitBox
AmplabJenkins commented on pull request #28840: URL: https://github.com/apache/spark/pull/28840#issuecomment-659340827 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29021: [WIP][SPARK-32201][SQL] More general skew join pattern matching

2020-07-16 Thread GitBox
AmplabJenkins removed a comment on pull request #29021: URL: https://github.com/apache/spark/pull/29021#issuecomment-659340770 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-07-16 Thread GitBox
AmplabJenkins removed a comment on pull request #28840: URL: https://github.com/apache/spark/pull/28840#issuecomment-659340827 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] maropu commented on pull request #29067: [SPARK-32274][SQL] Make SQL cache serialization pluggable

2020-07-16 Thread GitBox
maropu commented on pull request #29067: URL: https://github.com/apache/spark/pull/29067#issuecomment-659340965 > Would that alleviate your concern about SPIP @maropu? Yea, the descison sounds reasonable to me. This is

[GitHub] [spark] cloud-fan commented on a change in pull request #28676: [SPARK-31869][SQL] BroadcastHashJoinExec can utilize the build side for its output partitioning

2020-07-16 Thread GitBox
cloud-fan commented on a change in pull request #28676: URL: https://github.com/apache/spark/pull/28676#discussion_r455707587 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashJoinExec.scala ## @@ -60,6 +63,81 @@ case class BroadcastHashJo

[GitHub] [spark] cloud-fan commented on a change in pull request #28676: [SPARK-31869][SQL] BroadcastHashJoinExec can utilize the build side for its output partitioning

2020-07-16 Thread GitBox
cloud-fan commented on a change in pull request #28676: URL: https://github.com/apache/spark/pull/28676#discussion_r455709347 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashJoinExec.scala ## @@ -60,6 +63,81 @@ case class BroadcastHashJo

[GitHub] [spark] cloud-fan commented on a change in pull request #28676: [SPARK-31869][SQL] BroadcastHashJoinExec can utilize the build side for its output partitioning

2020-07-16 Thread GitBox
cloud-fan commented on a change in pull request #28676: URL: https://github.com/apache/spark/pull/28676#discussion_r455710758 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashJoin.scala ## @@ -62,21 +62,30 @@ trait HashJoin extends BaseJoinExec {

[GitHub] [spark] cloud-fan commented on a change in pull request #28676: [SPARK-31869][SQL] BroadcastHashJoinExec can utilize the build side for its output partitioning

2020-07-16 Thread GitBox
cloud-fan commented on a change in pull request #28676: URL: https://github.com/apache/spark/pull/28676#discussion_r455712732 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/joins/BroadcastJoinSuite.scala ## @@ -415,6 +417,216 @@ abstract class BroadcastJo

[GitHub] [spark] cloud-fan commented on a change in pull request #28676: [SPARK-31869][SQL] BroadcastHashJoinExec can utilize the build side for its output partitioning

2020-07-16 Thread GitBox
cloud-fan commented on a change in pull request #28676: URL: https://github.com/apache/spark/pull/28676#discussion_r455713560 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/joins/BroadcastJoinSuite.scala ## @@ -415,6 +417,216 @@ abstract class BroadcastJo

[GitHub] [spark] maropu edited a comment on pull request #29067: [SPARK-32274][SQL] Make SQL cache serialization pluggable

2020-07-16 Thread GitBox
maropu edited a comment on pull request #29067: URL: https://github.com/apache/spark/pull/29067#issuecomment-659340965 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] maropu edited a comment on pull request #29067: [SPARK-32274][SQL] Make SQL cache serialization pluggable

2020-07-16 Thread GitBox
maropu edited a comment on pull request #29067: URL: https://github.com/apache/spark/pull/29067#issuecomment-659340965 > Would that alleviate your concern about SPIP @maropu? Yea, the descison as @HyukjinKwon suggested sounds reasonable to me. ---

[GitHub] [spark] maropu edited a comment on pull request #29067: [SPARK-32274][SQL] Make SQL cache serialization pluggable

2020-07-16 Thread GitBox
maropu edited a comment on pull request #29067: URL: https://github.com/apache/spark/pull/29067#issuecomment-659340965 > Would that alleviate your concern about SPIP @maropu? Yea, the approach that you suggested sounds reasonable to me. --

[GitHub] [spark] maropu edited a comment on pull request #29067: [SPARK-32274][SQL] Make SQL cache serialization pluggable

2020-07-16 Thread GitBox
maropu edited a comment on pull request #29067: URL: https://github.com/apache/spark/pull/29067#issuecomment-659340965 > Would that alleviate your concern about SPIP @maropu? Yea, the approach that you suggested sounds reasonable to me. Thanks for sum up, @HyukjinKwon . ---

[GitHub] [spark] cloud-fan commented on a change in pull request #28676: [SPARK-31869][SQL] BroadcastHashJoinExec can utilize the build side for its output partitioning

2020-07-16 Thread GitBox
cloud-fan commented on a change in pull request #28676: URL: https://github.com/apache/spark/pull/28676#discussion_r455716738 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/adaptive/AdaptiveQueryExecSuite.scala ## @@ -565,7 +565,7 @@ class AdaptiveQueryEx

[GitHub] [spark] cloud-fan commented on a change in pull request #29079: [SPARK-32286][SQL] Coalesce bucketed table for shuffled hash join if applicable

2020-07-16 Thread GitBox
cloud-fan commented on a change in pull request #29079: URL: https://github.com/apache/spark/pull/29079#discussion_r455720125 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala ## @@ -2659,12 +2660,24 @@ object SQLConf { buildConf("spar

[GitHub] [spark] cloud-fan commented on a change in pull request #29079: [SPARK-32286][SQL] Coalesce bucketed table for shuffled hash join if applicable

2020-07-16 Thread GitBox
cloud-fan commented on a change in pull request #29079: URL: https://github.com/apache/spark/pull/29079#discussion_r455723207 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/bucketing/CoalesceBucketsInJoinSuite.scala ## @@ -19,17 +19,21 @@ package org.apac

[GitHub] [spark] maropu commented on pull request #29126: [SPARK-32324][SQL]Fix error messages during using PIVOT and lateral view

2020-07-16 Thread GitBox
maropu commented on pull request #29126: URL: https://github.com/apache/spark/pull/29126#issuecomment-659356390 > We have the following defined in spark/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4. but it seems pivotClause and lateralView need to replace th

[GitHub] [spark] cloud-fan commented on a change in pull request #29079: [SPARK-32286][SQL] Coalesce bucketed table for shuffled hash join if applicable

2020-07-16 Thread GitBox
cloud-fan commented on a change in pull request #29079: URL: https://github.com/apache/spark/pull/29079#discussion_r455724248 ## File path: sql/core/src/test/scala/org/apache/spark/sql/execution/bucketing/CoalesceBucketsInJoinSuite.scala ## @@ -103,46 +119,69 @@ class Coalesce

[GitHub] [spark] cloud-fan commented on a change in pull request #29107: [SPARK-32308][SQL] Move by-name resolution logic of unionByName from API code to analysis phase

2020-07-16 Thread GitBox
cloud-fan commented on a change in pull request #29107: URL: https://github.com/apache/spark/pull/29107#discussion_r455725949 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/TypeCoercion.scala ## @@ -1099,6 +1101,64 @@ object TypeCoercion {

[GitHub] [spark] maropu commented on a change in pull request #29085: [SPARK-32106][SQL]Implement SparkScriptTransformationExec in sql/core

2020-07-16 Thread GitBox
maropu commented on a change in pull request #29085: URL: https://github.com/apache/spark/pull/29085#discussion_r455726405 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/BaseScriptTransformationExec.scala ## @@ -87,17 +131,55 @@ trait BaseScriptTransforma

[GitHub] [spark] maropu commented on a change in pull request #29085: [SPARK-32106][SQL]Implement SparkScriptTransformationExec in sql/core

2020-07-16 Thread GitBox
maropu commented on a change in pull request #29085: URL: https://github.com/apache/spark/pull/29085#discussion_r455731070 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/BaseScriptTransformationSuite.scala ## @@ -0,0 +1,227 @@ +/* + * Licensed to the

[GitHub] [spark] HyukjinKwon commented on pull request #29114: [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

2020-07-16 Thread GitBox
HyukjinKwon commented on pull request #29114: URL: https://github.com/apache/spark/pull/29114#issuecomment-659363180 I just did a very quick test. Seems like if your function to serialize is big, it can benefit best from it up to 1250%: ```python >>> from pyspark import cloudpickl

[GitHub] [spark] HyukjinKwon commented on pull request #29114: [SPARK-32094][PYTHON] Update cloudpickle to v1.5.0

2020-07-16 Thread GitBox
HyukjinKwon commented on pull request #29114: URL: https://github.com/apache/spark/pull/29114#issuecomment-659363377 @BryanCutler, @viirya, @ueshin can you take a look when you're available? This is an automated message from

[GitHub] [spark] SparkQA commented on pull request #29045: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-16 Thread GitBox
SparkQA commented on pull request #29045: URL: https://github.com/apache/spark/pull/29045#issuecomment-659364715 **[Test build #125949 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125949/testReport)** for PR 29045 at commit [`c0f6209`](https://github.co

[GitHub] [spark] SparkQA removed a comment on pull request #29045: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-16 Thread GitBox
SparkQA removed a comment on pull request #29045: URL: https://github.com/apache/spark/pull/29045#issuecomment-659203978 **[Test build #125949 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125949/testReport)** for PR 29045 at commit [`c0f6209`](https://gi

[GitHub] [spark] SparkQA commented on pull request #29125: [SPARK-32018][SQL][3.0] UnsafeRow.setDecimal should set null with overflowed value

2020-07-16 Thread GitBox
SparkQA commented on pull request #29125: URL: https://github.com/apache/spark/pull/29125#issuecomment-659365406 **[Test build #125965 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125965/testReport)** for PR 29125 at commit [`7268c05`](https://github.com

[GitHub] [spark] AmplabJenkins commented on pull request #29045: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-16 Thread GitBox
AmplabJenkins commented on pull request #29045: URL: https://github.com/apache/spark/pull/29045#issuecomment-659365245 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29045: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-16 Thread GitBox
AmplabJenkins removed a comment on pull request #29045: URL: https://github.com/apache/spark/pull/29045#issuecomment-659365245 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To r

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29045: [SPARK-32234][SQL] Spark sql commands are failing on selecting the orc tables

2020-07-16 Thread GitBox
AmplabJenkins removed a comment on pull request #29045: URL: https://github.com/apache/spark/pull/29045#issuecomment-659365254 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125

[GitHub] [spark] SparkQA commented on pull request #29101: [SPARK-32302][SQL] Partially push down disjunctive predicates through Join/Partitions

2020-07-16 Thread GitBox
SparkQA commented on pull request #29101: URL: https://github.com/apache/spark/pull/29101#issuecomment-659366838 **[Test build #125952 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125952/testReport)** for PR 29101 at commit [`a051700`](https://github.co

[GitHub] [spark] SparkQA removed a comment on pull request #29101: [SPARK-32302][SQL] Partially push down disjunctive predicates through Join/Partitions

2020-07-16 Thread GitBox
SparkQA removed a comment on pull request #29101: URL: https://github.com/apache/spark/pull/29101#issuecomment-659204292 **[Test build #125952 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125952/testReport)** for PR 29101 at commit [`a051700`](https://gi

[GitHub] [spark] SparkQA commented on pull request #28977: [WIP] Add all hive.execution suite in the parallel test group

2020-07-16 Thread GitBox
SparkQA commented on pull request #28977: URL: https://github.com/apache/spark/pull/28977#issuecomment-659367884 **[Test build #125966 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125966/testReport)** for PR 28977 at commit [`0556b4b`](https://github.com

[GitHub] [spark] AmplabJenkins commented on pull request #29101: [SPARK-32302][SQL] Partially push down disjunctive predicates through Join/Partitions

2020-07-16 Thread GitBox
AmplabJenkins commented on pull request #29101: URL: https://github.com/apache/spark/pull/29101#issuecomment-659367703 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29101: [SPARK-32302][SQL] Partially push down disjunctive predicates through Join/Partitions

2020-07-16 Thread GitBox
AmplabJenkins removed a comment on pull request #29101: URL: https://github.com/apache/spark/pull/29101#issuecomment-659367703 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To r

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29101: [SPARK-32302][SQL] Partially push down disjunctive predicates through Join/Partitions

2020-07-16 Thread GitBox
AmplabJenkins removed a comment on pull request #29101: URL: https://github.com/apache/spark/pull/29101#issuecomment-659367711 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125

[GitHub] [spark] kiszk commented on pull request #29067: [SPARK-32274][SQL] Make SQL cache serialization pluggable

2020-07-16 Thread GitBox
kiszk commented on pull request #29067: URL: https://github.com/apache/spark/pull/29067#issuecomment-659371202 I will also review it this weekend. This is an automated message from the Apache Git Service. To respond to the me

[GitHub] [spark] SparkQA commented on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-07-16 Thread GitBox
SparkQA commented on pull request #28904: URL: https://github.com/apache/spark/pull/28904#issuecomment-659371463 **[Test build #125956 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125956/testReport)** for PR 28904 at commit [`247a0a1`](https://github.co

[GitHub] [spark] SparkQA removed a comment on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-07-16 Thread GitBox
SparkQA removed a comment on pull request #28904: URL: https://github.com/apache/spark/pull/28904#issuecomment-659210212 **[Test build #125956 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125956/testReport)** for PR 28904 at commit [`247a0a1`](https://gi

[GitHub] [spark] AmplabJenkins commented on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-07-16 Thread GitBox
AmplabJenkins commented on pull request #28904: URL: https://github.com/apache/spark/pull/28904#issuecomment-659372181 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #29021: [WIP][SPARK-32201][SQL] More general skew join pattern matching

2020-07-16 Thread GitBox
SparkQA commented on pull request #29021: URL: https://github.com/apache/spark/pull/29021#issuecomment-659372384 **[Test build #125967 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125967/testReport)** for PR 29021 at commit [`d65a210`](https://github.com

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink metadata log to avoid memory issue

2020-07-16 Thread GitBox
AmplabJenkins removed a comment on pull request #28904: URL: https://github.com/apache/spark/pull/28904#issuecomment-659372181 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] GuoPhilipse commented on pull request #29126: [SPARK-32324][SQL]Fix error messages during using PIVOT and lateral view

2020-07-16 Thread GitBox
GuoPhilipse commented on pull request #29126: URL: https://github.com/apache/spark/pull/29126#issuecomment-659373259 > > We have the following defined in spark/sql/catalyst/src/main/antlr4/org/apache/spark/sql/catalyst/parser/SqlBase.g4. but it seems pivotClause and lateralView need to rep

[GitHub] [spark] LantaoJin commented on pull request #29123: [SPARK-32283][CORE] Kryo should support multiple user registrators

2020-07-16 Thread GitBox
LantaoJin commented on pull request #29123: URL: https://github.com/apache/spark/pull/29123#issuecomment-659374776 cc @RotemShaul @vanzin @dongjoon-hyun This is an automated message from the Apache Git Service. To respond to

[GitHub] [spark] LantaoJin commented on pull request #29120: [SPARK-32291][SQL] COALESCE should not reduce the child parallelism if it contains a Join

2020-07-16 Thread GitBox
LantaoJin commented on pull request #29120: URL: https://github.com/apache/spark/pull/29120#issuecomment-659375039 retest this please This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] SparkQA commented on pull request #29079: [SPARK-32286][SQL] Coalesce bucketed table for shuffled hash join if applicable

2020-07-16 Thread GitBox
SparkQA commented on pull request #29079: URL: https://github.com/apache/spark/pull/29079#issuecomment-659375246 **[Test build #125955 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125955/testReport)** for PR 29079 at commit [`11d138b`](https://github.co

[GitHub] [spark] SparkQA removed a comment on pull request #29079: [SPARK-32286][SQL] Coalesce bucketed table for shuffled hash join if applicable

2020-07-16 Thread GitBox
SparkQA removed a comment on pull request #29079: URL: https://github.com/apache/spark/pull/29079#issuecomment-659210167 **[Test build #125955 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125955/testReport)** for PR 29079 at commit [`11d138b`](https://gi

[GitHub] [spark] AmplabJenkins commented on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-16 Thread GitBox
AmplabJenkins commented on pull request #29117: URL: https://github.com/apache/spark/pull/29117#issuecomment-659375719 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins commented on pull request #29120: [SPARK-32291][SQL] COALESCE should not reduce the child parallelism if it contains a Join

2020-07-16 Thread GitBox
AmplabJenkins commented on pull request #29120: URL: https://github.com/apache/spark/pull/29120#issuecomment-659375532 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] LantaoJin commented on pull request #28961: [SPARK-32143][SQL] Prevent a skewed join from producing too many partition splits

2020-07-16 Thread GitBox
LantaoJin commented on pull request #28961: URL: https://github.com/apache/spark/pull/28961#issuecomment-659375446 retest this please This is an automated message from the Apache Git Service. To respond to the message, please

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-16 Thread GitBox
AmplabJenkins removed a comment on pull request #29117: URL: https://github.com/apache/spark/pull/29117#issuecomment-659375719 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29079: [SPARK-32286][SQL] Coalesce bucketed table for shuffled hash join if applicable

2020-07-16 Thread GitBox
AmplabJenkins commented on pull request #29079: URL: https://github.com/apache/spark/pull/29079#issuecomment-659375946 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #29021: [WIP][SPARK-32201][SQL] More general skew join pattern matching

2020-07-16 Thread GitBox
SparkQA commented on pull request #29021: URL: https://github.com/apache/spark/pull/29021#issuecomment-659376170 **[Test build #125968 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125968/testReport)** for PR 29021 at commit [`6f7dbc2`](https://github.com

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29120: [SPARK-32291][SQL] COALESCE should not reduce the child parallelism if it contains a Join

2020-07-16 Thread GitBox
AmplabJenkins removed a comment on pull request #29120: URL: https://github.com/apache/spark/pull/29120#issuecomment-659375532 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29079: [SPARK-32286][SQL] Coalesce bucketed table for shuffled hash join if applicable

2020-07-16 Thread GitBox
AmplabJenkins removed a comment on pull request #29079: URL: https://github.com/apache/spark/pull/29079#issuecomment-659375946 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To r

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29079: [SPARK-32286][SQL] Coalesce bucketed table for shuffled hash join if applicable

2020-07-16 Thread GitBox
AmplabJenkins removed a comment on pull request #29079: URL: https://github.com/apache/spark/pull/29079#issuecomment-659375958 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125

[GitHub] [spark] SparkQA commented on pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource

2020-07-16 Thread GitBox
SparkQA commented on pull request #27366: URL: https://github.com/apache/spark/pull/27366#issuecomment-659377791 **[Test build #125958 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125958/testReport)** for PR 27366 at commit [`57524d6`](https://github.co

[GitHub] [spark] SparkQA removed a comment on pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource

2020-07-16 Thread GitBox
SparkQA removed a comment on pull request #27366: URL: https://github.com/apache/spark/pull/27366#issuecomment-659217990 **[Test build #125958 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125958/testReport)** for PR 27366 at commit [`57524d6`](https://gi

[GitHub] [spark] AmplabJenkins commented on pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource

2020-07-16 Thread GitBox
AmplabJenkins commented on pull request #27366: URL: https://github.com/apache/spark/pull/27366#issuecomment-659378513 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

2020-07-16 Thread GitBox
SparkQA commented on pull request #28840: URL: https://github.com/apache/spark/pull/28840#issuecomment-659378696 **[Test build #125969 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125969/testReport)** for PR 28840 at commit [`5d4c152`](https://github.com

[GitHub] [spark] AmplabJenkins removed a comment on pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource

2020-07-16 Thread GitBox
AmplabJenkins removed a comment on pull request #27366: URL: https://github.com/apache/spark/pull/27366#issuecomment-659378513 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To r

[GitHub] [spark] maropu commented on pull request #29126: [SPARK-32324][SQL]Fix error messages during using PIVOT and lateral view

2020-07-16 Thread GitBox
maropu commented on pull request #29126: URL: https://github.com/apache/spark/pull/29126#issuecomment-659378822 I don't dig into it though, probably the second case is matched in `querySpecification`: https://github.com/apache/spark/blob/db47c6e340a63100d7c0e85abf237adc4e2174cc/sql/cata

[GitHub] [spark] HyukjinKwon commented on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-16 Thread GitBox
HyukjinKwon commented on pull request #29117: URL: https://github.com/apache/spark/pull/29117#issuecomment-659379654 retest this please This is an automated message from the Apache Git Service. To respond to the message, plea

[GitHub] [spark] AmplabJenkins commented on pull request #28961: [SPARK-32143][SQL] Prevent a skewed join from producing too many partition splits

2020-07-16 Thread GitBox
AmplabJenkins commented on pull request #28961: URL: https://github.com/apache/spark/pull/28961#issuecomment-659379389 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource

2020-07-16 Thread GitBox
AmplabJenkins removed a comment on pull request #27366: URL: https://github.com/apache/spark/pull/27366#issuecomment-659378521 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125

[GitHub] [spark] SparkQA commented on pull request #29064: [SPARK-32272][SQL] Add SQL standard command SET TIME ZONE

2020-07-16 Thread GitBox
SparkQA commented on pull request #29064: URL: https://github.com/apache/spark/pull/29064#issuecomment-659379423 **[Test build #125959 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125959/testReport)** for PR 29064 at commit [`11f0fed`](https://github.co

[GitHub] [spark] SparkQA removed a comment on pull request #29064: [SPARK-32272][SQL] Add SQL standard command SET TIME ZONE

2020-07-16 Thread GitBox
SparkQA removed a comment on pull request #29064: URL: https://github.com/apache/spark/pull/29064#issuecomment-659221768 **[Test build #125959 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125959/testReport)** for PR 29064 at commit [`11f0fed`](https://gi

[GitHub] [spark] SparkQA commented on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-16 Thread GitBox
SparkQA commented on pull request #29117: URL: https://github.com/apache/spark/pull/29117#issuecomment-659380293 **[Test build #125970 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125970/testReport)** for PR 29117 at commit [`fcb833c`](https://github.com

[GitHub] [spark] AmplabJenkins removed a comment on pull request #28961: [SPARK-32143][SQL] Prevent a skewed join from producing too many partition splits

2020-07-16 Thread GitBox
AmplabJenkins removed a comment on pull request #28961: URL: https://github.com/apache/spark/pull/28961#issuecomment-659379389 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29064: [SPARK-32272][SQL] Add SQL standard command SET TIME ZONE

2020-07-16 Thread GitBox
AmplabJenkins commented on pull request #29064: URL: https://github.com/apache/spark/pull/29064#issuecomment-659380022 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29064: [SPARK-32272][SQL] Add SQL standard command SET TIME ZONE

2020-07-16 Thread GitBox
AmplabJenkins removed a comment on pull request #29064: URL: https://github.com/apache/spark/pull/29064#issuecomment-659380022 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To r

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29064: [SPARK-32272][SQL] Add SQL standard command SET TIME ZONE

2020-07-16 Thread GitBox
AmplabJenkins removed a comment on pull request #29064: URL: https://github.com/apache/spark/pull/29064#issuecomment-659380036 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/125

[GitHub] [spark] SparkQA commented on pull request #29101: [SPARK-32302][SQL] Partially push down disjunctive predicates through Join/Partitions

2020-07-16 Thread GitBox
SparkQA commented on pull request #29101: URL: https://github.com/apache/spark/pull/29101#issuecomment-659381818 **[Test build #125953 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125953/testReport)** for PR 29101 at commit [`16c78d6`](https://github.co

[GitHub] [spark] SparkQA removed a comment on pull request #29101: [SPARK-32302][SQL] Partially push down disjunctive predicates through Join/Partitions

2020-07-16 Thread GitBox
SparkQA removed a comment on pull request #29101: URL: https://github.com/apache/spark/pull/29101#issuecomment-659204337 **[Test build #125953 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125953/testReport)** for PR 29101 at commit [`16c78d6`](https://gi

[GitHub] [spark] AmplabJenkins commented on pull request #29101: [SPARK-32302][SQL] Partially push down disjunctive predicates through Join/Partitions

2020-07-16 Thread GitBox
AmplabJenkins commented on pull request #29101: URL: https://github.com/apache/spark/pull/29101#issuecomment-659382730 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #29120: [SPARK-32291][SQL] COALESCE should not reduce the child parallelism if it contains a Join

2020-07-16 Thread GitBox
SparkQA commented on pull request #29120: URL: https://github.com/apache/spark/pull/29120#issuecomment-659382906 **[Test build #125971 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125971/testReport)** for PR 29120 at commit [`e56f5d4`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #29130: [SPARK-32330][SQL] Preserve shuffled hash join build side partitioning

2020-07-16 Thread GitBox
SparkQA commented on pull request #29130: URL: https://github.com/apache/spark/pull/29130#issuecomment-659383125 **[Test build #125951 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125951/testReport)** for PR 29130 at commit [`dface2a`](https://github.co

[GitHub] [spark] wankunde commented on pull request #28850: [SPARK-32015][Core]Remote inheritable thread local variables after spark context is stopped

2020-07-16 Thread GitBox
wankunde commented on pull request #28850: URL: https://github.com/apache/spark/pull/28850#issuecomment-659383289 @srowen @Ngone51 Thanks for all your help. I will remove the reference after spark context is stopped in my application.

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29101: [SPARK-32302][SQL] Partially push down disjunctive predicates through Join/Partitions

2020-07-16 Thread GitBox
AmplabJenkins removed a comment on pull request #29101: URL: https://github.com/apache/spark/pull/29101#issuecomment-659382730 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] SparkQA commented on pull request #29101: [SPARK-32302][SQL] Partially push down disjunctive predicates through Join/Partitions

2020-07-16 Thread GitBox
SparkQA commented on pull request #29101: URL: https://github.com/apache/spark/pull/29101#issuecomment-659383270 **[Test build #125946 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125946/testReport)** for PR 29101 at commit [`005f6fe`](https://github.co

[GitHub] [spark] AmplabJenkins commented on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-16 Thread GitBox
AmplabJenkins commented on pull request #29117: URL: https://github.com/apache/spark/pull/29117#issuecomment-659383387 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA removed a comment on pull request #29101: [SPARK-32302][SQL] Partially push down disjunctive predicates through Join/Partitions

2020-07-16 Thread GitBox
SparkQA removed a comment on pull request #29101: URL: https://github.com/apache/spark/pull/29101#issuecomment-659203362 **[Test build #125946 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125946/testReport)** for PR 29101 at commit [`005f6fe`](https://gi

[GitHub] [spark] AmplabJenkins commented on pull request #29101: [SPARK-32302][SQL] Partially push down disjunctive predicates through Join/Partitions

2020-07-16 Thread GitBox
AmplabJenkins commented on pull request #29101: URL: https://github.com/apache/spark/pull/29101#issuecomment-659383911 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA removed a comment on pull request #29130: [SPARK-32330][SQL] Preserve shuffled hash join build side partitioning

2020-07-16 Thread GitBox
SparkQA removed a comment on pull request #29130: URL: https://github.com/apache/spark/pull/29130#issuecomment-659203872 **[Test build #125951 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125951/testReport)** for PR 29130 at commit [`dface2a`](https://gi

[GitHub] [spark] AmplabJenkins removed a comment on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-16 Thread GitBox
AmplabJenkins removed a comment on pull request #29117: URL: https://github.com/apache/spark/pull/29117#issuecomment-659383387 This is an automated message from the Apache Git Service. To respond to the message, please log on

[GitHub] [spark] AmplabJenkins commented on pull request #29130: [SPARK-32330][SQL] Preserve shuffled hash join build side partitioning

2020-07-16 Thread GitBox
AmplabJenkins commented on pull request #29130: URL: https://github.com/apache/spark/pull/29130#issuecomment-659383806 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHu

[GitHub] [spark] SparkQA commented on pull request #28961: [SPARK-32143][SQL] Prevent a skewed join from producing too many partition splits

2020-07-16 Thread GitBox
SparkQA commented on pull request #28961: URL: https://github.com/apache/spark/pull/28961#issuecomment-659383996 **[Test build #125972 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125972/testReport)** for PR 28961 at commit [`3811ae9`](https://github.com

[GitHub] [spark] SparkQA commented on pull request #29117: [WIP] Debug flaky pip installation test failure

2020-07-16 Thread GitBox
SparkQA commented on pull request #29117: URL: https://github.com/apache/spark/pull/29117#issuecomment-659384112 **[Test build #125973 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/125973/testReport)** for PR 29117 at commit [`fcb833c`](https://github.com

  1   2   3   4   5   6   7   8   9   10   >