[GitHub] [spark] dongjoon-hyun opened a new pull request #31912: [SPARK-34811][CORE] Redact fs.s3a.access.key like secret and token
dongjoon-hyun opened a new pull request #31912: URL: https://github.com/apache/spark/pull/31912 ### What changes were proposed in this pull request? Like we redact secrets and tokens, this PR aims to redact access key. ### Why are the changes needed? ### Does this PR introduce _any_ user-facing change? ### How was this patch tested? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31909: [SPARK-34809][CORE] Enable spark.hadoopRDD.ignoreEmptySplits by default
SparkQA commented on pull request #31909: URL: https://github.com/apache/spark/pull/31909#issuecomment-803522439 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40881/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31909: [SPARK-34809][CORE] Enable spark.hadoopRDD.ignoreEmptySplits by default
SparkQA commented on pull request #31909: URL: https://github.com/apache/spark/pull/31909#issuecomment-803521816 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40881/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] tanelk commented on pull request #31907: [SPARK-34807][SQL] Move TransposeWindow before Operator push down
tanelk commented on pull request #31907: URL: https://github.com/apache/spark/pull/31907#issuecomment-803521361 @wangyum, I'll answer your question (https://github.com/apache/spark/pull/31677#pullrequestreview-616975279) here. Just to clarify - the #31677 does not fix the issue, you are trying to fix here. But they are very similar. I improved the `CollapseWindow`, the `TransposeWindow` can be improved in a similar way to fix your issue. I tried a quick change: ```diff diff --git a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala index 3e3550d5da..e629ccc268 100644 --- a/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala +++ b/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/Optimizer.scala @@ -991,13 +991,24 @@ object TransposeWindow extends Rule[LogicalPlan] { }) } + private def windowsCompatible(w1: Window, w2: Window): Boolean = { +w1.references.intersect(w2.windowOutputSet).isEmpty && + w1.expressions.forall(_.deterministic) && + w2.expressions.forall(_.deterministic) && + compatiblePartitions(w1.partitionSpec, w2.partitionSpec) + } + def apply(plan: LogicalPlan): LogicalPlan = plan transformUp { -case w1 @ Window(we1, ps1, os1, w2 @ Window(we2, ps2, os2, grandChild)) -if w1.references.intersect(w2.windowOutputSet).isEmpty && - w1.expressions.forall(_.deterministic) && - w2.expressions.forall(_.deterministic) && - compatiblePartitions(ps1, ps2) => - Project(w1.output, Window(we2, ps2, os2, Window(we1, ps1, os1, grandChild))) +case w1 @ Window(_, _, _, w2 @ Window(_, _, _, grandChild)) +if windowsCompatible(w1, w2) => + Project(w1.output, w2.copy(child = w1.copy(child = grandChild))) + +case w1 @ Window(_, _, _, Project(pl, w2 @ Window(_, _, _, grandChild))) + if windowsCompatible(w1, w2) && w1.references.subsetOf(grandChild.outputSet) => + Project( +pl ++ w1.windowOutputSet, +w2.copy(child = w1.copy(child = grandChild)) + ) } } ``` And it changes the TPC-DS q47 and TPC-DS q57 in the same way your PR does, but I find this change to be more robust. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] c21 commented on pull request #31911: [SPARK-34796][SQL][3.1] Initialize counter variable for LIMIT code-gen in doProduce()
c21 commented on pull request #31911: URL: https://github.com/apache/spark/pull/31911#issuecomment-803520810 retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #31908: [SPARK-34808][SQL] Removes outer join if it only has DISTINCT on streamed side
AmplabJenkins removed a comment on pull request #31908: URL: https://github.com/apache/spark/pull/31908#issuecomment-803519860 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136291/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #31908: [SPARK-34808][SQL] Removes outer join if it only has DISTINCT on streamed side
AmplabJenkins commented on pull request #31908: URL: https://github.com/apache/spark/pull/31908#issuecomment-803519860 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136291/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #31908: [SPARK-34808][SQL] Removes outer join if it only has DISTINCT on streamed side
SparkQA removed a comment on pull request #31908: URL: https://github.com/apache/spark/pull/31908#issuecomment-803499228 **[Test build #136291 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136291/testReport)** for PR 31908 at commit [`7aa7e69`](https://github.com/apache/spark/commit/7aa7e69087460820a11ef4b0d4224ab8d463daa7). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31908: [SPARK-34808][SQL] Removes outer join if it only has DISTINCT on streamed side
SparkQA commented on pull request #31908: URL: https://github.com/apache/spark/pull/31908#issuecomment-803519703 **[Test build #136291 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136291/testReport)** for PR 31908 at commit [`7aa7e69`](https://github.com/apache/spark/commit/7aa7e69087460820a11ef4b0d4224ab8d463daa7). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #31911: [SPARK-34796][SQL][3.1] Initialize counter variable for LIMIT code-gen in doProduce()
SparkQA removed a comment on pull request #31911: URL: https://github.com/apache/spark/pull/31911#issuecomment-803514644 **[Test build #136298 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136298/testReport)** for PR 31911 at commit [`519ac59`](https://github.com/apache/spark/commit/519ac5915207e208ea32ceea76fd01088474ace1). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #31911: [SPARK-34796][SQL][3.1] Initialize counter variable for LIMIT code-gen in doProduce()
AmplabJenkins removed a comment on pull request #31911: URL: https://github.com/apache/spark/pull/31911#issuecomment-803518792 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #31910: [SPARK-34810][TEST] Update PostgreSQL test with the latest results.
AmplabJenkins removed a comment on pull request #31910: URL: https://github.com/apache/spark/pull/31910#issuecomment-803518594 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136294/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #31878: [SPARK-34784][BUILD] Upgrade Jackson to 2.12.2
SparkQA removed a comment on pull request #31878: URL: https://github.com/apache/spark/pull/31878#issuecomment-803509661 **[Test build #136297 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136297/testReport)** for PR 31878 at commit [`aeef7fb`](https://github.com/apache/spark/commit/aeef7fbaa73d152499d56e87a0046de88b392c2d). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #31878: [SPARK-34784][BUILD] Upgrade Jackson to 2.12.2
AmplabJenkins removed a comment on pull request #31878: URL: https://github.com/apache/spark/pull/31878#issuecomment-803518595 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #31911: [SPARK-34796][SQL][3.1] Initialize counter variable for LIMIT code-gen in doProduce()
AmplabJenkins commented on pull request #31911: URL: https://github.com/apache/spark/pull/31911#issuecomment-803519451 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136298/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31911: [SPARK-34796][SQL][3.1] Initialize counter variable for LIMIT code-gen in doProduce()
SparkQA commented on pull request #31911: URL: https://github.com/apache/spark/pull/31911#issuecomment-803519397 **[Test build #136298 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136298/testReport)** for PR 31911 at commit [`519ac59`](https://github.com/apache/spark/commit/519ac5915207e208ea32ceea76fd01088474ace1). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31874: [SPARK-34708][SQL] Code-gen for left semi/anti broadcast nested loop join (build right side)
SparkQA commented on pull request #31874: URL: https://github.com/apache/spark/pull/31874#issuecomment-803519323 **[Test build #136300 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136300/testReport)** for PR 31874 at commit [`f2e846d`](https://github.com/apache/spark/commit/f2e846d873f1e062429a58ceb56cb1bf2aaab14d). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] c21 commented on a change in pull request #31874: [SPARK-34708][SQL] Code-gen for left semi/anti broadcast nested loop join (build right side)
c21 commented on a change in pull request #31874: URL: https://github.com/apache/spark/pull/31874#discussion_r598222928 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastNestedLoopJoinExec.scala ## @@ -452,4 +457,50 @@ case class BroadcastNestedLoopJoinExec( |} """.stripMargin } + + private def codegenLeftExistence( + ctx: CodegenContext, + input: Seq[ExprCode], + exists: Boolean): String = { +val (buildRowArray, buildRowArrayTerm) = prepareBroadcast(ctx) +val numOutput = metricTerm(ctx, "numOutputRows") + +if (condition.isEmpty) { + if (buildRowArray.nonEmpty == exists) { +// Return streamed side if join condition is empty and +// 1. build side is non-empty for LeftSemi join +// or +// 2. build side is empty for LeftAnti join. +s""" + |$numOutput.add(1); + |${consume(ctx, input)} + """.stripMargin + } else { +// Return nothing if join condition is empty and +// 1. build side is empty for LeftSemi join +// or +// 2. build side is non-empty for LeftAnti join. +"" + } +} else { + val (buildRow, checkCondition, _) = getJoinCondition(ctx, input, streamed, broadcast) + val findMatchedRow = ctx.freshName("findMatchedRow") + val arrayIndex = ctx.freshName("arrayIndex") + + s""" + |boolean $findMatchedRow = false; Review comment: sounds good, updated. ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastNestedLoopJoinExec.scala ## @@ -395,8 +395,9 @@ case class BroadcastNestedLoopJoinExec( } } - override def supportCodegen: Boolean = { -joinType.isInstanceOf[InnerLike] + override def supportCodegen: Boolean = (joinType, buildSide) match { +case (_: InnerLike, _) | (LeftSemi, BuildRight) | (LeftAnti, BuildRight) => true Review comment: @cloud-fan - updated. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #31909: [SPARK-34809][CORE] Enable spark.hadoopRDD.ignoreEmptySplits by default
dongjoon-hyun commented on pull request #31909: URL: https://github.com/apache/spark/pull/31909#issuecomment-803519024 Thank you so much, @attilapiros ! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #31878: [SPARK-34784][BUILD] Upgrade Jackson to 2.12.2
dongjoon-hyun commented on pull request #31878: URL: https://github.com/apache/spark/pull/31878#issuecomment-803518979 The failure is irrelevant. ``` [info] - SPARK-34757: should ignore cache for SNAPSHOT dependencies *** FAILED *** (412 milliseconds) [info] 0 equaled 0 (SparkSubmitUtilsSuite.scala:321) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #31878: [SPARK-34784][BUILD] Upgrade Jackson to 2.12.2
AmplabJenkins commented on pull request #31878: URL: https://github.com/apache/spark/pull/31878#issuecomment-803518927 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136297/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31909: [SPARK-34809][CORE] Enable spark.hadoopRDD.ignoreEmptySplits by default
SparkQA commented on pull request #31909: URL: https://github.com/apache/spark/pull/31909#issuecomment-803518829 **[Test build #136299 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136299/testReport)** for PR 31909 at commit [`e34c5ae`](https://github.com/apache/spark/commit/e34c5aeac04ffc0d4f372472b863ee7401644187). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31878: [SPARK-34784][BUILD] Upgrade Jackson to 2.12.2
SparkQA commented on pull request #31878: URL: https://github.com/apache/spark/pull/31878#issuecomment-803518828 **[Test build #136297 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136297/testReport)** for PR 31878 at commit [`aeef7fb`](https://github.com/apache/spark/commit/aeef7fbaa73d152499d56e87a0046de88b392c2d). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #31911: [SPARK-34796][SQL][3.1] Initialize counter variable for LIMIT code-gen in doProduce()
AmplabJenkins commented on pull request #31911: URL: https://github.com/apache/spark/pull/31911#issuecomment-803518792 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/40880/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31911: [SPARK-34796][SQL][3.1] Initialize counter variable for LIMIT code-gen in doProduce()
SparkQA commented on pull request #31911: URL: https://github.com/apache/spark/pull/31911#issuecomment-803518785 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40880/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #31878: [SPARK-34784][BUILD] Upgrade Jackson to 2.12.2
AmplabJenkins commented on pull request #31878: URL: https://github.com/apache/spark/pull/31878#issuecomment-803518595 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/40879/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #31910: [SPARK-34810][TEST] Update PostgreSQL test with the latest results.
AmplabJenkins commented on pull request #31910: URL: https://github.com/apache/spark/pull/31910#issuecomment-803518594 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136294/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31911: [SPARK-34796][SQL][3.1] Initialize counter variable for LIMIT code-gen in doProduce()
SparkQA commented on pull request #31911: URL: https://github.com/apache/spark/pull/31911#issuecomment-803518349 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40880/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31878: [SPARK-34784][BUILD] Upgrade Jackson to 2.12.2
SparkQA commented on pull request #31878: URL: https://github.com/apache/spark/pull/31878#issuecomment-803517820 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40879/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31878: [SPARK-34784][BUILD] Upgrade Jackson to 2.12.2
SparkQA commented on pull request #31878: URL: https://github.com/apache/spark/pull/31878#issuecomment-803517127 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40879/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum commented on pull request #31910: [SPARK-34810][TEST] Update PostgreSQL test with the latest results.
wangyum commented on pull request #31910: URL: https://github.com/apache/spark/pull/31910#issuecomment-803516777 Merged to master. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum closed pull request #31910: [SPARK-34810][TEST] Update PostgreSQL test with the latest results.
wangyum closed pull request #31910: URL: https://github.com/apache/spark/pull/31910 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #31909: [SPARK-34809][CORE] Enable spark.hadoopRDD.ignoreEmptySplits by default
HyukjinKwon commented on pull request #31909: URL: https://github.com/apache/spark/pull/31909#issuecomment-803515683 retest this please -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #31909: [SPARK-34809][CORE] Enable spark.hadoopRDD.ignoreEmptySplits by default
HyukjinKwon commented on pull request #31909: URL: https://github.com/apache/spark/pull/31909#issuecomment-803515659 @bozhang2820 the test failure seems related to your PR https://github.com/apache/spark/commit/86ea5203201a1b7611f0beaa9c7759d480fb21af. Yeah I can also confirm that the test failure is not related to this PR -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #31910: [SPARK-34810][TEST] Update PostgreSQL test with the latest results.
SparkQA removed a comment on pull request #31910: URL: https://github.com/apache/spark/pull/31910#issuecomment-803503898 **[Test build #136294 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136294/testReport)** for PR 31910 at commit [`3f07b79`](https://github.com/apache/spark/commit/3f07b79dbafd1291dde88e346b45a775b5da1331). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31910: [SPARK-34810][TEST] Update PostgreSQL test with the latest results.
SparkQA commented on pull request #31910: URL: https://github.com/apache/spark/pull/31910#issuecomment-803515233 **[Test build #136294 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136294/testReport)** for PR 31910 at commit [`3f07b79`](https://github.com/apache/spark/commit/3f07b79dbafd1291dde88e346b45a775b5da1331). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31911: [SPARK-34796][SQL][3.1] Initialize counter variable for LIMIT code-gen in doProduce()
SparkQA commented on pull request #31911: URL: https://github.com/apache/spark/pull/31911#issuecomment-803514644 **[Test build #136298 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136298/testReport)** for PR 31911 at commit [`519ac59`](https://github.com/apache/spark/commit/519ac5915207e208ea32ceea76fd01088474ace1). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] c21 commented on pull request #31911: [SPARK-34796][SQL][3.1] Initialize counter variable for LIMIT code-gen in doProduce()
c21 commented on pull request #31911: URL: https://github.com/apache/spark/pull/31911#issuecomment-803514447 cc @maropu could you help take look when you have time, thanks. The PR has exactly same change with https://github.com/apache/spark/pull/31892. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] c21 opened a new pull request #31911: [SPARK-34796][SQL][3.1] Initialize counter variable for LIMIT code-gen in doProduce()
c21 opened a new pull request #31911: URL: https://github.com/apache/spark/pull/31911 ### What changes were proposed in this pull request? This PR is to fix the LIMIT code-gen bug in https://issues.apache.org/jira/browse/SPARK-34796, where the counter variable from `BaseLimitExec` is not initialized but used in code-gen. This is because the limit counter variable will be used in upstream operators (LIMIT's child plan, e.g. `ColumnarToRowExec` operator for early termination), but in the same stage, there can be some operators doing the shortcut and not calling `BaseLimitExec`'s `doConsume()`, e.g. [HashJoin.codegenInner](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/joins/HashJoin.scala#L402). So if we have query that `LocalLimit - BroadcastHashJoin - FileScan` in the same stage, the whole stage code-gen compilation will be failed. Here is an example: ``` test("failed limit query") { withTable("left_table", "empty_right_table", "output_table") { spark.range(5).toDF("k").write.saveAsTable("left_table") spark.range(0).toDF("k").write.saveAsTable("empty_right_table") withSQLConf(SQLConf.ADAPTIVE_EXECUTION_ENABLED.key -> "false") { spark.sql("CREATE TABLE output_table (k INT) USING parquet") spark.sql( s""" |INSERT INTO TABLE output_table |SELECT t1.k FROM left_table t1 |JOIN empty_right_table t2 |ON t1.k = t2.k |LIMIT 3 |""".stripMargin) } } } ``` Query plan: ``` Execute InsertIntoHadoopFsRelationCommand file:/Users/chengsu/spark/sql/core/spark-warehouse/org.apache.spark.sql.SQLQuerySuite/output_table, false, Parquet, Map(path -> file:/Users/chengsu/spark/sql/core/spark-warehouse/org.apache.spark.sql.SQLQuerySuite/output_table), Append, CatalogTable( Database: default Table: output_table Created Time: Thu Mar 18 21:46:26 PDT 2021 Last Access: UNKNOWN Created By: Spark 3.2.0-SNAPSHOT Type: MANAGED Provider: parquet Location: file:/Users/chengsu/spark/sql/core/spark-warehouse/org.apache.spark.sql.SQLQuerySuite/output_table Schema: root |-- k: integer (nullable = true) ), org.apache.spark.sql.execution.datasources.InMemoryFileIndex@b25d08b, [k] +- *(3) Project [ansi_cast(k#228L as int) AS k#231] +- *(3) GlobalLimit 3 +- Exchange SinglePartition, ENSURE_REQUIREMENTS, [id=#179] +- *(2) LocalLimit 3 +- *(2) Project [k#228L] +- *(2) BroadcastHashJoin [k#228L], [k#229L], Inner, BuildRight, false :- *(2) Filter isnotnull(k#228L) : +- *(2) ColumnarToRow : +- FileScan parquet default.left_table[k#228L] Batched: true, DataFilters: [isnotnull(k#228L)], Format: Parquet, Location: InMemoryFileIndex(1 paths)[file:/Users/chengsu/spark/sql/core/spark-warehouse/org.apache.spark.sq..., PartitionFilters: [], PushedFilters: [IsNotNull(k)], ReadSchema: struct +- BroadcastExchange HashedRelationBroadcastMode(List(input[0, bigint, false]),false), [id=#173] +- *(1) Filter isnotnull(k#229L) +- *(1) ColumnarToRow +- FileScan parquet default.empty_right_table[k#229L] Batched: true, DataFilters: [isnotnull(k#229L)], Format: Parquet, Location: InMemoryFileIndex(1 paths)[file:/Users/chengsu/spark/sql/core/spark-warehouse/org.apache.spark.sq..., PartitionFilters: [], PushedFilters: [IsNotNull(k)], ReadSchema: struct ``` Codegen failure - https://gist.github.com/c21/ea760c75b546d903247582be656d9d66 . The uninitialized variable `_limit_counter_1` from `LocalLimitExec` is referenced in `ColumnarToRowExec`, but `BroadcastHashJoinExec` does not call `LocalLimitExec.doConsume()` to initialize the counter variable. The fix is to move the counter variable initialization to `doProduce()`, as in whole stage code-gen framework, `doProduce()` will definitely be called if upstream operators `doProduce()`/`doConsume()` is called. Note: this only happens in AQE disabled case, because we have an AQE optimization rule [EliminateUnnecessaryJoin](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/EliminateUnnecessaryJoin.scala#L69) to change the whole query to an empty `LocalRelation` if inner join broadcast side is empty with AQE enabled. ### Why are the changes needed? Fix query failure. ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Added unit test in `SQLQuerySuite.scala`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub
[GitHub] [spark] AmplabJenkins removed a comment on pull request #31886: [WIP][SPARK-34795][SQL][TEST] Adds a new job in GitHub Actions to check the output of TPC-DS queries
AmplabJenkins removed a comment on pull request #31886: URL: https://github.com/apache/spark/pull/31886#issuecomment-803512627 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/40878/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #31886: [WIP][SPARK-34795][SQL][TEST] Adds a new job in GitHub Actions to check the output of TPC-DS queries
AmplabJenkins commented on pull request #31886: URL: https://github.com/apache/spark/pull/31886#issuecomment-803512627 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/40878/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #31909: [SPARK-34809][CORE] Enable spark.hadoopRDD.ignoreEmptySplits by default
dongjoon-hyun commented on pull request #31909: URL: https://github.com/apache/spark/pull/31909#issuecomment-803512564 The failure is irrelevant. ``` [info] - SPARK-34757: should ignore cache for SNAPSHOT dependencies *** FAILED *** (1 second, 254 milliseconds) [info] 0 equaled 0 (SparkSubmitUtilsSuite.scala:321) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31886: [WIP][SPARK-34795][SQL][TEST] Adds a new job in GitHub Actions to check the output of TPC-DS queries
SparkQA commented on pull request #31886: URL: https://github.com/apache/spark/pull/31886#issuecomment-803511979 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40878/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31886: [WIP][SPARK-34795][SQL][TEST] Adds a new job in GitHub Actions to check the output of TPC-DS queries
SparkQA commented on pull request #31886: URL: https://github.com/apache/spark/pull/31886#issuecomment-803511376 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40878/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31878: [SPARK-34784][BUILD] Upgrade Jackson to 2.12.2
SparkQA commented on pull request #31878: URL: https://github.com/apache/spark/pull/31878#issuecomment-803509661 **[Test build #136297 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136297/testReport)** for PR 31878 at commit [`aeef7fb`](https://github.com/apache/spark/commit/aeef7fbaa73d152499d56e87a0046de88b392c2d). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum commented on pull request #31878: [SPARK-34784][BUILD] Upgrade Jackson to 2.12.2
wangyum commented on pull request #31878: URL: https://github.com/apache/spark/pull/31878#issuecomment-803509509 Jenkins, retest this please. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #31649: [SPARK-34542][BUILD] Upgrade Parquet to 1.12.0
SparkQA removed a comment on pull request #31649: URL: https://github.com/apache/spark/pull/31649#issuecomment-803499257 **[Test build #136292 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136292/testReport)** for PR 31649 at commit [`145dba0`](https://github.com/apache/spark/commit/145dba00e7484d50304a16eebfb73006bd63855b). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30679: [SPARK-33717][LAUNCHER] deprecate spark.launcher.childConectionTimeout
AmplabJenkins removed a comment on pull request #30679: URL: https://github.com/apache/spark/pull/30679#issuecomment-803508278 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136293/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #31910: [SPARK-34810][TEST] Update PostgreSQL test with the latest results.
AmplabJenkins removed a comment on pull request #31910: URL: https://github.com/apache/spark/pull/31910#issuecomment-803508565 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/40876/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #31899: [SPARK-34525][DOCS] Update Spark Create Table DDL to reflect alternative key value notation
AmplabJenkins removed a comment on pull request #31899: URL: https://github.com/apache/spark/pull/31899#issuecomment-803508319 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/40877/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #31649: [SPARK-34542][BUILD] Upgrade Parquet to 1.12.0
AmplabJenkins removed a comment on pull request #31649: URL: https://github.com/apache/spark/pull/31649#issuecomment-803508370 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136292/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31910: [SPARK-34810][TEST] Update PostgreSQL test with the latest results.
SparkQA commented on pull request #31910: URL: https://github.com/apache/spark/pull/31910#issuecomment-803508561 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40876/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #31910: [SPARK-34810][TEST] Update PostgreSQL test with the latest results.
AmplabJenkins commented on pull request #31910: URL: https://github.com/apache/spark/pull/31910#issuecomment-803508565 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/40876/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #31649: [SPARK-34542][BUILD] Upgrade Parquet to 1.12.0
AmplabJenkins commented on pull request #31649: URL: https://github.com/apache/spark/pull/31649#issuecomment-803508370 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136292/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31886: [WIP][SPARK-34795][SQL][TEST] Adds a new job in GitHub Actions to check the output of TPC-DS queries
SparkQA commented on pull request #31886: URL: https://github.com/apache/spark/pull/31886#issuecomment-803508352 **[Test build #136296 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136296/testReport)** for PR 31886 at commit [`4185e73`](https://github.com/apache/spark/commit/4185e739a1823d3c1462790e98889dcdba57d39e). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30679: [SPARK-33717][LAUNCHER] deprecate spark.launcher.childConectionTimeout
AmplabJenkins commented on pull request #30679: URL: https://github.com/apache/spark/pull/30679#issuecomment-803508342 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31899: [SPARK-34525][DOCS] Update Spark Create Table DDL to reflect alternative key value notation
SparkQA commented on pull request #31899: URL: https://github.com/apache/spark/pull/31899#issuecomment-803508315 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40877/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #31899: [SPARK-34525][DOCS] Update Spark Create Table DDL to reflect alternative key value notation
AmplabJenkins commented on pull request #31899: URL: https://github.com/apache/spark/pull/31899#issuecomment-803508319 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/40877/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30679: [SPARK-33717][LAUNCHER] deprecate spark.launcher.childConectionTimeout
AmplabJenkins commented on pull request #30679: URL: https://github.com/apache/spark/pull/30679#issuecomment-803508278 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136293/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31649: [SPARK-34542][BUILD] Upgrade Parquet to 1.12.0
SparkQA commented on pull request #31649: URL: https://github.com/apache/spark/pull/31649#issuecomment-803508239 **[Test build #136292 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136292/testReport)** for PR 31649 at commit [`145dba0`](https://github.com/apache/spark/commit/145dba00e7484d50304a16eebfb73006bd63855b). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31910: [SPARK-34810][TEST] Update PostgreSQL test with the latest results.
SparkQA commented on pull request #31910: URL: https://github.com/apache/spark/pull/31910#issuecomment-803507782 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40876/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #30679: [SPARK-33717][LAUNCHER] deprecate spark.launcher.childConectionTimeout
SparkQA removed a comment on pull request #30679: URL: https://github.com/apache/spark/pull/30679#issuecomment-803499310 **[Test build #136293 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136293/testReport)** for PR 30679 at commit [`e1db9f4`](https://github.com/apache/spark/commit/e1db9f42fb0157666522ba7596f53e40b58047e4). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30679: [SPARK-33717][LAUNCHER] deprecate spark.launcher.childConectionTimeout
SparkQA commented on pull request #30679: URL: https://github.com/apache/spark/pull/30679#issuecomment-803507704 **[Test build #136293 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136293/testReport)** for PR 30679 at commit [`e1db9f4`](https://github.com/apache/spark/commit/e1db9f42fb0157666522ba7596f53e40b58047e4). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31899: [SPARK-34525][DOCS] Update Spark Create Table DDL to reflect alternative key value notation
SparkQA commented on pull request #31899: URL: https://github.com/apache/spark/pull/31899#issuecomment-803507614 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40877/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on pull request #31886: [WIP][SPARK-34795][SQL][TEST] Adds a new job in GitHub Actions to check the output of TPC-DS queries
maropu commented on pull request #31886: URL: https://github.com/apache/spark/pull/31886#issuecomment-803506269 NOTE: I'm checking if the result in the golden files is correct query-by-query. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #31886: [WIP][SPARK-34795][SQL][TEST] Adds a new job in GitHub Actions to check the output of TPC-DS queries
maropu commented on a change in pull request #31886: URL: https://github.com/apache/spark/pull/31886#discussion_r598214512 ## File path: sql/core/src/test/resources/tpcds-query-results/v1_4/q1.sql.out ## @@ -0,0 +1,125 @@ +-- Automatically generated by TPCDSQueryTestSuite + +-- !query +WITH customer_total_return AS +( SELECT +sr_customer_sk AS ctr_customer_sk, +sr_store_sk AS ctr_store_sk, +sum(sr_return_amt) AS ctr_total_return + FROM store_returns, date_dim + WHERE sr_returned_date_sk = d_date_sk AND d_year = 2000 + GROUP BY sr_customer_sk, sr_store_sk) +SELECT c_customer_id +FROM customer_total_return ctr1, store, customer +WHERE ctr1.ctr_total_return > + (SELECT avg(ctr_total_return) * 1.2 + FROM customer_total_return ctr2 + WHERE ctr1.ctr_store_sk = ctr2.ctr_store_sk) + AND s_store_sk = ctr1.ctr_store_sk + AND s_state = 'TN' + AND ctr1.ctr_customer_sk = c_customer_sk +ORDER BY c_customer_id +LIMIT 100 Review comment: okay. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #31899: [SPARK-34525][DOCS] Update Spark Create Table DDL to reflect alternative key value notation
AmplabJenkins removed a comment on pull request #31899: URL: https://github.com/apache/spark/pull/31899#issuecomment-803504740 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136295/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #31899: [SPARK-34525][DOCS] Update Spark Create Table DDL to reflect alternative key value notation
SparkQA removed a comment on pull request #31899: URL: https://github.com/apache/spark/pull/31899#issuecomment-803503908 **[Test build #136295 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136295/testReport)** for PR 31899 at commit [`34690ed`](https://github.com/apache/spark/commit/34690ed5e1cc54834e710031d5acb9483f23a8b8). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #31899: [SPARK-34525][DOCS] Update Spark Create Table DDL to reflect alternative key value notation
AmplabJenkins commented on pull request #31899: URL: https://github.com/apache/spark/pull/31899#issuecomment-803504740 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136295/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31899: [SPARK-34525][DOCS] Update Spark Create Table DDL to reflect alternative key value notation
SparkQA commented on pull request #31899: URL: https://github.com/apache/spark/pull/31899#issuecomment-803504714 **[Test build #136295 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136295/testReport)** for PR 31899 at commit [`34690ed`](https://github.com/apache/spark/commit/34690ed5e1cc54834e710031d5acb9483f23a8b8). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #31899: [SPARK-34525][DOCS] Update Spark Create Table DDL to reflect alternative key value notation
AmplabJenkins removed a comment on pull request #31899: URL: https://github.com/apache/spark/pull/31899#issuecomment-802893284 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #31908: [SPARK-34808][SQL] Removes outer join if it only has DISTINCT on streamed side
AmplabJenkins removed a comment on pull request #31908: URL: https://github.com/apache/spark/pull/31908#issuecomment-803503801 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/40873/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31899: [SPARK-34525][DOCS] Update Spark Create Table DDL to reflect alternative key value notation
SparkQA commented on pull request #31899: URL: https://github.com/apache/spark/pull/31899#issuecomment-803503908 **[Test build #136295 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136295/testReport)** for PR 31899 at commit [`34690ed`](https://github.com/apache/spark/commit/34690ed5e1cc54834e710031d5acb9483f23a8b8). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31910: [SPARK-34810][TEST] Update PostgreSQL test with the latest results.
SparkQA commented on pull request #31910: URL: https://github.com/apache/spark/pull/31910#issuecomment-803503898 **[Test build #136294 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136294/testReport)** for PR 31910 at commit [`3f07b79`](https://github.com/apache/spark/commit/3f07b79dbafd1291dde88e346b45a775b5da1331). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #31649: [SPARK-34542][BUILD] Upgrade Parquet to 1.12.0
AmplabJenkins removed a comment on pull request #31649: URL: https://github.com/apache/spark/pull/31649#issuecomment-803503758 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/40874/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on pull request #30018: [SPARK-33122][SQL] Remove redundant aggregates in the Optimzier
maropu commented on pull request #30018: URL: https://github.com/apache/spark/pull/30018#issuecomment-803503800 Thanks for the reviews, @dongjoon-hyun ! Please open a new follow-up PR to address them, @tanelk . -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30679: [SPARK-33717][LAUNCHER] deprecate spark.launcher.childConectionTimeout
AmplabJenkins removed a comment on pull request #30679: URL: https://github.com/apache/spark/pull/30679#issuecomment-803503756 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/40875/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #31908: [SPARK-34808][SQL] Removes outer join if it only has DISTINCT on streamed side
AmplabJenkins commented on pull request #31908: URL: https://github.com/apache/spark/pull/31908#issuecomment-803503801 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/40873/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #31909: [SPARK-34809][CORE] Enable spark.hadoopRDD.ignoreEmptySplits by default
AmplabJenkins removed a comment on pull request #31909: URL: https://github.com/apache/spark/pull/31909#issuecomment-803503757 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136290/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31908: [SPARK-34808][SQL] Removes outer join if it only has DISTINCT on streamed side
SparkQA commented on pull request #31908: URL: https://github.com/apache/spark/pull/31908#issuecomment-803503797 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40873/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #30679: [SPARK-33717][LAUNCHER] deprecate spark.launcher.childConectionTimeout
AmplabJenkins commented on pull request #30679: URL: https://github.com/apache/spark/pull/30679#issuecomment-803503756 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/40875/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #31909: [SPARK-34809][CORE] Enable spark.hadoopRDD.ignoreEmptySplits by default
AmplabJenkins commented on pull request #31909: URL: https://github.com/apache/spark/pull/31909#issuecomment-803503757 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136290/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #31649: [SPARK-34542][BUILD] Upgrade Parquet to 1.12.0
AmplabJenkins commented on pull request #31649: URL: https://github.com/apache/spark/pull/31649#issuecomment-803503758 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/40874/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on pull request #31900: Update sql-ref-syntax-dml-insert-into.md
maropu commented on pull request #31900: URL: https://github.com/apache/spark/pull/31900#issuecomment-803503695 The fix itself looks fine to me. Please follow the @HyukjinKwon suggestion above. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30679: [SPARK-33717][LAUNCHER] deprecate spark.launcher.childConectionTimeout
SparkQA commented on pull request #30679: URL: https://github.com/apache/spark/pull/30679#issuecomment-803503563 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40875/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on pull request #31899: [SPARK-34525][DOCS] Update Spark Create Table DDL to reflect alternative key value notation
maropu commented on pull request #31899: URL: https://github.com/apache/spark/pull/31899#issuecomment-803503349 It seems the other DDL stats have the same syntax, e.g. `ALTER TABLE`. Could you fix them, too, in this PR? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on pull request #31899: [SPARK-34525][DOCS] Update Spark Create Table DDL to reflect alternative key value notation
maropu commented on pull request #31899: URL: https://github.com/apache/spark/pull/31899#issuecomment-803503235 ok to test -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31908: [SPARK-34808][SQL] Removes outer join if it only has DISTINCT on streamed side
SparkQA commented on pull request #31908: URL: https://github.com/apache/spark/pull/31908#issuecomment-803503109 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40873/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31649: [SPARK-34542][BUILD] Upgrade Parquet to 1.12.0
SparkQA commented on pull request #31649: URL: https://github.com/apache/spark/pull/31649#issuecomment-803503048 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40874/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30679: [SPARK-33717][LAUNCHER] deprecate spark.launcher.childConectionTimeout
SparkQA commented on pull request #30679: URL: https://github.com/apache/spark/pull/30679#issuecomment-803502927 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40875/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #31909: [SPARK-34809][CORE] Enable spark.hadoopRDD.ignoreEmptySplits by default
SparkQA removed a comment on pull request #31909: URL: https://github.com/apache/spark/pull/31909#issuecomment-803493780 **[Test build #136290 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136290/testReport)** for PR 31909 at commit [`e34c5ae`](https://github.com/apache/spark/commit/e34c5aeac04ffc0d4f372472b863ee7401644187). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31909: [SPARK-34809][CORE] Enable spark.hadoopRDD.ignoreEmptySplits by default
SparkQA commented on pull request #31909: URL: https://github.com/apache/spark/pull/31909#issuecomment-803502542 **[Test build #136290 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136290/testReport)** for PR 31909 at commit [`e34c5ae`](https://github.com/apache/spark/commit/e34c5aeac04ffc0d4f372472b863ee7401644187). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #31649: [SPARK-34542][BUILD] Upgrade Parquet to 1.12.0
SparkQA commented on pull request #31649: URL: https://github.com/apache/spark/pull/31649#issuecomment-803502356 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/40874/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #31909: [SPARK-34809][CORE] Enable spark.hadoopRDD.ignoreEmptySplits by default
dongjoon-hyun commented on pull request #31909: URL: https://github.com/apache/spark/pull/31909#issuecomment-803502131 Also, cc @mridulm since this is a behavior change. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #31909: [SPARK-34809][CORE] Enable spark.hadoopRDD.ignoreEmptySplits by default
dongjoon-hyun commented on pull request #31909: URL: https://github.com/apache/spark/pull/31909#issuecomment-803501833 Thank you, @HyukjinKwon . Ya, I remember those commits. :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] williamhyun opened a new pull request #31910: [SPARK-34810][TEST] Update PostgreSQL test with the latest results.
williamhyun opened a new pull request #31910: URL: https://github.com/apache/spark/pull/31910 ### What changes were proposed in this pull request? This PR aims to update `PostgresIntegrationSuite` with the latest results. ### Why are the changes needed? The latest PostgreSQL jar version is 42.2.19. - https://jdbc.postgresql.org/documentation/changelog.html#version_42.2.19 42.2.9 (2019-12-06) 42.2.10 (2020-01-30) 42.2.11 (2020-03-09) 42.2.12 (2020-03-31) 42.2.13 (2020-06-04) 42.2.14 (2020-06-10) 42.2.15 (2020-08-14) 42.2.16 (2020-08-20) 42.2.17 (2020-10-09) 42.2.18 (2020-10-15) 42.2.19 (2021-02-18) ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass the CI with the updated test cases. ``` build/sbt -Pdocker-integration-tests 'docker-integration-tests/testOnly org.apache.spark.sql.jdbc.PostgresIntegrationSuite' ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #31900: Update sql-ref-syntax-dml-insert-into.md
HyukjinKwon commented on pull request #31900: URL: https://github.com/apache/spark/pull/31900#issuecomment-803500258 @robert4os, can you keep the PR template, and answer each questions, and follow http://spark.apache.org/contributing.html (e.g., fix PR title, filing a JIRA, etc.)? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #31908: [SPARK-34808][SQL] Removes outer join if it only has DISTINCT on streamed side
AmplabJenkins removed a comment on pull request #31908: URL: https://github.com/apache/spark/pull/31908#issuecomment-803445827 Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/136284/ -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #30679: [SPARK-33717][LAUNCHER] deprecate spark.launcher.childConectionTimeout
AmplabJenkins removed a comment on pull request #30679: URL: https://github.com/apache/spark/pull/30679#issuecomment-803499128 Can one of the admins verify this patch? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #30679: [SPARK-33717][LAUNCHER] deprecate spark.launcher.childConectionTimeout
SparkQA commented on pull request #30679: URL: https://github.com/apache/spark/pull/30679#issuecomment-803499310 **[Test build #136293 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/136293/testReport)** for PR 30679 at commit [`e1db9f4`](https://github.com/apache/spark/commit/e1db9f42fb0157666522ba7596f53e40b58047e4). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org