[GitHub] [spark] gengliangwang commented on issue #24849: [SPARK-28018][SQL] Allow upcasting decimal to double/float
gengliangwang commented on issue #24849: [SPARK-28018][SQL] Allow upcasting decimal to double/float URL: https://github.com/apache/spark/pull/24849#issuecomment-507140862 > - Add a decimal type for SQL literals that can be cast to float because the intended type of the literal is not known, or use some analysis rule that matches literals for the same purpose > - Parse literals as floats and insert an implicit cast from float to decimal Sorry I meant there is not a conclusion in the sync. I came up with the proposals in the sync, but I don't think they are good enough. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei edited a comment on issue #24992: [SPARK-28194][SQL] Judge whether to reorder joinKeys to prevent None.get in EnsureRequirements
turboFei edited a comment on issue #24992: [SPARK-28194][SQL] Judge whether to reorder joinKeys to prevent None.get in EnsureRequirements URL: https://github.com/apache/spark/pull/24992#issuecomment-507135803 @maropu It is difficult to judge Set(currentOrderOfKeys) == Set(expectedOrderOfKeys). For example: ``` Seq(1, 2, 2, 2) Seq(1, 1, 2, 2) Seq(1, 1, 1, 2) ``` So I add a judgement in `reorder`, when we can not reorder the join keys, return the (leftKeys, rightKeys) directly. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei edited a comment on issue #24992: [SPARK-28194][SQL] Judge whether to reorder joinKeys to prevent None.get in EnsureRequirements
turboFei edited a comment on issue #24992: [SPARK-28194][SQL] Judge whether to reorder joinKeys to prevent None.get in EnsureRequirements URL: https://github.com/apache/spark/pull/24992#issuecomment-507135803 @maropu It is difficult to judge Set(currentOrderOfKeys) == Set(expectedOrderOfKeys). For example: ``` Seq(1, 2, 2, 2) Seq(1, 1, 2, 2) Seq(1, 1, 1, 2) ``` So I add a judgement in `reorder`, when we can not reorder the keys, return the (leftKeys, rightKeys) directly. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei edited a comment on issue #24992: [SPARK-28194][SQL] Judge whether to reorder joinKeys to prevent None.get in EnsureRequirements
turboFei edited a comment on issue #24992: [SPARK-28194][SQL] Judge whether to reorder joinKeys to prevent None.get in EnsureRequirements URL: https://github.com/apache/spark/pull/24992#issuecomment-507135803 @maropu It is difficult to judge Set(currentOrderOfKeys) == Set(expectedOrderOfKeys). For example: ``` Seq(1, 2, 2, 2) Seq(1, 1, 2, 2) Seq(1, 1, 1, 2) ``` So I add a judgement in `reorder`, when we can reorder the keys, return the (leftKeys, rightKeys) directly. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei commented on issue #24992: [SPARK-28194][SQL] Judge whether to reorder joinKeys to prevent None.get in EnsureRequirements
turboFei commented on issue #24992: [SPARK-28194][SQL] Judge whether to reorder joinKeys to prevent None.get in EnsureRequirements URL: https://github.com/apache/spark/pull/24992#issuecomment-507135803 It is difficult to judge Set(currentOrderOfKeys) == Set(expectedOrderOfKeys). For example: ``` Seq(1, 2, 2, 2) Seq(1, 1, 2, 2) Seq(1, 1, 1, 2) ``` So I add a judgement in `reorder`, when we can reorder the keys, return the (leftKeys, rightKeys) directly. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24998: [SPARK-28202] [Core] [Test] Avoid noises of system props in SparkConfSuite
SparkQA commented on issue #24998: [SPARK-28202] [Core] [Test] Avoid noises of system props in SparkConfSuite URL: https://github.com/apache/spark/pull/24998#issuecomment-507134687 **[Test build #107063 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107063/testReport)** for PR 24998 at commit [`4e8507c`](https://github.com/apache/spark/commit/4e8507c2b89854e2096ca98b2a0c03296620cbea). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24998: [SPARK-28202] [Core] [Test] Avoid noises of system props in SparkConfSuite
AmplabJenkins removed a comment on issue #24998: [SPARK-28202] [Core] [Test] Avoid noises of system props in SparkConfSuite URL: https://github.com/apache/spark/pull/24998#issuecomment-507134249 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12256/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24998: [SPARK-28202] [Core] [Test] Avoid noises of system props in SparkConfSuite
AmplabJenkins removed a comment on issue #24998: [SPARK-28202] [Core] [Test] Avoid noises of system props in SparkConfSuite URL: https://github.com/apache/spark/pull/24998#issuecomment-507134241 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24998: [SPARK-28202] [Core] [Test] Avoid noises of system props in SparkConfSuite
AmplabJenkins commented on issue #24998: [SPARK-28202] [Core] [Test] Avoid noises of system props in SparkConfSuite URL: https://github.com/apache/spark/pull/24998#issuecomment-507134241 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] jerryshao commented on issue #24998: [SPARK-28202] [Core] [Test] Avoid noises of system props in SparkConfSuite
jerryshao commented on issue #24998: [SPARK-28202] [Core] [Test] Avoid noises of system props in SparkConfSuite URL: https://github.com/apache/spark/pull/24998#issuecomment-507134239 I'm OK with the changes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24998: [SPARK-28202] [Core] [Test] Avoid noises of system props in SparkConfSuite
AmplabJenkins commented on issue #24998: [SPARK-28202] [Core] [Test] Avoid noises of system props in SparkConfSuite URL: https://github.com/apache/spark/pull/24998#issuecomment-507134249 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12256/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] jerryshao commented on issue #24998: [SPARK-28202] [Core] [Test] Avoid noises of system props in SparkConfSuite
jerryshao commented on issue #24998: [SPARK-28202] [Core] [Test] Avoid noises of system props in SparkConfSuite URL: https://github.com/apache/spark/pull/24998#issuecomment-507133562 Jenkins, test this please. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25016: [SPARK-28200][SQL] Decimal overflow handling in ExpressionEncoder
SparkQA commented on issue #25016: [SPARK-28200][SQL] Decimal overflow handling in ExpressionEncoder URL: https://github.com/apache/spark/pull/25016#issuecomment-507132873 **[Test build #107062 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107062/testReport)** for PR 25016 at commit [`2513ea7`](https://github.com/apache/spark/commit/2513ea779767b4afbaa979a25ec402a7e2d9aa4c). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25016: [SPARK-28200][SQL] Decimal overflow handling in ExpressionEncoder
AmplabJenkins removed a comment on issue #25016: [SPARK-28200][SQL] Decimal overflow handling in ExpressionEncoder URL: https://github.com/apache/spark/pull/25016#issuecomment-507132340 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25016: [SPARK-28200][SQL] Decimal overflow handling in ExpressionEncoder
AmplabJenkins removed a comment on issue #25016: [SPARK-28200][SQL] Decimal overflow handling in ExpressionEncoder URL: https://github.com/apache/spark/pull/25016#issuecomment-507132345 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12255/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25016: [SPARK-28200][SQL] Decimal overflow handling in ExpressionEncoder
AmplabJenkins commented on issue #25016: [SPARK-28200][SQL] Decimal overflow handling in ExpressionEncoder URL: https://github.com/apache/spark/pull/25016#issuecomment-507132345 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12255/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25016: [SPARK-28200][SQL] Decimal overflow handling in ExpressionEncoder
AmplabJenkins commented on issue #25016: [SPARK-28200][SQL] Decimal overflow handling in ExpressionEncoder URL: https://github.com/apache/spark/pull/25016#issuecomment-507132340 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei commented on a change in pull request #24992: [SPARK-28194][SQL] Judge whether to reorder joinKeys to prevent None.get in EnsureRequirements
turboFei commented on a change in pull request #24992: [SPARK-28194][SQL] Judge whether to reorder joinKeys to prevent None.get in EnsureRequirements URL: https://github.com/apache/spark/pull/24992#discussion_r298889535 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala ## @@ -231,14 +231,15 @@ case class EnsureRequirements(conf: SQLConf) extends Rule[SparkPlan] { val keysAndIndexes = currentOrderOfKeys.zipWithIndex expectedOrderOfKeys.foreach(expression => { - val index = keysAndIndexes.find { case (e, idx) => + keysAndIndexes.find { case (e, idx) => // As we may have the same key used many times, we need to filter out its occurrence we // have already used. e.semanticEquals(expression) && !pickedIndexes.contains(idx) - }.map(_._2).get - pickedIndexes += index - leftKeysBuffer.append(leftKeys(index)) - rightKeysBuffer.append(rightKeys(index)) + }.map(_._2).map(index => { Review comment: We can add a judgement in reorder. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified to avoid unnecessary decoding.
maropu commented on issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified to avoid unnecessary decoding. URL: https://github.com/apache/spark/pull/22347#issuecomment-507130730 sure. @Dooyoung-Hwang are you there? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified to avoid unnecessary decoding.
cloud-fan commented on issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified to avoid unnecessary decoding. URL: https://github.com/apache/spark/pull/22347#issuecomment-507129518 LGTM, @maropu can you take it over if @Dooyoung-Hwang is not active? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified to avoid unnecessary decoding.
SparkQA commented on issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified to avoid unnecessary decoding. URL: https://github.com/apache/spark/pull/22347#issuecomment-507129391 **[Test build #107061 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107061/testReport)** for PR 22347 at commit [`8666272`](https://github.com/apache/spark/commit/86662722e53bfcae2c75e61d170c983abd599b3a). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified to avoid unnecessary decoding.
cloud-fan commented on a change in pull request #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified to avoid unnecessary decoding. URL: https://github.com/apache/spark/pull/22347#discussion_r298891683 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala ## @@ -348,30 +349,30 @@ abstract class SparkPlan extends QueryPlan[SparkPlan] with Logging with Serializ // Otherwise, interpolate the number of partitions we need to try, but overestimate // it by 50%. We also cap the estimation in the end. val limitScaleUpFactor = Math.max(sqlContext.conf.limitScaleUpFactor, 2) -if (buf.isEmpty) { +if (scannedRowCount == 0) { numPartsToTry = partsScanned * limitScaleUpFactor } else { - val left = n - buf.size + val left = n - scannedRowCount // As left > 0, numPartsToTry is always >= 1 - numPartsToTry = Math.ceil(1.5 * left * partsScanned / buf.size).toInt + numPartsToTry = Math.ceil(1.5 * left * partsScanned / scannedRowCount).toInt numPartsToTry = Math.min(numPartsToTry, partsScanned * limitScaleUpFactor) } } val p = partsScanned.until(math.min(partsScanned + numPartsToTry, totalParts).toInt) val sc = sqlContext.sparkContext - val res = sc.runJob(childRDD, -(it: Iterator[Array[Byte]]) => if (it.hasNext) it.next() else Array.empty[Byte], p) - - buf ++= res.flatMap(decodeUnsafeRows) + val res = sc.runJob(childRDD, (it: Iterator[(Long, Array[Byte])]) => +if (it.hasNext) it.next() else (0L, Array.empty[Byte]), p) + buf ++= res.map(_._2) + scannedRowCount += res.map(_._1).sum partsScanned += p.size } -if (buf.size > n) { - buf.take(n).toArray +if (scannedRowCount > n) { + buf.iterator.flatMap(decodeUnsafeRows).take(n).toArray } else { - buf.toArray + buf.flatMap(decodeUnsafeRows).toArray Review comment: ditto This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified to avoid unnecessary decoding.
cloud-fan commented on a change in pull request #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified to avoid unnecessary decoding. URL: https://github.com/apache/spark/pull/22347#discussion_r298891651 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/SparkPlan.scala ## @@ -348,30 +349,30 @@ abstract class SparkPlan extends QueryPlan[SparkPlan] with Logging with Serializ // Otherwise, interpolate the number of partitions we need to try, but overestimate // it by 50%. We also cap the estimation in the end. val limitScaleUpFactor = Math.max(sqlContext.conf.limitScaleUpFactor, 2) -if (buf.isEmpty) { +if (scannedRowCount == 0) { numPartsToTry = partsScanned * limitScaleUpFactor } else { - val left = n - buf.size + val left = n - scannedRowCount // As left > 0, numPartsToTry is always >= 1 - numPartsToTry = Math.ceil(1.5 * left * partsScanned / buf.size).toInt + numPartsToTry = Math.ceil(1.5 * left * partsScanned / scannedRowCount).toInt numPartsToTry = Math.min(numPartsToTry, partsScanned * limitScaleUpFactor) } } val p = partsScanned.until(math.min(partsScanned + numPartsToTry, totalParts).toInt) val sc = sqlContext.sparkContext - val res = sc.runJob(childRDD, -(it: Iterator[Array[Byte]]) => if (it.hasNext) it.next() else Array.empty[Byte], p) - - buf ++= res.flatMap(decodeUnsafeRows) + val res = sc.runJob(childRDD, (it: Iterator[(Long, Array[Byte])]) => +if (it.hasNext) it.next() else (0L, Array.empty[Byte]), p) + buf ++= res.map(_._2) + scannedRowCount += res.map(_._1).sum partsScanned += p.size } -if (buf.size > n) { - buf.take(n).toArray +if (scannedRowCount > n) { + buf.iterator.flatMap(decodeUnsafeRows).take(n).toArray Review comment: nit: since this is perf critical code path, I think we can optimize it further, since we know the length of the result array. ``` val result = new Array[InternalRow](n) while (result.length < n) { // decode } result ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified to avoid unnecessary decoding.
AmplabJenkins removed a comment on issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified to avoid unnecessary decoding. URL: https://github.com/apache/spark/pull/22347#issuecomment-507128897 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12254/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified to avoid unnecessary decoding.
AmplabJenkins commented on issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified to avoid unnecessary decoding. URL: https://github.com/apache/spark/pull/22347#issuecomment-507128897 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12254/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified to avoid unnecessary decoding.
AmplabJenkins removed a comment on issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified to avoid unnecessary decoding. URL: https://github.com/apache/spark/pull/22347#issuecomment-507128893 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified to avoid unnecessary decoding.
AmplabJenkins commented on issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified to avoid unnecessary decoding. URL: https://github.com/apache/spark/pull/22347#issuecomment-507128893 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified to avoid unnecessary decoding.
AmplabJenkins removed a comment on issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified to avoid unnecessary decoding. URL: https://github.com/apache/spark/pull/22347#issuecomment-502206062 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified to avoid unnecessary decoding.
cloud-fan commented on issue #22347: [SPARK-25353][SQL] executeTake in SparkPlan is modified to avoid unnecessary decoding. URL: https://github.com/apache/spark/pull/22347#issuecomment-507128498 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LiShuMing commented on issue #24998: [SPARK-28202] [Core] [Test] Avoid noises of system props in SparkConfSuite
LiShuMing commented on issue #24998: [SPARK-28202] [Core] [Test] Avoid noises of system props in SparkConfSuite URL: https://github.com/apache/spark/pull/24998#issuecomment-507128537 In `core/test` module, only this unittest failed. Maybe I need check other modules? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #25004: [SPARK-28205][SQL] useV1SourceList configuration should be for all data sources
cloud-fan closed pull request #25004: [SPARK-28205][SQL] useV1SourceList configuration should be for all data sources URL: https://github.com/apache/spark/pull/25004 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] jerryshao commented on a change in pull request #25007: [SPARK-28209][CORE][SHUFFLE] Proposed new shuffle writer API
jerryshao commented on a change in pull request #25007: [SPARK-28209][CORE][SHUFFLE] Proposed new shuffle writer API URL: https://github.com/apache/spark/pull/25007#discussion_r298889988 ## File path: core/src/main/java/org/apache/spark/api/shuffle/ShuffleDataIO.java ## @@ -0,0 +1,31 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.api.shuffle; Review comment: Not sure if it is proper to add the interfaces to here `o.a.s.api`? Looks like most of the things under the api package are related to rdd functions. How about this package `o.a.s.shuffle.api`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #25004: [SPARK-28205][SQL] useV1SourceList configuration should be for all data sources
cloud-fan commented on issue #25004: [SPARK-28205][SQL] useV1SourceList configuration should be for all data sources URL: https://github.com/apache/spark/pull/25004#issuecomment-507127467 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #24960: [SPARK-28156][SQL] Self-join should not miss cached view
cloud-fan commented on issue #24960: [SPARK-28156][SQL] Self-join should not miss cached view URL: https://github.com/apache/spark/pull/24960#issuecomment-507127363 Another idea: shall we apply `AliasViewChild` right before `EliminateView`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei commented on a change in pull request #24992: [SPARK-28194][SQL] Refactor code to prevent None.get in EnsureRequirements
turboFei commented on a change in pull request #24992: [SPARK-28194][SQL] Refactor code to prevent None.get in EnsureRequirements URL: https://github.com/apache/spark/pull/24992#discussion_r298889535 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala ## @@ -231,14 +231,15 @@ case class EnsureRequirements(conf: SQLConf) extends Rule[SparkPlan] { val keysAndIndexes = currentOrderOfKeys.zipWithIndex expectedOrderOfKeys.foreach(expression => { - val index = keysAndIndexes.find { case (e, idx) => + keysAndIndexes.find { case (e, idx) => // As we may have the same key used many times, we need to filter out its occurrence we // have already used. e.semanticEquals(expression) && !pickedIndexes.contains(idx) - }.map(_._2).get - pickedIndexes += index - leftKeysBuffer.append(leftKeys(index)) - rightKeysBuffer.append(rightKeys(index)) + }.map(_._2).map(index => { Review comment: We can add a judgement in reorderJoinKeys. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] jerryshao commented on issue #24998: [SPARK-28202] [Core] [Test] Avoid noises of system props in SparkConfSuite
jerryshao commented on issue #24998: [SPARK-28202] [Core] [Test] Avoid noises of system props in SparkConfSuite URL: https://github.com/apache/spark/pull/24998#issuecomment-507126499 I would guess that other tests will also be failed possibly if you have local modified properties. Have you met this issue in other test suites? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei commented on a change in pull request #24992: [SPARK-28194][SQL] Refactor code to prevent None.get in EnsureRequirements
turboFei commented on a change in pull request #24992: [SPARK-28194][SQL] Refactor code to prevent None.get in EnsureRequirements URL: https://github.com/apache/spark/pull/24992#discussion_r298889300 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala ## @@ -231,14 +231,15 @@ case class EnsureRequirements(conf: SQLConf) extends Rule[SparkPlan] { val keysAndIndexes = currentOrderOfKeys.zipWithIndex expectedOrderOfKeys.foreach(expression => { - val index = keysAndIndexes.find { case (e, idx) => + keysAndIndexes.find { case (e, idx) => // As we may have the same key used many times, we need to filter out its occurrence we // have already used. e.semanticEquals(expression) && !pickedIndexes.contains(idx) - }.map(_._2).get - pickedIndexes += index - leftKeysBuffer.append(leftKeys(index)) - rightKeysBuffer.append(rightKeys(index)) + }.map(_._2).map(index => { Review comment: > If `Set(currentOrderOfKeys) == Set(expectedOrderOfKeys)`, I think we cannot hit the exception... yes, you are right. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #24992: [SPARK-28194][SQL] Refactor code to prevent None.get in EnsureRequirements
maropu commented on a change in pull request #24992: [SPARK-28194][SQL] Refactor code to prevent None.get in EnsureRequirements URL: https://github.com/apache/spark/pull/24992#discussion_r29620 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala ## @@ -231,14 +231,15 @@ case class EnsureRequirements(conf: SQLConf) extends Rule[SparkPlan] { val keysAndIndexes = currentOrderOfKeys.zipWithIndex expectedOrderOfKeys.foreach(expression => { - val index = keysAndIndexes.find { case (e, idx) => + keysAndIndexes.find { case (e, idx) => // As we may have the same key used many times, we need to filter out its occurrence we // have already used. e.semanticEquals(expression) && !pickedIndexes.contains(idx) - }.map(_._2).get - pickedIndexes += index - leftKeysBuffer.append(leftKeys(index)) - rightKeysBuffer.append(rightKeys(index)) + }.map(_._2).map(index => { Review comment: If `Set(currentOrderOfKeys) == Set(expectedOrderOfKeys)`, I think we cannot hit the exception... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #24738: [SPARK-23098][SQL] Migrate Kafka Batch source to v2.
cloud-fan commented on a change in pull request #24738: [SPARK-23098][SQL] Migrate Kafka Batch source to v2. URL: https://github.com/apache/spark/pull/24738#discussion_r298887969 ## File path: external/kafka-0-10-sql/src/main/scala/org/apache/spark/sql/kafka010/KafkaMicroBatchStream.scala ## @@ -299,7 +299,7 @@ private[kafka010] class KafkaMicroBatchStream( if (content(0) == 'v') { val indexOfNewLine = content.indexOf("\n") if (indexOfNewLine > 0) { - val version = parseVersion(content.substring(0, indexOfNewLine), VERSION) + parseVersion(content.substring(0, indexOfNewLine), VERSION) Review comment: maybe we can rename it to `validateVersion`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24983: [SPARK-27714][SQL][CBO] Support Genetic Algorithm based join reorder
SparkQA commented on issue #24983: [SPARK-27714][SQL][CBO] Support Genetic Algorithm based join reorder URL: https://github.com/apache/spark/pull/24983#issuecomment-507124499 **[Test build #107060 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107060/testReport)** for PR 24983 at commit [`0860e4e`](https://github.com/apache/spark/commit/0860e4ebf424ff819609dd45db8587be7b5565ec). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on issue #25015: [SPARK-28217][SQL] Allow a pluggable statistics plan visitor for a logical plan.
AngersZh commented on issue #25015: [SPARK-28217][SQL] Allow a pluggable statistics plan visitor for a logical plan. URL: https://github.com/apache/spark/pull/25015#issuecomment-507124404 [SPARK-27602] (https://issues.apache.org/jira/browse/SPARK-27602) I've thought about doing this. You can make it more extensible. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei commented on a change in pull request #24992: [SPARK-28194][SQL] Refactor code to prevent None.get in EnsureRequirements
turboFei commented on a change in pull request #24992: [SPARK-28194][SQL] Refactor code to prevent None.get in EnsureRequirements URL: https://github.com/apache/spark/pull/24992#discussion_r298887634 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala ## @@ -231,14 +231,15 @@ case class EnsureRequirements(conf: SQLConf) extends Rule[SparkPlan] { val keysAndIndexes = currentOrderOfKeys.zipWithIndex expectedOrderOfKeys.foreach(expression => { - val index = keysAndIndexes.find { case (e, idx) => + keysAndIndexes.find { case (e, idx) => // As we may have the same key used many times, we need to filter out its occurrence we // have already used. e.semanticEquals(expression) && !pickedIndexes.contains(idx) - }.map(_._2).get - pickedIndexes += index - leftKeysBuffer.append(leftKeys(index)) - rightKeysBuffer.append(rightKeys(index)) + }.map(_._2).map(index => { Review comment: I'll check it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei removed a comment on issue #24992: [SPARK-28194][SQL] Refactor code to prevent None.get in EnsureRequirements
turboFei removed a comment on issue #24992: [SPARK-28194][SQL] Refactor code to prevent None.get in EnsureRequirements URL: https://github.com/apache/spark/pull/24992#issuecomment-507123648 Yes, it is a bug. ![image](https://user-images.githubusercontent.com/6757692/60413014-fde62a00-9c05-11e9-95b0-ee963cffda65.png) The keys of currentOrderOfKeys are not similar with those of expectedOrderOfKeys. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24983: [SPARK-27714][SQL][CBO] Support Genetic Algorithm based join reorder
AmplabJenkins removed a comment on issue #24983: [SPARK-27714][SQL][CBO] Support Genetic Algorithm based join reorder URL: https://github.com/apache/spark/pull/24983#issuecomment-507124116 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24983: [SPARK-27714][SQL][CBO] Support Genetic Algorithm based join reorder
AmplabJenkins removed a comment on issue #24983: [SPARK-27714][SQL][CBO] Support Genetic Algorithm based join reorder URL: https://github.com/apache/spark/pull/24983#issuecomment-507124120 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12253/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24983: [SPARK-27714][SQL][CBO] Support Genetic Algorithm based join reorder
AmplabJenkins commented on issue #24983: [SPARK-27714][SQL][CBO] Support Genetic Algorithm based join reorder URL: https://github.com/apache/spark/pull/24983#issuecomment-507124116 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24983: [SPARK-27714][SQL][CBO] Support Genetic Algorithm based join reorder
AmplabJenkins commented on issue #24983: [SPARK-27714][SQL][CBO] Support Genetic Algorithm based join reorder URL: https://github.com/apache/spark/pull/24983#issuecomment-507124120 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12253/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] turboFei commented on issue #24992: [SPARK-28194][SQL] Refactor code to prevent None.get in EnsureRequirements
turboFei commented on issue #24992: [SPARK-28194][SQL] Refactor code to prevent None.get in EnsureRequirements URL: https://github.com/apache/spark/pull/24992#issuecomment-507123648 Yes, it is a bug. ![image](https://user-images.githubusercontent.com/6757692/60413014-fde62a00-9c05-11e9-95b0-ee963cffda65.png) The keys of currentOrderOfKeys are not similar with those of expectedOrderOfKeys. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #24978: [SPARK-28177][SQL] Adjust post shuffle partition number in adaptive execution
cloud-fan commented on issue #24978: [SPARK-28177][SQL] Adjust post shuffle partition number in adaptive execution URL: https://github.com/apache/spark/pull/24978#issuecomment-507123632 LGTM This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] xianyinxin commented on issue #24983: [SPARK-27714][SQL][CBO] Support Genetic Algorithm based join reorder
xianyinxin commented on issue #24983: [SPARK-27714][SQL][CBO] Support Genetic Algorithm based join reorder URL: https://github.com/apache/spark/pull/24983#issuecomment-507123598 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #24978: [SPARK-28177][SQL] Adjust post shuffle partition number in adaptive execution
cloud-fan commented on a change in pull request #24978: [SPARK-28177][SQL] Adjust post shuffle partition number in adaptive execution URL: https://github.com/apache/spark/pull/24978#discussion_r298886464 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/exchange/EnsureRequirements.scala ## @@ -36,107 +36,12 @@ import org.apache.spark.sql.internal.SQLConf * the input partition ordering requirements are met. */ case class EnsureRequirements(conf: SQLConf) extends Rule[SparkPlan] { - private def defaultNumPreShufflePartitions: Int = conf.numShufflePartitions - - private def targetPostShuffleInputSize: Long = conf.targetPostShuffleInputSize - - private def adaptiveExecutionEnabled: Boolean = conf.adaptiveExecutionEnabled - - private def minNumPostShufflePartitions: Option[Int] = { -val minNumPostShufflePartitions = conf.minNumPostShufflePartitions -if (minNumPostShufflePartitions > 0) Some(minNumPostShufflePartitions) else None - } - - /** - * Adds [[ExchangeCoordinator]] to [[ShuffleExchangeExec]]s if adaptive query execution is enabled - * and partitioning schemes of these [[ShuffleExchangeExec]]s support [[ExchangeCoordinator]]. - */ - private def withExchangeCoordinator( - children: Seq[SparkPlan], - requiredChildDistributions: Seq[Distribution]): Seq[SparkPlan] = { -val supportsCoordinator = - if (children.exists(_.isInstanceOf[ShuffleExchangeExec])) { -// Right now, ExchangeCoordinator only support HashPartitionings. -children.forall { - case e @ ShuffleExchangeExec(hash: HashPartitioning, _, _) => true - case child => -child.outputPartitioning match { - case hash: HashPartitioning => true - case collection: PartitioningCollection => - collection.partitionings.forall(_.isInstanceOf[HashPartitioning]) - case _ => false -} -} - } else { -// In this case, although we do not have Exchange operators, we may still need to -// shuffle data when we have more than one children because data generated by -// these children may not be partitioned in the same way. -// Please see the comment in withCoordinator for more details. -val supportsDistribution = requiredChildDistributions.forall { dist => - dist.isInstanceOf[ClusteredDistribution] || dist.isInstanceOf[HashClusteredDistribution] -} -children.length > 1 && supportsDistribution - } - -val withCoordinator = - if (adaptiveExecutionEnabled && supportsCoordinator) { -val coordinator = - new ExchangeCoordinator( -targetPostShuffleInputSize, -minNumPostShufflePartitions) -children.zip(requiredChildDistributions).map { - case (e: ShuffleExchangeExec, _) => -// This child is an Exchange, we need to add the coordinator. -e.copy(coordinator = Some(coordinator)) - case (child, distribution) => -// If this child is not an Exchange, we need to add an Exchange for now. -// Ideally, we can try to avoid this Exchange. However, when we reach here, -// there are at least two children operators (because if there is a single child -// and we can avoid Exchange, supportsCoordinator will be false and we -// will not reach here.). Although we can make two children have the same number of -// post-shuffle partitions. Their numbers of pre-shuffle partitions may be different. -// For example, let's say we have the following plan -// Join -// / \ -// Agg Exchange -// / \ -//Exchange t2 -// / -// t1 -// In this case, because a post-shuffle partition can include multiple pre-shuffle -// partitions, a HashPartitioning will not be strictly partitioned by the hashcodes -// after shuffle. So, even we can use the child Exchange operator of the Join to -// have a number of post-shuffle partitions that matches the number of partitions of -// Agg, we cannot say these two children are partitioned in the same way. -// Here is another case -// Join -// / \ -// Agg1 Agg2 -// / \ -// Exchange1 Exchange2 -// / \ -// t1 t2 -// In this case, two Aggs shuffle data with the same column of the join condition. -// After we use ExchangeCoordinator, these two Aggs may not be partitioned in the same -// way. Let's say that Agg1 and Agg2 both have 5 pre-shuffle partitions and 2 -// post-shuffle partitions. It is possible that Agg1 fetches t
[GitHub] [spark] AngersZhuuuu edited a comment on issue #24909: [SPARK-28106][SQL] When Spark SQL use "add jar" , before add to SparkContext, check jar path exist first.
AngersZh edited a comment on issue #24909: [SPARK-28106][SQL] When Spark SQL use "add jar" , before add to SparkContext, check jar path exist first. URL: https://github.com/apache/spark/pull/24909#issuecomment-507122951 @gatorsmile @GregOwen Could you review this again? Thanks This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AngersZhuuuu commented on issue #24909: [SPARK-28106][SQL] When Spark SQL use "add jar" , before add to SparkContext, check jar path exist first.
AngersZh commented on issue #24909: [SPARK-28106][SQL] When Spark SQL use "add jar" , before add to SparkContext, check jar path exist first. URL: https://github.com/apache/spark/pull/24909#issuecomment-507122951 @gatorsmile @GregOwen Could you review this again? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24963: [SPARK-28159][ML] Make the transform natively in ml framework to avoid extra conversion
AmplabJenkins removed a comment on issue #24963: [SPARK-28159][ML] Make the transform natively in ml framework to avoid extra conversion URL: https://github.com/apache/spark/pull/24963#issuecomment-507117221 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24963: [SPARK-28159][ML] Make the transform natively in ml framework to avoid extra conversion
AmplabJenkins removed a comment on issue #24963: [SPARK-28159][ML] Make the transform natively in ml framework to avoid extra conversion URL: https://github.com/apache/spark/pull/24963#issuecomment-507117224 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107058/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24963: [SPARK-28159][ML] Make the transform natively in ml framework to avoid extra conversion
AmplabJenkins commented on issue #24963: [SPARK-28159][ML] Make the transform natively in ml framework to avoid extra conversion URL: https://github.com/apache/spark/pull/24963#issuecomment-507117224 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107058/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24963: [SPARK-28159][ML] Make the transform natively in ml framework to avoid extra conversion
AmplabJenkins commented on issue #24963: [SPARK-28159][ML] Make the transform natively in ml framework to avoid extra conversion URL: https://github.com/apache/spark/pull/24963#issuecomment-507117221 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #24963: [SPARK-28159][ML] Make the transform natively in ml framework to avoid extra conversion
SparkQA removed a comment on issue #24963: [SPARK-28159][ML] Make the transform natively in ml framework to avoid extra conversion URL: https://github.com/apache/spark/pull/24963#issuecomment-507107652 **[Test build #107058 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107058/testReport)** for PR 24963 at commit [`bd813db`](https://github.com/apache/spark/commit/bd813dbd4e91437a2172c86c9e14f3941b3edd14). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24963: [SPARK-28159][ML] Make the transform natively in ml framework to avoid extra conversion
SparkQA commented on issue #24963: [SPARK-28159][ML] Make the transform natively in ml framework to avoid extra conversion URL: https://github.com/apache/spark/pull/24963#issuecomment-507117059 **[Test build #107058 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107058/testReport)** for PR 24963 at commit [`bd813db`](https://github.com/apache/spark/commit/bd813dbd4e91437a2172c86c9e14f3941b3edd14). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25019: [SPARK-28195][SQL] Fix CheckAnalysis not working for command and report misleading error message
SparkQA commented on issue #25019: [SPARK-28195][SQL] Fix CheckAnalysis not working for command and report misleading error message URL: https://github.com/apache/spark/pull/25019#issuecomment-507111962 **[Test build #107059 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107059/testReport)** for PR 25019 at commit [`9a32d8e`](https://github.com/apache/spark/commit/9a32d8e482c245d8fb1c7aeafd48ac4381829e10). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25019: [SPARK-28195][SQL] Fix CheckAnalysis not working for command and report misleading error message
AmplabJenkins removed a comment on issue #25019: [SPARK-28195][SQL] Fix CheckAnalysis not working for command and report misleading error message URL: https://github.com/apache/spark/pull/25019#issuecomment-507111666 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12252/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25019: [SPARK-28195][SQL] Fix CheckAnalysis not working for command and report misleading error message
AmplabJenkins removed a comment on issue #25019: [SPARK-28195][SQL] Fix CheckAnalysis not working for command and report misleading error message URL: https://github.com/apache/spark/pull/25019#issuecomment-507111665 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25019: [SPARK-28195][SQL] Fix CheckAnalysis not working for command and report misleading error message
AmplabJenkins commented on issue #25019: [SPARK-28195][SQL] Fix CheckAnalysis not working for command and report misleading error message URL: https://github.com/apache/spark/pull/25019#issuecomment-507111666 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12252/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25019: [SPARK-28195][SQL] Fix CheckAnalysis not working for command and report misleading error message
AmplabJenkins commented on issue #25019: [SPARK-28195][SQL] Fix CheckAnalysis not working for command and report misleading error message URL: https://github.com/apache/spark/pull/25019#issuecomment-507111665 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on issue #25019: [SPARK-28195][SQL] Fix CheckAnalysis not working for command and report misleading error message
maropu commented on issue #25019: [SPARK-28195][SQL] Fix CheckAnalysis not working for command and report misleading error message URL: https://github.com/apache/spark/pull/25019#issuecomment-507111650 This issue only exists in INSERT command? Probably, you'd be better to make the title and description more concrete... This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on issue #25019: [SPARK-28195][SQL] Fix CheckAnalysis not working for command and report misleading error message
maropu commented on issue #25019: [SPARK-28195][SQL] Fix CheckAnalysis not working for command and report misleading error message URL: https://github.com/apache/spark/pull/25019#issuecomment-507111332 ok to test This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25019: [SPARK-28195][SQL] Fix CheckAnalysis not working for command and report misleading error message
AmplabJenkins removed a comment on issue #25019: [SPARK-28195][SQL] Fix CheckAnalysis not working for command and report misleading error message URL: https://github.com/apache/spark/pull/25019#issuecomment-507110740 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #25012: [SPARK-28215][SQL][R] as_tibble was removed from Arrow R API
HyukjinKwon closed pull request #25012: [SPARK-28215][SQL][R] as_tibble was removed from Arrow R API URL: https://github.com/apache/spark/pull/25012 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #25016: [SPARK-28200][SQL] Decimal overflow handling in ExpressionEncoder
cloud-fan commented on a change in pull request #25016: [SPARK-28200][SQL] Decimal overflow handling in ExpressionEncoder URL: https://github.com/apache/spark/pull/25016#discussion_r298877783 ## File path: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/encoders/ExpressionEncoderSuite.scala ## @@ -379,6 +380,78 @@ class ExpressionEncoderSuite extends CodegenInterpretedPlanTest with AnalysisTes assert(e.getMessage.contains("tuple with more than 22 elements are not supported")) } + // Scala / Java big decimals -- + + encodeDecodeTest(BigDecimal(("9" * 20) + "." + "9" * 18), +"scala decimal within precision/scale limit") + encodeDecodeTest(new java.math.BigDecimal(("9" * 20) + "." + "9" * 18), +"java decimal within precision/scale limit") + + encodeDecodeTest(BigDecimal(("9" * 20) + "." + "9" * 18).unary_-, Review comment: shall we use `-BigDecimal...` instead of `.unary_-` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25019: [SPARK-28195]Fix CheckAnalysis not working for command and report misleading error message
AmplabJenkins removed a comment on issue #25019: [SPARK-28195]Fix CheckAnalysis not working for command and report misleading error message URL: https://github.com/apache/spark/pull/25019#issuecomment-507110404 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25019: [SPARK-28195]Fix CheckAnalysis not working for command and report misleading error message
AmplabJenkins commented on issue #25019: [SPARK-28195]Fix CheckAnalysis not working for command and report misleading error message URL: https://github.com/apache/spark/pull/25019#issuecomment-507110740 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25019: [SPARK-28195]Fix CheckAnalysis not working for command and report misleading error message
AmplabJenkins commented on issue #25019: [SPARK-28195]Fix CheckAnalysis not working for command and report misleading error message URL: https://github.com/apache/spark/pull/25019#issuecomment-507110326 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25019: [SPARK-28195]Fix CheckAnalysis not working for command and report misleading error message
AmplabJenkins commented on issue #25019: [SPARK-28195]Fix CheckAnalysis not working for command and report misleading error message URL: https://github.com/apache/spark/pull/25019#issuecomment-507110404 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25019: [SPARK-28195]Fix CheckAnalysis not working for command and report misleading error message
AmplabJenkins removed a comment on issue #25019: [SPARK-28195]Fix CheckAnalysis not working for command and report misleading error message URL: https://github.com/apache/spark/pull/25019#issuecomment-507110326 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] liupc opened a new pull request #25019: [SPARK-28195]Fix CheckAnalysis not working for command and report misleading error message
liupc opened a new pull request #25019: [SPARK-28195]Fix CheckAnalysis not working for command and report misleading error message URL: https://github.com/apache/spark/pull/25019 ## What changes were proposed in this pull request? This PR will try to fix the issue that the sub plan of command is not being checked for analysis and sometimes will report misleading error message. An example sql is like `insert overwrite directory '/path' using parquet select * from table1` When "table1" does not exists, we will finally got a misleading error message: ``` Caused by: org.apache.spark.sql.catalyst.analysis.UnresolvedException: Invalid call to dataType on unresolved object, tree: 'kr.objective_id at org.apache.spark.sql.catalyst.analysis.UnresolvedAttribute.dataType(unresolved.scala:105) at org.apache.spark.sql.types.StructType$$anonfun$fromAttributes$1.apply(StructType.scala:440) at org.apache.spark.sql.types.StructType$$anonfun$fromAttributes$1.apply(StructType.scala:440) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.immutable.List.foreach(List.scala:381) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.immutable.List.map(List.scala:285) at org.apache.spark.sql.types.StructType$.fromAttributes(StructType.scala:440) at org.apache.spark.sql.catalyst.plans.QueryPlan.schema$lzycompute(QueryPlan.scala:159) at org.apache.spark.sql.catalyst.plans.QueryPlan.schema(QueryPlan.scala:159) at org.apache.spark.sql.execution.datasources.DataSource.planForWriting(DataSource.scala:544) at org.apache.spark.sql.execution.command.InsertIntoDataSourceDirCommand.run(InsertIntoDataSourceDirCommand.scala:70) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79) at org.apache.spark.sql.execution.adaptive.QueryStage.executeCollect(QueryStage.scala:246) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at org.apache.spark.sql.Dataset$$anonfun$6.apply(Dataset.scala:190) at org.apache.spark.sql.Dataset$$anonfun$52.apply(Dataset.scala:3277) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:77) at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3276) at org.apache.spark.sql.Dataset.(Dataset.scala:190) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:75) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:642) at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:694) at org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.org$apache$spark$sql$hive$thriftserver$SparkExecuteStatementOperation$$execute(SparkExecuteStatementOperation.scala:277) ... 11 more ``` ## How was this patch tested? exist UT(AnalysisSuite) This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #25012: [SPARK-28215][SQL][R] as_tibble was removed from Arrow R API
HyukjinKwon commented on issue #25012: [SPARK-28215][SQL][R] as_tibble was removed from Arrow R API URL: https://github.com/apache/spark/pull/25012#issuecomment-507110218 Merged to master. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] itsvikramagr commented on a change in pull request #24922: [SPARK-28120][SS] Rocksdb state storage implementation
itsvikramagr commented on a change in pull request #24922: [SPARK-28120][SS] Rocksdb state storage implementation URL: https://github.com/apache/spark/pull/24922#discussion_r298875972 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/state/WALUtils.scala ## @@ -0,0 +1,280 @@ +/* Review comment: Good point on self-review. I have abstracted out a lot of code from HDFS state store to create WALUtils. I didn't make any change in HDFS state store provider to reduce the scope of this PR. I can either start a new PR for the refactoring or I can do it once the rest of the code is reviewed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24963: [SPARK-28159][ML] Make the transform natively in ml framework to avoid extra conversion
AmplabJenkins removed a comment on issue #24963: [SPARK-28159][ML] Make the transform natively in ml framework to avoid extra conversion URL: https://github.com/apache/spark/pull/24963#issuecomment-507108414 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12251/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24963: [SPARK-28159][ML] Make the transform natively in ml framework to avoid extra conversion
AmplabJenkins removed a comment on issue #24963: [SPARK-28159][ML] Make the transform natively in ml framework to avoid extra conversion URL: https://github.com/apache/spark/pull/24963#issuecomment-507108411 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24963: [SPARK-28159][ML] Make the transform natively in ml framework to avoid extra conversion
AmplabJenkins commented on issue #24963: [SPARK-28159][ML] Make the transform natively in ml framework to avoid extra conversion URL: https://github.com/apache/spark/pull/24963#issuecomment-507108414 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/12251/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24963: [SPARK-28159][ML] Make the transform natively in ml framework to avoid extra conversion
AmplabJenkins commented on issue #24963: [SPARK-28159][ML] Make the transform natively in ml framework to avoid extra conversion URL: https://github.com/apache/spark/pull/24963#issuecomment-507108411 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] itsvikramagr commented on issue #24922: [SPARK-28120][SS] Rocksdb state storage implementation
itsvikramagr commented on issue #24922: [SPARK-28120][SS] Rocksdb state storage implementation URL: https://github.com/apache/spark/pull/24922#issuecomment-507108250 Thanks, @HeartSaVioR for the review. Let me work on your comments. Also, I am looking into generating performance numbers for various scenarios. Will soon get back with those as well. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #24963: [SPARK-28159][ML] Make the transform natively in ml framework to avoid extra conversion
SparkQA commented on issue #24963: [SPARK-28159][ML] Make the transform natively in ml framework to avoid extra conversion URL: https://github.com/apache/spark/pull/24963#issuecomment-507107652 **[Test build #107058 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107058/testReport)** for PR 24963 at commit [`bd813db`](https://github.com/apache/spark/commit/bd813dbd4e91437a2172c86c9e14f3941b3edd14). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #25010: [SPARK-28201][SQL] Revisit MakeDecimal behavior on overflow
cloud-fan closed pull request #25010: [SPARK-28201][SQL] Revisit MakeDecimal behavior on overflow URL: https://github.com/apache/spark/pull/25010 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #25010: [SPARK-28201][SQL] Revisit MakeDecimal behavior on overflow
cloud-fan commented on issue #25010: [SPARK-28201][SQL] Revisit MakeDecimal behavior on overflow URL: https://github.com/apache/spark/pull/25010#issuecomment-507106833 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25012: [SPARK-28215][SQL][R] as_tibble was removed from Arrow R API
AmplabJenkins removed a comment on issue #25012: [SPARK-28215][SQL][R] as_tibble was removed from Arrow R API URL: https://github.com/apache/spark/pull/25012#issuecomment-507106133 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25012: [SPARK-28215][SQL][R] as_tibble was removed from Arrow R API
AmplabJenkins removed a comment on issue #25012: [SPARK-28215][SQL][R] as_tibble was removed from Arrow R API URL: https://github.com/apache/spark/pull/25012#issuecomment-507106139 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107056/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25012: [SPARK-28215][SQL][R] as_tibble was removed from Arrow R API
SparkQA removed a comment on issue #25012: [SPARK-28215][SQL][R] as_tibble was removed from Arrow R API URL: https://github.com/apache/spark/pull/25012#issuecomment-507099969 **[Test build #107056 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107056/testReport)** for PR 25012 at commit [`bd89000`](https://github.com/apache/spark/commit/bd89000c1d08147b37957954edc09cf7f3ac469e). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25012: [SPARK-28215][SQL][R] as_tibble was removed from Arrow R API
AmplabJenkins commented on issue #25012: [SPARK-28215][SQL][R] as_tibble was removed from Arrow R API URL: https://github.com/apache/spark/pull/25012#issuecomment-507106133 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25012: [SPARK-28215][SQL][R] as_tibble was removed from Arrow R API
AmplabJenkins commented on issue #25012: [SPARK-28215][SQL][R] as_tibble was removed from Arrow R API URL: https://github.com/apache/spark/pull/25012#issuecomment-507106139 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107056/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25012: [SPARK-28215][SQL][R] as_tibble was removed from Arrow R API
SparkQA commented on issue #25012: [SPARK-28215][SQL][R] as_tibble was removed from Arrow R API URL: https://github.com/apache/spark/pull/25012#issuecomment-507106090 **[Test build #107056 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107056/testReport)** for PR 25012 at commit [`bd89000`](https://github.com/apache/spark/commit/bd89000c1d08147b37957954edc09cf7f3ac469e). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #24795: [SPARK-27945][SQL] Minimal changes to support columnar processing
cloud-fan commented on a change in pull request #24795: [SPARK-27945][SQL] Minimal changes to support columnar processing URL: https://github.com/apache/spark/pull/24795#discussion_r298873668 ## File path: sql/core/src/test/scala/org/apache/spark/sql/SparkSessionExtensionSuite.scala ## @@ -251,6 +286,371 @@ object MyExtensions { (_: Seq[Expression]) => Literal(5, IntegerType)) } +case class CloseableColumnBatchIterator(itr: Iterator[ColumnarBatch], +f: ColumnarBatch => ColumnarBatch) extends Iterator[ColumnarBatch] { + var cb: ColumnarBatch = null + + private def closeCurrentBatch(): Unit = { +if (cb != null) { + cb.close + cb = null +} + } + + TaskContext.get().addTaskCompletionListener[Unit]((tc: TaskContext) => { +closeCurrentBatch() + }) + + override def hasNext: Boolean = { +closeCurrentBatch() +itr.hasNext + } + + override def next(): ColumnarBatch = { +closeCurrentBatch() +cb = f(itr.next()) +cb + } +} + +object NoCloseColumnVector extends Logging { + def wrapIfNeeded(cv: ColumnVector): NoCloseColumnVector = cv match { +case ref: NoCloseColumnVector => + ref +case vec => NoCloseColumnVector(vec) + } +} + +/** + * Provide a ColumnVector so ColumnarExpression can close temporary values without + * having to guess what type it really is. + */ +case class NoCloseColumnVector(wrapped: ColumnVector) extends ColumnVector(wrapped.dataType) { + private var refCount = 1 + + /** + * Don't actually close the ColumnVector this wraps. The producer of the vector will take + * care of that. + */ + override def close(): Unit = { +// Empty + } + + override def hasNull: Boolean = wrapped.hasNull + + override def numNulls(): Int = wrapped.numNulls + + override def isNullAt(rowId: Int): Boolean = wrapped.isNullAt(rowId) + + override def getBoolean(rowId: Int): Boolean = wrapped.getBoolean(rowId) + + override def getByte(rowId: Int): Byte = wrapped.getByte(rowId) + + override def getShort(rowId: Int): Short = wrapped.getShort(rowId) + + override def getInt(rowId: Int): Int = wrapped.getInt(rowId) + + override def getLong(rowId: Int): Long = wrapped.getLong(rowId) + + override def getFloat(rowId: Int): Float = wrapped.getFloat(rowId) + + override def getDouble(rowId: Int): Double = wrapped.getDouble(rowId) + + override def getArray(rowId: Int): ColumnarArray = wrapped.getArray(rowId) + + override def getMap(ordinal: Int): ColumnarMap = wrapped.getMap(ordinal) + + override def getDecimal(rowId: Int, precision: Int, scale: Int): Decimal = +wrapped.getDecimal(rowId, precision, scale) + + override def getUTF8String(rowId: Int): UTF8String = wrapped.getUTF8String(rowId) + + override def getBinary(rowId: Int): Array[Byte] = wrapped.getBinary(rowId) + + override protected def getChild(ordinal: Int): ColumnVector = wrapped.getChild(ordinal) +} + +trait ColumnarExpression extends Expression with Serializable { + /** + * Returns true if this expression supports columnar processing through [[columnarEval]]. + */ + def supportsColumnar: Boolean = true + + /** + * Returns the result of evaluating this expression on the entire + * [[org.apache.spark.sql.vectorized.ColumnarBatch]]. The result of + * calling this may be a single [[org.apache.spark.sql.vectorized.ColumnVector]] or a scalar + * value. Scalar values typically happen if they are a part of the expression i.e. col("a") + 100. Review comment: I understand that this is in test not an API, but other people may look at this test to learn how to implement columnar operator, and I feel the current example is not that good. IIUC, the goal is: 1. users can write a rule to replace an arbitrary SQL operator with a custom optimized columnar version 2. Spark automatically insert column-to-row and row-to-column operators around the columnar operator. For 1, I think a pretty simple approach is, take in an expression tree, compile it to a columnar processor that can execute the expression tree in a columnar fashion. We don't need to create a `ColumnarExpression`, which seems over complicated to me as a column processor. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #24795: [SPARK-27945][SQL] Minimal changes to support columnar processing
cloud-fan commented on a change in pull request #24795: [SPARK-27945][SQL] Minimal changes to support columnar processing URL: https://github.com/apache/spark/pull/24795#discussion_r298873668 ## File path: sql/core/src/test/scala/org/apache/spark/sql/SparkSessionExtensionSuite.scala ## @@ -251,6 +286,371 @@ object MyExtensions { (_: Seq[Expression]) => Literal(5, IntegerType)) } +case class CloseableColumnBatchIterator(itr: Iterator[ColumnarBatch], +f: ColumnarBatch => ColumnarBatch) extends Iterator[ColumnarBatch] { + var cb: ColumnarBatch = null + + private def closeCurrentBatch(): Unit = { +if (cb != null) { + cb.close + cb = null +} + } + + TaskContext.get().addTaskCompletionListener[Unit]((tc: TaskContext) => { +closeCurrentBatch() + }) + + override def hasNext: Boolean = { +closeCurrentBatch() +itr.hasNext + } + + override def next(): ColumnarBatch = { +closeCurrentBatch() +cb = f(itr.next()) +cb + } +} + +object NoCloseColumnVector extends Logging { + def wrapIfNeeded(cv: ColumnVector): NoCloseColumnVector = cv match { +case ref: NoCloseColumnVector => + ref +case vec => NoCloseColumnVector(vec) + } +} + +/** + * Provide a ColumnVector so ColumnarExpression can close temporary values without + * having to guess what type it really is. + */ +case class NoCloseColumnVector(wrapped: ColumnVector) extends ColumnVector(wrapped.dataType) { + private var refCount = 1 + + /** + * Don't actually close the ColumnVector this wraps. The producer of the vector will take + * care of that. + */ + override def close(): Unit = { +// Empty + } + + override def hasNull: Boolean = wrapped.hasNull + + override def numNulls(): Int = wrapped.numNulls + + override def isNullAt(rowId: Int): Boolean = wrapped.isNullAt(rowId) + + override def getBoolean(rowId: Int): Boolean = wrapped.getBoolean(rowId) + + override def getByte(rowId: Int): Byte = wrapped.getByte(rowId) + + override def getShort(rowId: Int): Short = wrapped.getShort(rowId) + + override def getInt(rowId: Int): Int = wrapped.getInt(rowId) + + override def getLong(rowId: Int): Long = wrapped.getLong(rowId) + + override def getFloat(rowId: Int): Float = wrapped.getFloat(rowId) + + override def getDouble(rowId: Int): Double = wrapped.getDouble(rowId) + + override def getArray(rowId: Int): ColumnarArray = wrapped.getArray(rowId) + + override def getMap(ordinal: Int): ColumnarMap = wrapped.getMap(ordinal) + + override def getDecimal(rowId: Int, precision: Int, scale: Int): Decimal = +wrapped.getDecimal(rowId, precision, scale) + + override def getUTF8String(rowId: Int): UTF8String = wrapped.getUTF8String(rowId) + + override def getBinary(rowId: Int): Array[Byte] = wrapped.getBinary(rowId) + + override protected def getChild(ordinal: Int): ColumnVector = wrapped.getChild(ordinal) +} + +trait ColumnarExpression extends Expression with Serializable { + /** + * Returns true if this expression supports columnar processing through [[columnarEval]]. + */ + def supportsColumnar: Boolean = true + + /** + * Returns the result of evaluating this expression on the entire + * [[org.apache.spark.sql.vectorized.ColumnarBatch]]. The result of + * calling this may be a single [[org.apache.spark.sql.vectorized.ColumnVector]] or a scalar + * value. Scalar values typically happen if they are a part of the expression i.e. col("a") + 100. Review comment: I understand that this is in test not an API, but other people may look at this test to learn how to implement columnar operator, and I feel the current example is not that good. IIUC, the goal is: 1. users can write a rule to replace an arbitrary SQL operator with a custom optimized columnar version 2. Spark automatically insert column-to-row and row-to-column operators around the columnar operator. For 1, I think a pretty simple approach is, take in an expression tree, compile it to a columnar processor that can execute the expression tree in a columnar fashion. We don't need to create a `ColumnarExpression`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24983: [SPARK-27714][SQL][CBO] Support Genetic Algorithm based join reorder
AmplabJenkins removed a comment on issue #24983: [SPARK-27714][SQL][CBO] Support Genetic Algorithm based join reorder URL: https://github.com/apache/spark/pull/24983#issuecomment-507101280 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107057/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #24946: [SPARK-27234][SS][PYTHON] Use InheritableThreadLocal for current epoch in EpochTracker (to support Python UDFs)
HyukjinKwon commented on issue #24946: [SPARK-27234][SS][PYTHON] Use InheritableThreadLocal for current epoch in EpochTracker (to support Python UDFs) URL: https://github.com/apache/spark/pull/24946#issuecomment-507101455 gentle ping .. :-) .. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #24958: [SPARK-28153][PYTHON] Use AtomicReference at InputFileBlockHolder (to support input_file_name with Python UDF)
HyukjinKwon commented on issue #24958: [SPARK-28153][PYTHON] Use AtomicReference at InputFileBlockHolder (to support input_file_name with Python UDF) URL: https://github.com/apache/spark/pull/24958#issuecomment-507101426 gentle ping .. :-) .. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #24983: [SPARK-27714][SQL][CBO] Support Genetic Algorithm based join reorder
AmplabJenkins removed a comment on issue #24983: [SPARK-27714][SQL][CBO] Support Genetic Algorithm based join reorder URL: https://github.com/apache/spark/pull/24983#issuecomment-507101274 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #24983: [SPARK-27714][SQL][CBO] Support Genetic Algorithm based join reorder
SparkQA removed a comment on issue #24983: [SPARK-27714][SQL][CBO] Support Genetic Algorithm based join reorder URL: https://github.com/apache/spark/pull/24983#issuecomment-507101082 **[Test build #107057 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/107057/testReport)** for PR 24983 at commit [`4c3e21d`](https://github.com/apache/spark/commit/4c3e21d5c739bd52b1c46675fcb06ef670cee6a9). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24983: [SPARK-27714][SQL][CBO] Support Genetic Algorithm based join reorder
AmplabJenkins commented on issue #24983: [SPARK-27714][SQL][CBO] Support Genetic Algorithm based join reorder URL: https://github.com/apache/spark/pull/24983#issuecomment-507101274 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #24983: [SPARK-27714][SQL][CBO] Support Genetic Algorithm based join reorder
AmplabJenkins commented on issue #24983: [SPARK-27714][SQL][CBO] Support Genetic Algorithm based join reorder URL: https://github.com/apache/spark/pull/24983#issuecomment-507101280 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/107057/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org