[GitHub] spark issue #20929: [SPARK-23772][SQL] Provide an option to ignore column of...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/20929 @HyukjinKwon I think this is your area, so could you double-check this? Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20929: [SPARK-23772][SQL] Provide an option to ignore column of...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20929 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20929: [SPARK-23772][SQL] Provide an option to ignore column of...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/20929 I checked the json test passed in my local. So, I merged the fix into this. Sorry to bother you, but could you check again? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20929: [SPARK-23772][SQL] Provide an option to ignore column of...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20929 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/269/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20929: [SPARK-23772][SQL] Provide an option to ignore column of...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20929 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20929: [SPARK-23772][SQL] Provide an option to ignore column of...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20929 **[Test build #92016 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92016/testReport)** for PR 20929 at commit [`4544433`](https://github.com/apache/spark/commit/4544433760bd70cff41aa8e8bb718e6de0e3b877). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20929: [SPARK-23772][SQL] Provide an option to ignore column of...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20929 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/4163/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21514: [SPARK-22860] [Core] - hide key password from linux ps l...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/21514 Is this only the place where we need to hide the password? e.g., how about logging about properties in [SparkSubmitArguments](https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala) --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21567: [SPARK-24560][CORE][MESOS] Fix some getTimeAsMs as getTi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21567 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92008/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21567: [SPARK-24560][CORE][MESOS] Fix some getTimeAsMs as getTi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21567 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21567: [SPARK-24560][CORE][MESOS] Fix some getTimeAsMs as getTi...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21567 **[Test build #92008 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92008/testReport)** for PR 21567 at commit [`0dd57ea`](https://github.com/apache/spark/commit/0dd57ea193cc4c3282961cea63ecf36b1d6a7e95). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21582: [SPARK-24576][BUILD] Upgrade Apache ORC to 1.5.1
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21582 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21582: [SPARK-24576][BUILD] Upgrade Apache ORC to 1.5.1
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21582 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92007/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21582: [SPARK-24576][BUILD] Upgrade Apache ORC to 1.5.1
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21582 **[Test build #92007 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92007/testReport)** for PR 21582 at commit [`654aa45`](https://github.com/apache/spark/commit/654aa4530b4b5de9888a895051e741e82eefdfe1). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21534: [SPARK-24526][build][test-maven] Spaces in the build dir...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21534 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21534: [SPARK-24526][build][test-maven] Spaces in the build dir...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21534 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92009/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21534: [SPARK-24526][build][test-maven] Spaces in the build dir...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21534 **[Test build #92009 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92009/testReport)** for PR 21534 at commit [`bb12f3e`](https://github.com/apache/spark/commit/bb12f3e2ad74f9d4c89e1c7adab4d306fa87b101). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21581: [SPARK-24574][SQL] array_contains function deals with Co...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21581 **[Test build #92015 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92015/testReport)** for PR 21581 at commit [`28aa515`](https://github.com/apache/spark/commit/28aa51554f4c730fae3c8090ac3c268e1ddfa4f8). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21581: [SPARK-24574][SQL] array_contains function deals with Co...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21581 ok to test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21583 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92014/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21583 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21583 **[Test build #92014 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92014/testReport)** for PR 21583 at commit [`2707dee`](https://github.com/apache/spark/commit/2707dee967fd6c4cebbe96cc7ae40feb5bfced24). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21583 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21583 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/268/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21583 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/4162/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21583 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/268/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21583 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21583 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/268/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21583 **[Test build #92014 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92014/testReport)** for PR 21583 at commit [`2707dee`](https://github.com/apache/spark/commit/2707dee967fd6c4cebbe96cc7ae40feb5bfced24). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21583 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21583 **[Test build #92013 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92013/testReport)** for PR 21583 at commit [`2707dee`](https://github.com/apache/spark/commit/2707dee967fd6c4cebbe96cc7ae40feb5bfced24). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21583 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92013/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user ifilonenko commented on the issue: https://github.com/apache/spark/pull/21583 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21583 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/267/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21583 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21583 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21583 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/4161/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21582: [SPARK-24576][BUILD] Upgrade Apache ORC to 1.5.1
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21582#discussion_r195959171 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcSerializer.scala --- @@ -223,6 +223,6 @@ class OrcSerializer(dataSchema: StructType) { * Return a Orc value object for the given Spark schema. */ private def createOrcValue(dataType: DataType) = { - OrcStruct.createValue(TypeDescription.fromString(dataType.catalogString)) + OrcStruct.createValue(TypeDescription.fromString(OrcFileFormat.getQuotedSchemaString(dataType))) --- End diff -- @dongjoon-hyun Thanks for explaining it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21583 **[Test build #92013 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92013/testReport)** for PR 21583 at commit [`2707dee`](https://github.com/apache/spark/commit/2707dee967fd6c4cebbe96cc7ae40feb5bfced24). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user ifilonenko commented on the issue: https://github.com/apache/spark/pull/21583 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21583 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21583 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/266/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20636: [SPARK-23415][SQL][TEST] Make behavior of BufferHolderSp...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/20636 hmmm, is this build failure related to this pr? It seems other prs passed in the build. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21221: [SPARK-23429][CORE] Add executor memory metrics t...
Github user edwinalu commented on a diff in the pull request: https://github.com/apache/spark/pull/21221#discussion_r195957438 --- Diff: core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala --- @@ -169,6 +182,31 @@ private[spark] class EventLoggingListener( // Events that trigger a flush override def onStageCompleted(event: SparkListenerStageCompleted): Unit = { +if (shouldLogExecutorMetricsUpdates) { + // clear out any previous attempts, that did not have a stage completed event + val prevAttemptId = event.stageInfo.attemptNumber() - 1 + for (attemptId <- 0 to prevAttemptId) { +liveStageExecutorMetrics.remove((event.stageInfo.stageId, attemptId)) + } + + // log the peak executor metrics for the stage, for each live executor, + // whether or not the executor is running tasks for the stage + val accumUpdates = new ArrayBuffer[(Long, Int, Int, Seq[AccumulableInfo])]() + val executorMap = liveStageExecutorMetrics.remove( +(event.stageInfo.stageId, event.stageInfo.attemptNumber())) + executorMap.foreach { + executorEntry => { + for ((executorId, peakExecutorMetrics) <- executorEntry) { +val executorMetrics = new ExecutorMetrics(-1, peakExecutorMetrics.metrics) --- End diff -- The last timestamp seems like it wouldn't have enough information, since peaks for different metrics could occur at different times, and with different combinations of stages running. Only -1 would be logged. Right now it's writing out SparkListenerExecutorMetricsUpdate events, which contain ExecutorMetrics, which has timestamp. Do you think timestamp should be removed from ExecutorMetrics? It seems good to have the timestamp for when the metrics were gathered, but it's not being exposed at this point. For both the history server and the live UI, the goal is to show the peak value for each metric each executor. For the executors tab, this is the peak value of each metric over the lifetime of the executor. For the stages tab, this is the peak value for each metric for that executor while the stage is running. The executor could be processing tasks for other stages as well, if there are concurrent stages, or no tasks for this stage if it isn't assigned any tasks, but it is the peak values between the time the stage starts and ends. Can you describe how the stage level metrics would work the last timestamp for any peak metric? Would there be a check to see if the event is being read from the history log? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21579: [SPARK-24573][INFRA] Runs SBT checkstyle after the build...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/21579 ping @cloud-fan Shall we merge this to unblock the testing of other PRs? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21583 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21583 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92012/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21583 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21583 **[Test build #92012 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92012/testReport)** for PR 21583 at commit [`2707dee`](https://github.com/apache/spark/commit/2707dee967fd6c4cebbe96cc7ae40feb5bfced24). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21583 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/265/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21583 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/4160/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21583 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/265/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21583 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-make-spark-distribution-unified/265/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21583 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user ifilonenko commented on the issue: https://github.com/apache/spark/pull/21583 I don't know if Jenkins build a distribution with `--pip`. TBD based on success. But locally, this worked when I ran: `dev/make-distribution.sh --pip --tgz -Phadoop-2.7 -Pkubernetes` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20636: [SPARK-23415][SQL][TEST] Make behavior of BufferHolderSp...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20636 **[Test build #92011 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92011/testReport)** for PR 20636 at commit [`a134091`](https://github.com/apache/spark/commit/a134091aad0c3f8e3674f6cd751c2b8d5d83e39e). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20636: [SPARK-23415][SQL][TEST] Make behavior of BufferHolderSp...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20636 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92011/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21583 **[Test build #92012 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92012/testReport)** for PR 21583 at commit [`2707dee`](https://github.com/apache/spark/commit/2707dee967fd6c4cebbe96cc7ae40feb5bfced24). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21583: [SPARK-23984][K8S][Test] Added Integration Tests for PyS...
Github user ifilonenko commented on the issue: https://github.com/apache/spark/pull/21583 @ssuchter @holdenk @mccheah for review --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21583: [SPARK-23984][K8S][Test] Added Integration Tests ...
GitHub user ifilonenko opened a pull request: https://github.com/apache/spark/pull/21583 [SPARK-23984][K8S][Test] Added Integration Tests for PySpark on Kubernetes ## What changes were proposed in this pull request? I added integration tests for PySpark ( + checking JVM options + RemoteFileTest) which wasn't properly merged in the initial integration test PR. ## How was this patch tested? I tested this with integration tests using: `dev/dev-run-integration-tests.sh --spark-tgz spark-2.4.0-SNAPSHOT-bin-2.7.3.tgz` You can merge this pull request into a Git repository by running: $ git pull https://github.com/ifilonenko/spark master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21583.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21583 commit 2707dee967fd6c4cebbe96cc7ae40feb5bfced24 Author: Ilan Filonenko Date: 2018-06-18T02:48:24Z Fixed Remote File tests and added PySpark tests --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20636: [SPARK-23415][SQL][TEST] Make behavior of BufferHolderSp...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20636 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21580: [SPARK-24575][SQL] Prohibit window expressions inside WH...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21580 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21580: [SPARK-24575][SQL] Prohibit window expressions inside WH...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21580 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/4159/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21580: [SPARK-24575][SQL] Prohibit window expressions inside WH...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21580 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/264/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21580: [SPARK-24575][SQL] Prohibit window expressions inside WH...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21580 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21221: [SPARK-23429][CORE] Add executor memory metrics t...
Github user squito commented on a diff in the pull request: https://github.com/apache/spark/pull/21221#discussion_r195955081 --- Diff: core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala --- @@ -169,6 +182,31 @@ private[spark] class EventLoggingListener( // Events that trigger a flush override def onStageCompleted(event: SparkListenerStageCompleted): Unit = { +if (shouldLogExecutorMetricsUpdates) { + // clear out any previous attempts, that did not have a stage completed event + val prevAttemptId = event.stageInfo.attemptNumber() - 1 + for (attemptId <- 0 to prevAttemptId) { +liveStageExecutorMetrics.remove((event.stageInfo.stageId, attemptId)) + } + + // log the peak executor metrics for the stage, for each live executor, + // whether or not the executor is running tasks for the stage + val accumUpdates = new ArrayBuffer[(Long, Int, Int, Seq[AccumulableInfo])]() + val executorMap = liveStageExecutorMetrics.remove( +(event.stageInfo.stageId, event.stageInfo.attemptNumber())) + executorMap.foreach { + executorEntry => { + for ((executorId, peakExecutorMetrics) <- executorEntry) { +val executorMetrics = new ExecutorMetrics(-1, peakExecutorMetrics.metrics) --- End diff -- I can see how this would work, but it also seems far more confusing than necessary. My understanding was that you'd always log the last timestamp which replaced the peak value for *any* metric. Are you ever logging something other than -1 for the timestamp? If not, we just shouldn't put any timestamp in the log. It might be helpful to step back a bit , and rather than focusing on the mechanics of what you're doing now, discuss the desired end behavior in the history server and the live UI based on the timestamp. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21580: [SPARK-24575][SQL] Prohibit window expressions inside WH...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21580 **[Test build #92010 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92010/testReport)** for PR 21580 at commit [`9a07ea3`](https://github.com/apache/spark/commit/9a07ea361eccfefe348db8a9b50acf3c68ec7a56). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20636: [SPARK-23415][SQL][TEST] Make behavior of BufferHolderSp...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20636 **[Test build #92011 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92011/testReport)** for PR 20636 at commit [`a134091`](https://github.com/apache/spark/commit/a134091aad0c3f8e3674f6cd751c2b8d5d83e39e). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20636: [SPARK-23415][SQL][TEST] Make behavior of BufferHolderSp...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20636 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/263/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21580: [SPARK-24575][SQL] Prohibit window expressions inside WH...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/21580 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20636: [SPARK-23415][SQL][TEST] Make behavior of BufferHolderSp...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20636 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20636: [SPARK-23415][SQL][TEST] Make behavior of BufferHolderSp...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20636 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/4158/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20636: [SPARK-23415][SQL][TEST] Make behavior of BufferHolderSp...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20636 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20636: [SPARK-23415][SQL][TEST] Make behavior of BufferHolderSp...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/20636 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21581: [SPARK-24574][SQL] array_contains function deals with Co...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/21581 BTW, I found the other 3 similar issues there; ``` scala> Seq((Seq(1, 2, 3), 2)).toDF("a", "b").selectExpr("array_position(a, b)").show ++ |array_position(a, b)| ++ | 2| ++ scala> Seq((Seq(1, 2, 3), 2)).toDF("a", "b").selectExpr("element_at(a, b)").show ++ |element_at(a, b)| ++ | 2| ++ scala> Seq((Seq(1, 2, 3), 2)).toDF("a", "b").selectExpr("array_remove(a, b)").show +--+ |array_remove(a, b)| +--+ |[1, 3]| +--+ ``` I think this is a tiny fix, so IMHO this pr might need to address all the issues here. cc: @ueshin --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21581: [SPARK-24574][SQL] array_contains function deals ...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/21581#discussion_r195954291 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala --- @@ -3077,12 +3077,16 @@ object functions { // /** - * Returns null if the array is null, true if the array contains `value`, and false otherwise. + * Returns null if the array is null, true if the array contains `value` or the content of --- End diff -- We need to update this comment? I think `content of value` is a little ambiguous. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21582: [SPARK-24576][BUILD] Upgrade Apache ORC to 1.5.1
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/21582#discussion_r195950366 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcFileFormat.scala --- @@ -59,6 +59,19 @@ private[sql] object OrcFileFormat { def checkFieldNames(names: Seq[String]): Unit = { names.foreach(checkFieldName) } + + def getQuotedSchemaString(dataType: DataType): String = dataType match { +case _: AtomicType => dataType.catalogString +case StructType(fields) => + fields.map(f => s"`${f.name}`:${getQuotedSchemaString(f.dataType)}") +.mkString("struct<", ",", ">") +case ArrayType(elementType, _) => + s"array<${getQuotedSchemaString(elementType)}>" +case MapType(keyType, valueType, _) => + s"map<${getQuotedSchemaString(keyType)},${getQuotedSchemaString(valueType)}>" +case _ => // UDT and others + dataType.catalogString --- End diff -- We don't need to recursively quote `udt.sqlType`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21288: [SPARK-24206][SQL] Improve FilterPushdownBenchmar...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/21288#discussion_r195949711 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/FilterPushdownBenchmark.scala --- @@ -0,0 +1,442 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.benchmark + +import java.io.File + +import scala.util.{Random, Try} + +import org.apache.spark.SparkConf +import org.apache.spark.sql.{DataFrame, SparkSession} +import org.apache.spark.sql.functions.monotonically_increasing_id +import org.apache.spark.sql.internal.SQLConf +import org.apache.spark.util.{Benchmark, Utils} + + +/** + * Benchmark to measure read performance with Filter pushdown. + * To run this: + * spark-submit --class + */ +object FilterPushdownBenchmark { + val conf = new SparkConf() +.setAppName("FilterPushdownBenchmark") +// Since `spark.master` always exists, overrides this value +.set("spark.master", "local[1]") --- End diff -- I'm afraid that other developers might misunderstand how-to-use this? ``` spark-submit --master local[1] --class spark-submit --master local[*] --class In both case, the benchmark always uses `local[1]`. Or, you suggest the other point of view? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21534: [SPARK-24526][build][test-maven] Spaces in the build dir...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21534 **[Test build #92009 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92009/testReport)** for PR 21534 at commit [`bb12f3e`](https://github.com/apache/spark/commit/bb12f3e2ad74f9d4c89e1c7adab4d306fa87b101). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21288: [SPARK-24206][SQL] Improve FilterPushdownBenchmar...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/21288#discussion_r195948346 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/FilterPushdownBenchmark.scala --- @@ -0,0 +1,442 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.benchmark + +import java.io.File + +import scala.util.{Random, Try} + +import org.apache.spark.SparkConf +import org.apache.spark.sql.{DataFrame, SparkSession} +import org.apache.spark.sql.functions.monotonically_increasing_id +import org.apache.spark.sql.internal.SQLConf +import org.apache.spark.util.{Benchmark, Utils} + + +/** + * Benchmark to measure read performance with Filter pushdown. + * To run this: + * spark-submit --class + */ +object FilterPushdownBenchmark { + val conf = new SparkConf() +.setAppName("FilterPushdownBenchmark") +// Since `spark.master` always exists, overrides this value +.set("spark.master", "local[1]") --- End diff -- What I mean is adding `--master local[1]` at line 34, too. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21534: [SPARK-24526][build][test-maven] Spaces in the build dir...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21534 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21567: [SPARK-24560][CORE][MESOS] Fix some getTimeAsMs as getTi...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21567 **[Test build #92008 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92008/testReport)** for PR 21567 at commit [`0dd57ea`](https://github.com/apache/spark/commit/0dd57ea193cc4c3282961cea63ecf36b1d6a7e95). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21567: [SPARK-24560][CORE][MESOS] Fix some getTimeAsMs as getTi...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/21567 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21575: [SPARK-24566][CORE] spark.storage.blockManagerSlaveTimeo...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/21575 cc: @jiangxb1987 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21571: [SPARK-24565][SS] Add API for in Structured Streaming fo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21571 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92005/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21571: [SPARK-24565][SS] Add API for in Structured Streaming fo...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21571 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21571: [SPARK-24565][SS] Add API for in Structured Streaming fo...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21571 **[Test build #92005 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92005/testReport)** for PR 21571 at commit [`9062fb9`](https://github.com/apache/spark/commit/9062fb9053b67d59a1f2357adc28a705bf9ba713). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21582: [SPARK-24576][BUILD] Upgrade Apache ORC to 1.5.1
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/21582#discussion_r195947273 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcSerializer.scala --- @@ -223,6 +223,6 @@ class OrcSerializer(dataSchema: StructType) { * Return a Orc value object for the given Spark schema. */ private def createOrcValue(dataType: DataType) = { - OrcStruct.createValue(TypeDescription.fromString(dataType.catalogString)) + OrcStruct.createValue(TypeDescription.fromString(OrcFileFormat.getQuotedSchemaString(dataType))) --- End diff -- Thank you for review, @viirya . ORC 1.5 checks the field name syntax more strictly. For example, a field name having `dot`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21582: [SPARK-24576][BUILD] Upgrade Apache ORC to 1.5.1
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21582 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21582: [SPARK-24576][BUILD] Upgrade Apache ORC to 1.5.1
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21582 **[Test build #92007 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92007/testReport)** for PR 21582 at commit [`654aa45`](https://github.com/apache/spark/commit/654aa4530b4b5de9888a895051e741e82eefdfe1). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21582: [SPARK-24576][BUILD] Upgrade Apache ORC to 1.5.1
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21582 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/262/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21582: [SPARK-24576][BUILD] Upgrade Apache ORC to 1.5.1
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21582 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21582: [SPARK-24576][BUILD] Upgrade Apache ORC to 1.5.1
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21582 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/4157/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21288: [SPARK-24206][SQL] Improve FilterPushdownBenchmar...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/21288#discussion_r195946683 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/FilterPushdownBenchmark.scala --- @@ -0,0 +1,442 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.benchmark + +import java.io.File + +import scala.util.{Random, Try} + +import org.apache.spark.SparkConf +import org.apache.spark.sql.{DataFrame, SparkSession} +import org.apache.spark.sql.functions.monotonically_increasing_id +import org.apache.spark.sql.internal.SQLConf +import org.apache.spark.util.{Benchmark, Utils} + + +/** + * Benchmark to measure read performance with Filter pushdown. + * To run this: + * spark-submit --class + */ +object FilterPushdownBenchmark { + val conf = new SparkConf() +.setAppName("FilterPushdownBenchmark") +// Since `spark.master` always exists, overrides this value +.set("spark.master", "local[1]") --- End diff -- btw, I updated the description. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21288: [SPARK-24206][SQL] Improve FilterPushdownBenchmar...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/21288#discussion_r195946600 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/benchmark/FilterPushdownBenchmark.scala --- @@ -0,0 +1,442 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.execution.benchmark + +import java.io.File + +import scala.util.{Random, Try} + +import org.apache.spark.SparkConf +import org.apache.spark.sql.{DataFrame, SparkSession} +import org.apache.spark.sql.functions.monotonically_increasing_id +import org.apache.spark.sql.internal.SQLConf +import org.apache.spark.util.{Benchmark, Utils} + + +/** + * Benchmark to measure read performance with Filter pushdown. + * To run this: + * spark-submit --class + */ +object FilterPushdownBenchmark { + val conf = new SparkConf() +.setAppName("FilterPushdownBenchmark") +// Since `spark.master` always exists, overrides this value +.set("spark.master", "local[1]") --- End diff -- In the current pr, we cannot use `spark.master` in command line options. You suggest we drop `.set("spark.master", "local[1]")` and we always set `spark.master` in options for this benchmark? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20929: [SPARK-23772][SQL] Provide an option to ignore column of...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/20929 ok, I'll check --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21582: [SPARK-24576][BUILD] Upgrade Apache ORC to 1.5.1
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21582 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/92006/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21582: [SPARK-24576][BUILD] Upgrade Apache ORC to 1.5.1
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21582 **[Test build #92006 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/92006/testReport)** for PR 21582 at commit [`60e461e`](https://github.com/apache/spark/commit/60e461ee78e0b601e3f7bf7927730e0dabc234ef). * This patch **fails build dependency tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21582: [SPARK-24576][BUILD] Upgrade Apache ORC to 1.5.1
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21582 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21582: [SPARK-24576][BUILD] Upgrade Apache ORC to 1.5.1
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/21582#discussion_r195945989 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/orc/OrcSerializer.scala --- @@ -223,6 +223,6 @@ class OrcSerializer(dataSchema: StructType) { * Return a Orc value object for the given Spark schema. */ private def createOrcValue(dataType: DataType) = { - OrcStruct.createValue(TypeDescription.fromString(dataType.catalogString)) + OrcStruct.createValue(TypeDescription.fromString(OrcFileFormat.getQuotedSchemaString(dataType))) --- End diff -- Why this change? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org