[GitHub] spark pull request #20026: [SPARK-22838][Core] Avoid unnecessary copying of ...
Github user ConeyLiu closed the pull request at: https://github.com/apache/spark/pull/20026 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20026: [SPARK-22838][Core] Avoid unnecessary copying of ...
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/20026#discussion_r162810684 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskStore.scala --- @@ -152,7 +153,7 @@ private class DiskBlockData( file: File, blockSize: Long) extends BlockData { - override def toInputStream(): InputStream = new FileInputStream(file) + override def toInputStream(): InputStream = new NioBufferedFileInputStream(file) --- End diff -- >IIUC for network (netty) transmission, it uses zero copy sendFile, which is another path (toNetty). Thanks for explaining, I did not notice this before. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20340: [SPARK-21293][SS][SPARKR] Add doc example for streaming ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20340 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20340: [SPARK-21293][SS][SPARKR] Add doc example for streaming ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20340 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86434/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20177: [SPARK-22954][SQL] Fix the exception thrown by Analyze c...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20177 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20177: [SPARK-22954][SQL] Fix the exception thrown by Analyze c...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20177 **[Test build #86425 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86425/testReport)** for PR 20177 at commit [`77e4d6d`](https://github.com/apache/spark/commit/77e4d6db1d647db7a7b2c13c922bab0bdd3e53fc). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20340: [SPARK-21293][SS][SPARKR] Add doc example for streaming ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20340 **[Test build #86434 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86434/testReport)** for PR 20340 at commit [`e4e6a96`](https://github.com/apache/spark/commit/e4e6a96a6bad5c1a35e546ee536e421f163b858f). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20177: [SPARK-22954][SQL] Fix the exception thrown by Analyze c...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20177 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86425/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20342: [SPARK-23170] Dump the statistics of effective runs of a...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/20342 You forgot to add `[SQL]` in the title? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20177: [SPARK-22954][SQL] Fix the exception thrown by Analyze c...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20177 **[Test build #86425 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86425/testReport)** for PR 20177 at commit [`77e4d6d`](https://github.com/apache/spark/commit/77e4d6db1d647db7a7b2c13c922bab0bdd3e53fc). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20342: [SPARK-23170] Dump the statistics of effective ru...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/20342#discussion_r162814241 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/QueryExecutionMetering.scala --- @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.rules + +import com.google.common.util.concurrent.AtomicLongMap +import scala.collection.JavaConverters._ + +case class QueryExecutionMetering() { --- End diff -- nit: `RuleExecutionMetering`? Or, the current name intends to collect metrics from other than `RuleExecutor`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19285 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19285 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86429/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19285 **[Test build #86429 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86429/testReport)** for PR 19285 at commit [`9d7c52d`](https://github.com/apache/spark/commit/9d7c52d53d98b971ad1ad05c828855f7298b6058). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20340: [SPARK-21293][SS][SPARKR] Add doc example for streaming ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20340 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86433/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20340: [SPARK-21293][SS][SPARKR] Add doc example for streaming ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20340 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20340: [SPARK-21293][SS][SPARKR] Add doc example for streaming ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20340 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/66/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20340: [SPARK-21293][SS][SPARKR] Add doc example for streaming ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20340 **[Test build #86433 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86433/testReport)** for PR 20340 at commit [`53d0346`](https://github.com/apache/spark/commit/53d03464f940fe0297e16b5da423c156a23b51cb). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20338: [SPARK-11222][BUILD][PYTHON] python code style checker u...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20338 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20337: [SPARK-11630] [core] ClosureCleaner moved from warning t...
Github user srowen commented on the issue: https://github.com/apache/spark/pull/20337 I think that's fine, especially as Scala 2.12 and its different implementation of some closures triggers this a lot. You could use string interpolation while here. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20342: [SPARK-23170][SQL] Dump the statistics of effecti...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/20342#discussion_r162815863 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/QueryExecutionMetering.scala --- @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.rules + +import com.google.common.util.concurrent.AtomicLongMap +import scala.collection.JavaConverters._ --- End diff -- my bad? I thought the correct one was... ``` import scala.collection.JavaConverters._ import com.google.common.util.concurrent.AtomicLongMap ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20338: [SPARK-11222][BUILD][PYTHON] python code style checker u...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20338 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86423/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20340: [SPARK-21293][SS][SPARKR] Add doc example for streaming ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20340 **[Test build #86435 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86435/testReport)** for PR 20340 at commit [`c24de9c`](https://github.com/apache/spark/commit/c24de9c354cff537828c414b4c64742cf14d6634). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20338: [SPARK-11222][BUILD][PYTHON] python code style checker u...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20338 **[Test build #86423 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86423/testReport)** for PR 20338 at commit [`8873441`](https://github.com/apache/spark/commit/88734413c5e33803b7d5f8c0f81899ed1d3577ff). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20342: [SPARK-23170][SQL] Dump the statistics of effecti...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20342#discussion_r162815832 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/QueryExecutionMetering.scala --- @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.rules + +import com.google.common.util.concurrent.AtomicLongMap +import scala.collection.JavaConverters._ --- End diff -- The order? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20340: [SPARK-21293][SS][SPARKR] Add doc example for streaming ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20340 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20026: [SPARK-22838][Core] Avoid unnecessary copying of data
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20026 **[Test build #86428 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86428/testReport)** for PR 20026 at commit [`b5ab25a`](https://github.com/apache/spark/commit/b5ab25a79411eba8cded9cc2eee2d9d942e4b6a5). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20340: [SPARK-21293][SS][SPARKR] Add doc example for streaming ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20340 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/67/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20342: [SPARK-23170][SQL] Dump the statistics of effective runs...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/20342 LGTM except for one more minor comment. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20026: [SPARK-22838][Core] Avoid unnecessary copying of ...
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/20026#discussion_r162803175 --- Diff: core/src/main/scala/org/apache/spark/storage/DiskStore.scala --- @@ -152,7 +153,7 @@ private class DiskBlockData( file: File, blockSize: Long) extends BlockData { - override def toInputStream(): InputStream = new FileInputStream(file) + override def toInputStream(): InputStream = new NioBufferedFileInputStream(file) --- End diff -- Hi @jerryshao, thanks for reviewing. This is inspired by #15408. > the returned `InputStream` will be deserialized in `BlockManger` This is not entirely correct. Sometimes we don't need deserialized, such as network transmission. And also, this does not add extra work to deserialization, but reduces the effort of network-like delivery. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20342: [SPARK-23170][SQL] Dump the statistics of effecti...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/20342#discussion_r162815751 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/QueryExecutionMetering.scala --- @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.rules + +import com.google.common.util.concurrent.AtomicLongMap +import scala.collection.JavaConverters._ --- End diff -- super nit: wrong order --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20341: [MINOR] [SQL] Test case cleanups for recent PRs
Github user mgaido91 commented on the issue: https://github.com/apache/spark/pull/20341 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20344: [MINOR] Typo fixes
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20344 **[Test build #86447 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86447/testReport)** for PR 20344 at commit [`9fff0ed`](https://github.com/apache/spark/commit/9fff0ed104650f4e92ae87deb91381cd79ac5bfa). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20342: [SPARK-23170][SQL] Dump the statistics of effective runs...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20342 **[Test build #86446 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86446/testReport)** for PR 20342 at commit [`8716c6e`](https://github.com/apache/spark/commit/8716c6eccf3d606a2c5f6275b8b3a9b343b01393). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20340: [SPARK-21293][SS][SPARKR] Add doc example for streaming ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20340 **[Test build #86434 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86434/testReport)** for PR 20340 at commit [`e4e6a96`](https://github.com/apache/spark/commit/e4e6a96a6bad5c1a35e546ee536e421f163b858f). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20342: [SPARK-23170][SQL] Dump the statistics of effective runs...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20342 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20342: [SPARK-23170][SQL] Dump the statistics of effective runs...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20342 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/76/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20342: [SPARK-23170][SQL] Dump the statistics of effective runs...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20342 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20341: [MINOR] [SQL] Test case cleanups by recent PRs
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20341 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/71/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20344: [MINOR] Typo fixes
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20344 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/77/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20340: [SPARK-21293][SS][SPARKR] Add doc example for streaming ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20340 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20341: [MINOR] [SQL] Test case cleanups by recent PRs
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20341 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20344: [MINOR] Typo fixes
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20344 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20342: [SPARK-23170][SQL] Dump the statistics of effective runs...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20342 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86442/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20342: [SPARK-23170][SQL] Dump the statistics of effective runs...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20342 **[Test build #86442 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86442/testReport)** for PR 20342 at commit [`eda94fe`](https://github.com/apache/spark/commit/eda94fe6b374128f847586c2869626a40670e05b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19285 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86427/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20341: [MINOR] [SQL] Test case cleanups by recent PRs
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20341 cc @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20342: [SPARK-23170] Dump the statistics of effective runs of a...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20342 **[Test build #86442 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86442/testReport)** for PR 20342 at commit [`eda94fe`](https://github.com/apache/spark/commit/eda94fe6b374128f847586c2869626a40670e05b). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19285 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19285 **[Test build #86427 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86427/testReport)** for PR 19285 at commit [`a2b9513`](https://github.com/apache/spark/commit/a2b951358e88bbbc0335a7eabda19e36d6586904). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20342: [SPARK-23170] Dump the statistics of effective runs of a...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20342 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/73/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20342: [SPARK-23170] Dump the statistics of effective runs of a...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20342 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20342: [SPARK-23170][SQL] Dump the statistics of effecti...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20342#discussion_r162817354 --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/rules/QueryExecutionMetering.scala --- @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalyst.rules + +import com.google.common.util.concurrent.AtomicLongMap +import scala.collection.JavaConverters._ --- End diff -- Yeah. https://github.com/databricks/scala-style-guide#imports --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20341: [MINOR] [SQL] Test case cleanups by recent PRs
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20341#discussion_r162812208 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/DataFrameJoinSuite.scala --- @@ -276,16 +277,14 @@ class DataFrameJoinSuite extends QueryTest with SharedSQLContext { test("SPARK-23087: don't throw Analysis Exception in CheckCartesianProduct when join condition " + "is false or null") { -val df = spark.range(10) -val dfNull = spark.range(10).select(lit(null).as("b")) -val planNull = df.join(dfNull, $"id" === $"b", "left").queryExecution.analyzed - -spark.sessionState.executePlan(planNull).optimizedPlan - -val dfOne = df.select(lit(1).as("a")) -val dfTwo = spark.range(10).select(lit(2).as("b")) -val planFalse = dfOne.join(dfTwo, $"a" === $"b", "left").queryExecution.analyzed - -spark.sessionState.executePlan(planFalse).optimizedPlan +withSQLConf(SQLConf.CROSS_JOINS_ENABLED.key -> "false") { + val df = spark.range(10) + val dfNull = spark.range(10).select(lit(null).as("b")) + df.join(dfNull, $"id" === $"b", "left").queryExecution.optimizedPlan + + val dfOne = df.select(lit(1).as("a")) + val dfTwo = spark.range(10).select(lit(2).as("b")) + dfOne.join(dfTwo, $"a" === $"b", "left").queryExecution.optimizedPlan +} --- End diff -- cc @mariobriggs --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20344: [MINOR] Typo fixes
GitHub user jaceklaskowski opened a pull request: https://github.com/apache/spark/pull/20344 [MINOR] Typo fixes ## What changes were proposed in this pull request? Typo fixes ## How was this patch tested? Local build / Doc-only changes You can merge this pull request into a Git repository by running: $ git pull https://github.com/jaceklaskowski/spark typo-fixes Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20344.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20344 commit 9fff0ed104650f4e92ae87deb91381cd79ac5bfa Author: Jacek Laskowski Date: 2018-01-21T17:59:26Z [MINOR] Typo fixes --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20341: [MINOR] [SQL] Test case cleanups by recent PRs
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20341#discussion_r162812178 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveUDAFSuite.scala --- @@ -49,8 +49,12 @@ class HiveUDAFSuite extends QueryTest with TestHiveSingleton with SQLTestUtils { } protected override def afterAll(): Unit = { -sql(s"DROP TEMPORARY FUNCTION IF EXISTS mock") -sql(s"DROP TEMPORARY FUNCTION IF EXISTS hive_max") +try { + sql(s"DROP TEMPORARY FUNCTION IF EXISTS mock") + sql(s"DROP TEMPORARY FUNCTION IF EXISTS hive_max") --- End diff -- Actually, these drop functions are unnecessary. However, it sounds also fine to keep them because we should clean up the local objects after usage. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20338: [SPARK-11222][BUILD][PYTHON] python code style ch...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20338#discussion_r162804815 --- Diff: dev/lint-python --- @@ -35,11 +35,9 @@ compile_status="${PIPESTATUS[0]}" # Get pep8 at runtime so that we don't rely on it being installed on the build server. #+ See: https://github.com/apache/spark/pull/1744#issuecomment-50982162 -#+ TODOs: -#+ - Download pep8 from PyPI. It's more "official". -PEP8_VERSION="1.7.0" +PEP8_VERSION="2.3.1" PEP8_SCRIPT_PATH="$SPARK_ROOT_DIR/dev/pep8-$PEP8_VERSION.py" -PEP8_SCRIPT_REMOTE_PATH="https://raw.githubusercontent.com/jcrocholl/pep8/$PEP8_VERSION/pep8.py"; +PEP8_SCRIPT_REMOTE_PATH="https://raw.githubusercontent.com/PyCQA/pycodestyle/$PEP8_VERSION/pycodestyle.py"; --- End diff -- Shall we leave a note that `pep8` is formally renamed to `pycodestyle` to reduce confusion, and in the PR description too? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20341: [MINOR] [SQL] Test case cleanups by recent PRs
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20341#discussion_r162812196 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/test/TestHive.scala --- @@ -492,8 +492,7 @@ private[hive] class TestHiveSparkSession( protected val originalUDFs: JavaSet[String] = FunctionRegistry.getFunctionNames /** - * Resets the test instance by deleting any tables that have been created. - * TODO: also clear out UDFs, views, etc. --- End diff -- These TODO has been addressed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20341: [MINOR] [SQL] Test case cleanups by recent PRs
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20341 **[Test build #86440 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86440/testReport)** for PR 20341 at commit [`0e0cdb2`](https://github.com/apache/spark/commit/0e0cdb269ea88d1233babb898a6aee0981132c75). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18906: [SPARK-21692][PYSPARK][SQL] Add nullability support to P...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18906 **[Test build #86445 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86445/testReport)** for PR 18906 at commit [`414545d`](https://github.com/apache/spark/commit/414545d4d9b73fa2981332efc5dd2b03966fcb2e). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20340: [SPARK-21293][SS][SPARKR] Add doc example for streaming ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20340 **[Test build #86433 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86433/testReport)** for PR 20340 at commit [`53d0346`](https://github.com/apache/spark/commit/53d03464f940fe0297e16b5da423c156a23b51cb). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19285 **[Test build #86427 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86427/testReport)** for PR 19285 at commit [`a2b9513`](https://github.com/apache/spark/commit/a2b951358e88bbbc0335a7eabda19e36d6586904). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20338: [SPARK-11222][BUILD][PYTHON] python code style ch...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20338#discussion_r162804771 --- Diff: dev/tox.ini --- @@ -13,7 +13,7 @@ # See the License for the specific language governing permissions and # limitations under the License. -[pep8] -ignore=E402,E731,E241,W503,E226 +[pycodestyle] +ignore=E402,E731,E241,W503,E226,E722,E741,E305 --- End diff -- So, do those `E722,E741,E305` include rules to validate the blank line before starting doctests, which is described in the JIRA? Seems: E722: "do not use bare except'" E741: "ambiguous variable name 'l'", E305: "expected 2 blank lines after class or function definition" --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20340: [SPARK-21293][SS][SPARKR] Add doc example for str...
GitHub user felixcheung opened a pull request: https://github.com/apache/spark/pull/20340 [SPARK-21293][SS][SPARKR] Add doc example for streaming join, dedup ## What changes were proposed in this pull request? streaming programming guide changes ## How was this patch tested? manually You can merge this pull request into a Git repository by running: $ git pull https://github.com/felixcheung/spark rstreamdoc Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20340.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20340 commit 53d03464f940fe0297e16b5da423c156a23b51cb Author: Felix Cheung Date: 2018-01-21T09:41:10Z doc --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20341: [MINOR] [SQL] Test case cleanups by recent PRs
GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/20341 [MINOR] [SQL] Test case cleanups by recent PRs ## What changes were proposed in this pull request? Revert the unneeded test case changes we made in SPARK-23000 Also fixes the test suites that do not call `super.afterAll()` in the local `afterAll`. The `afterAll()` of `TestHiveSingleton` actually reset the environments. ## How was this patch tested? N/A You can merge this pull request into a Git repository by running: $ git pull https://github.com/gatorsmile/spark testRelated Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20341.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20341 commit 563e499442011338f3be219f0e7642fea160bbcf Author: gatorsmile Date: 2018-01-21T12:55:22Z Revert "[SPARK-23000] Use fully qualified table names in HiveMetastoreCatalogSuite" This reverts commit c7572b79da0a29e502890d7618eaf805a1c9f474. commit e0390567c4b46108b6c3bf7144e547c095fb705f Author: gatorsmile Date: 2018-01-21T13:01:09Z fix commit 0e0cdb269ea88d1233babb898a6aee0981132c75 Author: gatorsmile Date: 2018-01-21T13:20:53Z fix --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20340: [SPARK-21293][SS][SPARKR] Add doc example for str...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20340 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...
Github user ConeyLiu commented on the issue: https://github.com/apache/spark/pull/19285 Thanks for reviewing. The code has updated, pls help to review. Thanks again. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20342: [SPARK-23170][SQL] Dump the statistics of effective runs...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20342 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20342: [SPARK-23170][SQL] Dump the statistics of effective runs...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20342 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86446/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20177: [SPARK-22954][SQL] Fix the exception thrown by Analyze c...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20177 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #19285: [SPARK-22068][CORE]Reduce the duplicate code betw...
Github user ConeyLiu commented on a diff in the pull request: https://github.com/apache/spark/pull/19285#discussion_r162802949 --- Diff: core/src/main/scala/org/apache/spark/storage/memory/MemoryStore.scala --- @@ -261,37 +263,93 @@ private[spark] class MemoryStore( // If this task attempt already owns more unroll memory than is necessary to store the // block, then release the extra memory that will not be used. val excessUnrollMemory = unrollMemoryUsedByThisBlock - size - releaseUnrollMemoryForThisTask(MemoryMode.ON_HEAP, excessUnrollMemory) + releaseUnrollMemoryForThisTask(memoryMode, excessUnrollMemory) transferUnrollToStorage(size) true } } + if (enoughStorageMemory) { entries.synchronized { - entries.put(blockId, entry) + entries.put(blockId, createMemoryEntry()) } logInfo("Block %s stored as values in memory (estimated size %s, free %s)".format( blockId, Utils.bytesToString(size), Utils.bytesToString(maxMemory - blocksMemoryUsed))) Right(size) } else { assert(currentUnrollMemoryForThisTask >= unrollMemoryUsedByThisBlock, "released too much unroll memory") +Left(unrollMemoryUsedByThisBlock) + } +} else { + Left(unrollMemoryUsedByThisBlock) +} + } + + /** + * Attempt to put the given block in memory store as values. + * + * It's possible that the iterator is too large to materialize and store in memory. To avoid + * OOM exceptions, this method will gradually unroll the iterator while periodically checking + * whether there is enough free memory. If the block is successfully materialized, then the + * temporary unroll memory used during the materialization is "transferred" to storage memory, + * so we won't acquire more memory than is actually needed to store the block. + * + * @return in case of success, the estimated size of the stored data. In case of failure, return + * an iterator containing the values of the block. The returned iterator will be backed + * by the combination of the partially-unrolled block and the remaining elements of the + * original input iterator. The caller must either fully consume this iterator or call + * `close()` on it in order to free the storage memory consumed by the partially-unrolled + * block. + */ + private[storage] def putIteratorAsValues[T]( + blockId: BlockId, + values: Iterator[T], + classTag: ClassTag[T]): Either[PartiallyUnrolledIterator[T], Long] = { + +// Underlying vector for unrolling the block +var vector = new SizeTrackingVector[T]()(classTag) +var arrayValues: Array[T] = null +var preciseSize: Long = -1 + +def storeValue(value: T): Unit = { + vector += value +} + +def estimateSize(precise: Boolean): Long = { + if (precise) { +// We only call need the precise size after all values unrolled. +arrayValues = vector.toArray +preciseSize = SizeEstimator.estimate(arrayValues) +preciseSize + } else { +vector.estimateSize() + } +} + +def createMemoryEntry(): MemoryEntry[T] = { + // We successfully unrolled the entirety of this block + DeserializedMemoryEntry[T](arrayValues, preciseSize, classTag) +} + +putIterator(blockId, values, classTag, MemoryMode.ON_HEAP, storeValue, + estimateSize, createMemoryEntry) match { + case Right(storedSize) => Right(storedSize) + case Left(unrollMemoryUsedByThisBlock) => +// We ran out of space while unrolling the values for this block +val (unrolledIterator, size) = if (vector != null) { --- End diff -- updated, thanks. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20177: [SPARK-22954][SQL] Fix the exception thrown by Analyze c...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20177 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86424/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20177: [SPARK-22954][SQL] Fix the exception thrown by Analyze c...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20177 **[Test build #86424 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86424/testReport)** for PR 20177 at commit [`0e873e5`](https://github.com/apache/spark/commit/0e873e5fcfe0858d869d9cb9cf63597c6746b734). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20342: [SPARK-23170][SQL] Dump the statistics of effective runs...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20342 **[Test build #86446 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86446/testReport)** for PR 20342 at commit [`8716c6e`](https://github.com/apache/spark/commit/8716c6eccf3d606a2c5f6275b8b3a9b343b01393). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20249: [SPARK-23057][SPARK-19235][SQL] SET LOCATION should chan...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20249 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20249: [SPARK-23057][SPARK-19235][SQL] SET LOCATION should chan...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20249 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86420/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20342: [SPARK-23170][SQL] Dump the statistics of effective runs...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20342 **[Test build #86443 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86443/testReport)** for PR 20342 at commit [`72e63f2`](https://github.com/apache/spark/commit/72e63f25f85e640a1c22c74ba70e800f8e391410). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20343: [SPARK-23167][SQL] Add TPCDS queries v2.7 in TPCDSQueryS...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20343 **[Test build #86448 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86448/testReport)** for PR 20343 at commit [`9ac04ed`](https://github.com/apache/spark/commit/9ac04edc5aa770fb04b9ad4c12de75fa6d4ac2c8). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20338: [SPARK-11222][BUILD][PYTHON] python code style checker u...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20338 This doesn't actually run the Python lint in Jenkins build though because there's no change on the files with `.py`. Mind double checking and fixing: ```diff diff --git a/dev/run-tests.py b/dev/run-tests.py index 7e6f7ff0603..6fac324972e 100755 --- a/dev/run-tests.py +++ b/dev/run-tests.py @@ -576,7 +576,10 @@ def main(): for f in changed_files): # run_java_style_checks() pass -if not changed_files or any(f.endswith(".py") for f in changed_files): +if not changed_files or any(f.endswith("lint-python") +or f.endswith("tox.ini") +or f.endswith(".py") +for f in changed_files): run_python_style_checks() if not changed_files or any(f.endswith(".R") for f in changed_files): run_sparkr_style_checks() ``` too? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20249: [SPARK-23057][SPARK-19235][SQL] SET LOCATION should chan...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20249 **[Test build #86420 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86420/testReport)** for PR 20249 at commit [`90c4980`](https://github.com/apache/spark/commit/90c49809886e2f487dc4c4dc6ba45aa16bae8933). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20177: [SPARK-22954][SQL] Fix the exception thrown by Analyze c...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20177 **[Test build #86424 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86424/testReport)** for PR 20177 at commit [`0e873e5`](https://github.com/apache/spark/commit/0e873e5fcfe0858d869d9cb9cf63597c6746b734). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19285: [SPARK-22068][CORE]Reduce the duplicate code between put...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19285 **[Test build #86429 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86429/testReport)** for PR 19285 at commit [`9d7c52d`](https://github.com/apache/spark/commit/9d7c52d53d98b971ad1ad05c828855f7298b6058). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20340: [SPARK-21293][SS][SPARKR] Add doc example for streaming ...
Github user felixcheung commented on the issue: https://github.com/apache/spark/pull/20340 merged to master/2.3 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20343: [SPARK-23167][SQL] Add TPCDS queries v2.7 in TPCDSQueryS...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20343 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20342: [SPARK-23170][SQL] Dump the statistics of effective runs...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20342 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20343: [SPARK-23167][SQL] Add TPCDS queries v2.7 in TPCDSQueryS...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20343 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/78/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20342: [SPARK-23170][SQL] Dump the statistics of effective runs...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20342 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/74/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20342: [SPARK-23170][SQL] Dump the statistics of effective runs...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20342 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20342: [SPARK-23170][SQL] Dump the statistics of effective runs...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20342 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/86441/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20342: [SPARK-23170][SQL] Dump the statistics of effective runs...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20342 **[Test build #86441 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86441/testReport)** for PR 20342 at commit [`e790ab9`](https://github.com/apache/spark/commit/e790ab9950aa3ed9a0662e4d10f9d8611ff8f1ee). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class QueryExecutionMetering() ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19993: [SPARK-22799][ML] Bucketizer should throw exception if s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19993 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19993: [SPARK-22799][ML] Bucketizer should throw exception if s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19993 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/69/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20343: [SPARK-23167][SQL] Add TPCDS queries v2.7 in TPCD...
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/20343#discussion_r162824441 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/TPCDSQuerySuite.scala --- @@ -339,6 +340,30 @@ class TPCDSQuerySuite extends BenchmarkQueryTest { } } + val tpcdsQueriesV2_7_0 = Seq( +"q1", "q2", "q3", "q4", "q5", "q5a", "q6", "q7", "q8", "q9", "q10", "q10a", "q11", +"q12", "q13", "q14_1", "q14_2", "q14a_1", "q14a_2", "q15", "q16", "q17", "q18", "q18a", "q19", +"q20", "q21", "q22", "q22a", "q23_1", "q23_2", "q24_1", "q24_2", "q25", "q26", "q27", "q27a", +"q28", "q29", "q30", "q31", "q32", "q33", "q34", "q35", "q35a", "q36", "q36a", "q37", "q38", +"q39_1", "q39_2", "q40", "q41", "q42", "q43", "q44", "q45", "q46", "q47", "q48", "q49", +"q50", "q51", "q51a", "q52", "q53", "q54", "q55", "q56", "q57", "q58", "q59", +"q60", "q61", "q62", "q63", "q64", "q65", "q66", "q67", "q67a", "q68", "q69", +"q70", "q70a", "q71", "q72", "q73", "q74", "q75", "q76", "q77", "q77a", "q78", "q79", +"q80", "q80a", "q81", "q82", "q83", "q84", "q85", "q86", "q86a", "q87", "q88", "q89", +"q90", "q91", "q92", "q93", "q94", "q95", "q96", "q97", "q98", "q99") + + tpcdsQueriesV2_7_0.foreach { name => +val queryString = resourceToString(s"tpcds-v2.7.0/$name.sql", --- End diff -- @maropu . It's great to have v2.7. Could you check the schema too? For example, we had better update [the following in the schema](https://github.com/apache/spark/pull/20343/files#diff-38fa80d1dc9860f07e135dd02d259269R247)? ``` -|`web_country` STRING, `web_gmt_offset` STRING, `web_tax_percentage` DECIMAL(5,2)) +|`web_country` STRING, `web_gmt_offset` DECIMAL(5,2), `web_tax_percentage` DECIMAL(5,2)) ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20343: [SPARK-23167][SQL] Add TPCDS queries v2.7 in TPCDSQueryS...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20343 **[Test build #86448 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86448/testReport)** for PR 20343 at commit [`9ac04ed`](https://github.com/apache/spark/commit/9ac04edc5aa770fb04b9ad4c12de75fa6d4ac2c8). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20087: [SPARK-21786][SQL] The 'spark.sql.parquet.compres...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20087#discussion_r162802793 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/SaveAsHiveFile.scala --- @@ -55,18 +55,28 @@ private[hive] trait SaveAsHiveFile extends DataWritingCommand { customPartitionLocations: Map[TablePartitionSpec, String] = Map.empty, partitionAttributes: Seq[Attribute] = Nil): Set[String] = { -val isCompressed = hadoopConf.get("hive.exec.compress.output", "false").toBoolean +val isCompressed = + fileSinkConf.getTableInfo.getOutputFileFormatClassName.toLowerCase(Locale.ROOT) match { +case formatName if formatName.endsWith("orcoutputformat") => + // For ORC,"mapreduce.output.fileoutputformat.compress", + // "mapreduce.output.fileoutputformat.compress.codec", and + // "mapreduce.output.fileoutputformat.compress.type" + // have no impact because it uses table properties to store compression information. --- End diff -- Although this is the existing behavior, but could you investigate how Hive behaves when `Parquet.Compress` is set. https://issues.apache.org/jira/browse/HIVE-7858 Is it the same as ORC? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20337: [SPARK-11630] [core] ClosureCleaner moved from warning t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20337 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20337: [SPARK-11630] [core] ClosureCleaner moved from warning t...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20337 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/60/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20337: [SPARK-11630] [core] ClosureCleaner moved from warning t...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20337 **[Test build #86421 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86421/testReport)** for PR 20337 at commit [`e978154`](https://github.com/apache/spark/commit/e97815442bc77ce46b401cd680e5340d4b3e60be). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19993: [SPARK-22799][ML] Bucketizer should throw exception if s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19993 **[Test build #86437 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86437/testReport)** for PR 19993 at commit [`8c162a3`](https://github.com/apache/spark/commit/8c162a335258fc320061bc00653ca1c3d0c13f24). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20342: [SPARK-23170] Dump the statistics of effective runs of a...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20342 **[Test build #86441 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/86441/testReport)** for PR 20342 at commit [`e790ab9`](https://github.com/apache/spark/commit/e790ab9950aa3ed9a0662e4d10f9d8611ff8f1ee). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org