[GitHub] spark issue #20482: [SPARK-23311][SQL][TEST]add FilterFunction test case for...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20482 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20482: [SPARK-23311][SQL][TEST]add FilterFunction test case for...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20482 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87023/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20482: [SPARK-23311][SQL][TEST]add FilterFunction test case for...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20482 **[Test build #87023 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87023/testReport)** for PR 20482 at commit [`6042523`](https://github.com/apache/spark/commit/6042523d54dbebceb80ebd4b180bd9b73c5bd3ed). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20479: [SPARK-23305][SQL][TEST] Test `spark.sql.files.ignoreMis...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20479 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20482: [SPARK-23311][SQL][TEST]add FilterFunction test case for...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20482 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87021/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20482: [SPARK-23311][SQL][TEST]add FilterFunction test case for...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20482 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20479: [SPARK-23305][SQL][TEST] Test `spark.sql.files.ignoreMis...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20479 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87022/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20482: [SPARK-23311][SQL][TEST]add FilterFunction test case for...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20482 **[Test build #87021 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87021/testReport)** for PR 20482 at commit [`6042523`](https://github.com/apache/spark/commit/6042523d54dbebceb80ebd4b180bd9b73c5bd3ed). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20479: [SPARK-23305][SQL][TEST] Test `spark.sql.files.ignoreMis...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20479 **[Test build #87022 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87022/testReport)** for PR 20479 at commit [`9a2f640`](https://github.com/apache/spark/commit/9a2f640f3e695970c0c4ffe93d6fe978c4013ed1). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20460: [SPARK-23285][K8S] Allow fractional values for spark.exe...
Github user liyinan926 commented on the issue: https://github.com/apache/spark/pull/20460 Agreed. This is a fundamental change to the way Spark handles task scheduling, task parallelism, and dynamic resource allocation, etc., and it impacts every scheduler backends. I'm closing this PR for now. Thanks for reviewing and giving feedbacks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20460: [SPARK-23285][K8S] Allow fractional values for sp...
Github user liyinan926 closed the pull request at: https://github.com/apache/spark/pull/20460 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20495: [SPARK-23327] [SQL] Update the description of three exte...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20495 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20495: [SPARK-23327] [SQL] Update the description of three exte...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20495 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87026/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20495: [SPARK-23327] [SQL] Update the description of three exte...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20495 **[Test build #87026 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87026/testReport)** for PR 20495 at commit [`9ecc809`](https://github.com/apache/spark/commit/9ecc809056800058cc95a1341fd9b85fa247867f). * This patch **fails to build**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20495: [SPARK-23327] [SQL] Update the description of three exte...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20495 **[Test build #87026 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87026/testReport)** for PR 20495 at commit [`9ecc809`](https://github.com/apache/spark/commit/9ecc809056800058cc95a1341fd9b85fa247867f). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20495: [SPARK-23327] [SQL] Update the description of three exte...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20495 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20495: [SPARK-23327] [SQL] Update the description of three exte...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20495 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/560/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20495: [SPARK-23327] [SQL] Update the description of three exte...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20495 cc @srinathshankar @rxin @cloud-fan --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20495: [SPARK-23327] [SQL] Update the description of thr...
GitHub user gatorsmile opened a pull request: https://github.com/apache/spark/pull/20495 [SPARK-23327] [SQL] Update the description of three external API or functions ## What changes were proposed in this pull request? Update the description of three external API or functions `createFunction `, `length` and `repartitionByRange ` ## How was this patch tested? N/A You can merge this pull request into a Git repository by running: $ git pull https://github.com/gatorsmile/spark updateFunc Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20495.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20495 commit 48f552cb087b0f8e3a87d43191450474734cae06 Author: gatorsmile Date: 2018-02-03T04:35:59Z update create function. commit 21ea233b6ef44bac4aefe13ae9014badac3450b1 Author: gatorsmile Date: 2018-02-03T06:51:03Z update the comment for repartitionByRange commit 9ecc809056800058cc95a1341fd9b85fa247867f Author: gatorsmile Date: 2018-02-03T07:24:08Z update the comment of length --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20484: [SPARK-23313][DOC] Add a migration guide for ORC
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20484#discussion_r165806324 --- Diff: docs/sql-programming-guide.md --- @@ -1776,6 +1776,42 @@ working with timestamps in `pandas_udf`s to get the best performance, see ## Upgrading From Spark SQL 2.2 to 2.3 + - Since Spark 2.3, Spark supports a vectorized ORC reader with a new ORC file format for ORC files and Hive ORC tables. To do that, the following configurations are newly added or change their default values. + +- New configurations + + + Property NameDefaultMeaning + +spark.sql.orc.impl +native +The name of ORC implementation. It can be one of native and hive. native means the native ORC support that is built on Apache ORC 1.4.1. `hive` means the ORC library in Hive 1.2.1 which is used prior to Spark 2.3. + + +spark.sql.orc.enableVectorizedReader +true +Enables vectorized orc decoding in native implementation. If false, a new non-vectorized ORC reader is used in native implementation. For hive implementation, this is ignored. + + + +- Changed configurations + + + Property NameDefaultMeaning + +spark.sql.orc.filterPushdown +true +Enables filter pushdown for ORC files. It is false by default prior to Spark 2.3. + + +spark.sql.hive.convertMetastoreOrc +true +Enable the Spark's ORC support, which can be configured by spark.sql.orc.impl, instead of Hive SerDe when reading from and writing to Hive ORC tables. It is false by default prior to Spark 2.3. + + + +- Since Apache ORC 1.4.1 is a standalone library providing a subset of Hive ORC related configurations, you can use ORC configuration name and Hive configuration name. To see a full list of supported ORC configurations, see https://github.com/apache/orc/blob/master/java/core/src/java/org/apache/orc/OrcConf.java";>OrcConf.java. --- End diff -- You can do a search. We need to improve our ORC test coverage for sure. If possible, please add test cases to see whether both `orc.stripe.size` and `hive.exec.orc.default.stripe.size` work for two Spark's ORC readers. We also need the same tests for checking whether `hive.exec.orc.default.stripe.size` works for Hive serde tables. To ensure the correctness of the documentation, I hope we can at least submit a PR for testing them before merging this PR? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20484: [SPARK-23313][DOC] Add a migration guide for ORC
Github user dongjoon-hyun commented on a diff in the pull request: https://github.com/apache/spark/pull/20484#discussion_r165805925 --- Diff: docs/sql-programming-guide.md --- @@ -1776,6 +1776,42 @@ working with timestamps in `pandas_udf`s to get the best performance, see ## Upgrading From Spark SQL 2.2 to 2.3 + - Since Spark 2.3, Spark supports a vectorized ORC reader with a new ORC file format for ORC files and Hive ORC tables. To do that, the following configurations are newly added or change their default values. + +- New configurations + + + Property NameDefaultMeaning + +spark.sql.orc.impl +native +The name of ORC implementation. It can be one of native and hive. native means the native ORC support that is built on Apache ORC 1.4.1. `hive` means the ORC library in Hive 1.2.1 which is used prior to Spark 2.3. + + +spark.sql.orc.enableVectorizedReader +true +Enables vectorized orc decoding in native implementation. If false, a new non-vectorized ORC reader is used in native implementation. For hive implementation, this is ignored. + + + +- Changed configurations + + + Property NameDefaultMeaning + +spark.sql.orc.filterPushdown +true +Enables filter pushdown for ORC files. It is false by default prior to Spark 2.3. + + +spark.sql.hive.convertMetastoreOrc +true +Enable the Spark's ORC support, which can be configured by spark.sql.orc.impl, instead of Hive SerDe when reading from and writing to Hive ORC tables. It is false by default prior to Spark 2.3. + + + +- Since Apache ORC 1.4.1 is a standalone library providing a subset of Hive ORC related configurations, you can use ORC configuration name and Hive configuration name. To see a full list of supported ORC configurations, see https://github.com/apache/orc/blob/master/java/core/src/java/org/apache/orc/OrcConf.java";>OrcConf.java. --- End diff -- It's possible in another PR. BTW, about the test coverage, - Do you want to see specifically `orc.stripe.size` and `hive.exec.orc.default.stripe.size` only? - Do we have a test coverage before for old Hive ORC code path? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20359: [SPARK-23186][SQL] Initialize DriverManager first before...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20359 **[Test build #87025 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87025/testReport)** for PR 20359 at commit [`52e6f19`](https://github.com/apache/spark/commit/52e6f19a660be4d6a1589d42a452884a8caf26ac). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20359: [SPARK-23186][SQL] Initialize DriverManager first before...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20359 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20359: [SPARK-23186][SQL] Initialize DriverManager first before...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20359 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/559/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20359: [SPARK-23186][SQL] Initialize DriverManager first before...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/20359 It's a well-known irrelevant failure. - org.apache.spark.sql.kafka010.KafkaContinuousSourceStressForDontFailOnDataLossSuite.stress test for failOnDataLoss=false --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20359: [SPARK-23186][SQL] Initialize DriverManager first before...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/20359 Retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18555: [SPARK-21353][CORE]add checkValue in spark.internal.conf...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/18555 **[Test build #87024 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87024/testReport)** for PR 18555 at commit [`c213be6`](https://github.com/apache/spark/commit/c213be69ae545cc360a9480969c1aa02d1e51e39). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20359: [SPARK-23186][SQL] Initialize DriverManager first before...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20359 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87020/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20359: [SPARK-23186][SQL] Initialize DriverManager first before...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20359 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20359: [SPARK-23186][SQL] Initialize DriverManager first before...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20359 **[Test build #87020 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87020/testReport)** for PR 20359 at commit [`52e6f19`](https://github.com/apache/spark/commit/52e6f19a660be4d6a1589d42a452884a8caf26ac). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20486: [SPARK-23317][SQL] rename ContinuousReader.setOff...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20486 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20486: [SPARK-23317][SQL] rename ContinuousReader.setOffset to ...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20486 Thanks! Merged to master/2.3 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20482: [SPARK-23311][SQL][TEST]add FilterFunction test case for...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20482 **[Test build #87023 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87023/testReport)** for PR 20482 at commit [`6042523`](https://github.com/apache/spark/commit/6042523d54dbebceb80ebd4b180bd9b73c5bd3ed). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20482: [SPARK-23311][SQL][TEST]add FilterFunction test case for...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20482 **[Test build #87021 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87021/testReport)** for PR 20482 at commit [`6042523`](https://github.com/apache/spark/commit/6042523d54dbebceb80ebd4b180bd9b73c5bd3ed). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20479: [SPARK-23305][SQL][TEST] Test `spark.sql.files.ignoreMis...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20479 **[Test build #87022 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87022/testReport)** for PR 20479 at commit [`9a2f640`](https://github.com/apache/spark/commit/9a2f640f3e695970c0c4ffe93d6fe978c4013ed1). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20482: [SPARK-23311][SQL][TEST]add FilterFunction test case for...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20482 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20479: [SPARK-23305][SQL][TEST] Test `spark.sql.files.ignoreMis...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20479 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/558/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20479: [SPARK-23305][SQL][TEST] Test `spark.sql.files.ignoreMis...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20479 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20479: [SPARK-23305][SQL][TEST] Test `spark.sql.files.ignoreMis...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20479 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20491: [SQL] Minor doc update: Add an example in DataFra...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20491 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20479: [SPARK-23305][SQL][TEST] Test `spark.sql.files.ignoreMis...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20479 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20479: [SPARK-23305][SQL][TEST] Test `spark.sql.files.ignoreMis...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20479 **[Test build #87019 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87019/testReport)** for PR 20479 at commit [`9a2f640`](https://github.com/apache/spark/commit/9a2f640f3e695970c0c4ffe93d6fe978c4013ed1). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20479: [SPARK-23305][SQL][TEST] Test `spark.sql.files.ignoreMis...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20479 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87019/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20491: [SQL] Minor doc update: Add an example in DataFrameReade...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20491 Thanks! Merged to master/2.3 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20484: [SPARK-23313][DOC] Add a migration guide for ORC
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20484#discussion_r165804642 --- Diff: docs/sql-programming-guide.md --- @@ -1776,6 +1776,42 @@ working with timestamps in `pandas_udf`s to get the best performance, see ## Upgrading From Spark SQL 2.2 to 2.3 + - Since Spark 2.3, Spark supports a vectorized ORC reader with a new ORC file format for ORC files and Hive ORC tables. To do that, the following configurations are newly added or change their default values. + +- New configurations + + + Property NameDefaultMeaning + +spark.sql.orc.impl +native +The name of ORC implementation. It can be one of native and hive. native means the native ORC support that is built on Apache ORC 1.4.1. `hive` means the ORC library in Hive 1.2.1 which is used prior to Spark 2.3. + + +spark.sql.orc.enableVectorizedReader +true +Enables vectorized orc decoding in native implementation. If false, a new non-vectorized ORC reader is used in native implementation. For hive implementation, this is ignored. + + + +- Changed configurations + + + Property NameDefaultMeaning + +spark.sql.orc.filterPushdown +true +Enables filter pushdown for ORC files. It is false by default prior to Spark 2.3. + + +spark.sql.hive.convertMetastoreOrc +true +Enable the Spark's ORC support, which can be configured by spark.sql.orc.impl, instead of Hive SerDe when reading from and writing to Hive ORC tables. It is false by default prior to Spark 2.3. + + + +- Since Apache ORC 1.4.1 is a standalone library providing a subset of Hive ORC related configurations, you can use ORC configuration name and Hive configuration name. To see a full list of supported ORC configurations, see https://github.com/apache/orc/blob/master/java/core/src/java/org/apache/orc/OrcConf.java";>OrcConf.java. --- End diff -- You mean these hive conf works for our native readers? Could you add test cases for them? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19788: [SPARK-9853][Core] Optimize shuffle fetch of contiguous ...
Github user wangyum commented on the issue: https://github.com/apache/spark/pull/19788 Thanks @yucai , It's a great improvement for many output files. The figure below is our comparison: **Before**: https://user-images.githubusercontent.com/5399861/35762292-6b5f9f88-08cf-11e8-8aa5-0d10e4282599.png";> **After**: https://user-images.githubusercontent.com/5399861/35762790-9be2e468-08d8-11e8-8403-2f85993eee9d.png";> --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20493: [SPARK-23326][WEBUI]schedulerDelay should return 0 when ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20493 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20493: [SPARK-23326][WEBUI]schedulerDelay should return 0 when ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20493 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87015/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20493: [SPARK-23326][WEBUI]schedulerDelay should return 0 when ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20493 **[Test build #87015 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87015/testReport)** for PR 20493 at commit [`7889fb0`](https://github.com/apache/spark/commit/7889fb0e5e4515ade35c2a07703017e16ee6194a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20492: [SPARK-23310][CORE] Turn off read ahead input stream for...
Github user sameeragarwal commented on the issue: https://github.com/apache/spark/pull/20492 Just a minor comment, LGTM. Thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20492: [SPARK-23310][CORE] Turn off read ahead input str...
Github user sameeragarwal commented on a diff in the pull request: https://github.com/apache/spark/pull/20492#discussion_r165804011 --- Diff: core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeSorterSpillReader.java --- @@ -77,7 +77,7 @@ public UnsafeSorterSpillReader( SparkEnv.get().conf().getDouble("spark.unsafe.sorter.spill.read.ahead.fraction", 0.5); final boolean readAheadEnabled = SparkEnv.get() != null && - SparkEnv.get().conf().getBoolean("spark.unsafe.sorter.spill.read.ahead.enabled", true); + SparkEnv.get().conf().getBoolean("spark.unsafe.sorter.spill.read.ahead.enabled", false); --- End diff -- Can we add a comment here to add more context? Perhaps also link a JIRA/TODO to re-enable this in 2.4? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20460: [SPARK-23285][K8S] Allow fractional values for spark.exe...
Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/20460 I would suggest to bring out a discussion or even a design on dev mail list before doing such ground changing. This may affect not only dynamic allocation, but also scheduler. It is better to collect all the feedbacks (especially those who works on scheduler side). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20492: [SPARK-23310][CORE] Turn off read ahead input stream for...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20492 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87014/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20492: [SPARK-23310][CORE] Turn off read ahead input stream for...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20492 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20492: [SPARK-23310][CORE] Turn off read ahead input stream for...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20492 **[Test build #87014 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87014/testReport)** for PR 20492 at commit [`36b099b`](https://github.com/apache/spark/commit/36b099ba60c6616e5b15bd0849d7a1c8a935cba5). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20445: [SPARK-23092][SQL] Migrate MemoryStream to DataSourceV2 ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20445 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87018/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20445: [SPARK-23092][SQL] Migrate MemoryStream to DataSourceV2 ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20445 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20445: [SPARK-23092][SQL] Migrate MemoryStream to DataSourceV2 ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20445 **[Test build #87018 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87018/testReport)** for PR 20445 at commit [`1204755`](https://github.com/apache/spark/commit/1204755d8bdb0e8f0627a72bc8f456fdc12fc7ea). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20485: [SPARK-23315][SQL] failed to get output from canonicaliz...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20485 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87013/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20485: [SPARK-23315][SQL] failed to get output from canonicaliz...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20485 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20485: [SPARK-23315][SQL] failed to get output from canonicaliz...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20485 **[Test build #87013 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87013/testReport)** for PR 20485 at commit [`3aa0438`](https://github.com/apache/spark/commit/3aa043897bea5de1c230db6386d832e9b2993df3). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20359: [SPARK-23186][SQL] Initialize DriverManager first before...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20359 **[Test build #87020 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87020/testReport)** for PR 20359 at commit [`52e6f19`](https://github.com/apache/spark/commit/52e6f19a660be4d6a1589d42a452884a8caf26ac). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20359: [SPARK-23186][SQL] Initialize DriverManager first before...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20359 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/557/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20359: [SPARK-23186][SQL] Initialize DriverManager first before...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20359 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20479: [SPARK-23305][SQL][TEST] Test `spark.sql.files.ignoreMis...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20479 **[Test build #87019 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87019/testReport)** for PR 20479 at commit [`9a2f640`](https://github.com/apache/spark/commit/9a2f640f3e695970c0c4ffe93d6fe978c4013ed1). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20479: [SPARK-23305][SQL][TEST] Test `spark.sql.files.ignoreMis...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20479 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/556/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20479: [SPARK-23305][SQL][TEST] Test `spark.sql.files.ignoreMis...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20479 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20479: [SPARK-23305][SQL][TEST] Test `spark.sql.files.ignoreMis...
Github user dongjoon-hyun commented on the issue: https://github.com/apache/spark/pull/20479 Retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20491: [SQL] Minor doc update: Add an example in DataFrameReade...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20491 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20491: [SQL] Minor doc update: Add an example in DataFrameReade...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20491 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87012/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20491: [SQL] Minor doc update: Add an example in DataFrameReade...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20491 **[Test build #87012 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87012/testReport)** for PR 20491 at commit [`e5e5e0b`](https://github.com/apache/spark/commit/e5e5e0b44e22f58736dd27e5c048395670574f18). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20442: [SPARK-23265][ML]Update multi-column error handling logi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20442 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20442: [SPARK-23265][ML]Update multi-column error handling logi...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20442 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/555/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20494: [SPARK-23064][SS][DOCS] Stream-stream joins Docum...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/20494 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20494: [SPARK-23064][SS][DOCS] Stream-stream joins Documentatio...
Github user tdas commented on the issue: https://github.com/apache/spark/pull/20494 Merging to master and 2.3 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20445: [SPARK-23092][SQL] Migrate MemoryStream to DataSourceV2 ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20445 **[Test build #87018 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87018/testReport)** for PR 20445 at commit [`1204755`](https://github.com/apache/spark/commit/1204755d8bdb0e8f0627a72bc8f456fdc12fc7ea). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20445: [SPARK-23092][SQL] Migrate MemoryStream to DataSourceV2 ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20445 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20445: [SPARK-23092][SQL] Migrate MemoryStream to DataSourceV2 ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20445 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/554/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20494: [SPARK-23064][SS][DOCS] Stream-stream joins Documentatio...
Github user jose-torres commented on the issue: https://github.com/apache/spark/pull/20494 LGTM --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20481: [SPARK-23307][WEBUI]Sort jobs/stages/tasks/queries with ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20481 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20481: [SPARK-23307][WEBUI]Sort jobs/stages/tasks/queries with ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20481 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87005/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20481: [SPARK-23307][WEBUI]Sort jobs/stages/tasks/queries with ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20481 **[Test build #87005 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87005/testReport)** for PR 20481 at commit [`b83b396`](https://github.com/apache/spark/commit/b83b396dcd10fabf9d28ef57d4206fba2980efa5). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20487: [SPARK-23319][TESTS] Explicitly skips PySpark tests for ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20487 **[Test build #87016 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87016/testReport)** for PR 20487 at commit [`6403198`](https://github.com/apache/spark/commit/640319812307b166f060366d54974c7352e3d7ba). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20487: [SPARK-23319][TESTS] Explicitly skips PySpark tests for ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20487 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87016/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20487: [SPARK-23319][TESTS] Explicitly skips PySpark tests for ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20487 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20494: [SPARK-23064][SS][DOCS] Stream-stream joins Documentatio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20494 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87017/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20494: [SPARK-23064][SS][DOCS] Stream-stream joins Documentatio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20494 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20494: [SPARK-23064][SS][DOCS] Stream-stream joins Documentatio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20494 **[Test build #87017 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87017/testReport)** for PR 20494 at commit [`de446b5`](https://github.com/apache/spark/commit/de446b5d024f7c646a647d346904519a2742eed1). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20387 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87007/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20387 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20387: [SPARK-23203][SQL]: DataSourceV2: Use immutable logical ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20387 **[Test build #87007 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87007/testReport)** for PR 20387 at commit [`f1d9872`](https://github.com/apache/spark/commit/f1d9872a2699cdbd5c87b02e702dc8103335131d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20494: [SPARK-23064][SS][DOCS] Stream-stream joins Documentatio...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20494 **[Test build #87017 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87017/testReport)** for PR 20494 at commit [`de446b5`](https://github.com/apache/spark/commit/de446b5d024f7c646a647d346904519a2742eed1). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20494: [SPARK-23064][SS][DOCS] Stream-stream joins Documentatio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20494 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20494: [SPARK-23064][SS][DOCS] Stream-stream joins Documentatio...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20494 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/553/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20488: [SPARK-23321][SQL]: Validate datasource v2 writes
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20488 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87006/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20494: [SPARK-23064][SS][DOCS] Stream-stream joins Docum...
GitHub user tdas opened a pull request: https://github.com/apache/spark/pull/20494 [SPARK-23064][SS][DOCS] Stream-stream joins Documentation - follow up ## What changes were proposed in this pull request? Further clarification of caveats in using stream-stream outer joins. ## How was this patch tested? N/A You can merge this pull request into a Git repository by running: $ git pull https://github.com/tdas/spark SPARK-23064-2 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20494.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20494 commit de446b5d024f7c646a647d346904519a2742eed1 Author: Tathagata Das Date: 2018-02-02T23:37:41Z Improvement --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20488: [SPARK-23321][SQL]: Validate datasource v2 writes
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20488 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20487: [SPARK-23319][TESTS] Explicitly skips PySpark tes...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20487#discussion_r165795585 --- Diff: python/pyspark/sql/utils.py --- @@ -115,18 +115,30 @@ def toJArray(gateway, jtype, arr): def require_minimum_pandas_version(): """ Raise ImportError if minimum version of Pandas is not installed """ +minimum_pandas_version = "0.19.2" + from distutils.version import LooseVersion -import pandas -if LooseVersion(pandas.__version__) < LooseVersion('0.19.2'): -raise ImportError("Pandas >= 0.19.2 must be installed on calling Python process; " - "however, your version was %s." % pandas.__version__) +try: +import pandas +except ImportError: +raise ImportError("Pandas >= %s must be installed; however, " + "it was not found." % minimum_pandas_version) --- End diff -- I catch `ImportError` here just to make the error message nicer. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20487: [SPARK-23319][TESTS] Explicitly skips PySpark tes...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20487#discussion_r165795538 --- Diff: python/pyspark/sql/tests.py --- @@ -48,19 +48,26 @@ else: import unittest -_have_pandas = False -_have_old_pandas = False +_pandas_requirement_message = None try: -import pandas -try: -from pyspark.sql.utils import require_minimum_pandas_version -require_minimum_pandas_version() -_have_pandas = True -except: -_have_old_pandas = True -except: -# No Pandas, but that's okay, we'll skip those tests -pass +from pyspark.sql.utils import require_minimum_pandas_version +require_minimum_pandas_version() +except ImportError as e: +from pyspark.util import _exception_message +# If Pandas version requirement is not satisfied, skip related tests. +_pandas_requirement_message = _exception_message(e) + +_pyarrow_requirement_message = None +try: +from pyspark.sql.utils import require_minimum_pyarrow_version +require_minimum_pyarrow_version() +except ImportError as e: +from pyspark.util import _exception_message +# If Arrow version requirement is not satisfied, skip related tests. +_pyarrow_requirement_message = _exception_message(e) + +_have_pandas = _pandas_requirement_message is None +_have_pyarrow = _pyarrow_requirement_message is None --- End diff -- Here is the logic I used: `_pyarrow_requirement_message` contains error message for PyArrow requirement if missing or version is not matched. if `_pyarrow_requirement_message` contains the message, `_have_pyarrow` becomes `False`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20488: [SPARK-23321][SQL]: Validate datasource v2 writes
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20488 **[Test build #87006 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87006/testReport)** for PR 20488 at commit [`3580daf`](https://github.com/apache/spark/commit/3580daf15497a1d49112a0eddd556f74b9b3e280). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `case class NoopCommand() extends RunnableCommand ` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org