[GitHub] spark issue #20670: [SPARK-23405] Add constranits
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20670 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20670: [SPARK-23405] Add constranits
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20670 **[Test build #87648 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87648/testReport)** for PR 20670 at commit [`705ed46`](https://github.com/apache/spark/commit/705ed462bb307871e65199ce02576f12d60d2176). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20670: [SPARK-23405] Add constranits
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20670 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87648/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20668: [SPARK-23510][SQL] Support Hive 2.2 and Hive 2.3 ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20668#discussion_r170444340 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala --- @@ -1146,3 +1146,25 @@ private[client] class Shim_v2_1 extends Shim_v2_0 { alterPartitionsMethod.invoke(hive, tableName, newParts, environmentContextInAlterTable) } } + +private[client] class Shim_v2_2 extends Shim_v2_1 { + +} + +private[client] class Shim_v2_3 extends Shim_v2_2 { + + val environmentContext = new EnvironmentContext() + environmentContext.putToProperties("DO_NOT_UPDATE_STATS", "true") + + private lazy val alterPartitionsMethod = +findMethod( + classOf[Hive], + "alterPartitions", + classOf[String], + classOf[JList[Partition]], + classOf[EnvironmentContext]) + + override def alterPartitions(hive: Hive, tableName: String, newParts: JList[Partition]): Unit = { --- End diff -- If we do not add `alterPartitionsMethod `, which test case will fail? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20670: add constranits
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20670 **[Test build #87648 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87648/testReport)** for PR 20670 at commit [`705ed46`](https://github.com/apache/spark/commit/705ed462bb307871e65199ce02576f12d60d2176). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20670: add constranits
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20670 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20670: add constranits
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20670 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1035/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20670: add constranits
GitHub user KaiXinXiaoLei opened a pull request: https://github.com/apache/spark/pull/20670 add constranits ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix) I run a sql: `select ls.cs_order_number from ls left semi join catalog_sales cs on ls.cs_order_number = cs.cs_order_number`, The `ls` table is a small table ,and the number is one. The `catalog_sales` table is a big table, and the number is 10 billion. The task will be hang up. And i find the many null values of `cs_order_number` in the `catalog_sales` table. I think the null value should be removed in the logical plan. ## How was this patch tested? (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) Please review http://spark.apache.org/contributing.html before opening a pull request. You can merge this pull request into a Git repository by running: $ git pull https://github.com/KaiXinXiaoLei/spark Spark-23405 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20670.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20670 commit 705ed462bb307871e65199ce02576f12d60d2176 Author: KaiXinXiaoLei <584620569@...> Date: 2018-02-25T06:06:39Z add constranits --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20658: [SPARK-23488][python] Add missing catalog methods to pyt...
Github user drboyer commented on the issue: https://github.com/apache/spark/pull/20658 @HyukjinKwon thanks for the review so far! Sorry for the delay, I somehow missed the Python style output in the test logs earlier. How's this look now? Can you elaborate more on "doctest" if it's still needed? From what I can tell the only documentation for the Catalog is [this simple reference](https://spark.apache.org/docs/latest/api/python/pyspark.sql.html?highlight=catalog#pyspark.sql.SparkSession.catalog) which would be unaffected by my change --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20658: [SPARK-23488][python] Add missing catalog methods to pyt...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20658 **[Test build #87647 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87647/testReport)** for PR 20658 at commit [`a49ffa0`](https://github.com/apache/spark/commit/a49ffa010a46a0d87de124d8ddf66c8173b756fb). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20658: [SPARK-23488][python] Add missing catalog methods to pyt...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20658 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87647/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20658: [SPARK-23488][python] Add missing catalog methods to pyt...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20658 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20658: [SPARK-23488][python] Add missing catalog methods to pyt...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20658 **[Test build #87647 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87647/testReport)** for PR 20658 at commit [`a49ffa0`](https://github.com/apache/spark/commit/a49ffa010a46a0d87de124d8ddf66c8173b756fb). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20668: [SPARK-23510][SQL] Support Hive 2.2 and Hive 2.3 metasto...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20668 Also need to update `HiveClientVersions.scala` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20668: [SPARK-23510][SQL] Support Hive 2.2 and Hive 2.3 ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20668#discussion_r170441161 --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/client/VersionsSuite.scala --- @@ -125,7 +126,7 @@ class VersionsSuite extends SparkFunSuite with Logging { // Hive changed the default of datanucleus.schema.autoCreateAll from true to false and // hive.metastore.schema.verification from false to true since 2.0 // For details, see the JIRA HIVE-6113 and HIVE-12463 - if (version == "2.0" || version == "2.1") { + if (version.split("\\.").head.toInt > 1) { --- End diff -- ```Scala if (version == "2.0" || version == "2.1" || version == "2.2" || version == "2.3") { ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20668: [SPARK-23510][SQL] Support Hive 2.2 and Hive 2.3 ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/20668#discussion_r170440954 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala --- @@ -1146,3 +1146,25 @@ private[client] class Shim_v2_1 extends Shim_v2_0 { alterPartitionsMethod.invoke(hive, tableName, newParts, environmentContextInAlterTable) } } + +private[client] class Shim_v2_2 extends Shim_v2_1 { + +} --- End diff -- Please remove `{}` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20669: [SPARK-22839][K8S] Remove the use of init-container for ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20669 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/1028/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20669: [SPARK-22839][K8S] Remove the use of init-container for ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20669 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/1028/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20669: [SPARK-22839][K8S] Remove the use of init-container for ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20669 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/1027/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20669: [SPARK-22839][K8S] Remove the use of init-container for ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20669 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1034/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20669: [SPARK-22839][K8S] Remove the use of init-container for ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20669 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20669: [SPARK-22839][K8S] Remove the use of init-container for ...
Github user ssuchter commented on the issue: https://github.com/apache/spark/pull/20669 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20669: [SPARK-22839][K8S] Remove the use of init-container for ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20669 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/1027/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20669: [SPARK-22839][K8S] Remove the use of init-container for ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20669 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20669: [SPARK-22839][K8S] Remove the use of init-container for ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20669 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1033/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20669: [SPARK-22839][K8S] Remove the use of init-container for ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20669 Kubernetes integration test status failure URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/1026/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20669: [SPARK-22839][K8S] Remove the use of init-container for ...
Github user ssuchter commented on the issue: https://github.com/apache/spark/pull/20669 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20669: [SPARK-22839][K8S] Remove the use of init-container for ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20669 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1032/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20669: [SPARK-22839][K8S] Remove the use of init-container for ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20669 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20669: [SPARK-22839][K8S] Remove the use of init-container for ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20669 Kubernetes integration test starting URL: https://amplab.cs.berkeley.edu/jenkins/job/testing-k8s-prb-spark-integration/1026/ --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20669: [SPARK-22839][K8S] Remove the use of init-container for ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20669 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20669: [SPARK-22839][K8S] Remove the use of init-contain...
GitHub user ifilonenko opened a pull request: https://github.com/apache/spark/pull/20669 [SPARK-22839][K8S] Remove the use of init-container for downloading remote dependencies ## What changes were proposed in this pull request? Removal of the init-container for downloading remote dependencies. Built off of the work done by @vanzin in an attempt to refactor driver/executor configuration elaborated in [this](https://issues.apache.org/jira/browse/SPARK-22839) ticket. ## How was this patch tested? This patch was tested with unit and integration tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ifilonenko/spark remove-init-container Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20669.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20669 commit 2fefd0edf2f15ba66620fd507bd0cd7ce01bcd1e Author: Ilan FilonenkoDate: 2018-02-24T23:25:45Z Removed the use of init-container for downloading remote dependencies --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #18894: [SPARK-21673] Use the correct sandbox environment variab...
Github user joerg84 commented on the issue: https://github.com/apache/spark/pull/18894 LGTM from a Mesos perspective --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20641: [SPARK-23464][MESOS] Fix mesos cluster scheduler options...
Github user susanxhuynh commented on the issue: https://github.com/apache/spark/pull/20641 Thanks for the PR! It seems that the previous attempt to fix this (SPARK-18114) was wrong -- I'm not sure why we didn't catch the problem before, maybe lack of testing? @krcz My suggestion for this patch is to add a test, in order to prevent another regression in the future. I've written a unit test for this -- you could do something similar: https://github.com/mesosphere/spark/commit/4812ba3d10264f6d22ec654fa16b5810d70c27a9 I will also do more testing with my own integration tests. cc @skonto --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20668: [SPARK-23510][SQL] Support Hive 2.2 and Hive 2.3 metasto...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20668 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87646/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20668: [SPARK-23510][SQL] Support Hive 2.2 and Hive 2.3 metasto...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20668 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20668: [SPARK-23510][SQL] Support Hive 2.2 and Hive 2.3 metasto...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20668 **[Test build #87646 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87646/testReport)** for PR 20668 at commit [`48343bc`](https://github.com/apache/spark/commit/48343bc8214468b58dcffcc8d968c870ee0189be). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20668: [SPARK-23510][SQL] Support Hive 2.2 and Hive 2.3 metasto...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20668 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87645/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20668: [SPARK-23510][SQL] Support Hive 2.2 and Hive 2.3 metasto...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20668 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20668: [SPARK-23510][SQL] Support Hive 2.2 and Hive 2.3 metasto...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20668 **[Test build #87645 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87645/testReport)** for PR 20668 at commit [`5b1fc01`](https://github.com/apache/spark/commit/5b1fc0145efbdd427e8b49bd0f840f709d4bc801). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20666: [SPARK-23448][SQL] Clarify JSON and CSV parser be...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20666#discussion_r170425193 --- Diff: python/pyspark/sql/readwriter.py --- @@ -209,13 +209,15 @@ def json(self, path, schema=None, primitivesAsString=None, prefersDecimal=None, :param mode: allows a mode for dealing with corrupt records during parsing. If None is set, it uses the default value, ``PERMISSIVE``. -* ``PERMISSIVE`` : sets other fields to ``null`` when it meets a corrupted \ - record, and puts the malformed string into a field configured by \ - ``columnNameOfCorruptRecord``. To keep corrupt records, an user can set \ - a string type field named ``columnNameOfCorruptRecord`` in an user-defined \ - schema. If a schema does not have the field, it drops corrupt records during \ - parsing. When inferring a schema, it implicitly adds a \ - ``columnNameOfCorruptRecord`` field in an output schema. +* ``PERMISSIVE`` : when it meets a corrupted record, puts the malformed string \ + into a field configured by ``columnNameOfCorruptRecord``, and sets other \ + fields to ``null``. To keep corrupt records, an user can set a string type \ + field named ``columnNameOfCorruptRecord`` in an user-defined schema. If a \ + schema does not have the field, it drops corrupt records during parsing. \ + When inferring a schema, it implicitly adds a ``columnNameOfCorruptRecord`` \ --- End diff -- I think we should say `it implicitly adds ... if a corrupted record is found ` while we are here? I think it only adds `` `columnNameOfCorruptRecord` `` when it meets a corrupted record during schema inference. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20666: [SPARK-23448][SQL] Clarify JSON and CSV parser be...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20666#discussion_r170425254 --- Diff: python/pyspark/sql/readwriter.py --- @@ -209,13 +209,15 @@ def json(self, path, schema=None, primitivesAsString=None, prefersDecimal=None, :param mode: allows a mode for dealing with corrupt records during parsing. If None is set, it uses the default value, ``PERMISSIVE``. -* ``PERMISSIVE`` : sets other fields to ``null`` when it meets a corrupted \ - record, and puts the malformed string into a field configured by \ - ``columnNameOfCorruptRecord``. To keep corrupt records, an user can set \ - a string type field named ``columnNameOfCorruptRecord`` in an user-defined \ - schema. If a schema does not have the field, it drops corrupt records during \ - parsing. When inferring a schema, it implicitly adds a \ - ``columnNameOfCorruptRecord`` field in an output schema. +* ``PERMISSIVE`` : when it meets a corrupted record, puts the malformed string \ + into a field configured by ``columnNameOfCorruptRecord``, and sets other \ + fields to ``null``. To keep corrupt records, an user can set a string type \ + field named ``columnNameOfCorruptRecord`` in an user-defined schema. If a \ + schema does not have the field, it drops corrupt records during parsing. \ + When inferring a schema, it implicitly adds a ``columnNameOfCorruptRecord`` \ + field in an output schema. It doesn't support partial results. Even just one \ --- End diff -- It's trivial but how about we avoid an abbreviation like `dosen't`? It's usually what I do for doc although I am not sure if it actually matters. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20666: [SPARK-23448][SQL] Clarify JSON and CSV parser be...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20666#discussion_r170425099 --- Diff: python/pyspark/sql/readwriter.py --- @@ -393,13 +395,16 @@ def csv(self, path, schema=None, sep=None, encoding=None, quote=None, escape=Non :param mode: allows a mode for dealing with corrupt records during parsing. If None is set, it uses the default value, ``PERMISSIVE``. -* ``PERMISSIVE`` : sets other fields to ``null`` when it meets a corrupted \ - record, and puts the malformed string into a field configured by \ - ``columnNameOfCorruptRecord``. To keep corrupt records, an user can set \ - a string type field named ``columnNameOfCorruptRecord`` in an \ - user-defined schema. If a schema does not have the field, it drops corrupt \ - records during parsing. When a length of parsed CSV tokens is shorter than \ - an expected length of a schema, it sets `null` for extra fields. +* ``PERMISSIVE`` : when it meets a corrupted record, puts the malformed string \ + into a field configured by ``columnNameOfCorruptRecord``, and sets other \ + fields to ``null``. To keep corrupt records, an user can set a string type \ + field named ``columnNameOfCorruptRecord`` in an user-defined schema. If a \ + schema does not have the field, it drops corrupt records during parsing. \ + It supports partial result for the records just with less or more tokens \ + than the schema. When it meets a malformed record whose parsed tokens is \ --- End diff -- How about ` a malformed record whose parsed tokens is` -> ` a malformed record having the length of parsed tokens shorter than the length of a schema`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20668: [SPARK-23510][SQL] Support Hive 2.2 and Hive 2.3 metasto...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20668 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1031/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20668: [SPARK-23510][SQL] Support Hive 2.2 and Hive 2.3 metasto...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20668 **[Test build #87646 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87646/testReport)** for PR 20668 at commit [`48343bc`](https://github.com/apache/spark/commit/48343bc8214468b58dcffcc8d968c870ee0189be). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20668: [SPARK-23510][SQL] Support Hive 2.2 and Hive 2.3 metasto...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20668 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20668: [SPARK-23510][SQL] Support Hive 2.2 and Hive 2.3 ...
Github user wangyum commented on a diff in the pull request: https://github.com/apache/spark/pull/20668#discussion_r170425667 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveUtils.scala --- @@ -202,7 +202,6 @@ private[spark] object HiveUtils extends Logging { ConfVars.METASTORE_AGGREGATE_STATS_CACHE_MAX_READER_WAIT -> TimeUnit.MILLISECONDS, ConfVars.HIVES_AUTO_PROGRESS_TIMEOUT -> TimeUnit.SECONDS, ConfVars.HIVE_LOG_INCREMENTAL_PLAN_PROGRESS_INTERVAL -> TimeUnit.MILLISECONDS, - ConfVars.HIVE_STATS_JDBC_TIMEOUT -> TimeUnit.SECONDS, --- End diff -- Remove `HIVE_STATS_JDBC_TIMEOUT ` , more see: https://issues.apache.org/jira/browse/HIVE-12164 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20668: [SPARK-23510][SQL] Support Hive 2.2 and Hive 2.3 ...
Github user wangyum commented on a diff in the pull request: https://github.com/apache/spark/pull/20668#discussion_r170425631 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveUtils.scala --- @@ -202,8 +202,6 @@ private[spark] object HiveUtils extends Logging { ConfVars.METASTORE_AGGREGATE_STATS_CACHE_MAX_READER_WAIT -> TimeUnit.MILLISECONDS, ConfVars.HIVES_AUTO_PROGRESS_TIMEOUT -> TimeUnit.SECONDS, ConfVars.HIVE_LOG_INCREMENTAL_PLAN_PROGRESS_INTERVAL -> TimeUnit.MILLISECONDS, - ConfVars.HIVE_STATS_JDBC_TIMEOUT -> TimeUnit.SECONDS, - ConfVars.HIVE_STATS_RETRIES_WAIT -> TimeUnit.MILLISECONDS, --- End diff -- Remove `HIVE_STATS_JDBC_TIMEOUT ` , more see: https://issues.apache.org/jira/browse/HIVE-12164 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20668: [SPARK-23510][SQL] Support Hive 2.2 and Hive 2.3 ...
Github user wangyum commented on a diff in the pull request: https://github.com/apache/spark/pull/20668#discussion_r170425408 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala --- @@ -1146,3 +1146,25 @@ private[client] class Shim_v2_1 extends Shim_v2_0 { alterPartitionsMethod.invoke(hive, tableName, newParts, environmentContextInAlterTable) } } + +private[client] class Shim_v2_2 extends Shim_v2_1 { + +} + +private[client] class Shim_v2_3 extends Shim_v2_2 { + + val environmentContext = new EnvironmentContext() + environmentContext.putToProperties("DO_NOT_UPDATE_STATS", "true") --- End diff -- Otherwise will throw `NumberFormatException`: ``` [info] Cause: java.lang.NumberFormatException: null [info] at java.lang.Long.parseLong(Long.java:552) [info] at java.lang.Long.parseLong(Long.java:631) [info] at org.apache.hadoop.hive.metastore.MetaStoreUtils.isFastStatsSame(MetaStoreUtils.java:315) [info] at org.apache.hadoop.hive.metastore.HiveAlterHandler.alterPartitions(HiveAlterHandler.java:605) [info] at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.alter_partitions_with_environment_context(HiveMetaStore.java:3837) ``` more see: https://issues.apache.org/jira/browse/HIVE-15653 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20668: [SPARK-23510][SQL] Support Hive 2.2 and Hive 2.3 metasto...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20668 **[Test build #87645 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87645/testReport)** for PR 20668 at commit [`5b1fc01`](https://github.com/apache/spark/commit/5b1fc0145efbdd427e8b49bd0f840f709d4bc801). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20668: [SPARK-23510][SQL] Support Hive 2.2 and Hive 2.3 metasto...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20668 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1030/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20668: [SPARK-23510][SQL] Support Hive 2.2 and Hive 2.3 metasto...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20668 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20668: [SPARK-23510][SQL] Support Hive 2.2 and Hive 2.3 ...
GitHub user wangyum opened a pull request: https://github.com/apache/spark/pull/20668 [SPARK-23510][SQL] Support Hive 2.2 and Hive 2.3 metastore ## What changes were proposed in this pull request? Support Hive 2.2 and Hive 2.3 metastore. ## How was this patch tested? Exist tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/wangyum/spark SPARK-23510 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20668.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20668 commit 5b1fc0145efbdd427e8b49bd0f840f709d4bc801 Author: Yuming WangDate: 2018-02-24T16:19:35Z Support Hive 2.2 and Hive 2.3 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20667: [SPARK-23508][CORE] Use timeStampedHashMap for Bl...
Github user Ngone51 commented on a diff in the pull request: https://github.com/apache/spark/pull/20667#discussion_r170424196 --- Diff: core/src/main/scala/org/apache/spark/storage/BlockManagerId.scala --- @@ -132,10 +133,15 @@ private[spark] object BlockManagerId { getCachedBlockManagerId(obj) } - val blockManagerIdCache = new ConcurrentHashMap[BlockManagerId, BlockManagerId]() + val blockManagerIdCache = new TimeStampedHashMap[BlockManagerId, BlockManagerId](true) - def getCachedBlockManagerId(id: BlockManagerId): BlockManagerId = { + def getCachedBlockManagerId(id: BlockManagerId, clearOldValues: Boolean = false): BlockManagerId = + { blockManagerIdCache.putIfAbsent(id, id) -blockManagerIdCache.get(id) +val blockManagerId = blockManagerIdCache.get(id) +if (clearOldValues) { + blockManagerIdCache.clearOldValues(System.currentTimeMillis - Utils.timeStringAsMs("10d")) --- End diff -- 10 days? I don't think *time* can be a judging criteria to decide whether we should remove a cached id or not, even if you set the time threshold far less/greater than '10d'. Think about a extreamly case that a block could be frequently got all the time during the appâs running. So, it would be certainly removed from cache due to the time threshold, and recached next time we get it, and repeatedly. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20666: [SPARK-23448][SQL] Clarify JSON and CSV parser behavior ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20666 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20666: [SPARK-23448][SQL] Clarify JSON and CSV parser behavior ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20666 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87644/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20666: [SPARK-23448][SQL] Clarify JSON and CSV parser behavior ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20666 **[Test build #87644 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87644/testReport)** for PR 20666 at commit [`4400cf2`](https://github.com/apache/spark/commit/4400cf2eb4d3b1b37c9e299e91db6e4a032e0c3a). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20647: [SPARK-23303][SQL] improve the explain result for data s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20647 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87643/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20647: [SPARK-23303][SQL] improve the explain result for data s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20647 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20647: [SPARK-23303][SQL] improve the explain result for data s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20647 **[Test build #87643 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87643/testReport)** for PR 20647 at commit [`a73370a`](https://github.com/apache/spark/commit/a73370a5bf56f45ce67cd6cdaf86b53a14a67b5b). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20666: [SPARK-23448][SQL] Clarify JSON and CSV parser behavior ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20666 **[Test build #87644 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87644/testReport)** for PR 20666 at commit [`4400cf2`](https://github.com/apache/spark/commit/4400cf2eb4d3b1b37c9e299e91db6e4a032e0c3a). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20666: [SPARK-23448][SQL] Clarify JSON and CSV parser behavior ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20666 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20666: [SPARK-23448][SQL] Clarify JSON and CSV parser behavior ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20666 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1029/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20667: [SPARK-23508][CORE] Use timeStampedHashMap for Blockmana...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20667 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20667: [SPARK-23508][CORE] Use timeStampedHashMap for Blockmana...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20667 Can one of the admins verify this patch? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20667: [SPARK-23508][CORE] Use timeStampedHashMap for Bl...
GitHub user caneGuy opened a pull request: https://github.com/apache/spark/pull/20667 [SPARK-23508][CORE] Use timeStampedHashMap for BlockmanagerId in case blockManagerIdCache⦠⦠cause oom ## What changes were proposed in this pull request? blockManagerIdCache in BlockManagerId will not remove old values which may cause oom `val blockManagerIdCache = new ConcurrentHashMap[BlockManagerId, BlockManagerId]()` Since whenever we apply a new BlockManagerId, it will put into this map. This patch will use timestampHashMap instead for `JsonProtocol`. ## How was this patch tested? Exist tests. You can merge this pull request into a Git repository by running: $ git pull https://github.com/caneGuy/spark zhoukang/fix-history Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20667.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20667 commit fc1b6a0169c123a825a253defb021c73aebf1c98 Author: zhoukangDate: 2018-02-24T10:13:01Z Use timeStampedHashMap for BlockmanagerId in case blockManagerIdCache cause oom --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20666: [SPARK-23448][SQL] Clarify JSON and CSV parser be...
Github user viirya commented on a diff in the pull request: https://github.com/apache/spark/pull/20666#discussion_r170418454 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala --- @@ -550,12 +552,14 @@ class DataFrameReader private[sql](sparkSession: SparkSession) extends Logging { * `mode` (default `PERMISSIVE`): allows a mode for dealing with corrupt records *during parsing. It supports the following case-insensitive modes. * - * `PERMISSIVE` : sets other fields to `null` when it meets a corrupted record, and puts - * the malformed string into a field configured by `columnNameOfCorruptRecord`. To keep + * `PERMISSIVE` : when it meets a corrupted record, puts the malformed string into a + * field configured by `columnNameOfCorruptRecord`, and sets other fields to `null`. To keep * corrupt records, an user can set a string type field named `columnNameOfCorruptRecord` * in an user-defined schema. If a schema does not have the field, it drops corrupt records - * during parsing. When a length of parsed CSV tokens is shorter than an expected length - * of a schema, it sets `null` for extra fields. + * during parsing. It supports partial result for the records just with less or more tokens --- End diff -- Yes. Will update accordingly. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20666: [SPARK-23448][SQL] Clarify JSON and CSV parser behavior ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20666 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87642/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20666: [SPARK-23448][SQL] Clarify JSON and CSV parser behavior ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20666 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20666: [SPARK-23448][SQL] Clarify JSON and CSV parser behavior ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20666 **[Test build #87642 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87642/testReport)** for PR 20666 at commit [`4ad330b`](https://github.com/apache/spark/commit/4ad330b1def558e17dfb693d428e1bd69248e5a3). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20666: [SPARK-23448][SQL] Clarify JSON and CSV parser be...
Github user HyukjinKwon commented on a diff in the pull request: https://github.com/apache/spark/pull/20666#discussion_r170417628 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/DataFrameReader.scala --- @@ -550,12 +552,14 @@ class DataFrameReader private[sql](sparkSession: SparkSession) extends Logging { * `mode` (default `PERMISSIVE`): allows a mode for dealing with corrupt records *during parsing. It supports the following case-insensitive modes. * - * `PERMISSIVE` : sets other fields to `null` when it meets a corrupted record, and puts - * the malformed string into a field configured by `columnNameOfCorruptRecord`. To keep + * `PERMISSIVE` : when it meets a corrupted record, puts the malformed string into a + * field configured by `columnNameOfCorruptRecord`, and sets other fields to `null`. To keep * corrupt records, an user can set a string type field named `columnNameOfCorruptRecord` * in an user-defined schema. If a schema does not have the field, it drops corrupt records - * during parsing. When a length of parsed CSV tokens is shorter than an expected length - * of a schema, it sets `null` for extra fields. + * during parsing. It supports partial result for the records just with less or more tokens --- End diff -- I think there are same instances to update `DataStreamReader`, `readwriter.py` and `streaming.py` too. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #20597: [MINOR][TEST] Update from 2.2.0 to 2.2.1 in HiveE...
Github user seancxmao closed the pull request at: https://github.com/apache/spark/pull/20597 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20647: [SPARK-23303][SQL] improve the explain result for data s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20647 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20647: [SPARK-23303][SQL] improve the explain result for data s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20647 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1028/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20647: [SPARK-23303][SQL] improve the explain result for data s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20647 **[Test build #87643 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87643/testReport)** for PR 20647 at commit [`a73370a`](https://github.com/apache/spark/commit/a73370a5bf56f45ce67cd6cdaf86b53a14a67b5b). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20666: [SPARK-23448][SQL] Clarify JSON and CSV parser behavior ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20666 **[Test build #87642 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87642/testReport)** for PR 20666 at commit [`4ad330b`](https://github.com/apache/spark/commit/4ad330b1def558e17dfb693d428e1bd69248e5a3). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20666: [SPARK-23448][SQL] Clarify JSON and CSV parser behavior ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20666 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20666: [SPARK-23448][SQL] Clarify JSON and CSV parser behavior ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20666 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution/1027/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20647: [SPARK-23303][SQL] improve the explain result for data s...
Github user cloud-fan commented on the issue: https://github.com/apache/spark/pull/20647 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20666: [SPARK-23448][SQL] Clarify JSON and CSV parser behavior ...
Github user viirya commented on the issue: https://github.com/apache/spark/pull/20666 retest this please. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #10942: [SPARK-12850] [SQL] Support Bucket Pruning (Predicate Pu...
Github user gengliangwang commented on the issue: https://github.com/apache/spark/pull/10942 @lonehacker I have just created a jira ticket for the migration project: https://issues.apache.org/jira/browse/SPARK-23507 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20666: [SPARK-23448][SQL] Clarify JSON and CSV parser behavior ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20666 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87641/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20666: [SPARK-23448][SQL] Clarify JSON and CSV parser behavior ...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20666 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20666: [SPARK-23448][SQL] Clarify JSON and CSV parser behavior ...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20666 **[Test build #87641 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87641/testReport)** for PR 20666 at commit [`4ad330b`](https://github.com/apache/spark/commit/4ad330b1def558e17dfb693d428e1bd69248e5a3). * This patch **fails due to an unknown error code, -9**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20647: [SPARK-23303][SQL] improve the explain result for data s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20647 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/87640/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20647: [SPARK-23303][SQL] improve the explain result for data s...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20647 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20647: [SPARK-23303][SQL] improve the explain result for data s...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20647 **[Test build #87640 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/87640/testReport)** for PR 20647 at commit [`a73370a`](https://github.com/apache/spark/commit/a73370a5bf56f45ce67cd6cdaf86b53a14a67b5b). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org