[GitHub] spark issue #21773: [SPARK-24810][SQL] Fix paths to test files in AvroSuite
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21773 **[Test build #93039 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93039/testReport)** for PR 21773 at commit [`98354ec`](https://github.com/apache/spark/commit/98354ecb716f2578bb699c03869b0344e46f960a). * This patch passes all tests. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `class AvroDeserializer(rootAvroType: Schema, rootCatalystType: DataType) ` * ` sealed trait CatalystDataUpdater ` * ` final class RowUpdater(row: InternalRow) extends CatalystDataUpdater ` * ` final class ArrayDataUpdater(array: ArrayData) extends CatalystDataUpdater ` * `class AvroSerializer(rootCatalystType: DataType, rootAvroType: Schema, nullable: Boolean) ` * `class IncompatibleSchemaException(msg: String, ex: Throwable = null) extends Exception(msg, ex)` * `class SerializableSchema(@transient var value: Schema)` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21777: [WIP][SPARK-24498][SQL] Add JDK compiler for runtime cod...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21777 @maropu @kiszk Thank you for taking this effort! Based on my initial understanding, the code generated by the JDK compiler can be better optimized by JIT in many cases. Is my understanding right? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21777: [WIP][SPARK-24498][SQL] Add JDK compiler for runtime cod...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21777 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93038/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21777: [WIP][SPARK-24498][SQL] Add JDK compiler for runtime cod...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21777 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21777: [WIP][SPARK-24498][SQL] Add JDK compiler for runtime cod...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21777 **[Test build #93038 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93038/testReport)** for PR 21777 at commit [`523bf3d`](https://github.com/apache/spark/commit/523bf3d96d21ebf07aa87b1842ea58a840d2c33b). * This patch **fails from timeout after a configured wait of \`300m\`**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21442: [SPARK-24402] [SQL] Optimize `In` expression when only o...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21442 @HyukjinKwon Currently, in my opinion, the highest priority PRs include Parquet nested column pruning (https://github.com/apache/spark/pull/21320/files), new built-in avro, high-order functions, data correctness issue of Shuffle+Repartition on RDD, gang scheduling and so on. If you have bandwidth, please also help the reviews in these PRs. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21442: [SPARK-24402] [SQL] Optimize `In` expression when only o...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21442 Also, the root cause of getting a lot of pings is that we somehow started to block the Jenkins build for some old PRs. Probably I missed some threads in dev mailing list but I still don't know who and who started this. In a way to be honest, I kind of felt getting annoyed by this to somehow probably by the similar reason you gave, even though I haven't expressed this so far for my reasons above. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21780: [SPARK-23259][SQL] Clean up legacy code around hive exte...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21780 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21780: [SPARK-23259][SQL] Clean up legacy code around hive exte...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21780 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93075/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21780: [SPARK-23259][SQL] Clean up legacy code around hive exte...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21780 **[Test build #93075 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93075/testReport)** for PR 21780 at commit [`7688790`](https://github.com/apache/spark/commit/7688790287a86d2896390a0faf5dc2cd1b24e97c). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21442: [SPARK-24402] [SQL] Optimize `In` expression when only o...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21442 Of course, the time and priority matter but there are pending PRs queued up due to the time and priority matter so far. Shall we check the PRs and see if there are important ones for Spark 2.4 before the branch is cut out since the release is being close? Otherwise, most of them will probably be missed by the time and priority matter again. I don't think I have done ping and retriggering things for roughly about the last year in such batch. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21442: [SPARK-24402] [SQL] Optimize `In` expression when only o...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21442 If you want to help this, we should ping the reviewers/committers based on the priority of these PRs. Also, we should not trigger the pings within a short period. Please reduce the size of the batch to 10s per day? We need to seriously consider which new features we really want to introduce in the upcoming release. We only have 15 days left before the code freeze of Spark 2.4. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21442: [SPARK-24402] [SQL] Optimize `In` expression when only o...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21442 I am not blindly triggering the test. I skimmed and only re-triggered tests some PRs while I am skimming stale PRs. For "ok to test", I haven't also blindly re-triggered. I did where any committer initially triggered or I see some values while skimming stale PRs. Getting pings is annoying of course but the root cause is we didn't check the PRs so far diligently, or authors were being inactive. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20405: [SPARK-23229][SQL] Dataset.hint should use planWithBarri...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20405 @HyukjinKwon Does the existing AnalysisBarrier introduce a regression when users use the hint like `df1.hint("broadcast", "t")`? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21442: [SPARK-24402] [SQL] Optimize `In` expression when only o...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21442 @HyukjinKwon Thank you for trying to trigger the tests but it will not help if you got many pings within one hour. To save the resource, we need to do more investigation before blindly triggering the test. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21442: [SPARK-24402] [SQL] Optimize `In` expression when only o...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21442 Also, see the stale PRs and see what we have delayed to take a look. Please consider partly this is what we should have checked and reviewed earlier, and/or partly authors haven't updated their PRs. Both cases need pings, right? This is what we should do but we couldn't. Mostly these were inactive for more than a month. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21711: [SPARK-24681][SQL] Verify nested column names in Hive me...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21711 **[Test build #93082 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93082/testReport)** for PR 21711 at commit [`bcdae88`](https://github.com/apache/spark/commit/bcdae885df053959cccf6cfc28269b87603c8b58). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19538: [SPARK-20393][WEBU UI][BACKPORT-2.0] Strengthen Spark to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19538 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19538: [SPARK-20393][WEBU UI][BACKPORT-2.0] Strengthen Spark to...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/19538 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93054/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21711: [SPARK-24681][SQL] Verify nested column names in Hive me...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21711 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21711: [SPARK-24681][SQL] Verify nested column names in Hive me...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21711 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/994/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21770 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21711: [SPARK-24681][SQL] Verify nested column names in Hive me...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/21711 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21770 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93045/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19538: [SPARK-20393][WEBU UI][BACKPORT-2.0] Strengthen Spark to...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/19538 **[Test build #93054 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93054/consoleFull)** for PR 19538 at commit [`a599d91`](https://github.com/apache/spark/commit/a599d9165fcbf50855feb617255fcaf2bed85e4d). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21770 **[Test build #93045 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93045/testReport)** for PR 21770 at commit [`9880f26`](https://github.com/apache/spark/commit/9880f26a8b763565b02b29b0be7e3597f54ff3de). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21442: [SPARK-24402] [SQL] Optimize `In` expression when only o...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21442 This should get updated, right? I should give a ping here anyway. I wonder why triggering retesting so matters for some PRs. Probably, you are mostly talking about "ok to test" that I left for the old PRs which gave pings for guys. In this case, I think the root causes is that we started to block the Jenkins tests for the reason I don't know. This already gives a lot of pings though so far. If the author is willing to update its PR, then it's a blocker again to ask Jenkins build. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21442: [SPARK-24402] [SQL] Optimize `In` expression when only o...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21442 @HyukjinKwon All the involved reviewers will get a ping. This is annoying to see many pings within one hour, right? My suggestion is to read the comments before triggering the test --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21770 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93042/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21770 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21780: [SPARK-23259][SQL] Clean up legacy code around hive exte...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21780 Yup, please do. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21770: [SPARK-24806][SQL] Brush up generated code so that JDK c...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21770 **[Test build #93042 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93042/testReport)** for PR 21770 at commit [`9fdbefd`](https://github.com/apache/spark/commit/9fdbefd4eeb41a0633cc165d5fb7970060f129b0). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21442: [SPARK-24402] [SQL] Optimize `In` expression when only o...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21442 @gatorsmile, can I just see if it the build passes against the latest build? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20864: [SPARK-23745][SQL]Remove the directories of the “hive....
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20864 @gatorsmile, I triggered tests for PRs where there's committer's command for it. I don't know why or who started to block the tests in Jenkins for what reason. If the author is willing to update the PR, the Jenkins is a blocker again. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20864: [SPARK-23745][SQL]Remove the directories of the “hive....
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20864 @HyukjinKwon We also do not need to trigger the test for this PR. This fix does not look good based on the above comment from @liufengdb --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21442: [SPARK-24402] [SQL] Optimize `In` expression when only o...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21442 @HyukjinKwon The code has a bug. Please read the discussion before you trigger the retest. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21761: [SPARK-24771][BUILD]Upgrade Apache AVRO to 1.8.2
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21761 Please do not merge it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21582: [SPARK-24576][BUILD] Upgrade Apache ORC to 1.5.2
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/21582 @dongjoon-hyun Could you give how the benchmark works? What is the workload pattern? How does the benchmark invoke Spark? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21352: [SPARK-24305][SQL][FOLLOWUP] Avoid serialization of priv...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21352 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21352: [SPARK-24305][SQL][FOLLOWUP] Avoid serialization of priv...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21352 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93062/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20856: [SPARK-23731][SQL] FileSourceScanExec throws NullPointer...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20856 Sure. Let me give a shot to reproduce it. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21352: [SPARK-24305][SQL][FOLLOWUP] Avoid serialization of priv...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21352 **[Test build #93062 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93062/testReport)** for PR 21352 at commit [`2862d3e`](https://github.com/apache/spark/commit/2862d3e4ad7c2207f23db2f2d58fb27ba6e708c5). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21707: Update for spark 2.2.2 release
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21707 **[Test build #93081 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93081/testReport)** for PR 21707 at commit [`c0c2613`](https://github.com/apache/spark/commit/c0c26136b2b19f651f2d430d0b331a5eda0ed5bb). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21707: Update for spark 2.2.2 release
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21707 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21707: Update for spark 2.2.2 release
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21707 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/993/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20856: [SPARK-23731][SQL] FileSourceScanExec throws NullPointer...
Github user gatorsmile commented on the issue: https://github.com/apache/spark/pull/20856 We are not able to merge it without a valid test case. We need to understand the root cause why `relation` can be null. @HyukjinKwon If you can help, please try to create a test case. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21707: Update for spark 2.2.2 release
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21707 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21779: [SPARK-24813][TESTS][HIVE][HOTFIX][BRANCH-2.2] HiveExter...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21779 Merged into branch-2.2. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21779: [SPARK-24813][TESTS][HIVE][HOTFIX][BRANCH-2.2] HiveExter...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21779 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21779: [SPARK-24813][TESTS][HIVE][HOTFIX][BRANCH-2.2] HiveExter...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21779 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93046/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21779: [SPARK-24813][TESTS][HIVE][HOTFIX][BRANCH-2.2] HiveExter...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21779 **[Test build #93046 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93046/testReport)** for PR 21779 at commit [`c9bcdbf`](https://github.com/apache/spark/commit/c9bcdbf75638365f48819aada51161a3b3fb7ae2). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17086: [SPARK-24101][ML][MLLIB] ML Evaluators should use weight...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17086 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20864: [SPARK-23745][SQL]Remove the directories of the “hive....
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20864 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17086: [SPARK-24101][ML][MLLIB] ML Evaluators should use weight...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/17086 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93068/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20864: [SPARK-23745][SQL]Remove the directories of the “hive....
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20864 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93061/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20864: [SPARK-23745][SQL]Remove the directories of the “hive....
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20864 **[Test build #93061 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93061/testReport)** for PR 20864 at commit [`b177bf4`](https://github.com/apache/spark/commit/b177bf441a04c0a700e33d7e40c1b2408c3c0c3b). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #17086: [SPARK-24101][ML][MLLIB] ML Evaluators should use weight...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/17086 **[Test build #93068 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93068/testReport)** for PR 17086 at commit [`9e59fd5`](https://github.com/apache/spark/commit/9e59fd592e9cbe43e9fc3d5c317cd3c4e2d6ac43). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21525: [SPARK-24513][ML] Attribute support in UnaryTransformer
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21525 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93069/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21525: [SPARK-24513][ML] Attribute support in UnaryTransformer
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21525 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21525: [SPARK-24513][ML] Attribute support in UnaryTransformer
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21525 **[Test build #93069 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93069/testReport)** for PR 21525 at commit [`18b5b25`](https://github.com/apache/spark/commit/18b5b25b42e01680e873b915c9b77bfd724b3a45). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21781: [INFRA] Close stale PR
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21781 **[Test build #93080 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93080/testReport)** for PR 21781 at commit [`9f7e2a2`](https://github.com/apache/spark/commit/9f7e2a24930d1f3ccf783aceb395604d8e2ba913). * This patch **fails to generate documentation**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21781: [INFRA] Close stale PR
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21781 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93080/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21711: [SPARK-24681][SQL] Verify nested column names in Hive me...
Github user maropu commented on the issue: https://github.com/apache/spark/pull/21711 It seems like Aveo errors, so Iâll trigger when it fixed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21781: [INFRA] Close stale PR
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21781 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21102: [SPARK-23913][SQL] Add array_intersect function
Github user kiszk commented on the issue: https://github.com/apache/spark/pull/21102 cc @ueshin --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21711: [SPARK-24681][SQL] Verify nested column names in Hive me...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21711 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93079/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21711: [SPARK-24681][SQL] Verify nested column names in Hive me...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21711 **[Test build #93079 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93079/testReport)** for PR 21711 at commit [`bcdae88`](https://github.com/apache/spark/commit/bcdae885df053959cccf6cfc28269b87603c8b58). * This patch **fails to generate documentation**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21711: [SPARK-24681][SQL] Verify nested column names in Hive me...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21711 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20405: [SPARK-23229][SQL] Dataset.hint should use planWithBarri...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20405 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93078/ Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20405: [SPARK-23229][SQL] Dataset.hint should use planWithBarri...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20405 Merged build finished. Test FAILed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20405: [SPARK-23229][SQL] Dataset.hint should use planWithBarri...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20405 **[Test build #93078 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93078/testReport)** for PR 20405 at commit [`47bb245`](https://github.com/apache/spark/commit/47bb245353202208f2c41634c3796c8e4d2be663). * This patch **fails to generate documentation**. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21711: [SPARK-24681][SQL] Verify nested column names in Hive me...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21711 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21711: [SPARK-24681][SQL] Verify nested column names in Hive me...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21711 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/992/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20235: [Spark-22887][ML][TESTS][WIP] ML test for StructuredStre...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20235 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/93058/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21781: [INFRA] Close stale PR
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21781 Mostly they are proposed to be closed by 1. author's inactivity 2. committer's decision to close (including me) 3. account removed 4. PRs that are bing taken over 5. etc. Authors: - Please come and say "please take out #X". I will leave this PR open for few days. - If you guys missed it, please reopen when you guys find some time. That would give more refreshes to make the progress on the proposed changes. Contributors: please go ahead if you guys find some PRs worth taking over. Please leave a comment and open a PR. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20235: [Spark-22887][ML][TESTS][WIP] ML test for StructuredStre...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20235 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21781: [INFRA] Close stale PR
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21781 **[Test build #93080 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93080/testReport)** for PR 21781 at commit [`9f7e2a2`](https://github.com/apache/spark/commit/9f7e2a24930d1f3ccf783aceb395604d8e2ba913). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20235: [Spark-22887][ML][TESTS][WIP] ML test for StructuredStre...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20235 **[Test build #93058 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93058/testReport)** for PR 20235 at commit [`c0f3056`](https://github.com/apache/spark/commit/c0f3056d8b737ab23621950d58da714188fe641c). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21781: [INFRA] Close stale PR
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21781 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21781: [INFRA] Close stale PR
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21781 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/991/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21781: [INFRA] Close stale PR
GitHub user HyukjinKwon opened a pull request: https://github.com/apache/spark/pull/21781 [INFRA] Close stale PR Closes #17422 Closes #17619 Closes #17536 Closes #18034 Closes #18229 Closes #18268 Closes #17973 Closes #18125 Closes #18918 Closes #18812 Closes #18457 Closes #19274 Closes #19456 Closes #19510 Closes #19420 Closes #20090 Closes #20177 Closes #20304 Closes #20319 Closes #20543 Closes #20437 Closes #21261 Closes #21726 Closes #14653 Closes #13143 Closes #17894 Closes #20430 Closes #19758 Closes #12951 Closes #17092 Closes #21079 Closes #21240 Closes #16910 Closes #12904 Closes #21731 Closes #21095 You can merge this pull request into a Git repository by running: $ git pull https://github.com/HyukjinKwon/spark closing-prs Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/21781.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #21781 commit 9f7e2a24930d1f3ccf783aceb395604d8e2ba913 Author: hyukjinkwon Date: 2018-07-16T03:37:48Z Close stale PRs --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21711: [SPARK-24681][SQL] Verify nested column names in Hive me...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21711 **[Test build #93079 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93079/testReport)** for PR 21711 at commit [`bcdae88`](https://github.com/apache/spark/commit/bcdae885df053959cccf6cfc28269b87603c8b58). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21711: [SPARK-24681][SQL] Verify nested column names in ...
Github user maropu commented on a diff in the pull request: https://github.com/apache/spark/pull/21711#discussion_r202568645 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -138,17 +138,36 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, hadoopConf: Configurat } /** - * Checks the validity of data column names. Hive metastore disallows the table to use comma in - * data column names. Partition columns do not have such a restriction. Views do not have such - * a restriction. + * Checks the validity of data column names. Hive metastore disallows the table to use some + * special characters (',', ':', and ';') in data column names, including nested column names. + * Partition columns do not have such a restriction. Views do not have such a restriction. */ private def verifyDataSchema( tableName: TableIdentifier, tableType: CatalogTableType, dataSchema: StructType): Unit = { if (tableType != VIEW) { - dataSchema.map(_.name).foreach { colName => -if (colName.contains(",")) { - throw new AnalysisException("Cannot create a table having a column whose name contains " + -s"commas in Hive metastore. Table: $tableName; Column: $colName") + val invalidChars = Seq(",", ":", ";") + def verifyNestedColumnNames(schema: StructType): Unit = schema.foreach { f => +f.dataType match { + case st: StructType => verifyNestedColumnNames(st) + case _ if invalidChars.exists(f.name.contains) => +val errMsg = "Cannot create a table having a nested column whose name contains " + + s"invalid characters (${invalidChars.map(c => s"'$c'").mkString(", ")}) " + --- End diff -- aha, I'll fix, thanks! --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21707: Update for spark 2.2.2 release
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21707 oops, branch-2.2 doesn't have the fix yet `https://github.com/apache/spark/pull/21779`. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21442: [SPARK-24402] [SQL] Optimize `In` expression when only o...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21442 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/990/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21442: [SPARK-24402] [SQL] Optimize `In` expression when only o...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/21442 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20405: [SPARK-23229][SQL] Dataset.hint should use planWithBarri...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20405 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20405: [SPARK-23229][SQL] Dataset.hint should use planWithBarri...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20405 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/989/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20856: [SPARK-23731][SQL] FileSourceScanExec throws NullPointer...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20856 Merged build finished. Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20405: [SPARK-23229][SQL] Dataset.hint should use planWithBarri...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20405 **[Test build #93078 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93078/testReport)** for PR 20405 at commit [`47bb245`](https://github.com/apache/spark/commit/47bb245353202208f2c41634c3796c8e4d2be663). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20856: [SPARK-23731][SQL] FileSourceScanExec throws NullPointer...
Github user AmplabJenkins commented on the issue: https://github.com/apache/spark/pull/20856 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/988/ Test PASSed. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21711: [SPARK-24681][SQL] Verify nested column names in ...
Github user gatorsmile commented on a diff in the pull request: https://github.com/apache/spark/pull/21711#discussion_r202567965 --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala --- @@ -138,17 +138,36 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, hadoopConf: Configurat } /** - * Checks the validity of data column names. Hive metastore disallows the table to use comma in - * data column names. Partition columns do not have such a restriction. Views do not have such - * a restriction. + * Checks the validity of data column names. Hive metastore disallows the table to use some + * special characters (',', ':', and ';') in data column names, including nested column names. + * Partition columns do not have such a restriction. Views do not have such a restriction. */ private def verifyDataSchema( tableName: TableIdentifier, tableType: CatalogTableType, dataSchema: StructType): Unit = { if (tableType != VIEW) { - dataSchema.map(_.name).foreach { colName => -if (colName.contains(",")) { - throw new AnalysisException("Cannot create a table having a column whose name contains " + -s"commas in Hive metastore. Table: $tableName; Column: $colName") + val invalidChars = Seq(",", ":", ";") + def verifyNestedColumnNames(schema: StructType): Unit = schema.foreach { f => +f.dataType match { + case st: StructType => verifyNestedColumnNames(st) + case _ if invalidChars.exists(f.name.contains) => +val errMsg = "Cannot create a table having a nested column whose name contains " + + s"invalid characters (${invalidChars.map(c => s"'$c'").mkString(", ")}) " + --- End diff -- Normally, in this case, what we do is like: ```Scala val invalidCharsString = invalidChars.map(c => s"'$c'").mkString(", ") val errMsg = "Cannot create a table having a nested column whose name contains " + s"invalid characters ($invalidCharsString) in Hive metastore. Table: $tableName; " + s"Column: ${f.name}" ``` --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20856: [SPARK-23731][SQL] FileSourceScanExec throws NullPointer...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/20856 **[Test build #93077 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93077/testReport)** for PR 20856 at commit [`3981421`](https://github.com/apache/spark/commit/39814216026da32eee5aabf3886bbedd3b90ed08). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21442: [SPARK-24402] [SQL] Optimize `In` expression when only o...
Github user SparkQA commented on the issue: https://github.com/apache/spark/pull/21442 **[Test build #93076 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/93076/testReport)** for PR 21442 at commit [`7a354fc`](https://github.com/apache/spark/commit/7a354fcd154ec2d8f88a5c1fbf1bd75fdb15ec49). --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21442: [SPARK-24402] [SQL] Optimize `In` expression when only o...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21442 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #21657: [SPARK-24676][SQL] Project required data from CSV...
Github user asfgit closed the pull request at: https://github.com/apache/spark/pull/21657 --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #19449: [SPARK-22219][SQL] Refactor code to get a value for "spa...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/19449 Hm, shall we leave this closed then? --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20856: [SPARK-23731][SQL] FileSourceScanExec throws NullPointer...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20856 retest this please --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #21240: [SPARK-21274][SQL] Add a new generator function replicat...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/21240 ping @dilipbiswal for an update. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #20095: [SPARK-22126][ML] Added fitMultiple method with default ...
Github user HyukjinKwon commented on the issue: https://github.com/apache/spark/pull/20095 ping @MrBago --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org