[GitHub] [spark] AmplabJenkins removed a comment on issue #25447: [DO-NOT-MERGE][test-hadoop3.2][test-maven] Investigate JAVA_HOME not being set properly
AmplabJenkins removed a comment on issue #25447: [DO-NOT-MERGE][test-hadoop3.2][test-maven] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-521126149 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25447: [DO-NOT-MERGE][test-hadoop3.2][test-maven] Investigate JAVA_HOME not being set properly
AmplabJenkins removed a comment on issue #25447: [DO-NOT-MERGE][test-hadoop3.2][test-maven] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-521126158 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14156/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25448: [SPARK-28697] Invalidate Database/Table names starting with underscore
AmplabJenkins removed a comment on issue #25448: [SPARK-28697] Invalidate Database/Table names starting with underscore URL: https://github.com/apache/spark/pull/25448#issuecomment-521125826 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25447: [DO-NOT-MERGE][test-hadoop3.2][test-maven] Investigate JAVA_HOME not being set properly
AmplabJenkins commented on issue #25447: [DO-NOT-MERGE][test-hadoop3.2][test-maven] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-521126158 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14156/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25408: [SPARK-28687][SQL] Support `epoch`, `isoyear`, `milliseconds` and `microseconds` at `extract()`
AmplabJenkins commented on issue #25408: [SPARK-28687][SQL] Support `epoch`, `isoyear`, `milliseconds` and `microseconds` at `extract()` URL: https://github.com/apache/spark/pull/25408#issuecomment-521126128 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25447: [DO-NOT-MERGE][test-hadoop3.2][test-maven] Investigate JAVA_HOME not being set properly
AmplabJenkins commented on issue #25447: [DO-NOT-MERGE][test-hadoop3.2][test-maven] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-521126149 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25408: [SPARK-28687][SQL] Support `epoch`, `isoyear`, `milliseconds` and `microseconds` at `extract()`
AmplabJenkins removed a comment on issue #25408: [SPARK-28687][SQL] Support `epoch`, `isoyear`, `milliseconds` and `microseconds` at `extract()` URL: https://github.com/apache/spark/pull/25408#issuecomment-521126133 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14157/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25408: [SPARK-28687][SQL] Support `epoch`, `isoyear`, `milliseconds` and `microseconds` at `extract()`
AmplabJenkins removed a comment on issue #25408: [SPARK-28687][SQL] Support `epoch`, `isoyear`, `milliseconds` and `microseconds` at `extract()` URL: https://github.com/apache/spark/pull/25408#issuecomment-521126128 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25408: [SPARK-28687][SQL] Support `epoch`, `isoyear`, `milliseconds` and `microseconds` at `extract()`
AmplabJenkins commented on issue #25408: [SPARK-28687][SQL] Support `epoch`, `isoyear`, `milliseconds` and `microseconds` at `extract()` URL: https://github.com/apache/spark/pull/25408#issuecomment-521126133 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14157/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25448: [SPARK-28697] Invalidate Database/Table names starting with underscore
AmplabJenkins commented on issue #25448: [SPARK-28697] Invalidate Database/Table names starting with underscore URL: https://github.com/apache/spark/pull/25448#issuecomment-521125989 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins
dongjoon-hyun commented on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins URL: https://github.com/apache/spark/pull/25443#issuecomment-521125981 BTW, unfortunately, the ongoing Jenkins tests will be kill in 5 minutes because it's already midnight in PST timezone. I'll visit this PR tomorrow again. Thanks for testing, @wangyum and @HyukjinKwon . This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25448: [SPARK-28697] Invalidate names starting with _ to avoid unexpected behaviour
AmplabJenkins commented on issue #25448: [SPARK-28697] Invalidate names starting with _ to avoid unexpected behaviour URL: https://github.com/apache/spark/pull/25448#issuecomment-521125826 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ajithme commented on issue #25448: [SPARK-28697] Invalidate names starting with _ to avoid unexpected behaviour
ajithme commented on issue #25448: [SPARK-28697] Invalidate names starting with _ to avoid unexpected behaviour URL: https://github.com/apache/spark/pull/25448#issuecomment-521125734 @dongjoon-hyun @cloud-fan @HyukjinKwon please review and let me know your opinion on the fix This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins
dongjoon-hyun commented on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins URL: https://github.com/apache/spark/pull/25443#issuecomment-521125536 @wangyum . If we change this PR's `builtinHiveVersion` version to `2.3.6`, `HiveThriftServer2Suites` and `HiveMetastoreLazyInitializationSuite` seems to fail. > val builtinHiveVersion: String = if (isHive23) "2.3.5" else "1.2.1" Is that the reason we need to do this later? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ajithme opened a new pull request #25448: [SPARK-28697] Invalidate names starting with _ to avoid unexpected behaviour
ajithme opened a new pull request #25448: [SPARK-28697] Invalidate names starting with _ to avoid unexpected behaviour URL: https://github.com/apache/spark/pull/25448 ## What changes were proposed in this pull request? I think we should disallow if a identifier starts with _ for create database and create table Partially we can see its effect in SPARK-28697 where as the table name starts with _ (like _sampleTable) , the FileFormat assumes it to be a hidden folder and do not list it which causes unusual behavior ## How was this patch tested? Avoiding creating tables and databases with names starting from underscore. Added test case for same This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25447: [DO-NOT-MERGE][test-hadoop3.2][test-maven] Investigate JAVA_HOME not being set properly
SparkQA commented on issue #25447: [DO-NOT-MERGE][test-hadoop3.2][test-maven] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-521124655 **[Test build #109090 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109090/testReport)** for PR 25447 at commit [`f7d40b0`](https://github.com/apache/spark/commit/f7d40b075680d90b141c888524eb64545ce2081c). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25447: [DO-NOT-MERGE][test-hadoop3.2][test-maven] Investigate JAVA_HOME not being set properly
AmplabJenkins removed a comment on issue #25447: [DO-NOT-MERGE][test-hadoop3.2][test-maven] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-521124064 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] hddong commented on a change in pull request #25412: [SPARK-28691][EXAMPLES] Add Java/Scala DirectKerberizedKafkaWordCount examples
hddong commented on a change in pull request #25412: [SPARK-28691][EXAMPLES] Add Java/Scala DirectKerberizedKafkaWordCount examples URL: https://github.com/apache/spark/pull/25412#discussion_r313726230 ## File path: examples/src/main/scala/org/apache/spark/examples/streaming/DirectKerberizedKafkaWordCount.scala ## @@ -0,0 +1,90 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +// scalastyle:off println +package org.apache.spark.examples.streaming + +import org.apache.kafka.clients.CommonClientConfigs +import org.apache.kafka.clients.consumer.ConsumerConfig +import org.apache.kafka.common.security.auth.SecurityProtocol +import org.apache.kafka.common.serialization.StringDeserializer + +import org.apache.spark.SparkConf +import org.apache.spark.streaming._ +import org.apache.spark.streaming.kafka010._ + +/** + * Consumes messages from one or more topics in Kafka and does wordcount. + * Usage: DirectKerberizedKafkaWordCount + *is a list of one or more Kafka brokers + *is a consumer group name to consume from topics + *is a list of one or more kafka topics to consume from + * + * Example: + *$ bin/run-example --files ${path}/kafka_jaas.conf \ Review comment: > Where is `kafka_jaas.conf` file? Can we describe how to execute this example from the very first bash command? `kafka_jaas.conf` can manually create, I will add a template and describe it in this file. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25447: [DO-NOT-MERGE][test-hadoop3.2][test-maven] Investigate JAVA_HOME not being set properly
AmplabJenkins removed a comment on issue #25447: [DO-NOT-MERGE][test-hadoop3.2][test-maven] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-521124067 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14155/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25447: [DO-NOT-MERGE][test-hadoop3.2][test-maven] Investigate JAVA_HOME not being set properly
AmplabJenkins commented on issue #25447: [DO-NOT-MERGE][test-hadoop3.2][test-maven] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-521124067 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14155/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25447: [DO-NOT-MERGE][test-hadoop3.2][test-maven] Investigate JAVA_HOME not being set properly
AmplabJenkins commented on issue #25447: [DO-NOT-MERGE][test-hadoop3.2][test-maven] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-521124064 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #25447: [DO-NOT-MERGE][test-hadoop3.2][test-maven] Investigate JAVA_HOME not being set properly
HyukjinKwon commented on issue #25447: [DO-NOT-MERGE][test-hadoop3.2][test-maven] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-521123890 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sandeep-katta commented on issue #24566: [SPARK-27667][SQL] Get the current database from spark catalog instead of querying the Hive
sandeep-katta commented on issue #24566: [SPARK-27667][SQL] Get the current database from spark catalog instead of querying the Hive URL: https://github.com/apache/spark/pull/24566#issuecomment-521121995 @wangyum can you please review this, I have added the SQLConf This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] PavithraRamachandran commented on a change in pull request #25394: [SPARK-28671][SQL]when a non exsistent permanent function is dropped, NoSuchPermanentFunctionException is thrown
PavithraRamachandran commented on a change in pull request #25394: [SPARK-28671][SQL]when a non exsistent permanent function is dropped,NoSuchPermanentFunctionException is thrown URL: https://github.com/apache/spark/pull/25394#discussion_r313722698 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala ## @@ -1114,7 +1114,7 @@ class SessionCatalog( } externalCatalog.dropFunction(db, name.funcName) } else if (!ignoreIfNotExists) { - throw new NoSuchFunctionException(db = db, func = identifier.toString) + throw new NoSuchPermanentFunctionException(db = db, func = identifier.toString) Review comment: cc @maropu This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25439: [SPARK-28709][DSTREAMS] - Fix StreamingContext leak through Streaming…
AmplabJenkins removed a comment on issue #25439: [SPARK-28709][DSTREAMS] - Fix StreamingContext leak through Streaming… URL: https://github.com/apache/spark/pull/25439#issuecomment-521120214 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14153/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25439: [SPARK-28709][DSTREAMS] - Fix StreamingContext leak through Streaming…
SparkQA commented on issue #25439: [SPARK-28709][DSTREAMS] - Fix StreamingContext leak through Streaming… URL: https://github.com/apache/spark/pull/25439#issuecomment-521120721 **[Test build #109088 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109088/testReport)** for PR 25439 at commit [`4d5965e`](https://github.com/apache/spark/commit/4d5965ecb48685faed63a751100433a273695e5b). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25115: [SPARK-28351][SQL] Support DELETE in DataSource V2
AmplabJenkins removed a comment on issue #25115: [SPARK-28351][SQL] Support DELETE in DataSource V2 URL: https://github.com/apache/spark/pull/25115#issuecomment-521120283 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14154/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25115: [SPARK-28351][SQL] Support DELETE in DataSource V2
SparkQA commented on issue #25115: [SPARK-28351][SQL] Support DELETE in DataSource V2 URL: https://github.com/apache/spark/pull/25115#issuecomment-521120746 **[Test build #109089 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109089/testReport)** for PR 25115 at commit [`bbf5156`](https://github.com/apache/spark/commit/bbf515666495cbf5f12731b3cdab4a23960f3d77). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25439: [SPARK-28709][DSTREAMS] - Fix StreamingContext leak through Streaming…
AmplabJenkins removed a comment on issue #25439: [SPARK-28709][DSTREAMS] - Fix StreamingContext leak through Streaming… URL: https://github.com/apache/spark/pull/25439#issuecomment-521120210 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25115: [SPARK-28351][SQL] Support DELETE in DataSource V2
AmplabJenkins removed a comment on issue #25115: [SPARK-28351][SQL] Support DELETE in DataSource V2 URL: https://github.com/apache/spark/pull/25115#issuecomment-521120280 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25115: [SPARK-28351][SQL] Support DELETE in DataSource V2
AmplabJenkins commented on issue #25115: [SPARK-28351][SQL] Support DELETE in DataSource V2 URL: https://github.com/apache/spark/pull/25115#issuecomment-521120283 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14154/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25115: [SPARK-28351][SQL] Support DELETE in DataSource V2
AmplabJenkins commented on issue #25115: [SPARK-28351][SQL] Support DELETE in DataSource V2 URL: https://github.com/apache/spark/pull/25115#issuecomment-521120280 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25439: [SPARK-28709][DSTREAMS] - Fix StreamingContext leak through Streaming…
AmplabJenkins commented on issue #25439: [SPARK-28709][DSTREAMS] - Fix StreamingContext leak through Streaming… URL: https://github.com/apache/spark/pull/25439#issuecomment-521120210 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25439: [SPARK-28709][DSTREAMS] - Fix StreamingContext leak through Streaming…
AmplabJenkins commented on issue #25439: [SPARK-28709][DSTREAMS] - Fix StreamingContext leak through Streaming… URL: https://github.com/apache/spark/pull/25439#issuecomment-521120214 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14153/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #25115: [SPARK-28351][SQL] Support DELETE in DataSource V2
cloud-fan commented on issue #25115: [SPARK-28351][SQL] Support DELETE in DataSource V2 URL: https://github.com/apache/spark/pull/25115#issuecomment-521119576 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan closed pull request #25206: [SPARK-28265][SQL] Add renameTable to TableCatalog API
cloud-fan closed pull request #25206: [SPARK-28265][SQL] Add renameTable to TableCatalog API URL: https://github.com/apache/spark/pull/25206 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] choojoyq commented on a change in pull request #25439: [SPARK-28709][DSTREAMS] - Fix StreamingContext leak through Streaming…
choojoyq commented on a change in pull request #25439: [SPARK-28709][DSTREAMS] - Fix StreamingContext leak through Streaming… URL: https://github.com/apache/spark/pull/25439#discussion_r313720790 ## File path: streaming/src/main/scala/org/apache/spark/streaming/StreamingContext.scala ## @@ -575,6 +577,8 @@ class StreamingContext private[streaming] ( try { validate() +registerProgressListener() Review comment: I think so. I believe``StreamingTab`` shouldn't be responsible for registering/unregistering the listener as it could be and even already used in other place (metrics). Moreover seems there is also a bug that if ui is disabled, listener isn't registered and metrics aren't reported. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sandeep-katta commented on a change in pull request #25431: [SPARK-28711][DOCS] Update migration guide to add note about Hive upgrade
sandeep-katta commented on a change in pull request #25431: [SPARK-28711][DOCS] Update migration guide to add note about Hive upgrade URL: https://github.com/apache/spark/pull/25431#discussion_r313720598 ## File path: docs/sql-migration-guide-upgrade.md ## @@ -23,6 +23,9 @@ license: | {:toc} ## Upgrading From Spark SQL 2.4 to 3.0 + - Since Spark 3.0, hive is upgraded to 2.3.x, so it is required to update the Hive Review comment: okay understood, this should be in the scope of Hive upgrade. Thank you @wangyum . I will close this PR as invalid This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sandeep-katta closed pull request #25431: [SPARK-28711][DOCS] Update migration guide to add note about Hive upgrade
sandeep-katta closed pull request #25431: [SPARK-28711][DOCS] Update migration guide to add note about Hive upgrade URL: https://github.com/apache/spark/pull/25431 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #25442: [SPARK-28722][ML] Change sequential label sorting in StringIndexer fit to parallel
viirya commented on a change in pull request #25442: [SPARK-28722][ML] Change sequential label sorting in StringIndexer fit to parallel URL: https://github.com/apache/spark/pull/25442#discussion_r313719799 ## File path: mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ## @@ -213,32 +221,36 @@ class StringIndexer @Since("1.4.0") ( val labelsArray = $(stringOrderType) match { case StringIndexer.frequencyDesc => val sortFunc = StringIndexer.getSortFunc(ascending = false) -countByValue(dataset, inputCols).map { counts => +val orgStrings = countByValue(dataset, inputCols).toSeq +ThreadUtils.parmap(orgStrings, "sortingStringLabels", 8) { counts => counts.toSeq.sortWith(sortFunc).map(_._1).toArray -} +}.toArray case StringIndexer.frequencyAsc => val sortFunc = StringIndexer.getSortFunc(ascending = true) -countByValue(dataset, inputCols).map { counts => +val orgStrings = countByValue(dataset, inputCols).toSeq +ThreadUtils.parmap(orgStrings, "sortingStringLabels", 8) { counts => counts.toSeq.sortWith(sortFunc).map(_._1).toArray -} +}.toArray case StringIndexer.alphabetDesc => -import dataset.sparkSession.implicits._ dataset.persist() -val labels = inputCols.map { inputCol => - dataset.select(inputCol).na.drop().distinct().sort(dataset(s"$inputCol").desc) -.as[String].collect() -} +val selectedCols = getSelectedCols(dataset, inputCols).map(collect_set(_)) +val allLabels = dataset.select(selectedCols: _*) + .collect().toSeq.flatMap(_.toSeq).asInstanceOf[Seq[Seq[String]]] Review comment: distinct is done at executors by `collect_set` expression. Yes, sorting is done at the driver. This is a good point. I think it depends on the cardinality of input labels. For StringIndexer, the cardinality should not be very high, as suggested billion level. Actually, for frequency-based string order, sorting is also done at the driver, previously. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on issue #25206: [SPARK-28265][SQL] Add renameTable to TableCatalog API
cloud-fan commented on issue #25206: [SPARK-28265][SQL] Add renameTable to TableCatalog API URL: https://github.com/apache/spark/pull/25206#issuecomment-521117919 thanks, merging to master! This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] choojoyq commented on a change in pull request #25439: [SPARK-28709][DSTREAMS] - Fix StreamingContext leak through Streaming…
choojoyq commented on a change in pull request #25439: [SPARK-28709][DSTREAMS] - Fix StreamingContext leak through Streaming… URL: https://github.com/apache/spark/pull/25439#discussion_r313719160 ## File path: streaming/src/test/scala/org/apache/spark/streaming/InputStreamsSuite.scala ## @@ -52,8 +52,6 @@ class InputStreamsSuite extends TestSuiteBase with BeforeAndAfter { // Set up the streaming context and input streams withStreamingContext(new StreamingContext(conf, batchDuration)) { ssc => -ssc.addStreamingListener(ssc.progressListener) - Review comment: Correct. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] choojoyq commented on a change in pull request #25439: [SPARK-28709][DSTREAMS] - Fix StreamingContext leak through Streaming…
choojoyq commented on a change in pull request #25439: [SPARK-28709][DSTREAMS] - Fix StreamingContext leak through Streaming… URL: https://github.com/apache/spark/pull/25439#discussion_r313719072 ## File path: streaming/src/test/scala/org/apache/spark/streaming/StreamingContextSuite.scala ## @@ -373,7 +374,7 @@ class StreamingContextSuite extends SparkFunSuite with BeforeAndAfter with TimeL Thread.sleep(100) } - test ("registering and de-registering of streamingSource") { + test("registering and de-registering of streamingSource") { Review comment: Got it, reverted. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] choojoyq commented on a change in pull request #25439: [SPARK-28709][DSTREAMS] - Fix StreamingContext leak through Streaming…
choojoyq commented on a change in pull request #25439: [SPARK-28709][DSTREAMS] - Fix StreamingContext leak through Streaming… URL: https://github.com/apache/spark/pull/25439#discussion_r313718989 ## File path: streaming/src/test/scala/org/apache/spark/streaming/StreamingContextSuite.scala ## @@ -392,6 +393,29 @@ class StreamingContextSuite extends SparkFunSuite with BeforeAndAfter with TimeL assert(!sourcesAfterStop.contains(streamingSourceAfterStop)) } + test("registering and de-registering of progressListener") { Review comment: Sure, updated. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] choojoyq commented on a change in pull request #25439: [SPARK-28709][DSTREAMS] - Fix StreamingContext leak through Streaming…
choojoyq commented on a change in pull request #25439: [SPARK-28709][DSTREAMS] - Fix StreamingContext leak through Streaming… URL: https://github.com/apache/spark/pull/25439#discussion_r313718918 ## File path: core/src/main/scala/org/apache/spark/ui/SparkUI.scala ## @@ -138,6 +138,10 @@ private[spark] class SparkUI private ( streamingJobProgressListener = Option(sparkListener) } + def clearStreamingJobProgressListener(): Unit = { +streamingJobProgressListener = None + } + Review comment: Removed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] wangyum commented on a change in pull request #25431: [SPARK-28711][DOCS] Update migration guide to add note about Hive upgrade
wangyum commented on a change in pull request #25431: [SPARK-28711][DOCS] Update migration guide to add note about Hive upgrade URL: https://github.com/apache/spark/pull/25431#discussion_r313718805 ## File path: docs/sql-migration-guide-upgrade.md ## @@ -23,6 +23,9 @@ license: | {:toc} ## Upgrading From Spark SQL 2.4 to 3.0 + - Since Spark 3.0, hive is upgraded to 2.3.x, so it is required to update the Hive Review comment: There are two things here: 1. If you want to improve the performance of the Hive Metastore Server. The correct way is to upgrade your Hive Metastore Server to latest version. `SCHEMA_VERSION` should be updated by [Hive itself](https://github.com/apache/hive/blob/c57a59611fa168ee38c6ee0ee60b1d6c4994f9f8/metastore/scripts/upgrade/mysql/upgrade-1.2.0-to-1.3.0.mysql.sql). 2. Upgrade built-in Hive to 2.3.x still can get benefits if you Hive Metastore Server is 1.2.x, such as [SPARK-12014](https://issues.apache.org/jira/browse/SPARK-12014), [SPARK-27500](https://issues.apache.org/jira/browse/SPARK-27500) and [SPARK-26321](https://issues.apache.org/jira/browse/SPARK-26321). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly
AmplabJenkins removed a comment on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-521115705 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109085/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly
SparkQA removed a comment on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-521112095 **[Test build #109085 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109085/testReport)** for PR 25447 at commit [`0c89766`](https://github.com/apache/spark/commit/0c897661afb5f716c404d6892b550b04140be153). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly
AmplabJenkins removed a comment on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-521115698 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly
AmplabJenkins commented on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-521115698 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly
SparkQA commented on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-521115681 **[Test build #109085 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109085/testReport)** for PR 25447 at commit [`0c89766`](https://github.com/apache/spark/commit/0c897661afb5f716c404d6892b550b04140be153). * This patch **fails to generate documentation**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly
AmplabJenkins commented on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-521115705 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109085/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly
AmplabJenkins removed a comment on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-521115415 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109087/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly
SparkQA removed a comment on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-521115392 **[Test build #109087 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109087/testReport)** for PR 25447 at commit [`130bef4`](https://github.com/apache/spark/commit/130bef4c51373554afc3427cd09a6af52a63ef86). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly
SparkQA commented on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-521115392 **[Test build #109087 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109087/testReport)** for PR 25447 at commit [`130bef4`](https://github.com/apache/spark/commit/130bef4c51373554afc3427cd09a6af52a63ef86). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly
AmplabJenkins commented on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-521115410 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly
SparkQA commented on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-521115404 **[Test build #109087 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109087/testReport)** for PR 25447 at commit [`130bef4`](https://github.com/apache/spark/commit/130bef4c51373554afc3427cd09a6af52a63ef86). * This patch **fails due to an unknown error code, 125**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly
AmplabJenkins commented on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-521115415 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109087/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly
AmplabJenkins removed a comment on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-521115410 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on a change in pull request #25442: [SPARK-28722][ML] Change sequential label sorting in StringIndexer fit to parallel
viirya commented on a change in pull request #25442: [SPARK-28722][ML] Change sequential label sorting in StringIndexer fit to parallel URL: https://github.com/apache/spark/pull/25442#discussion_r313716725 ## File path: mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ## @@ -213,32 +221,36 @@ class StringIndexer @Since("1.4.0") ( val labelsArray = $(stringOrderType) match { case StringIndexer.frequencyDesc => val sortFunc = StringIndexer.getSortFunc(ascending = false) -countByValue(dataset, inputCols).map { counts => +val orgStrings = countByValue(dataset, inputCols).toSeq +ThreadUtils.parmap(orgStrings, "sortingStringLabels", 8) { counts => Review comment: Picked this number as following other places in Spark using `ThreadUtils.parmap`. I'm not sure we can use driver core config (`spark.driver.cores`) as it is not for this purpose. From the document, this config is only in cluster mode. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly
AmplabJenkins commented on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-521114985 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14152/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly
AmplabJenkins removed a comment on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-521114985 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14152/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly
AmplabJenkins removed a comment on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-521114983 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly
AmplabJenkins commented on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-521114983 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] brkyvz commented on a change in pull request #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs
brkyvz commented on a change in pull request #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs URL: https://github.com/apache/spark/pull/25368#discussion_r313715521 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalog/v2/CatalogManager.scala ## @@ -0,0 +1,100 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalog.v2 + +import scala.collection.mutable +import scala.util.control.NonFatal + +import org.apache.spark.internal.Logging +import org.apache.spark.sql.internal.SQLConf + +/** + * A thread-safe manager for [[CatalogPlugin]]s. It tracks all the registered catalogs, and allow + * the caller to look up a catalog by name. + */ +class CatalogManager(conf: SQLConf) extends Logging { + + private val catalogs = mutable.HashMap.empty[String, CatalogPlugin] + + def catalog(name: String): CatalogPlugin = synchronized { +catalogs.getOrElseUpdate(name, Catalogs.load(name, conf)) + } + + def defaultCatalog: Option[CatalogPlugin] = { +conf.defaultV2Catalog.flatMap { catalogName => + try { +Some(catalog(catalogName)) + } catch { +case NonFatal(e) => + logError(s"Cannot load default v2 catalog: $catalogName", e) + None + } +} + } + + def v2SessionCatalog: Option[CatalogPlugin] = { +try { + Some(catalog(CatalogManager.SESSION_CATALOG_NAME)) +} catch { + case NonFatal(e) => +logError("Cannot load v2 session catalog", e) +None +} + } + + private def getDefaultNamespace(c: CatalogPlugin) = c match { +case c: SupportsNamespaces => c.defaultNamespace() +case _ => Array.empty[String] + } + + private var _currentNamespace = { +// The builtin catalog use "default" as the default database. Review comment: I think we're saying the same things. I totally agree that: for a catalog `c1` ```sql SELECT ... FROM c1.t ```, this should be a fully qualified identifier. I'm saying that we should push the namespace configuration into the catalog that supports it. It shouldn't be part of the CatalogManager. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] brkyvz commented on a change in pull request #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs
brkyvz commented on a change in pull request #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs URL: https://github.com/apache/spark/pull/25368#discussion_r313715521 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalog/v2/CatalogManager.scala ## @@ -0,0 +1,100 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalog.v2 + +import scala.collection.mutable +import scala.util.control.NonFatal + +import org.apache.spark.internal.Logging +import org.apache.spark.sql.internal.SQLConf + +/** + * A thread-safe manager for [[CatalogPlugin]]s. It tracks all the registered catalogs, and allow + * the caller to look up a catalog by name. + */ +class CatalogManager(conf: SQLConf) extends Logging { + + private val catalogs = mutable.HashMap.empty[String, CatalogPlugin] + + def catalog(name: String): CatalogPlugin = synchronized { +catalogs.getOrElseUpdate(name, Catalogs.load(name, conf)) + } + + def defaultCatalog: Option[CatalogPlugin] = { +conf.defaultV2Catalog.flatMap { catalogName => + try { +Some(catalog(catalogName)) + } catch { +case NonFatal(e) => + logError(s"Cannot load default v2 catalog: $catalogName", e) + None + } +} + } + + def v2SessionCatalog: Option[CatalogPlugin] = { +try { + Some(catalog(CatalogManager.SESSION_CATALOG_NAME)) +} catch { + case NonFatal(e) => +logError("Cannot load v2 session catalog", e) +None +} + } + + private def getDefaultNamespace(c: CatalogPlugin) = c match { +case c: SupportsNamespaces => c.defaultNamespace() +case _ => Array.empty[String] + } + + private var _currentNamespace = { +// The builtin catalog use "default" as the default database. Review comment: I think we're saying the same things. I totally agree that: for a catalog `c1` ```sql SELECT ... FROM c1.t ```, this should be a fully qualified identifier. I'm saying that we should push the namespace configuration into the catalog that supports it. It shouldn't be part of the CatalogManager. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] brkyvz commented on a change in pull request #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs
brkyvz commented on a change in pull request #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs URL: https://github.com/apache/spark/pull/25368#discussion_r313715521 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalog/v2/CatalogManager.scala ## @@ -0,0 +1,100 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalog.v2 + +import scala.collection.mutable +import scala.util.control.NonFatal + +import org.apache.spark.internal.Logging +import org.apache.spark.sql.internal.SQLConf + +/** + * A thread-safe manager for [[CatalogPlugin]]s. It tracks all the registered catalogs, and allow + * the caller to look up a catalog by name. + */ +class CatalogManager(conf: SQLConf) extends Logging { + + private val catalogs = mutable.HashMap.empty[String, CatalogPlugin] + + def catalog(name: String): CatalogPlugin = synchronized { +catalogs.getOrElseUpdate(name, Catalogs.load(name, conf)) + } + + def defaultCatalog: Option[CatalogPlugin] = { +conf.defaultV2Catalog.flatMap { catalogName => + try { +Some(catalog(catalogName)) + } catch { +case NonFatal(e) => + logError(s"Cannot load default v2 catalog: $catalogName", e) + None + } +} + } + + def v2SessionCatalog: Option[CatalogPlugin] = { +try { + Some(catalog(CatalogManager.SESSION_CATALOG_NAME)) +} catch { + case NonFatal(e) => +logError("Cannot load v2 session catalog", e) +None +} + } + + private def getDefaultNamespace(c: CatalogPlugin) = c match { +case c: SupportsNamespaces => c.defaultNamespace() +case _ => Array.empty[String] + } + + private var _currentNamespace = { +// The builtin catalog use "default" as the default database. Review comment: I think we're saying the same things. I totally agree that: for a catalog `c1` ```sql SELECT ... FROM c1.t ``` , this should be a fully qualified identifier. I'm saying that we should push the namespace configuration into the catalog that supports it. It shouldn't be part of the CatalogManager. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] brkyvz commented on a change in pull request #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs
brkyvz commented on a change in pull request #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs URL: https://github.com/apache/spark/pull/25368#discussion_r313715521 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalog/v2/CatalogManager.scala ## @@ -0,0 +1,100 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalog.v2 + +import scala.collection.mutable +import scala.util.control.NonFatal + +import org.apache.spark.internal.Logging +import org.apache.spark.sql.internal.SQLConf + +/** + * A thread-safe manager for [[CatalogPlugin]]s. It tracks all the registered catalogs, and allow + * the caller to look up a catalog by name. + */ +class CatalogManager(conf: SQLConf) extends Logging { + + private val catalogs = mutable.HashMap.empty[String, CatalogPlugin] + + def catalog(name: String): CatalogPlugin = synchronized { +catalogs.getOrElseUpdate(name, Catalogs.load(name, conf)) + } + + def defaultCatalog: Option[CatalogPlugin] = { +conf.defaultV2Catalog.flatMap { catalogName => + try { +Some(catalog(catalogName)) + } catch { +case NonFatal(e) => + logError(s"Cannot load default v2 catalog: $catalogName", e) + None + } +} + } + + def v2SessionCatalog: Option[CatalogPlugin] = { +try { + Some(catalog(CatalogManager.SESSION_CATALOG_NAME)) +} catch { + case NonFatal(e) => +logError("Cannot load v2 session catalog", e) +None +} + } + + private def getDefaultNamespace(c: CatalogPlugin) = c match { +case c: SupportsNamespaces => c.defaultNamespace() +case _ => Array.empty[String] + } + + private var _currentNamespace = { +// The builtin catalog use "default" as the default database. Review comment: I think we're saying the same things. I totally agree that: for a catalog `c1` ```sql SELECT ... FROM c1.t ```, this should be a fully qualified identifier. I'm saying that we should push the namespace configuration into the catalog that supports it. It shouldn't be part of the CatalogManager. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs
cloud-fan commented on a change in pull request #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs URL: https://github.com/apache/spark/pull/25368#discussion_r313714736 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceResolution.scala ## @@ -45,8 +45,8 @@ case class DataSourceResolution( import org.apache.spark.sql.catalog.v2.CatalogV2Implicits._ import lookup._ - lazy val v2SessionCatalog: CatalogPlugin = lookup.sessionCatalog - .getOrElse(throw new AnalysisException("No v2 session catalog implementation is available")) + def v2SessionCatalog: CatalogPlugin = lookup.sessionCatalog Review comment: The `LookupCatalog` has some convenient utils, e.g. `CatalogObjectIdentifier`, `AsTableIdentifier`, etc. I think we should still keep it. BTW good point about making this rule take `CatalogManager` directly. Will update it soon. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs
cloud-fan commented on a change in pull request #25368: [SPARK-28635][SQL] create CatalogManager to track registered v2 catalogs URL: https://github.com/apache/spark/pull/25368#discussion_r313714259 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalog/v2/CatalogManager.scala ## @@ -0,0 +1,100 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one or more + * contributor license agreements. See the NOTICE file distributed with + * this work for additional information regarding copyright ownership. + * The ASF licenses this file to You under the Apache License, Version 2.0 + * (the "License"); you may not use this file except in compliance with + * the License. You may obtain a copy of the License at + * + *http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.spark.sql.catalog.v2 + +import scala.collection.mutable +import scala.util.control.NonFatal + +import org.apache.spark.internal.Logging +import org.apache.spark.sql.internal.SQLConf + +/** + * A thread-safe manager for [[CatalogPlugin]]s. It tracks all the registered catalogs, and allow + * the caller to look up a catalog by name. + */ +class CatalogManager(conf: SQLConf) extends Logging { + + private val catalogs = mutable.HashMap.empty[String, CatalogPlugin] + + def catalog(name: String): CatalogPlugin = synchronized { +catalogs.getOrElseUpdate(name, Catalogs.load(name, conf)) + } + + def defaultCatalog: Option[CatalogPlugin] = { +conf.defaultV2Catalog.flatMap { catalogName => + try { +Some(catalog(catalogName)) + } catch { +case NonFatal(e) => + logError(s"Cannot load default v2 catalog: $catalogName", e) + None + } +} + } + + def v2SessionCatalog: Option[CatalogPlugin] = { +try { + Some(catalog(CatalogManager.SESSION_CATALOG_NAME)) +} catch { + case NonFatal(e) => +logError("Cannot load v2 session catalog", e) +None +} + } + + private def getDefaultNamespace(c: CatalogPlugin) = c match { +case c: SupportsNamespaces => c.defaultNamespace() +case _ => Array.empty[String] + } + + private var _currentNamespace = { +// The builtin catalog use "default" as the default database. Review comment: I think current namespace only make sense to the current catalog, e.g. `SELECT ... FROM t`, `t` can be a table in the current catalog's current namespace. However, for `SELECT ... FROM c1.t`, it's confusing to say `t` is a table in catalog `c1`'s current namespace. When a table identifier starts with a catalog name, it should be a fully qualified identifier, and we can't apply current namespace here. catalog (including `V2SessionCatalog`) can report its default namespace, which will be used as the current namespace when switching to the catalog at the first time. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly
AmplabJenkins removed a comment on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-52677 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14150/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly
AmplabJenkins removed a comment on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-52669 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] xianyinxin commented on issue #25115: [SPARK-28351][SQL] Support DELETE in DataSource V2
xianyinxin commented on issue #25115: [SPARK-28351][SQL] Support DELETE in DataSource V2 URL: https://github.com/apache/spark/pull/25115#issuecomment-521112238 It seems the failure pyspark test has nothing to do with this pr. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly
SparkQA commented on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-521112095 **[Test build #109085 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109085/testReport)** for PR 25447 at commit [`0c89766`](https://github.com/apache/spark/commit/0c897661afb5f716c404d6892b550b04140be153). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins
SparkQA commented on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins URL: https://github.com/apache/spark/pull/25443#issuecomment-521112119 **[Test build #109086 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109086/testReport)** for PR 25443 at commit [`77a70ae`](https://github.com/apache/spark/commit/77a70ae1b98b538a315ca7f53e44fd15a49b0ec2). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins
AmplabJenkins removed a comment on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins URL: https://github.com/apache/spark/pull/25443#issuecomment-52726 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins
AmplabJenkins removed a comment on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins URL: https://github.com/apache/spark/pull/25443#issuecomment-52730 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14151/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly
AmplabJenkins commented on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-52677 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14150/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins
AmplabJenkins commented on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins URL: https://github.com/apache/spark/pull/25443#issuecomment-52730 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14151/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly
AmplabJenkins commented on issue #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447#issuecomment-52669 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins
AmplabJenkins commented on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins URL: https://github.com/apache/spark/pull/25443#issuecomment-52726 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] felixcheung commented on a change in pull request #25442: [SPARK-28722][ML] Change sequential label sorting in StringIndexer fit to parallel
felixcheung commented on a change in pull request #25442: [SPARK-28722][ML] Change sequential label sorting in StringIndexer fit to parallel URL: https://github.com/apache/spark/pull/25442#discussion_r313712444 ## File path: mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ## @@ -213,32 +221,36 @@ class StringIndexer @Since("1.4.0") ( val labelsArray = $(stringOrderType) match { case StringIndexer.frequencyDesc => val sortFunc = StringIndexer.getSortFunc(ascending = false) -countByValue(dataset, inputCols).map { counts => +val orgStrings = countByValue(dataset, inputCols).toSeq +ThreadUtils.parmap(orgStrings, "sortingStringLabels", 8) { counts => Review comment: how is 8 picked here? should this be ~= number of driver core or something? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] felixcheung commented on a change in pull request #25442: [SPARK-28722][ML] Change sequential label sorting in StringIndexer fit to parallel
felixcheung commented on a change in pull request #25442: [SPARK-28722][ML] Change sequential label sorting in StringIndexer fit to parallel URL: https://github.com/apache/spark/pull/25442#discussion_r313713261 ## File path: mllib/src/main/scala/org/apache/spark/ml/feature/StringIndexer.scala ## @@ -213,32 +221,36 @@ class StringIndexer @Since("1.4.0") ( val labelsArray = $(stringOrderType) match { case StringIndexer.frequencyDesc => val sortFunc = StringIndexer.getSortFunc(ascending = false) -countByValue(dataset, inputCols).map { counts => +val orgStrings = countByValue(dataset, inputCols).toSeq +ThreadUtils.parmap(orgStrings, "sortingStringLabels", 8) { counts => counts.toSeq.sortWith(sortFunc).map(_._1).toArray -} +}.toArray case StringIndexer.frequencyAsc => val sortFunc = StringIndexer.getSortFunc(ascending = true) -countByValue(dataset, inputCols).map { counts => +val orgStrings = countByValue(dataset, inputCols).toSeq +ThreadUtils.parmap(orgStrings, "sortingStringLabels", 8) { counts => counts.toSeq.sortWith(sortFunc).map(_._1).toArray -} +}.toArray case StringIndexer.alphabetDesc => -import dataset.sparkSession.implicits._ dataset.persist() -val labels = inputCols.map { inputCol => - dataset.select(inputCol).na.drop().distinct().sort(dataset(s"$inputCol").desc) -.as[String].collect() -} +val selectedCols = getSelectedCols(dataset, inputCols).map(collect_set(_)) +val allLabels = dataset.select(selectedCols: _*) + .collect().toSeq.flatMap(_.toSeq).asInstanceOf[Seq[Seq[String]]] Review comment: so this can be selecting a large number of columns and collecting it all to the driver for distinct/sort? isn't this possibly very slow if we have billions of record? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon opened a new pull request #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly
HyukjinKwon opened a new pull request #25447: [DO-NOT-MERGE] Investigate JAVA_HOME not being set properly URL: https://github.com/apache/spark/pull/25447 ## What changes were proposed in this pull request? Do not merge ## How was this patch tested? N/A This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on issue #25402: [SPARK-28666] Support saveAsTable for V2 tables through Session Catalog
SparkQA commented on issue #25402: [SPARK-28666] Support saveAsTable for V2 tables through Session Catalog URL: https://github.com/apache/spark/pull/25402#issuecomment-521110620 **[Test build #109084 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/109084/testReport)** for PR 25402 at commit [`673d95a`](https://github.com/apache/spark/commit/673d95a58fb1b80618c9d626acc8d1a64dd61d51). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25348: [SPARK-28554][SQL] Adds a v1 fallback writer implementation for v2 data source codepaths
AmplabJenkins removed a comment on issue #25348: [SPARK-28554][SQL] Adds a v1 fallback writer implementation for v2 data source codepaths URL: https://github.com/apache/spark/pull/25348#issuecomment-521110184 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14149/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25402: [SPARK-28666] Support saveAsTable for V2 tables through Session Catalog
AmplabJenkins removed a comment on issue #25402: [SPARK-28666] Support saveAsTable for V2 tables through Session Catalog URL: https://github.com/apache/spark/pull/25402#issuecomment-521110185 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25348: [SPARK-28554][SQL] Adds a v1 fallback writer implementation for v2 data source codepaths
AmplabJenkins removed a comment on issue #25348: [SPARK-28554][SQL] Adds a v1 fallback writer implementation for v2 data source codepaths URL: https://github.com/apache/spark/pull/25348#issuecomment-521110178 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins
AmplabJenkins removed a comment on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins URL: https://github.com/apache/spark/pull/25443#issuecomment-521110129 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14147/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25402: [SPARK-28666] Support saveAsTable for V2 tables through Session Catalog
AmplabJenkins removed a comment on issue #25402: [SPARK-28666] Support saveAsTable for V2 tables through Session Catalog URL: https://github.com/apache/spark/pull/25402#issuecomment-521110189 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14148/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins
AmplabJenkins removed a comment on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins URL: https://github.com/apache/spark/pull/25443#issuecomment-521110127 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins
HyukjinKwon commented on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins URL: https://github.com/apache/spark/pull/25443#issuecomment-521110213 Let me open a separate PR and proceed This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25402: [SPARK-28666] Support saveAsTable for V2 tables through Session Catalog
AmplabJenkins commented on issue #25402: [SPARK-28666] Support saveAsTable for V2 tables through Session Catalog URL: https://github.com/apache/spark/pull/25402#issuecomment-521110185 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins
AmplabJenkins commented on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins URL: https://github.com/apache/spark/pull/25443#issuecomment-521110127 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25348: [SPARK-28554][SQL] Adds a v1 fallback writer implementation for v2 data source codepaths
AmplabJenkins commented on issue #25348: [SPARK-28554][SQL] Adds a v1 fallback writer implementation for v2 data source codepaths URL: https://github.com/apache/spark/pull/25348#issuecomment-521110184 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14149/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25348: [SPARK-28554][SQL] Adds a v1 fallback writer implementation for v2 data source codepaths
AmplabJenkins commented on issue #25348: [SPARK-28554][SQL] Adds a v1 fallback writer implementation for v2 data source codepaths URL: https://github.com/apache/spark/pull/25348#issuecomment-521110178 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins
AmplabJenkins commented on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins URL: https://github.com/apache/spark/pull/25443#issuecomment-521110129 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14147/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on issue #25402: [SPARK-28666] Support saveAsTable for V2 tables through Session Catalog
AmplabJenkins commented on issue #25402: [SPARK-28666] Support saveAsTable for V2 tables through Session Catalog URL: https://github.com/apache/spark/pull/25402#issuecomment-521110189 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/14148/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] felixcheung commented on issue #24922: [SPARK-28120][SS] Rocksdb state storage implementation
felixcheung commented on issue #24922: [SPARK-28120][SS] Rocksdb state storage implementation URL: https://github.com/apache/spark/pull/24922#issuecomment-521110033 cool. is the issue here https://github.com/apache/spark/pull/24922#issuecomment-510327508 resolved? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins
AmplabJenkins removed a comment on issue #25443: [WIP][SPARK-28723][test-hadoop3.2][test-maven] Test JDK 11 with Hadoop-3.2/Hive 2.3.6 on jenkins URL: https://github.com/apache/spark/pull/25443#issuecomment-521109054 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/109083/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org With regards, Apache Git Services - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org