[GitHub] [spark] SparkQA removed a comment on pull request #28940: [SPARK-32121][SHUFFLE] Support Windows OS in ExecutorDiskUtils
SparkQA removed a comment on pull request #28940: URL: https://github.com/apache/spark/pull/28940#issuecomment-652142021 **[Test build #124722 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124722/testReport)** for PR 28940 at commit [`aac01ca`](https://github.com/apache/spark/commit/aac01ca11c3c024e8a75753e43a217cadb1d8c46). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28940: [SPARK-32121][SHUFFLE] Support Windows OS in ExecutorDiskUtils
SparkQA commented on pull request #28940: URL: https://github.com/apache/spark/pull/28940#issuecomment-652207766 **[Test build #124722 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124722/testReport)** for PR 28940 at commit [`aac01ca`](https://github.com/apache/spark/commit/aac01ca11c3c024e8a75753e43a217cadb1d8c46). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC
AmplabJenkins removed a comment on pull request #28953: URL: https://github.com/apache/spark/pull/28953#issuecomment-652202419 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124745/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC
SparkQA removed a comment on pull request #28953: URL: https://github.com/apache/spark/pull/28953#issuecomment-652196837 **[Test build #124745 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124745/testReport)** for PR 28953 at commit [`0db0376`](https://github.com/apache/spark/commit/0db0376b4eaed4b02739080b1ba3d1e4c6e97bd3). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand
AmplabJenkins removed a comment on pull request #28647: URL: https://github.com/apache/spark/pull/28647#issuecomment-652202127 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124739/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC
AmplabJenkins removed a comment on pull request #28953: URL: https://github.com/apache/spark/pull/28953#issuecomment-652202412 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC
AmplabJenkins commented on pull request #28953: URL: https://github.com/apache/spark/pull/28953#issuecomment-652202412 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC
SparkQA commented on pull request #28953: URL: https://github.com/apache/spark/pull/28953#issuecomment-652202383 **[Test build #124745 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124745/testReport)** for PR 28953 at commit [`0db0376`](https://github.com/apache/spark/commit/0db0376b4eaed4b02739080b1ba3d1e4c6e97bd3). * This patch **fails to generate documentation**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand
SparkQA removed a comment on pull request #28647: URL: https://github.com/apache/spark/pull/28647#issuecomment-652174626 **[Test build #124739 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124739/testReport)** for PR 28647 at commit [`7f9f685`](https://github.com/apache/spark/commit/7f9f68571a535c2ecb46f9036e4988167416f49f). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand
AmplabJenkins commented on pull request #28647: URL: https://github.com/apache/spark/pull/28647#issuecomment-652202119 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand
AmplabJenkins removed a comment on pull request #28647: URL: https://github.com/apache/spark/pull/28647#issuecomment-652202119 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand
SparkQA commented on pull request #28647: URL: https://github.com/apache/spark/pull/28647#issuecomment-652201843 **[Test build #124739 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124739/testReport)** for PR 28647 at commit [`7f9f685`](https://github.com/apache/spark/commit/7f9f68571a535c2ecb46f9036e4988167416f49f). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28947: [SPARK-32129][SQL] Support AQE skew join with Union
AmplabJenkins removed a comment on pull request #28947: URL: https://github.com/apache/spark/pull/28947#issuecomment-652199044 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124740/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28947: [SPARK-32129][SQL] Support AQE skew join with Union
AmplabJenkins commented on pull request #28947: URL: https://github.com/apache/spark/pull/28947#issuecomment-652199037 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28947: [SPARK-32129][SQL] Support AQE skew join with Union
AmplabJenkins removed a comment on pull request #28947: URL: https://github.com/apache/spark/pull/28947#issuecomment-652199037 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28947: [SPARK-32129][SQL] Support AQE skew join with Union
SparkQA removed a comment on pull request #28947: URL: https://github.com/apache/spark/pull/28947#issuecomment-652182370 **[Test build #124740 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124740/testReport)** for PR 28947 at commit [`d011e9a`](https://github.com/apache/spark/commit/d011e9a11f416c73af4a602f9966db35c2643dd8). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28947: [SPARK-32129][SQL] Support AQE skew join with Union
SparkQA commented on pull request #28947: URL: https://github.com/apache/spark/pull/28947#issuecomment-652198809 **[Test build #124740 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124740/testReport)** for PR 28947 at commit [`d011e9a`](https://github.com/apache/spark/commit/d011e9a11f416c73af4a602f9966db35c2643dd8). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #28676: [SPARK-31869][SQL] BroadcastHashJoinExec can utilize the build side for its output partitioning
maropu commented on a change in pull request #28676: URL: https://github.com/apache/spark/pull/28676#discussion_r448120574 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashJoinExec.scala ## @@ -60,6 +60,26 @@ case class BroadcastHashJoinExec( } } + override def outputPartitioning: Partitioning = { +def buildKeys: Seq[Expression] = buildSide match { + case BuildLeft => leftKeys + case BuildRight => rightKeys +} + +joinType match { + case _: InnerLike => Review comment: NVM, on second thought, its difficult to hanlde this issue in that side. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC
SparkQA commented on pull request #28953: URL: https://github.com/apache/spark/pull/28953#issuecomment-652196837 **[Test build #124745 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124745/testReport)** for PR 28953 at commit [`0db0376`](https://github.com/apache/spark/commit/0db0376b4eaed4b02739080b1ba3d1e4c6e97bd3). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #28955: [SPARK-32142][SQL][TESTS] Keep the original tests and codes to avoid potential conflicts in dev
HyukjinKwon commented on pull request #28955: URL: https://github.com/apache/spark/pull/28955#issuecomment-652196123 Thank you guys. Merged to master and branch-3.0. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #28955: [SPARK-32142][SQL][TESTS] Keep the original tests and codes to avoid potential conflicts in dev
HyukjinKwon closed pull request #28955: URL: https://github.com/apache/spark/pull/28955 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC
AmplabJenkins removed a comment on pull request #28953: URL: https://github.com/apache/spark/pull/28953#issuecomment-652195101 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC
AmplabJenkins commented on pull request #28953: URL: https://github.com/apache/spark/pull/28953#issuecomment-652195101 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sarutak edited a comment on pull request #28803: [SPARK-31971][WEBUI] Add pagination support for all jobs timeline
sarutak edited a comment on pull request #28803: URL: https://github.com/apache/spark/pull/28803#issuecomment-652154162 Hi @gengliangwang, shall we restart discussion? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28955: [SPARK-32142][SQL][TESTS] Keep the original tests and codes to avoid potential conflicts in dev
AmplabJenkins commented on pull request #28955: URL: https://github.com/apache/spark/pull/28955#issuecomment-652193469 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28955: [SPARK-32142][SQL][TESTS] Keep the original tests and codes to avoid potential conflicts in dev
AmplabJenkins removed a comment on pull request #28955: URL: https://github.com/apache/spark/pull/28955#issuecomment-652193469 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28955: [SPARK-32142][SQL][TESTS] Keep the original tests and codes to avoid potential conflicts in dev
SparkQA commented on pull request #28955: URL: https://github.com/apache/spark/pull/28955#issuecomment-652192426 **[Test build #124700 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124700/testReport)** for PR 28955 at commit [`7a36dd3`](https://github.com/apache/spark/commit/7a36dd397c6276594308529a9fd6ac2c0e81a5c6). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28955: [SPARK-32142][SQL][TESTS] Keep the original tests and codes to avoid potential conflicts in dev
SparkQA removed a comment on pull request #28955: URL: https://github.com/apache/spark/pull/28955#issuecomment-652101599 **[Test build #124700 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124700/testReport)** for PR 28955 at commit [`7a36dd3`](https://github.com/apache/spark/commit/7a36dd397c6276594308529a9fd6ac2c0e81a5c6). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28956: [SPARK-31797][SQL][FOLLOWUP] TIMESTAMP_SECONDS supports fractional input
dongjoon-hyun commented on a change in pull request #28956: URL: https://github.com/apache/spark/pull/28956#discussion_r448115259 ## File path: sql/core/src/test/resources/sql-tests/inputs/datetime.sql ## @@ -2,13 +2,16 @@ -- [SPARK-31710] TIMESTAMP_SECONDS, TIMESTAMP_MILLISECONDS and TIMESTAMP_MICROSECONDS to timestamp transfer select TIMESTAMP_SECONDS(1230219000),TIMESTAMP_SECONDS(-1230219000),TIMESTAMP_SECONDS(null); +select TIMESTAMP_SECONDS(1.23), TIMESTAMP_SECONDS(1.23d); select TIMESTAMP_MILLIS(1230219000123),TIMESTAMP_MILLIS(-1230219000123),TIMESTAMP_MILLIS(null); select TIMESTAMP_MICROS(1230219000123123),TIMESTAMP_MICROS(-1230219000123123),TIMESTAMP_MICROS(null); --- overflow exception: +-- overflow exception select TIMESTAMP_SECONDS(1230219000123123); select TIMESTAMP_SECONDS(-1230219000123123); select TIMESTAMP_MILLIS(92233720368547758); select TIMESTAMP_MILLIS(-92233720368547758); +-- truncate exception +select TIMESTAMP_SECONDS(0.1234567); Review comment: Shall we have a test case for `allow truncation` together because this PR allows truncation for `double` type? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
cloud-fan commented on a change in pull request #28833: URL: https://github.com/apache/spark/pull/28833#discussion_r448115666 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ## @@ -2309,6 +2309,108 @@ class HiveDDLSuite } } + test("SPARK-20680: Spark-sql do not support for void column datatype") { +withTable("t") { + withView("tabVoidType") { +val client = + spark.sharedState.externalCatalog.unwrapped.asInstanceOf[HiveExternalCatalog].client +client.runSqlHive("CREATE TABLE t (t1 int)") +client.runSqlHive("INSERT INTO t VALUES (3)") +client.runSqlHive("CREATE VIEW tabVoidType AS SELECT NULL AS col FROM t") +checkAnswer(spark.table("tabVoidType"), Row(null)) +// No exception shows +val desc = spark.sql("DESC tabVoidType").collect().toSeq +assert(desc.contains(Row("col", "null", null))) + } +} + +// Forbid CTAS with null type +withTable("t1", "t2", "t3") { + val e1 = intercept[AnalysisException] { +spark.sql("CREATE TABLE t1 USING PARQUET AS SELECT null as null_col") + }.getMessage + assert(e1.contains("Cannot create tables with VOID type")) + + val e2 = intercept[AnalysisException] { +spark.sql("CREATE TABLE t2 AS SELECT null as null_col") + }.getMessage + assert(e2.contains("Cannot create tables with VOID type")) + + val e3 = intercept[AnalysisException] { +spark.sql("CREATE TABLE t3 STORED AS PARQUET AS SELECT null as null_col") + }.getMessage + assert(e3.contains("Cannot create tables with VOID type")) +} + +// Forbid creating table with void/null type in Spark +Seq("void", "null").foreach { colType => + withTable("t1", "t2", "t3") { +val e1 = intercept[AnalysisException] { + spark.sql(s"CREATE TABLE t1 (v $colType) USING parquet") +}.getMessage +assert(e1.contains("Cannot create tables with VOID type")) +val e2 = intercept[AnalysisException] { + spark.sql(s"CREATE TABLE t2 (v $colType) USING hive") Review comment: can we follow the CTAS test and use `STORED AS PARQUET`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28956: [SPARK-31797][SQL][FOLLOWUP] TIMESTAMP_SECONDS supports fractional input
dongjoon-hyun commented on a change in pull request #28956: URL: https://github.com/apache/spark/pull/28956#discussion_r448115408 ## File path: sql/core/src/test/resources/sql-tests/inputs/datetime.sql ## @@ -2,13 +2,16 @@ -- [SPARK-31710] TIMESTAMP_SECONDS, TIMESTAMP_MILLISECONDS and TIMESTAMP_MICROSECONDS to timestamp transfer select TIMESTAMP_SECONDS(1230219000),TIMESTAMP_SECONDS(-1230219000),TIMESTAMP_SECONDS(null); +select TIMESTAMP_SECONDS(1.23), TIMESTAMP_SECONDS(1.23d); select TIMESTAMP_MILLIS(1230219000123),TIMESTAMP_MILLIS(-1230219000123),TIMESTAMP_MILLIS(null); select TIMESTAMP_MICROS(1230219000123123),TIMESTAMP_MICROS(-1230219000123123),TIMESTAMP_MICROS(null); --- overflow exception: +-- overflow exception select TIMESTAMP_SECONDS(1230219000123123); select TIMESTAMP_SECONDS(-1230219000123123); select TIMESTAMP_MILLIS(92233720368547758); select TIMESTAMP_MILLIS(-92233720368547758); +-- truncate exception +select TIMESTAMP_SECONDS(0.1234567); Review comment: This PR aims to allow truncation for both ANSI and legacy mode. Did I understand correctly? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28956: [SPARK-31797][SQL][FOLLOWUP] TIMESTAMP_SECONDS supports fractional input
dongjoon-hyun commented on a change in pull request #28956: URL: https://github.com/apache/spark/pull/28956#discussion_r448115259 ## File path: sql/core/src/test/resources/sql-tests/inputs/datetime.sql ## @@ -2,13 +2,16 @@ -- [SPARK-31710] TIMESTAMP_SECONDS, TIMESTAMP_MILLISECONDS and TIMESTAMP_MICROSECONDS to timestamp transfer select TIMESTAMP_SECONDS(1230219000),TIMESTAMP_SECONDS(-1230219000),TIMESTAMP_SECONDS(null); +select TIMESTAMP_SECONDS(1.23), TIMESTAMP_SECONDS(1.23d); select TIMESTAMP_MILLIS(1230219000123),TIMESTAMP_MILLIS(-1230219000123),TIMESTAMP_MILLIS(null); select TIMESTAMP_MICROS(1230219000123123),TIMESTAMP_MICROS(-1230219000123123),TIMESTAMP_MICROS(null); --- overflow exception: +-- overflow exception select TIMESTAMP_SECONDS(1230219000123123); select TIMESTAMP_SECONDS(-1230219000123123); select TIMESTAMP_MILLIS(92233720368547758); select TIMESTAMP_MILLIS(-92233720368547758); +-- truncate exception +select TIMESTAMP_SECONDS(0.1234567); Review comment: Shall we have a test case for `allow truncate` together because this PR allows truncation for `double` type? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28956: [SPARK-31797][SQL][FOLLOWUP] TIMESTAMP_SECONDS supports fractional input
dongjoon-hyun commented on a change in pull request #28956: URL: https://github.com/apache/spark/pull/28956#discussion_r448114383 ## File path: sql/core/src/test/resources/sql-tests/inputs/datetime.sql ## @@ -2,13 +2,16 @@ -- [SPARK-31710] TIMESTAMP_SECONDS, TIMESTAMP_MILLISECONDS and TIMESTAMP_MICROSECONDS to timestamp transfer select TIMESTAMP_SECONDS(1230219000),TIMESTAMP_SECONDS(-1230219000),TIMESTAMP_SECONDS(null); +select TIMESTAMP_SECONDS(1.23), TIMESTAMP_SECONDS(1.23d); Review comment: Since this has `Decimal` and `Double`, can we have `Float` together by using `TIMESTAMP_SECONDS(CAST(1.23 AS FLOAT))`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28940: [SPARK-32121][SHUFFLE] Support Windows OS in ExecutorDiskUtils
AmplabJenkins removed a comment on pull request #28940: URL: https://github.com/apache/spark/pull/28940#issuecomment-652188385 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28940: [SPARK-32121][SHUFFLE] Support Windows OS in ExecutorDiskUtils
AmplabJenkins commented on pull request #28940: URL: https://github.com/apache/spark/pull/28940#issuecomment-652188378 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28940: [SPARK-32121][SHUFFLE] Support Windows OS in ExecutorDiskUtils
SparkQA commented on pull request #28940: URL: https://github.com/apache/spark/pull/28940#issuecomment-652188033 **[Test build #124744 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124744/testReport)** for PR 28940 at commit [`a80bd6c`](https://github.com/apache/spark/commit/a80bd6c8a5d93187cf06f941c2a9d296a7b6ca61). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] pan3793 commented on a change in pull request #28940: [SPARK-32121][SHUFFLE] Support Windows OS in ExecutorDiskUtils
pan3793 commented on a change in pull request #28940: URL: https://github.com/apache/spark/pull/28940#discussion_r448112801 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExecutorDiskUtils.java ## @@ -50,14 +58,18 @@ public static File getFile(String[] localDirs, int subDirsPerLocalDir, String fi * the internal code in java.io.File would normalize it later, creating a new "foo/bar" * String copy. Unfortunately, we cannot just reuse the normalization code that java.io.File * uses, since it is in the package-private class java.io.FileSystem. + * + * On Windows, separator "\" is used instead of "/". + * + * "\\" is legal character in path name on Unix like OS, but illegal on Windows. Review comment: Changed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28942: [SPARK-32125][UI] Support get taskList by status in Web UI and SHS Rest API
AmplabJenkins removed a comment on pull request #28942: URL: https://github.com/apache/spark/pull/28942#issuecomment-652186162 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124714/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
cloud-fan commented on a change in pull request #28833: URL: https://github.com/apache/spark/pull/28833#discussion_r448111898 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ## @@ -2309,6 +2309,108 @@ class HiveDDLSuite } } + test("SPARK-20680: Spark-sql do not support for void column datatype") { +withTable("t") { + withView("tabVoidType") { +val client = + spark.sharedState.externalCatalog.unwrapped.asInstanceOf[HiveExternalCatalog].client +client.runSqlHive("CREATE TABLE t (t1 int)") +client.runSqlHive("INSERT INTO t VALUES (3)") +client.runSqlHive("CREATE VIEW tabVoidType AS SELECT NULL AS col FROM t") +checkAnswer(spark.table("tabVoidType"), Row(null)) +// No exception shows +val desc = spark.sql("DESC tabVoidType").collect().toSeq +assert(desc.contains(Row("col", "null", null))) Review comment: shall we change `NullType.toString` to use void? to match the parser side. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
cloud-fan commented on a change in pull request #28833: URL: https://github.com/apache/spark/pull/28833#discussion_r448111771 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ## @@ -2309,6 +2309,108 @@ class HiveDDLSuite } } + test("SPARK-20680: Spark-sql do not support for void column datatype") { +withTable("t") { + withView("tabVoidType") { +val client = + spark.sharedState.externalCatalog.unwrapped.asInstanceOf[HiveExternalCatalog].client +client.runSqlHive("CREATE TABLE t (t1 int)") +client.runSqlHive("INSERT INTO t VALUES (3)") +client.runSqlHive("CREATE VIEW tabVoidType AS SELECT NULL AS col FROM t") Review comment: shall we check TABLE as well instead of only VIEW? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28942: [SPARK-32125][UI] Support get taskList by status in Web UI and SHS Rest API
AmplabJenkins removed a comment on pull request #28942: URL: https://github.com/apache/spark/pull/28942#issuecomment-652186156 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28942: [SPARK-32125][UI] Support get taskList by status in Web UI and SHS Rest API
AmplabJenkins commented on pull request #28942: URL: https://github.com/apache/spark/pull/28942#issuecomment-652186156 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
cloud-fan commented on a change in pull request #28833: URL: https://github.com/apache/spark/pull/28833#discussion_r44876 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala ## @@ -292,6 +293,8 @@ case class PreprocessTableCreation(sparkSession: SparkSession) extends Rule[Logi "in the table definition of " + table.identifier, sparkSession.sessionState.conf.caseSensitiveAnalysis) +assertNoNullTypeInSchema(schema) Review comment: Is this needed? I think the changes in `ResolveCatalogs` and `ResolveSessionCatalog` should cover all the commands. ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala ## @@ -106,7 +107,7 @@ class ResolveHiveSerdeTable(session: SparkSession) extends Rule[LogicalPlan] { } else { withStorage } - + assertNoNullTypeInSchema(withSchema.schema) Review comment: ditto This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28942: [SPARK-32125][UI] Support get taskList by status in Web UI and SHS Rest API
SparkQA removed a comment on pull request #28942: URL: https://github.com/apache/spark/pull/28942#issuecomment-652130288 **[Test build #124714 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124714/testReport)** for PR 28942 at commit [`c10e38f`](https://github.com/apache/spark/commit/c10e38f8669ab9d61b3956052d5a01ce672204c7). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
cloud-fan commented on a change in pull request #28833: URL: https://github.com/apache/spark/pull/28833#discussion_r448110946 ## File path: sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala ## @@ -270,6 +275,7 @@ class ResolveSessionCatalog( SessionCatalogAndTable(catalog, tbl), _, _, _, _, _, _, _, _, _) => val provider = c.provider.getOrElse(conf.defaultDataSourceName) if (!isV2Provider(provider)) { +assertNoNullTypeInSchema(c.tableSchema) Review comment: ditto, this check can be done at the beginning. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28961: [SPARK-32143][SQL] Prevent a skewed join from producing too many partition splits
AmplabJenkins removed a comment on pull request #28961: URL: https://github.com/apache/spark/pull/28961#issuecomment-652185021 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124741/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on a change in pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink to avoid memory issue
HeartSaVioR commented on a change in pull request #28904: URL: https://github.com/apache/spark/pull/28904#discussion_r448050453 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/CompactibleFileStreamLog.scala ## @@ -106,10 +106,8 @@ abstract class CompactibleFileStreamLog[T <: AnyRef : ClassTag]( interval } - /** - * Filter out the obsolete logs. - */ - def compactLogs(logs: Seq[T]): Seq[T] + /** Determine whether the log should be retained or not. */ + def shouldRetain(log: T): Boolean Review comment: The UT I had to fix leverages this behavior - it's only possible when we collect all entries, as it assumes DELETE_ACTION can come after ADD_ACTION "in any further batches" and compactLogs should be able to filter out. This also requires us to materialize all entries. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28956: [SPARK-31797][SQL][FOLLOWUP] TIMESTAMP_SECONDS supports fractional input
dongjoon-hyun commented on a change in pull request #28956: URL: https://github.com/apache/spark/pull/28956#discussion_r448110859 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/datetimeExpressions.scala ## @@ -408,19 +407,66 @@ abstract class NumberToTimestampBase extends UnaryExpression } } +// scalastyle:off line.size.limit @ExpressionDescription( - usage = "_FUNC_(seconds) - Creates timestamp from the number of seconds since UTC epoch.", + usage = "_FUNC_(seconds) - Creates timestamp from the number of seconds (can be fractional) since UTC epoch.", examples = """ Examples: > SELECT _FUNC_(1230219000); 2008-12-25 07:30:00 + > SELECT _FUNC_(1230219000.123); + 2008-12-25 07:30:00.123 """, group = "datetime_funcs", since = "3.1.0") -case class SecondsToTimestamp(child: Expression) - extends NumberToTimestampBase { +// scalastyle:on line.size.limit +case class SecondsToTimestamp(child: Expression) extends UnaryExpression + with ExpectsInputTypes with NullIntolerant{ Review comment: nit: `NullIntolerant{` -> `NullIntolerant {`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
cloud-fan commented on a change in pull request #28833: URL: https://github.com/apache/spark/pull/28833#discussion_r448110841 ## File path: sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala ## @@ -102,6 +105,7 @@ class ResolveSessionCatalog( nameParts @ SessionCatalogAndTable(catalog, tbl), _, _, _, _, _) => loadTable(catalog, tbl.asIdentifier).collect { case v1Table: V1Table => + a.dataType.foreach(failNullType) Review comment: this can be done before the `loadTable` call. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28961: [SPARK-32143][SQL] Prevent a skewed join from producing too many partition splits
SparkQA removed a comment on pull request #28961: URL: https://github.com/apache/spark/pull/28961#issuecomment-652184176 **[Test build #124741 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124741/testReport)** for PR 28961 at commit [`a8c5fcc`](https://github.com/apache/spark/commit/a8c5fcc9c05361e5fe78f6d54abfc1ab69a6f486). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28961: [SPARK-32143][SQL] Prevent a skewed join from producing too many partition splits
AmplabJenkins removed a comment on pull request #28961: URL: https://github.com/apache/spark/pull/28961#issuecomment-652185016 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
cloud-fan commented on a change in pull request #28833: URL: https://github.com/apache/spark/pull/28833#discussion_r448110622 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/types/NullType.scala ## @@ -38,4 +38,12 @@ class NullType private() extends DataType { * @since 1.3.0 */ @Stable -case object NullType extends NullType +case object NullType extends NullType { + + def containsNullType(dt: DataType): Boolean = dt match { Review comment: let's not add a new method to a stable public class. Can we put it in the method `failNullType`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
cloud-fan commented on a change in pull request #28833: URL: https://github.com/apache/spark/pull/28833#discussion_r448110440 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogV2Util.scala ## @@ -346,4 +346,17 @@ private[sql] object CatalogV2Util { } } } + + def failNullType(dt: DataType): Unit = { +if (NullType.containsNullType(dt)) { + throw new AnalysisException( +"Cannot create tables with VOID type.") +} + } + + def assertNoNullTypeInSchema(schema: StructType): Unit = { +schema.foreach { f => + failNullType(CatalystSqlParser.parseDataType(schema.catalogString)) Review comment: shouldn't this be `failNullType(f.dataType)`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28961: [SPARK-32143][SQL] Prevent a skewed join from producing too many partition splits
AmplabJenkins commented on pull request #28961: URL: https://github.com/apache/spark/pull/28961#issuecomment-652185016 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28961: [SPARK-32143][SQL] Prevent a skewed join from producing too many partition splits
SparkQA commented on pull request #28961: URL: https://github.com/apache/spark/pull/28961#issuecomment-652185009 **[Test build #124741 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124741/testReport)** for PR 28961 at commit [`a8c5fcc`](https://github.com/apache/spark/commit/a8c5fcc9c05361e5fe78f6d54abfc1ab69a6f486). * This patch **fails Scala style tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28961: [SPARK-32143][SQL] Prevent a skewed join from producing too many partition splits
AmplabJenkins removed a comment on pull request #28961: URL: https://github.com/apache/spark/pull/28961#issuecomment-652184546 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28948: [SPARK-31935][SQL][FOLLOWUP] Hadoop file system config should be effective in data source options
AmplabJenkins removed a comment on pull request #28948: URL: https://github.com/apache/spark/pull/28948#issuecomment-652184523 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
cloud-fan commented on a change in pull request #28833: URL: https://github.com/apache/spark/pull/28833#discussion_r448110252 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ## @@ -2211,6 +2211,8 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging DecimalType(precision.getText.toInt, 0) case ("decimal" | "dec" | "numeric", precision :: scale :: Nil) => DecimalType(precision.getText.toInt, scale.getText.toInt) + case ("void", Nil) => NullType + case ("null", Nil) => NullType Review comment: I'm not sure about this. `null` is also a literal syntax, and this may introduce ambiguity if `null` is also a type name. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28948: [SPARK-31935][SQL][FOLLOWUP] Hadoop file system config should be effective in data source options
AmplabJenkins commented on pull request #28948: URL: https://github.com/apache/spark/pull/28948#issuecomment-652184523 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28961: [SPARK-32143][SQL] Prevent a skewed join from producing too many partition splits
AmplabJenkins commented on pull request #28961: URL: https://github.com/apache/spark/pull/28961#issuecomment-652184546 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28961: [SPARK-32143][SQL] Prevent a skewed join from producing too many partition splits
SparkQA commented on pull request #28961: URL: https://github.com/apache/spark/pull/28961#issuecomment-652184176 **[Test build #124741 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124741/testReport)** for PR 28961 at commit [`a8c5fcc`](https://github.com/apache/spark/commit/a8c5fcc9c05361e5fe78f6d54abfc1ab69a6f486). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28948: [SPARK-31935][SQL][FOLLOWUP] Hadoop file system config should be effective in data source options
AmplabJenkins removed a comment on pull request #28948: URL: https://github.com/apache/spark/pull/28948#issuecomment-652120925 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124651/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28948: [SPARK-31935][SQL][FOLLOWUP] Hadoop file system config should be effective in data source options
SparkQA commented on pull request #28948: URL: https://github.com/apache/spark/pull/28948#issuecomment-652184175 **[Test build #124742 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124742/testReport)** for PR 28948 at commit [`2d7975e`](https://github.com/apache/spark/commit/2d7975ec67d12025f6c09d688d6cdb033ef5072f). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #27983: [SPARK-32105][SQL]Refactor current ScriptTransformationExec code
SparkQA commented on pull request #27983: URL: https://github.com/apache/spark/pull/27983#issuecomment-652184186 **[Test build #124743 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124743/testReport)** for PR 27983 at commit [`c33e0fb`](https://github.com/apache/spark/commit/c33e0fbbd2a5feaa8db2b5b2238f707ea4a73dc0). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on pull request #28948: [SPARK-31935][SQL][FOLLOWUP] Hadoop file system config should be effective in data source options
gengliangwang commented on pull request #28948: URL: https://github.com/apache/spark/pull/28948#issuecomment-652183971 retest this please. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28940: [SPARK-32121][SHUFFLE] Support Windows OS in ExecutorDiskUtils
dongjoon-hyun commented on a change in pull request #28940: URL: https://github.com/apache/spark/pull/28940#discussion_r448109490 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExecutorDiskUtils.java ## @@ -50,14 +58,18 @@ public static File getFile(String[] localDirs, int subDirsPerLocalDir, String fi * the internal code in java.io.File would normalize it later, creating a new "foo/bar" * String copy. Unfortunately, we cannot just reuse the normalization code that java.io.File * uses, since it is in the package-private class java.io.FileSystem. + * + * On Windows, separator "\" is used instead of "/". + * + * "\\" is legal character in path name on Unix like OS, but illegal on Windows. Review comment: Maybe, - `is legal character` -> `is a legal character`. - `Unix like` -> `Unix-like`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LantaoJin opened a new pull request #28961: [SPARK-32143][SQL] Prevent a skewed join from producing too many partition splits
LantaoJin opened a new pull request #28961: URL: https://github.com/apache/spark/pull/28961 ### What changes were proposed in this pull request? Add a configuration `spark.sql.adaptive.skewJoin.maxPartitionSplits` to prevent a skewed join from producing too many partition splits. ### Why are the changes needed? In handling skewed SortMergeJoin, when matching partitions from the left side and the right side both have skew and split too many partitions, the plan generation may take a very long time and finally OOM. Even fallback to normal SMJ, the query cannot success either. So we should fast fail this query. In below logs we can see that it took over 1 hour to generate the plan in AQE when handle a skewed join which produced too many splits. ``` 20/06/30 12:31:26,271 INFO [HiveServer2-Background-Pool: Thread-821384] adaptive.OptimizeSkewedJoin:54 : 20/06/30 12:31:26,299 INFO [HiveServer2-Background-Pool: Thread-821384] adaptive.OptimizeSkewedJoin:54 : Left side partition 1 (3 TB) is skewed, split it into *39150* parts. 20/06/30 12:31:26,315 INFO [HiveServer2-Background-Pool: Thread-821384] adaptive.OptimizeSkewedJoin:54 : Right side partition 1 (11 TB) is skewed, split it into *17022* parts. 20/06/30 12:32:24,952 INFO [HiveServer2-Background-Pool: Thread-821384] adaptive.OptimizeSkewedJoin:54 : Right side partition 8 (1 GB) is skewed, split it into 17 parts. ... 20/06/30 13:27:25,158 INFO [HiveServer2-Background-Pool: Thread-821384] adaptive.AdaptiveSparkPlanExec:54 : Final plan: CollectLimit 1000 ``` ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Add a ut This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #27983: [SPARK-32105][SQL]Refactor current ScriptTransformationExec code
AmplabJenkins removed a comment on pull request #27983: URL: https://github.com/apache/spark/pull/27983#issuecomment-652183126 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #27983: [SPARK-32105][SQL]Refactor current ScriptTransformationExec code
AmplabJenkins commented on pull request #27983: URL: https://github.com/apache/spark/pull/27983#issuecomment-652183126 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28947: [SPARK-32129][SQL] Support AQE skew join with Union
AmplabJenkins removed a comment on pull request #28947: URL: https://github.com/apache/spark/pull/28947#issuecomment-652182623 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28947: [SPARK-32129][SQL] Support AQE skew join with Union
AmplabJenkins commented on pull request #28947: URL: https://github.com/apache/spark/pull/28947#issuecomment-652182623 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HeartSaVioR commented on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink to avoid memory issue
HeartSaVioR commented on pull request #28904: URL: https://github.com/apache/spark/pull/28904#issuecomment-652182305 org.apache.spark.sql.hive.thriftserver.HiveSessionImplSuite.(It is not a test it is a sbt.testing.SuiteSelector) This seems to be failing frequently. I'll see other build result being run via 124719. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28947: [SPARK-32129][SQL] Support AQE skew join with Union
SparkQA commented on pull request #28947: URL: https://github.com/apache/spark/pull/28947#issuecomment-652182370 **[Test build #124740 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124740/testReport)** for PR 28947 at commit [`d011e9a`](https://github.com/apache/spark/commit/d011e9a11f416c73af4a602f9966db35c2643dd8). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #27983: [SPARK-32105][SQL]Refactor current ScriptTransformationExec code
HyukjinKwon commented on pull request #27983: URL: https://github.com/apache/spark/pull/27983#issuecomment-652182347 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LantaoJin commented on pull request #28947: [SPARK-32129][SQL] Support AQE skew join with Union
LantaoJin commented on pull request #28947: URL: https://github.com/apache/spark/pull/28947#issuecomment-652182015 retest this please This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28947: [SPARK-32129][SQL] Support AQE skew join with Union
AmplabJenkins removed a comment on pull request #28947: URL: https://github.com/apache/spark/pull/28947#issuecomment-652180003 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124736/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28947: [SPARK-32129][SQL] Support AQE skew join with Union
AmplabJenkins removed a comment on pull request #28947: URL: https://github.com/apache/spark/pull/28947#issuecomment-652179998 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #28959: [SPARK-32088][PYTHON][FOLLOWUP] Replace `collect()` by `show()` in the example for `timestamp_seconds`
HyukjinKwon closed pull request #28959: URL: https://github.com/apache/spark/pull/28959 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28947: [SPARK-32129][SQL] Support AQE skew join with Union
AmplabJenkins commented on pull request #28947: URL: https://github.com/apache/spark/pull/28947#issuecomment-652179998 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28947: [SPARK-32129][SQL] Support AQE skew join with Union
SparkQA removed a comment on pull request #28947: URL: https://github.com/apache/spark/pull/28947#issuecomment-652164083 **[Test build #124736 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124736/testReport)** for PR 28947 at commit [`d011e9a`](https://github.com/apache/spark/commit/d011e9a11f416c73af4a602f9966db35c2643dd8). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28947: [SPARK-32129][SQL] Support AQE skew join with Union
SparkQA commented on pull request #28947: URL: https://github.com/apache/spark/pull/28947#issuecomment-652179862 **[Test build #124736 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124736/testReport)** for PR 28947 at commit [`d011e9a`](https://github.com/apache/spark/commit/d011e9a11f416c73af4a602f9966db35c2643dd8). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #28955: [SPARK-32142][SQL][TESTS] Keep the original tests and codes to avoid potential conflicts in dev
dongjoon-hyun commented on pull request #28955: URL: https://github.com/apache/spark/pull/28955#issuecomment-652177687 Thank you for updating. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] stackedsax commented on pull request #23340: [SPARK-23431][CORE] Expose the new executor memory metrics at the stage level
stackedsax commented on pull request #23340: URL: https://github.com/apache/spark/pull/23340#issuecomment-652176572 @edwinalu thanks for getting this started so long ago. Looks like there are some conflicts after so much time has passed. Do you have time/interest to continue working on this or would you like some help getting this sorted out? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink to avoid memory issue
AmplabJenkins removed a comment on pull request #28904: URL: https://github.com/apache/spark/pull/28904#issuecomment-652176078 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124716/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink to avoid memory issue
AmplabJenkins removed a comment on pull request #28904: URL: https://github.com/apache/spark/pull/28904#issuecomment-652176070 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink to avoid memory issue
AmplabJenkins commented on pull request #28904: URL: https://github.com/apache/spark/pull/28904#issuecomment-652176070 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink to avoid memory issue
SparkQA commented on pull request #28904: URL: https://github.com/apache/spark/pull/28904#issuecomment-652175830 **[Test build #124716 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124716/testReport)** for PR 28904 at commit [`849e421`](https://github.com/apache/spark/commit/849e421ae0ef2f2b1b98f1aa6517ec73f56cda15). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] ulysses-you commented on a change in pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand
ulysses-you commented on a change in pull request #28647: URL: https://github.com/apache/spark/pull/28647#discussion_r448102285 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala ## @@ -1066,7 +1066,10 @@ private[hive] object HiveClientImpl extends Logging { hiveTable.setSerializationLib( table.storage.serde.getOrElse("org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe")) table.storage.properties.foreach { case (k, v) => hiveTable.setSerdeParam(k, v) } -table.properties.foreach { case (k, v) => hiveTable.setProperty(k, v) } +// Hive only retain the useful properties through serde class annotation. +// For better compatible with Hive, we remove the metastore properties. +val hiveProperties = table.properties -- HIVE_METASTORE_GENERATED_PROPERTIES Review comment: It also affect `createTable`, but seems fine. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28904: [SPARK-30462][SS] Streamline the logic on file stream source and sink to avoid memory issue
SparkQA removed a comment on pull request #28904: URL: https://github.com/apache/spark/pull/28904#issuecomment-652136149 **[Test build #124716 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124716/testReport)** for PR 28904 at commit [`849e421`](https://github.com/apache/spark/commit/849e421ae0ef2f2b1b98f1aa6517ec73f56cda15). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28917: [SPARK-31847][CORE][TESTS] DAGSchedulerSuite: Rewrite the test framework to support apply specified spark configurations.
AmplabJenkins removed a comment on pull request #28917: URL: https://github.com/apache/spark/pull/28917#issuecomment-652174941 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28917: [SPARK-31847][CORE][TESTS] DAGSchedulerSuite: Rewrite the test framework to support apply specified spark configurations.
AmplabJenkins commented on pull request #28917: URL: https://github.com/apache/spark/pull/28917#issuecomment-652174941 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #23340: [SPARK-23431][CORE] Expose the new executor memory metrics at the stage level
AmplabJenkins removed a comment on pull request #23340: URL: https://github.com/apache/spark/pull/23340#issuecomment-652174521 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand
AmplabJenkins commented on pull request #28647: URL: https://github.com/apache/spark/pull/28647#issuecomment-652174925 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand
AmplabJenkins removed a comment on pull request #28647: URL: https://github.com/apache/spark/pull/28647#issuecomment-652174925 Merged build finished. Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand
AmplabJenkins removed a comment on pull request #28647: URL: https://github.com/apache/spark/pull/28647#issuecomment-652174928 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/29351/ Test PASSed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand
SparkQA commented on pull request #28647: URL: https://github.com/apache/spark/pull/28647#issuecomment-652174626 **[Test build #124739 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124739/testReport)** for PR 28647 at commit [`7f9f685`](https://github.com/apache/spark/commit/7f9f68571a535c2ecb46f9036e4988167416f49f). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #23340: [SPARK-23431][CORE] Expose the new executor memory metrics at the stage level
AmplabJenkins commented on pull request #23340: URL: https://github.com/apache/spark/pull/23340#issuecomment-652174792 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #23340: [SPARK-23431][CORE] Expose the new executor memory metrics at the stage level
AmplabJenkins removed a comment on pull request #23340: URL: https://github.com/apache/spark/pull/23340#issuecomment-531895234 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #23340: [SPARK-23431][CORE] Expose the new executor memory metrics at the stage level
AmplabJenkins commented on pull request #23340: URL: https://github.com/apache/spark/pull/23340#issuecomment-652174521 Can one of the admins verify this patch? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28917: [SPARK-31847][CORE][TESTS] DAGSchedulerSuite: Rewrite the test framework to support apply specified spark configurations.
SparkQA commented on pull request #28917: URL: https://github.com/apache/spark/pull/28917#issuecomment-652174594 **[Test build #124738 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124738/testReport)** for PR 28917 at commit [`da6a9f7`](https://github.com/apache/spark/commit/da6a9f75a76cd86bc1245943b07ebbbcd44a1b13). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org