[GitHub] [spark] HyukjinKwon commented on pull request #28771: [2.4][SPARK-31935][SQL] Hadoop file system config should be effective in data source options
HyukjinKwon commented on pull request #28771: URL: https://github.com/apache/spark/pull/28771#issuecomment-652231653 Yeah, it should be best to have LGTM or at least some positive comments. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LantaoJin commented on a change in pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
LantaoJin commented on a change in pull request #28833: URL: https://github.com/apache/spark/pull/28833#discussion_r448149646 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala ## @@ -292,6 +293,8 @@ case class PreprocessTableCreation(sparkSession: SparkSession) extends Rule[Logi "in the table definition of " + table.identifier, sparkSession.sessionState.conf.caseSensitiveAnalysis) +assertNoNullTypeInSchema(schema) Review comment: Without this, "CREATE TABLE t1 USING PARQUET AS SELECT null as null_col" in Spark will throw `Parquet data source does not support null data type.` instead of `Cannot create tables with VOID type` Comparing the error message from Hive `SemanticException [Error 10305]: CREATE-TABLE-AS-SELECT creates a VOID type, please use CAST to specify the type, near field: col`, it's confused. So better to keep it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC
AmplabJenkins removed a comment on pull request #28953: URL: https://github.com/apache/spark/pull/28953#issuecomment-652230833 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124715/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LantaoJin commented on a change in pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
LantaoJin commented on a change in pull request #28833: URL: https://github.com/apache/spark/pull/28833#discussion_r448149646 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala ## @@ -292,6 +293,8 @@ case class PreprocessTableCreation(sparkSession: SparkSession) extends Rule[Logi "in the table definition of " + table.identifier, sparkSession.sessionState.conf.caseSensitiveAnalysis) +assertNoNullTypeInSchema(schema) Review comment: Without this, "CREATE TABLE t1 USING PARQUET AS SELECT null as null_col" will throw `Parquet data source does not support null data type.` instead of `Cannot create tables with VOID type` Comparing the error message from Hive `SemanticException [Error 10305]: CREATE-TABLE-AS-SELECT creates a VOID type, please use CAST to specify the type, near field: col`, it's confused. So better to keep it. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun edited a comment on pull request #28771: [2.4][SPARK-31935][SQL] Hadoop file system config should be effective in data source options
dongjoon-hyun edited a comment on pull request #28771: URL: https://github.com/apache/spark/pull/28771#issuecomment-652230135 Note that I'm fine with that backporting because it seems that you were confident with this and it looked urgent to you. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC
AmplabJenkins removed a comment on pull request #28953: URL: https://github.com/apache/spark/pull/28953#issuecomment-652230826 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
cloud-fan commented on a change in pull request #28833: URL: https://github.com/apache/spark/pull/28833#discussion_r448154450 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ## @@ -2309,6 +2309,108 @@ class HiveDDLSuite } } + test("SPARK-20680: Spark-sql do not support for void column datatype") { +withTable("t") { + withView("tabVoidType") { +val client = + spark.sharedState.externalCatalog.unwrapped.asInstanceOf[HiveExternalCatalog].client +client.runSqlHive("CREATE TABLE t (t1 int)") +client.runSqlHive("INSERT INTO t VALUES (3)") +client.runSqlHive("CREATE VIEW tabVoidType AS SELECT NULL AS col FROM t") +checkAnswer(spark.table("tabVoidType"), Row(null)) +// No exception shows +val desc = spark.sql("DESC tabVoidType").collect().toSeq +assert(desc.contains(Row("col", "null", null))) Review comment: I mean `DataType.simpleString`. I think it looks better if DESC TABLE returns `Row("col", "void", null)` here. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC
AmplabJenkins commented on pull request #28953: URL: https://github.com/apache/spark/pull/28953#issuecomment-652230826 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #28771: [2.4][SPARK-31935][SQL] Hadoop file system config should be effective in data source options
dongjoon-hyun commented on pull request #28771: URL: https://github.com/apache/spark/pull/28771#issuecomment-652230135 Note that I'm fine with this because it seems that you were confident with this and it looked urgent to you. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC
SparkQA removed a comment on pull request #28953: URL: https://github.com/apache/spark/pull/28953#issuecomment-652136148 **[Test build #124715 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124715/testReport)** for PR 28953 at commit [`f2e68a2`](https://github.com/apache/spark/commit/f2e68a2c27f430ae13ff53792174f70a55e95be8). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LantaoJin commented on a change in pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
LantaoJin commented on a change in pull request #28833: URL: https://github.com/apache/spark/pull/28833#discussion_r448153666 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala ## @@ -2211,6 +2211,8 @@ class AstBuilder(conf: SQLConf) extends SqlBaseBaseVisitor[AnyRef] with Logging DecimalType(precision.getText.toInt, 0) case ("decimal" | "dec" | "numeric", precision :: scale :: Nil) => DecimalType(precision.getText.toInt, scale.getText.toInt) + case ("void", Nil) => NullType + case ("null", Nil) => NullType Review comment: Em, you are right. I will remove this. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC
SparkQA commented on pull request #28953: URL: https://github.com/apache/spark/pull/28953#issuecomment-652229722 **[Test build #124715 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124715/testReport)** for PR 28953 at commit [`f2e68a2`](https://github.com/apache/spark/commit/f2e68a2c27f430ae13ff53792174f70a55e95be8). * This patch **fails PySpark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #28771: [2.4][SPARK-31935][SQL] Hadoop file system config should be effective in data source options
dongjoon-hyun commented on pull request #28771: URL: https://github.com/apache/spark/pull/28771#issuecomment-652228985 Hi, @gengliangwang . Did you merge this without any LGTM from other committers? cc @gatorsmile , @cloud-fan , @HyukjinKwon This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LantaoJin commented on a change in pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
LantaoJin commented on a change in pull request #28833: URL: https://github.com/apache/spark/pull/28833#discussion_r448150865 ## File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala ## @@ -106,7 +107,7 @@ class ResolveHiveSerdeTable(session: SparkSession) extends Rule[LogicalPlan] { } else { withStorage } - + assertNoNullTypeInSchema(withSchema.schema) Review comment: This can be removed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28962: [SPARK-32136][SQL] NormalizeFloatingNumbers should work on null struct
dongjoon-hyun commented on a change in pull request #28962: URL: https://github.com/apache/spark/pull/28962#discussion_r448150897 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala ## @@ -123,7 +123,8 @@ object NormalizeFloatingNumbers extends Rule[LogicalPlan] { val fields = expr.dataType.asInstanceOf[StructType].fields.indices.map { i => normalize(GetStructField(expr, i)) } - CreateStruct(fields) + val struct = CreateStruct(fields) + If(IsNull(expr), Literal(null, struct.dataType), struct) Review comment: The AS-IS also looks good to me, too. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28962: [SPARK-32136][SQL] NormalizeFloatingNumbers should work on null struct
dongjoon-hyun commented on a change in pull request #28962: URL: https://github.com/apache/spark/pull/28962#discussion_r448150338 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/optimizer/NormalizeFloatingNumbers.scala ## @@ -123,7 +123,8 @@ object NormalizeFloatingNumbers extends Rule[LogicalPlan] { val fields = expr.dataType.asInstanceOf[StructType].fields.indices.map { i => normalize(GetStructField(expr, i)) } - CreateStruct(fields) + val struct = CreateStruct(fields) + If(IsNull(expr), Literal(null, struct.dataType), struct) Review comment: I'm just wondering if we need a new Literal here. Maybe, can we simple just put `expr`? ```scala - val struct = CreateStruct(fields) - If(IsNull(expr), Literal(null, struct.dataType), struct) + If(IsNull(expr), expr, CreateStruct(fields)) ``` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LantaoJin commented on a change in pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
LantaoJin commented on a change in pull request #28833: URL: https://github.com/apache/spark/pull/28833#discussion_r448149646 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala ## @@ -292,6 +293,8 @@ case class PreprocessTableCreation(sparkSession: SparkSession) extends Rule[Logi "in the table definition of " + table.identifier, sparkSession.sessionState.conf.caseSensitiveAnalysis) +assertNoNullTypeInSchema(schema) Review comment: Without this, "CREATE TABLE t1 USING PARQUET AS SELECT null as null_col" will throw "Parquet data source does not support null data type." instead of "Cannot create tables with VOID type" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LantaoJin commented on a change in pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
LantaoJin commented on a change in pull request #28833: URL: https://github.com/apache/spark/pull/28833#discussion_r448149646 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala ## @@ -292,6 +293,8 @@ case class PreprocessTableCreation(sparkSession: SparkSession) extends Rule[Logi "in the table definition of " + table.identifier, sparkSession.sessionState.conf.caseSensitiveAnalysis) +assertNoNullTypeInSchema(schema) Review comment: With out this, "CREATE TABLE t1 USING PARQUET AS SELECT null as null_col" will throw "Parquet data source does not support null data type." instead of "Cannot create tables with VOID type" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LantaoJin commented on a change in pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
LantaoJin commented on a change in pull request #28833: URL: https://github.com/apache/spark/pull/28833#discussion_r448149646 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala ## @@ -292,6 +293,8 @@ case class PreprocessTableCreation(sparkSession: SparkSession) extends Rule[Logi "in the table definition of " + table.identifier, sparkSession.sessionState.conf.caseSensitiveAnalysis) +assertNoNullTypeInSchema(schema) Review comment: With out this, "CREATE TABLE t1 USING PARQUET AS SELECT null as null_col" will throws "Parquet data source does not support null data type." instead of "Cannot create tables with VOID type" This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LantaoJin commented on a change in pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
LantaoJin commented on a change in pull request #28833: URL: https://github.com/apache/spark/pull/28833#discussion_r448149646 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala ## @@ -292,6 +293,8 @@ case class PreprocessTableCreation(sparkSession: SparkSession) extends Rule[Logi "in the table definition of " + table.identifier, sparkSession.sessionState.conf.caseSensitiveAnalysis) +assertNoNullTypeInSchema(schema) Review comment: With out this, `CREATE TABLE t1 USING PARQUET AS SELECT null as null_col` will throws `Parquet data source does not support null data type.` instead of `Cannot create tables with VOID type` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28951: [SPARK-32131][SQL] Fix AnalysisException messages at UNION/EXCEPT/MINUS operations
AmplabJenkins removed a comment on pull request #28951: URL: https://github.com/apache/spark/pull/28951#issuecomment-652225009 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28951: [SPARK-32131][SQL] Fix AnalysisException messages at UNION/EXCEPT/MINUS operations
AmplabJenkins commented on pull request #28951: URL: https://github.com/apache/spark/pull/28951#issuecomment-652225009 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28956: [SPARK-31797][SQL][FOLLOWUP] TIMESTAMP_SECONDS supports fractional input
AmplabJenkins removed a comment on pull request #28956: URL: https://github.com/apache/spark/pull/28956#issuecomment-652224005 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #28951: [SPARK-32131][SQL] Fix AnalysisException messages at UNION/EXCEPT/MINUS operations
dongjoon-hyun commented on pull request #28951: URL: https://github.com/apache/spark/pull/28951#issuecomment-652224404 Thank you all! This lands at `master/3.0/2.4`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28956: [SPARK-31797][SQL][FOLLOWUP] TIMESTAMP_SECONDS supports fractional input
AmplabJenkins commented on pull request #28956: URL: https://github.com/apache/spark/pull/28956#issuecomment-652224005 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28956: [SPARK-31797][SQL][FOLLOWUP] TIMESTAMP_SECONDS supports fractional input
SparkQA commented on pull request #28956: URL: https://github.com/apache/spark/pull/28956#issuecomment-652223503 **[Test build #124747 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124747/testReport)** for PR 28956 at commit [`e41095d`](https://github.com/apache/spark/commit/e41095d73b106c62ea9e9c8e3dce2c47be2a6ed1). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28951: [SPARK-32131][SQL] Fix AnalysisException messages at UNION/EXCEPT/MINUS operations
SparkQA removed a comment on pull request #28951: URL: https://github.com/apache/spark/pull/28951#issuecomment-652109787 **[Test build #124704 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124704/testReport)** for PR 28951 at commit [`984c652`](https://github.com/apache/spark/commit/984c65275bdf68bf8ff1f9ce3c91696a90e65439). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28951: [SPARK-32131][SQL] Fix AnalysisException messages at UNION/EXCEPT/MINUS operations
SparkQA commented on pull request #28951: URL: https://github.com/apache/spark/pull/28951#issuecomment-652223150 **[Test build #124704 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124704/testReport)** for PR 28951 at commit [`984c652`](https://github.com/apache/spark/commit/984c65275bdf68bf8ff1f9ce3c91696a90e65439). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LantaoJin commented on a change in pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
LantaoJin commented on a change in pull request #28833: URL: https://github.com/apache/spark/pull/28833#discussion_r448145159 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ## @@ -2309,6 +2309,108 @@ class HiveDDLSuite } } + test("SPARK-20680: Spark-sql do not support for void column datatype") { +withTable("t") { + withView("tabVoidType") { +val client = + spark.sharedState.externalCatalog.unwrapped.asInstanceOf[HiveExternalCatalog].client +client.runSqlHive("CREATE TABLE t (t1 int)") +client.runSqlHive("INSERT INTO t VALUES (3)") +client.runSqlHive("CREATE VIEW tabVoidType AS SELECT NULL AS col FROM t") +checkAnswer(spark.table("tabVoidType"), Row(null)) +// No exception shows +val desc = spark.sql("DESC tabVoidType").collect().toSeq +assert(desc.contains(Row("col", "null", null))) Review comment: `NullType.toString` retruns "NullType". What's this comment meaning? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun closed pull request #28951: [SPARK-32131][SQL] Fix AnalysisException messages at UNION/EXCEPT/MINUS operations
dongjoon-hyun closed pull request #28951: URL: https://github.com/apache/spark/pull/28951 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on a change in pull request #28805: [SPARK-28169][SQL] Convert scan predicate condition to CNF
gengliangwang commented on a change in pull request #28805: URL: https://github.com/apache/spark/pull/28805#discussion_r448143683 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala ## @@ -249,8 +251,36 @@ trait PredicateHelper extends Logging { resultStack.top } - private def groupExpressionsByQualifier(expressions: Seq[Expression]): Seq[Expression] = { - expressions.groupBy(_.references.map(_.qualifier)).map(_._2.reduceLeft(And)).toSeq + /** + * Convert an expression to conjunctive normal form when pushing predicates through Join, + * when expand predicates, we can group by the qualifier avoiding generate unnecessary + * expression to control the length of final result since there are multiple tables. + * + * @param condition condition need to be converted + * @return the CNF result as sequence of disjunctive expressions. If the number of expressions + * exceeds threshold on converting `Or`, `Seq.empty` is returned. + */ + def conjunctiveNormalFormAndGroupExpsByQualifier(condition: Expression): Seq[Expression] = { Review comment: On second thought, the method name `conjunctiveNormalFormAndGroupExpsByQualifier` is too long and the `And` is weird. How about changing to `CNFWithGroupExpressionsByQualifier`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] gengliangwang commented on a change in pull request #28805: [SPARK-28169][SQL] Convert scan predicate condition to CNF
gengliangwang commented on a change in pull request #28805: URL: https://github.com/apache/spark/pull/28805#discussion_r448143802 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/predicates.scala ## @@ -249,8 +251,36 @@ trait PredicateHelper extends Logging { resultStack.top } - private def groupExpressionsByQualifier(expressions: Seq[Expression]): Seq[Expression] = { - expressions.groupBy(_.references.map(_.qualifier)).map(_._2.reduceLeft(And)).toSeq + /** + * Convert an expression to conjunctive normal form when pushing predicates through Join, + * when expand predicates, we can group by the qualifier avoiding generate unnecessary + * expression to control the length of final result since there are multiple tables. + * + * @param condition condition need to be converted + * @return the CNF result as sequence of disjunctive expressions. If the number of expressions + * exceeds threshold on converting `Or`, `Seq.empty` is returned. + */ + def conjunctiveNormalFormAndGroupExpsByQualifier(condition: Expression): Seq[Expression] = { +conjunctiveNormalForm(condition, (expressions: Seq[Expression]) => + expressions.groupBy(_.references.map(_.qualifier)).map(_._2.reduceLeft(And)).toSeq) + } + + /** + * Convert an expression to conjunctive normal form for predicate pushdown and partition pruning. + * When expanding predicates, this method groups expressions by their references for reducing + * the size of pushed down predicates and corresponding codegen. In partition pruning strategies, + * we split filters by [[splitConjunctivePredicates]] and partition filters by judging if it's + * references is subset of partCols, if we combine expressions group by reference when expand + * predicate of [[Or]], it won't impact final predicate pruning result since + * [[splitConjunctivePredicates]] won't split [[Or]] expression. + * + * @param condition condition need to be converted + * @return the CNF result as sequence of disjunctive expressions. If the number of expressions + * exceeds threshold on converting `Or`, `Seq.empty` is returned. + */ + def conjunctiveNormalFormAndGroupExpsByReference(condition: Expression): Seq[Expression] = { Review comment: How about changing to `CNFWithGroupExpressionsByReference`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LantaoJin commented on a change in pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
LantaoJin commented on a change in pull request #28833: URL: https://github.com/apache/spark/pull/28833#discussion_r448143565 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ## @@ -2309,6 +2309,108 @@ class HiveDDLSuite } } + test("SPARK-20680: Spark-sql do not support for void column datatype") { +withTable("t") { + withView("tabVoidType") { +val client = + spark.sharedState.externalCatalog.unwrapped.asInstanceOf[HiveExternalCatalog].client +client.runSqlHive("CREATE TABLE t (t1 int)") +client.runSqlHive("INSERT INTO t VALUES (3)") +client.runSqlHive("CREATE VIEW tabVoidType AS SELECT NULL AS col FROM t") Review comment: `client.runSqlHive("CREATE TABLE tabVoidType AS SELECT NULL AS col FROM t")` will throw FAILED: SemanticException [Error 10305]: CREATE-TABLE-AS-SELECT creates a VOID type, please use CAST to specify the type, near field: col This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] MaxGekk commented on pull request #27366: [SPARK-30648][SQL] Support filters pushdown in JSON datasource
MaxGekk commented on pull request #27366: URL: https://github.com/apache/spark/pull/27366#issuecomment-652220337 @HyukjinKwon @cloud-fan Please, have a look at this PR. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28956: [SPARK-31797][SQL][FOLLOWUP] TIMESTAMP_SECONDS supports fractional input
cloud-fan commented on a change in pull request #28956: URL: https://github.com/apache/spark/pull/28956#discussion_r448142644 ## File path: sql/core/src/test/resources/sql-tests/inputs/datetime.sql ## @@ -2,13 +2,16 @@ -- [SPARK-31710] TIMESTAMP_SECONDS, TIMESTAMP_MILLISECONDS and TIMESTAMP_MICROSECONDS to timestamp transfer select TIMESTAMP_SECONDS(1230219000),TIMESTAMP_SECONDS(-1230219000),TIMESTAMP_SECONDS(null); +select TIMESTAMP_SECONDS(1.23), TIMESTAMP_SECONDS(1.23d); select TIMESTAMP_MILLIS(1230219000123),TIMESTAMP_MILLIS(-1230219000123),TIMESTAMP_MILLIS(null); select TIMESTAMP_MICROS(1230219000123123),TIMESTAMP_MICROS(-1230219000123123),TIMESTAMP_MICROS(null); --- overflow exception: +-- overflow exception select TIMESTAMP_SECONDS(1230219000123123); select TIMESTAMP_SECONDS(-1230219000123123); select TIMESTAMP_MILLIS(92233720368547758); select TIMESTAMP_MILLIS(-92233720368547758); +-- truncate exception +select TIMESTAMP_SECONDS(0.1234567); Review comment: yes, float/double are approximate values and truncation should be always allowed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan edited a comment on pull request #28954: [SPARK-32083][SQL] Apply CoalesceShufflePartitions when input RDD has 0 partitions with AQE
cloud-fan edited a comment on pull request #28954: URL: https://github.com/apache/spark/pull/28954#issuecomment-652219037 After more thoughts, maybe a better way is to add a new rule in `AdaptiveSparkPlanExec.optimizer`, which converts `LogicalQueryStage` to empty `LocalRelation` if the size is 0. This is not really "coalesce partitions" and we'd better not do it in `CoalesceShufflePartitions`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LantaoJin commented on a change in pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
LantaoJin commented on a change in pull request #28833: URL: https://github.com/apache/spark/pull/28833#discussion_r448138143 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogV2Util.scala ## @@ -346,4 +346,17 @@ private[sql] object CatalogV2Util { } } } + + def failNullType(dt: DataType): Unit = { +if (NullType.containsNullType(dt)) { + throw new AnalysisException( +"Cannot create tables with VOID type.") +} + } + + def assertNoNullTypeInSchema(schema: StructType): Unit = { +schema.foreach { f => + failNullType(CatalystSqlParser.parseDataType(schema.catalogString)) Review comment: Ah, yes. Remove `CatalystSqlParser.parseDataType(schema.catalogString)` first could also remove `case ("null", Nil) => NullType` then. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on pull request #28954: [SPARK-32083][SQL] Apply CoalesceShufflePartitions when input RDD has 0 partitions with AQE
cloud-fan commented on pull request #28954: URL: https://github.com/apache/spark/pull/28954#issuecomment-652219037 After more thoughts, maybe a better way is to add a new rule in `AdaptiveSparkPlanExec.optimizer`, which converts `LogicalQueryStage` to empty `LocalRelation` if the size is 0. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
AmplabJenkins removed a comment on pull request #28833: URL: https://github.com/apache/spark/pull/28833#issuecomment-652217235 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124734/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
AmplabJenkins removed a comment on pull request #28833: URL: https://github.com/apache/spark/pull/28833#issuecomment-652217228 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
SparkQA removed a comment on pull request #28833: URL: https://github.com/apache/spark/pull/28833#issuecomment-652160085 **[Test build #124734 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124734/testReport)** for PR 28833 at commit [`fdf57bf`](https://github.com/apache/spark/commit/fdf57bf2b6e4606c357965bc82126c82e1675ac5). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
AmplabJenkins commented on pull request #28833: URL: https://github.com/apache/spark/pull/28833#issuecomment-652217228 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
SparkQA commented on pull request #28833: URL: https://github.com/apache/spark/pull/28833#issuecomment-652217007 **[Test build #124734 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124734/testReport)** for PR 28833 at commit [`fdf57bf`](https://github.com/apache/spark/commit/fdf57bf2b6e4606c357965bc82126c82e1675ac5). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] zhli1142015 commented on pull request #28859: [SPARK-32024][WEBUI] Update ApplicationStoreInfo.size during HistoryServerDiskManager initializing
zhli1142015 commented on pull request #28859: URL: https://github.com/apache/spark/pull/28859#issuecomment-652215562 Gently ping, @jiangxb1987 , @dongjoon-hyun , @ueshin , @gengliangwang , could you please help to review? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] LantaoJin commented on a change in pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
LantaoJin commented on a change in pull request #28833: URL: https://github.com/apache/spark/pull/28833#discussion_r448138143 ## File path: sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogV2Util.scala ## @@ -346,4 +346,17 @@ private[sql] object CatalogV2Util { } } } + + def failNullType(dt: DataType): Unit = { +if (NullType.containsNullType(dt)) { + throw new AnalysisException( +"Cannot create tables with VOID type.") +} + } + + def assertNoNullTypeInSchema(schema: StructType): Unit = { +schema.foreach { f => + failNullType(CatalystSqlParser.parseDataType(schema.catalogString)) Review comment: Ah, yes. Remove `CatalystSqlParser.parseDataType(schema.catalogString)` first could also remove `case ("null", Nil) => NullType` then. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #28962: [SPARK-32136][SQL] NormalizeFloatingNumbers should work on null struct
dongjoon-hyun commented on pull request #28962: URL: https://github.com/apache/spark/pull/28962#issuecomment-652214503 Thank you so much, @viirya . This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun closed pull request #28949: [SPARK-32028][WEBUI][2.4] fix app id link for multi attempts app in history summary page
dongjoon-hyun closed pull request #28949: URL: https://github.com/apache/spark/pull/28949 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] manuzhang commented on a change in pull request #28954: [SPARK-32083][SQL] Apply CoalesceShufflePartitions when input RDD has 0 partitions with AQE
manuzhang commented on a change in pull request #28954: URL: https://github.com/apache/spark/pull/28954#discussion_r448135495 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/CoalesceShufflePartitions.scala ## @@ -79,8 +66,29 @@ case class CoalesceShufflePartitions(session: SparkSession) extends Rule[SparkPl case stage: ShuffleQueryStageExec if stageIds.contains(stage.id) => CustomShuffleReaderExec(stage, partitionSpecs) } + } + + if (validMetrics.isEmpty) { Review comment: I think it's like coalescing one less shuffles and handled by the `nonEmpty` codes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28962: [SPARK-32136][SQL] NormalizeFloatingNumbers should work on null struct
AmplabJenkins removed a comment on pull request #28962: URL: https://github.com/apache/spark/pull/28962#issuecomment-652210749 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28962: [SPARK-32136][SQL] NormalizeFloatingNumbers should work on null struct
SparkQA commented on pull request #28962: URL: https://github.com/apache/spark/pull/28962#issuecomment-652212688 **[Test build #124746 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124746/testReport)** for PR 28962 at commit [`41a318e`](https://github.com/apache/spark/commit/41a318eb5808e3f3838c12301330ea9c1b3a351f). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
AmplabJenkins removed a comment on pull request #28833: URL: https://github.com/apache/spark/pull/28833#issuecomment-652211295 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28962: [SPARK-32136][SQL] NormalizeFloatingNumbers should work on null struct
AmplabJenkins commented on pull request #28962: URL: https://github.com/apache/spark/pull/28962#issuecomment-652210749 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
SparkQA removed a comment on pull request #28833: URL: https://github.com/apache/spark/pull/28833#issuecomment-652114014 **[Test build #124706 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124706/testReport)** for PR 28833 at commit [`5aa4c1a`](https://github.com/apache/spark/commit/5aa4c1ae4ab9f67d947dcc33f7e1ff07a1d22858). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
SparkQA commented on pull request #28833: URL: https://github.com/apache/spark/pull/28833#issuecomment-652210408 **[Test build #124706 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124706/testReport)** for PR 28833 at commit [`5aa4c1a`](https://github.com/apache/spark/commit/5aa4c1ae4ab9f67d947dcc33f7e1ff07a1d22858). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya commented on pull request #28962: [SPARK-32136][SQL] NormalizeFloatingNumbers should work on null struct
viirya commented on pull request #28962: URL: https://github.com/apache/spark/pull/28962#issuecomment-652210688 cc @cloud-fan @HyukjinKwon This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28940: [SPARK-32121][SHUFFLE] Support Windows OS in ExecutorDiskUtils
AmplabJenkins removed a comment on pull request #28940: URL: https://github.com/apache/spark/pull/28940#issuecomment-652209724 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124722/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] viirya opened a new pull request #28962: [SPARK-32136][SQL] NormalizeFloatingNumbers should work on null struct
viirya opened a new pull request #28962: URL: https://github.com/apache/spark/pull/28962 ### What changes were proposed in this pull request? This patch fixes wrong groupBy result if the grouping key is a null-value struct. ### Why are the changes needed? `NormalizeFloatingNumbers` reconstructs a struct if input expression is StructType. If the input struct is null, it will reconstruct a struct with null-value fields, instead of null. ### Does this PR introduce _any_ user-facing change? Yes, fixing incorrect groupBy result. ### How was this patch tested? Unit test. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28940: [SPARK-32121][SHUFFLE] Support Windows OS in ExecutorDiskUtils
AmplabJenkins removed a comment on pull request #28940: URL: https://github.com/apache/spark/pull/28940#issuecomment-652209718 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on pull request #28957: [WIP][SPARK-32138] Drop Python 2.7, 3.4 and 3.5
dongjoon-hyun commented on pull request #28957: URL: https://github.com/apache/spark/pull/28957#issuecomment-652210013 Wow, nice investigation, @Fokko ! Thanks. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28940: [SPARK-32121][SHUFFLE] Support Windows OS in ExecutorDiskUtils
AmplabJenkins commented on pull request #28940: URL: https://github.com/apache/spark/pull/28940#issuecomment-652209724 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28940: [SPARK-32121][SHUFFLE] Support Windows OS in ExecutorDiskUtils
SparkQA removed a comment on pull request #28940: URL: https://github.com/apache/spark/pull/28940#issuecomment-652142021 **[Test build #124722 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124722/testReport)** for PR 28940 at commit [`aac01ca`](https://github.com/apache/spark/commit/aac01ca11c3c024e8a75753e43a217cadb1d8c46). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28940: [SPARK-32121][SHUFFLE] Support Windows OS in ExecutorDiskUtils
SparkQA commented on pull request #28940: URL: https://github.com/apache/spark/pull/28940#issuecomment-652207766 **[Test build #124722 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124722/testReport)** for PR 28940 at commit [`aac01ca`](https://github.com/apache/spark/commit/aac01ca11c3c024e8a75753e43a217cadb1d8c46). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC
AmplabJenkins removed a comment on pull request #28953: URL: https://github.com/apache/spark/pull/28953#issuecomment-652202419 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124745/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC
AmplabJenkins removed a comment on pull request #28953: URL: https://github.com/apache/spark/pull/28953#issuecomment-652202412 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC
AmplabJenkins commented on pull request #28953: URL: https://github.com/apache/spark/pull/28953#issuecomment-652202412 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC
SparkQA commented on pull request #28953: URL: https://github.com/apache/spark/pull/28953#issuecomment-652202383 **[Test build #124745 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124745/testReport)** for PR 28953 at commit [`0db0376`](https://github.com/apache/spark/commit/0db0376b4eaed4b02739080b1ba3d1e4c6e97bd3). * This patch **fails to generate documentation**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC
SparkQA removed a comment on pull request #28953: URL: https://github.com/apache/spark/pull/28953#issuecomment-652196837 **[Test build #124745 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124745/testReport)** for PR 28953 at commit [`0db0376`](https://github.com/apache/spark/commit/0db0376b4eaed4b02739080b1ba3d1e4c6e97bd3). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand
AmplabJenkins removed a comment on pull request #28647: URL: https://github.com/apache/spark/pull/28647#issuecomment-652202127 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124739/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand
SparkQA removed a comment on pull request #28647: URL: https://github.com/apache/spark/pull/28647#issuecomment-652174626 **[Test build #124739 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124739/testReport)** for PR 28647 at commit [`7f9f685`](https://github.com/apache/spark/commit/7f9f68571a535c2ecb46f9036e4988167416f49f). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand
AmplabJenkins commented on pull request #28647: URL: https://github.com/apache/spark/pull/28647#issuecomment-652202119 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand
AmplabJenkins removed a comment on pull request #28647: URL: https://github.com/apache/spark/pull/28647#issuecomment-652202119 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28647: [SPARK-31828][SQL] Retain table properties at CreateTableLikeCommand
SparkQA commented on pull request #28647: URL: https://github.com/apache/spark/pull/28647#issuecomment-652201843 **[Test build #124739 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124739/testReport)** for PR 28647 at commit [`7f9f685`](https://github.com/apache/spark/commit/7f9f68571a535c2ecb46f9036e4988167416f49f). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28947: [SPARK-32129][SQL] Support AQE skew join with Union
AmplabJenkins removed a comment on pull request #28947: URL: https://github.com/apache/spark/pull/28947#issuecomment-652199044 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124740/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28947: [SPARK-32129][SQL] Support AQE skew join with Union
AmplabJenkins removed a comment on pull request #28947: URL: https://github.com/apache/spark/pull/28947#issuecomment-652199037 Merged build finished. Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28947: [SPARK-32129][SQL] Support AQE skew join with Union
SparkQA removed a comment on pull request #28947: URL: https://github.com/apache/spark/pull/28947#issuecomment-652182370 **[Test build #124740 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124740/testReport)** for PR 28947 at commit [`d011e9a`](https://github.com/apache/spark/commit/d011e9a11f416c73af4a602f9966db35c2643dd8). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28947: [SPARK-32129][SQL] Support AQE skew join with Union
AmplabJenkins commented on pull request #28947: URL: https://github.com/apache/spark/pull/28947#issuecomment-652199037 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28947: [SPARK-32129][SQL] Support AQE skew join with Union
SparkQA commented on pull request #28947: URL: https://github.com/apache/spark/pull/28947#issuecomment-652198809 **[Test build #124740 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124740/testReport)** for PR 28947 at commit [`d011e9a`](https://github.com/apache/spark/commit/d011e9a11f416c73af4a602f9966db35c2643dd8). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] maropu commented on a change in pull request #28676: [SPARK-31869][SQL] BroadcastHashJoinExec can utilize the build side for its output partitioning
maropu commented on a change in pull request #28676: URL: https://github.com/apache/spark/pull/28676#discussion_r448120574 ## File path: sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastHashJoinExec.scala ## @@ -60,6 +60,26 @@ case class BroadcastHashJoinExec( } } + override def outputPartitioning: Partitioning = { +def buildKeys: Seq[Expression] = buildSide match { + case BuildLeft => leftKeys + case BuildRight => rightKeys +} + +joinType match { + case _: InnerLike => Review comment: NVM, on second thought, its difficult to hanlde this issue in that side. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC
SparkQA commented on pull request #28953: URL: https://github.com/apache/spark/pull/28953#issuecomment-652196837 **[Test build #124745 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124745/testReport)** for PR 28953 at commit [`0db0376`](https://github.com/apache/spark/commit/0db0376b4eaed4b02739080b1ba3d1e4c6e97bd3). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon commented on pull request #28955: [SPARK-32142][SQL][TESTS] Keep the original tests and codes to avoid potential conflicts in dev
HyukjinKwon commented on pull request #28955: URL: https://github.com/apache/spark/pull/28955#issuecomment-652196123 Thank you guys. Merged to master and branch-3.0. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] HyukjinKwon closed pull request #28955: [SPARK-32142][SQL][TESTS] Keep the original tests and codes to avoid potential conflicts in dev
HyukjinKwon closed pull request #28955: URL: https://github.com/apache/spark/pull/28955 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC
AmplabJenkins removed a comment on pull request #28953: URL: https://github.com/apache/spark/pull/28953#issuecomment-652195101 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28953: [SPARK-32013][SQL] Support query execution before/after reading/writing DataFrame over JDBC
AmplabJenkins commented on pull request #28953: URL: https://github.com/apache/spark/pull/28953#issuecomment-652195101 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] sarutak edited a comment on pull request #28803: [SPARK-31971][WEBUI] Add pagination support for all jobs timeline
sarutak edited a comment on pull request #28803: URL: https://github.com/apache/spark/pull/28803#issuecomment-652154162 Hi @gengliangwang, shall we restart discussion? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28955: [SPARK-32142][SQL][TESTS] Keep the original tests and codes to avoid potential conflicts in dev
AmplabJenkins commented on pull request #28955: URL: https://github.com/apache/spark/pull/28955#issuecomment-652193469 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28955: [SPARK-32142][SQL][TESTS] Keep the original tests and codes to avoid potential conflicts in dev
AmplabJenkins removed a comment on pull request #28955: URL: https://github.com/apache/spark/pull/28955#issuecomment-652193469 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28955: [SPARK-32142][SQL][TESTS] Keep the original tests and codes to avoid potential conflicts in dev
SparkQA commented on pull request #28955: URL: https://github.com/apache/spark/pull/28955#issuecomment-652192426 **[Test build #124700 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124700/testReport)** for PR 28955 at commit [`7a36dd3`](https://github.com/apache/spark/commit/7a36dd397c6276594308529a9fd6ac2c0e81a5c6). * This patch passes all tests. * This patch merges cleanly. * This patch adds no public classes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA removed a comment on pull request #28955: [SPARK-32142][SQL][TESTS] Keep the original tests and codes to avoid potential conflicts in dev
SparkQA removed a comment on pull request #28955: URL: https://github.com/apache/spark/pull/28955#issuecomment-652101599 **[Test build #124700 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124700/testReport)** for PR 28955 at commit [`7a36dd3`](https://github.com/apache/spark/commit/7a36dd397c6276594308529a9fd6ac2c0e81a5c6). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28956: [SPARK-31797][SQL][FOLLOWUP] TIMESTAMP_SECONDS supports fractional input
dongjoon-hyun commented on a change in pull request #28956: URL: https://github.com/apache/spark/pull/28956#discussion_r448115259 ## File path: sql/core/src/test/resources/sql-tests/inputs/datetime.sql ## @@ -2,13 +2,16 @@ -- [SPARK-31710] TIMESTAMP_SECONDS, TIMESTAMP_MILLISECONDS and TIMESTAMP_MICROSECONDS to timestamp transfer select TIMESTAMP_SECONDS(1230219000),TIMESTAMP_SECONDS(-1230219000),TIMESTAMP_SECONDS(null); +select TIMESTAMP_SECONDS(1.23), TIMESTAMP_SECONDS(1.23d); select TIMESTAMP_MILLIS(1230219000123),TIMESTAMP_MILLIS(-1230219000123),TIMESTAMP_MILLIS(null); select TIMESTAMP_MICROS(1230219000123123),TIMESTAMP_MICROS(-1230219000123123),TIMESTAMP_MICROS(null); --- overflow exception: +-- overflow exception select TIMESTAMP_SECONDS(1230219000123123); select TIMESTAMP_SECONDS(-1230219000123123); select TIMESTAMP_MILLIS(92233720368547758); select TIMESTAMP_MILLIS(-92233720368547758); +-- truncate exception +select TIMESTAMP_SECONDS(0.1234567); Review comment: Shall we have a test case for `allow truncation` together because this PR allows truncation for `double` type? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
cloud-fan commented on a change in pull request #28833: URL: https://github.com/apache/spark/pull/28833#discussion_r448115666 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ## @@ -2309,6 +2309,108 @@ class HiveDDLSuite } } + test("SPARK-20680: Spark-sql do not support for void column datatype") { +withTable("t") { + withView("tabVoidType") { +val client = + spark.sharedState.externalCatalog.unwrapped.asInstanceOf[HiveExternalCatalog].client +client.runSqlHive("CREATE TABLE t (t1 int)") +client.runSqlHive("INSERT INTO t VALUES (3)") +client.runSqlHive("CREATE VIEW tabVoidType AS SELECT NULL AS col FROM t") +checkAnswer(spark.table("tabVoidType"), Row(null)) +// No exception shows +val desc = spark.sql("DESC tabVoidType").collect().toSeq +assert(desc.contains(Row("col", "null", null))) + } +} + +// Forbid CTAS with null type +withTable("t1", "t2", "t3") { + val e1 = intercept[AnalysisException] { +spark.sql("CREATE TABLE t1 USING PARQUET AS SELECT null as null_col") + }.getMessage + assert(e1.contains("Cannot create tables with VOID type")) + + val e2 = intercept[AnalysisException] { +spark.sql("CREATE TABLE t2 AS SELECT null as null_col") + }.getMessage + assert(e2.contains("Cannot create tables with VOID type")) + + val e3 = intercept[AnalysisException] { +spark.sql("CREATE TABLE t3 STORED AS PARQUET AS SELECT null as null_col") + }.getMessage + assert(e3.contains("Cannot create tables with VOID type")) +} + +// Forbid creating table with void/null type in Spark +Seq("void", "null").foreach { colType => + withTable("t1", "t2", "t3") { +val e1 = intercept[AnalysisException] { + spark.sql(s"CREATE TABLE t1 (v $colType) USING parquet") +}.getMessage +assert(e1.contains("Cannot create tables with VOID type")) +val e2 = intercept[AnalysisException] { + spark.sql(s"CREATE TABLE t2 (v $colType) USING hive") Review comment: can we follow the CTAS test and use `STORED AS PARQUET`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28956: [SPARK-31797][SQL][FOLLOWUP] TIMESTAMP_SECONDS supports fractional input
dongjoon-hyun commented on a change in pull request #28956: URL: https://github.com/apache/spark/pull/28956#discussion_r448115408 ## File path: sql/core/src/test/resources/sql-tests/inputs/datetime.sql ## @@ -2,13 +2,16 @@ -- [SPARK-31710] TIMESTAMP_SECONDS, TIMESTAMP_MILLISECONDS and TIMESTAMP_MICROSECONDS to timestamp transfer select TIMESTAMP_SECONDS(1230219000),TIMESTAMP_SECONDS(-1230219000),TIMESTAMP_SECONDS(null); +select TIMESTAMP_SECONDS(1.23), TIMESTAMP_SECONDS(1.23d); select TIMESTAMP_MILLIS(1230219000123),TIMESTAMP_MILLIS(-1230219000123),TIMESTAMP_MILLIS(null); select TIMESTAMP_MICROS(1230219000123123),TIMESTAMP_MICROS(-1230219000123123),TIMESTAMP_MICROS(null); --- overflow exception: +-- overflow exception select TIMESTAMP_SECONDS(1230219000123123); select TIMESTAMP_SECONDS(-1230219000123123); select TIMESTAMP_MILLIS(92233720368547758); select TIMESTAMP_MILLIS(-92233720368547758); +-- truncate exception +select TIMESTAMP_SECONDS(0.1234567); Review comment: This PR aims to allow truncation for both ANSI and legacy mode. Did I understand correctly? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28956: [SPARK-31797][SQL][FOLLOWUP] TIMESTAMP_SECONDS supports fractional input
dongjoon-hyun commented on a change in pull request #28956: URL: https://github.com/apache/spark/pull/28956#discussion_r448115259 ## File path: sql/core/src/test/resources/sql-tests/inputs/datetime.sql ## @@ -2,13 +2,16 @@ -- [SPARK-31710] TIMESTAMP_SECONDS, TIMESTAMP_MILLISECONDS and TIMESTAMP_MICROSECONDS to timestamp transfer select TIMESTAMP_SECONDS(1230219000),TIMESTAMP_SECONDS(-1230219000),TIMESTAMP_SECONDS(null); +select TIMESTAMP_SECONDS(1.23), TIMESTAMP_SECONDS(1.23d); select TIMESTAMP_MILLIS(1230219000123),TIMESTAMP_MILLIS(-1230219000123),TIMESTAMP_MILLIS(null); select TIMESTAMP_MICROS(1230219000123123),TIMESTAMP_MICROS(-1230219000123123),TIMESTAMP_MICROS(null); --- overflow exception: +-- overflow exception select TIMESTAMP_SECONDS(1230219000123123); select TIMESTAMP_SECONDS(-1230219000123123); select TIMESTAMP_MILLIS(92233720368547758); select TIMESTAMP_MILLIS(-92233720368547758); +-- truncate exception +select TIMESTAMP_SECONDS(0.1234567); Review comment: Shall we have a test case for `allow truncate` together because this PR allows truncation for `double` type? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] dongjoon-hyun commented on a change in pull request #28956: [SPARK-31797][SQL][FOLLOWUP] TIMESTAMP_SECONDS supports fractional input
dongjoon-hyun commented on a change in pull request #28956: URL: https://github.com/apache/spark/pull/28956#discussion_r448114383 ## File path: sql/core/src/test/resources/sql-tests/inputs/datetime.sql ## @@ -2,13 +2,16 @@ -- [SPARK-31710] TIMESTAMP_SECONDS, TIMESTAMP_MILLISECONDS and TIMESTAMP_MICROSECONDS to timestamp transfer select TIMESTAMP_SECONDS(1230219000),TIMESTAMP_SECONDS(-1230219000),TIMESTAMP_SECONDS(null); +select TIMESTAMP_SECONDS(1.23), TIMESTAMP_SECONDS(1.23d); Review comment: Since this has `Decimal` and `Double`, can we have `Float` together by using `TIMESTAMP_SECONDS(CAST(1.23 AS FLOAT))`? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28940: [SPARK-32121][SHUFFLE] Support Windows OS in ExecutorDiskUtils
AmplabJenkins removed a comment on pull request #28940: URL: https://github.com/apache/spark/pull/28940#issuecomment-652188385 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins commented on pull request #28940: [SPARK-32121][SHUFFLE] Support Windows OS in ExecutorDiskUtils
AmplabJenkins commented on pull request #28940: URL: https://github.com/apache/spark/pull/28940#issuecomment-652188378 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] SparkQA commented on pull request #28940: [SPARK-32121][SHUFFLE] Support Windows OS in ExecutorDiskUtils
SparkQA commented on pull request #28940: URL: https://github.com/apache/spark/pull/28940#issuecomment-652188033 **[Test build #124744 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/124744/testReport)** for PR 28940 at commit [`a80bd6c`](https://github.com/apache/spark/commit/a80bd6c8a5d93187cf06f941c2a9d296a7b6ca61). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] pan3793 commented on a change in pull request #28940: [SPARK-32121][SHUFFLE] Support Windows OS in ExecutorDiskUtils
pan3793 commented on a change in pull request #28940: URL: https://github.com/apache/spark/pull/28940#discussion_r448112801 ## File path: common/network-shuffle/src/main/java/org/apache/spark/network/shuffle/ExecutorDiskUtils.java ## @@ -50,14 +58,18 @@ public static File getFile(String[] localDirs, int subDirsPerLocalDir, String fi * the internal code in java.io.File would normalize it later, creating a new "foo/bar" * String copy. Unfortunately, we cannot just reuse the normalization code that java.io.File * uses, since it is in the package-private class java.io.FileSystem. + * + * On Windows, separator "\" is used instead of "/". + * + * "\\" is legal character in path name on Unix like OS, but illegal on Windows. Review comment: Changed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] AmplabJenkins removed a comment on pull request #28942: [SPARK-32125][UI] Support get taskList by status in Web UI and SHS Rest API
AmplabJenkins removed a comment on pull request #28942: URL: https://github.com/apache/spark/pull/28942#issuecomment-652186162 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/124714/ Test FAILed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
cloud-fan commented on a change in pull request #28833: URL: https://github.com/apache/spark/pull/28833#discussion_r448111898 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ## @@ -2309,6 +2309,108 @@ class HiveDDLSuite } } + test("SPARK-20680: Spark-sql do not support for void column datatype") { +withTable("t") { + withView("tabVoidType") { +val client = + spark.sharedState.externalCatalog.unwrapped.asInstanceOf[HiveExternalCatalog].client +client.runSqlHive("CREATE TABLE t (t1 int)") +client.runSqlHive("INSERT INTO t VALUES (3)") +client.runSqlHive("CREATE VIEW tabVoidType AS SELECT NULL AS col FROM t") +checkAnswer(spark.table("tabVoidType"), Row(null)) +// No exception shows +val desc = spark.sql("DESC tabVoidType").collect().toSeq +assert(desc.contains(Row("col", "null", null))) Review comment: shall we change `NullType.toString` to use void? to match the parser side. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] cloud-fan commented on a change in pull request #28833: [SPARK-20680][SQL] Spark-sql do not support for creating table with void column datatype
cloud-fan commented on a change in pull request #28833: URL: https://github.com/apache/spark/pull/28833#discussion_r448111771 ## File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala ## @@ -2309,6 +2309,108 @@ class HiveDDLSuite } } + test("SPARK-20680: Spark-sql do not support for void column datatype") { +withTable("t") { + withView("tabVoidType") { +val client = + spark.sharedState.externalCatalog.unwrapped.asInstanceOf[HiveExternalCatalog].client +client.runSqlHive("CREATE TABLE t (t1 int)") +client.runSqlHive("INSERT INTO t VALUES (3)") +client.runSqlHive("CREATE VIEW tabVoidType AS SELECT NULL AS col FROM t") Review comment: shall we check TABLE as well instead of only VIEW? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org