[GitHub] [spark] MaxGekk commented on a change in pull request #30482: [SPARK-33529][SQL] Handle '__HIVE_DEFAULT_PARTITION__' while resolving V2 partition specs

2020-11-24 Thread GitBox


MaxGekk commented on a change in pull request #30482:
URL: https://github.com/apache/spark/pull/30482#discussion_r529558445



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/connector/AlterTablePartitionV2SQLSuite.scala
##
@@ -243,4 +243,22 @@ class AlterTablePartitionV2SQLSuite extends 
DatasourceV2SQLBase {
   assert(!partTable.partitionExists(expectedPartition))
 }
   }
+
+  test("SPARK-33529: handle __HIVE_DEFAULT_PARTITION__") {
+val t = "testpart.ns1.ns2.tbl"
+withTable(t) {
+  sql(s"CREATE TABLE $t (part0 string) USING foo PARTITIONED BY (part0)")
+  val partTable = catalog("testpart")
+.asTableCatalog
+.loadTable(Identifier.of(Array("ns1", "ns2"), "tbl"))
+.asPartitionable
+  val expectedPartition = InternalRow.fromSeq(Seq[Any](null))
+  assert(!partTable.partitionExists(expectedPartition))
+  val partSpec = "PARTITION (part0 = '__HIVE_DEFAULT_PARTITION__')"

Review comment:
   > It's more like a hive specific thing and we should let v2 
implementation to decide ...
   
   It is already Spark specific thing too. Implementations don't see 
`'__HIVE_DEFAULT_PARTITION__'` at all because it is replaced by `null` at the 
analyzing phase. 





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk commented on a change in pull request #30482: [SPARK-33529][SQL] Handle '__HIVE_DEFAULT_PARTITION__' while resolving V2 partition specs

2020-11-24 Thread GitBox


MaxGekk commented on a change in pull request #30482:
URL: https://github.com/apache/spark/pull/30482#discussion_r529474127



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/connector/AlterTablePartitionV2SQLSuite.scala
##
@@ -243,4 +243,22 @@ class AlterTablePartitionV2SQLSuite extends 
DatasourceV2SQLBase {
   assert(!partTable.partitionExists(expectedPartition))
 }
   }
+
+  test("SPARK-33529: handle __HIVE_DEFAULT_PARTITION__") {
+val t = "testpart.ns1.ns2.tbl"
+withTable(t) {
+  sql(s"CREATE TABLE $t (part0 string) USING foo PARTITIONED BY (part0)")
+  val partTable = catalog("testpart")
+.asTableCatalog
+.loadTable(Identifier.of(Array("ns1", "ns2"), "tbl"))
+.asPartitionable
+  val expectedPartition = InternalRow.fromSeq(Seq[Any](null))
+  assert(!partTable.partitionExists(expectedPartition))
+  val partSpec = "PARTITION (part0 = '__HIVE_DEFAULT_PARTITION__')"

Review comment:
   > does part_col = null work?
   
   I have checked that. `null` is recognized  as a string `"null"`.





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk commented on a change in pull request #30482: [SPARK-33529][SQL] Handle '__HIVE_DEFAULT_PARTITION__' while resolving V2 partition specs

2020-11-24 Thread GitBox


MaxGekk commented on a change in pull request #30482:
URL: https://github.com/apache/spark/pull/30482#discussion_r529470852



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/connector/AlterTablePartitionV2SQLSuite.scala
##
@@ -243,4 +243,22 @@ class AlterTablePartitionV2SQLSuite extends 
DatasourceV2SQLBase {
   assert(!partTable.partitionExists(expectedPartition))
 }
   }
+
+  test("SPARK-33529: handle __HIVE_DEFAULT_PARTITION__") {
+val t = "testpart.ns1.ns2.tbl"
+withTable(t) {
+  sql(s"CREATE TABLE $t (part0 string) USING foo PARTITIONED BY (part0)")
+  val partTable = catalog("testpart")
+.asTableCatalog
+.loadTable(Identifier.of(Array("ns1", "ns2"), "tbl"))
+.asPartitionable
+  val expectedPartition = InternalRow.fromSeq(Seq[Any](null))
+  assert(!partTable.partitionExists(expectedPartition))
+  val partSpec = "PARTITION (part0 = '__HIVE_DEFAULT_PARTITION__')"

Review comment:
   For example, if we have a string partitioned column - how could we 
distinguish `null` from `"null"`?  





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk commented on a change in pull request #30482: [SPARK-33529][SQL] Handle '__HIVE_DEFAULT_PARTITION__' while resolving V2 partition specs

2020-11-24 Thread GitBox


MaxGekk commented on a change in pull request #30482:
URL: https://github.com/apache/spark/pull/30482#discussion_r529469591



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/connector/AlterTablePartitionV2SQLSuite.scala
##
@@ -243,4 +243,22 @@ class AlterTablePartitionV2SQLSuite extends 
DatasourceV2SQLBase {
   assert(!partTable.partitionExists(expectedPartition))
 }
   }
+
+  test("SPARK-33529: handle __HIVE_DEFAULT_PARTITION__") {
+val t = "testpart.ns1.ns2.tbl"
+withTable(t) {
+  sql(s"CREATE TABLE $t (part0 string) USING foo PARTITIONED BY (part0)")
+  val partTable = catalog("testpart")
+.asTableCatalog
+.loadTable(Identifier.of(Array("ns1", "ns2"), "tbl"))
+.asPartitionable
+  val expectedPartition = InternalRow.fromSeq(Seq[Any](null))
+  assert(!partTable.partitionExists(expectedPartition))
+  val partSpec = "PARTITION (part0 = '__HIVE_DEFAULT_PARTITION__')"

Review comment:
   ok. How can users specify `null` partition value?





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk commented on a change in pull request #30482: [SPARK-33529][SQL] Handle '__HIVE_DEFAULT_PARTITION__' while resolving V2 partition specs

2020-11-24 Thread GitBox


MaxGekk commented on a change in pull request #30482:
URL: https://github.com/apache/spark/pull/30482#discussion_r529377394



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/connector/AlterTablePartitionV2SQLSuite.scala
##
@@ -243,4 +243,22 @@ class AlterTablePartitionV2SQLSuite extends 
DatasourceV2SQLBase {
   assert(!partTable.partitionExists(expectedPartition))
 }
   }
+
+  test("SPARK-33529: handle __HIVE_DEFAULT_PARTITION__") {
+val t = "testpart.ns1.ns2.tbl"
+withTable(t) {
+  sql(s"CREATE TABLE $t (part0 string) USING foo PARTITIONED BY (part0)")
+  val partTable = catalog("testpart")
+.asTableCatalog
+.loadTable(Identifier.of(Array("ns1", "ns2"), "tbl"))
+.asPartitionable
+  val expectedPartition = InternalRow.fromSeq(Seq[Any](null))

Review comment:
   `'__HIVE_DEFAULT_PARTITION__'` should be handled as `null`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk commented on a change in pull request #30482: [SPARK-33529][SQL] Handle '__HIVE_DEFAULT_PARTITION__' while resolving V2 partition specs

2020-11-24 Thread GitBox


MaxGekk commented on a change in pull request #30482:
URL: https://github.com/apache/spark/pull/30482#discussion_r529377394



##
File path: 
sql/core/src/test/scala/org/apache/spark/sql/connector/AlterTablePartitionV2SQLSuite.scala
##
@@ -243,4 +243,22 @@ class AlterTablePartitionV2SQLSuite extends 
DatasourceV2SQLBase {
   assert(!partTable.partitionExists(expectedPartition))
 }
   }
+
+  test("SPARK-33529: handle __HIVE_DEFAULT_PARTITION__") {
+val t = "testpart.ns1.ns2.tbl"
+withTable(t) {
+  sql(s"CREATE TABLE $t (part0 string) USING foo PARTITIONED BY (part0)")
+  val partTable = catalog("testpart")
+.asTableCatalog
+.loadTable(Identifier.of(Array("ns1", "ns2"), "tbl"))
+.asPartitionable
+  val expectedPartition = InternalRow.fromSeq(Seq[Any](null))

Review comment:
   '__HIVE_DEFAULT_PARTITION__' should be handled as `null`





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] [spark] MaxGekk commented on a change in pull request #30482: [SPARK-33529][SQL] Handle '__HIVE_DEFAULT_PARTITION__' while resolving V2 partition specs

2020-11-24 Thread GitBox


MaxGekk commented on a change in pull request #30482:
URL: https://github.com/apache/spark/pull/30482#discussion_r529373882



##
File path: 
sql/catalyst/src/main/scala/org/apache/spark/sql/util/PartitioningUtils.scala
##
@@ -18,9 +18,15 @@
 package org.apache.spark.sql.util
 
 import org.apache.spark.sql.AnalysisException
+import org.apache.spark.sql.catalyst.InternalRow
 import org.apache.spark.sql.catalyst.analysis.Resolver
+import org.apache.spark.sql.catalyst.catalog.CatalogTypes.TablePartitionSpec
+import org.apache.spark.sql.catalyst.catalog.ExternalCatalogUtils
+import org.apache.spark.sql.catalyst.expressions.{Cast, Literal}
+import org.apache.spark.sql.catalyst.util.{CaseInsensitiveMap, DateTimeUtils}
+import org.apache.spark.sql.types.StructType
 
-object PartitioningUtils {
+private[sql] object PartitioningUtils {

Review comment:
   Addressed @cloud-fan 's comment 
https://github.com/apache/spark/pull/30454#discussion_r528549153





This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org