rdblue commented on a change in pull request #374: Migrate spark table to 
iceberg table
URL: https://github.com/apache/incubator-iceberg/pull/374#discussion_r318677245
 
 

 ##########
 File path: spark/src/main/scala/org/apache/iceberg/spark/SparkTableUtil.scala
 ##########
 @@ -131,18 +136,22 @@ object SparkTableUtil {
         s"$name=${partition(name)}"
       }.mkString("/")
 
-      DataFiles.builder(spec)
-          .withPath(path)
-          .withFormat(format)
-          .withPartitionPath(partitionKey)
-          .withFileSizeInBytes(fileSize)
-          .withMetrics(new Metrics(rowCount,
-            arrayToMap(columnSizes),
-            arrayToMap(valueCounts),
-            arrayToMap(nullValueCounts),
-            arrayToMap(lowerBounds),
-            arrayToMap(upperBounds)))
-          .build()
+      var builder = DataFiles.builder(spec)
+        .withPath(path)
+        .withFormat(format)
+        .withFileSizeInBytes(fileSize)
+        .withMetrics(new Metrics(rowCount,
+          arrayToMap(columnSizes),
+          arrayToMap(valueCounts),
+          arrayToMap(nullValueCounts),
+          arrayToMap(lowerBounds),
+          arrayToMap(upperBounds)))
+
+      if (partitionKey == "") {
+        builder.build()
+      } else {
+        builder.withPartitionPath(partitionKey).build()
 
 Review comment:
   That's correct.
   
   Iceberg considers conversion from partition tuple to partition path a 
one-way conversion. We only want to use `withPartitionPath` for conversions 
like this one where we are importing data from Hive tables. In that case, it 
isn't clear what the date format will be because many people use strings for 
date partition fields.
   
   I think we could probably add support for ISO-8601 dates, but we want to 
avoid putting a lot of effort into parsing strings into partition values.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to