[GitHub] [hudi] yihua commented on a diff in pull request #6163: [HUDI-4440] Treat boostrapped table as non-partitioned in HudiFileIndex if partit…

GitBox Fri, 22 Jul 2022 19:53:42 -0700


yihua commented on code in PR #6163:
URL: https://github.com/apache/hudi/pull/6163#discussion_r928069250



##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/hudi/SparkHoodieTableFileIndex.scala:
##########
@@ -96,10 +97,24 @@ class SparkHoodieTableFileIndex(spark: SparkSession,
         val partitionFields = partitionColumns.get().map(column => 
StructField(column, StringType))
         StructType(partitionFields)
       } else {
-        val partitionFields = partitionColumns.get().map(column =>
-          nameFieldMap.getOrElse(column, throw new 
IllegalArgumentException(s"Cannot find column: '" +
-            s"$column' in the schema[${schema.fields.mkString(",")}]")))
-        StructType(partitionFields)
+        val partitionFields = partitionColumns.get().filter(column => 
nameFieldMap.contains(column))
+          .map(column => nameFieldMap.apply(column))
+
+        if (partitionFields.size != partitionColumns.get().size) {

Review Comment:
   Got it.  For the time being, I'll land this fix.  As a follow-up, could one 
of you add docs in the code to clarify why the check is needed?  It's not clear 
from reading the code at first glance.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] yihua commented on a diff in pull request #6163: [HUDI-4440] Treat boostrapped table as non-partitioned in HudiFileIndex if partit…

Reply via email to