nsivabalan commented on a change in pull request #3946:
URL: https://github.com/apache/hudi/pull/3946#discussion_r781538712



##########
File path: 
hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/DefaultSource.scala
##########
@@ -201,7 +201,7 @@ class DefaultSource extends RelationProvider
       HadoopFsRelation(
         fileIndex,
         fileIndex.partitionSchema,
-        fileIndex.dataSchema,
+        fileIndex.schema,

Review comment:
       HoodieFileIndex.dataSchema represents schema without partition fields. 
while HoodieFileIndex.schema represents hudi table schema (obtained using table 
schema resolver). Atleast logically it doesn't look right to switch from 
dataSchema to schema unless we have concrete reasons. 
   Also to answer Danny's comment, data files could contain partition columns. 
Infact, support to remove partition columns from user fields was added only 
recently. So, for most cases, data files will contain the partition fields/cols.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to