nsivabalan commented on a change in pull request #3946:
URL: https://github.com/apache/hudi/pull/3946#discussion_r781538712
##########
File path:
hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/DefaultSource.scala
##########
@@ -201,7 +201,7 @@ class DefaultSource extends RelationProvider
HadoopFsRelation(
fileIndex,
fileIndex.partitionSchema,
- fileIndex.dataSchema,
+ fileIndex.schema,
Review comment:
HoodieFileIndex.dataSchema represents schema without partition fields.
while HoodieFileIndex.schema represents hudi table schema (obtained using table
schema resolver). Atleast logically it doesn't look right to switch from
dataSchema to schema unless we have concrete reasons.
Also to answer Danny's comment, data files could contain partition columns.
Infact, support to remove partition columns from user fields was added only
recently. So, for most cases, data files will contain the partition fields/cols.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]