[GitHub] [hudi] nsivabalan commented on a change in pull request #3946: [HUDI-2711] Fallback to fulltable scan for IncrementalRelation if underlying files have been cleared or moved by cleaner

GitBox Mon, 10 Jan 2022 12:51:33 -0800


nsivabalan commented on a change in pull request #3946:
URL: https://github.com/apache/hudi/pull/3946#discussion_r781538712




##########
File path: 
hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/DefaultSource.scala
##########
@@ -201,7 +201,7 @@ class DefaultSource extends RelationProvider
       HadoopFsRelation(
         fileIndex,
         fileIndex.partitionSchema,
-        fileIndex.dataSchema,
+        fileIndex.schema,

Review comment:
       HoodieFileIndex.dataSchema represents schema without partition fields. 
while HoodieFileIndex.schema represents hudi table schema (obtained using table 
schema resolver). Atleast logically it doesn't look right to switch from 
dataSchema to schema unless we have concrete reasons. 
   Also to answer Danny's comment, data files could contain partition columns. 
Infact, support to remove partition columns from user fields was added only 
recently. So, for most cases, data files will contain the partition fields/cols.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] nsivabalan commented on a change in pull request #3946: [HUDI-2711] Fallback to fulltable scan for IncrementalRelation if underlying files have been cleared or moved by cleaner

Reply via email to