rahil-c commented on code in PR #13591:
URL: https://github.com/apache/hudi/pull/13591#discussion_r2226630154


##########
hudi-spark-datasource/hudi-spark-common/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/HoodieFileGroupReaderBasedParquetFileFormat.scala:
##########
@@ -217,6 +218,7 @@ class 
HoodieFileGroupReaderBasedParquetFileFormat(tablePath: String,
                 .withStart(file.start)
                 .withLength(baseFileLength)
                 .withShouldUseRecordPosition(shouldUseRecordPosition)
+                
.withEnableOptimizedLogBlockScan(props.getBoolean(HoodieReaderConfig.ENABLE_OPTIMIZED_LOG_BLOCKS_SCAN.key(),
 HoodieReaderConfig.ENABLE_OPTIMIZED_LOG_BLOCKS_SCAN.defaultValue().toBoolean))

Review Comment:
   @danny0405 does the field `TypedProperties props` always guaranteed to have 
all the configs values from the `HoodieWriteConfig` or  `HoodieMetadataConfig`? 
If so then I think this fallback logic can work.
   
   However if it does not, then I do not think we should be doing this in the 
FileGroupReader builder, as this can cause the value of 
`enableOptimizedLogBlockScan` to be incorrectly overwritten depending on the 
invoker.
   
   The field `enableOptimizedLogBlockScan` is a boolean that seems to be set in 
the builder to `false` 
   `private boolean enableOptimizedLogBlockScan = false`, so I do not think we 
can check for it being `null`.
   
    If we do the check to see if its `false` this could be intentionally have 
been set by the user in places where we have access to the 
`HoodieMetadataConfig` or the `HoodieReaderConfig` and pass it in via 
`.withEnableOptimizedLogBlockScan(config.enableOptimizedLogBlocksScan())`. 
   
   So doing something like this, could cause the default value for the config 
to be used in the event the `props` was not correctly populated and did not 
have the key.
   `withEnableOptimizedLogBlockScan = 
props.getBoolean(HoodieReaderConfig.ENABLE_OPTIMIZED_LOG_BLOCKS_SCAN.key(), 
HoodieReaderConfig.ENABLE_OPTIMIZED_LOG_BLOCKS_SCAN.defaultValue().toBoolean)`
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to