nsivabalan commented on code in PR #17601:
URL: https://github.com/apache/hudi/pull/17601#discussion_r2820238568


##########
hudi-common/src/main/java/org/apache/hudi/common/table/log/AbstractHoodieLogRecordReader.java:
##########
@@ -182,6 +186,8 @@ protected AbstractHoodieLogRecordReader(HoodieStorage 
storage, String basePath,
     this.forceFullScan = forceFullScan;
     this.internalSchema = internalSchema == null ? 
InternalSchema.getEmptyInternalSchema() : internalSchema;
     this.enableOptimizedLogBlocksScan = enableOptimizedLogBlocksScan;
+    this.enableLogicalTimestampFieldRepair = 
storage.getConf().getBoolean(HoodieFileReader.ENABLE_LOGICAL_TIMESTAMP_REPAIR,

Review Comment:
   hey @yihua : 
   in 1.x, we have FileGroupReader abstraction and hence we had reader context 
where we pass some config values from driver to executor. but in 0.x, for log 
record reader, we do not have any such medium through which we can pass in 
adhoc configs. 
   
   for eg, value for `HoodieFileReader.ENABLE_LOGICAL_TIMESTAMP_REPAIR`.
   
   Even we do not have arguments like map<String, String> or even hadoop conf 
that we pass to log record reader 
   
   
https://github.com/apache/hudi/blob/164b35a79e6decb868780964b2bdac1fc35f23b7/hudi-common/src/main/java/org/apache/hudi/common/table/log/HoodieMergedLogRecordScanner.java#L479
   
   So, this makes it cumbersome to share some info from driver to executor. 
hence @linliu-code resorted to computing this value w/n spark tasks. which 
makes the computation repetitive. 
   
   do you think this is worth adding hadoop conf or some kind of params 
(map<String, String>) as an argument to HoodieMergedLogRecordScanner and 
AbstractHoodieLogRecordReader. 
   
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to