Litianye commented on pull request #1719:
URL: https://github.com/apache/hudi/pull/1719#issuecomment-642497057


   > @Litianye no worry we can work through this together. I believe all the 
sources treat empty string and `Option.empty()` the same. If not then it's a 
bug. If we don't fix it now, the empty string will haunt us later.
   
   Get it! I add little change in `DeltaSync`. 
   
   In `KafkaSource` like `JsonKafkaSource` & `AvroKafkaSource`, 
`Option.empty()`and `Option.of("")` treated in reset strategy  branch 
https://github.com/apache/hudi/blob/df2e0c760e7df0bd1b200867b3f0d2ca3a3f1fce/hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java#L184
   
   In `DFSSource` like `AvroDFSSource` & `JsonDFSSource` & `CsvDFSSource` & 
`ParquetDFSSource`,  i think `Option.of("")` will never occur in method 
https://github.com/apache/hudi/blob/df2e0c760e7df0bd1b200867b3f0d2ca3a3f1fce/hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/DFSPathSelector.java#L64,
 because in `DFSPathSelector` checkpoint generator from 
`FileStatus.getModificationTime()`.
   
   In `HoodieIncrSource`, this case fixed in line 
https://github.com/apache/hudi/blob/df2e0c760e7df0bd1b200867b3f0d2ca3a3f1fce/hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/HoodieIncrSource.java#L106
   
   In `HiveIncrPullSource`, in line 
https://github.com/apache/hudi/blob/df2e0c760e7df0bd1b200867b3f0d2ca3a3f1fce/hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/HiveIncrPullSource.java#L123
 will return an empty string, can you assess this and how can we fix it?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to