Litianye commented on pull request #1719:
URL: https://github.com/apache/hudi/pull/1719#issuecomment-643142558


   > > In `KafkaSource` like `JsonKafkaSource` & `AvroKafkaSource`, 
`Option.empty()`and `Option.of("")` treated in reset strategy branch
   > 
   > Can we remove the empty string handling in the source and elsewhere as 
well? It might confuse other people when they are reading the code there.
   > > 
https://github.com/apache/hudi/blob/df2e0c760e7df0bd1b200867b3f0d2ca3a3f1fce/hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/HiveIncrPullSource.java#L123
   > > 
   > > will return an empty string, can you assess this and how can we fix it?
   > 
   > IMO this is fine, we need to store the empty string in some cases. As long 
as we can eliminate empty string during the offset calculation, the code should 
look cleaner.
   
   hmmm,if empty string is fine in `HiveIncrPullSource`, it's easier to achieve 
the goal:
   >I believe all the sources treat empty string and `Option.empty()` the same.
   
   a little doubt is we will remove all of empty string check in `Source`, or, 
just treat empty string and `Option.empty()` the same. 
   
   I remove the empty string check in line 
https://github.com/apache/hudi/blob/df2e0c760e7df0bd1b200867b3f0d2ca3a3f1fce/hudi-utilities/src/main/java/org/apache/hudi/utilities/sources/helpers/KafkaOffsetGen.java#L58
   This check is unnecessary in this method.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to