garyli1019 commented on issue #1362: HUDI-644 Enable user to get checkpoint 
from previous commits in DeltaStreamer
URL: https://github.com/apache/incubator-hudi/pull/1362#issuecomment-593770649
 
 
   I think running the parallel jobs once sounds a little bit hacky. The best 
way should be to generate the checkpoint string and pass it to the delta 
streamer in the first run. In this way, I will need to write a checkpoint 
generator to scan all the files generated by Kafka connect. This is definitely 
doable but needs some effort. 
   So I think we can do this to help the users migrate to delta streamer:
   - checkPointGenerator helper functions help users generate the checkpoint 
from popular sink connectors(Kafka connect, Spark streaming e.t.c)
   - Allow the user to commit without using delta streamer to fix the gap if 
the checkpoint is difficult to generate.
   Any thoughts? 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

Reply via email to