garyli1019 commented on issue #1362: HUDI-644 Enable user to get checkpoint from previous commits in DeltaStreamer URL: https://github.com/apache/incubator-hudi/pull/1362#issuecomment-593770649 I think running the parallel jobs once sounds a little bit hacky. The best way should be to generate the checkpoint string and pass it to the delta streamer in the first run. In this way, I will need to write a checkpoint generator to scan all the files generated by Kafka connect. This is definitely doable but needs some effort. So I think we can do this to help the users migrate to delta streamer: - checkPointGenerator helper functions help users generate the checkpoint from popular sink connectors(Kafka connect, Spark streaming e.t.c) - Allow the user to commit without using delta streamer to fix the gap if the checkpoint is difficult to generate. Any thoughts?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services
