nsivabalan commented on PR #5633: URL: https://github.com/apache/hudi/pull/5633#issuecomment-1256807510
I don't think we have any bugs as such. here is the context. SqlSource is different from other sources since there is no concept of checkpoint. So, lets say, we define a sql as "select * from tbl1" We do syncOnce() and ingest into hudi. next time again, we are going to invoke the same sql only. which is "select * from tbl1". So, we don't have anything like a checkpoint to resume from previous attempt when we invoked sql. So, just for sqlSource, the checkpoint that gets serialized into commit metadata is always null. So, if we invoke syncOnce() again, hudi tries to fetch the checkpoint from commit metadata which is null and again we just invoke the sql as is. your test had some issue. we have to generate new data to sql test. I have enhanced the test for sqlSource to test 2 syncOnce(). https://github.com/apache/hudi/pull/6781 Let me know if it makes sense. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
