dongkelun commented on PR #5633: URL: https://github.com/apache/hudi/pull/5633#issuecomment-1165064752
> sorry, I dont' understand why you are setting "--checkpoint earliest" w/ your spark-submit job. You should not set any checkpoint value if I am not wrong. can you help me understand. "earliest/latest" is meant for auto reset for kafka sources. First of all, you are absolutely correct. The reason why I set the value of checkpoint is that sqlsource in version 0.9.0 cannot extract data if checkpoint is not set,There will be the following logs: ```java No new data, source checkpoint has not changed. Nothing to commit. Old checkpoint=(Optional.empty). New Checkpoint=(null) ``` So I try to set checkpoint and set a meaningless value, and then I can extract the data, but there will be this exception when I extract again. In the new version, the problem that data cannot be extracted has been solved by adding the parameter ` --allow-commit-on-no-checkpoint-change',However, if the user mistakenly sets a checkpoint that should not be set, there will still be this exception, so I think we should solve this problem and avoid this exception -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
