danny0405 commented on issue #11017: URL: https://github.com/apache/hudi/issues/11017#issuecomment-2056090368
> One is that the new data overwrites the old data Not sure if you are using the `upsert` operation by using the index for updating. If you are using the `Flink_STATE` index, you need to recover the job from the latest checkpoint. Another choice is using the hashing index by specifying the option `index.type` as `BUCKET`, whereas there is no need for state recovery. > The second type is that new tasks only retrieve data from the most recent commit and do not count the entire amount of data Are you saying the state acc of Flink? did you recover from the latest checkpoint state? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
