junmingliu opened a new issue, #5641: URL: https://github.com/apache/seatunnel/issues/5641
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/seatunnel/issues?q=is%3Aissue+label%3A%22bug%22) and found no similar issues. ### What happened Data duplication when batch job restore from checkpoint,server log as blow: [server.log](https://github.com/apache/seatunnel/files/12923836/server.log) ### SeaTunnel Version dev branch,the commit id as blow:  ### SeaTunnel Config ```conf env { # You can set flink configuration here job.mode = "BATCH" checkpoint.interval ="10000" checkpoint.timeout = 9000000 } source{ Jdbc { url = "jdbc:mysql://XXX:3306/XXX?serverTimezone=GMT%2b8&useCompression=true&useSSL=false&useCursorFetch=true&allowPublicKeyRetrieval=true" driver = "com.mysql.cj.jdbc.Driver" connection_check_timeout_sec = 100 user = "XXX" password = "XXX" partition_column = "id" partition_num = 20 fetch_size = 5000 query = "select * from indicator_bigdata limit 8000000" parallelism = 2 } } transform { # If you would like to get more information about how to configure seatunnel and see full list of transform plugins, # please go to https://seatunnel.apache.org/docs/transform/sql } sink { jdbc { url = "jdbc:postgresql://XXX:5432/postgres" driver = "org.postgresql.Driver" user = "XXX" password = "XXX" batch_size = 5000 batch_inteval_ms = 0 database = postgres table = public.indicator_bigdata generate_sink_sql = true is_exactly_once = true xa_data_source_class_name = "org.postgresql.xa.PGXADataSource" max_commit_attempts = 3 transaction_timeout_sec = 86400 } # If you would like to get more information about how to configure seatunnel and see full list of sink plugins, # please go to https://seatunnel.apache.org/docs/category/sink-v2 } ``` ### Running Command ```shell firstly,2023-10-16 18:56:40 run as blow:/bin/seatunnel.sh -c ../mysql2pg.template secondly,2023-10-16 18:58 run as blow:./bin/seatunnel.sh -s 766253013990899713; The command completed in 2023-10-16 19:03 finally,2023-10-16 19:08 run as blow:./bin/seatunnel.sh -c ../mysql2pg.template -r 766253013990899713 ``` ### Error Exception ```log Data is duplicated。 详细:键值"(id)=(2000026)" 已经存在 Call getNextException to see other errors in the batch. ``` ### Zeta or Flink or Spark Version _No response_ ### Java or Scala Version _No response_ ### Screenshots _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
