matianhe3 opened a new issue, #5610: URL: https://github.com/apache/seatunnel/issues/5610
### Search before asking - [X] I had searched in the [issues](https://github.com/apache/seatunnel/issues?q=is%3Aissue+label%3A%22bug%22) and found no similar issues. ### What happened when data more than 1million, the data seem appear duplicate , i get more than 2million data on starrocks, and seatunnel not stop , continue run , and duplicate data increase. ### SeaTunnel Version 2.3.3 zeta ### SeaTunnel Config ```conf env { execution.parallelism = 3 job.mode = BATCH job.name = mkt } source { Jdbc { result_table_name = source url = "jdbc:sqlserver://" driver = com.microsoft.sqlserver.jdbc.SQLServerDriver connection_check_timeout_sec = 100 user = marketing password = "Qq@329429240" query = """select id, keyword, city, realcity, ver, imei, channel, uid, os, lands, server_time, m_usercode, m_username, m_roomid, m_channel, m_addtime, m_updatetime, m_searchcode, m_type, dt FROM YS_BI.marketing.marketing_share_room_record """ } } transform { } sink { StarRocks { source_table_name = [result] nodeUrls = ["clickhouse01:8030", "clickhouse02:8030", "clickhouse03:8030"] base-url = "jdbc:mysql:loadBalance://clickhouse01:9030,clickhouse02:9030,clickhouse03:9030" username = root password = "" database = dws table = mkt_record save_mode_create_template = """ CREATE TABLE IF NOT EXISTS `${database}`.`${table_name}` ( server_time DATETIME NOT NULL, dt DATE NOT NULL, ${rowtype_fields} ) ENGINE = OLAP DUPLICATE KEY(server_time) PARTITION BY (dt) DISTRIBUTED BY HASH(dt) """ } } ``` ### Running Command ```shell seatunnel.sh -c mkt.conf ``` ### Error Exception ```log duplicate data ``` ### Zeta or Flink or Spark Version Zeta ### Java or Scala Version java 1.8 ### Screenshots _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
