brantyou opened a new issue, #64806: URL: https://github.com/apache/doris/issues/64806
### Search before asking - [x] I had searched in the [issues](https://github.com/apache/doris/issues?q=is%3Aissue) and found no similar issues. ### Version 4.1.1 ### What's Wrong? When using the multi-table CDC mode of CREATE STREAMING JOB in Doris 4.1.1 to synchronize tables and data from a specified MySQL database with auto table creation enabled, all Chinese characters in the synchronized data are garbled and displayed as ???. For the identical MySQL table, if we adopt the CREATE STREAMING JOB TVF mode: manually create a Doris primary key model table first, then perform data synchronization, the Chinese characters display normally without garbling. We executed SHOW CREATE TABLE to check the DDL statements generated by Doris under the two modes, and found no obvious differences between them. In the CDC mode synchronization task, the following parameters have already been appended to the MySQL jdbc_url: ?useUnicode=true&characterEncoding=utf-8&serverTimezone=Asia/Shanghai&zeroDateTimeBehavior=convertToNull Yet Chinese content still turns into garbled ???. The schema configuration of the corresponding MySQL table is as follows: ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci ROW_FORMAT=DYNAMIC ### What You Expected? When synchronizing MySQL tables and data using Doris 4.1.1 multi-table CDC streaming job with auto table creation, Chinese text should be parsed and stored correctly, displaying normal Chinese characters instead of garbled ???. The Chinese display effect should be consistent with the TVF streaming job mode (manually created Doris primary key tables work fine for Chinese content). ### How to Reproduce? Environment: Apache Doris 4.1.1, source MySQL table with charset utf8mb4 MySQL table config: ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_0900_ai_ci ROW_FORMAT=DYNAMIC Create a multi-table CDC streaming job with auto table creation enabled to sync all tables from a specified MySQL database. Append charset & timezone parameters to MySQL jdbc_url: ?useUnicode=true&characterEncoding=utf-8&serverTimezone=Asia/Shanghai&zeroDateTimeBehavior=convertToNull Insert or read Chinese data in MySQL source table, wait for CDC synchronization to complete. Query synced data in Doris, all Chinese characters show as ???. ### Anything Else? _No response_ ### Are you willing to submit PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
