cs17899219 opened a new issue, #620: URL: https://github.com/apache/doris-flink-connector/issues/620
### Search before asking - [x] I had searched in the [issues](https://github.com/apache/incubator-doris/issues?q=is%3Aissue) and found no similar issues. ### Description The current logic in `TypeConverter.java` uses a multiplier of `3` to calculate the required byte length for the Doris `VARCHAR` type: ```java // Current implementation return length * 3 > 65533 ? DorisType.STRING : String.format("%s(%s)", DorisType.VARCHAR, length * 3); ``` This assumes a maximum of 3 bytes per character, which is insufficient for the widely used utf8mb4 character set (common in MySQL/MariaDB and other sources). The utf8mb4 encoding supports the full range of Unicode characters (including emojis), requiring up to 4 bytes per character. If a source column contains 4-byte characters, the calculated byte length may underestimate the required size, leading to: Data truncation or corruption during the synchronization process. Load failures with errors such as "data length exceeded" or "row size too large" when Doris enforces the byte limit. ### Solution This change updates the byte multiplier from 3 to 4 to safely accommodate the full utf8mb4 character set, ensuring the calculated byte length is always sufficient for the defined character length, thus guaranteeing data integrity and preventing sync failures. ### Are you willing to submit PR? - [x] Yes I am willing to submit a PR! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
