[
https://issues.apache.org/jira/browse/FLINK-35102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Leonard Xu resolved FLINK-35102.
--------------------------------
Resolution: Fixed
fixed via 5e52d8620cee995b55f003956c3f7d7a3bdf4a22
> Incorret Type mapping for Flink CDC Doris connector
> ---------------------------------------------------
>
> Key: FLINK-35102
> URL: https://issues.apache.org/jira/browse/FLINK-35102
> Project: Flink
> Issue Type: Bug
> Components: Flink CDC
> Reporter: Xiqian YU
> Assignee: Xiqian YU
> Priority: Major
> Labels: pull-request-available
> Fix For: cdc-3.1.0
>
>
> According to Flink CDC Doris connector docs, CHAR and VARCHAR are mapped to
> 3-bytes since Doris uses UTF-8 variable-length encoding internally.
> |CHAR(n)|CHAR(n*3)|In Doris, strings are stored in UTF-8 encoding, so English
> characters occupy 1 byte and Chinese characters occupy 3 bytes. The length
> here is multiplied by 3. The maximum length of CHAR is 255. Once exceeded, it
> will automatically be converted to VARCHAR type.|
> |VARCHAR(n)|VARCHAR(n*3)|Same as above. The length here is multiplied by 3.
> The maximum length of VARCHAR is 65533. Once exceeded, it will automatically
> be converted to STRING type.|
> However, currently Doris connector maps `CHAR(n)` to `CHAR(n)` and
> `VARCHAR(n)` to `VARCHAR(n * 4)`, which is inconsistent with specification in
> docs.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)