[ 
https://issues.apache.org/jira/browse/FLINK-35102?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonard Xu resolved FLINK-35102.
--------------------------------
    Resolution: Fixed

fixed via 5e52d8620cee995b55f003956c3f7d7a3bdf4a22

> Incorret Type mapping for Flink CDC Doris connector
> ---------------------------------------------------
>
>                 Key: FLINK-35102
>                 URL: https://issues.apache.org/jira/browse/FLINK-35102
>             Project: Flink
>          Issue Type: Bug
>          Components: Flink CDC
>            Reporter: Xiqian YU
>            Assignee: Xiqian YU
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: cdc-3.1.0
>
>
> According to Flink CDC Doris connector docs, CHAR and VARCHAR are mapped to 
> 3-bytes since Doris uses UTF-8 variable-length encoding internally.
> |CHAR(n)|CHAR(n*3)|In Doris, strings are stored in UTF-8 encoding, so English 
> characters occupy 1 byte and Chinese characters occupy 3 bytes. The length 
> here is multiplied by 3. The maximum length of CHAR is 255. Once exceeded, it 
> will automatically be converted to VARCHAR type.|
> |VARCHAR(n)|VARCHAR(n*3)|Same as above. The length here is multiplied by 3. 
> The maximum length of VARCHAR is 65533. Once exceeded, it will automatically 
> be converted to STRING type.|
> However, currently Doris connector maps `CHAR(n)` to `CHAR(n)` and 
> `VARCHAR(n)` to `VARCHAR(n * 4)`, which is inconsistent with specification in 
> docs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to