[jira] [Comment Edited] (FLINK-24413) Casting to a CHAR() and VARCHAR() doesn't trim the string to the specified precision

Timo Walther (Jira) Wed, 13 Oct 2021 06:56:23 -0700


    [ 
https://issues.apache.org/jira/browse/FLINK-24413?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17428239#comment-17428239
 ]


Timo Walther edited comment on FLINK-24413 at 10/13/21, 1:55 PM:
-----------------------------------------------------------------

Thanks for investigating this topic [~matriv].

I would vote for a combination of Spark's and Snowflake's behavior:
- `CAST` trims and pads if needed.
- We can introduce an additional runtime check when going to the sink, similar 
to the `NOT NULL` enforcer.

In general, we should encourage users to use STRING and BYTES. CHAR and VARCHAR 
are only for ecosystem and catalog in order to connect to other systems and 
declare schema. It should be the responsibility of the data producer (i.e. 
source, function) to create a valid `StringData` matching the schema it was 
configured with. I would not add this check to `DataStructureConverters`, 
because this would mean that the converters should be able to work with invalid 
input data that doesn't match the requested data type. We can add helper 
methods to `StringData.fromString(string, length)` to make this easier for the 
developer.


was (Author: twalthr):
I would vote for a combination of Spark's and Snowflake's behavior:
- `CAST` trims and pads if needed.
- We can introduce an additional runtime check when going to the sink, similar 
to the `NOT NULL` enforcer.

In general, we should encourage users to use STRING and BYTES. CHAR and VARCHAR 
are only for ecosystem and catalog in order to connect to other systems and 
declare schema. It should be the responsibility of the data producer (i.e. 
source, function) to create a valid `StringData` matching the schema it was 
configured with. I would not add this check to `DataStructureConverters`, 
because this would mean that the converters should be able to work with invalid 
input data that doesn't match the requested data type. We can add helper 
methods to `StringData.fromString(string, length)` to make this easier for the 
developer.

> Casting to a CHAR() and VARCHAR() doesn't trim the string to the specified 
> precision
> ------------------------------------------------------------------------------------
>
>                 Key: FLINK-24413
>                 URL: https://issues.apache.org/jira/browse/FLINK-24413
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Table SQL / API
>            Reporter: Marios Trivyzas
>            Priority: Major
>              Labels: pull-request-available
>
> *CAST**('abcdfe' AS CHAR(3))* should trim the string to 3 chars but currently 
> returns the whole string *'abcdfe'*.
>  
> PostgreSQL and Oracle for example behave as such:
> postgres=# select '123456afas'::char(4);
>  bpchar 
>  --------
>  1234
>  (1 row)
> postgres=# select '123456afas'::varchar(5);
>  varchar 
>  ---------
>  12345
>  (1 row)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-24413) Casting to a CHAR() and VARCHAR() doesn't trim the string to the specified precision

Reply via email to