[ 
https://issues.apache.org/jira/browse/FLINK-24924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17444893#comment-17444893
 ] 

Caizhi Weng commented on FLINK-24924:
-------------------------------------

Thanks for the mention.

I don't quite like this idea. In a production pipeline there might be an error 
in the upstream and might produce invalid data once for a while. In a streaming 
system it is very unfriendly to users to throw exception when faced with 
invalid data, because this record will fail the job again and again even after 
restart but most users (from my experience in some production pipelines) are 
willing to just skip this record to keep the pipeline flowing.

Instead of {{TRY_CAST}} can we have something like {{CAST_OR_FAIL}} or at least 
have an option to control whether to throw exception. There are thousands of 
users and tens of thousands of jobs using {{CAST(...) IS NOT NULL}} to filter 
invalid data in production. If this behavior suddenly changes it will be a 
burden for them to also update their SQL.

> TO_TIMESTAMP and TO_DATE should fail
> ------------------------------------
>
>                 Key: FLINK-24924
>                 URL: https://issues.apache.org/jira/browse/FLINK-24924
>             Project: Flink
>          Issue Type: Sub-task
>            Reporter: Francesco Guardiani
>            Priority: Major
>
> In a similar fashion to what described 
> https://issues.apache.org/jira/browse/FLINK-24385, TO_TIMESTAMP and TO_DATE 
> should fail instead of returning {{null}}.
> In particular for these two functions, a failure in parsing could lead to 
> very unexpected behavior, for example it could lead to records with null 
> rowtime.
> We should change these functions to fail by default when parsing generates an 
> error. We can let users handle errors by letting them use TRY_CAST for the 
> same functionality:
> {code:sql}
> -- This fails when input is invalid
> TO_TIMESTAMP(input)
> -- Behaves the same as above
> CAST(input AS TIMESTAMP)
> -- This returns null when input is invalid
> TRY_CAST(input AS TIMESTAMP)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to