Hi Caizhi, Thanks for bringing this up for discussion. I think the important part is what do developers expect as the default behaviour of a CAST function when casting fails. If I look at Postgres [1] or MSSQL [2], the default behaviour of a CAST failing would be to return an error, which would be the new behaviour. Returning a value when a CAST fails can lead to users not understanding immediately where that value comes from. So, I would be in favor of the new behaviour by default, but including a configuration flag to maintain the old behaviour to avoid that you need to rewrite all these jobs.
Best regards, Martijn [1] https://www.postgresql.org/docs/current/sql-createcast.html [2] https://docs.microsoft.com/en-us/sql/t-sql/functions/try-cast-transact-sql?view=sql-server-ver15 On Thu, 18 Nov 2021 at 03:17, Caizhi Weng <tsreape...@gmail.com> wrote: > Hi devs! > > We're discussing the behavior of casting functions (including cast, > to_timestamp, to_date, etc.) for invalid input in > https://issues.apache.org/jira/browse/FLINK-24924. As this topic is > crucial > to compatibility and usability we'd like to continue discussing this > publicly in the mailing list. > > The main topic is to discuss that shall casting functions return null (keep > its current behavior) or throw exceptions (introduce a new behavior). I'm > trying to conclude the ideas on both sides. Correct me if I miss something. > > *From the devs who support throwing exceptions (new behavior):* > > The main concern is that if we silently return null then unexpected results > or exceptions (mainly NullPointerException) may be produced. However, it > will be hard for users to reason for this because there is no detailed > message. If we throw exceptions in the first place, then it's much easier > to catch the exception with nice detailed messages explaining what is going > wrong. Especially for this case of DATE/TIME/TIMESTAMP it's very helpful to > have a detailed error and see where and why the parsing broke. > > For compatibility concerns, we can provide a TRY_CAST function which is > exactly the same as the current CAST function by returning nulls for > invalid input. > > *From the devs who support return null (current behavior):* > > The main concern is compatibility and usability. > > On usability: The upstream system may occasionally produce invalid data and > if we throw exceptions when seeing this it will fail the job again and > again even after restart (because the invalid data is always > there). Streaming computing is a resident program and users do not want it > to frequently fail and cannot automatically recover. Most users are willing > to just skip that record and continue processing. Imagine an online job > running for a couple of weeks and suddenly fails due to some unexpected > dirty data. What choices do users have to quickly resume the job? > > On compatibility: There are currently thousands of users and tens of > thousands of jobs relying on the current behavior to filter out invalid > input. If we change the behavior it will be a disaster for users as they > have to rewrite and check their SQL very carefully. > > > What do you think? We're looking forward to your feedback. >