[jira] [Commented] (FLINK-10049) Unify the processing logic for NULL arguments in SQL built-in functions
[ https://issues.apache.org/jira/browse/FLINK-10049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16814703#comment-16814703 ] Hequn Cheng commented on FLINK-10049: - Yes, I meant to unify the logic for the two problems I listed. Whether return null or throw NPE depends on the detailed logic of UDFs and may vary differently. What we have to do is to make sure the semantics and try to avoid exception if null is ok. :-) +1 to rename the title. Best, Hequn > Unify the processing logic for NULL arguments in SQL built-in functions > --- > > Key: FLINK-10049 > URL: https://issues.apache.org/jira/browse/FLINK-10049 > Project: Flink > Issue Type: Improvement > Components: Table SQL / API >Reporter: Xingcan Cui >Assignee: vinoyang >Priority: Major > > Currently, the built-in functions treat NULL arguments in different ways. > E.g., ABS(NULL) returns NULL, while LOG10(NULL) throws an NPE. The general > SQL-way of handling NULL values should be that if one argument is NULL the > result is NULL. We should unify the processing logic for that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10049) Unify the processing logic for NULL arguments in SQL built-in functions
[ https://issues.apache.org/jira/browse/FLINK-10049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813002#comment-16813002 ] Xingcan Cui commented on FLINK-10049: - Hey guys, thanks for the comments. I've not gone through all the documents, but it seems true that different SQL engines have different mechanisms for this problem. However, since all the fields in Flink SQL are nullable in the current version, simply throwing NPE and terminating the execution should always be avoided. IMO, each UDF is responsible to handle {{NULL}} arguments itself, with the correct semantics. The {{NULL}} means unknown in SQL, and thus most scalar functions should output "unknown" with an unknown input. We can add the {{RETURNS NULL ON NULL INPUT}} option to UDF definitions (maybe a method to be overridden), but it works more like an optimization method, which means event without this declaration, the function should return "NULL" after being invoked (just in case). Actually, there's no need to unify the processing logic. Just keep the correct semantics and avoid terminating the (continuous) queries unexpectedly. Thus, I plan to rename this ticket to "Correctly handle NULL arguments in SQL built-in functions". As for the exception handling mechanism, it's a little bit different and we'd better discuss it in another place. What do you think? > Unify the processing logic for NULL arguments in SQL built-in functions > --- > > Key: FLINK-10049 > URL: https://issues.apache.org/jira/browse/FLINK-10049 > Project: Flink > Issue Type: Improvement > Components: Table SQL / API >Reporter: Xingcan Cui >Assignee: vinoyang >Priority: Major > > Currently, the built-in functions treat NULL arguments in different ways. > E.g., ABS(NULL) returns NULL, while LOG10(NULL) throws an NPE. The general > SQL-way of handling NULL values should be that if one argument is NULL the > result is NULL. We should unify the processing logic for that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10049) Unify the processing logic for NULL arguments in SQL built-in functions
[ https://issues.apache.org/jira/browse/FLINK-10049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812935#comment-16812935 ] Hequn Cheng commented on FLINK-10049: - [~xccui] Thanks for opening the issue. Good catch and +1 to unify the processing logic. This is an interesting topic and I would like to share some thoughts too. I think it's a big topic for the improvement of UDF which deserves a discussion by itself. As for the null input problem raised by this issue, I have come up with two problems to be addressed. - How to handle Exceptions for UDF. We may need a global configuration to control the behavior of handling exceptions for UDF. For example, providing a return null option. - How to handle NULL input and NULL output. We can let the UDF process null input by default. However, we can also provide options like SqlServer, i.e., {{RETURNS NULL ON NULL INPUT}}, return NULL when any of the arguments it receives is NULL, without actually invoking the body of the function. What do you guys think? > Unify the processing logic for NULL arguments in SQL built-in functions > --- > > Key: FLINK-10049 > URL: https://issues.apache.org/jira/browse/FLINK-10049 > Project: Flink > Issue Type: Improvement > Components: Table SQL / API >Reporter: Xingcan Cui >Assignee: vinoyang >Priority: Major > > Currently, the built-in functions treat NULL arguments in different ways. > E.g., ABS(NULL) returns NULL, while LOG10(NULL) throws an NPE. The general > SQL-way of handling NULL values should be that if one argument is NULL the > result is NULL. We should unify the processing logic for that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10049) Unify the processing logic for NULL arguments in SQL built-in functions
[ https://issues.apache.org/jira/browse/FLINK-10049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16812931#comment-16812931 ] vinoyang commented on FLINK-10049: -- [~twalthr] What's your opinion? > Unify the processing logic for NULL arguments in SQL built-in functions > --- > > Key: FLINK-10049 > URL: https://issues.apache.org/jira/browse/FLINK-10049 > Project: Flink > Issue Type: Improvement > Components: Table SQL / API >Reporter: Xingcan Cui >Assignee: vinoyang >Priority: Major > > Currently, the built-in functions treat NULL arguments in different ways. > E.g., ABS(NULL) returns NULL, while LOG10(NULL) throws an NPE. The general > SQL-way of handling NULL values should be that if one argument is NULL the > result is NULL. We should unify the processing logic for that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10049) Unify the processing logic for NULL arguments in SQL built-in functions
[ https://issues.apache.org/jira/browse/FLINK-10049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16811903#comment-16811903 ] LiuJi commented on FLINK-10049: --- It seems string functions in different SQL engines may have different behaviors with NULL value. for example, as you mentioned above, LOG10(NULL) throws NPE, but in other SQL engine it will return null. The same applies to ABS, it may throw Exception with NULL param. I'm not sure if there is a grammatical standard in string functions? [~xccui] > Unify the processing logic for NULL arguments in SQL built-in functions > --- > > Key: FLINK-10049 > URL: https://issues.apache.org/jira/browse/FLINK-10049 > Project: Flink > Issue Type: Improvement > Components: Table SQL / API >Reporter: Xingcan Cui >Priority: Major > > Currently, the built-in functions treat NULL arguments in different ways. > E.g., ABS(NULL) returns NULL, while LOG10(NULL) throws an NPE. The general > SQL-way of handling NULL values should be that if one argument is NULL the > result is NULL. We should unify the processing logic for that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (FLINK-10049) Unify the processing logic for NULL arguments in SQL built-in functions
[ https://issues.apache.org/jira/browse/FLINK-10049?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16569014#comment-16569014 ] vinoyang commented on FLINK-10049: -- [~xccui] Agree! It seems for some string functions, if passing parameter which is NULL, the function's logic would not be invoked, the table framework would returns NULL directly. > Unify the processing logic for NULL arguments in SQL built-in functions > --- > > Key: FLINK-10049 > URL: https://issues.apache.org/jira/browse/FLINK-10049 > Project: Flink > Issue Type: Improvement > Components: Table API & SQL >Reporter: Xingcan Cui >Priority: Major > > Currently, the built-in functions treat NULL arguments in different ways. > E.g., ABS(NULL) returns NULL, while LOG10(NULL) throws an NPE. The general > SQL-way of handling NULL values should be that if one argument is NULL the > result is NULL. We should unify the processing logic for that. -- This message was sent by Atlassian JIRA (v7.6.3#76005)