[ 
https://issues.apache.org/jira/browse/DRILL-6815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bohdan Kazydub updated DRILL-6815:
----------------------------------
    Description: 
If a (simple) function is declared with NULL_IF_NULL null handling strategy 
(`nulls = NullHandling.NULL_IF_NULL`) there is a additional code generated 
which checks if any of the inputs is NULL (not set). In case if there is, 
output is set to be null otherwise function's code is executed and at the end 
output value is marked as set in case if ANY of the inputs is OPTIONAL (see 
https://github.com/apache/drill/blob/8edeb49873d1a1710cfe28e0b49364d07eb1aef4/exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillSimpleFuncHolder.java#L143).

The problem is, this behavior makes it impossible to make output value NULL 
from within function's evaluation body. Which may prove useful in certain 
situations, e.g. when input is an empty string and output should be NULL in the 
case etc. Sometimes it may result in creation of two separate functions with 
NullHanling.INTERNAL (one for OPTIONAL and one for REQUIRED inputs) instead of 
one with NULL_IF_NULL. It does not follow a Principle of Least Astonishment as 
effectively it behaves more like "null if and only if null" and documentation 
for NULL_IF_NULL is as follows:
{code}
enum NullHandling {
    ...

    /**
     * Null output if any null input:
     * Indicates that a method's associated logical operation returns NULL if
     * either input is NULL, and therefore that the method must not be called
     * with null inputs.  (The calling framework must handle NULLs.)
     */
    NULL_IF_NULL
}
{code}
It looks as if this behavior was not intended.

Intent of this improvement is to allow output NULL values based on function's 
eval() method when NULL_IF_NULL null handling strategy is chosen.

  was:
If a (simple) function is declared with NULL_IF_NULL null handling strategy 
(`nulls = NullHandling.NULL_IF_NULL`) there is a additional code generated 
which checks if any of the inputs is NULL (not set). In case if there is, 
output is set to be null otherwise function's code is executed and at the end 
output value is marked as set in case if ANY of the inputs is OPTIONAL (see 
[https://github.com/apache/drill/blob/8edeb49873d1a1710cfe28e0b49364d07eb1aef4/exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillSimpleFuncHolder.java#L143).]

The problem is, this behavior makes it impossible to make output value NULL 
from within [function's evaluation 
body|https://github.com/apache/drill/blob/7b0c9034753a8c5035fd1c0f1f84a37b376e6748/exec/java-exec/src/main/java/org/apache/drill/exec/expr/DrillSimpleFunc.java#L22].
 Which may prove useful in certain situations, e.g. when input is an empty 
string and output should be NULL in the case etc. Sometimes it may result in 
two separate functions instead of one with NULL_IF_NULL. It does not follow a 
[Principle of Least 
Astonishment|https://en.wikipedia.org/wiki/Principle_of_least_astonishment] as 
effectively it behaves more like "null if and only if null" and documentation 
for NULL_IF_NULL is currently


> Improve code generation to handle functions with NullHandling.NULL_IF_NULL 
> better
> ---------------------------------------------------------------------------------
>
>                 Key: DRILL-6815
>                 URL: https://issues.apache.org/jira/browse/DRILL-6815
>             Project: Apache Drill
>          Issue Type: Improvement
>            Reporter: Bohdan Kazydub
>            Priority: Major
>
> If a (simple) function is declared with NULL_IF_NULL null handling strategy 
> (`nulls = NullHandling.NULL_IF_NULL`) there is a additional code generated 
> which checks if any of the inputs is NULL (not set). In case if there is, 
> output is set to be null otherwise function's code is executed and at the end 
> output value is marked as set in case if ANY of the inputs is OPTIONAL (see 
> https://github.com/apache/drill/blob/8edeb49873d1a1710cfe28e0b49364d07eb1aef4/exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/DrillSimpleFuncHolder.java#L143).
> The problem is, this behavior makes it impossible to make output value NULL 
> from within function's evaluation body. Which may prove useful in certain 
> situations, e.g. when input is an empty string and output should be NULL in 
> the case etc. Sometimes it may result in creation of two separate functions 
> with NullHanling.INTERNAL (one for OPTIONAL and one for REQUIRED inputs) 
> instead of one with NULL_IF_NULL. It does not follow a Principle of Least 
> Astonishment as effectively it behaves more like "null if and only if null" 
> and documentation for NULL_IF_NULL is as follows:
> {code}
> enum NullHandling {
>     ...
>     /**
>      * Null output if any null input:
>      * Indicates that a method's associated logical operation returns NULL if
>      * either input is NULL, and therefore that the method must not be called
>      * with null inputs.  (The calling framework must handle NULLs.)
>      */
>     NULL_IF_NULL
> }
> {code}
> It looks as if this behavior was not intended.
> Intent of this improvement is to allow output NULL values based on function's 
> eval() method when NULL_IF_NULL null handling strategy is chosen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to