[ 
https://issues.apache.org/jira/browse/HIVE-12082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024606#comment-16024606
 ] 

Dudu Markovitz commented on HIVE-12082:
---------------------------------------

[~szehon] [~sershe] -

_GREATEST_ and _LEAST_ return _NULL_ if one their argument is _NULL_, on every 
database I'm familiar with.
However, I strongly feel this is a bad design that does not fit realistic 
use-cases.
SQL users tends to use _GREATEST_ /_LEAST_ as an horizontal _MAX_/_MIN_, which 
in that case makes a lot of sense to ignore _NULL_ values .
The is a work-around _NULLS_ but it is cumbersome and error-prone, e.g. -
{code}
GREATEST (coalesce(x,-999999999),coalesce(y,-999999999),coalesce(z,-999999999))
{code}

I would like to suggest 2 possible options to handle this function differently:
1. configuration: something like _hive.greatest.least.ignore.null_ with default 
value of _false_.
2. Enhanced the functions syntax : {code}GREATEST/LEAST (...) [IGNORE 
NULLS]{code}

What say you?





> Null comparison for greatest and least operator
> -----------------------------------------------
>
>                 Key: HIVE-12082
>                 URL: https://issues.apache.org/jira/browse/HIVE-12082
>             Project: Hive
>          Issue Type: Bug
>          Components: UDF
>            Reporter: Szehon Ho
>            Assignee: Szehon Ho
>             Fix For: 2.0.0
>
>         Attachments: HIVE-12082.2.patch, HIVE-12082.patch
>
>
> In mysql comparisons if any of the entries are null, then the result is null.
> [https://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html|https://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html]
>  and 
> [https://dev.mysql.com/doc/refman/5.0/en/type-conversion.html|https://dev.mysql.com/doc/refman/5.0/en/type-conversion.html].
> This can be demonstrated by the following mysql query:
> {noformat}
> mysql> select greatest(1, null) from test;
> +-------------------+
> | greatest(1, null) |
> +-------------------+
> |              NULL |
> +-------------------+
> 1 row in set (0.00 sec)
> mysql> select greatest(-1, null) from test;
> +--------------------+
> | greatest(-1, null) |
> +--------------------+
> |               NULL |
> +--------------------+
> 1 row in set (0.00 sec)
> {noformat}
> This is in contrast to Hive, where null are ignored in the comparisons.
> {noformat}
> hive> select greatest(null, 1) from test;
> OK
> 1
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to