[
https://issues.apache.org/jira/browse/HIVE-12082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16024606#comment-16024606
]
Dudu Markovitz commented on HIVE-12082:
---------------------------------------
[~szehon] [~sershe] -
_GREATEST_ and _LEAST_ return _NULL_ if one their argument is _NULL_, on every
database I'm familiar with.
However, I strongly feel this is a bad design that does not fit realistic
use-cases.
SQL users tends to use _GREATEST_ /_LEAST_ as an horizontal _MAX_/_MIN_, which
in that case makes a lot of sense to ignore _NULL_ values .
The is a work-around _NULLS_ but it is cumbersome and error-prone, e.g. -
{code}
GREATEST (coalesce(x,-999999999),coalesce(y,-999999999),coalesce(z,-999999999))
{code}
I would like to suggest 2 possible options to handle this function differently:
1. configuration: something like _hive.greatest.least.ignore.null_ with default
value of _false_.
2. Enhanced the functions syntax : {code}GREATEST/LEAST (...) [IGNORE
NULLS]{code}
What say you?
> Null comparison for greatest and least operator
> -----------------------------------------------
>
> Key: HIVE-12082
> URL: https://issues.apache.org/jira/browse/HIVE-12082
> Project: Hive
> Issue Type: Bug
> Components: UDF
> Reporter: Szehon Ho
> Assignee: Szehon Ho
> Fix For: 2.0.0
>
> Attachments: HIVE-12082.2.patch, HIVE-12082.patch
>
>
> In mysql comparisons if any of the entries are null, then the result is null.
> [https://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html|https://dev.mysql.com/doc/refman/5.0/en/comparison-operators.html]
> and
> [https://dev.mysql.com/doc/refman/5.0/en/type-conversion.html|https://dev.mysql.com/doc/refman/5.0/en/type-conversion.html].
> This can be demonstrated by the following mysql query:
> {noformat}
> mysql> select greatest(1, null) from test;
> +-------------------+
> | greatest(1, null) |
> +-------------------+
> | NULL |
> +-------------------+
> 1 row in set (0.00 sec)
> mysql> select greatest(-1, null) from test;
> +--------------------+
> | greatest(-1, null) |
> +--------------------+
> | NULL |
> +--------------------+
> 1 row in set (0.00 sec)
> {noformat}
> This is in contrast to Hive, where null are ignored in the comparisons.
> {noformat}
> hive> select greatest(null, 1) from test;
> OK
> 1
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)