Csaba Ringhofer created IMPALA-11911:
----------------------------------------
Summary: Incorrect handling of NULL arguments in Hive GenericUDFs
Key: IMPALA-11911
URL: https://issues.apache.org/jira/browse/IMPALA-11911
Project: IMPALA
Issue Type: Bug
Components: Frontend
Affects Versions: Impala 4.2.0
Reporter: Csaba Ringhofer
If an argument of a GenericUDF is NULL then Impala passes a null instead of a
deferred object:
https://github.com/apache/impala/blob/5abbb9bd17373c8aafe6d213d328e16934cdca07/fe/src/main/java/org/apache/impala/hive/executor/HiveUdfExecutorGeneric.java#L74
This seems to be wrong, as the example GenericUDFs I checked in Hive assume
that the argument is not null, but the DeferredObject's get() function can
return null:
https://github.com/apache/hive/blob/7082fd1dfd087c99e6f00a7a0e95a30e198fede8/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFIf.java#L165
This also makes sense as one of the goals of DeferredObject is lazy evaluation,
so we may not know before calling get() whether the argument is null
https://github.com/apache/hive/blob/7082fd1dfd087c99e6f00a7a0e95a30e198fede8/ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java#L92
Even Impala's test UDFs throw an exception for NULL:
{code}
create function generic_identity(int) returns int
location '/test-warehouse/impala-hive-udfs.jar'
symbol='org.apache.impala.TestGenericUdf';
select generic_identity(cast(NULL as int));
WARNINGS: UDF WARNING: Hive UDF
path=hdfs://localhost:20500/test-warehouse/impala-hive-udfs.jar
class=org.apache.impala.TestGenericUdf failed due to: NullPointerException: null
{code}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)