Csaba Ringhofer created IMPALA-11911:
----------------------------------------

             Summary: Incorrect handling of NULL arguments in Hive GenericUDFs
                 Key: IMPALA-11911
                 URL: https://issues.apache.org/jira/browse/IMPALA-11911
             Project: IMPALA
          Issue Type: Bug
          Components: Frontend
    Affects Versions: Impala 4.2.0
            Reporter: Csaba Ringhofer


If an argument of a GenericUDF is NULL then Impala passes a null instead of a 
deferred object:
https://github.com/apache/impala/blob/5abbb9bd17373c8aafe6d213d328e16934cdca07/fe/src/main/java/org/apache/impala/hive/executor/HiveUdfExecutorGeneric.java#L74

This seems to be wrong, as the example GenericUDFs I checked in Hive assume 
that the argument is not null, but the DeferredObject's get() function can 
return null:
https://github.com/apache/hive/blob/7082fd1dfd087c99e6f00a7a0e95a30e198fede8/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFIf.java#L165

This also makes sense as one of the goals of DeferredObject is lazy evaluation, 
so we may not know before calling get() whether the argument is null
https://github.com/apache/hive/blob/7082fd1dfd087c99e6f00a7a0e95a30e198fede8/ql/src/java/org/apache/hadoop/hive/ql/exec/ExprNodeGenericFuncEvaluator.java#L92

Even Impala's test UDFs throw an exception for NULL:
{code}
create function generic_identity(int) returns int
location '/test-warehouse/impala-hive-udfs.jar'
symbol='org.apache.impala.TestGenericUdf';

select generic_identity(cast(NULL as int));

WARNINGS: UDF WARNING: Hive UDF 
path=hdfs://localhost:20500/test-warehouse/impala-hive-udfs.jar 
class=org.apache.impala.TestGenericUdf failed due to: NullPointerException: null
{code}




--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to