I am getting an exception when using the Percentile UDAF on an empty data
set (details below)?  Has anyone seen/solved this before?

Thanks
Dilip

1. Create the following table and load it with 4 rows: 10, 20, 30, 40
CREATE TABLE pct_test (
                    val INT
                );

2. SELECT PERCENTILE(val, 0.5) FROM pct_test;   works fine

3. SELECT PERCENTILE(val, 0.5) FROM pct_test WHERE val > 100; fails with the
following exception.

java.lang.RuntimeException: Hive Runtime Error while closing operators
        at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:347)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
        at org.apache.hadoop.mapred.Child.main(Child.java:170)
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
org.apache.hadoop.hive.ql.metadata.HiveException: Unable to execute
method public boolean
org.apache.hadoop.hive.ql.udf.UDAFPercentile$PercentileLongEvaluator.iterate(org.apache.hadoop.io.LongWritable,double)
 on object 
org.apache.hadoop.hive.ql.udf.udafpercentile$percentilelongevalua...@ded0f0
of class org.apache.hadoop.hive.ql.udf.UDAFPercentile$PercentileLongEvaluator
with arguments {null, null} of size 2
        at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:897)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:539)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:548)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:548)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:548)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:548)
        at org.apache.hadoop.hive.ql.exec.Operator.close(Operator.java:548)
        at org.apache.hadoop.hive.ql.exec.ExecMapper.close(ExecMapper.java:324)
        ... 4 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Unable to
execute method public boolean
org.apache.hadoop.hive.ql.udf.UDAFPercentile$PercentileLongEvaluator.iterate(org.apache.hadoop.io.LongWritable,double)
 on object 
org.apache.hadoop.hive.ql.udf.udafpercentile$percentilelongevalua...@ded0f0
of class org.apache.hadoop.hive.ql.udf.UDAFPercentile$PercentileLongEvaluator
with arguments {null, null} of size 2
        at 
org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:725)
        at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDAFBridge$GenericUDAFBridgeEvaluator.iterate(GenericUDAFBridge.java:169)
        at 
org.apache.hadoop.hive.ql.udf.generic.GenericUDAFEvaluator.aggregate(GenericUDAFEvaluator.java:139)
        at 
org.apache.hadoop.hive.ql.exec.GroupByOperator.closeOp(GroupByOperator.java:865)
        ... 11 more
Caused by: java.lang.IllegalArgumentException
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
        at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
        at java.lang.reflect.Method.invoke(Method.java:597)
        at 
org.apache.hadoop.hive.ql.exec.FunctionRegistry.invoke(FunctionRegistry.java:701)
        ... 14 more


In this simple example, percentile may not be well defined as there are no
rows to operate on.   However, the problem is that the exception also occurs
in larger data sets where a few of the multiple maps involved may output 0
rows.

Reply via email to