Re: Help to solve UDAF errors!

Abhishek Bhattacharya Tue, 12 Feb 2013 09:49:25 -0800

Hi Mark,

Thanks for the response!
The UDAFPercentile.java have two terminate() methods since it is handling
two different input types by the two inner classes: PercentileLongEvaluator
and PercentileLongArrayEvaluator.
I am handling only a single input type of double from one table column to
the iterate() method and wish to return an ArrayList<DoubleWritable> from
the terminate() method.
What is wrong in my class?
Moreover, is there any way for UDF/UDAF/UDTF which can process all the rows
of the table and output only a subset of the total rows based on some
aggregation function of one column attribute i.e., similar to my case of
computing the top-n-percent of a column attribute and output the entire set
of filtered rows with all other columns from the table?


Thanks,
Abhishek



On Sun, Feb 10, 2013 at 12:36 PM, Mark Grover
<grover.markgro...@gmail.com>wrote:

> Hi Abhishek,
> The code looks incomplete.
>
> See the comment at
> https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/UDAF.java#L22
> Those are all the methods your UDAF class needs to implement but you seem
> to be missing them.
>
> Mark
>
> On Sat, Feb 9, 2013 at 11:08 PM, Abhishek Bhattacharya 
> <abhat...@fiu.edu>wrote:
>
>> Thanks for the response.
>> The link to the code is:
>> https://github.com/Abhishek2301/Hive/blob/master/src/UDAFTopNPercent.java
>> Please let me know to fix it!
>>
>> Thanks,
>> Abhishek
>>
>>
>>
>> On Fri, Feb 8, 2013 at 5:02 PM, Mark Grover 
>> <grover.markgro...@gmail.com>wrote:
>>
>>> Abhishek,
>>> The code doesn't seem to be complete.
>>>
>>> Look at
>>> https://github.com/apache/hive/blob/trunk/ql/src/java/org/apache/hadoop/hive/ql/udf/UDAFPercentile.javafor
>>>  reference. It has two terminate()'s - one for UDAF and one for the
>>> Evaluator.
>>>
>>> Do you mind posting your complete code on github somewhere so it's
>>> easier to analyze?
>>>
>>> Mark
>>>
>>> On Fri, Feb 8, 2013 at 2:05 PM, Abhishek Bhattacharya 
>>> <abhat...@fiu.edu>wrote:
>>>
>>>> Hi,
>>>>
>>>> I have implemented a simple UDAF for top-n-percent as follows:
>>>> import java.util.ArrayList;
>>>> import java.util.Collections;
>>>>
>>>> import org.apache.hadoop.hive.ql.exec.UDAF;
>>>> import org.apache.hadoop.hive.ql.exec.UDAFEvaluator;
>>>>
>>>> public class UDAFTopNPercent extends UDAF{
>>>>
>>>>     public static class Result {
>>>>         ArrayList<Double> list;
>>>>         double min;
>>>>     }
>>>>
>>>>     public class TopNPercentEvaluator implements UDAFEvaluator {
>>>>
>>>>         private Result res;
>>>>         private int rowIndex;
>>>>         private int percent;
>>>>
>>>>         public TopNPercentEvaluator() {
>>>>             super();
>>>>             res = new Result();
>>>>             init();
>>>>             rowIndex = 0;
>>>>         }
>>>>         @Override
>>>>         public void init() {
>>>>             res.list = new ArrayList<Double>();
>>>>             res.min = Double.MAX_VALUE;
>>>>         }
>>>>
>>>>         public boolean iterate(Double rowVal, int pct) {
>>>>             ArrayList<Double> resList = res.list;
>>>>             rowIndex++;
>>>>             resList.add(rowVal);
>>>>             percent = pct;
>>>>             return true;
>>>>         }
>>>>
>>>>         public ArrayList<Double> terminatePartial() {
>>>>             ArrayList<Double> resList = res.list;
>>>>             Collections.sort(resList);
>>>>             return resList;
>>>>         }
>>>>
>>>>         public boolean merge(ArrayList<Double> otherList) {
>>>>             ArrayList<Double> resList = res.list;
>>>>             resList.addAll(otherList);
>>>>             return true;
>>>>         }
>>>>
>>>>         public ArrayList<Double> terminate() {
>>>>             ArrayList<Double> resList = res.list;
>>>>             double num_rows = (double)percent/100.0*rowIndex;
>>>>             Collections.sort(resList);
>>>>             int lastIdx = resList.size()- (int) num_rows;
>>>>             if(lastIdx <= 0) {
>>>>                 return resList;
>>>>             }
>>>>             for(int i=0; i<lastIdx; i++) {
>>>>                 resList.remove(i);
>>>>             }
>>>>             return resList;
>>>>         }
>>>>     }
>>>>
>>>>     /**
>>>>      * @param args
>>>>      */
>>>>     public static void main(String[] args) {
>>>>         // TODO Auto-generated method stub
>>>>
>>>>     }
>>>>
>>>> }
>>>>
>>>> But throws some error such as first few lines are:
>>>> FAILED: Hive Internal Error:
>>>> java.lang.ClassCastException(org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableFloatObjectInspector
>>>> cannot be cast to
>>>> org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector)
>>>> java.lang.ClassCastException:
>>>> org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableFloatObjectInspector
>>>> cannot be cast to
>>>> org.apache.hadoop.hive.serde2.objectinspector.StructObjectInspector
>>>>         at
>>>> org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorConverters.getConverter(ObjectInspectorConverters.java:116)
>>>>         at
>>>> org.apache.hadoop.hive.ql.udf.generic.GenericUDFUtils$ConversionHelper.<init>(GenericUDFUtils.java:300)
>>>>         at
>>>> org.apache.hadoop.hive.ql.udf.generic.GenericUDAFBridge$GenericUDAFBridgeEvaluator.init(GenericUDAFBridge.java:129)
>>>>
>>>> Please help me to debug this!
>>>> Is it throwing from returning ArrayList<Double> in terminate()?
>>>> How should I return a List from UDAF?
>>>>
>>>> Thanks,
>>>> Abhishek
>>>>
>>>
>>>
>>
>>
>> --
>> Thanks and Regards,
>>
>> Abhishek Bhattacharya
>> PhD Computer Science
>> School of Computing and Information Sciences
>> Florida International University
>>
>
>


-- 
Thanks and Regards,

Abhishek Bhattacharya
PhD Computer Science
School of Computing and Information Sciences
Florida International University

Re: Help to solve UDAF errors!

Reply via email to