Hi Zheng,

Wouldnt the query you mentioned need a group by clause? I need the top x
customers per product id. Sorry, can you please explain.

Thanks and Regards,
Sonal


On Thu, Feb 4, 2010 at 12:07 PM, Sonal Goyal <[email protected]> wrote:

> Hi Zheng,
>
> Thanks for your email and your feedback. I will try to change the code as
> suggested by you.
>
> Here is the output of describe:
>
> *hive> describe products_bought;
> OK
>
> product_id    int
> customer_id    int
> product_count    int
>
>
> *My function was working fine earlier with this table and iterate(int,
> int, int, int). Once I introduced the other iterate, it stopped working.
>
>
> Thanks and Regards,
> Sonal
>
>
>
> On Thu, Feb 4, 2010 at 11:37 AM, Zheng Shao <[email protected]> wrote:
>
>> Hi Sonal,
>>
>> 1. We usually move the group_by column out of the UDAF - just like we
>> do "SELECT key, sum(value) FROM table".
>>
>> I think you should write:
>>
>> SELECT customer_id, topx(2, product_id, product_count)
>> FROM products_bought
>>
>> and in topx:
>> public boolean iterate(int max, int attribute, int count).
>>
>>
>> 2. Can you run "describe products_bought"? Does product_count column
>> have type "int"?
>>
>> You might want to try removing the other interate function to see
>> whether that solves the problem.
>>
>>
>> Zheng
>>
>>
>> On Wed, Feb 3, 2010 at 9:58 PM, Sonal Goyal <[email protected]>
>> wrote:
>> > Hi Zheng,
>> >
>> > My query is:
>> >
>> > select a.myTable.key, a.myTable.attribute, a.myTable.count from (select
>> > explode (t.pc) as myTable from (select topx(2, product_id, customer_id,
>> > product_count) as pc from (select product_id, customer_id, product_count
>> > from products_bought order by product_id, product_count desc) r ) t )a;
>> >
>> > My overloaded iterators are:
>> >
>> > public boolean iterate(int max, int groupBy, int attribute, int count)
>> >
>> > public boolean iterate(int max, int groupBy, int attribute, double
>> count)
>> >
>> > Before overloading, my query was running fine. My table products_bought
>> is:
>> > product_id int, customer_id int, product_count int
>> >
>> > And I get:
>> > FAILED: Error in semantic analysis: Ambiguous method for class
>> > org.apache.hadoop.hive.udaf.TopXPerGroup with [int, int, int, int]
>> >
>> > The hive logs say:
>> > 2010-02-03 11:18:15,721 ERROR processors.DeleteResourceProcessor
>> > (SessionState.java:printError(255)) - Usage: delete [FILE|JAR|ARCHIVE]
>> > <value> [<value>]*
>> > 2010-02-03 11:22:14,663 ERROR ql.Driver
>> (SessionState.java:printError(255))
>> > - FAILED: Error in semantic analysis: Ambiguous method for class
>> > org.apache.hadoop.hive.udaf.TopXPerGroup with [int, int, int, int]
>> > org.apache.hadoop.hive.ql.exec.AmbiguousMethodException: Ambiguous
>> method
>> > for class org.apache.hadoop.hive.udaf.TopXPerGroup with [int, int, int,
>> int]
>> >         at
>> >
>> org.apache.hadoop.hive.ql.exec.DefaultUDAFEvaluatorResolver.getEvaluatorClass(DefaultUDAFEvaluatorResolver.java:83)
>> >         at
>> >
>> org.apache.hadoop.hive.ql.udf.generic.GenericUDAFBridge.getEvaluator(GenericUDAFBridge.java:57)
>> >         at
>> >
>> org.apache.hadoop.hive.ql.exec.FunctionRegistry.getGenericUDAFEvaluator(FunctionRegistry.java:594)
>> >         at
>> >
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getGenericUDAFEvaluator(SemanticAnalyzer.java:1882)
>> >         at
>> >
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapGroupByOperator(SemanticAnalyzer.java:2270)
>> >         at
>> >
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggr1MR(SemanticAnalyzer.java:2821)
>> >         at
>> >
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:4543)
>> >         at
>> >
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:5058)
>> >         at
>> >
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:4999)
>> >         at
>> >
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:5020)
>> >         at
>> >
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:4999)
>> >         at
>> >
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:5020)
>> >         at
>> >
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5587)
>> >         at
>> >
>> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:114)
>> >         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:317)
>> >         at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:370)
>> >         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:362)
>> >         at
>> > org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:140)
>> >         at
>> > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:200)
>> >         at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:311)
>> >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>> >         at
>> >
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> >         at
>> >
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> >         at java.lang.reflect.Method.invoke(Method.java:597)
>> >         at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>> >
>> >
>> >
>> > Thanks and Regards,
>> > Sonal
>> >
>> >
>> > On Thu, Feb 4, 2010 at 12:12 AM, Zheng Shao <[email protected]> wrote:
>> >>
>> >> Can you post the Hive query? What are the types of the parameters that
>> >> you passed to the function?
>> >>
>> >> Zheng
>> >>
>> >> On Wed, Feb 3, 2010 at 3:23 AM, Sonal Goyal <[email protected]>
>> wrote:
>> >> > Hi,
>> >> >
>> >> > I am writing a UDAF which takes in 4 parameters. I have 2 cases - one
>> >> > where
>> >> > all the paramters are ints, and second where the last parameter is
>> >> > double. I
>> >> > wrote two evaluators for this, with iterate as
>> >> >
>> >> > public boolean iterate(int max, int groupBy, int attribute, int
>> count)
>> >> >
>> >> > and
>> >> >
>> >> > public boolean iterate(int max, int groupBy, int attribute, double
>> >> > count)
>> >> >
>> >> > However, when I run a query, I get the exception:
>> >> > org.apache.hadoop.hive.ql.exec.AmbiguousMethodException: Ambiguous
>> >> > method
>> >> > for class org.apache.hadoop.hive.udaf.TopXPerGroup with [int, int,
>> int,
>> >> > int]
>> >> >         at
>> >> >
>> >> >
>> org.apache.hadoop.hive.ql.exec.DefaultUDAFEvaluatorResolver.getEvaluatorClass(DefaultUDAFEvaluatorResolver.java:83)
>> >> >         at
>> >> >
>> >> >
>> org.apache.hadoop.hive.ql.udf.generic.GenericUDAFBridge.getEvaluator(GenericUDAFBridge.java:57)
>> >> >         at
>> >> >
>> >> >
>> org.apache.hadoop.hive.ql.exec.FunctionRegistry.getGenericUDAFEvaluator(FunctionRegistry.java:594)
>> >> >         at
>> >> >
>> >> >
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getGenericUDAFEvaluator(SemanticAnalyzer.java:1882)
>> >> >         at
>> >> >
>> >> >
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapGroupByOperator(SemanticAnalyzer.java:2270)
>> >> >         at
>> >> >
>> >> >
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggr1MR(SemanticAnalyzer.java:2821)
>> >> >         at
>> >> >
>> >> >
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:4543)
>> >> >         at
>> >> >
>> >> >
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:5058)
>> >> >         at
>> >> >
>> >> >
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:4999)
>> >> >         at
>> >> >
>> >> >
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:5020)
>> >> >         at
>> >> >
>> >> >
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:4999)
>> >> >         at
>> >> >
>> >> >
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:5020)
>> >> >         at
>> >> >
>> >> >
>> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5587)
>> >> >         at
>> >> >
>> >> >
>> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:114)
>> >> >         at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:317)
>> >> >         at
>> org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:370)
>> >> >         at org.apache.hadoop.hive.ql.Driver.run(Driver.java:362)
>> >> >         at
>> >> > org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:140)
>> >> >         at
>> >> > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:200)
>> >> >         at
>> org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:311)
>> >> >         at sun.reflect.NativeMethodAccessorImpl.invoke0(Native
>> Method)
>> >> >         at
>> >> >
>> >> >
>> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>> >> >         at
>> >> >
>> >> >
>> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>> >> >         at java.lang.reflect.Method.invoke(Method.java:597)
>> >> >         at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
>> >> >
>> >> > One option for me is to write  a resolver which I will do. But, I
>> just
>> >> > wanted to know if this is a bug in hive whereby it is not able to get
>> >> > the
>> >> > write evaluator. Or if this is a gap in my understanding.
>> >> >
>> >> > I look forward to hearing your views on this.
>> >> >
>> >> > Thanks and Regards,
>> >> > Sonal
>> >> >
>> >>
>> >>
>> >>
>> >> --
>> >> Yours,
>> >> Zheng
>> >
>> >
>>
>>
>>
>> --
>> Yours,
>> Zheng
>>
>
>

Reply via email to