Hi Zheng, Wouldnt the query you mentioned need a group by clause? I need the top x customers per product id. Sorry, can you please explain.
Thanks and Regards, Sonal On Thu, Feb 4, 2010 at 12:07 PM, Sonal Goyal <[email protected]> wrote: > Hi Zheng, > > Thanks for your email and your feedback. I will try to change the code as > suggested by you. > > Here is the output of describe: > > *hive> describe products_bought; > OK > > product_id int > customer_id int > product_count int > > > *My function was working fine earlier with this table and iterate(int, > int, int, int). Once I introduced the other iterate, it stopped working. > > > Thanks and Regards, > Sonal > > > > On Thu, Feb 4, 2010 at 11:37 AM, Zheng Shao <[email protected]> wrote: > >> Hi Sonal, >> >> 1. We usually move the group_by column out of the UDAF - just like we >> do "SELECT key, sum(value) FROM table". >> >> I think you should write: >> >> SELECT customer_id, topx(2, product_id, product_count) >> FROM products_bought >> >> and in topx: >> public boolean iterate(int max, int attribute, int count). >> >> >> 2. Can you run "describe products_bought"? Does product_count column >> have type "int"? >> >> You might want to try removing the other interate function to see >> whether that solves the problem. >> >> >> Zheng >> >> >> On Wed, Feb 3, 2010 at 9:58 PM, Sonal Goyal <[email protected]> >> wrote: >> > Hi Zheng, >> > >> > My query is: >> > >> > select a.myTable.key, a.myTable.attribute, a.myTable.count from (select >> > explode (t.pc) as myTable from (select topx(2, product_id, customer_id, >> > product_count) as pc from (select product_id, customer_id, product_count >> > from products_bought order by product_id, product_count desc) r ) t )a; >> > >> > My overloaded iterators are: >> > >> > public boolean iterate(int max, int groupBy, int attribute, int count) >> > >> > public boolean iterate(int max, int groupBy, int attribute, double >> count) >> > >> > Before overloading, my query was running fine. My table products_bought >> is: >> > product_id int, customer_id int, product_count int >> > >> > And I get: >> > FAILED: Error in semantic analysis: Ambiguous method for class >> > org.apache.hadoop.hive.udaf.TopXPerGroup with [int, int, int, int] >> > >> > The hive logs say: >> > 2010-02-03 11:18:15,721 ERROR processors.DeleteResourceProcessor >> > (SessionState.java:printError(255)) - Usage: delete [FILE|JAR|ARCHIVE] >> > <value> [<value>]* >> > 2010-02-03 11:22:14,663 ERROR ql.Driver >> (SessionState.java:printError(255)) >> > - FAILED: Error in semantic analysis: Ambiguous method for class >> > org.apache.hadoop.hive.udaf.TopXPerGroup with [int, int, int, int] >> > org.apache.hadoop.hive.ql.exec.AmbiguousMethodException: Ambiguous >> method >> > for class org.apache.hadoop.hive.udaf.TopXPerGroup with [int, int, int, >> int] >> > at >> > >> org.apache.hadoop.hive.ql.exec.DefaultUDAFEvaluatorResolver.getEvaluatorClass(DefaultUDAFEvaluatorResolver.java:83) >> > at >> > >> org.apache.hadoop.hive.ql.udf.generic.GenericUDAFBridge.getEvaluator(GenericUDAFBridge.java:57) >> > at >> > >> org.apache.hadoop.hive.ql.exec.FunctionRegistry.getGenericUDAFEvaluator(FunctionRegistry.java:594) >> > at >> > >> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getGenericUDAFEvaluator(SemanticAnalyzer.java:1882) >> > at >> > >> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapGroupByOperator(SemanticAnalyzer.java:2270) >> > at >> > >> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggr1MR(SemanticAnalyzer.java:2821) >> > at >> > >> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:4543) >> > at >> > >> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:5058) >> > at >> > >> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:4999) >> > at >> > >> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:5020) >> > at >> > >> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:4999) >> > at >> > >> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:5020) >> > at >> > >> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5587) >> > at >> > >> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:114) >> > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:317) >> > at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:370) >> > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:362) >> > at >> > org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:140) >> > at >> > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:200) >> > at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:311) >> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) >> > at >> > >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> > at >> > >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> > at java.lang.reflect.Method.invoke(Method.java:597) >> > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) >> > >> > >> > >> > Thanks and Regards, >> > Sonal >> > >> > >> > On Thu, Feb 4, 2010 at 12:12 AM, Zheng Shao <[email protected]> wrote: >> >> >> >> Can you post the Hive query? What are the types of the parameters that >> >> you passed to the function? >> >> >> >> Zheng >> >> >> >> On Wed, Feb 3, 2010 at 3:23 AM, Sonal Goyal <[email protected]> >> wrote: >> >> > Hi, >> >> > >> >> > I am writing a UDAF which takes in 4 parameters. I have 2 cases - one >> >> > where >> >> > all the paramters are ints, and second where the last parameter is >> >> > double. I >> >> > wrote two evaluators for this, with iterate as >> >> > >> >> > public boolean iterate(int max, int groupBy, int attribute, int >> count) >> >> > >> >> > and >> >> > >> >> > public boolean iterate(int max, int groupBy, int attribute, double >> >> > count) >> >> > >> >> > However, when I run a query, I get the exception: >> >> > org.apache.hadoop.hive.ql.exec.AmbiguousMethodException: Ambiguous >> >> > method >> >> > for class org.apache.hadoop.hive.udaf.TopXPerGroup with [int, int, >> int, >> >> > int] >> >> > at >> >> > >> >> > >> org.apache.hadoop.hive.ql.exec.DefaultUDAFEvaluatorResolver.getEvaluatorClass(DefaultUDAFEvaluatorResolver.java:83) >> >> > at >> >> > >> >> > >> org.apache.hadoop.hive.ql.udf.generic.GenericUDAFBridge.getEvaluator(GenericUDAFBridge.java:57) >> >> > at >> >> > >> >> > >> org.apache.hadoop.hive.ql.exec.FunctionRegistry.getGenericUDAFEvaluator(FunctionRegistry.java:594) >> >> > at >> >> > >> >> > >> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.getGenericUDAFEvaluator(SemanticAnalyzer.java:1882) >> >> > at >> >> > >> >> > >> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapGroupByOperator(SemanticAnalyzer.java:2270) >> >> > at >> >> > >> >> > >> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genGroupByPlanMapAggr1MR(SemanticAnalyzer.java:2821) >> >> > at >> >> > >> >> > >> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:4543) >> >> > at >> >> > >> >> > >> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:5058) >> >> > at >> >> > >> >> > >> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:4999) >> >> > at >> >> > >> >> > >> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:5020) >> >> > at >> >> > >> >> > >> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:4999) >> >> > at >> >> > >> >> > >> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:5020) >> >> > at >> >> > >> >> > >> org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5587) >> >> > at >> >> > >> >> > >> org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:114) >> >> > at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:317) >> >> > at >> org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:370) >> >> > at org.apache.hadoop.hive.ql.Driver.run(Driver.java:362) >> >> > at >> >> > org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:140) >> >> > at >> >> > org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:200) >> >> > at >> org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:311) >> >> > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native >> Method) >> >> > at >> >> > >> >> > >> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) >> >> > at >> >> > >> >> > >> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) >> >> > at java.lang.reflect.Method.invoke(Method.java:597) >> >> > at org.apache.hadoop.util.RunJar.main(RunJar.java:156) >> >> > >> >> > One option for me is to write a resolver which I will do. But, I >> just >> >> > wanted to know if this is a bug in hive whereby it is not able to get >> >> > the >> >> > write evaluator. Or if this is a gap in my understanding. >> >> > >> >> > I look forward to hearing your views on this. >> >> > >> >> > Thanks and Regards, >> >> > Sonal >> >> > >> >> >> >> >> >> >> >> -- >> >> Yours, >> >> Zheng >> > >> > >> >> >> >> -- >> Yours, >> Zheng >> > >
