Hey Ping, I just tried the same UDF/query but I am unable to reproduce that NPE. Which version of hive are you using?
Cheers, Paul From: Ping Zhu [mailto:[email protected]] Sent: Wednesday, July 21, 2010 5:48 PM To: [email protected] Subject: Re: deploy simple UDF function This problem still exist. My small test case is: I created a table string_table with one column of string type. I insert one record into table string_table. I create another UDF function "udftest" which takes Text argument and return boolean value. The query is "select * from string_table where udftest(col) = true;". Error "FAILED: Unknown exception: null" returns. UDF function source code: package com.example; import org.apache.hadoop.hive.ql.exec.UDF; import org.apache.hadoop.io.Text; public final class UDFTest extends UDF { public boolean evaluate(final Text s) { if (s == null) { return false; } return true; } } On Wed, Jul 21, 2010 at 5:20 PM, Paul Yang <[email protected]<mailto:[email protected]>> wrote: I did notice that if the where clause is not a Boolean expression, there is a exception thrown - e.g. SELECT key FORM src WHERE 1; I filed a JIRA for this issue at: https://issues.apache.org/jira/browse/HIVE-1478 Glad that your query works now, but "where f(col) = true" should not cause an error, as the = operator returns a boolean value. So it's strange that you got an error... if you see this problem again, could you post a test case? From: Ping Zhu [mailto:[email protected]<mailto:[email protected]>] Sent: Wednesday, July 21, 2010 5:10 PM To: [email protected]<mailto:[email protected]> Subject: Re: deploy simple UDF function It was "where f(col) = true". Sorry for the typo. Ping On Wed, Jul 21, 2010 at 5:03 PM, Paul Yang <[email protected]<mailto:[email protected]>> wrote: Hold on, how did 'where f(col) is true' compile? I don't think "is true" is valid HQL. Can you post the full query? From: Ping Zhu [mailto:[email protected]<mailto:[email protected]>] Sent: Wednesday, July 21, 2010 4:58 PM To: [email protected]<mailto:[email protected]> Subject: Re: deploy simple UDF function I figured the source of error: The UDF function (say, f) returns boolean value. The where clause in Hive query was "where f(col) is true)". I change the where clause to "where f(col)". Then it works. I did other contrived test by changing the return type of UDF to int. The where clause in Hive query is changed to "where f(col)=1". It also works. Is this an issue/bug of Hive compiler? Ping On Wed, Jul 21, 2010 at 3:55 PM, Ping Zhu <[email protected]<mailto:[email protected]>> wrote: I have tested this simple UDF function locally. The function itself is properly implemented. On Wed, Jul 21, 2010 at 3:54 PM, Ping Zhu <[email protected]<mailto:[email protected]>> wrote: Hi, I have a problem with calling a simple UDF function in Hive query. I compiled the function and created a jar file on my local pc. Then the jar file is sent to a remote Hive cluster and deployed. When this UDF function is called in a Hive query, an error "FAILED: Unknown exception: null" returns. I checked Hive log file, the detailed error message is: 2010-07-21 15:45:33,590 ERROR ql.Driver (SessionState.java:printError(248)) - FAILED: Unknown exception: null java.lang.NullPointerException at org.apache.hadoop.hive.ql.udf.generic.GenericUDFUtils$ConversionHelper.<init>(GenericUDFUtils.java:214) at org.apache.hadoop.hive.ql.udf.generic.GenericUDFBridge.initialize(GenericUDFBridge.java:140) at org.apache.hadoop.hive.ql.plan.exprNodeGenericFuncDesc.newInstance(exprNodeGenericFuncDesc.java:141) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getFuncExprNodeDesc(TypeCheckProcFactory.java:444) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.getXpathOrFuncExprNodeDesc(TypeCheckProcFactory.java:541) at org.apache.hadoop.hive.ql.parse.TypeCheckProcFactory$DefaultExprProcessor.process(TypeCheckProcFactory.java:634) at org.apache.hadoop.hive.ql.lib.DefaultRuleDispatcher.dispatch(DefaultRuleDispatcher.java:80) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.dispatch(DefaultGraphWalker.java:83) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.walk(DefaultGraphWalker.java:117) at org.apache.hadoop.hive.ql.lib.DefaultGraphWalker.startWalking(DefaultGraphWalker.java:95) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genExprNodeDesc(SemanticAnalyzer.java:5283) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:1005) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genFilterPlan(SemanticAnalyzer.java:991) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genBodyPlan(SemanticAnalyzer.java:4234) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genPlan(SemanticAnalyzer.java:4714) at org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:5203) at org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:105) at org.apache.hadoop.hive.ql.Driver.compile(Driver.java:275) at org.apache.hadoop.hive.ql.Driver.runCommand(Driver.java:320) at org.apache.hadoop.hive.ql.Driver.run(Driver.java:312) at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:123) at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:181) at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:287) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:597) at org.apache.hadoop.util.RunJar.main(RunJar.java:156) The versions of Hive installed on my local pc and remote Hive cluster are 0.6 and 0.5 respectively. I copied corresponding jar files which are needed to compile the UDF function from remote Hive cluster, but it still does not work. Any suggestions/comments will be highly appreciated. Thanks and best regards, Ping
