Re: Hive permanent functions are not available in Spark SQL
Hi Pala, Can you add the full stacktrace of the exception? For now, can you use create temporary function to workaround the issue? Thanks, Yin On Wed, Sep 30, 2015 at 11:01 AM, Pala M Muthaia < mchett...@rocketfuelinc.com.invalid> wrote: > +user list > > On Tue, Sep 29, 2015 at 3:43 PM, Pala M Muthaia < > mchett...@rocketfuelinc.com> wrote: > >> Hi, >> >> I am trying to use internal UDFs that we have added as permanent >> functions to Hive, from within Spark SQL query (using HiveContext), but i >> encounter NoSuchObjectException, i.e. the function could not be found. >> >> However, if i execute 'show functions' command in spark SQL, the >> permanent functions appear in the list. >> >> I am using Spark 1.4.1 with Hive 0.13.1. I tried to debug this by looking >> at the log and code, but it seems both the show functions command as well >> as udf query both go through essentially the same code path, but the former >> can see the UDF but the latter can't. >> >> Any ideas on how to debug/fix this? >> >> >> Thanks, >> pala >> > >
Re: Hive permanent functions are not available in Spark SQL
Thanks for getting back Yin. I have copied the stack below. The associated query is just this: "hc.sql("select murmurhash3('abc') from dual")". The UDF murmurhash3 is already available in our hive metastore. Regarding temporary function, can i create a temp function with existing Hive UDF code, not arbitrary scala code? Thanks. - 15/10/01 16:59:59 ERROR metastore.RetryingHMSHandler: MetaException(message:NoSuchObjectException(message:Function default.murmurhash3 does not exist)) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:4613) at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_function(HiveMetaStore.java:4740) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) at com.sun.proxy.$Proxy30.get_function(Unknown Source) at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getFunction(HiveMetaStoreClient.java:1721) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89) at com.sun.proxy.$Proxy31.getFunction(Unknown Source) at org.apache.hadoop.hive.ql.metadata.Hive.getFunction(Hive.java:2662) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionInfoFromMetastore(FunctionRegistry.java:546) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.getQualifiedFunctionInfo(FunctionRegistry.java:579) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionInfo(FunctionRegistry.java:645) at org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionInfo(FunctionRegistry.java:652) at org.apache.spark.sql.hive.HiveFunctionRegistry.lookupFunction(hiveUdfs.scala:54) at org.apache.spark.sql.hive.HiveContext$$anon$3.org $apache$spark$sql$catalyst$analysis$OverrideFunctionRegistry$$super$lookupFunction(HiveContext.scala:376) at org.apache.spark.sql.catalyst.analysis.OverrideFunctionRegistry$$anonfun$lookupFunction$2.apply(FunctionRegistry.scala:44) at org.apache.spark.sql.catalyst.analysis.OverrideFunctionRegistry$$anonfun$lookupFunction$2.apply(FunctionRegistry.scala:44) at scala.Option.getOrElse(Option.scala:120) at org.apache.spark.sql.catalyst.analysis.OverrideFunctionRegistry$class.lookupFunction(FunctionRegistry.scala:44) at org.apache.spark.sql.hive.HiveContext$$anon$3.lookupFunction(HiveContext.scala:376) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$13$$anonfun$applyOrElse$5.applyOrElse(Analyzer.scala:465) at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$13$$anonfun$applyOrElse$5.applyOrElse(Analyzer.scala:463) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:222) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:222) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:51) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:221) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:242) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47) at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273) at scala.collection.AbstractIterator.to(Iterator.scala:1157) at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265) at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157) at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252) at scala.collection.AbstractIterator.toArray(Iterator.scala:1157) at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:272) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:227) at org.apache.spark.sql.catalyst.plans.QueryPlan.org $apache$spark$sql$catalyst$plans$QueryPlan$$transformExpressionDown$1(QueryPlan.scala:75) at org.apache.spark.sql.catalyst.plans.QueryPlan$$anonfun$1$$anonfun$apply$1.apply(QueryPlan.scala:90) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244) at
Re: Hive permanent functions are not available in Spark SQL
Yes. You can use create temporary function to create a function based on a Hive UDF ( https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL#LanguageManualDDL-Create/Drop/ReloadFunction ). Regarding the error, I think the problem is that starting from Spark 1.4, we have two separate Hive client, one for metastore side and one for execution side. For your case, the function's metadata is stored in metastore. But, when we do function look up, we are looking at the Hive in execution side. Can you file a jira? On Thu, Oct 1, 2015 at 2:04 PM, Pala M Muthaiawrote: > Thanks for getting back Yin. I have copied the stack below. The associated > query is just this: "hc.sql("select murmurhash3('abc') from dual")". The > UDF murmurhash3 is already available in our hive metastore. > > Regarding temporary function, can i create a temp function with existing > Hive UDF code, not arbitrary scala code? > > Thanks. > > - > 15/10/01 16:59:59 ERROR metastore.RetryingHMSHandler: > MetaException(message:NoSuchObjectException(message:Function > default.murmurhash3 does not exist)) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newMetaException(HiveMetaStore.java:4613) > at > org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.get_function(HiveMetaStore.java:4740) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:105) > at com.sun.proxy.$Proxy30.get_function(Unknown Source) > at > org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getFunction(HiveMetaStoreClient.java:1721) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:89) > at com.sun.proxy.$Proxy31.getFunction(Unknown Source) > at org.apache.hadoop.hive.ql.metadata.Hive.getFunction(Hive.java:2662) > at > org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionInfoFromMetastore(FunctionRegistry.java:546) > at > org.apache.hadoop.hive.ql.exec.FunctionRegistry.getQualifiedFunctionInfo(FunctionRegistry.java:579) > at > org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionInfo(FunctionRegistry.java:645) > at > org.apache.hadoop.hive.ql.exec.FunctionRegistry.getFunctionInfo(FunctionRegistry.java:652) > at > org.apache.spark.sql.hive.HiveFunctionRegistry.lookupFunction(hiveUdfs.scala:54) > at org.apache.spark.sql.hive.HiveContext$$anon$3.org > $apache$spark$sql$catalyst$analysis$OverrideFunctionRegistry$$super$lookupFunction(HiveContext.scala:376) > at > org.apache.spark.sql.catalyst.analysis.OverrideFunctionRegistry$$anonfun$lookupFunction$2.apply(FunctionRegistry.scala:44) > at > org.apache.spark.sql.catalyst.analysis.OverrideFunctionRegistry$$anonfun$lookupFunction$2.apply(FunctionRegistry.scala:44) > at scala.Option.getOrElse(Option.scala:120) > at > org.apache.spark.sql.catalyst.analysis.OverrideFunctionRegistry$class.lookupFunction(FunctionRegistry.scala:44) > at > org.apache.spark.sql.hive.HiveContext$$anon$3.lookupFunction(HiveContext.scala:376) > at > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$13$$anonfun$applyOrElse$5.applyOrElse(Analyzer.scala:465) > at > org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$13$$anonfun$applyOrElse$5.applyOrElse(Analyzer.scala:463) > at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:222) > at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:222) > at > org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:51) > at > org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:221) > at > org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:242) > at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) > at scala.collection.Iterator$class.foreach(Iterator.scala:727) > at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) > at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) > at > scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) > at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47) > at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273) > at scala.collection.AbstractIterator.to(Iterator.scala:1157) > at > scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265) > at
Re: Hive permanent functions are not available in Spark SQL
+user list On Tue, Sep 29, 2015 at 3:43 PM, Pala M Muthaiawrote: > Hi, > > I am trying to use internal UDFs that we have added as permanent functions > to Hive, from within Spark SQL query (using HiveContext), but i encounter > NoSuchObjectException, i.e. the function could not be found. > > However, if i execute 'show functions' command in spark SQL, the permanent > functions appear in the list. > > I am using Spark 1.4.1 with Hive 0.13.1. I tried to debug this by looking > at the log and code, but it seems both the show functions command as well > as udf query both go through essentially the same code path, but the former > can see the UDF but the latter can't. > > Any ideas on how to debug/fix this? > > > Thanks, > pala >