[
https://issues.apache.org/jira/browse/SPARK-12051?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyukjin Kwon resolved SPARK-12051.
----------------------------------
Resolution: Duplicate
I am resolving this as a duplicate per your comment in
https://issues.apache.org/jira/browse/SPARK-11609?focusedCommentId=15081761&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15081761
Please reopen this if I misunderstood.
> Can't register UDF from Hive thrift server
> ------------------------------------------
>
> Key: SPARK-12051
> URL: https://issues.apache.org/jira/browse/SPARK-12051
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.5.0
> Reporter: Alex Liu
>
> Start thriftserver, then from beeline
> {code}
> 0: jdbc:hive2://localhost:10000> create temporary function c_to_string as
> 'org.apache.hadoop.hive.cassandra.ql.udf.UDFCassandraBinaryToString';
> +---------+--+
> | result |
> +---------+--+
> +---------+--+
> No rows selected (0.483 seconds)
> 0: jdbc:hive2://localhost:10000> select c_to_string(c4, 'time') from
> test_table2;
> Error: org.apache.spark.sql.AnalysisException: undefined function
> c_to_string; line 1 pos 23 (state=,code=0)
> {code}
> log shows
> {code}
> OK
> ERROR 2015-11-30 08:29:37
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation: Error
> executing query, currentState RUNNING,
> org.apache.spark.sql.AnalysisException: undefined function c_to_string; line
> 1 pos 23
> at
> org.apache.spark.sql.hive.HiveFunctionRegistry$$anonfun$lookupFunction$2$$anonfun$1.apply(hiveUDFs.scala:58)
> ~[spark-hive_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.hive.HiveFunctionRegistry$$anonfun$lookupFunction$2$$anonfun$1.apply(hiveUDFs.scala:58)
> ~[spark-hive_2.10-1.5.0.3.jar:1.5.0.3]
> at scala.Option.getOrElse(Option.scala:120)
> ~[scala-library-2.10.5.jar:na]
> at
> org.apache.spark.sql.hive.HiveFunctionRegistry$$anonfun$lookupFunction$2.apply(hiveUDFs.scala:57)
> ~[spark-hive_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.hive.HiveFunctionRegistry$$anonfun$lookupFunction$2.apply(hiveUDFs.scala:53)
> ~[spark-hive_2.10-1.5.0.3.jar:1.5.0.3]
> at scala.util.Try.getOrElse(Try.scala:77) ~[scala-library-2.10.5.jar:na]
> at
> org.apache.spark.sql.hive.HiveFunctionRegistry.lookupFunction(hiveUDFs.scala:53)
> ~[spark-hive_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$10$$anonfun$applyOrElse$5$$anonfun$applyOrElse$23.apply(Analyzer.scala:490)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$10$$anonfun$applyOrElse$5$$anonfun$applyOrElse$23.apply(Analyzer.scala:490)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.catalyst.analysis.package$.withPosition(package.scala:48)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$10$$anonfun$applyOrElse$5.applyOrElse(Analyzer.scala:489)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$10$$anonfun$applyOrElse$5.applyOrElse(Analyzer.scala:486)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:227)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$3.apply(TreeNode.scala:227)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:51)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:226)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:232)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$transformDown$1.apply(TreeNode.scala:232)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:249)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> ~[scala-library-2.10.5.jar:na]
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> ~[scala-library-2.10.5.jar:na]
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> ~[scala-library-2.10.5.jar:na]
> at
> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
> ~[scala-library-2.10.5.jar:na]
> at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
> ~[scala-library-2.10.5.jar:na]
> at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
> ~[scala-library-2.10.5.jar:na]
> at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
> ~[scala-library-2.10.5.jar:na]
> at scala.collection.AbstractIterator.to(Iterator.scala:1157)
> ~[scala-library-2.10.5.jar:na]
> at
> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
> ~[scala-library-2.10.5.jar:na]
> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
> ~[scala-library-2.10.5.jar:na]
> at
> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
> ~[scala-library-2.10.5.jar:na]
> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
> ~[scala-library-2.10.5.jar:na]
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.transformChildren(TreeNode.scala:279)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:232)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionDown$1(QueryPlan.scala:76)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.catalyst.plans.QueryPlan.org$apache$spark$sql$catalyst$plans$QueryPlan$$recursiveTransform$1(QueryPlan.scala:86)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.catalyst.plans.QueryPlan$$anonfun$org$apache$spark$sql$catalyst$plans$QueryPlan$$recursiveTransform$1$1.apply(QueryPlan.scala:90)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
> ~[scala-library-2.10.5.jar:na]
> at
> scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:244)
> ~[scala-library-2.10.5.jar:na]
> at
> scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
> ~[scala-library-2.10.5.jar:na]
> at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47)
> ~[scala-library-2.10.5.jar:na]
> at
> scala.collection.TraversableLike$class.map(TraversableLike.scala:244)
> ~[scala-library-2.10.5.jar:na]
> at scala.collection.AbstractTraversable.map(Traversable.scala:105)
> ~[scala-library-2.10.5.jar:na]
> at
> org.apache.spark.sql.catalyst.plans.QueryPlan.org$apache$spark$sql$catalyst$plans$QueryPlan$$recursiveTransform$1(QueryPlan.scala:90)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.catalyst.plans.QueryPlan$$anonfun$1.apply(QueryPlan.scala:94)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at scala.collection.Iterator$$anon$11.next(Iterator.scala:328)
> ~[scala-library-2.10.5.jar:na]
> at scala.collection.Iterator$class.foreach(Iterator.scala:727)
> ~[scala-library-2.10.5.jar:na]
> at scala.collection.AbstractIterator.foreach(Iterator.scala:1157)
> ~[scala-library-2.10.5.jar:na]
> at
> scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48)
> ~[scala-library-2.10.5.jar:na]
> at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103)
> ~[scala-library-2.10.5.jar:na]
> at
> scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47)
> ~[scala-library-2.10.5.jar:na]
> at scala.collection.TraversableOnce$class.to(TraversableOnce.scala:273)
> ~[scala-library-2.10.5.jar:na]
> at scala.collection.AbstractIterator.to(Iterator.scala:1157)
> ~[scala-library-2.10.5.jar:na]
> at
> scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265)
> ~[scala-library-2.10.5.jar:na]
> at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157)
> ~[scala-library-2.10.5.jar:na]
> at
> scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252)
> ~[scala-library-2.10.5.jar:na]
> at scala.collection.AbstractIterator.toArray(Iterator.scala:1157)
> ~[scala-library-2.10.5.jar:na]
> at
> org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressionsDown(QueryPlan.scala:94)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.catalyst.plans.QueryPlan.transformExpressions(QueryPlan.scala:65)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$10.applyOrElse(Analyzer.scala:486)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$$anonfun$apply$10.applyOrElse(Analyzer.scala:484)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolveOperators$1.apply(LogicalPlan.scala:57)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolveOperators$1.apply(LogicalPlan.scala:57)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:51)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperators(LogicalPlan.scala:56)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$.apply(Analyzer.scala:484)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveFunctions$.apply(Analyzer.scala:483)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:83)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1$$anonfun$apply$1.apply(RuleExecutor.scala:80)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:111)
> ~[scala-library-2.10.5.jar:na]
> at scala.collection.immutable.List.foldLeft(List.scala:84)
> ~[scala-library-2.10.5.jar:na]
> at
> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:80)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$execute$1.apply(RuleExecutor.scala:72)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at scala.collection.immutable.List.foreach(List.scala:318)
> ~[scala-library-2.10.5.jar:na]
> at
> org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:72)
> ~[spark-catalyst_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.SQLContext$QueryExecution.analyzed$lzycompute(SQLContext.scala:910)
> ~[spark-sql_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.SQLContext$QueryExecution.analyzed(SQLContext.scala:910)
> ~[spark-sql_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.SQLContext$QueryExecution.assertAnalyzed(SQLContext.scala:908)
> ~[spark-sql_2.10-1.5.0.3.jar:1.5.0.3]
> at org.apache.spark.sql.DataFrame.<init>(DataFrame.scala:132)
> ~[spark-sql_2.10-1.5.0.3.jar:1.5.0.3]
> at org.apache.spark.sql.DataFrame$.apply(DataFrame.scala:51)
> ~[spark-sql_2.10-1.5.0.3.jar:1.5.0.3]
> at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:719)
> ~[spark-sql_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.runInternal(SparkExecuteStatementOperation.scala:224)
> ~[spark-hive-thriftserver_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171)
> [spark-hive-thriftserver_2.10-1.5.0.3.jar:1.5.0.3]
> at java.security.AccessController.doPrivileged(Native Method)
> [na:1.8.0_66]
> at javax.security.auth.Subject.doAs(Subject.java:422) [na:1.8.0_66]
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1125)
> [hadoop-core-1.0.4.18.jar:na]
> at
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:182)
> [spark-hive-thriftserver_2.10-1.5.0.3.jar:1.5.0.3]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [na:1.8.0_66]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [na:1.8.0_66]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [na:1.8.0_66]
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [na:1.8.0_66]
> at java.lang.Thread.run(Thread.java:745) [na:1.8.0_66]
> ERROR 2015-11-30 08:29:37
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation: Error
> running hive query:
> org.apache.hive.service.cli.HiveSQLException:
> org.apache.spark.sql.AnalysisException: undefined function c_to_string; line
> 1 pos 23
> at
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation.runInternal(SparkExecuteStatementOperation.scala:259)
> ~[spark-hive-thriftserver_2.10-1.5.0.3.jar:1.5.0.3]
> at
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1$$anon$2.run(SparkExecuteStatementOperation.scala:171)
> ~[spark-hive-thriftserver_2.10-1.5.0.3.jar:1.5.0.3]
> at java.security.AccessController.doPrivileged(Native Method)
> [na:1.8.0_66]
> at javax.security.auth.Subject.doAs(Subject.java:422) [na:1.8.0_66]
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1125)
> [hadoop-core-1.0.4.18.jar:na]
> at
> org.apache.spark.sql.hive.thriftserver.SparkExecuteStatementOperation$$anon$1.run(SparkExecuteStatementOperation.scala:182)
> [spark-hive-thriftserver_2.10-1.5.0.3.jar:1.5.0.3]
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [na:1.8.0_66]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [na:1.8.0_66]
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [na:1.8.0_66]
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(Threa
> {code}
> Spark SQL implements its own UDF and registry. beeline sends the UDF
> registering command to thrift server which register it in hive's UDF registry
> instead of Spark SQL's UDF registry, so it fails.
> We need a custom command to register Spark SQL's own UDF
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]