[ 
https://issues.apache.org/jira/browse/SPARK-2569?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Armbrust resolved SPARK-2569.
-------------------------------------

       Resolution: Fixed
    Fix Version/s: 1.1.0

> Customized UDFs in hive not running with Spark SQL
> --------------------------------------------------
>
>                 Key: SPARK-2569
>                 URL: https://issues.apache.org/jira/browse/SPARK-2569
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 1.0.0
>         Environment: linux or mac, hive 0.9.0 and hive 0.13.0 with hadoop 
> 1.0.4, scala 2.10.3, spark 1.0.0
>            Reporter: jacky hung
>            Assignee: Michael Armbrust
>            Priority: Critical
>             Fix For: 1.1.0
>
>
> start spark-shell,
> init (like create hiveContext, import ._ ect, make sure the jar including the 
> UDFs is in classpath)
> hql("CREATE TEMPORARY FUNCTION t_ts AS 'udf.Timestamp'"), which is 
> successful. 
> then i tried hql("select t_ts(time) from data_common where xxxx limit 
> 1").collect().foreach(println), which failed with NullPointException 
> we had discussion about it in the mail list.
> http://apache-spark-user-list.1001560.n3.nabble.com/run-sparksql-hiveudf-error-throw-NPE-td8888.html#a9006
> java.lang.NullPointerException 
> org.apache.spark.sql.hive.HiveFunctionFactory$class.getFunctionClass(hiveUdfs.scala:117)
>  org.apache.spark.sql.hive.HiveUdf.getFunctionClass(hiveUdfs.scala:157) 
> org.apache.spark.sql.hive.HiveFunctionFactory$class.createFunction(hiveUdfs.scala:119)
>  org.apache.spark.sql.hive.HiveUdf.createFunction(hiveUdfs.scala:157) 
> org.apache.spark.sql.hive.HiveUdf.function$lzycompute(hiveUdfs.scala:170) 
> org.apache.spark.sql.hive.HiveUdf.function(hiveUdfs.scala:170) 
> org.apache.spark.sql.hive.HiveSimpleUdf.method$lzycompute(hiveUdfs.scala:181) 
> org.apache.spark.sql.hive.HiveSimpleUdf.method(hiveUdfs.scala:180) 
> org.apache.spark.sql.hive.HiveSimpleUdf.wrappers$lzycompute(hiveUdfs.scala:186)
>  org.apache.spark.sql.hive.HiveSimpleUdf.wrappers(hiveUdfs.scala:186) 
> org.apache.spark.sql.hive.HiveSimpleUdf.eval(hiveUdfs.scala:220) 
> org.apache.spark.sql.catalyst.expressions.MutableProjection.apply(Projection.scala:64)
>  
> org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$7.apply(Aggregate.scala:160)
>  
> org.apache.spark.sql.execution.Aggregate$$anonfun$execute$1$$anonfun$7.apply(Aggregate.scala:153)
>  org.apache.spark.rdd.RDD$$anonfun$12.apply(RDD.scala:580) 
> org.apache.spark.rdd.RDD$$anonfun$12.apply(RDD.scala:580) 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35) 
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:261) 
> org.apache.spark.rdd.RDD.iterator(RDD.scala:228) 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:35)



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to