Hive UDF are only applicable for HiveContext and its subclass instance, is the CassandraAwareSQLContext a direct sub class of HiveContext or SQLContext?
From: shahab [mailto:shahab.mok...@gmail.com] Sent: Tuesday, March 3, 2015 5:10 PM To: Cheng, Hao Cc: user@spark.apache.org Subject: Re: Supporting Hive features in Spark SQL Thrift JDBC server val sc: SparkContext = new SparkContext(conf) val sqlCassContext = new CassandraAwareSQLContext(sc) // I used some Calliope Cassandra Spark connector val rdd : SchemaRDD = sqlCassContext.sql("select * from db.profile " ) rdd.cache rdd.registerTempTable("profile") rdd.first //enforce caching val q = "select from_unixtime(floor(createdAt/1000)) from profile where sampling_bucket=0 " val rdd2 = rdd.sqlContext.sql(q ) println ("Result: " + rdd2.first) And I get the following errors: xception in thread "main" org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Unresolved attributes: 'from_unixtime('floor(('createdAt / 1000))) AS c0#7, tree: Project ['from_unixtime('floor(('createdAt / 1000))) AS c0#7] Filter (sampling_bucket#10 = 0) Subquery profile Project [company#8,bucket#9,sampling_bucket#10,profileid#11,createdat#12L,modifiedat#13L,version#14] CassandraRelation localhost, 9042, 9160, normaldb_sampling, profile, org.apache.spark.sql.CassandraAwareSQLContext@778b692d<mailto:org.apache.spark.sql.CassandraAwareSQLContext@778b692d>, None, None, false, Some(Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml) at org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$apply$1.applyOrElse(Analyzer.scala:72) at org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$apply$1.applyOrElse(Analyzer.scala:70) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:165) at org.apache.spark.sql.catalyst.trees.TreeNode$$anonfun$4.apply(TreeNode.scala:183) at scala.collection.Iterator$$anon$11.next(Iterator.scala:328) at scala.collection.Iterator$class.foreach(Iterator.scala:727) at scala.collection.AbstractIterator.foreach(Iterator.scala:1157) at scala.collection.generic.Growable$class.$plus$plus$eq(Growable.scala:48) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:103) at scala.collection.mutable.ArrayBuffer.$plus$plus$eq(ArrayBuffer.scala:47) at scala.collection.TraversableOnce$class.to<http://class.to>(TraversableOnce.scala:273) at scala.collection.AbstractIterator.to<http://scala.collection.AbstractIterator.to>(Iterator.scala:1157) at scala.collection.TraversableOnce$class.toBuffer(TraversableOnce.scala:265) at scala.collection.AbstractIterator.toBuffer(Iterator.scala:1157) at scala.collection.TraversableOnce$class.toArray(TraversableOnce.scala:252) at scala.collection.AbstractIterator.toArray(Iterator.scala:1157) at org.apache.spark.sql.catalyst.trees.TreeNode.transformChildrenDown(TreeNode.scala:212) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:168) at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:156) at org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.apply(Analyzer.scala:70) at org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.apply(Analyzer.scala:68) at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:61) at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:59) at scala.collection.IndexedSeqOptimized$class.foldl(IndexedSeqOptimized.scala:51) at scala.collection.IndexedSeqOptimized$class.foldLeft(IndexedSeqOptimized.scala:60) at scala.collection.mutable.WrappedArray.foldLeft(WrappedArray.scala:34) at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:59) at org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:51) at scala.collection.immutable.List.foreach(List.scala:318) at org.apache.spark.sql.catalyst.rules.RuleExecutor.apply(RuleExecutor.scala:51) at org.apache.spark.sql.SQLContext$QueryExecution.analyzed$lzycompute(SQLContext.scala:402) at org.apache.spark.sql.SQLContext$QueryExecution.analyzed(SQLContext.scala:402) at org.apache.spark.sql.SQLContext$QueryExecution.optimizedPlan$lzycompute(SQLContext.scala:403) at org.apache.spark.sql.SQLContext$QueryExecution.optimizedPlan(SQLContext.scala:403) at org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan$lzycompute(SQLContext.scala:407) at org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan(SQLContext.scala:405) at org.apache.spark.sql.SQLContext$QueryExecution.executedPlan$lzycompute(SQLContext.scala:411) at org.apache.spark.sql.SQLContext$QueryExecution.executedPlan(SQLContext.scala:411) at org.apache.spark.sql.SchemaRDD.collect(SchemaRDD.scala:438) at org.apache.spark.sql.SchemaRDD.take(SchemaRDD.scala:440) at org.apache.spark.sql.SchemaRDD.take(SchemaRDD.scala:103) at org.apache.spark.rdd.RDD.first(RDD.scala:1091) at boot.SQLDemo$.main(SQLDemo.scala:65) //my code at boot.SQLDemo.main(SQLDemo.scala) //my code On Tue, Mar 3, 2015 at 8:57 AM, Cheng, Hao <hao.ch...@intel.com<mailto:hao.ch...@intel.com>> wrote: Can you provide the detailed failure call stack? From: shahab [mailto:shahab.mok...@gmail.com<mailto:shahab.mok...@gmail.com>] Sent: Tuesday, March 3, 2015 3:52 PM To: user@spark.apache.org<mailto:user@spark.apache.org> Subject: Supporting Hive features in Spark SQL Thrift JDBC server Hi, According to Spark SQL documentation, "....Spark SQL supports the vast majority of Hive features, such as User Defined Functions( UDF) ", and one of these UFDs is "current_date()" function, which should be supported. However, i get error when I am using this UDF in my SQL query. There are couple of other UDFs which cause similar error. Am I missing something in my JDBC server ? /Shahab