Re: LATERAL VIEW explode issue

2015-05-20 Thread kiran mavatoor
Hi Yana,
I was using sqlContext in the program by creating new SqlContext(sc). This was 
created the problem when i submit the job using spark-submit. Where as, when I 
run the same program in spark-shell, the default context is hive context (it 
seems ) and every thing seems to be fine. This created confusion. 
As solution, i called new HiveContext(sc) instead of SqlContext.
cheerskiran. 


 On Wednesday, May 20, 2015 6:38 PM, yana  wrote:
   

 Just a guess but are you using HiveContext in one case vs SqlContext 
inanother? You dont show a stacktrace but this looks like parser error...Which 
would make me guess different  context or different spark versio on the cluster 
you are submitting to...

Sent on the new Sprint Network from my Samsung Galaxy S®4.

 Original message From: kiran mavatoor Date:05/20/2015 5:57 AM 
(GMT-05:00) To: User Subject: LATERAL VIEW explode issue 
Hi,
When I use "LATERAL VIEW explode" on the registered temp table in spark shell, 
it works.  But when I use the same in spark-submit (as jar file) it is not 
working. its giving error -  "failure: ``union'' expected but identifier VIEW 
found"
sql statement i am using is
SELECT id,mapKey FROM locations LATERAL VIEW 
explode(map_keys(jsonStringToMapUdf(countries))) countries AS mapKey
I registered "jsonStringToMapUdf" as my sql function.
ThanksKiran9008099770  

  

LATERAL VIEW explode issue

2015-05-20 Thread kiran mavatoor
Hi,
When I use "LATERAL VIEW explode" on the registered temp table in spark shell, 
it works.  But when I use the same in spark-submit (as jar file) it is not 
working. its giving error -  "failure: ``union'' expected but identifier VIEW 
found"
sql statement i am using is
SELECT id,mapKey FROM locations LATERAL VIEW 
explode(map_keys(jsonStringToMapUdf(countries))) countries AS mapKey
I registered "jsonStringToMapUdf" as my sql function.
ThanksKiran9008099770  

example code for current date in spark sql

2015-05-05 Thread kiran mavatoor
Hi,
In Hive , I am using unix_timestamp() as 'update_on' to insert current date in 
'update_on' column of the table. Now I am converting it into spark sql. Please 
suggest example code to insert current date and time into column of the table 
using spark sql. 
CheersKiran.

spark sql LEFT OUTER JOIN java.lang.ClassCastException

2015-04-27 Thread kiran mavatoor
Hi There,
I am using spark sql left out join query. 
The sql query is 
scala> val test = sqlContext.sql("SELECT e.departmentID FROM employee e LEFT 
OUTER JOIN department d ON d.departmentId = e.departmentId").toDF()
In the spark 1.3.1 its working fine, but the latest pull is give the below error
15/04/27 23:02:49 ERROR Executor: Exception in task 4.0 in stage 67.0 (TID 
118)java.lang.ClassCastException15/04/27 23:02:49 INFO TaskSetManager: Lost 
task 4.0 in stage 67.0 (TID 118) on executor localhost: 
java.lang.ClassCastException (null) [duplicate 1]15/04/27 23:02:49 ERROR 
Executor: Exception in task 2.0 in stage 67.0 (TID 
116)java.lang.ClassCastException15/04/27 23:02:49 INFO TaskSetManager: Lost 
task 2.0 in stage 67.0 (TID 116) on executor localhost: 
java.lang.ClassCastException (null) [duplicate 2]15/04/27 23:02:49 ERROR 
Executor: Exception in task 3.0 in stage 67.0 (TID 
117)java.lang.ClassCastException15/04/27 23:02:49 INFO TaskSetManager: Lost 
task 3.0 in stage 67.0 (TID 117) on executor localhost: 
java.lang.ClassCastException (null) [duplicate 3]15/04/27 23:02:49 ERROR 
Executor: Exception in task 0.0 in stage 66.0 (TID 
112)java.lang.ClassCastException15/04/27 23:02:49 INFO TaskSetManager: Lost 
task 0.0 in stage 66.0 (TID 112) on executor localhost: 
java.lang.ClassCastException (null) [duplicate 1]15/04/27 23:02:49 INFO 
TaskSchedulerImpl: Removed TaskSet 66.0, whose tasks have all completed, from 
pool 15/04/27 23:02:49 ERROR Executor: Exception in task 5.0 in stage 67.0 (TID 
119)java.lang.ClassCastException15/04/27 23:02:49 INFO TaskSetManager: Lost 
task 5.0 in stage 67.0 (TID 119) on executor localhost: 
java.lang.ClassCastException (null) [duplicate 4]15/04/27 23:02:49 ERROR 
Executor: Exception in task 0.0 in stage 67.0 (TID 
114)java.lang.ClassCastException15/04/27 23:02:49 INFO TaskSetManager: Lost 
task 0.0 in stage 67.0 (TID 114) on executor localhost: 
java.lang.ClassCastException (null) [duplicate 5]15/04/27 23:02:49 INFO 
TaskSchedulerImpl: Removed TaskSet 67.0, whose tasks have all completed, from 
pool org.apache.spark.SparkException: Job aborted due to stage failure: Task 1 
in stage 66.0 failed 1 times, most recent failure: Lost task 1.0 in stage 66.0 
(TID 113, localhost): java.lang.ClassCastException
Driver stacktrace: at 
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1241)
 at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1232)
 at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1231)
 at 
scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) 
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:47) at 
org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1231) at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:705)
 at 
org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:705)
 at scala.Option.foreach(Option.scala:236) at 
org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:705)
 at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1424)
 at 
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1385)
 at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
ThanksKiran.