I have resolved the hanging issue below by using yarn-client as follows
spark-shell --master yarn --deploy-mode client --driver-class-path /home/hduser/jars/ojdbc6.jar val channels = sqlContext.read.format("jdbc").options( Map("url" -> "jdbc:oracle:thin:@rhes564:1521:mydb", "dbtable" -> "(select * from sh.channels where channel_id = 14)", "user" -> "sh", "password" -> "sh")).load channels.show But I am getting this error with channels.show 16/02/12 16:03:37 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, rhes564, PROCESS_LOCAL, 1929 bytes) 16/02/12 16:03:37 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on rhes564:33141 (size: 2.7 KB, free: 1589.8 MB) 16/02/12 16:03:38 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, rhes564): java.sql.SQLException: No suitable driver found for jdbc:oracle:thin:@rhes564:1521:mydb at java.sql.DriverManager.getConnection(DriverManager.java:596) at java.sql.DriverManager.getConnection(DriverManager.java:187) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anonfun$getConnecto r$1.apply(JDBCRDD.scala:188) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anonfun$getConnecto r$1.apply(JDBCRDD.scala:181) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.<init>(JDBCR DD.scala:360) at org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD.compute(JDBCRDD.scal a:352) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300) at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:300) at org.apache.spark.rdd.RDD.iterator(RDD.scala:264) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:66) at org.apache.spark.scheduler.Task.run(Task.scala:88) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:214) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:11 45) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:6 15) at java.lang.Thread.run(Thread.java:724) 16/02/12 16:03:38 INFO TaskSetManager: Starting task 0.1 in stage 0.0 (TID 1, rhes564, PROCESS_LOCAL, 1929 bytes) 16/02/12 16:03:38 INFO TaskSetManager: Lost task 0.1 in stage 0.0 (TID 1) on executor rhes564: java.sql.SQLException (No suitable driver found for jdbc:oracle:thin:@rhes564:1521:mydb) [duplicate 1] 16/02/12 16:03:38 INFO TaskSetManager: Starting task 0.2 in stage 0.0 (TID 2, rhes564, PROCESS_LOCAL, 1929 bytes) 16/02/12 16:03:38 INFO TaskSetManager: Lost task 0.2 in stage 0.0 (TID 2) on executor rhes564: java.sql.SQLException (No suitable driver found for jdbc:oracle:thin:@rhes564:1521:mydb) [duplicate 2] 16/02/12 16:03:38 INFO TaskSetManager: Starting task 0.3 in stage 0.0 (TID 3, rhes564, PROCESS_LOCAL, 1929 bytes) 16/02/12 16:03:38 INFO TaskSetManager: Lost task 0.3 in stage 0.0 (TID 3) on executor rhes564: java.sql.SQLException (No suitable driver found for jdbc:oracle:thin:@rhes564:1521:mydb) [duplicate 3] 16/02/12 16:03:38 ERROR TaskSetManager: Task 0 in stage 0.0 failed 4 times; aborting job 16/02/12 16:03:38 INFO YarnScheduler: Removed TaskSet 0.0, whose tasks have all completed, from pool 16/02/12 16:03:38 INFO YarnScheduler: Cancelling stage 0 16/02/12 16:03:38 INFO DAGScheduler: ResultStage 0 (show at <console>:26) failed in 1.182 s 16/02/12 16:03:38 INFO DAGScheduler: Job 0 failed: show at <console>:26, took 1.316319 s 16/02/12 16:03:39 INFO SparkContext: Invoking stop() from shutdown hook 16/02/12 16:03:39 INFO SparkUI: Stopped Spark web UI at http://50.140.197.217:4040 16/02/12 16:03:39 INFO DAGScheduler: Stopping DAGScheduler 16/02/12 16:03:39 INFO YarnClientSchedulerBackend: Interrupting monitor thread 16/02/12 16:03:39 INFO YarnClientSchedulerBackend: Shutting down all executors 16/02/12 16:03:39 INFO YarnClientSchedulerBackend: Asking each executor to shut down 16/02/12 16:03:39 INFO YarnClientSchedulerBackend: Stopped Dr Mich Talebzadeh LinkedIn <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABU rV8Pw> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUr V8Pw <http://talebzadehmich.wordpress.com/> http://talebzadehmich.wordpress.com NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Technology Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Technology Ltd, its subsidiaries nor their employees accept any responsibility. From: Mich Talebzadeh [mailto:m...@peridale.co.uk] Sent: 12 February 2016 10:45 To: user@spark.apache.org Subject: Connection via JDBC to Oracle hangs after count call Hi, I use the following to connect to Oracle DB from Spark shell 1.5.2 spark-shell --master spark://50.140.197.217:7077 --driver-class-path /home/hduser/jars/ojdbc6.jar in Scala I do scala> val sqlContext = new org.apache.spark.sql.SQLContext(sc) sqlContext: org.apache.spark.sql.SQLContext = org.apache.spark.sql.SQLContext@f9d4387 <mailto:org.apache.spark.sql.SQLContext@f9d4387> scala> val channels = sqlContext.read.format("jdbc").options( | Map("url" -> "jdbc:oracle:thin:@rhes564:1521:mydb", | "dbtable" -> "(select * from sh.channels where channel_id = 14)", | "user" -> "sh", | "password" -> "xxxxxxx")).load channels: org.apache.spark.sql.DataFrame = [CHANNEL_ID: decimal(0,-127), CHANNEL_DESC: string, CHANNEL_CLASS: string, CHANNEL_CLASS_ID: decimal(0,-127), CHANNEL_TOTAL: string, CHANNEL_TOTAL_ID: decimal(0,-127)] scala> channels.count() But the latter command keeps hanging? Any ideas appreciated Thanks, Mich Talebzadeh LinkedIn <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABU rV8Pw> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUr V8Pw <http://talebzadehmich.wordpress.com/> http://talebzadehmich.wordpress.com NOTE: The information in this email is proprietary and confidential. This message is for the designated recipient only, if you are not the intended recipient, you should destroy it immediately. Any information in this message shall not be understood as given or endorsed by Peridale Technology Ltd, its subsidiaries or their employees, unless expressly so stated. It is the responsibility of the recipient to ensure that this email is virus free, therefore neither Peridale Technology Ltd, its subsidiaries nor their employees accept any responsibility.