[
https://issues.apache.org/jira/browse/PHOENIX-5582?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Istvan Toth resolved PHOENIX-5582.
----------------------------------
Resolution: Cannot Reproduce
This looks more like a Spark configuration error.
If this is still an issue please reopen the ticket.
> When we use Phoenix-Spark module to R/W phoenix data in Spark cluster mode,
> No suitable driver found exception will occur
> -------------------------------------------------------------------------------------------------------------------------
>
> Key: PHOENIX-5582
> URL: https://issues.apache.org/jira/browse/PHOENIX-5582
> Project: Phoenix
> Issue Type: Bug
> Affects Versions: 5.0.0
> Reporter: iteblog
> Priority: Minor
>
> When we use Phoenix-Spark module to read and write phoenix data in Spark
> cluster mode, No suitable driver found exception will occur.
> Maven dependencies
> {code:xml}
> <dependency>
> <groupId>org.apache.phoenix</groupId>
> <artifactId>phoenix-spark</artifactId>
> <version>5.0.0-HBase-2.0</version>
> </dependency>
> <dependency>
> <groupId>org.apache.phoenix</groupId>
> <artifactId>phoenix-core</artifactId>
> <version>5.0.0-HBase-2.0</version>
> </dependency>
> {code}
> My test code is as follows
> {code:scala}
> package phoenix.datasource
> import org.apache.spark.sql.SparkSession
> object Test {
> def main(args: Array[String]): Unit = {
> val spark = SparkSession
> .builder()
> .appName("PhoenixDataSource")
> .getOrCreate()
> val zk =
> "test-master1-001.com,test-master2-001.com,test-master3-001.com:2181"
> val df = spark.read.format("org.apache.phoenix.spark")
> .option("table", "search_info_test").option("zkUrl", zk).load()
> df.selectExpr("ID","NAME").show(20, 100)
> }
> }
> {code}
> If you run the above code in local mode, everything is fine. However, if you
> run the above code in cluster mode, the following exceptions occur
> {code:scala}
> 19/11/21 11:20:17 ERROR PhoenixInputFormat: Failed to get the query plan with
> error [No suitable driver found for
> jdbc:phoenix:test-master1-001.com,test-master2-001.com,test-master3-001.com:2181:/hbase;]
> 19/11/21 11:20:17 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
> java.lang.RuntimeException: java.sql.SQLException: No suitable driver found
> for
> jdbc:phoenix:test-master1-001.com,test-master2-001.com,test-master3-001.com:2181:/hbase;
> at
> org.apache.phoenix.mapreduce.PhoenixInputFormat.getQueryPlan(PhoenixInputFormat.java:208)
> at
> org.apache.phoenix.mapreduce.PhoenixInputFormat.createRecordReader(PhoenixInputFormat.java:76)
> at
> org.apache.spark.rdd.NewHadoopRDD$$anon$1.liftedTree1$1(NewHadoopRDD.scala:197)
> at org.apache.spark.rdd.NewHadoopRDD$$anon$1.<init>(NewHadoopRDD.scala:196)
> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:151)
> at org.apache.spark.rdd.NewHadoopRDD.compute(NewHadoopRDD.scala:70)
> at org.apache.phoenix.spark.PhoenixRDD.compute(PhoenixRDD.scala:64)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
> at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:288)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
> at org.apache.spark.scheduler.Task.run(Task.scala:121)
> at
> org.apache.spark.executor.Executor$TaskRunner$$anonfun$10.apply(Executor.scala:408)
> at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1360)
> at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:414)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1147)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:622)
> at java.lang.Thread.run(Thread.java:834)
> Caused by: java.sql.SQLException: No suitable driver found for
> jdbc:phoenix:test-master1-001.com,test-master2-001.com,test-master3-001.com:2181:/hbase;
> at java.sql.DriverManager.getConnection(DriverManager.java:699)
> at java.sql.DriverManager.getConnection(DriverManager.java:217)
> at
> org.apache.phoenix.mapreduce.util.ConnectionUtil.getConnection(ConnectionUtil.java:113)
> at
> org.apache.phoenix.mapreduce.util.ConnectionUtil.getInputConnection(ConnectionUtil.java:58)
> at
> org.apache.phoenix.mapreduce.PhoenixInputFormat.getQueryPlan(PhoenixInputFormat.java:180)
> ... 43 more
> {code}
> The reason is that the {{org.apache.phoenix.spark.PhoenixRDD}} class only
> registered the {{PhoenixDriver}} in driver, not on the executor side, which
> caused the above problem. Using the phoenix-spark module to write data to
> phoenix in spark cluster mode will also cause this problem.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)