[
https://issues.apache.org/jira/browse/SPARK-21179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16929072#comment-16929072
]
Philippe ROSSIGNOL commented on SPARK-21179:
--------------------------------------------
For me UseNativeQuery=0 works well with the Simba jdbc Spark driver.
So there is a misunderstood in the Simba Spark documentation. In fact it's
written that 0 is the default value, but in reality the default value is 1 and
not 0 !
> Unable to return Hive INT data type into Spark via Hive JDBC driver: Caused
> by: java.sql.SQLDataException: [Simba][JDBC](10140) Error converting value to
> int.
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: SPARK-21179
> URL: https://issues.apache.org/jira/browse/SPARK-21179
> Project: Spark
> Issue Type: Bug
> Components: Spark Shell, SQL
> Affects Versions: 1.6.0, 2.0.0, 2.1.1
> Environment: OS: Linux
> HDP version 2.5.0.1-60
> Hive version: 1.2.1
> Spark version 2.0.0.2.5.0.1-60
> JDBC: Download the latest Hortonworks JDBC driver
> Reporter: Matthew Walton
> Priority: Major
> Labels: bulk-closed
>
> I'm trying to fetch back data in Spark SQL using a JDBC connection to Hive.
> Unfortunately, when I try to query data that resides in an INT column I get
> the following error:
> 17/06/22 12:14:37 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
> java.sql.SQLDataException: [Simba][JDBC](10140) Error converting value to
> int.
> Steps to reproduce:
> 1) On Hive create a simple table with an INT column and insert some data (I
> used SQuirreL Client with the Hortonworks JDBC driver):
> create table wh2.hivespark (country_id int, country_name string)
> insert into wh2.hivespark values (1, 'USA')
> 2) Copy the Hortonworks Hive JDBC driver to the machine where you will run
> Spark Shell
> 3) Start Spark shell loading the Hortonworks Hive JDBC driver jar files
> ./spark-shell --jars
> /home/spark/jdbc/hortonworkshive/HiveJDBC41.jar,/home/spark/jdbc/hortonworkshive/TCLIServiceClient.jar,/home/spark/jdbc/hortonworkshive/commons-codec-1.3.jar,/home/spark/jdbc/hortonworkshive/commons-logging-1.1.1.jar,/home/spark/jdbc/hortonworkshive/hive_metastore.jar,/home/spark/jdbc/hortonworkshive/hive_service.jar,/home/spark/jdbc/hortonworkshive/httpclient-4.1.3.jar,/home/spark/jdbc/hortonworkshive/httpcore-4.1.3.jar,/home/spark/jdbc/hortonworkshive/libfb303-0.9.0.jar,/home/spark/jdbc/hortonworkshive/libthrift-0.9.0.jar,/home/spark/jdbc/hortonworkshive/log4j-1.2.14.jar,/home/spark/jdbc/hortonworkshive/ql.jar,/home/spark/jdbc/hortonworkshive/slf4j-api-1.5.11.jar,/home/spark/jdbc/hortonworkshive/slf4j-log4j12-1.5.11.jar,/home/spark/jdbc/hortonworkshive/zookeeper-3.4.6.jar
> 4) In Spark shell load the data from Hive using the JDBC driver
> val hivespark = spark.read.format("jdbc").options(Map("url" ->
> "jdbc:hive2://localhost:10000/wh2;AuthMech=3;UseNativeQuery=1;user=hfs;password=hdfs","dbtable"
> ->
> "wh2.hivespark")).option("driver","com.simba.hive.jdbc41.HS2Driver").option("user","hdfs").option("password","hdfs").load()
> 5) In Spark shell try to display the data
> hivespark.show()
> At this point you should see the error:
> scala> hivespark.show()
> 17/06/22 12:14:37 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
> java.sql.SQLDataException: [Simba][JDBC](10140) Error converting value to int.
> at
> com.simba.hiveserver2.exceptions.ExceptionConverter.toSQLException(Unknown
> Source)
> at
> com.simba.hiveserver2.utilities.conversion.TypeConverter.toInt(Unknown Source)
> at com.simba.hiveserver2.jdbc.common.SForwardResultSet.getInt(Unknown
> Source)
> at
> org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.getNext(JDBCRDD.scala:437)
> at
> org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.hasNext(JDBCRDD.scala:535)
> at
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
> Source)
> at
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
> at
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:370)
> at
> org.apache.spark.sql.execution.SparkPlan$$anonfun$4.apply(SparkPlan.scala:246)
> at
> org.apache.spark.sql.execution.SparkPlan$$anonfun$4.apply(SparkPlan.scala:240)
> at
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:784)
> at
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:784)
> at
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
> at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
> at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
> at org.apache.spark.scheduler.Task.run(Task.scala:85)
> at
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
> at java.lang.Thread.run(Thread.java:745)
> Note: I also tested this issue using a JDBC driver from Progress DataDirect
> and I see a similar error message so this does not seem to be driver specific.
> scala> hivespark.show()
> 17/06/22 12:07:59 ERROR Executor: Exception in task 0.0 in stage 2.0 (TID 2)
> java.sql.SQLException: [DataDirect][Hive JDBC Driver]Value can not be
> converted to requested type.
> Also, if I query this table directly from SQuirreL Client tool there is no
> error.
--
This message was sent by Atlassian Jira
(v8.3.2#803003)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]