[jira] [Commented] (SPARK-21179) Unable to return Hive INT data type into Spark via Hive JDBC driver: Caused by: java.sql.SQLDataException: [Simba][JDBC](10140) Error converting value to int.

Philippe ROSSIGNOL (Jira) Fri, 13 Sep 2019 02:45:50 -0700


    [ 
https://issues.apache.org/jira/browse/SPARK-21179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16929072#comment-16929072
 ]


Philippe ROSSIGNOL commented on SPARK-21179:
--------------------------------------------

For me UseNativeQuery=0 works well with the Simba jdbc Spark driver.

So there is a misunderstood in the Simba Spark documentation. In fact it's 
written that 0 is the default value, but in reality the default value is 1 and 
not 0 !

> Unable to return Hive INT data type into Spark via Hive JDBC driver:  Caused 
> by: java.sql.SQLDataException: [Simba][JDBC](10140) Error converting value to 
> int.  
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-21179
>                 URL: https://issues.apache.org/jira/browse/SPARK-21179
>             Project: Spark
>          Issue Type: Bug
>          Components: Spark Shell, SQL
>    Affects Versions: 1.6.0, 2.0.0, 2.1.1
>         Environment: OS:  Linux
> HDP version 2.5.0.1-60
> Hive version: 1.2.1
> Spark  version 2.0.0.2.5.0.1-60
> JDBC:  Download the latest Hortonworks JDBC driver
>            Reporter: Matthew Walton
>            Priority: Major
>              Labels: bulk-closed
>
> I'm trying to fetch back data in Spark SQL using a JDBC connection to Hive.  
> Unfortunately, when I try to query data that resides in an INT column I get 
> the following error:  
> 17/06/22 12:14:37 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
> java.sql.SQLDataException: [Simba][JDBC](10140) Error converting value to 
> int.  
> Steps to reproduce:
> 1) On Hive create a simple table with an INT column and insert some data (I 
> used SQuirreL Client with the Hortonworks JDBC driver):
> create table wh2.hivespark (country_id int, country_name string)
> insert into wh2.hivespark values (1, 'USA')
> 2) Copy the Hortonworks Hive JDBC driver to the machine where you will run 
> Spark Shell
> 3) Start Spark shell loading the Hortonworks Hive JDBC driver jar files
> ./spark-shell --jars 
> /home/spark/jdbc/hortonworkshive/HiveJDBC41.jar,/home/spark/jdbc/hortonworkshive/TCLIServiceClient.jar,/home/spark/jdbc/hortonworkshive/commons-codec-1.3.jar,/home/spark/jdbc/hortonworkshive/commons-logging-1.1.1.jar,/home/spark/jdbc/hortonworkshive/hive_metastore.jar,/home/spark/jdbc/hortonworkshive/hive_service.jar,/home/spark/jdbc/hortonworkshive/httpclient-4.1.3.jar,/home/spark/jdbc/hortonworkshive/httpcore-4.1.3.jar,/home/spark/jdbc/hortonworkshive/libfb303-0.9.0.jar,/home/spark/jdbc/hortonworkshive/libthrift-0.9.0.jar,/home/spark/jdbc/hortonworkshive/log4j-1.2.14.jar,/home/spark/jdbc/hortonworkshive/ql.jar,/home/spark/jdbc/hortonworkshive/slf4j-api-1.5.11.jar,/home/spark/jdbc/hortonworkshive/slf4j-log4j12-1.5.11.jar,/home/spark/jdbc/hortonworkshive/zookeeper-3.4.6.jar
> 4) In Spark shell load the data from Hive using the JDBC driver
> val hivespark = spark.read.format("jdbc").options(Map("url" -> 
> "jdbc:hive2://localhost:10000/wh2;AuthMech=3;UseNativeQuery=1;user=hfs;password=hdfs","dbtable"
>  -> 
> "wh2.hivespark")).option("driver","com.simba.hive.jdbc41.HS2Driver").option("user","hdfs").option("password","hdfs").load()
> 5) In Spark shell try to display the data
> hivespark.show()
> At this point you should see the error:
> scala> hivespark.show()
> 17/06/22 12:14:37 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
> java.sql.SQLDataException: [Simba][JDBC](10140) Error converting value to int.
>         at 
> com.simba.hiveserver2.exceptions.ExceptionConverter.toSQLException(Unknown 
> Source)
>         at 
> com.simba.hiveserver2.utilities.conversion.TypeConverter.toInt(Unknown Source)
>         at com.simba.hiveserver2.jdbc.common.SForwardResultSet.getInt(Unknown 
> Source)
>         at 
> org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.getNext(JDBCRDD.scala:437)
>         at 
> org.apache.spark.sql.execution.datasources.jdbc.JDBCRDD$$anon$1.hasNext(JDBCRDD.scala:535)
>         at 
> org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown
>  Source)
>         at 
> org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
>         at 
> org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:370)
>         at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$4.apply(SparkPlan.scala:246)
>         at 
> org.apache.spark.sql.execution.SparkPlan$$anonfun$4.apply(SparkPlan.scala:240)
>         at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:784)
>         at 
> org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:784)
>         at 
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
>         at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319)
>         at org.apache.spark.rdd.RDD.iterator(RDD.scala:283)
>         at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70)
>         at org.apache.spark.scheduler.Task.run(Task.scala:85)
>         at 
> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> Note:  I also tested this issue using a JDBC driver from Progress DataDirect 
> and I see a similar error message so this does not seem to be driver specific.
> scala> hivespark.show()
> 17/06/22 12:07:59 ERROR Executor: Exception in task 0.0 in stage 2.0 (TID 2)
> java.sql.SQLException: [DataDirect][Hive JDBC Driver]Value can not be 
> converted to requested type.
> Also, if I query this table directly from SQuirreL Client tool there is no 
> error.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-21179) Unable to return Hive INT data type into Spark via Hive JDBC driver: Caused by: java.sql.SQLDataException: [Simba][JDBC](10140) Error converting value to int.

Reply via email to