[
https://issues.apache.org/jira/browse/SPARK-22113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16180273#comment-16180273
]
Liang-Chi Hsieh commented on SPARK-22113:
-----------------------------------------
I can reproduce it. Currently Spark can't work with Hive if you use JDBC to
connect it. I'm just not very sure if we want to support it.
Another thing I mentioned in previous comment is the behavior difference
between your second query and my test:
??Another thing is, my test shows a little different behavior with Michael
Fu's. In my test, the table column will be prefixed with table name. But looks
like it isn't in Michael Fu's case because {{dw_date}} can be resolved in the
second query.??
For example, in my test, I can't resolve {{dw_date}} column directly. I need to
refer it as {{tablename.dw_date}}. I am not sure if the Hive version causes
this difference. But it doesn't change the fact that JDBC doesn't work with
Hive now in Spark.
> Dataset shows in Hive is inconsistent with JDBC
> -----------------------------------------------
>
> Key: SPARK-22113
> URL: https://issues.apache.org/jira/browse/SPARK-22113
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.2.0
> Environment: version 2.2.0
> Reporter: Michael Fu
>
> I am trying to query data from Hive in spark. According spark-sql document,
> there're two ways to do this:
> The first way is Init session with _enableHiveSupport_
> {code:java}
> SparkSession session =
> SparkSession.builder().enableHiveSupport().getOrCreate();
> session.sql("select dw_date from tfdw.dwd_dim_date limit 10").show();
> {code}
> the dataset shows the correct result
> !https://i.stack.imgur.com/gBJCj.png!
> The second way is through JDBC
> {code:java}
> Dataset<Row> ds = session.read()
> .format("jdbc")
> .option("driver", "org.apache.hive.jdbc.HiveDriver")
> .option("url",
> "jdbc:hive2://iZ11syxr6afZ:21050/;auth=noSasl")
> .option("dbtable", "tfdw.dwd_dim_date")
> .load();
> ds.select("dw_date").limit(10).show();
> {code}
> But the dataset only show the column name in the result rather than the data
> in the column
> !https://i.stack.imgur.com/FBMDN.png!
> The two pictures should be consistent I think. Any outstanding I missed ?
> Many thanks!
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]