[jira] [Comment Edited] (SPARK-25833) Views without column names created by Hive are not readable by Spark

Dilip Biswal (JIRA) Sat, 27 Oct 2018 13:40:12 -0700


    [ 
https://issues.apache.org/jira/browse/SPARK-25833?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16666178#comment-16666178
 ]


Dilip Biswal edited comment on SPARK-25833 at 10/27/18 8:39 PM:
----------------------------------------------------------------

This looks like a duplicate of 
https://issues.apache.org/jira/browse/SPARK-24864. Please see the discussion 
there. Basically Hive and Spark are two different systems and follow a 
different scheme to compute auto generated column names. We should be using 
aliases  in the view definition to make it runnable from spark.

cc [~smilegator] [~srowen]
Thank you.


was (Author: dkbiswal):
This looks like a duplicate of 
https://issues.apache.org/jira/browse/SPARK-24864. Please see the discussion 
there. Basically Hive and Spark are two different systems and follow a 
different scheme to compute auto generated column names. We should be using 
aliases  in the view definition to make it runnable from spark.

Thank you.

> Views without column names created by Hive are not readable by Spark
> --------------------------------------------------------------------
>
>                 Key: SPARK-25833
>                 URL: https://issues.apache.org/jira/browse/SPARK-25833
>             Project: Spark
>          Issue Type: Bug
>          Components: SQL
>    Affects Versions: 2.3.2
>            Reporter: Chenxiao Mao
>            Priority: Major
>
> A simple example to reproduce this issue.
>  create a view via Hive CLI:
> {code:sql}
> hive> CREATE VIEW v1 AS SELECT * FROM (SELECT 1) t1
> {code}
> query that view via Spark
> {code:sql}
> spark-sql> select * from v1;
> Error in query: cannot resolve '`t1._c0`' given input columns: [1]; line 1 
> pos 7;
> 'Project [*]
> +- 'SubqueryAlias v1, `default`.`v1`
>    +- 'Project ['t1._c0]
>       +- SubqueryAlias t1
>          +- Project [1 AS 1#41]
>             +- OneRowRelation$
> {code}
> Check the view definition:
> {code:sql}
> hive> desc extended v1;
> OK
> _c0                   int
> ...
> viewOriginalText:SELECT * FROM (SELECT 1) t1, 
> viewExpandedText:SELECT `t1`.`_c0` FROM (SELECT 1) `t1`
> ...
> {code}
> _c0 in above view definition is automatically generated by Hive, which is not 
> recognizable by Spark.
>  see [Hive 
> LanguageManual|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=30746446&navigatingVersions=true#LanguageManualDDL-CreateView]
>  for more details:
> {quote}If no column names are supplied, the names of the view's columns will 
> be derived automatically from the defining SELECT expression. (If the SELECT 
> contains unaliased scalar expressions such as x+y, the resulting view column 
> names will be generated in the form _C0, _C1, etc.)
> {quote}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Comment Edited] (SPARK-25833) Views without column names created by Hive are not readable by Spark

Reply via email to