[
https://issues.apache.org/jira/browse/SPARK-25833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chenxiao Mao updated SPARK-25833:
---------------------------------
Description:
Views without column names created by Hive are not readable by Spark.
A simple example to reproduce this issue.
create a view via Hive CLI:
{code:sql}
hive> CREATE VIEW v1 AS SELECT * FROM (SELECT 1) t1
{code}
query that view via Spark
{code:sql}
spark-sql> select * from v1;
Error in query: cannot resolve '`t1._c0`' given input columns: [1]; line 1 pos
7;
'Project [*]
+- 'SubqueryAlias v1, `default`.`v1`
+- 'Project ['t1._c0]
+- SubqueryAlias t1
+- Project [1 AS 1#41]
+- OneRowRelation$
{code}
Check the view definition:
{code:sql}
hive> desc extended v1;
OK
_c0 int
...
viewOriginalText:SELECT * FROM (SELECT 1) t1,
viewExpandedText:SELECT `t1`.`_c0` FROM (SELECT 1) `t1`
...
{code}
_c0 in above view definition is automatically generated by Hive, which is not
recognizable by Spark.
see [Hive
LanguageManual|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=30746446&navigatingVersions=true#LanguageManualDDL-CreateView]
for more details:
{quote}If no column names are supplied, the names of the view's columns will be
derived automatically from the defining SELECT expression. (If the SELECT
contains unaliased scalar expressions such as x+y, the resulting view column
names will be generated in the form _C0, _C1, etc.)
{quote}
was:
A simple example to reproduce this issue.
create a view via Hive CLI:
{code:sql}
hive> CREATE VIEW v1 AS SELECT * FROM (SELECT 1) t1
{code}
query that view via Spark
{code:sql}
spark-sql> select * from v1;
Error in query: cannot resolve '`t1._c0`' given input columns: [1]; line 1 pos
7;
'Project [*]
+- 'SubqueryAlias v1, `default`.`v1`
+- 'Project ['t1._c0]
+- SubqueryAlias t1
+- Project [1 AS 1#41]
+- OneRowRelation$
{code}
Check the view definition:
{code:sql}
hive> desc extended v1;
OK
_c0 int
...
viewOriginalText:SELECT * FROM (SELECT 1) t1,
viewExpandedText:SELECT `t1`.`_c0` FROM (SELECT 1) `t1`
...
{code}
_c0 in above view definition is automatically generated by Hive, which is not
recognizable by Spark.
see [Hive
LanguageManual|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=30746446&navigatingVersions=true#LanguageManualDDL-CreateView]
for more details:
{quote}If no column names are supplied, the names of the view's columns will be
derived automatically from the defining SELECT expression. (If the SELECT
contains unaliased scalar expressions such as x+y, the resulting view column
names will be generated in the form _C0, _C1, etc.)
{quote}
> Views without column names created by Hive are not readable by Spark
> --------------------------------------------------------------------
>
> Key: SPARK-25833
> URL: https://issues.apache.org/jira/browse/SPARK-25833
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 2.3.2
> Reporter: Chenxiao Mao
> Priority: Major
>
> Views without column names created by Hive are not readable by Spark.
> A simple example to reproduce this issue.
> create a view via Hive CLI:
> {code:sql}
> hive> CREATE VIEW v1 AS SELECT * FROM (SELECT 1) t1
> {code}
> query that view via Spark
> {code:sql}
> spark-sql> select * from v1;
> Error in query: cannot resolve '`t1._c0`' given input columns: [1]; line 1
> pos 7;
> 'Project [*]
> +- 'SubqueryAlias v1, `default`.`v1`
> +- 'Project ['t1._c0]
> +- SubqueryAlias t1
> +- Project [1 AS 1#41]
> +- OneRowRelation$
> {code}
> Check the view definition:
> {code:sql}
> hive> desc extended v1;
> OK
> _c0 int
> ...
> viewOriginalText:SELECT * FROM (SELECT 1) t1,
> viewExpandedText:SELECT `t1`.`_c0` FROM (SELECT 1) `t1`
> ...
> {code}
> _c0 in above view definition is automatically generated by Hive, which is not
> recognizable by Spark.
> see [Hive
> LanguageManual|https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=30746446&navigatingVersions=true#LanguageManualDDL-CreateView]
> for more details:
> {quote}If no column names are supplied, the names of the view's columns will
> be derived automatically from the defining SELECT expression. (If the SELECT
> contains unaliased scalar expressions such as x+y, the resulting view column
> names will be generated in the form _C0, _C1, etc.)
> {quote}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]