[ 
https://issues.apache.org/jira/browse/HIVE-22601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16993783#comment-16993783
 ] 

okumin commented on HIVE-22601:
-------------------------------

There is a bug in the step to rewrite TOK_SETCOLREFs in an AST.

A TOK_SETCOLREF is a placeholder to express column names.

In my case, the schema of the result of UNION ALL is unknown at the AST level. 
So this placeholder will be required.

[https://github.com/apache/hive/blob/eb72a0c01d3dc5dc50c220a548c7986a440eef3f/ql/src/java/org/apache/hadoop/hive/ql/parse/ParseUtils.java#L81]

 

This step assumes that there is only one alias per expression at most.

As a result, only c3 column was retained in my case.

[https://github.com/apache/hive/blob/eb72a0c01d3dc5dc50c220a548c7986a440eef3f/ql/src/java/org/apache/hadoop/hive/ql/parse/ParseUtils.java#L450]

 

This is the syntax part and it will appear with a UDTF.
 * 
[https://github.com/apache/hive/blob/eb72a0c01d3dc5dc50c220a548c7986a440eef3f/ql/src/java/org/apache/hadoop/hive/ql/parse/SelectClauseParser.g#L88]
 * 
[https://github.com/apache/hive/blob/eb72a0c01d3dc5dc50c220a548c7986a440eef3f/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L4530-L4537]

 

In other cases where there are multiple children, a TOK_SELEXPR would have 
tokens other than Identifier.

So my patch wouldn't break any case when there are these ones.
 * 
[https://github.com/apache/hive/blob/eb72a0c01d3dc5dc50c220a548c7986a440eef3f/ql/src/java/org/apache/hadoop/hive/ql/parse/SelectClauseParser.g#L56-L58]
 * 
[https://github.com/apache/hive/blob/eb72a0c01d3dc5dc50c220a548c7986a440eef3f/ql/src/java/org/apache/hadoop/hive/ql/parse/SelectClauseParser.g#L84]
 * 
[https://github.com/apache/hive/blob/eb72a0c01d3dc5dc50c220a548c7986a440eef3f/ql/src/java/org/apache/hadoop/hive/ql/parse/FromClauseParser.g#L155-L161]
 * 
[https://github.com/apache/hive/blob/eb72a0c01d3dc5dc50c220a548c7986a440eef3f/ql/src/java/org/apache/hadoop/hive/ql/parse/FromClauseParser.g#L334-L338]

 

Checkstyle errors came from existing incorrect format. I will revise them in 
another ticket.

https://issues.apache.org/jira/browse/HIVE-22618

Concerning asflicense, it's not related to my patch.

> Some columns will be lost when a UDTF has multiple aliases in some cases
> ------------------------------------------------------------------------
>
>                 Key: HIVE-22601
>                 URL: https://issues.apache.org/jira/browse/HIVE-22601
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>    Affects Versions: 3.1.2
>            Reporter: okumin
>            Assignee: okumin
>            Priority: Major
>              Labels: pull-request-available
>         Attachments: HIVE-22601.1.patch, HIVE-22601.2.patch, HIVE-22601.patch
>
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Only one column will be retained when putting UDTFs with multiple aliases and 
> a top-level UNION together.
> For example, the result of the following SQL should have three columns, c1, 
> c2 and c3.
> {code:java}
> SELECT stack(1, 'a', 'b', 'c') AS (c1, c2, c3)
> UNION ALL
> SELECT stack(1, 'd', 'e', 'f') AS (c1, c2, c3);
> {code}
> However, It's only the c3 column which I can get.
> {code:java}
> +---------+
> | _u1.c3  |
> +---------+
> | c       |
> | f       |
> +---------+
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to