[
https://issues.apache.org/jira/browse/HIVE-22601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16993783#comment-16993783
]
okumin commented on HIVE-22601:
-------------------------------
There is a bug in the step to rewrite TOK_SETCOLREFs in an AST.
A TOK_SETCOLREF is a placeholder to express column names.
In my case, the schema of the result of UNION ALL is unknown at the AST level.
So this placeholder will be required.
[https://github.com/apache/hive/blob/eb72a0c01d3dc5dc50c220a548c7986a440eef3f/ql/src/java/org/apache/hadoop/hive/ql/parse/ParseUtils.java#L81]
This step assumes that there is only one alias per expression at most.
As a result, only c3 column was retained in my case.
[https://github.com/apache/hive/blob/eb72a0c01d3dc5dc50c220a548c7986a440eef3f/ql/src/java/org/apache/hadoop/hive/ql/parse/ParseUtils.java#L450]
This is the syntax part and it will appear with a UDTF.
*
[https://github.com/apache/hive/blob/eb72a0c01d3dc5dc50c220a548c7986a440eef3f/ql/src/java/org/apache/hadoop/hive/ql/parse/SelectClauseParser.g#L88]
*
[https://github.com/apache/hive/blob/eb72a0c01d3dc5dc50c220a548c7986a440eef3f/ql/src/java/org/apache/hadoop/hive/ql/parse/SemanticAnalyzer.java#L4530-L4537]
In other cases where there are multiple children, a TOK_SELEXPR would have
tokens other than Identifier.
So my patch wouldn't break any case when there are these ones.
*
[https://github.com/apache/hive/blob/eb72a0c01d3dc5dc50c220a548c7986a440eef3f/ql/src/java/org/apache/hadoop/hive/ql/parse/SelectClauseParser.g#L56-L58]
*
[https://github.com/apache/hive/blob/eb72a0c01d3dc5dc50c220a548c7986a440eef3f/ql/src/java/org/apache/hadoop/hive/ql/parse/SelectClauseParser.g#L84]
*
[https://github.com/apache/hive/blob/eb72a0c01d3dc5dc50c220a548c7986a440eef3f/ql/src/java/org/apache/hadoop/hive/ql/parse/FromClauseParser.g#L155-L161]
*
[https://github.com/apache/hive/blob/eb72a0c01d3dc5dc50c220a548c7986a440eef3f/ql/src/java/org/apache/hadoop/hive/ql/parse/FromClauseParser.g#L334-L338]
Checkstyle errors came from existing incorrect format. I will revise them in
another ticket.
https://issues.apache.org/jira/browse/HIVE-22618
Concerning asflicense, it's not related to my patch.
> Some columns will be lost when a UDTF has multiple aliases in some cases
> ------------------------------------------------------------------------
>
> Key: HIVE-22601
> URL: https://issues.apache.org/jira/browse/HIVE-22601
> Project: Hive
> Issue Type: Bug
> Components: Query Processor
> Affects Versions: 3.1.2
> Reporter: okumin
> Assignee: okumin
> Priority: Major
> Labels: pull-request-available
> Attachments: HIVE-22601.1.patch, HIVE-22601.2.patch, HIVE-22601.patch
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Only one column will be retained when putting UDTFs with multiple aliases and
> a top-level UNION together.
> For example, the result of the following SQL should have three columns, c1,
> c2 and c3.
> {code:java}
> SELECT stack(1, 'a', 'b', 'c') AS (c1, c2, c3)
> UNION ALL
> SELECT stack(1, 'd', 'e', 'f') AS (c1, c2, c3);
> {code}
> However, It's only the c3 column which I can get.
> {code:java}
> +---------+
> | _u1.c3 |
> +---------+
> | c |
> | f |
> +---------+
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)