[
https://issues.apache.org/jira/browse/HIVE-25485?focusedWorklogId=651811&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-651811
]
ASF GitHub Bot logged work on HIVE-25485:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 16/Sep/21 15:26
Start Date: 16/Sep/21 15:26
Worklog Time Spent: 10m
Work Description: kgyrtkirk commented on a change in pull request #2608:
URL: https://github.com/apache/hive/pull/2608#discussion_r710227073
##########
File path: ql/src/test/results/clientpositive/llap/union_literals.q.out
##########
@@ -0,0 +1,397 @@
+PREHOOK: query: explain
+SELECT * FROM (
+ VALUES(1, '1'),
+ (2, 'orange'),
+ (5, 'yellow'),
+ (10, 'green'),
+ (11, 'blue'),
+ (12, 'indigo'),
+ (20, 'violet'))
+ AS Colors
+PREHOOK: type: QUERY
+PREHOOK: Input: _dummy_database@_dummy_table
+#### A masked pattern was here ####
+POSTHOOK: query: explain
+SELECT * FROM (
+ VALUES(1, '1'),
+ (2, 'orange'),
+ (5, 'yellow'),
+ (10, 'green'),
+ (11, 'blue'),
+ (12, 'indigo'),
+ (20, 'violet'))
+ AS Colors
+POSTHOOK: type: QUERY
+POSTHOOK: Input: _dummy_database@_dummy_table
+#### A masked pattern was here ####
+STAGE DEPENDENCIES:
+ Stage-0 is a root stage
+
+STAGE PLANS:
+ Stage: Stage-0
+ Fetch Operator
+ limit: -1
+ Processor Tree:
+ TableScan
+ alias: _dummy_table
+ Row Limit Per Split: 1
+ Select Operator
+ expressions: array(const struct(1,'1'),const
struct(2,'orange'),const struct(5,'yellow'),const struct(10,'green'),const
struct(11,'blue'),const struct(12,'indigo'),const struct(20,'violet')) (type:
array<struct<col1:int,col2:string>>)
+ outputColumnNames: _col0
+ UDTF Operator
+ function name: inline
+ Select Operator
+ expressions: col1 (type: int), col2 (type: string)
+ outputColumnNames: _col0, _col1
+ ListSink
+
+PREHOOK: query: explain
+SELECT * FROM (
+ VALUES(1, '1'),
+ (2, 'orange'),
+ (5, 'yellow'),
+ (10, 'green'),
+ (11, 'blue'),
+ (12, 'indigo'),
+ (20, 'violet'))
+ AS Colors
+union all
+ select 2,'2'
+union all
+ select 2,'2'
+PREHOOK: type: QUERY
+PREHOOK: Input: _dummy_database@_dummy_table
+#### A masked pattern was here ####
+POSTHOOK: query: explain
+SELECT * FROM (
+ VALUES(1, '1'),
+ (2, 'orange'),
+ (5, 'yellow'),
+ (10, 'green'),
+ (11, 'blue'),
+ (12, 'indigo'),
+ (20, 'violet'))
+ AS Colors
+union all
+ select 2,'2'
+union all
+ select 2,'2'
+POSTHOOK: type: QUERY
+POSTHOOK: Input: _dummy_database@_dummy_table
+#### A masked pattern was here ####
+STAGE DEPENDENCIES:
+ Stage-1 is a root stage
+ Stage-0 depends on stages: Stage-1
+
+STAGE PLANS:
+ Stage: Stage-1
+ Tez
+#### A masked pattern was here ####
+ Edges:
+ Map 1 <- Union 2 (CONTAINS)
+ Map 3 <- Union 2 (CONTAINS)
+#### A masked pattern was here ####
+ Vertices:
+ Map 1
+ Map Operator Tree:
+ TableScan
+ alias: _dummy_table
+ Row Limit Per Split: 1
+ Statistics: Num rows: 1 Data size: 10 Basic stats: COMPLETE
Column stats: COMPLETE
+ Select Operator
+ expressions: array(const struct(2,'2'),const
struct(2,'2')) (type: array<struct<col1:int,col2:string>>)
+ outputColumnNames: _col0
+ Statistics: Num rows: 1 Data size: 56 Basic stats:
COMPLETE Column stats: COMPLETE
+ UDTF Operator
+ Statistics: Num rows: 1 Data size: 56 Basic stats:
COMPLETE Column stats: COMPLETE
+ function name: inline
+ Select Operator
+ expressions: col1 (type: int), col2 (type: string)
+ outputColumnNames: _col0, _col1
+ Statistics: Num rows: 1 Data size: 8 Basic stats:
COMPLETE Column stats: COMPLETE
+ File Output Operator
+ compressed: false
+ Statistics: Num rows: 2 Data size: 16 Basic stats:
COMPLETE Column stats: COMPLETE
+ table:
+ input format:
org.apache.hadoop.mapred.SequenceFileInputFormat
+ output format:
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
+ serde:
org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe
+ Execution mode: llap
+ LLAP IO: no inputs
+ Map 3
+ Map Operator Tree:
+ TableScan
+ alias: _dummy_table
+ Row Limit Per Split: 1
+ Statistics: Num rows: 1 Data size: 10 Basic stats: COMPLETE
Column stats: COMPLETE
+ Select Operator
+ expressions: array(const struct(1,'1'),const
struct(2,'orange'),const struct(5,'yellow'),const struct(10,'green'),const
struct(11,'blue'),const struct(12,'indigo'),const struct(20,'violet')) (type:
array<struct<col1:int,col2:string>>)
Review comment:
no merge is done for here becase at the rule level things are a bit
different:
* for the above struct:
`RecordType(INTEGER NOT NULL EXPR$0, VARCHAR(2147483647) CHARACTER SET
"UTF-16LE" NOT NULL EXPR$1) NOT NULL`
* for the values
`RecordType(INTEGER col1, VARCHAR(2147483647) CHARACTER SET "UTF-16LE"
col2)`
we have 2 kind of differences:
* column names are a bit different for inline tables/selection of literals
* but the type is also differs
seeing the above things I opted to be safe instead of smart - and transform
only rows for which the type is exactly the same.
I've changed the testcase to add a case in which we see that similar inline
tables are also flattened.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 651811)
Time Spent: 1h (was: 50m)
> Transform selects of literals under a UNION ALL to inline table scan
> --------------------------------------------------------------------
>
> Key: HIVE-25485
> URL: https://issues.apache.org/jira/browse/HIVE-25485
> Project: Hive
> Issue Type: Improvement
> Reporter: Zoltan Haindrich
> Assignee: Zoltan Haindrich
> Priority: Major
> Labels: pull-request-available
> Time Spent: 1h
> Remaining Estimate: 0h
>
> {code}
> select 1
> union all
> select 1
> union all
> [...]
> union all
> select 1
> {code}
> results in a very big plan; which will have vertexes proportional to the
> number of union all branch - hence it could be slow to execute it
--
This message was sent by Atlassian Jira
(v8.3.4#803005)