[
https://issues.apache.org/jira/browse/HIVE-19192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16436770#comment-16436770
]
Gopal V commented on HIVE-19192:
--------------------------------
UNION ALL doesn't seem to suffer from the same problem - in Hive 1.2+ UNION
does something different and the optimizer path for UNION DISTINCT is called.
> HiveServer2 query compilation : query compilation time increases if sql has
> multiple unions
> --------------------------------------------------------------------------------------------
>
> Key: HIVE-19192
> URL: https://issues.apache.org/jira/browse/HIVE-19192
> Project: Hive
> Issue Type: Improvement
> Components: Hive, HiveServer2
> Affects Versions: 1.2.1, 2.1.0
> Environment: Hive-1.2.1
> Hive-2.1.0
>
> Reporter: Rajkumar Singh
> Priority: Major
> Attachments: query-with-100-union.q, query-with-200-union.q,
> query-with-50-union.q
>
>
> query compilation time suffer a lot if SQL has many unions, here is the
> simple reproduce of the problem. PFA attached query with 50,100 and 200
> unions(forgive me for this bad SQL). when run explain against hiveserver2 I
> can see the compilation time increase many folds.
> {code}
> query-with-50-union.q
> 1,671 rows selected (10.662 seconds)
> query-with-100-union.q
> 3,321 rows selected (101.709 seconds)
> query-with-200-union.q
> 6,588 rows selected (1074.487 seconds)
> {code}
> Running such SQL against hiveserver2 can starve other SQL to run into single
> threaded compilation stage.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)