[
https://issues.apache.org/jira/browse/HIVE-24031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17176322#comment-17176322
]
Stamatis Zampetakis commented on HIVE-24031:
--------------------------------------------
I run the query from above with {{TestMiniLlapLocalCliDriver}} and the
profiling ([^query_big_array_constructor.nps]) shows that the vast majority of
time is spend on creating defensive copies of the node expression list inside
ASTNode#getChildren.
!ASTNode_getChildren_cost.png!
The method is called extensively from various places in the code especially
those walking over the expression tree so it needs to be efficient. I propose
to drop the defensive copy (possibly protecting the list from modifications via
an unmodiafable collection) and let clients do copies of the list if they deem
necessary. In most of the cases, if not all, making copies of the list seems
useless.
> Infinite planning time on syntactically big queries
> ---------------------------------------------------
>
> Key: HIVE-24031
> URL: https://issues.apache.org/jira/browse/HIVE-24031
> Project: Hive
> Issue Type: Bug
> Components: Query Planning
> Reporter: Stamatis Zampetakis
> Assignee: Stamatis Zampetakis
> Priority: Major
> Fix For: 4.0.0
>
> Attachments: ASTNode_getChildren_cost.png,
> query_big_array_constructor.nps
>
>
> Syntactically big queries (~1 million tokens), such as the query shown below,
> lead to very big (seemingly infinite) planning times.
> {code:sql}
> select posexplode(array('item1', 'item2', ..., 'item1M'));
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)