[
https://issues.apache.org/jira/browse/HIVE-24031?focusedWorklogId=473922&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-473922
]
ASF GitHub Bot logged work on HIVE-24031:
-----------------------------------------
Author: ASF GitHub Bot
Created on: 24/Aug/20 15:05
Start Date: 24/Aug/20 15:05
Worklog Time Spent: 10m
Work Description: zabetak opened a new pull request #1424:
URL: https://github.com/apache/hive/pull/1424
### What changes were proposed in this pull request?
1. Drop the defensive copy of children inside ASTNode#getChildren.
2. Protect clients by accidentally modifying the list via an
unmodifiable collection.
### Why are the changes needed?
Profiling shows the vast majority of time spend on creating defensive
copies of the node expression list inside ASTNode#getChildren.
The method is called extensively from various places in the code
especially those walking over the expression tree so it needs to be
efficient.
Most of the time creating defensive copies is not necessary. For those
cases (if any) that the list needs to be modified clients should perform
a copy themselves.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
The test was added in a separate branch since it is not meant to be
committed upstream for the following reasons:
- the query for reproducing the problem takes up a few MBs
- requires some changes in the default configurations.
If you want to run the test run the following commands:
```
git checkout -b HIVE-24031-TEST master
git pull [email protected]:zabetak/hive.git HIVE-24031-PLUS-TEST
mvn clean install -DskipTests
cd itests
mvn clean install -DskipTests
cd qtest
mvn test -Dtest=TestMiniLlapLocalCliDriver
-Dqfile=big_query_with_array_constructor.q -Dtest.output.overwrite
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 473922)
Remaining Estimate: 0h
Time Spent: 10m
> Infinite planning time on syntactically big queries
> ---------------------------------------------------
>
> Key: HIVE-24031
> URL: https://issues.apache.org/jira/browse/HIVE-24031
> Project: Hive
> Issue Type: Bug
> Components: Query Planning
> Reporter: Stamatis Zampetakis
> Assignee: Stamatis Zampetakis
> Priority: Major
> Fix For: 4.0.0
>
> Attachments: ASTNode_getChildren_cost.png,
> query_big_array_constructor.nps
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Syntactically big queries (~1 million tokens), such as the query shown below,
> lead to very big (seemingly infinite) planning times.
> {code:sql}
> select posexplode(array('item1', 'item2', ..., 'item1M'));
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)