[jira] [Work logged] (HIVE-24031) Infinite planning time on syntactically big queries

2020-09-08 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24031?focusedWorklogId=480276=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-480276
 ]

ASF GitHub Bot logged work on HIVE-24031:
-

Author: ASF GitHub Bot
Created on: 08/Sep/20 16:45
Start Date: 08/Sep/20 16:45
Worklog Time Spent: 10m 
  Work Description: kgyrtkirk merged pull request #1424:
URL: https://github.com/apache/hive/pull/1424


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 480276)
Time Spent: 50m  (was: 40m)

> Infinite planning time on syntactically big queries
> ---
>
> Key: HIVE-24031
> URL: https://issues.apache.org/jira/browse/HIVE-24031
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: ASTNode_getChildren_cost.png, 
> query_big_array_constructor.nps
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
> Syntactically big queries (~1 million tokens), such as the query shown below, 
> lead to very big (seemingly infinite) planning times.
> {code:sql}
> select posexplode(array('item1', 'item2', ..., 'item1M'));
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24031) Infinite planning time on syntactically big queries

2020-09-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24031?focusedWorklogId=477932=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477932
 ]

ASF GitHub Bot logged work on HIVE-24031:
-

Author: ASF GitHub Bot
Created on: 02/Sep/20 15:02
Start Date: 02/Sep/20 15:02
Worklog Time Spent: 10m 
  Work Description: zabetak opened a new pull request #1424:
URL: https://github.com/apache/hive/pull/1424


   ### What changes were proposed in this pull request?
   
   1. Drop the defensive copy of children inside ASTNode#getChildren.
   2. Protect clients by accidentally modifying the list via an
   unmodifiable collection.
   
   ### Why are the changes needed?
   Profiling shows the vast majority of time spend on creating defensive
   copies of the node expression list inside ASTNode#getChildren.
   
   The method is called extensively from various places in the code
   especially those walking over the expression tree so it needs to be
   efficient.
   
   Most of the time creating defensive copies is not necessary. For those
   cases (if any) that the list needs to be modified clients should perform
   a copy themselves.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   The test was added in a separate branch since it is not meant to be 
committed upstream for the following reasons:
   
   - the query for reproducing the problem takes up a few MBs
   - requires some changes in the default configurations.
   
   If you want to run the test run the following commands: 
   ```
   git checkout -b HIVE-24031-TEST master
   git pull g...@github.com:zabetak/hive.git HIVE-24031-PLUS-TEST
   mvn clean install -DskipTests
   cd itests
   mvn clean install -DskipTests
   cd qtest
   mvn test -Dtest=TestMiniLlapLocalCliDriver 
-Dqfile=big_query_with_array_constructor.q -Dtest.output.overwrite
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 477932)
Time Spent: 40m  (was: 0.5h)

> Infinite planning time on syntactically big queries
> ---
>
> Key: HIVE-24031
> URL: https://issues.apache.org/jira/browse/HIVE-24031
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: ASTNode_getChildren_cost.png, 
> query_big_array_constructor.nps
>
>  Time Spent: 40m
>  Remaining Estimate: 0h
>
> Syntactically big queries (~1 million tokens), such as the query shown below, 
> lead to very big (seemingly infinite) planning times.
> {code:sql}
> select posexplode(array('item1', 'item2', ..., 'item1M'));
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24031) Infinite planning time on syntactically big queries

2020-09-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24031?focusedWorklogId=477930=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477930
 ]

ASF GitHub Bot logged work on HIVE-24031:
-

Author: ASF GitHub Bot
Created on: 02/Sep/20 15:02
Start Date: 02/Sep/20 15:02
Worklog Time Spent: 10m 
  Work Description: zabetak commented on pull request #1424:
URL: https://github.com/apache/hive/pull/1424#issuecomment-685795081


   Closing pull request to trigger pre-commits



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 477930)
Time Spent: 20m  (was: 10m)

> Infinite planning time on syntactically big queries
> ---
>
> Key: HIVE-24031
> URL: https://issues.apache.org/jira/browse/HIVE-24031
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: ASTNode_getChildren_cost.png, 
> query_big_array_constructor.nps
>
>  Time Spent: 20m
>  Remaining Estimate: 0h
>
> Syntactically big queries (~1 million tokens), such as the query shown below, 
> lead to very big (seemingly infinite) planning times.
> {code:sql}
> select posexplode(array('item1', 'item2', ..., 'item1M'));
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24031) Infinite planning time on syntactically big queries

2020-09-02 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24031?focusedWorklogId=477931=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-477931
 ]

ASF GitHub Bot logged work on HIVE-24031:
-

Author: ASF GitHub Bot
Created on: 02/Sep/20 15:02
Start Date: 02/Sep/20 15:02
Worklog Time Spent: 10m 
  Work Description: zabetak closed pull request #1424:
URL: https://github.com/apache/hive/pull/1424


   



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 477931)
Time Spent: 0.5h  (was: 20m)

> Infinite planning time on syntactically big queries
> ---
>
> Key: HIVE-24031
> URL: https://issues.apache.org/jira/browse/HIVE-24031
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
> Fix For: 4.0.0
>
> Attachments: ASTNode_getChildren_cost.png, 
> query_big_array_constructor.nps
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Syntactically big queries (~1 million tokens), such as the query shown below, 
> lead to very big (seemingly infinite) planning times.
> {code:sql}
> select posexplode(array('item1', 'item2', ..., 'item1M'));
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Work logged] (HIVE-24031) Infinite planning time on syntactically big queries

2020-08-24 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/HIVE-24031?focusedWorklogId=473922=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-473922
 ]

ASF GitHub Bot logged work on HIVE-24031:
-

Author: ASF GitHub Bot
Created on: 24/Aug/20 15:05
Start Date: 24/Aug/20 15:05
Worklog Time Spent: 10m 
  Work Description: zabetak opened a new pull request #1424:
URL: https://github.com/apache/hive/pull/1424


   ### What changes were proposed in this pull request?
   
   1. Drop the defensive copy of children inside ASTNode#getChildren.
   2. Protect clients by accidentally modifying the list via an
   unmodifiable collection.
   
   ### Why are the changes needed?
   Profiling shows the vast majority of time spend on creating defensive
   copies of the node expression list inside ASTNode#getChildren.
   
   The method is called extensively from various places in the code
   especially those walking over the expression tree so it needs to be
   efficient.
   
   Most of the time creating defensive copies is not necessary. For those
   cases (if any) that the list needs to be modified clients should perform
   a copy themselves.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   The test was added in a separate branch since it is not meant to be 
committed upstream for the following reasons:
   
   - the query for reproducing the problem takes up a few MBs
   - requires some changes in the default configurations.
   
   If you want to run the test run the following commands: 
   ```
   git checkout -b HIVE-24031-TEST master
   git pull g...@github.com:zabetak/hive.git HIVE-24031-PLUS-TEST
   mvn clean install -DskipTests
   cd itests
   mvn clean install -DskipTests
   cd qtest
   mvn test -Dtest=TestMiniLlapLocalCliDriver 
-Dqfile=big_query_with_array_constructor.q -Dtest.output.overwrite
   ```



This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Issue Time Tracking
---

Worklog Id: (was: 473922)
Remaining Estimate: 0h
Time Spent: 10m

> Infinite planning time on syntactically big queries
> ---
>
> Key: HIVE-24031
> URL: https://issues.apache.org/jira/browse/HIVE-24031
> Project: Hive
>  Issue Type: Bug
>  Components: Query Planning
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
> Fix For: 4.0.0
>
> Attachments: ASTNode_getChildren_cost.png, 
> query_big_array_constructor.nps
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> Syntactically big queries (~1 million tokens), such as the query shown below, 
> lead to very big (seemingly infinite) planning times.
> {code:sql}
> select posexplode(array('item1', 'item2', ..., 'item1M'));
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)