[Impala-ASF-CR] IMPALA-7448: Invalidate recently unused tables from catalogd

2018-08-15 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11224 )

Change subject: IMPALA-7448: Invalidate recently unused tables from catalogd
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11224/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java
File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java:

http://gerrit.cloudera.org:8080/#/c/11224/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@261
PS1, Line 261: sleep((long)(unusedTableTtlSec) * 1000L);
TimeUnit.SECONDS.sleep(unusedTableTtlSec);



--
To view, visit http://gerrit.cloudera.org:8080/11224
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ib549717abefcffb14d9a3814ee8cf0de8bd49e89
Gerrit-Change-Number: 11224
Gerrit-PatchSet: 1
Gerrit-Owner: Tianyi Wang 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tianyi Wang 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 15 Aug 2018 20:29:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6271: Impala daemon should log a message when it's being shut down

2018-08-06 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10847 )

Change subject: IMPALA-6271: Impala daemon should log a message when it's being 
shut down
..


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10847/6/be/src/catalog/catalogd-main.cc
File be/src/catalog/catalogd-main.cc:

http://gerrit.cloudera.org:8080/#/c/10847/6/be/src/catalog/catalogd-main.cc@58
PS6, Line 58: using namespace apache::thrift;
If you make more changes to this file then you could remove this duplicated 
line.



--
To view, visit http://gerrit.cloudera.org:8080/10847
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Id20da9e30440b7348557beccb8a0da14775fcc29
Gerrit-Change-Number: 10847
Gerrit-PatchSet: 6
Gerrit-Owner: Pranay Singh
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Pranay Singh
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Mon, 06 Aug 2018 17:59:14 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6709: Simplify tests that copy local files to tables

2018-08-06 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11127 )

Change subject: IMPALA-6709: Simplify tests that copy local files to tables
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11127/1/tests/common/test_parquet.py
File tests/common/test_parquet.py:

http://gerrit.cloudera.org:8080/#/c/11127/1/tests/common/test_parquet.py@31
PS1, Line 31:   hdfs_file = 
'/test-warehouse/{0}.db/{1}'.format(unique_database, filename)
This is the nittiest of nits, but I am learning python:
Do we use both '%' and 'format' ?
Is it good style to mix and match them in the same file?



--
To view, visit http://gerrit.cloudera.org:8080/11127
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ie00a4561825facf8abe2e8e74a6b6e93194f416f
Gerrit-Change-Number: 11127
Gerrit-PatchSet: 1
Gerrit-Owner: Zoltan Borok-Nagy 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Mon, 06 Aug 2018 17:42:41 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6568 add missing Query Compilation section to profiles.

2018-09-04 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11387


Change subject: IMPALA-6568 add missing Query Compilation section to profiles.
..

IMPALA-6568 add missing Query Compilation section to profiles.

The profile command is used to display low-level information about the
most recent query. When a client makes request for the Profile, it
sends a GetRuntimeProfile request for the last queryId to to the server.
The queryId is used to find the ClientRequestState, an object which tracks
information about the execution, including Profile data which is stored in
several RuntimeProfile objects. The reply to the GetRuntimeProfile message
contains the Profile, pretty-printed as a string.

When a query is sent to the Front End to be compiled, the TExecRequest
that is returned from createExecRequest() in the JVM contains a Timeline,
which is a named sequence of events with timing information. This Timeline
is added to the Summary Profile in the ClientRequestState.

In the following cases the Front End was not setting the Timeline in the 
TExecRequest:
 - All DDL operations
 - Load data statements
 - Set statements
 - Explain statements
And this meant that the profile would not contain the "Query Compilation" 
timeline.

ALTERNATIVES:

To fix this I set the Timeline in TExecRequest before returning it.

>result.setTimeline(timeline.toThrift());
>return result;

I considered writing this as

>return result.setTimeline(timeline.toThrift());

To try to make sure this code gets copied in any future return statements,
but I didn’t think it was worth the cost to readability.

TESTING:

Add a new test to test_observability.py which checks that the "Query
Compilation" timeline appears in the profile for various queries designed
to exercise the new code paths in createExecRequest.

Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5
---
M fe/src/main/java/org/apache/impala/service/Frontend.java
M tests/query_test/test_observability.py
2 files changed, 46 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/11387/1
--
To view, visit http://gerrit.cloudera.org:8080/11387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5
Gerrit-Change-Number: 11387
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Sherman 


[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-05 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11388/2/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/11388/2/be/src/service/impala-server.cc@950
PS2, Line 950: (*request_state)->query_events()->MarkEvent("Planning 
finished");
Looks like both FE and BE set "Planning Finished" in the timeline. Is that 
wrong?



--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 2
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 05 Sep 2018 19:02:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP: IMPALA-6568 add missing Query Compilation section to profiles.

2018-09-05 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/11387 )

Change subject: WIP: IMPALA-6568 add missing Query Compilation section to 
profiles.
..

WIP: IMPALA-6568 add missing Query Compilation section to profiles.

WIP: so Todd can look at refactor
TODO: change tests to check that "Planning finished" is present for explain 
queries

The profile command is used to display low-level information about the
most recent query. When a client makes request for the Profile, it
sends a GetRuntimeProfile request for the last queryId to to the server.
The queryId is used to find the ClientRequestState, an object which tracks
information about the execution, including Profile data which is stored in
several RuntimeProfile objects. The reply to the GetRuntimeProfile message
contains the Profile, pretty-printed as a string.

When a query is sent to the Front End to be compiled, the TExecRequest
that is returned from createExecRequest() in the JVM contains a Timeline,
which is a named sequence of events with timing information. This Timeline
is added to the Summary Profile in the ClientRequestState.

In the following cases the Front End was not setting the Timeline in the 
TExecRequest:
 - All DDL operations
 - Load data statements
 - Set statements
 - Explain statements
And this meant that the profile would not contain the "Query Compilation" 
timeline.

ALTERNATIVES:

To fix this I set the Timeline in TExecRequest before returning it.

>result.setTimeline(timeline.toThrift());
>return result;

I considered writing this as

>return result.setTimeline(timeline.toThrift());

To try to make sure this code gets copied in any future return statements,
but I didn’t think it was worth the cost to readability.

REFACTOR

I refactored the main createExecRequest() method to try to make the flow 
clearer.

TESTING:

Add a new test to test_observability.py which checks that the "Query
Compilation" timeline appears in the profile for various queries designed
to exercise the new code paths in createExecRequest.

Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5
---
M fe/src/main/java/org/apache/impala/service/Frontend.java
M tests/query_test/test_observability.py
2 files changed, 134 insertions(+), 56 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/11387/2
--
To view, visit http://gerrit.cloudera.org:8080/11387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5
Gerrit-Change-Number: 11387
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-6568 add missing Query Compilation section to profiles.

2018-09-05 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/11387 )

Change subject: IMPALA-6568 add missing Query Compilation section to profiles.
..

IMPALA-6568 add missing Query Compilation section to profiles.

The profile command is used to display low-level information about the
most recent query. When a client makes request for the Profile, it
sends a GetRuntimeProfile request for the last queryId to to the server.
The queryId is used to find the ClientRequestState, an object which tracks
information about the execution, including Profile data which is stored in
several RuntimeProfile objects. The reply to the GetRuntimeProfile message
contains the Profile, pretty-printed as a string.

When a query is sent to the Front End to be compiled, the TExecRequest
that is returned from createExecRequest() in the JVM contains a Timeline,
which is a named sequence of events with timing information. This Timeline
is added to the Summary Profile in the ClientRequestState.

In the following cases the Front End was not setting the Timeline in the 
TExecRequest:
 - All DDL operations
 - Load data statements
 - Set statements
 - Explain statements
And this meant that the profile would not contain the "Query Compilation" 
timeline.

I refactored the main createExecRequest() method to
- try to make the flow clearer,
- allow the timeline to be set in the TExecRequest in only one place.
- to set "Planning finished" in all timelines

TESTING:

Add a new test to test_observability.py which checks that the "Query
Compilation" and "Planning finished" timelines appear in the profile for
various queries designed to exercise the new code paths in createExecRequest.

Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5
---
M fe/src/main/java/org/apache/impala/service/Frontend.java
M tests/query_test/test_observability.py
2 files changed, 147 insertions(+), 58 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/11387/3
--
To view, visit http://gerrit.cloudera.org:8080/11387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5
Gerrit-Change-Number: 11387
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 


[Impala-ASF-CR] IMPALA-6568 add missing Query Compilation section to profiles.

2018-09-05 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11387 )

Change subject: IMPALA-6568 add missing Query Compilation section to profiles.
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11387/1/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/11387/1/fe/src/main/java/org/apache/impala/service/Frontend.java@1167
PS1, Line 1167: timeline.markEvent("Planning finished");
> hmm, but even with the earlier 'planning finished', statements like LOAD DA
OK, thanks, I think I see what you mean now.



--
To view, visit http://gerrit.cloudera.org:8080/11387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5
Gerrit-Change-Number: 11387
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Comment-Date: Wed, 05 Sep 2018 22:30:47 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6568 add missing Query Compilation section to profiles.

2018-09-07 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11387 )

Change subject: IMPALA-6568 add missing Query Compilation section to profiles.
..


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/11387/3/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/11387/3/fe/src/main/java/org/apache/impala/service/Frontend.java@1065
PS3, Line 1065:   if (!analysisResult.isCreateTableAsSelectStmt()) {
  : return result;
  :   }
> I personally think it's weird, but the impala style is to use single-line i
I tried this but I don't like it. I think its more readable how it is. I don't 
want to get into a code  style war for my first (or second) change... so I will 
change this if you think it is best, but this doesn't look inconsistent with 
existing code.


http://gerrit.cloudera.org:8080/#/c/11387/3/fe/src/main/java/org/apache/impala/service/Frontend.java@1134
PS3, Line 1134:   private void setMtDopForCatalogOp(AnalysisResult 
analysisResult,
> can be static? same with several of the extracted functions below
Done


http://gerrit.cloudera.org:8080/#/c/11387/3/fe/src/main/java/org/apache/impala/service/Frontend.java@1150
PS3, Line 1150: createTExecRequest
> maybe rename this to 'createBaseExecRequest' or something? Otherwise it's s
Done


http://gerrit.cloudera.org:8080/#/c/11387/3/fe/src/main/java/org/apache/impala/service/Frontend.java@1164
PS3, Line 1164:   AnalysisResult 
analysisResult,
> it seems this is just used to get the insertStmt, and the insertStmt is alr
Done


http://gerrit.cloudera.org:8080/#/c/11387/3/fe/src/main/java/org/apache/impala/service/Frontend.java@1185
PS3, Line 1185: void addQueryMetadata
> it seems that 'result' is only used as an argument here for storing the fin
Done



--
To view, visit http://gerrit.cloudera.org:8080/11387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5
Gerrit-Change-Number: 11387
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 07 Sep 2018 18:45:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-07 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java
File fe/src/main/java/org/apache/impala/service/FrontendProfile.java:

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@73
PS3, Line 73:* Create a new profile, setting it as the current thread-lcoal 
profile for the
spelling: thread-local


http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@151
PS3, Line 151:   oldThreadLocalValue_ = THREAD_LOCAL.get();
> Ya I think we should have some guard that prevents nested scopes. I doubt i
I see now. I think maybe all that is needed is more comments, perhaps in 
createNewWIthScope(), explaining the existence of a stack of FrontendProfiles. 
Plus some exercise of the nestable  scopes in the forthcoming unit test



-- 
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 07 Sep 2018 16:13:27 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile

2018-09-06 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11388 )

Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
..


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java
File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java:

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@189
PS3, Line 189:   private static final String DB_LIST_STATS_CATEGORY = 
"DatabaseList";
Is it worth having some sort of enum for these keys? Maybe it will make the 
code more untidy?


http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java
File fe/src/main/java/org/apache/impala/service/FrontendProfile.java:

http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@151
PS3, Line 151:   oldThreadLocalValue_ = THREAD_LOCAL.get();
Can you explain what is going on here? Is there a case where 
oldThreadLocalValue_ is not null?



--
To view, visit http://gerrit.cloudera.org:8080/11388
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276
Gerrit-Change-Number: 11388
Gerrit-PatchSet: 3
Gerrit-Owner: Todd Lipcon 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 06 Sep 2018 22:54:01 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6568 add missing Query Compilation section to profiles.

2018-09-07 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/11387 )

Change subject: IMPALA-6568 add missing Query Compilation section to profiles.
..

IMPALA-6568 add missing Query Compilation section to profiles.

The profile command is used to display low-level information about the
most recent query. When a client makes request for the Profile, it
sends a GetRuntimeProfile request for the last queryId to to the server.
The queryId is used to find the ClientRequestState, an object which tracks
information about the execution, including Profile data which is stored in
several RuntimeProfile objects. The reply to the GetRuntimeProfile message
contains the Profile, pretty-printed as a string.

When a query is sent to the Front End to be compiled, the TExecRequest
that is returned from createExecRequest() in the JVM contains a Timeline,
which is a named sequence of events with timing information. This Timeline
is added to the Summary Profile in the ClientRequestState.

In the following cases the Front End was not setting the Timeline in the 
TExecRequest:
 - All DDL operations
 - Load data statements
 - Set statements
 - Explain statements
And this meant that the profile would not contain the "Query Compilation" 
timeline.

I refactored the main createExecRequest() method to
- try to make the flow clearer,
- allow the timeline to be set in the TExecRequest in only one place.
- to set "Planning finished" in all timelines

TESTING:

Add a new test to test_observability.py which checks that the "Query
Compilation" and "Planning finished" timelines appear in the profile for
various queries designed to exercise the new code paths in createExecRequest.

Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5
---
M fe/src/main/java/org/apache/impala/service/Frontend.java
M tests/query_test/test_observability.py
2 files changed, 146 insertions(+), 58 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/11387/4
--
To view, visit http://gerrit.cloudera.org:8080/11387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5
Gerrit-Change-Number: 11387
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-6568 add missing Query Compilation section to profiles.

2018-09-07 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11387 )

Change subject: IMPALA-6568 add missing Query Compilation section to profiles.
..


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/11387/3/fe/src/main/java/org/apache/impala/service/Frontend.java
File fe/src/main/java/org/apache/impala/service/Frontend.java:

http://gerrit.cloudera.org:8080/#/c/11387/3/fe/src/main/java/org/apache/impala/service/Frontend.java@a1069
PS3, Line 1069:
> so was this the main culprit (early return) for omitting the timeline for a
Yes, the early returns were not setting the timeline correctly, which is the 
source of the bug.
Do you want me to split out the cleanup into a separate jira so it can be 
better reviewed?


http://gerrit.cloudera.org:8080/#/c/11387/3/fe/src/main/java/org/apache/impala/service/Frontend.java@1034
PS3, Line 1034: t
> nit: uppercase to getTExecRequest
Done


http://gerrit.cloudera.org:8080/#/c/11387/3/fe/src/main/java/org/apache/impala/service/Frontend.java@1035
PS3, Line 1035:StringBuilder 
explainString)
> indentation here should be 4 spaces from the beginning of the line
I think you mean 6 spaces from the beginning of the line?
So 4 spaces indented from the method's indentation?
I see I was doing this wrong everywhere so I fixed the other cases too.


http://gerrit.cloudera.org:8080/#/c/11387/3/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/11387/3/tests/query_test/test_observability.py@221
PS3, Line 221:   def test_query_profile_contains_all_events(self):
> use unique_database fixture here so that the various objects here are scope
Thanks for making me learn about pytest :-)


http://gerrit.cloudera.org:8080/#/c/11387/3/tests/query_test/test_observability.py@224
PS3, Line 224: self.hdfs_client.delete_file_dir('tmp/impala6568')
 : se
> would it simplify things here if you used an existing file?
I have tidied this up so it is more consistent with other tests



--
To view, visit http://gerrit.cloudera.org:8080/11387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5
Gerrit-Change-Number: 11387
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 07 Sep 2018 23:40:06 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6568 add missing Query Compilation section to profiles.

2018-09-11 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/11387 )

Change subject: IMPALA-6568 add missing Query Compilation section to profiles.
..

IMPALA-6568 add missing Query Compilation section to profiles.

The profile command is used to display low-level information about the
most recent query. When a client makes request for the Profile, it
sends a GetRuntimeProfile request for the last queryId to to the server.
The queryId is used to find the ClientRequestState, an object which tracks
information about the execution, including Profile data which is stored in
several RuntimeProfile objects. The reply to the GetRuntimeProfile message
contains the Profile, pretty-printed as a string.

When a query is sent to the Front End to be compiled, the TExecRequest
that is returned from createExecRequest() in the JVM contains a Timeline,
which is a named sequence of events with timing information. This Timeline
is added to the Summary Profile in the ClientRequestState.

In the following cases the Front End was not setting the Timeline in the 
TExecRequest:
 - All DDL operations
 - Load data statements
 - Set statements
 - Explain statements
And this meant that the profile would not contain the "Query Compilation" 
timeline.

I refactored the main createExecRequest() method to
- try to make the flow clearer,
- allow the timeline to be set in the TExecRequest in only one place.
- to set "Planning finished" in all timelines

TESTING:

Add a new test to test_observability.py which checks that the "Query
Compilation" and "Planning finished" timelines appear in the profile for
various queries designed to exercise the new code paths in createExecRequest.

Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5
---
M fe/src/main/java/org/apache/impala/service/Frontend.java
M tests/query_test/test_observability.py
2 files changed, 144 insertions(+), 59 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/11387/5
--
To view, visit http://gerrit.cloudera.org:8080/11387
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5
Gerrit-Change-Number: 11387
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths

2018-10-15 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11582 )

Change subject: IMPALA-6658: improve Parquet RLE for low bit widths
..


Patch Set 1:

(5 comments)

Another patch should be coming soon

http://gerrit.cloudera.org:8080/#/c/11582/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11582/1//COMMIT_MSG@31
PS1, Line 31: All unit tests pass
> I assume that you ran the e2e tests, and also that you checked that there i
I did run e2e tests, and now I have also checked that they encode values of bit 
width 1 and 2. Bit width 1 is used when writing definition levels (i.e. when a 
value is null or not). Bit width 2 is used when encoding a dictionary with 3 or 
4 entries.


http://gerrit.cloudera.org:8080/#/c/11582/1/be/src/util/rle-encoding.h
File be/src/util/rle-encoding.h:

http://gerrit.cloudera.org:8080/#/c/11582/1/be/src/util/rle-encoding.h@208
PS1, Line 208: min_repeated_run_length
> mention that it must be a multiple of 8, less than MAX_RUN_LENGTH_BUFFER
Done


http://gerrit.cloudera.org:8080/#/c/11582/1/be/src/util/rle-encoding.h@209
PS1, Line 209:  which is the best choice
> seems like from your DCHECK(TestInfo::is_test()) that this is the required
Done


http://gerrit.cloudera.org:8080/#/c/11582/1/be/src/util/rle-encoding.h@242
PS1, Line 242: MinRepeatedRunLength(bit_width)
> should this be min_repeated_run_length_, in case its set to a non-default v
I've changed this function again after thinking it through more carefully, and 
it no longer uses MinRepeatedRunLength() in this way.


http://gerrit.cloudera.org:8080/#/c/11582/1/be/src/util/rle-encoding.h@380
PS1, Line 380: run_length
> min_repeated_run_length_?
Done



--
To view, visit http://gerrit.cloudera.org:8080/11582
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76
Gerrit-Change-Number: 11582
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Mon, 15 Oct 2018 23:52:21 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5821: Add query with implicit casts to extended explain output.

2018-10-18 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/11719 )

Change subject: IMPALA-5821: Add query with implicit casts to extended explain 
output.
..

IMPALA-5821: Add query with implicit casts to extended explain output.

If explain_level is at 'extended' level or higher, then print the
analyzed query as text in the explain header to aid in diagnosis. This
query text is generated from the analyzed statement tree, so it includes
query rewriting. In addition implicit casts are printed, and numeric
literals are printed with a cast so that their type is visible.

If SHOW_IMPLICIT_CASTS is passed to toSql() then
- in CastExpr print the implicit cast
- in NumericLiteral print the literal with a cast to show the type

Add a PlannerTestOption directive that will force the query text showing
implicit casts to be included in the PLAN section of a .test file.

The toSql() method can be called on a ParseNode tree to return
the sql corresponding ot the tree. In the past toSQl() has been
enhanced to print rewritten sql by partially overloading toSql() [with
toSql(boolean)]. This current change requires changing toSQl() in
many places as NumericLiteral can appear at different points in ia
parse tree. To avoid many new fragile overloads of toSql() I added
toSql(ToSqlOptions), where ToSqlOptions is an enum which controls the
form of the Sql that is returned. This changes many files but is safer
and means that any future options to toSql() can be added painlessly.

If a query in a .test file produces a diff when run by PlannerTest,
then print the name of the .test file in the output.

Documentation of this change will be done as IMPALA-7718

EXAMPLE OUTPUT:

[localhost:21000] default> set explain_level=extended;
EXPLAIN_LEVEL set to extended
[localhost:21000] default> explain select * from functional_kudu.alltypestiny 
where bigint_col < 1000 / 100;
Query: explain select * from functional_kudu.alltypestiny where bigint_col < 
1000 / 100
Query with implicit casts: SELECT * FROM functional_kudu.alltypestiny WHERE 
CAST(bigint_col AS DOUBLE) < CAST(10 AS DOUBLE)
Max Per-Host Resource Reservation: Memory=0B Threads=2
Per-Host Resource Estimates: Memory=10MB
Codegen disabled by planner
""
F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|  Per-Host Resources: mem-estimate=0B mem-reservation=0B thread-reservation=2
(etc.)

Note that the initial "Query:" that is printed originates from
impala-shell (suppressed with --quiet), but for neatness I put the new
"Query with implicit casts:" output at the top of the header.

TESTING:

All end-to-end tests pass.
Added a new test in ExprRewriterTest which prints sql with implict casts
for some interesting queries.
The output of some Planner Tests in .test files has been updated to
include the sql with implict casts that is printed when explain_level is
at at least 'extended' level.

QUESTIONS FOR REVIEWERS:

Is "Query with implicit casts:" a good description?
Is the EXPLAIN header the right place for this output?

Change-Id: I55c3bdacc295137f66b2316a912fc347da30d6b0
---
M fe/src/main/java/org/apache/impala/analysis/AdminFnStmt.java
M 
fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticWindow.java
M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java
M fe/src/main/java/org/apache/impala/analysis/BaseTableRef.java
M fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java
M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java
M fe/src/main/java/org/apache/impala/analysis/CastExpr.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDataSrcStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDropRoleStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java
M 

[Impala-ASF-CR] IMPALA-5821: Add query with implicit casts to extended explain output.

2018-10-18 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11719 )

Change subject: IMPALA-5821: Add query with implicit casts to extended explain 
output.
..


Patch Set 1:

(4 comments)

Thanks for the review. What do you think about:
Is "Query with implicit casts:" a good description?
Is the EXPLAIN header the right place for this output?

http://gerrit.cloudera.org:8080/#/c/11719/1/fe/src/main/java/org/apache/impala/analysis/ParseNode.java
File fe/src/main/java/org/apache/impala/analysis/ParseNode.java:

http://gerrit.cloudera.org:8080/#/c/11719/1/fe/src/main/java/org/apache/impala/analysis/ParseNode.java@30
PS1, Line 30:   /**
:* Returns the SQL string corresponding to this node.
:*/
> Can you explain the relation of the two toSql() functions in the comment? I
Thanks fro thinking about this.  This is a case where default parameters would 
help.


http://gerrit.cloudera.org:8080/#/c/11719/1/fe/src/main/java/org/apache/impala/analysis/ToSqlOptions.java
File fe/src/main/java/org/apache/impala/analysis/ToSqlOptions.java:

http://gerrit.cloudera.org:8080/#/c/11719/1/fe/src/main/java/org/apache/impala/analysis/ToSqlOptions.java@1
PS1, Line 1: package org.apache.impala.analysis;
> Apache license is missing (also noticed by the jenkins job).
Done


http://gerrit.cloudera.org:8080/#/c/11719/1/fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java:

http://gerrit.cloudera.org:8080/#/c/11719/1/fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java@344
PS1, Line 344: TestToSqlWithImplicitCasts
> Can you add a test (or maybe a few) to check that nested structures like su
good idea


http://gerrit.cloudera.org:8080/#/c/11719/1/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
File fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java:

http://gerrit.cloudera.org:8080/#/c/11719/1/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java@546
PS1, Line 546: errorLog.append("\nTest file: " + testFile + ".test"
 : + "\n");
> nit: fits to one line
Done



--
To view, visit http://gerrit.cloudera.org:8080/11719
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I55c3bdacc295137f66b2316a912fc347da30d6b0
Gerrit-Change-Number: 11719
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Thu, 18 Oct 2018 22:08:31 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths

2018-10-22 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11582 )

Change subject: IMPALA-6658: improve Parquet RLE for low bit widths
..


Patch Set 2:

(7 comments)

Thanks for the reviews

http://gerrit.cloudera.org:8080/#/c/11582/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11582/2//COMMIT_MSG@20
PS2, Line 20: By default RleEncoder will now use run length encoding for runs of
: length 24 for single bit values, and of length 16 for 2 bit wide 
values.
: All other bit widths will use the existing length 8 runs.
> About the solution in general:
I agree with your calculations. I think sometimes min_run_length=24 is better 
at bit width=1
I’m going to use the notation of R16 for a repeated run of length 16, and L16 
for a literal run of length 16.
So your example (bit width 1) is:
repeated lengths:   R24 R16 R24 R16 R24
min_run_length 8 (old)  2   2   2   2   2
min_run_length 24 (new) 2   3   2   3   2
min_run_length 16   2   2   2   2   2

The case where min_run_length=24 is better is where we avoid interrupting a 
literal run
L24 R16 L24 R16 L24
Old solution4   2   4   2   4
min_run_length 24   4   2   3   2   3 (one long literal run)
min_run_length 16   4   2   4   2   4

The case with R8 interrupting a literal:
L24 R8  L24 R8  L24
Old solution4   2   4   2   4
min_run_length 24   4   1   3   1   3 (one long literal run)
min_run_length 16   4   1   3   1   3 (one long literal run)

The cases for bit_width = 2 are similar.

My thinking originally was that the case of a run interrupting a literal would 
be more common than alternating runs.
But I think you are right that I should only change things where there is 
always a win.
I will use min_run_length=16 for bit width 1, and that will be the only case 
where min_run_length != 8.


http://gerrit.cloudera.org:8080/#/c/11582/2/be/src/util/rle-encoding.h
File be/src/util/rle-encoding.h:

http://gerrit.cloudera.org:8080/#/c/11582/2/be/src/util/rle-encoding.h@203
PS2, Line 203: the previous run is flushed out
> I think that it would be helpful to update this comment with a brief discus
Done


http://gerrit.cloudera.org:8080/#/c/11582/2/be/src/util/rle-encoding.h@211
PS2, Line 211: mu;tiple
> typo: multiple
Done


http://gerrit.cloudera.org:8080/#/c/11582/2/be/src/util/rle-encoding.h@348
PS2, Line 348: most values
> most MAX_RUN_LENGTH_BUFFER values
Done


http://gerrit.cloudera.org:8080/#/c/11582/2/be/src/util/rle-encoding.h@350
PS2, Line 350: _
> nit: period, here and elsewhere
Done


http://gerrit.cloudera.org:8080/#/c/11582/2/be/src/util/rle-encoding.h@475
PS2, Line 475: // updates num_buffered_values
> This comment does not seam true here.
I think it is true, even when update_indicator_byte = false.
But the comment is bad, should be
// updates num_buffered_values_
(I am not used to that trailing underscore)


http://gerrit.cloudera.org:8080/#/c/11582/2/be/src/util/rle-encoding.h@549
PS2, Line 549: we
> typo
Done



--
To view, visit http://gerrit.cloudera.org:8080/11582
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76
Gerrit-Change-Number: 11582
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Mon, 22 Oct 2018 23:24:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5821: Add query with implicit casts to extended explain output.

2018-10-19 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11719 )

Change subject: IMPALA-5821: Add query with implicit casts to extended explain 
output.
..


Patch Set 2:

> It seems good to me, but I would ping Greg Rahn in the Jira, maybe
 > he has some ideas.

I have pinged him

 > I have one issue with the current output: as I saw in the .test
 > files, queries are printed in one line by default, which can make
 > complex queries very difficult to read. It would be much nicer to
 > break them at logical points, but I have no idea how to do it
 > easily.

I agree that a clever printing would be nice, but that sounds tricky. I could 
wrap at say the last space before 80 columns with only small effort, but this 
will make the header harder to parse in tests. Should I try this?


-- 
To view, visit http://gerrit.cloudera.org:8080/11719
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I55c3bdacc295137f66b2316a912fc347da30d6b0
Gerrit-Change-Number: 11719
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Fri, 19 Oct 2018 17:04:34 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-5821: Add query with implicit casts to extended explain output.

2018-10-17 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11719


Change subject: IMPALA-5821: Add query with implicit casts to extended explain 
output.
..

IMPALA-5821: Add query with implicit casts to extended explain output.

If explain_level is at 'extended' level or higher, then print the
analyzed query as text in the explain header to aid in diagnosis. This
query text is generated from the analyzed statement tree, so it includes
query rewriting. In addition implicit casts are printed, and numeric
literals are printed with a cast so that their type is visible.

If SHOW_IMPLICIT_CASTS is passed to toSql() then
- in CastExpr print the implicit cast
- in NumericLiteral print the literal with a cast to show the type

Add a PlannerTestOption directive that will force the query text showing
implicit casts to be included in the PLAN section of a .test file.

The toSql() method can be called on a ParseNode tree to return
the sql corresponding ot the tree. In the past toSQl() has been
enhanced to print rewritten sql by partially overloading toSql() [with
toSql(boolean)]. This current change requires changing toSQl() in
many places as NumericLiteral can appear at different points in ia
parse tree. To avoid many new fragile overloads of toSql() I added
toSql(ToSqlOptions), where ToSqlOptions is an enum which controls the
form of the Sql that is returned. This changes many files but is safer
and means that any future options to toSql() can be added painlessly.

If a query in a .test file produces a diff when run by PlannerTest,
then print the name of the .test file in the output.

Documentation of this change will be done as IMPALA-7718

EXAMPLE OUTPUT:

[localhost:21000] default> set explain_level=extended;
EXPLAIN_LEVEL set to extended
[localhost:21000] default> explain select * from functional_kudu.alltypestiny 
where bigint_col < 1000 / 100;
Query: explain select * from functional_kudu.alltypestiny where bigint_col < 
1000 / 100
Query with implicit casts: SELECT * FROM functional_kudu.alltypestiny WHERE 
CAST(bigint_col AS DOUBLE) < CAST(10 AS DOUBLE)
Max Per-Host Resource Reservation: Memory=0B Threads=2
Per-Host Resource Estimates: Memory=10MB
Codegen disabled by planner
""
F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|  Per-Host Resources: mem-estimate=0B mem-reservation=0B thread-reservation=2
(etc.)

Note that the initial "Query:" that is printed originates from
impala-shell (suppressed with --quiet), but for neatness I put the new
"Query with implicit casts:" output at the top of the header.

TESTING:

All end-to-end tests pass.
Added a new test in ExprRewriterTest which prints sql with implict casts
for some interesting queries.
The output of some Planner Tests in .test files has been updated to
include the sql with implict casts that is printed when explain_level is
at at least 'extended' level.

QUESTIONS FOR REVIEWERS:

Is "Query with implicit casts:" a good description?
Is the EXPLAIN header the right place for this output?

Change-Id: I55c3bdacc295137f66b2316a912fc347da30d6b0
---
M fe/src/main/java/org/apache/impala/analysis/AdminFnStmt.java
M 
fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticWindow.java
M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java
M fe/src/main/java/org/apache/impala/analysis/BaseTableRef.java
M fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java
M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java
M fe/src/main/java/org/apache/impala/analysis/CastExpr.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDataSrcStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDropRoleStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java
M 

[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths

2018-10-16 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/11582 )

Change subject: IMPALA-6658: improve Parquet RLE for low bit widths
..

IMPALA-6658: improve Parquet RLE for low bit widths

RleEncoder buffers values in its own cache to detect run lengths that
can be efficiently encoded. When a run is detected it is written with an
indicator byte which encodes the length of the run. So a run is encoded
as 2 bytes, but the actual cost of a run of small bit width values in
the middle of a literal is really 3 bytes, as a new indicator is needed
after the run.

Change RleEncoder to have the ability to use run lengths other than 8.
A new parameter to the constructor (min_run_length) allows test callers
(only) to set the minimum run length.

By default RleEncoder will now use run length encoding for runs of
length 24 for single bit values, and of length 16 for 2 bit wide values.
All other bit widths will use the existing length 8 runs.

Internally RleEncoder must buffer more values so that the longer runs
can be detected. The internal buffer “buffered_values_” is larger
and is now a circular buffer so that the first 8 bytes of the buffer can
be separately flushed to BitWriter.

Testing:

All end-to-end and unit tests pass

The unit test rle-test is enhanced to run all tests against RleEncoders
using all possible values of min_run_length. In Addition, rle-test is
refactored so that the Rle tests are in a class that inherits from
::testing::Test so that a SetUp() method can be used.
The Overflow test is enhanced to be more exhaustive (while still
completing in a second or two).

Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76
---
M be/src/util/rle-encoding.h
M be/src/util/rle-test.cc
2 files changed, 507 insertions(+), 255 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/11582/2
--
To view, visit http://gerrit.cloudera.org:8080/11582
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76
Gerrit-Change-Number: 11582
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths

2018-10-23 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11582 )

Change subject: IMPALA-6658: improve Parquet RLE for low bit widths
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11582/2/be/src/util/rle-test.cc
File be/src/util/rle-test.cc:

http://gerrit.cloudera.org:8080/#/c/11582/2/be/src/util/rle-test.cc@148
PS2, Line 148: Rle
> nit: RleTest
Done, missed this before



--
To view, visit http://gerrit.cloudera.org:8080/11582
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76
Gerrit-Change-Number: 11582
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Tue, 23 Oct 2018 19:39:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths

2018-10-23 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/11582 )

Change subject: IMPALA-6658: improve Parquet RLE for low bit widths
..

IMPALA-6658: improve Parquet RLE for low bit widths

RleEncoder buffers values in its own cache to detect run lengths that
can be efficiently encoded. When a run is detected it is written with an
indicator byte which encodes the length of the run. So an encoded
run always has an overhead of at least one byte. This means that for
single bit values, encoding 8 values as a run is inefficient.

Change RleEncoder to have the ability to use run lengths other than 8.
A new parameter to the constructor (min_run_length) allows test callers
(only) to set the minimum run length.

By default RleEncoder will now use run length encoding for runs of
length 16 for single bit values. All other bit widths will use the
existing length 8 runs.

Internally RleEncoder must buffer more values so that the longer runs
can be detected. The internal buffer “buffered_values_” is larger
and is now a circular buffer so that the first 8 bytes of the buffer can
be separately flushed to BitWriter.

Testing:

All end-to-end and unit tests pass

The unit test rle-test is enhanced to run all tests against RleEncoders
using all possible values of min_run_length. In Addition, rle-test is
refactored so that the Rle tests are in a class that inherits from
::testing::Test so that a SetUp() method can be used.
The Overflow test is enhanced to be more exhaustive (while still
completing in a second or two).

Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76
---
M be/src/util/rle-encoding.h
M be/src/util/rle-test.cc
2 files changed, 495 insertions(+), 255 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/11582/3
--
To view, visit http://gerrit.cloudera.org:8080/11582
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76
Gerrit-Change-Number: 11582
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths

2018-10-29 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11582 )

Change subject: IMPALA-6658: improve Parquet RLE for low bit widths
..


Patch Set 2:

(5 comments)

Thanks Thomas and Csaba for reviews so far

http://gerrit.cloudera.org:8080/#/c/11582/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11582/2//COMMIT_MSG@20
PS2, Line 20: By default RleEncoder will now use run length encoding for runs of
: length 24 for single bit values, and of length 16 for 2 bit wide 
values.
: All other bit widths will use the existing length 8 runs.
> >This could be avoided by using smaller
I agree that this might be a better way, but I am happy with the simple 
improvement we get from this change.


http://gerrit.cloudera.org:8080/#/c/11582/3/be/src/util/rle-encoding.h
File be/src/util/rle-encoding.h:

http://gerrit.cloudera.org:8080/#/c/11582/3/be/src/util/rle-encoding.h@266
PS3, Line 266:  1 byte for the repeated value
> I prefer to phrase it as "never worse than 8". For alternating long repeate
Done


http://gerrit.cloudera.org:8080/#/c/11582/3/be/src/util/rle-encoding.h@267
PS3, Line 267:
> Maybe "better than 16" would be better.
Done


http://gerrit.cloudera.org:8080/#/c/11582/2/be/src/util/rle-encoding.h
File be/src/util/rle-encoding.h:

http://gerrit.cloudera.org:8080/#/c/11582/2/be/src/util/rle-encoding.h@203
PS2, Line 203: the previous run is flushed out
> Done
Done for real this time


http://gerrit.cloudera.org:8080/#/c/11582/3/be/src/util/rle-test.cc
File be/src/util/rle-test.cc:

http://gerrit.cloudera.org:8080/#/c/11582/3/be/src/util/rle-test.cc@216
PS3, Line 216: c
> Oops, I missed this one:
Done



--
To view, visit http://gerrit.cloudera.org:8080/11582
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76
Gerrit-Change-Number: 11582
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Mon, 29 Oct 2018 21:14:39 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths

2018-10-29 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/11582 )

Change subject: IMPALA-6658: improve Parquet RLE for low bit widths
..

IMPALA-6658: improve Parquet RLE for low bit widths

RleEncoder buffers values in its own cache to detect run lengths that
can be efficiently encoded. When a run is detected it is written with an
indicator byte which encodes the length of the run. So an encoded
run always has an overhead of at least one byte. This means that for
single bit values, encoding 8 values as a run is inefficient.

Change RleEncoder to have the ability to use run lengths other than 8.
A new parameter to the constructor (min_run_length) allows test callers
(only) to set the minimum run length.

By default RleEncoder will now use run length encoding for runs of
length 16 for single bit values. All other bit widths will use the
existing length 8 runs.

Internally RleEncoder must buffer more values so that the longer runs
can be detected. The internal buffer “buffered_values_” is larger
and is now a circular buffer so that the first 8 bytes of the buffer can
be separately flushed to BitWriter.

Testing:

All end-to-end and unit tests pass

The unit test rle-test is enhanced to run all tests against RleEncoders
using all possible values of min_run_length. In Addition, rle-test is
refactored so that the Rle tests are in a class that inherits from
::testing::Test so that a SetUp() method can be used.
The Overflow test is enhanced to be more exhaustive (while still
completing in a second or two).

Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76
---
M be/src/util/rle-encoding.h
M be/src/util/rle-test.cc
2 files changed, 500 insertions(+), 255 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/11582/4
--
To view, visit http://gerrit.cloudera.org:8080/11582
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76
Gerrit-Change-Number: 11582
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-5821: Add query with implicit casts to extended explain output.

2018-10-31 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/11719 )

Change subject: IMPALA-5821: Add query with implicit casts to extended explain 
output.
..

IMPALA-5821: Add query with implicit casts to extended explain output.

If explain_level is at 'extended' level or higher, then enhance the
output from the explain command. (1) Show the analyzed sql in the
explain header, this is the rewritten sql, which includes implicit
casts, and literals are printed with a cast so that their type is
visible. (2) When predicates are shown in the plan these are shown in
the same format.

The toSql() method can be called on a ParseNode tree to return
the sql corresponding ot the tree. In the past toSQl() has been
enhanced to print rewritten sql by partially overloading toSql() [with
toSql(boolean)]. This current change requires changing toSQl() in
many places as NumericLiteral can appear at different points in ia
parse tree. To avoid many new fragile overloads of toSql() I added
toSql(ToSqlOptions), where ToSqlOptions is an enum which controls the
form of the Sql that is returned. This changes many files but is safer
and means that any future options to toSql() can be added painlessly.

If SHOW_IMPLICIT_CASTS is passed to toSql() then
- in CastExpr print the implicit cast
- in NumericLiteral print the literal with a cast to show the type

Add a PlannerTestOption directive that will force the query text showing
implicit casts to be included in the PLAN section of a .test file.

The analyzed query text is wrapped at 80 characters. Note that the
analyzed query cannot always be executed as queries rewritten to use
LEFT SEMI JOIN are not legal sql. In addition some space characters may
be removed from the query for prettier display.

Documentation of this change will be done as IMPALA-7718

EXAMPLE OUTPUT:

[localhost:21000] default> set explain_level=2;
EXPLAIN_LEVEL set to 2
[localhost:21000] default> explain select * from functional_kudu.alltypestiny 
where bigint_col < 1000 / 100;
Query: explain select * from functional_kudu.alltypestiny where bigint_col < 
1000 / 100
Max Per-Host Resource Reservation: Memory=0B Threads=2
Per-Host Resource Estimates: Memory=10MB
Codegen disabled by planner
Analyzed query: SELECT * FROM functional_kudu.alltypestiny WHERE CAST(bigint_col
AS DOUBLE) < CAST(10 AS DOUBLE)
""
F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|  Per-Host Resources: mem-estimate=4.88MB mem-reservation=0B 
thread-reservation=2
PLAN-ROOT SINK
|  mem-estimate=0B mem-reservation=0B thread-reservation=0
|
00:SCAN KUDU [functional_kudu.alltypestiny]
   predicates: CAST(bigint_col AS DOUBLE) < CAST(10 AS DOUBLE)
   mem-estimate=4.88MB mem-reservation=0B thread-reservation=1
   tuple-ids=0 row-size=97B cardinality=1
   in pipelines: 00(GETNEXT)
Fetched 16 row(s) in 0.03s

TESTING:

All end-to-end tests pass.
Added a new test in ExprRewriterTest which prints sql with implict casts
for some interesting queries.
Add a unit test for the code which wraps text at 60 characters.
The output of some Planner Tests in .test files has been updated to
include the Analyzed sql that is printed when explain_level is
at at least 'extended' level.

Change-Id: I55c3bdacc295137f66b2316a912fc347da30d6b0
---
M fe/src/main/java/org/apache/impala/analysis/AdminFnStmt.java
M 
fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticWindow.java
M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java
M fe/src/main/java/org/apache/impala/analysis/BaseTableRef.java
M fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java
M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java
M fe/src/main/java/org/apache/impala/analysis/CastExpr.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDataSrcStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDropRoleStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java
M 

[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths

2018-10-31 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11582 )

Change subject: IMPALA-6658: improve Parquet RLE for low bit widths
..


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/11582/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11582/2//COMMIT_MSG@20
PS2, Line 20: length 16 for single bit values. All other bit widths will use the
: existing length 8 runs.
:
> Yes, it looks good, but I think that the worst case (alternating L8 R16) wo
OK


http://gerrit.cloudera.org:8080/#/c/11582/4/be/src/util/rle-encoding.h
File be/src/util/rle-encoding.h:

http://gerrit.cloudera.org:8080/#/c/11582/4/be/src/util/rle-encoding.h@265
PS4, Line 265: return std::max(MinBufferSize(bit_width), literal_max_size);
> I think that is still not always the worst case - the example you wrote for
You are right, and I have updated MaxBufferSize (and added a test) in next patch



--
To view, visit http://gerrit.cloudera.org:8080/11582
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76
Gerrit-Change-Number: 11582
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Wed, 31 Oct 2018 23:36:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5821: Add query with implicit casts to extended explain output.

2018-10-31 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11719 )

Change subject: IMPALA-5821: Add query with implicit casts to extended explain 
output.
..


Patch Set 4:

(21 comments)

Thanks for review comments

http://gerrit.cloudera.org:8080/#/c/11719/4//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11719/4//COMMIT_MSG@51
PS4, Line 51: ""
> This quote seems strange here - is it included in the output?
This is what you get when you run 'impala-shell.sh -B'


http://gerrit.cloudera.org:8080/#/c/11719/4//COMMIT_MSG@69
PS4, Line 69: 80
> Isn't it 60?
yes


http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/analysis/ParseNode.java
File fe/src/main/java/org/apache/impala/analysis/ParseNode.java:

http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/analysis/ParseNode.java@32
PS4, Line 32: sql
> nit: pls use consistent case (SQL, sql, Sql)
Thanks


http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/analysis/ParseNode.java@32
PS4, Line 32: @param
> pls omit these javadoc annotations.
Done


http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/analysis/ParseNode.java@39
PS4, Line 39:* This should return the same result as calling 
toSql(ToSqlOptions.DEFAULT).
> would be good to do this via a default interface method. pls add a todo for
Clever idea


http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/analysis/ToSqlOptions.java
File fe/src/main/java/org/apache/impala/analysis/ToSqlOptions.java:

http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/analysis/ToSqlOptions.java@25
PS4, Line 25: default, normal
> "normal" makes sense if you know what is currently done. perhaps "original
Done


http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/analysis/ToSqlOptions.java@38
PS4, Line 38:   // This does have the consequence that the sql with implict 
casts may not pssibly fail
> nit: spelling
Done


http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/analysis/ToSqlOptions.java@38
PS4, Line 38: may not pssibly fail
:   // to parse if resubmitted as
> I found this difficult to understand. Did you mean that Sql produced in thi
Yes, rewritten SQL is already not always valid SQL.
EXISTS queries that are rewritten as semi-joins are not legal SQL.
I clarified the comment slightly


http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/analysis/TypeDef.java
File fe/src/main/java/org/apache/impala/analysis/TypeDef.java:

http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/analysis/TypeDef.java@165
PS4, Line 165: return parsedType_.toSql();
> pass down here?
It looks like it should, but the parsedType is not a ParseNode so I did not 
propagate ToSqlOptions to there.


http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/common/PrintUtils.java
File fe/src/main/java/org/apache/impala/common/PrintUtils.java:

http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/common/PrintUtils.java@110
PS4, Line 110: 32
> what's this?
Just some padding for if lines expand. Maybe this is too ugly.


http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/common/PrintUtils.java@115
PS4, Line 115: exiting
> sp. existing?
Done


http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java
File fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java:

http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java@424
PS4, Line 424: @pa
> remove
I don't really understand what is wrong with param?
But done.


http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java@451
PS4, Line 451: // AnalyzesOk(stmt.toSql(true), ctx);
> rm?
Done


http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java
File fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java:

http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java@750
PS4, Line 750: @para
> remove the comment annotations.
Done


http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java@772
PS4, Line 772:   if (line.contains("Analyzed query:")) {
 : builder.append(line).append("\n");
 : inImplictCasts = true;
 :   } else if (inImplictCasts) {
 : // Keep copying the query with implicit casts.
 : // This works because this is the last thing in the 
header.
 : 

[Impala-ASF-CR] IMPALA-5821: Add query with implicit casts to extended explain output.

2018-10-30 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/11719 )

Change subject: IMPALA-5821: Add query with implicit casts to extended explain 
output.
..

IMPALA-5821: Add query with implicit casts to extended explain output.

If explain_level is at 'extended' level or higher, then enhance the
output from the explain command. (1) Show the analyzed sql in the
explain header, this is the rewritten sql, which includes implicit
casts, and literals are printed with a cast so that their type is
visible. (2) When predicates are shown in the plan these are shown in
the same format.

The toSql() method can be called on a ParseNode tree to return
the sql corresponding ot the tree. In the past toSQl() has been
enhanced to print rewritten sql by partially overloading toSql() [with
toSql(boolean)]. This current change requires changing toSQl() in
many places as NumericLiteral can appear at different points in ia
parse tree. To avoid many new fragile overloads of toSql() I added
toSql(ToSqlOptions), where ToSqlOptions is an enum which controls the
form of the Sql that is returned. This changes many files but is safer
and means that any future options to toSql() can be added painlessly.

If SHOW_IMPLICIT_CASTS is passed to toSql() then
- in CastExpr print the implicit cast
- in NumericLiteral print the literal with a cast to show the type

Add a PlannerTestOption directive that will force the query text showing
implicit casts to be included in the PLAN section of a .test file.

The analyzed query text is wrapped at 80 characters. Note that the
analyzed query cannot always be executed as queries rewritten to use
LEFT SEMI JOIN are not legal sql. In addition some space characters may
be removed from the query for prettier display.

Documentation of this change will be done as IMPALA-7718

EXAMPLE OUTPUT:

[localhost:21000] default> set explain_level=2;
EXPLAIN_LEVEL set to 2
[localhost:21000] default> explain select * from functional_kudu.alltypestiny 
where bigint_col < 1000 / 100;
Query: explain select * from functional_kudu.alltypestiny where bigint_col < 
1000 / 100
Max Per-Host Resource Reservation: Memory=0B Threads=2
Per-Host Resource Estimates: Memory=10MB
Codegen disabled by planner
Analyzed query: SELECT * FROM functional_kudu.alltypestiny WHERE CAST(bigint_col
AS DOUBLE) < CAST(10 AS DOUBLE)
""
F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|  Per-Host Resources: mem-estimate=4.88MB mem-reservation=0B 
thread-reservation=2
PLAN-ROOT SINK
|  mem-estimate=0B mem-reservation=0B thread-reservation=0
|
00:SCAN KUDU [functional_kudu.alltypestiny]
   predicates: CAST(bigint_col AS DOUBLE) < CAST(10 AS DOUBLE)
   mem-estimate=4.88MB mem-reservation=0B thread-reservation=1
   tuple-ids=0 row-size=97B cardinality=1
   in pipelines: 00(GETNEXT)
Fetched 16 row(s) in 0.03s

TESTING:

All end-to-end tests pass.
Added a new test in ExprRewriterTest which prints sql with implict casts
for some interesting queries.
Add a unit test for the code which wraps text at 80 characters.
The output of some Planner Tests in .test files has been updated to
include the Analyzed sql that is printed when explain_level is
at at least 'extended' level.

Change-Id: I55c3bdacc295137f66b2316a912fc347da30d6b0
---
M fe/src/main/java/org/apache/impala/analysis/AdminFnStmt.java
M 
fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticWindow.java
M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java
M fe/src/main/java/org/apache/impala/analysis/BaseTableRef.java
M fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java
M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java
M fe/src/main/java/org/apache/impala/analysis/CastExpr.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDataSrcStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDropRoleStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java
M 

[Impala-ASF-CR] IMPALA-5821: Add query with implicit casts to extended explain output.

2018-10-30 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11719 )

Change subject: IMPALA-5821: Add query with implicit casts to extended explain 
output.
..


Patch Set 4:

I added a new patch because rebase was required


--
To view, visit http://gerrit.cloudera.org:8080/11719
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I55c3bdacc295137f66b2316a912fc347da30d6b0
Gerrit-Change-Number: 11719
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Tue, 30 Oct 2018 20:34:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths

2018-10-30 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11582 )

Change subject: IMPALA-6658: improve Parquet RLE for low bit widths
..


Patch Set 2:

(4 comments)

Thanks Csaba

http://gerrit.cloudera.org:8080/#/c/11582/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11582/2//COMMIT_MSG@20
PS2, Line 20: By default RleEncoder will now use run length encoding for runs of
: length 24 for single bit values, and of length 16 for 2 bit wide 
values.
: All other bit widths will use the existing length 8 runs.
> Ok, the biggest win (50% size for alternating runs of 8 in the 1 bit case)
So you want me to write something like this in the jira?

If you know the structure of the data then better encodings are
possible. For example with bit_width=1, using min_run_length=24 is
better in the case where we avoid interrupting a literal run.

Using the notation of 'RXX' for a repeated run of length XX (so R16 is a
run of lngth 16), and 'LYY' for a literal run of length YY.

 L24 R16 L24 R16 L24
min_run_length 8 4   2   4   2   4
min_run_length 16 (new default)  4   2   4   2   4
min_run_length 244   2   3   2   3 (one long literal run)

So it is possible to optimize by detecting this situation and avoiding
breaking a long literal run for a run of length 16.


http://gerrit.cloudera.org:8080/#/c/11582/4/be/src/util/rle-encoding.h
File be/src/util/rle-encoding.h:

http://gerrit.cloudera.org:8080/#/c/11582/4/be/src/util/rle-encoding.h@250
PS4, Line 250: iter
> Can you simplify this expression? MAX_VALUES_PER_LITERAL_RUN must be divisi
Good idea


http://gerrit.cloudera.org:8080/#/c/11582/4/be/src/util/rle-encoding.h@264
PS4, Line 264: insertin
> Aren't we double counting the indicator byte here? My assumption is that th
I think this is correct as is but I am changing it to the clearer
literal_max_size = num_runs * (1 + bytes_per_run)
-- 1 is the indicator
-- bytes_per_run is the encoded bytes


http://gerrit.cloudera.org:8080/#/c/11582/4/be/src/util/rle-encoding.h@265
PS4, Line 265: /  1 byte for the encoded run length
> This seems to assume that a single big literal run is the worsts case - can
I think before this change a long literal was not always the worst case, but 
now it is.



--
To view, visit http://gerrit.cloudera.org:8080/11582
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76
Gerrit-Change-Number: 11582
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Tue, 30 Oct 2018 22:22:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths

2018-10-30 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#7). ( 
http://gerrit.cloudera.org:8080/11582 )

Change subject: IMPALA-6658: improve Parquet RLE for low bit widths
..

IMPALA-6658: improve Parquet RLE for low bit widths

RleEncoder buffers values in its own cache to detect run lengths that
can be efficiently encoded. When a run is detected it is written with an
indicator byte which encodes the length of the run. So an encoded
run always has an overhead of at least one byte. This means that for
single bit values, encoding 8 values as a run is inefficient.

Change RleEncoder to have the ability to use run lengths other than 8.
A new parameter to the constructor (min_run_length) allows test callers
(only) to set the minimum run length.

By default RleEncoder will now use run length encoding for runs of
length 16 for single bit values. All other bit widths will use the
existing length 8 runs.

Internally RleEncoder must buffer more values so that the longer runs
can be detected. The internal buffer “buffered_values_” is larger
and is now a circular buffer so that the first 8 bytes of the buffer can
be separately flushed to BitWriter.

Testing:

All end-to-end and unit tests pass

The unit test rle-test is enhanced to run all tests against RleEncoders
using all possible values of min_run_length. In Addition, rle-test is
refactored so that the Rle tests are in a class that inherits from
::testing::Test so that a SetUp() method can be used.
The Overflow test is enhanced to be more exhaustive (while still
completing in a second or two).

Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76
---
M be/src/util/rle-encoding.h
M be/src/util/rle-test.cc
2 files changed, 499 insertions(+), 255 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/11582/7
--
To view, visit http://gerrit.cloudera.org:8080/11582
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76
Gerrit-Change-Number: 11582
Gerrit-PatchSet: 7
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths

2018-10-30 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/11582 )

Change subject: IMPALA-6658: improve Parquet RLE for low bit widths
..

IMPALA-6658: improve Parquet RLE for low bit widths

RleEncoder buffers values in its own cache to detect run lengths that
can be efficiently encoded. When a run is detected it is written with an
indicator byte which encodes the length of the run. So an encoded
run always has an overhead of at least one byte. This means that for
single bit values, encoding 8 values as a run is inefficient.

Change RleEncoder to have the ability to use run lengths other than 8.
A new parameter to the constructor (min_run_length) allows test callers
(only) to set the minimum run length.

By default RleEncoder will now use run length encoding for runs of
length 16 for single bit values. All other bit widths will use the
existing length 8 runs.

Internally RleEncoder must buffer more values so that the longer runs
can be detected. The internal buffer “buffered_values_” is larger
and is now a circular buffer so that the first 8 bytes of the buffer can
be separately flushed to BitWriter.

Testing:

All end-to-end and unit tests pass

The unit test rle-test is enhanced to run all tests against RleEncoders
using all possible values of min_run_length. In Addition, rle-test is
refactored so that the Rle tests are in a class that inherits from
::testing::Test so that a SetUp() method can be used.
The Overflow test is enhanced to be more exhaustive (while still
completing in a second or two).

Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76
---
M be/src/util/rle-encoding.h
M be/src/util/rle-test.cc
2 files changed, 484 insertions(+), 256 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/11582/6
--
To view, visit http://gerrit.cloudera.org:8080/11582
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76
Gerrit-Change-Number: 11582
Gerrit-PatchSet: 6
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths

2018-10-30 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/11582 )

Change subject: IMPALA-6658: improve Parquet RLE for low bit widths
..

IMPALA-6658: improve Parquet RLE for low bit widths

RleEncoder buffers values in its own cache to detect run lengths that
can be efficiently encoded. When a run is detected it is written with an
indicator byte which encodes the length of the run. So an encoded
run always has an overhead of at least one byte. This means that for
single bit values, encoding 8 values as a run is inefficient.

Change RleEncoder to have the ability to use run lengths other than 8.
A new parameter to the constructor (min_run_length) allows test callers
(only) to set the minimum run length.

By default RleEncoder will now use run length encoding for runs of
length 16 for single bit values. All other bit widths will use the
existing length 8 runs.

Internally RleEncoder must buffer more values so that the longer runs
can be detected. The internal buffer “buffered_values_” is larger
and is now a circular buffer so that the first 8 bytes of the buffer can
be separately flushed to BitWriter.

Testing:

All end-to-end and unit tests pass

The unit test rle-test is enhanced to run all tests against RleEncoders
using all possible values of min_run_length. In Addition, rle-test is
refactored so that the Rle tests are in a class that inherits from
::testing::Test so that a SetUp() method can be used.
The Overflow test is enhanced to be more exhaustive (while still
completing in a second or two).

Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76
---
M be/src/util/rle-encoding.h
M be/src/util/rle-test.cc
2 files changed, 510 insertions(+), 255 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/11582/5
--
To view, visit http://gerrit.cloudera.org:8080/11582
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76
Gerrit-Change-Number: 11582
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-5821: Add query with implicit casts to extended explain output.

2018-10-26 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/11719 )

Change subject: IMPALA-5821: Add query with implicit casts to extended explain 
output.
..

IMPALA-5821: Add query with implicit casts to extended explain output.

If explain_level is at 'extended' level or higher, then enhance the
output from the explain command. (1) Show the analyzed sql in the
explain header, this is the rewritten sql, which includes implicit
casts, and literals are printed with a cast so that their type is
visible. (2) When predicates are shown in the plan these are shown in
the same format.

The toSql() method can be called on a ParseNode tree to return
the sql corresponding ot the tree. In the past toSQl() has been
enhanced to print rewritten sql by partially overloading toSql() [with
toSql(boolean)]. This current change requires changing toSQl() in
many places as NumericLiteral can appear at different points in ia
parse tree. To avoid many new fragile overloads of toSql() I added
toSql(ToSqlOptions), where ToSqlOptions is an enum which controls the
form of the Sql that is returned. This changes many files but is safer
and means that any future options to toSql() can be added painlessly.

If SHOW_IMPLICIT_CASTS is passed to toSql() then
- in CastExpr print the implicit cast
- in NumericLiteral print the literal with a cast to show the type

Add a PlannerTestOption directive that will force the query text showing
implicit casts to be included in the PLAN section of a .test file.

The analyzed query text is wrapped at 80 characters. Note that the
analyzed query cannot always be executed as queries rewritten to use
LEFT SEMI JOIN are not legal sql. In addition some space characters may
be removed from the query for prettier display.

Documentation of this change will be done as IMPALA-7718

EXAMPLE OUTPUT:

[localhost:21000] default> set explain_level=2;
EXPLAIN_LEVEL set to 2
[localhost:21000] default> explain select * from functional_kudu.alltypestiny 
where bigint_col < 1000 / 100;
Query: explain select * from functional_kudu.alltypestiny where bigint_col < 
1000 / 100
Max Per-Host Resource Reservation: Memory=0B Threads=2
Per-Host Resource Estimates: Memory=10MB
Codegen disabled by planner
Analyzed query: SELECT * FROM functional_kudu.alltypestiny WHERE CAST(bigint_col
AS DOUBLE) < CAST(10 AS DOUBLE)
""
F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|  Per-Host Resources: mem-estimate=4.88MB mem-reservation=0B 
thread-reservation=2
PLAN-ROOT SINK
|  mem-estimate=0B mem-reservation=0B thread-reservation=0
|
00:SCAN KUDU [functional_kudu.alltypestiny]
   predicates: CAST(bigint_col AS DOUBLE) < CAST(10 AS DOUBLE)
   mem-estimate=4.88MB mem-reservation=0B thread-reservation=1
   tuple-ids=0 row-size=97B cardinality=1
   in pipelines: 00(GETNEXT)
Fetched 16 row(s) in 0.03s

TESTING:

All end-to-end tests pass.
Added a new test in ExprRewriterTest which prints sql with implict casts
for some interesting queries.
Add a unit test for the code which wraps text at 80 characters.
The output of some Planner Tests in .test files has been updated to
include the Analyzed sql that is printed when explain_level is
at at least 'extended' level.

Change-Id: I55c3bdacc295137f66b2316a912fc347da30d6b0
---
M fe/src/main/java/org/apache/impala/analysis/AdminFnStmt.java
M 
fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticWindow.java
M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java
M fe/src/main/java/org/apache/impala/analysis/BaseTableRef.java
M fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java
M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java
M fe/src/main/java/org/apache/impala/analysis/CastExpr.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDataSrcStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDropRoleStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java
M 

[Impala-ASF-CR] IMPALA-5821: Add query with implicit casts to extended explain output.

2018-11-01 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/11719 )

Change subject: IMPALA-5821: Add query with implicit casts to extended explain 
output.
..

IMPALA-5821: Add query with implicit casts to extended explain output.

If explain_level is at 'extended' level or higher, then enhance the
output from the explain command. (1) Show the analyzed sql in the
explain header, this is the rewritten sql, which includes implicit
casts, and literals are printed with a cast so that their type is
visible. (2) When predicates are shown in the plan these are shown in
the same format.

The toSql() method can be called on a ParseNode tree to return
the sql corresponding ot the tree. In the past toSQl() has been
enhanced to print rewritten sql by partially overloading toSql() [with
toSql(boolean)]. This current change requires changing toSQl() in
many places as NumericLiteral can appear at different points in ia
parse tree. To avoid many new fragile overloads of toSql() I added
toSql(ToSqlOptions), where ToSqlOptions is an enum which controls the
form of the Sql that is returned. This changes many files but is safer
and means that any future options to toSql() can be added painlessly.

If SHOW_IMPLICIT_CASTS is passed to toSql() then
- in CastExpr print the implicit cast
- in NumericLiteral print the literal with a cast to show the type

Add a PlannerTestOption directive that will force the query text showing
implicit casts to be included in the PLAN section of a .test file.

The analyzed query text is wrapped at 80 characters. Note that the
analyzed query cannot always be executed as queries rewritten to use
LEFT SEMI JOIN are not legal sql. In addition some space characters may
be removed from the query for prettier display.

Documentation of this change will be done as IMPALA-7718

EXAMPLE OUTPUT:

[localhost:21000] default> set explain_level=2;
EXPLAIN_LEVEL set to 2
[localhost:21000] default> explain select * from functional_kudu.alltypestiny 
where bigint_col < 1000 / 100;
Query: explain select * from functional_kudu.alltypestiny where bigint_col < 
1000 / 100
Max Per-Host Resource Reservation: Memory=0B Threads=2
Per-Host Resource Estimates: Memory=10MB
Codegen disabled by planner
Analyzed query: SELECT * FROM functional_kudu.alltypestiny WHERE CAST(bigint_col
AS DOUBLE) < CAST(10 AS DOUBLE)
""
F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|  Per-Host Resources: mem-estimate=4.88MB mem-reservation=0B 
thread-reservation=2
PLAN-ROOT SINK
|  mem-estimate=0B mem-reservation=0B thread-reservation=0
|
00:SCAN KUDU [functional_kudu.alltypestiny]
   predicates: CAST(bigint_col AS DOUBLE) < CAST(10 AS DOUBLE)
   mem-estimate=4.88MB mem-reservation=0B thread-reservation=1
   tuple-ids=0 row-size=97B cardinality=1
   in pipelines: 00(GETNEXT)
Fetched 16 row(s) in 0.03s

TESTING:

All end-to-end tests pass.
Added a new test in ExprRewriterTest which prints sql with implict casts
for some interesting queries.
Add a unit test for the code which wraps text at 60 characters.
The output of some Planner Tests in .test files has been updated to
include the Analyzed sql that is printed when explain_level is
at at least 'extended' level.

Change-Id: I55c3bdacc295137f66b2316a912fc347da30d6b0
---
M fe/src/main/java/org/apache/impala/analysis/AdminFnStmt.java
M 
fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticWindow.java
M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java
M fe/src/main/java/org/apache/impala/analysis/BaseTableRef.java
M fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java
M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java
M fe/src/main/java/org/apache/impala/analysis/CastExpr.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDataSrcStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDropRoleStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java
M 

[Impala-ASF-CR] IMPALA-5821: Add query with implicit casts to extended explain output.

2018-11-01 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11719 )

Change subject: IMPALA-5821: Add query with implicit casts to extended explain 
output.
..


Patch Set 4:

(1 comment)

I agree that maintenance is a concern.
1) for ParseNode I could remove toSql() from the interface and force all 
callers to use toSql(DEFAULT). This is easy in terms of effort as IntelliJ will 
do it for me, but results in a  lot of code changes, especially as many 
exception messages call toSql(). I did consider doing this originally but 
wanted to make this change as clear as possible. I can do this quickly in a 
follow-up jira. There are few more associated classes in analysis which would 
also be changed.
2) Do you want me to look at changes to toSql() that are not part of the 
ParseNode interface (or associated classes)?
There is org.apache.impala.catalog.Type which has toSql() {return toSql(1)} 
where the argument controls how deeply we show nested Types.
There is org.apache.impala.analysis.TableName.toSql() and 
org.apache.impala.analysis.PlanHint.toSql()
Right now I would propose leaving these as is, but they could also be changed 
if we wanted to never have a naked toSql() call.

http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/common/PrintUtils.java
File fe/src/main/java/org/apache/impala/common/PrintUtils.java:

http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/common/PrintUtils.java@110
PS4, Line 110: 32
> ok, just mention it in a brief comment. I prefer such inline constants to h
I removed it.



--
To view, visit http://gerrit.cloudera.org:8080/11719
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I55c3bdacc295137f66b2316a912fc347da30d6b0
Gerrit-Change-Number: 11719
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Thu, 01 Nov 2018 20:32:36 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths

2018-11-05 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#9). ( 
http://gerrit.cloudera.org:8080/11582 )

Change subject: IMPALA-6658: improve Parquet RLE for low bit widths
..

IMPALA-6658: improve Parquet RLE for low bit widths

RleEncoder buffers values in its own cache to detect run lengths that
can be efficiently encoded. When a run is detected it is written with an
indicator byte which encodes the length of the run. So an encoded
run always has an overhead of at least one byte. This means that for
single bit values, encoding 8 values as a run is inefficient.

Change RleEncoder to have the ability to use run lengths other than 8.
A new parameter to the constructor (min_run_length) allows test callers
(only) to set the minimum run length.

By default RleEncoder will now use run length encoding for runs of
length 16 for single bit values. All other bit widths will use the
existing length 8 runs.

Internally RleEncoder must buffer more values so that the longer runs
can be detected. The internal buffer “buffered_values_” is larger
and is now a circular buffer so that the first 8 bytes of the buffer can
be separately flushed to BitWriter.

Testing:

All end-to-end and unit tests pass

The unit test rle-test is enhanced to run all tests against RleEncoders
using all possible values of min_run_length. In Addition, rle-test is
refactored so that the Rle tests are in a class that inherits from
::testing::Test so that a SetUp() method can be used.
The Overflow test is enhanced to be more exhaustive (while still
completing in a second or two).

Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76
---
M be/src/util/rle-encoding.h
M be/src/util/rle-test.cc
2 files changed, 583 insertions(+), 256 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/11582/9
--
To view, visit http://gerrit.cloudera.org:8080/11582
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76
Gerrit-Change-Number: 11582
Gerrit-PatchSet: 9
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths

2018-11-05 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11582 )

Change subject: IMPALA-6658: improve Parquet RLE for low bit widths
..


Patch Set 8:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/11582/8/be/src/util/rle-encoding.h
File be/src/util/rle-encoding.h:

http://gerrit.cloudera.org:8080/#/c/11582/8/be/src/util/rle-encoding.h@259
PS8, Line 259: MaxBufferSize
> I am not 100% sure here, but I think that this could be expressed in an eas
OK, interesting. I think this is what you mean (see below).
I'm saying this is not clearer to humans (but it was worth trying)

  static int MaxBufferSize(int bit_width, int num_values) {
int data_bytes;
int overhead;
if (bit_width > 2) {
  // The largest encoded size is for all long literals with no repeated 
runs.
  int bytes_per_run = (bit_width * MAX_VALUES_PER_LITERAL_RUN) / 8;
  int num_runs =
  static_cast(BitUtil::Ceil(num_values, 
MAX_VALUES_PER_LITERAL_RUN));
  data_bytes = bytes_per_run * num_runs;
  overhead = num_runs;
} else {
  data_bytes = static_cast(BitUtil::Ceil(num_values * bit_width, 8));
  if (bit_width == 1) {
// Worst case is we use 4 bytes for every 3 of the input e.g. L8 R16 L8 
R16 L8.
overhead = 1 + static_cast(BitUtil::Ceil(data_bytes, 3));
  } else { // bit_width == 2
// Worst case is we use 3 bytes for every 2 of the input e.g. L8 R8 L8 
R8 L8.
overhead = 1 + static_cast(BitUtil::Ceil(data_bytes, 2));
  }
}
return std::max(MinBufferSize(bit_width), data_bytes + overhead);
  }


http://gerrit.cloudera.org:8080/#/c/11582/8/be/src/util/rle-encoding.h@261
PS8, Line 261: encoded
> Maybe repeated would be better here?
Done


http://gerrit.cloudera.org:8080/#/c/11582/8/be/src/util/rle-test.cc
File be/src/util/rle-test.cc:

http://gerrit.cloudera.org:8080/#/c/11582/8/be/src/util/rle-test.cc@636
PS8, Line 636: buffer_len
> I think that this var could have a more descriptive name like expected_max_
Done


http://gerrit.cloudera.org:8080/#/c/11582/8/be/src/util/rle-test.cc@642
PS8, Line 642: bug
> typo: big :)
:-(



--
To view, visit http://gerrit.cloudera.org:8080/11582
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76
Gerrit-Change-Number: 11582
Gerrit-PatchSet: 8
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Mon, 05 Nov 2018 18:45:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-5821: Add query with implicit casts to extended explain output.

2018-11-14 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#8). ( 
http://gerrit.cloudera.org:8080/11719 )

Change subject: IMPALA-5821: Add query with implicit casts to extended explain 
output.
..

IMPALA-5821: Add query with implicit casts to extended explain output.

If explain_level is at 'extended' level or higher, then enhance the
output from the explain command. (1) Show the analyzed sql in the
explain header, this is the rewritten sql, which includes implicit
casts, and literals are printed with a cast so that their type is
visible. (2) When predicates are shown in the plan these are shown in
the same format.

The toSql() method can be called on a ParseNode tree to return
the sql corresponding ot the tree. In the past toSQl() has been
enhanced to print rewritten sql by partially overloading toSql() [with
toSql(boolean)]. This current change requires changing toSQl() in
many places as NumericLiteral can appear at different points in ia
parse tree. To avoid many new fragile overloads of toSql() I added
toSql(ToSqlOptions), where ToSqlOptions is an enum which controls the
form of the Sql that is returned. This changes many files but is safer
and means that any future options to toSql() can be added painlessly.

If SHOW_IMPLICIT_CASTS is passed to toSql() then
- in CastExpr print the implicit cast
- in NumericLiteral print the literal with a cast to show the type

Add a PlannerTestOption directive that will force the query text showing
implicit casts to be included in the PLAN section of a .test file.

The analyzed query text is wrapped at 80 characters. Note that the
analyzed query cannot always be executed as queries rewritten to use
LEFT SEMI JOIN are not legal sql. In addition some space characters may
be removed from the query for prettier display.

Documentation of this change will be done as IMPALA-7718

EXAMPLE OUTPUT:

[localhost:21000] default> set explain_level=2;
EXPLAIN_LEVEL set to 2
[localhost:21000] default> explain select * from functional_kudu.alltypestiny 
where bigint_col < 1000 / 100;
Query: explain select * from functional_kudu.alltypestiny where bigint_col < 
1000 / 100
Max Per-Host Resource Reservation: Memory=0B Threads=2
Per-Host Resource Estimates: Memory=10MB
Codegen disabled by planner
Analyzed query: SELECT * FROM functional_kudu.alltypestiny WHERE CAST(bigint_col
AS DOUBLE) < CAST(10 AS DOUBLE)
""
F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1
|  Per-Host Resources: mem-estimate=4.88MB mem-reservation=0B 
thread-reservation=2
PLAN-ROOT SINK
|  mem-estimate=0B mem-reservation=0B thread-reservation=0
|
00:SCAN KUDU [functional_kudu.alltypestiny]
   predicates: CAST(bigint_col AS DOUBLE) < CAST(10 AS DOUBLE)
   mem-estimate=4.88MB mem-reservation=0B thread-reservation=1
   tuple-ids=0 row-size=97B cardinality=1
   in pipelines: 00(GETNEXT)
Fetched 16 row(s) in 0.03s

TESTING:

All end-to-end tests pass.
Added a new test in ExprRewriterTest which prints sql with implict casts
for some interesting queries.
Add a unit test for the code which wraps text at 60 characters.
The output of some Planner Tests in .test files has been updated to
include the Analyzed sql that is printed when explain_level is
at at least 'extended' level.

Change-Id: I55c3bdacc295137f66b2316a912fc347da30d6b0
---
M fe/src/main/java/org/apache/impala/analysis/AdminFnStmt.java
M 
fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticWindow.java
M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java
M fe/src/main/java/org/apache/impala/analysis/BaseTableRef.java
M fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java
M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java
M fe/src/main/java/org/apache/impala/analysis/CastExpr.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDataSrcStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDropRoleStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java
M 

[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths

2018-10-04 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11582


Change subject: IMPALA-6658: improve Parquet RLE for low bit widths
..

IMPALA-6658: improve Parquet RLE for low bit widths

RleEncoder buffers values in its own cache to detect run lengths that
can be efficiently encoded. When a run is detected it is written with an
indicator byte which encodes the length of the run. So a run is encoded
as 2 bytes, but the actual cost of a run of small bit width values in
the middle of a literal is really 3 bytes, as a new indicator is needed
after the run.

Change RleEncoder to have the ability to use run lengths other than 8.
A new parameter to the constructor (min_run_length) allows test callers
to set the minimum run length.

By default RleEncoder will now use run length encoding for runs of
length 24 for single bit values, and of length 16 for 2 bit wide values.
All other bit widths will use the existing length 8 runs.

Internally RleEncoder must buffer more values so that the longer runs
can be detected. The internal buffer “buffered_values_” is larger
and is now a circular buffer so that the first 8 bytes of the buffer can
be separately flushed to BitWriter.

Testing:

All unit tests pass

The unit test rle-test is enhanced to run all tests against RleEncoders
using all possible values of min_run_length. In Addition, rle-test is
refactored so that the rle tests are in a class that inherits from
::testing::Test so that a SetUp() method can be used.

Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76
---
M be/src/util/rle-encoding.h
M be/src/util/rle-test.cc
2 files changed, 497 insertions(+), 254 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/11582/1
--
To view, visit http://gerrit.cloudera.org:8080/11582
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76
Gerrit-Change-Number: 11582
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-589: Add sql function returning the impalad coordinator hostname.

2018-09-19 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/11459 )

Change subject: IMPALA-589: Add sql function returning the impalad coordinator 
hostname.
..

IMPALA-589: Add sql function returning the impalad coordinator hostname.

In every execution of an Impala query, one of the impalad daemons acts
as the coordinator node. In some cases, such as when using a proxy, a
user cannot predict which host will act as the coordinator. To aid in
diagnosis, we provide a sql function which returns the name of the host
on which the coordinator is running.

EXTERNAL DESCRIPTION:

Add a builtin function called coordinator(), which returns the name of
the host which is running the impalad that is acting as the coordinator
for the current query.

IMPLEMENTATION:

All functions are passed a FunctionContext object which is the interface
to the rest of the system for a UDF. From the FunctionContext we get
the TQueryCtx (query context) which contains a TNetworkAddress for
the coordinator. The hostname of the coordinator is copied from the
TNetworkAddress.

Can the TNetworkAddress for the coordinator be unitialized? In the
thrift source the coord_address is marked optional, but my reading of
the code says this is always set in a real impalad. To future-proof the
code a null StringVal is returned if coord_address is unset.

TESTING:
- Added a basic unit test for the new function.
- Added a unit test which simulates the case when coord_address is
  unset.
- Hand tested in a development deployment.
- Ran regression tests and got a clean run.

LIMITATIONS:

This change only adds the coordinator() function.
It does not add the debug_url() function that was mentioned in the
comments on the original Jira.

Change-Id: I94d6e2664ba659b48df53c5c06f67b502c533e47
---
M be/src/exprs/expr-test.cc
M be/src/exprs/utility-functions-ir.cc
M be/src/exprs/utility-functions.h
M common/function-registry/impala_functions.py
M docs/topics/impala_misc_functions.xml
5 files changed, 60 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/11459/2
--
To view, visit http://gerrit.cloudera.org:8080/11459
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I94d6e2664ba659b48df53c5c06f67b502c533e47
Gerrit-Change-Number: 11459
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Zoram Thanga 


[Impala-ASF-CR] IMPALA-589: Add sql function returning the impalad coordinator hostname.

2018-09-19 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11459 )

Change subject: IMPALA-589: Add sql function returning the impalad coordinator 
hostname.
..


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11459/1/be/src/exprs/expr-test.cc
File be/src/exprs/expr-test.cc:

http://gerrit.cloudera.org:8080/#/c/11459/1/be/src/exprs/expr-test.cc@4961
PS1, Line 4961:   ObjectPool pool;
> This does not seem to be used.
Thanks



--
To view, visit http://gerrit.cloudera.org:8080/11459
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I94d6e2664ba659b48df53c5c06f67b502c533e47
Gerrit-Change-Number: 11459
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Thu, 20 Sep 2018 00:38:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-589: Add sql function returning the impalad coordinator hostname.

2018-09-24 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/11459 )

Change subject: IMPALA-589: Add sql function returning the impalad coordinator 
hostname.
..

IMPALA-589: Add sql function returning the impalad coordinator hostname.

In every execution of an Impala query, one of the impalad daemons acts
as the coordinator node. In some cases, such as when using a proxy, a
user cannot predict which host will act as the coordinator. To aid in
diagnosis, we provide a sql function which returns the name of the host
on which the coordinator is running.

EXTERNAL DESCRIPTION:

Add a builtin function called coordinator(), which returns the name of
the host which is running the impalad that is acting as the coordinator
for the current query.

TESTING:
- Added a basic unit test for the new function.
- Added a unit test which simulates the case when coord_address is
  unset.
- Added a query that uses coordinator() to exprs.test
- Hand tested in a development deployment.
- Ran regression tests and got a clean run.

Change-Id: I94d6e2664ba659b48df53c5c06f67b502c533e47
---
M be/src/exprs/expr-test.cc
M be/src/exprs/utility-functions-ir.cc
M be/src/exprs/utility-functions.h
M common/function-registry/impala_functions.py
M docs/topics/impala_misc_functions.xml
M testdata/workloads/functional-query/queries/QueryTest/exprs.test
6 files changed, 66 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/11459/3
--
To view, visit http://gerrit.cloudera.org:8080/11459
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I94d6e2664ba659b48df53c5c06f67b502c533e47
Gerrit-Change-Number: 11459
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Zoram Thanga 


[Impala-ASF-CR] IMPALA-589: Add sql function returning the impalad coordinator hostname.

2018-09-24 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11459 )

Change subject: IMPALA-589: Add sql function returning the impalad coordinator 
hostname.
..


Patch Set 2:

(1 comment)

Thanks for making me look at the *.test files

http://gerrit.cloudera.org:8080/#/c/11459/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11459/2//COMMIT_MSG@7
PS2, Line 7: IMPALA-589: Add sql function returning the impalad coordinator 
hostname.
> I appreciate your thoroughness, but we generally make our commit messages a
Done



--
To view, visit http://gerrit.cloudera.org:8080/11459
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I94d6e2664ba659b48df53c5c06f67b502c533e47
Gerrit-Change-Number: 11459
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Mon, 24 Sep 2018 16:09:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7801: Remove toSql() from ParseNode interface.

2018-11-16 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/11942 )

Change subject: IMPALA-7801: Remove toSql() from ParseNode interface.
..

IMPALA-7801: Remove toSql() from ParseNode interface.

In IMPALA-5821 the method "toSql(ToSqlOptions)" was added to ParseNode,
to allow options to be passed when generating SQL from a parse tree.
This change was suggested as part of the review of IMPALA-5821.

The old "toSql()" method is currently implemented everywhere by calling
toSql(ToSqlOptions.DEFAULT), but this is fragile as this convention
could be accidentally ignored, To make the code more maintainable,
remove the old "toSql()" method and have all callers call the new method
instead.

In addition, similarly replace "toSql()" in a few more associated
classes: AnalyticWindow, LimitElement, Boundary, OrderByElement,
SelectListItem.

TESTING:

No new tests are added as there are no functional changes.
All end to end tests run clean.

Change-Id: I17025901838e9ffd753894a8087170123f9d8b33
---
M fe/src/main/java/org/apache/impala/analysis/AdminFnStmt.java
M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/AggregateInfoBase.java
M 
fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetLocationStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticInfo.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticWindow.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java
M fe/src/main/java/org/apache/impala/analysis/CastExpr.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
M fe/src/main/java/org/apache/impala/analysis/ExtractFromExpr.java
M fe/src/main/java/org/apache/impala/analysis/FromClause.java
M fe/src/main/java/org/apache/impala/analysis/FunctionArgs.java
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/analysis/HdfsCachingOp.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNotEmptyPredicate.java
M fe/src/main/java/org/apache/impala/analysis/KuduPartitionParam.java
M fe/src/main/java/org/apache/impala/analysis/LikePredicate.java
M fe/src/main/java/org/apache/impala/analysis/LimitElement.java
M fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java
M fe/src/main/java/org/apache/impala/analysis/OrderByElement.java
M fe/src/main/java/org/apache/impala/analysis/ParseNode.java
M fe/src/main/java/org/apache/impala/analysis/PartitionDef.java
M fe/src/main/java/org/apache/impala/analysis/PartitionKeyValue.java
M fe/src/main/java/org/apache/impala/analysis/PartitionSet.java
M fe/src/main/java/org/apache/impala/analysis/PartitionSpec.java
M fe/src/main/java/org/apache/impala/analysis/PartitionSpecBase.java
M fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/RangePartition.java
M fe/src/main/java/org/apache/impala/analysis/SelectListItem.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/analysis/StatementBase.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/main/java/org/apache/impala/analysis/Subquery.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/analysis/TableSampleClause.java
M fe/src/main/java/org/apache/impala/analysis/TimestampArithmeticExpr.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/analysis/TypeDef.java
M fe/src/main/java/org/apache/impala/analysis/UnionStmt.java
M fe/src/main/java/org/apache/impala/analysis/WithClause.java
M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java
M 

[Impala-ASF-CR] IMPALA-7801: Remove toSql() from ParseNode interface.

2018-11-16 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11942


Change subject: IMPALA-7801: Remove toSql() from ParseNode interface.
..

IMPALA-7801: Remove toSql() from ParseNode interface.

In IMPALA-5821 the method "toSql(ToSqlOptions)" was added to ParseNode,
to allow options to be passed when generating SQL from a parse tree.
This change was suggested as part of the review of IMPALA-5821.

The old "toSql()" method is currently implemented everywhere by calling
toSql(ToSqlOptions.DEFAULT), but this is fragile as this convention
could be accidentally ignored, To make the code more maintainable,
remove the old "toSql()" method and have all callers call the new method
instead.

In addition, similarly replace "toSql()" in a few more associated
classes: AnalyticWindow, LimitElement, Boundary, OrderByElement,
SelectListItem.

TESTING:

No new tests are added as there are no functional changes.
All end to end tests run clean.

Change-Id: I17025901838e9ffd753894a8087170123f9d8b33
---
M fe/src/main/java/org/apache/impala/analysis/AdminFnStmt.java
M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/AggregateInfoBase.java
M 
fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetLocationStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticInfo.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticWindow.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java
M fe/src/main/java/org/apache/impala/analysis/CastExpr.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
M fe/src/main/java/org/apache/impala/analysis/ExtractFromExpr.java
M fe/src/main/java/org/apache/impala/analysis/FromClause.java
M fe/src/main/java/org/apache/impala/analysis/FunctionArgs.java
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/analysis/HdfsCachingOp.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNotEmptyPredicate.java
M fe/src/main/java/org/apache/impala/analysis/KuduPartitionParam.java
M fe/src/main/java/org/apache/impala/analysis/LikePredicate.java
M fe/src/main/java/org/apache/impala/analysis/LimitElement.java
M fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java
M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java
M fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java
M fe/src/main/java/org/apache/impala/analysis/OrderByElement.java
M fe/src/main/java/org/apache/impala/analysis/ParseNode.java
M fe/src/main/java/org/apache/impala/analysis/PartitionDef.java
M fe/src/main/java/org/apache/impala/analysis/PartitionKeyValue.java
M fe/src/main/java/org/apache/impala/analysis/PartitionSet.java
M fe/src/main/java/org/apache/impala/analysis/PartitionSpec.java
M fe/src/main/java/org/apache/impala/analysis/PartitionSpecBase.java
M fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java
M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
M fe/src/main/java/org/apache/impala/analysis/RangePartition.java
M fe/src/main/java/org/apache/impala/analysis/SelectListItem.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java
M fe/src/main/java/org/apache/impala/analysis/SlotRef.java
M fe/src/main/java/org/apache/impala/analysis/StatementBase.java
M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java
M fe/src/main/java/org/apache/impala/analysis/Subquery.java
M fe/src/main/java/org/apache/impala/analysis/TableRef.java
M fe/src/main/java/org/apache/impala/analysis/TableSampleClause.java
M fe/src/main/java/org/apache/impala/analysis/TimestampArithmeticExpr.java
M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java
M fe/src/main/java/org/apache/impala/analysis/TypeDef.java
M fe/src/main/java/org/apache/impala/analysis/UnionStmt.java
M fe/src/main/java/org/apache/impala/analysis/WithClause.java
M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java
M 

[Impala-ASF-CR] IMPALA-7468: Port CancelQueryFInstances() to KRPC.

2019-01-02 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12142


Change subject: IMPALA-7468: Port CancelQueryFInstances() to KRPC.
..

IMPALA-7468: Port CancelQueryFInstances() to KRPC.

When the Coordinator needs to cancel a query (usually because a user has
hit Control-C), it does this by sending a CancelQueryFInstances message
to each fragment instance. This change switches this code to use KRPC.

Add new protobuf definitions for the messages, and remove the old thrift
definitions. Move the server-side implementation of Cancel() from
ImpalaInternalService to ControlService. Rework the scheduler so
that the FInstanceExecParams always contains the KRPC address of the
fragment executors, this address can then be used if a query is to be
cancelled.

For now keep the KRPC calls to CancelQueryFInstances() as synchronous.

While moving the client-side code, remove the fault injection code that
was inserted with FAULT_INJECTION_SEND_RPC_EXCEPTION and
FAULT_INJECTION_RECV_RPC_EXCEPTION (triggered by running impalad with
--fault_injection_rpc_exception_type=1) as this tickles code in
client-cache.h which is now not used.

TESTING:
  Ran all end-to-end tests.
  No new tests as test_cancellation.py provides good coverage.
  Checked in debugger that DebugAction style fault injection (triggered
  from test_cancellation.py) was working correctly.

Change-Id: I625030c3f1068061aa029e6e242f016cadd84969
---
M be/src/runtime/backend-client.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/scheduling/query-schedule.h
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M be/src/service/control-service.cc
M be/src/service/control-service.h
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M common/protobuf/control_service.proto
M common/thrift/ImpalaInternalService.thrift
12 files changed, 165 insertions(+), 105 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/12142/1
--
To view, visit http://gerrit.cloudera.org:8080/12142
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I625030c3f1068061aa029e6e242f016cadd84969
Gerrit-Change-Number: 12142
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Sherman 


[Impala-ASF-CR] IMPALA-7468: Port CancelQueryFInstances() to KRPC.

2019-01-02 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/12142 )

Change subject: IMPALA-7468: Port CancelQueryFInstances() to KRPC.
..

IMPALA-7468: Port CancelQueryFInstances() to KRPC.

When the Coordinator needs to cancel a query (usually because a user has
hit Control-C), it does this by sending a CancelQueryFInstances message
to each fragment instance. This change switches this code to use KRPC.

Add new protobuf definitions for the messages, and remove the old thrift
definitions. Move the server-side implementation of Cancel() from
ImpalaInternalService to ControlService. Rework the scheduler so
that the FInstanceExecParams always contains the KRPC address of the
fragment executors, this address can then be used if a query is to be
cancelled.

For now keep the KRPC calls to CancelQueryFInstances() as synchronous.

While moving the client-side code, remove the fault injection code that
was inserted with FAULT_INJECTION_SEND_RPC_EXCEPTION and
FAULT_INJECTION_RECV_RPC_EXCEPTION (triggered by running impalad with
--fault_injection_rpc_exception_type=1) as this tickles code in
client-cache.h which is now not used.

TESTING:
  Ran all end-to-end tests.
  No new tests as test_cancellation.py provides good coverage.
  Checked in debugger that DebugAction style fault injection (triggered
  from test_cancellation.py) was working correctly.

Change-Id: I625030c3f1068061aa029e6e242f016cadd84969
---
M be/src/runtime/backend-client.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/scheduling/query-schedule.h
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M be/src/service/control-service.cc
M be/src/service/control-service.h
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M common/protobuf/control_service.proto
M common/thrift/ImpalaInternalService.thrift
12 files changed, 164 insertions(+), 105 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/12142/2
--
To view, visit http://gerrit.cloudera.org:8080/12142
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I625030c3f1068061aa029e6e242f016cadd84969
Gerrit-Change-Number: 12142
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 


[Impala-ASF-CR] IMPALA-8047 Support .proto files in .clang-format

2019-01-04 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12165


Change subject: IMPALA-8047 Support .proto files in .clang-format
..

IMPALA-8047 Support .proto files in .clang-format

The .proto file extension is used for the Google Protocol Buffers
language. Impala uses this language to specify the format of messages
used by KRPC. Add support for this language to .clang-format so that we
can have consistent formatting.

The proposed support is:

Language: Proto
BasedOnStyle: Google
ColumnLimit: 90

This produces only a few diffs when run against the existing Impala
code. I’m not proposing to make any changes to .proto files, this is
just to show what clang-format will do. Apart from wrapping comments and
code at 90 chars, the diffs are mostly of the form

-syntax="proto2";
+syntax = "proto2";

-  message Certificate {};
+  message Certificate {
+  };

-  optional bool client_timeout_defined = 4 [ default = false ];
+  optional bool client_timeout_defined = 4 [default = false];

-UNKNOWN= 999;
-NEGOTIATE  = 1;
-SASL_SUCCESS   = 0;
-SASL_INITIATE  = 2;
+UNKNOWN = 999;
+NEGOTIATE = 1;
+SASL_SUCCESS = 0;
+SASL_INITIATE = 2;

This last change can be configured using “AlignConsecutiveAssignments:
true” but that creates a different set of diffs.

Change-Id: I0c2dcfc21fc8f9206adb64166fbd05580516791f
---
M .clang-format
1 file changed, 4 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/12165/2
--
To view, visit http://gerrit.cloudera.org:8080/12165
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I0c2dcfc21fc8f9206adb64166fbd05580516791f
Gerrit-Change-Number: 12165
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Sherman 


[Impala-ASF-CR] IMPALA-7468: Port CancelQueryFInstances() to KRPC.

2019-01-02 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12142 )

Change subject: IMPALA-7468: Port CancelQueryFInstances() to KRPC.
..


Patch Set 1:

(6 comments)

Thanks for the helpful review

http://gerrit.cloudera.org:8080/#/c/12142/1/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

http://gerrit.cloudera.org:8080/#/c/12142/1/be/src/runtime/coordinator-backend-state.h@294
PS1, Line 294:   Status DoCancelQueryFInstancesRrpcWithRetry(
> Nit: The prevailing style seems to place reference ('&') and pointer ('*')
Up to now I have just been using what the clang-format rules dictate. In 
clang-format terms I think you are saying we should use 
'DerivePointerAlignment: true' (which means: "analyze the formatted file for 
the most common alignment of & and *. Pointer and reference alignment styles 
are going to be updated according to the preferences found in the file. 
PointerAlignment is then used only as fallback"), is that the standard practice 
for Impala?


http://gerrit.cloudera.org:8080/#/c/12142/1/be/src/scheduling/query-schedule.h
File be/src/scheduling/query-schedule.h:

http://gerrit.cloudera.org:8080/#/c/12142/1/be/src/scheduling/query-schedule.h@104
PS1, Line 104:   const TNetworkAddress& krpc_host, int 
per_fragment_instance_idx,
> Nit: I wonder if there are trailing whitespaces in these lines? I don't und
There is no trailing whitespace I could find.
I think this formatting is the result of the clang-format parameter: 
ConstructorInitializerAllOnOneLineOrOnePerLine: true
which is splitting the initializers to one-per-line.


http://gerrit.cloudera.org:8080/#/c/12142/1/be/src/scheduling/scheduler.h
File be/src/scheduling/scheduler.h:

http://gerrit.cloudera.org:8080/#/c/12142/1/be/src/scheduling/scheduler.h@433
PS1, Line 433:   void ComputeFragmentExecParams(const BackendConfig 
executor_config,
> Nit: Please retain existing placement of '&' and '*'.
Thanks for catching my not using a reference for executor_config.


http://gerrit.cloudera.org:8080/#/c/12142/1/be/src/scheduling/scheduler.cc
File be/src/scheduling/scheduler.cc:

http://gerrit.cloudera.org:8080/#/c/12142/1/be/src/scheduling/scheduler.cc@333
PS1, Line 333: ComputeFragmentExecParams(executor_config, plan_exec_info,
> Nit: Trailing spaces?
Yes there is something wrong here and I have reformatted


http://gerrit.cloudera.org:8080/#/c/12142/1/common/protobuf/control_service.proto
File common/protobuf/control_service.proto:

http://gerrit.cloudera.org:8080/#/c/12142/1/common/protobuf/control_service.proto@163
PS1, Line 163:   optional UniqueIdPB query_id = 1;
> Don't understand how the query id can be optional...
OK you made me think!
I had just copied what other people had done in implementing krpc. But now I 
have investigated further and found that many people argue that having required 
fields is a bad idea, and the syntax is even removed in future protobuf 
implementations, see for example 
https://stackoverflow.com/questions/31801257/why-required-and-optional-is-removed-in-protocol-buffers-3
Overall this makes sense to me, so I think optional is OK here, are you 
persuaded?


http://gerrit.cloudera.org:8080/#/c/12142/1/common/protobuf/control_service.proto@179
PS1, Line 179:   // Cancellation is asynchronous.
> Commit message says cancellation is synchronous. Do you want to change this
When the coordinator cancels a query it sends CancelQueryFInstancesRequestPB 
messages to the hosts executing fragment instances for the query. These calls 
are synchronous, that is the coordinator waits for a response to each 
CancelQueryFInstances call before making the next one. This is what the commit 
message means by 'For now keep the KRPC calls to CancelQueryFInstances() as 
synchronous'. When a cancellation occurs on an impalad executing a fragment 
instance then the cancellation is implemented by setting flags saying the query 
is cancelled in various data structures, and by waking various sleeping 
threads. This should be enough to stop the query executing, but the 
CancelQueryFInstances call returns without doing any checking that all query 
execution has stopped. So in this sense cancel is asynchronous. I updated the 
comment to make this clearer.



--
To view, visit http://gerrit.cloudera.org:8080/12142
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I625030c3f1068061aa029e6e242f016cadd84969
Gerrit-Change-Number: 12142
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Thu, 03 Jan 2019 00:33:30 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7468: Port CancelQueryFInstances() to KRPC.

2019-01-02 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/12142 )

Change subject: IMPALA-7468: Port CancelQueryFInstances() to KRPC.
..

IMPALA-7468: Port CancelQueryFInstances() to KRPC.

When the Coordinator needs to cancel a query (usually because a user has
hit Control-C), it does this by sending a CancelQueryFInstances message
to each fragment instance. This change switches this code to use KRPC.

Add new protobuf definitions for the messages, and remove the old thrift
definitions. Move the server-side implementation of Cancel() from
ImpalaInternalService to ControlService. Rework the scheduler so
that the FInstanceExecParams always contains the KRPC address of the
fragment executors, this address can then be used if a query is to be
cancelled.

For now keep the KRPC calls to CancelQueryFInstances() as synchronous.

While moving the client-side code, remove the fault injection code that
was inserted with FAULT_INJECTION_SEND_RPC_EXCEPTION and
FAULT_INJECTION_RECV_RPC_EXCEPTION (triggered by running impalad with
--fault_injection_rpc_exception_type=1) as this tickles code in
client-cache.h which is now not used.

TESTING:
  Ran all end-to-end tests.
  No new tests as test_cancellation.py provides good coverage.
  Checked in debugger that DebugAction style fault injection (triggered
  from test_cancellation.py) was working correctly.

Change-Id: I625030c3f1068061aa029e6e242f016cadd84969
---
M be/src/runtime/backend-client.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/scheduling/query-schedule.h
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M be/src/service/control-service.cc
M be/src/service/control-service.h
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M common/protobuf/control_service.proto
M common/thrift/ImpalaInternalService.thrift
12 files changed, 158 insertions(+), 111 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/12142/3
--
To view, visit http://gerrit.cloudera.org:8080/12142
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I625030c3f1068061aa029e6e242f016cadd84969
Gerrit-Change-Number: 12142
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Zoram Thanga 


[Impala-ASF-CR] IMPALA-7985: Port RemoteShutdown() to KRPC.

2019-01-23 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12260


Change subject: IMPALA-7985: Port RemoteShutdown() to KRPC.
..

IMPALA-7985: Port RemoteShutdown() to KRPC.

The :shutdown command is used to shutdown a remote server. The common
case is that a user specifies the impalad to shutdown by specifying a
host e.g. :shutdown('host100'). If a user has more than one impalad on a
remote host then the form :shutdown(':') can be used to
specify the port by which the imapald can be contacted. Prior to
IMPALA-7985 this port was the backend port, e.g.
:shutdown('host100:22000'). With IMPALA-7985 the port to use is the KRPC
port, e.g. :shutdown('host100:27000').

Shutdown is implemented by making an rpc call to the target impalad.
This changes the implementation of this call to use KRPC.

To aid the user in finding the KRPC port, the KRPC address is added to
the /backends section of the debug web page.

We attempt to detect the case where :shutdown is pointed at a thrift
port (like the backend port) and print an informative message.

Documentation of this change will be done in IMPALA-8098.

For discussion of why it was chosen to implement this change in an
incompatible way, see comments in
https://issues.apache.org/jira/browse/IMPALA-7985.

TESTING

Ran all end-to-end tests.
Add a test for /backends to test_web_pages.py.
In test_restart_services.py add a call to the old backend port to the
test. Some expected error messages were changed in line with what KRPC
returns.

Change-Id: I4fd00ee4e638f5e71e27893162fd65501ef9e74e
---
M be/src/runtime/backend-client.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/service/client-request-state.cc
M be/src/service/control-service.cc
M be/src/service/control-service.h
M be/src/service/impala-http-handler.cc
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M common/protobuf/control_service.proto
M common/thrift/ImpalaInternalService.thrift
M tests/custom_cluster/test_restart_services.py
M tests/webserver/test_web_pages.py
M www/backends.tmpl
16 files changed, 219 insertions(+), 144 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/12260/1
--
To view, visit http://gerrit.cloudera.org:8080/12260
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I4fd00ee4e638f5e71e27893162fd65501ef9e74e
Gerrit-Change-Number: 12260
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr.

2018-12-12 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12068 )

Change subject: IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr.
..


Patch Set 3:

I tested performance of the two scalar expressions by running queries on my 
desktop box.

ValidTupleId is faster with the new codegen. To prove this I used a query with 
many 'sum(distinct column_name)' clauses. In this case the new code runs 2% 
faster. I think this would be undetectable in any normal query.

I ran many queries trying to analyze whether IsNotEmptyPredicate is faster with 
the new codegen. In most queries using the predicate there was no detectable 
difference. I was able to construct a pathological query with 100 union all 
expressions, each of which read from a nested type.  With this query runtime 
was 5% slower with the new codegen. I don't think this would be detectable in 
any normal query. Arguably we should still use this new codegen'd version as it 
fits with our vision of removing all calls to GetCodegendComputeFnWrapper().


--
To view, visit http://gerrit.cloudera.org:8080/12068
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ifb87b9e3b879c278ce8638d97bcb320a7555a6b3
Gerrit-Change-Number: 12068
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Thu, 13 Dec 2018 00:21:19 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7183: Include remote host when logging unknown execution status.

2018-12-17 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12102


Change subject: IMPALA-7183: Include remote host when logging unknown execution 
status.
..

IMPALA-7183: Include remote host when logging unknown execution status.

ControlService::ReportExecStatus reports the execution status of
a fragment to the coordinator. When the coordinator receives a
ReportExecStatus call, it looks up the ClientRequestState for the query
id. If this cannot be found, usually because the query is in the process
of being canceled, then it logs a message before responding to the
RPC. Add the remote host name to this message. We try to use the DNS
reverse-lookup name, if that fails we use the IP address instead.

Example output: "ReportExecStatus(): Received report for unknown query
ID (probably closed or cancelled): 9a437a10d83cfd86:3c961136
remote host=localhost"

TESTING
 Ran end-to-end tests.
 Forced code to execute by hand to check output is correct.

Change-Id: I79b223e22096e2d964b72654a34dc0d88e8c6422
---
M be/src/service/control-service.cc
1 file changed, 9 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/12102/1
--
To view, visit http://gerrit.cloudera.org:8080/12102
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I79b223e22096e2d964b72654a34dc0d88e8c6422
Gerrit-Change-Number: 12102
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr.

2018-12-18 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/12068 )

Change subject: IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr.
..

IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr.

These two classes evaluate scalar expressions. Previously codegen
was done by calling ScalarExpr::GetCodegendComputeFnWrapper which
generates a static method that calls the scalar expression evaluation
method. Make this more efficient by using cross-compiled code which is
customized at codegen time.

Add cross-compiled files is-not-empty-predicate-ir.cc and
valid-tuple-id-ir.cc which contain the scalar expression evaluation
methods. At Codegen time generate a static method that calls the
cross-compiled methods. This is done using a new common piece of code in
ScalarExpr::GetCodegendCrossCompiledComputeFnWrapper.

So the improvement in this change is that the evaluation method
code is cross-compiled to ir.

IMPALA-7657 also requests replacing GetCodegendComputeFnWrapper()
in TupleIsNullPredicate. In the current Impala code this method
is never called. This is because TupleIsNullPredicate is always
wrapped in an IfExpr. This is always codegen'd by IfExpr's
GetCodegendComputeFnWrapper() method. There is a separate Jira
IMPALA-7655 to improve codegen of IfExpr.

Minor corrections:
 Correct the link to llvm tutorial in LlvmCodegen.
 Make a method private in TupleIsNullPredicate.java.

PERFORMANCE:
 I tested performance on a local mini-cluster. I wrote some
 pathological queries to test the new code. The new codegen'd code is
 very similar in performance. ValidTupleIdExpr seems a bit faster, but
 IsNotEmptyPredicate  seems a little slower. Overall these changes are
 not purely for performance  but to move away from
 GetCodegendComputeFnWrapper.

TESTING:
 The changed scalar expressions are well exercised by current tests.
 Ran exhaustive end-to-end tests (pending).

Change-Id: Ifb87b9e3b879c278ce8638d97bcb320a7555a6b3
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/codegen/llvm-codegen.h
M be/src/exprs/CMakeLists.txt
A be/src/exprs/is-not-empty-predicate-ir.cc
M be/src/exprs/is-not-empty-predicate.cc
M be/src/exprs/is-not-empty-predicate.h
M be/src/exprs/scalar-expr.cc
M be/src/exprs/scalar-expr.h
M be/src/exprs/slot-ref.cc
A be/src/exprs/valid-tuple-id-ir.cc
M be/src/exprs/valid-tuple-id.cc
M be/src/exprs/valid-tuple-id.h
M fe/src/main/java/org/apache/impala/analysis/TupleIsNullPredicate.java
14 files changed, 156 insertions(+), 53 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/12068/4
--
To view, visit http://gerrit.cloudera.org:8080/12068
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifb87b9e3b879c278ce8638d97bcb320a7555a6b3
Gerrit-Change-Number: 12068
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-7183: Include remote host when logging unknown execution status.

2018-12-18 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/12102 )

Change subject: IMPALA-7183: Include remote host when logging unknown execution 
status.
..

IMPALA-7183: Include remote host when logging unknown execution status.

ControlService::ReportExecStatus reports the execution status of
a fragment to the coordinator. When the coordinator receives a
ReportExecStatus call, it looks up the ClientRequestState for the query
id. If this cannot be found, usually because the query is in the process
of being canceled, then it logs a message before responding to the
RPC. Add the IP address and port of the remote host to this message.

Example output (from minicluster): "ReportExecStatus(): Received report
for unknown query ID (probably closed or cancelled):
174dbb16aa00a5c2:382e5cc8 remote host=127.0.0.1:47112"

TESTING
 Ran end-to-end tests.
 Forced code to execute by hand to check output is correct.

Change-Id: I79b223e22096e2d964b72654a34dc0d88e8c6422
---
M be/src/service/control-service.cc
1 file changed, 3 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/12102/2
--
To view, visit http://gerrit.cloudera.org:8080/12102
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I79b223e22096e2d964b72654a34dc0d88e8c6422
Gerrit-Change-Number: 12102
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr.

2018-12-11 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/12068 )

Change subject: IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr.
..

IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr.

These two classes evaluate scalar expressions. Previously codegen
was done by calling ScalarExpr::GetCodegendComputeFnWrapper which
generates a static method that calls the scalar expression evaluation
methods. Make this more efficient by using cross-compiled code which is
customized at codegen time.

Add cross-compiled files is-not-empty-predicate-ir.cc and
valid-tuple-id-ir.cc These files contain the methods that will be
specialized during code generation These methods have to be static
to match the expected method signatures Refactor the existing scalar
methods so that they call a new internal static method which contains
the implementation of the scalar expression This static method is called
from both the old member function and the new cross-compiled method.

IMPALA-7657 also requests replacing GetCodegendComputeFnWrapper()
in TupleIsNullPredicate. In the current Impala code this method
is never called. This is because TupleIsNullPredicate is always
wrapped in an IfExpr. This is always codegen'd by IfExpr's
GetCodegendComputeFnWrapper() method. There is a separate Jira
IMPALA-7655 to improve codegen of IfExpr.

Minor corrections:
 Correct the link to llvm tutorial in LlvmCodegen.
 Make a method private in TupleIsNullPredicate.java.

TESTING:
 The changed scalar expressions are well exercised by current tests.
 Ran end-to-end tests.

Change-Id: Ifb87b9e3b879c278ce8638d97bcb320a7555a6b3
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/codegen/llvm-codegen.h
M be/src/exprs/CMakeLists.txt
A be/src/exprs/is-not-empty-predicate-ir.cc
M be/src/exprs/is-not-empty-predicate.cc
M be/src/exprs/is-not-empty-predicate.h
M be/src/exprs/slot-ref.cc
A be/src/exprs/valid-tuple-id-ir.cc
M be/src/exprs/valid-tuple-id.cc
M be/src/exprs/valid-tuple-id.h
M fe/src/main/java/org/apache/impala/analysis/TupleIsNullPredicate.java
12 files changed, 205 insertions(+), 41 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/12068/2
--
To view, visit http://gerrit.cloudera.org:8080/12068
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifb87b9e3b879c278ce8638d97bcb320a7555a6b3
Gerrit-Change-Number: 12068
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr.

2018-12-11 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/12068


Change subject: IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr.
..

IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr.

These two classes evaluate scalar expressions. Previously codegen
was done by calling ScalarExpr::GetCodegendComputeFnWrapper which
generates a static method that calls the scalar expression evaluation
methods. Make this more efficient by using cross-compiled code which is
customized at codegen time.

Add cross-compiled files is-not-empty-predicate-ir.cc and
valid-tuple-id-ir.cc These files contain the methods that will be
specialized during code generation These methods have to be static
to match the expected method signatures Refactor the existing scalar
methods so that they call a new internal static method which contains
the implementation of the scalar expression This static method is called
from both the old member function and the new cross-compiled method.

IMPALA-7657 also requests replacing GetCodegendComputeFnWrapper()
in TupleIsNullPredicate. In the current Impala code this method
is never called. This is because TupleIsNullPredicate is always
wrapped in an IfExpr. This is always codegen'd by IfExpr's
GetCodegendComputeFnWrapper() method. There is a separate Jira
IMPALA-7655 to improve codegen of IfExpr.

Minor corrections:
 Correct the link to llvm tutorial in LlvmCodegen.
 Make a method private in TupleIsNullPredicate.java.

TESTING:
 The changed scalar expressions are well exercised by current tests.
 Ran end-to-end tests.

Change-Id: Ifb87b9e3b879c278ce8638d97bcb320a7555a6b3
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/codegen/llvm-codegen.h
M be/src/exprs/CMakeLists.txt
A be/src/exprs/is-not-empty-predicate-ir.cc
M be/src/exprs/is-not-empty-predicate.cc
M be/src/exprs/is-not-empty-predicate.h
M be/src/exprs/slot-ref.cc
A be/src/exprs/valid-tuple-id-ir.cc
M be/src/exprs/valid-tuple-id.cc
M be/src/exprs/valid-tuple-id.h
M fe/src/main/java/org/apache/impala/analysis/TupleIsNullPredicate.java
12 files changed, 205 insertions(+), 41 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/12068/1
--
To view, visit http://gerrit.cloudera.org:8080/12068
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ifb87b9e3b879c278ce8638d97bcb320a7555a6b3
Gerrit-Change-Number: 12068
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr.

2018-12-11 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/12068 )

Change subject: IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr.
..

IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr.

These two classes evaluate scalar expressions. Previously codegen
was done by calling ScalarExpr::GetCodegendComputeFnWrapper which
generates a static method that calls the scalar expression evaluation
methods. Make this more efficient by using cross-compiled code which is
customized at codegen time.

Add cross-compiled files is-not-empty-predicate-ir.cc and
valid-tuple-id-ir.cc These files contain the methods that will be
specialized during code generation These methods have to be static
to match the expected method signatures Refactor the existing scalar
methods so that they call a new internal static method which contains
the implementation of the scalar expression This static method is called
from both the old member function and the new cross-compiled method.

IMPALA-7657 also requests replacing GetCodegendComputeFnWrapper()
in TupleIsNullPredicate. In the current Impala code this method
is never called. This is because TupleIsNullPredicate is always
wrapped in an IfExpr. This is always codegen'd by IfExpr's
GetCodegendComputeFnWrapper() method. There is a separate Jira
IMPALA-7655 to improve codegen of IfExpr.

Minor corrections:
 Correct the link to llvm tutorial in LlvmCodegen.
 Make a method private in TupleIsNullPredicate.java.

TESTING:
 The changed scalar expressions are well exercised by current tests.
 Ran end-to-end tests.

Change-Id: Ifb87b9e3b879c278ce8638d97bcb320a7555a6b3
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/codegen/llvm-codegen.h
M be/src/exprs/CMakeLists.txt
A be/src/exprs/is-not-empty-predicate-ir.cc
M be/src/exprs/is-not-empty-predicate.cc
M be/src/exprs/is-not-empty-predicate.h
M be/src/exprs/slot-ref.cc
A be/src/exprs/valid-tuple-id-ir.cc
M be/src/exprs/valid-tuple-id.cc
M be/src/exprs/valid-tuple-id.h
M fe/src/main/java/org/apache/impala/analysis/TupleIsNullPredicate.java
12 files changed, 206 insertions(+), 46 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/12068/3
--
To view, visit http://gerrit.cloudera.org:8080/12068
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifb87b9e3b879c278ce8638d97bcb320a7555a6b3
Gerrit-Change-Number: 12068
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-6924: Add child queries to profile in compute stats

2018-11-28 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11977 )

Change subject: IMPALA-6924: Add child queries to profile in compute stats
..


Patch Set 2:

(2 comments)

Looks like a useful change

http://gerrit.cloudera.org:8080/#/c/11977/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11977/2//COMMIT_MSG@9
PS2, Line 9: COMPUTE STATS triggers to child queries which do the actual stats
Should this be "triggers some child queries"?
Or "triggers two child queries"?


http://gerrit.cloudera.org:8080/#/c/11977/2/be/src/service/child-query.h
File be/src/service/child-query.h:

http://gerrit.cloudera.org:8080/#/c/11977/2/be/src/service/child-query.h@129
PS2, Line 129:   RuntimeProfile* profile_;
Do you want a /// comment here?



--
To view, visit http://gerrit.cloudera.org:8080/11977
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5006c3b366d381eed4687e550cdfc463be3d1350
Gerrit-Change-Number: 11977
Gerrit-PatchSet: 2
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 29 Nov 2018 01:07:12 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6924: Add child queries to profile in compute stats

2018-12-04 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11977 )

Change subject: IMPALA-6924: Add child queries to profile in compute stats
..


Patch Set 4: Code-Review+1

(5 comments)

This all looks good to me. I have comments that are more like questions so I 
can learn things.

http://gerrit.cloudera.org:8080/#/c/11977/4/be/src/service/client-request-state.cc
File be/src/service/client-request-state.cc:

http://gerrit.cloudera.org:8080/#/c/11977/4/be/src/service/client-request-state.cc@572
PS4, Line 572:   
child_queries.push_back(ChildQuery(compute_stats_params.tbl_stats_query, this,
I know you didn't write this line, but I see some people advocating that 
emplace_back is better than push_back as it can avoid a copy?


http://gerrit.cloudera.org:8080/#/c/11977/4/be/src/service/impala-hs2-server.cc
File be/src/service/impala-hs2-server.cc:

http://gerrit.cloudera.org:8080/#/c/11977/4/be/src/service/impala-hs2-server.cc@907
PS4, Line 907: return_val.__set_profile(ss.str());
So this covers both the string and base64 cases?


http://gerrit.cloudera.org:8080/#/c/11977/4/be/src/util/runtime-profile-test.cc
File be/src/util/runtime-profile-test.cc:

http://gerrit.cloudera.org:8080/#/c/11977/4/be/src/util/runtime-profile-test.cc@93
PS4, Line 93:   EXPECT_TRUE(deserialized_profile->GetCounter("Not there") == 
NULL);
== nullptr


http://gerrit.cloudera.org:8080/#/c/11977/4/be/src/util/runtime-profile.cc
File be/src/util/runtime-profile.cc:

http://gerrit.cloudera.org:8080/#/c/11977/4/be/src/util/runtime-profile.cc@888
PS4, Line 888:   uint32_t deserialized_len = result_len;
Would there be any advantage to using a cast here?


http://gerrit.cloudera.org:8080/#/c/11977/4/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/11977/4/tests/query_test/test_observability.py@265
PS4, Line 265: matches = re.findall("Query \(id=.{,33}\)", 
results.runtime_profile)
So 33 is the length of a query id? That made me think.



--
To view, visit http://gerrit.cloudera.org:8080/11977
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5006c3b366d381eed4687e550cdfc463be3d1350
Gerrit-Change-Number: 11977
Gerrit-PatchSet: 4
Gerrit-Owner: Thomas Marshall 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 04 Dec 2018 17:41:49 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7801: Remove toSql() from ParseNode interface.

2018-11-19 Thread Andrew Sherman (Code Review)
Andrew Sherman has abandoned this change. ( 
http://gerrit.cloudera.org:8080/11942 )

Change subject: IMPALA-7801: Remove toSql() from ParseNode interface.
..


Abandoned

Will let Paul Rogers fix (or at least think about) in a bigger change
--
To view, visit http://gerrit.cloudera.org:8080/11942
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: I17025901838e9ffd753894a8087170123f9d8b33
Gerrit-Change-Number: 11942
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-7801: Remove toSql() from ParseNode interface.

2018-11-19 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11942 )

Change subject: IMPALA-7801: Remove toSql() from ParseNode interface.
..


Patch Set 4:

Thanks Csaba, hope you didn't waste too much time on this


--
To view, visit http://gerrit.cloudera.org:8080/11942
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I17025901838e9ffd753894a8087170123f9d8b33
Gerrit-Change-Number: 11942
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Tue, 20 Nov 2018 00:04:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7801: Remove toSql() from ParseNode interface.

2018-11-16 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/11942 )

Change subject: IMPALA-7801: Remove toSql() from ParseNode interface.
..

IMPALA-7801: Remove toSql() from ParseNode interface.

In IMPALA-5821 the method "toSql(ToSqlOptions)" was added to ParseNode,
to allow options to be passed when generating SQL from a parse tree.
This change was suggested as part of the review of IMPALA-5821.

The old "toSql()" method is currently implemented everywhere by calling
toSql(ToSqlOptions.DEFAULT), but this is fragile as this convention
could be accidentally ignored, To make the code more maintainable,
remove the old "toSql()" method and have all callers call the new method
instead.

In addition, similarly replace "toSql()" in a few more associated
classes: AnalyticWindow, LimitElement, Boundary, OrderByElement,
SelectListItem.

Finally, rename ToSqlOptions to ToSqlOption anbd noive it to be part
of the ParseNode interface, and, in ToSqlOption, rename DEFAULT to
ORIGINAL.

TESTING:

No new tests are added as there are no functional changes.
All end to end tests run clean.

Change-Id: I17025901838e9ffd753894a8087170123f9d8b33
---
M fe/src/main/java/org/apache/impala/analysis/AdminFnStmt.java
M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/AggregateInfoBase.java
M 
fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetLocationStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticInfo.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticWindow.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java
M fe/src/main/java/org/apache/impala/analysis/BaseTableRef.java
M fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java
M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java
M fe/src/main/java/org/apache/impala/analysis/CastExpr.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDataSrcStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDropRoleStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/DeleteStmt.java
M fe/src/main/java/org/apache/impala/analysis/DescribeDbStmt.java
M fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/DropDataSrcStmt.java
M fe/src/main/java/org/apache/impala/analysis/DropDbStmt.java
M fe/src/main/java/org/apache/impala/analysis/DropFunctionStmt.java
M fe/src/main/java/org/apache/impala/analysis/DropStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/ExistsPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
M fe/src/main/java/org/apache/impala/analysis/ExtractFromExpr.java
M fe/src/main/java/org/apache/impala/analysis/FromClause.java
M fe/src/main/java/org/apache/impala/analysis/FunctionArgs.java
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/analysis/GrantRevokePrivStmt.java
M fe/src/main/java/org/apache/impala/analysis/GrantRevokeRoleStmt.java
M fe/src/main/java/org/apache/impala/analysis/HdfsCachingOp.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/IsNotEmptyPredicate.java
M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java
M 

[Impala-ASF-CR] IMPALA-7801: Remove toSql() from ParseNode interface.

2018-11-16 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/11942 )

Change subject: IMPALA-7801: Remove toSql() from ParseNode interface.
..

IMPALA-7801: Remove toSql() from ParseNode interface.

In IMPALA-5821 the method "toSql(ToSqlOptions)" was added to ParseNode,
to allow options to be passed when generating SQL from a parse tree.
This change was suggested as part of the review of IMPALA-5821.

The old "toSql()" method is currently implemented everywhere by calling
toSql(ToSqlOptions.DEFAULT), but this is fragile as this convention
could be accidentally ignored, To make the code more maintainable,
remove the old "toSql()" method and have all callers call the new method
instead.

In addition, similarly replace "toSql()" in a few more associated
classes: AnalyticWindow, LimitElement, Boundary, OrderByElement,
SelectListItem.

Finally, rename ToSqlOptions to ToSqlOption anbd noive it to be part
of the ParseNode interface, and, in ToSqlOption, rename DEFAULT to
ORIGINAL.

TESTING:

No new tests are added as there are no functional changes.
All end to end tests run clean.

Change-Id: I17025901838e9ffd753894a8087170123f9d8b33
---
M fe/src/main/java/org/apache/impala/analysis/AdminFnStmt.java
M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/AggregateInfoBase.java
M 
fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableAlterColStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterTableSetLocationStmt.java
M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticInfo.java
M fe/src/main/java/org/apache/impala/analysis/AnalyticWindow.java
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java
M fe/src/main/java/org/apache/impala/analysis/BaseTableRef.java
M fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java
M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java
M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java
M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java
M fe/src/main/java/org/apache/impala/analysis/CastExpr.java
M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java
M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java
M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDataSrcStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateDropRoleStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/DeleteStmt.java
M fe/src/main/java/org/apache/impala/analysis/DescribeDbStmt.java
M fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java
M fe/src/main/java/org/apache/impala/analysis/DropDataSrcStmt.java
M fe/src/main/java/org/apache/impala/analysis/DropDbStmt.java
M fe/src/main/java/org/apache/impala/analysis/DropFunctionStmt.java
M fe/src/main/java/org/apache/impala/analysis/DropStatsStmt.java
M fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java
M fe/src/main/java/org/apache/impala/analysis/ExistsPredicate.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java
M fe/src/main/java/org/apache/impala/analysis/ExtractFromExpr.java
M fe/src/main/java/org/apache/impala/analysis/FromClause.java
M fe/src/main/java/org/apache/impala/analysis/FunctionArgs.java
M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java
M fe/src/main/java/org/apache/impala/analysis/GrantRevokePrivStmt.java
M fe/src/main/java/org/apache/impala/analysis/GrantRevokeRoleStmt.java
M fe/src/main/java/org/apache/impala/analysis/HdfsCachingOp.java
M fe/src/main/java/org/apache/impala/analysis/InPredicate.java
M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java
M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java
M fe/src/main/java/org/apache/impala/analysis/IsNotEmptyPredicate.java
M 

[Impala-ASF-CR] IMPALA-7801: Remove toSql() from ParseNode interface.

2018-11-16 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11942 )

Change subject: IMPALA-7801: Remove toSql() from ParseNode interface.
..


Patch Set 2:

(6 comments)

Thanks Paul for the interesting review and discussion

http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/analysis/Expr.java
File fe/src/main/java/org/apache/impala/analysis/Expr.java:

http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/analysis/Expr.java@395
PS2, Line 395:   String sql = toSql(DEFAULT);
> Let's think about the meaning of the options for expressions. Expr nodes wo
I understand the point about Expr nodes but there is no simple solution. The 
ToSqlOption(s) does allow more complex behaviors by having different methods 
that can be called on a ToSqlOption object.

Csaba also suggested having some sort of context object to be passed as well. 
Maybe that is closer to what you want. In this change I am just trying to 
remove the default toSql() methods.


http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/analysis/Expr.java@1300
PS2, Line 1300:   name, (printExpr) ? " '" + toSql(DEFAULT) + "'" : "", 
type_.toString()));
> This comment applies in a zillion other places. Our general rule for error
In this change, as requested by Vuk, I am removing the default toSql() method. 
He thought, and I think I agree, that forcing a ToSqlOption object parameter 
makes things more consistent. If we don't want that then maybe I should abandon 
this change.


http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/analysis/ParseNode.java
File fe/src/main/java/org/apache/impala/analysis/ParseNode.java:

http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/analysis/ParseNode.java@33
PS2, Line 33:   String toSql(ToSqlOptions options);
> A common practice in Java is to include the enum definition in the interfac
Nice suggestion, done.


http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java
File fe/src/main/java/org/apache/impala/analysis/QueryStmt.java:

http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java@217
PS2, Line 217:   "OFFSET requires an ORDER BY clause: " + 
limitElement_.toSql(DEFAULT).trim());
> I wonder if we can solve a mess here with exceptions also. Rather than buil
Nice idea, but IMHO beyond the scope of this change.


http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/planner/SortNode.java
File fe/src/main/java/org/apache/impala/planner/SortNode.java:

http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/planner/SortNode.java@44
PS2, Line 44: import static org.apache.impala.analysis.ToSqlOptions.DEFAULT;
> Two suggestions.
Rename is done in next patch.
I am not sure I agree about importing the enum (i.e. not a static import of the 
needed enums). I think it is tidier to have just ORIGINAL in the code. As it is 
(now) a common idiom I believe people will get used to reading this correctly.


http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/planner/SortNode.java@214
PS2, Line 214: output.append(info_.getSortExprs().get(i).toSql(DEFAULT) 
+ " ");
> This is an example of the point made elsewhere. It SEEMS we are asking the
OK, thanks for explaining this. I will add another followup jira which will say 
something like:

IMPALA-7801 changed the code so that all ParseNode trees can be asked to create 
sql by calling toSql(SqlOption.ORIGINAL) or toSql(SqlOption.REWRITTEN). 
Although an Expr node produces output when toSql(SqlOption.ORIGINAL)  is 
called, it is a lie in that it is not the original sql. The current Expr may 
have replaced a tree of the original Expr objects during query rewrite. The 
replaced objects are gone, so we cannot reconstruct the "original" sql.
A possible solution is to observe that the select-like ParseNodes and Expr 
ParseNodes have different semantics and to change the interface of toSql()  to 
express this.



--
To view, visit http://gerrit.cloudera.org:8080/11942
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I17025901838e9ffd753894a8087170123f9d8b33
Gerrit-Change-Number: 11942
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Fri, 16 Nov 2018 23:46:37 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7801: Remove toSql() from ParseNode interface.

2018-11-16 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11942 )

Change subject: IMPALA-7801: Remove toSql() from ParseNode interface.
..


Patch Set 4:

reviewers should ignore patch set 3 which is very wrong


--
To view, visit http://gerrit.cloudera.org:8080/11942
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I17025901838e9ffd753894a8087170123f9d8b33
Gerrit-Change-Number: 11942
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Sat, 17 Nov 2018 00:49:32 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7801: Remove toSql() from ParseNode interface.

2018-11-19 Thread Andrew Sherman (Code Review)
Andrew Sherman has removed Vuk Ercegovac from this change.  ( 
http://gerrit.cloudera.org:8080/11942 )

Change subject: IMPALA-7801: Remove toSql() from ParseNode interface.
..


Removed reviewer Vuk Ercegovac.
--
To view, visit http://gerrit.cloudera.org:8080/11942
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: I17025901838e9ffd753894a8087170123f9d8b33
Gerrit-Change-Number: 11942
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-7801: Remove toSql() from ParseNode interface.

2018-11-19 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11942 )

Change subject: IMPALA-7801: Remove toSql() from ParseNode interface.
..


Patch Set 4:

I'm not happy with using ORIGINAL as the default argument. This doesn't really 
solve the problems that Paul has identified with the semantics of this 
parameter. Using ORIGINAL really just makes things worse in this regard. So I 
am thinking I will abandon this change unless someone wants the change to use 
an explicit DEFAULT to describe the give-me-whatever-you-have case.


--
To view, visit http://gerrit.cloudera.org:8080/11942
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I17025901838e9ffd753894a8087170123f9d8b33
Gerrit-Change-Number: 11942
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Paul Rogers 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Mon, 19 Nov 2018 17:38:24 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr.

2019-01-09 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/12068 )

Change subject: IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr.
..

IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr.

These two classes evaluate scalar expressions. Previously codegen was
done by calling ScalarExpr::GetCodegendComputeFnWrapper which generates
a static method that calls the scalar expression evaluation method. Make
this more efficient by generating code which is customized using
information available at codegen time.

Add new cross-compiled files is-not-empty-predicate-ir.cc null-literal-
ir.cc slot-ref-ir.cc  valid-tuple-id-ir.cc.

IsNotEmptyPredicate works by getting a CollectionVal object from the
single child Expr node, and counting its tuples. As preparation for
codegen, split out the logic that counts the tuples to a new cross-
compiled method "IsNotEmpty()". At codegen time we know the type and
value of the child node. Generate a call to a node-specific non-virtual
cross-compiled method to get the CollectionVal object from the child.
Then generate a call to return the result of calling IsNotNull() on the
CollectionVal object.

A ValidTupleIdExpr node contains a vector of tuple ids. It works by
probing each row for the tuple ids in the vector to find a non-null
tuple. As preparation for codegen, split out the logic that probes the
row for a non-null tuple id into a new cross-compiled method
"IsTupleFromRowNonNull()".  At codegen time we know the vector of tuple
ids. We unroll the loop through the tuple ids, generating code that
calls "IsTupleFromRowNonNull()", and returning the tuple id if/when a
non-null tuple is found.

IMPALA-7657 also requests replacing GetCodegendComputeFnWrapper() in
TupleIsNullPredicate. In the current Impala code this method is never
called. This is because TupleIsNullPredicate is always wrapped in an
IfExpr. This is always codegen'd by IfExpr's
GetCodegendComputeFnWrapper() method. There is a separate Jira
IMPALA-7655 to improve codegen of IfExpr.

Minor corrections:
  Correct the link to llvm tutorial in LlvmCodegen.

PERFORMANCE:
  I tested performance on a local mini-cluster. I wrote some
  pathological queries to test the new code. The new codegen'd code is
  very similar in performance. ValidTupleIdExpr seems a bit faster, but
  IsNotEmptyPredicate is only very slightly faster or identical.
  Overall these changes are not purely for performance  but to move
  away from GetCodegendComputeFnWrapper.

TESTING:
  The changed scalar expressions are well exercised by current tests.
  Ran exhaustive end-to-end tests.

Change-Id: Ifb87b9e3b879c278ce8638d97bcb320a7555a6b3
---
M be/src/codegen/gen_ir_descriptions.py
M be/src/codegen/impala-ir.cc
M be/src/codegen/llvm-codegen.h
M be/src/exprs/CMakeLists.txt
A be/src/exprs/is-not-empty-predicate-ir.cc
M be/src/exprs/is-not-empty-predicate.cc
M be/src/exprs/is-not-empty-predicate.h
A be/src/exprs/null-literal-ir.cc
M be/src/exprs/null-literal.cc
M be/src/exprs/null-literal.h
M be/src/exprs/scalar-expr.h
A be/src/exprs/slot-ref-ir.cc
M be/src/exprs/slot-ref.cc
M be/src/exprs/slot-ref.h
A be/src/exprs/valid-tuple-id-ir.cc
M be/src/exprs/valid-tuple-id.cc
M be/src/exprs/valid-tuple-id.h
17 files changed, 378 insertions(+), 37 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/12068/5
--
To view, visit http://gerrit.cloudera.org:8080/12068
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ifb87b9e3b879c278ce8638d97bcb320a7555a6b3
Gerrit-Change-Number: 12068
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-7468: Port CancelQueryFInstances() to KRPC.

2019-01-07 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12142 )

Change subject: IMPALA-7468: Port CancelQueryFInstances() to KRPC.
..


Patch Set 3:

(11 comments)

Thanks for the code reviews

http://gerrit.cloudera.org:8080/#/c/12142/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12142/3//COMMIT_MSG@9
PS3, Line 9: (usually because a user has
   : hit Control-C)
> Not that important, but is this really true? Some other cancellation situat
Thanks good point, "usually" is definitely wrong.


http://gerrit.cloudera.org:8080/#/c/12142/3/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

http://gerrit.cloudera.org:8080/#/c/12142/3/be/src/runtime/coordinator-backend-state.h@239
PS3, Line 239: Execution backend.
> Thrift execution backend?
Done


http://gerrit.cloudera.org:8080/#/c/12142/1/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

http://gerrit.cloudera.org:8080/#/c/12142/1/be/src/runtime/coordinator-backend-state.h@294
PS1, Line 294:   Status DoCancelQueryFInstancesRrpcWithRetry(
> I believe our current clang-format rules will tell you to put the symbols n
Done


http://gerrit.cloudera.org:8080/#/c/12142/3/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/12142/3/be/src/runtime/coordinator-backend-state.cc@427
PS3, Line 427: auto
> We generally avoid the use of auto as it hurts readability.
Done


http://gerrit.cloudera.org:8080/#/c/12142/3/be/src/runtime/coordinator-backend-state.cc@440
PS3, Line 440: DoCancelQueryFInstancesRrpcWithRetry
> Is there a reasonable way to make this generic like DoRpcWithRetry()? We'll
I'm doing retry using a lambda


http://gerrit.cloudera.org:8080/#/c/12142/3/be/src/scheduling/query-schedule.h
File be/src/scheduling/query-schedule.h:

http://gerrit.cloudera.org:8080/#/c/12142/3/be/src/scheduling/query-schedule.h@88
PS3, Line 88: execution backend
> Thrift execution backend?
Done


http://gerrit.cloudera.org:8080/#/c/12142/3/be/src/service/control-service.h
File be/src/service/control-service.h:

http://gerrit.cloudera.org:8080/#/c/12142/3/be/src/service/control-service.h@41
PS3, Line 41:   public:
> Not your code, but while you're here would you mind fixing the spacing in t
Done


http://gerrit.cloudera.org:8080/#/c/12142/3/be/src/service/control-service.h@63
PS3, Line 63:   /// Cancel any executing fragment instances for the query id 
specified in 'req'.
> nit: could you move this under ReportExecStatus() to keep all of the rpc ha
Done


http://gerrit.cloudera.org:8080/#/c/12142/3/be/src/service/control-service.h@64
PS3, Line 64: void
> virtual
Done


http://gerrit.cloudera.org:8080/#/c/12142/3/be/src/service/control-service.cc
File be/src/service/control-service.cc:

http://gerrit.cloudera.org:8080/#/c/12142/3/be/src/service/control-service.cc@173
PS3, Line 173: "query"
> Since this is a constant, its not really necessary to Substitute() it.
Done


http://gerrit.cloudera.org:8080/#/c/12142/3/common/protobuf/control_service.proto
File common/protobuf/control_service.proto:

http://gerrit.cloudera.org:8080/#/c/12142/3/common/protobuf/control_service.proto@181
PS3, Line 181:   rpc CancelQueryFInstances(CancelQueryFInstancesRequestPB) 
returns (CancelQueryFInstancesResponsePB);
> Long line.
Done, I added IMPALA-8047 for the clang-format support



--
To view, visit http://gerrit.cloudera.org:8080/12142
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I625030c3f1068061aa029e6e242f016cadd84969
Gerrit-Change-Number: 12142
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Mon, 07 Jan 2019 18:36:44 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7468: Port CancelQueryFInstances() to KRPC.

2019-01-07 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/12142 )

Change subject: IMPALA-7468: Port CancelQueryFInstances() to KRPC.
..

IMPALA-7468: Port CancelQueryFInstances() to KRPC.

When the Coordinator needs to cancel a query (for example because a user
has hit Control-C), it does this by sending a CancelQueryFInstances
message to each fragment instance. This change switches this code to use
KRPC.

Add new protobuf definitions for the messages, and remove the old thrift
definitions. Move the server-side implementation of Cancel() from
ImpalaInternalService to ControlService. Rework the scheduler so
that the FInstanceExecParams always contains the KRPC address of the
fragment executors, this address can then be used if a query is to be
cancelled.

For now keep the KRPC calls to CancelQueryFInstances() as synchronous.

While moving the client-side code, remove the fault injection code that
was inserted with FAULT_INJECTION_SEND_RPC_EXCEPTION and
FAULT_INJECTION_RECV_RPC_EXCEPTION (triggered by running impalad with
--fault_injection_rpc_exception_type=1) as this tickles code in
client-cache.h which is now not used.

TESTING:
  Ran all end-to-end tests.
  No new tests as test_cancellation.py provides good coverage.
  Checked in debugger that DebugAction style fault injection (triggered
  from test_cancellation.py) was working correctly.

Change-Id: I625030c3f1068061aa029e6e242f016cadd84969
---
M be/src/runtime/backend-client.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/scheduling/query-schedule.h
M be/src/scheduling/scheduler.cc
M be/src/scheduling/scheduler.h
M be/src/service/control-service.cc
M be/src/service/control-service.h
M be/src/service/impala-internal-service.cc
M be/src/service/impala-internal-service.h
M common/protobuf/control_service.proto
M common/thrift/ImpalaInternalService.thrift
12 files changed, 192 insertions(+), 139 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/12142/4
--
To view, visit http://gerrit.cloudera.org:8080/12142
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I625030c3f1068061aa029e6e242f016cadd84969
Gerrit-Change-Number: 12142
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Zoram Thanga 


[Impala-ASF-CR] IMPALA-589: Add a sql function to return the impalad coordinator hostname for diagnostic purposes.

2018-09-17 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11459


Change subject: IMPALA-589: Add a sql function to return the impalad 
coordinator hostname for diagnostic purposes.
..

IMPALA-589: Add a sql function to return the impalad coordinator hostname for 
diagnostic purposes.

In every execution of an Impala query, one of the impalad daemons acts as
the coordinator node. In some cases, such as when using a proxy, a user
cannot predict which host will act as the coordinator. To aid in diagnosis,
we provide a sql function which returns the name of the host on which the
coordinator is running.

EXTERNAL DESCRIPTION:

Add a builtin function called coordinator(), which returns the name of the
host which is running the impalad that is acting as the coordinator for the
current query.

IMPLEMENTATION:

All functions are passed a FunctionContext object which is the interface to the
rest of the system for a UDF.  From the FunctionContext we get the TQueryCtx
(query context) which contains a TNetworkAddress for the coordinator. The
hostname of the coordinator is copied from the TNetworkAddress.

Can the TNetworkAddress for the coordinator be unitialized?  In the thrift
source the coord_address is marked optional, but my reading of the code
says this is always set in a real impalad. To future-proof the code a null
StringVal is returned if coord_address is unset.

TESTING:
- Added a basic unit test for the new function.
- Added a unit test which simulates the case when coord_address is unset.
- Hand tested in a development deployment.
- Ran regression tests and got a clean run.

LIMITATIONS:

This change only adds the coordinator() function.
It does not add the debug_url() function that was mentioned in the comments on
the original Jira.

Change-Id: I94d6e2664ba659b48df53c5c06f67b502c533e47
---
M be/src/exprs/expr-test.cc
M be/src/exprs/utility-functions-ir.cc
M be/src/exprs/utility-functions.h
M common/function-registry/impala_functions.py
M docs/topics/impala_misc_functions.xml
5 files changed, 61 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/11459/1
--
To view, visit http://gerrit.cloudera.org:8080/11459
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I94d6e2664ba659b48df53c5c06f67b502c533e47
Gerrit-Change-Number: 11459
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Sherman 


[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter

2018-09-17 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11450 )

Change subject: IMPALA-7492: Add support for DATE text parser/formatter
..


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-value.cc
File be/src/runtime/date-value.cc:

http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-value.cc@89
PS2, Line 89:   }
Do you want to set an informative string if the value is not valid?



--
To view, visit http://gerrit.cloudera.org:8080/11450
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831
Gerrit-Change-Number: 11450
Gerrit-PatchSet: 2
Gerrit-Owner: Attila Jeges 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Csaba Ringhofer 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Zoltan Borok-Nagy 
Gerrit-Comment-Date: Mon, 17 Sep 2018 23:59:19 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7579: use the generic filesystem client to delete and create files in test query profile contains all events.

2018-09-18 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/11461


Change subject: IMPALA-7579: use the generic filesystem_client to delete and 
create files in test_query_profile_contains_all_events.
..

IMPALA-7579: use the generic filesystem_client to delete and create files in 
test_query_profile_contains_all_events.

This bug was introduced by IMPALA-6568 which added the
test_query_profile_contains_all_events test.  This test creates a file in
the filesystem so that is can be used by 'load data inpath'.  The test was
using the hdfs_client object to do file system operations, but this client
only works with hdfs.  Switch to using the filesystem_client object which
can work on other filesystems, including s3.

Change-Id: I7c6e0899455dd4e12636b959fab4bde79f02fb95
---
M tests/query_test/test_observability.py
1 file changed, 2 insertions(+), 2 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/11461/1
--
To view, visit http://gerrit.cloudera.org:8080/11461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I7c6e0899455dd4e12636b959fab4bde79f02fb95
Gerrit-Change-Number: 11461
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-7579: fix test query profile contains all events on S3

2018-09-18 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#2). ( 
http://gerrit.cloudera.org:8080/11461 )

Change subject: IMPALA-7579: fix test_query_profile_contains_all_events on S3
..

IMPALA-7579: fix test_query_profile_contains_all_events on S3

This bug was introduced by IMPALA-6568 which added the
test_query_profile_contains_all_events test. This test creates a file
in the filesystem so that it can be used by 'load data inpath'. The
test was using the hdfs_client object to do file system operations, but
this client only works with hdfs. Switch to using the filesystem_client
object which can work on other filesystems, including s3. Also create
the file that will be moved by 'load data inpath' under the unique
database directory; this means the file can be created without having to
check if it exists already and must be deleted.

Change-Id: I7c6e0899455dd4e12636b959fab4bde79f02fb95
---
M tests/query_test/test_observability.py
1 file changed, 2 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/11461/2
--
To view, visit http://gerrit.cloudera.org:8080/11461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I7c6e0899455dd4e12636b959fab4bde79f02fb95
Gerrit-Change-Number: 11461
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Vuk Ercegovac 


[Impala-ASF-CR] IMPALA-7579: fix test query profile contains all events on S3

2018-09-18 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11461 )

Change subject: IMPALA-7579: fix test_query_profile_contains_all_events on S3
..


Patch Set 1:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/11461/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11461/1//COMMIT_MSG@7
PS1, Line 7: IMPALA-7579: use the generic filesystem_client to delete and 
create files in test_query_profile_contains_all_events.
> nit: keep commit message lines to 72 characters (and don't wrap the title l
Done


http://gerrit.cloudera.org:8080/#/c/11461/1//COMMIT_MSG@7
PS1, Line 7: IMPALA-7579: use the generic filesystem_client to delete and 
create files in test_query_profile_contains_all_events.
> make this line less verbose... something like "fix S3 file ops in test_quer
Done


http://gerrit.cloudera.org:8080/#/c/11461/1//COMMIT_MSG@11
PS1, Line 11: is
> nit: it
Done


http://gerrit.cloudera.org:8080/#/c/11461/1//COMMIT_MSG@11
PS1, Line 11:
> nit: get rid of extra ws
Done


http://gerrit.cloudera.org:8080/#/c/11461/1/tests/query_test/test_observability.py
File tests/query_test/test_observability.py:

http://gerrit.cloudera.org:8080/#/c/11461/1/tests/query_test/test_observability.py@224
PS1, Line 224: tmp
> is this where the db is created? I thought it was under 'test-warehouse'
Thanks for the suggestion



--
To view, visit http://gerrit.cloudera.org:8080/11461
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I7c6e0899455dd4e12636b959fab4bde79f02fb95
Gerrit-Change-Number: 11461
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Vuk Ercegovac 
Gerrit-Comment-Date: Wed, 19 Sep 2018 00:33:13 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8371: Return appropriate error code for unified backend tests

2019-04-02 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12885 )

Change subject: IMPALA-8371: Return appropriate error code for unified backend 
tests
..


Patch Set 2: Code-Review+1

(2 comments)

LGTM

http://gerrit.cloudera.org:8080/#/c/12885/2/bin/gen-backend-test-script.py
File bin/gen-backend-test-script.py:

http://gerrit.cloudera.org:8080/#/c/12885/2/bin/gen-backend-test-script.py@21
PS2, Line 21: # 1: The file location to write the generated script
Aren't these parameters now named (--gtest_filter,--test_script_output) rather 
than positional?


http://gerrit.cloudera.org:8080/#/c/12885/2/bin/gen-backend-test-script.py@32
PS2, Line 32: # The script template requires substitutions to set the variables 
used by the body.
Nit: Clearer maybe to say "the script header template".



--
To view, visit http://gerrit.cloudera.org:8080/12885
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia146d026d42f76d5ea12d92798f299182de03eef
Gerrit-Change-Number: 12885
Gerrit-PatchSet: 2
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Joe McDonnell 
Gerrit-Comment-Date: Tue, 02 Apr 2019 16:38:59 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8143: Enhance DoRpcWithRetry().

2019-04-02 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#8). ( 
http://gerrit.cloudera.org:8080/12672 )

Change subject: IMPALA-8143: Enhance DoRpcWithRetry().
..

IMPALA-8143: Enhance DoRpcWithRetry().

Allow callers of RpcMgr::DoRpcWithRetry to specify a time to sleep if
the remote service is busy. DoRpcWithRetry now only sleeps if the remote
service is busy.

TESTING:

Ran all end-to-end tests.

Add a new test to rpc-mgr-test.cc which tests RpcMgr::DoRpcWithRetry in
two ways. Firstly we test the case where the remote service is busy
(which we force by filling up the queue), and secondly we exercise
DoRpcWithRetry by using a fake Proxy that simulates failures.

Clean up unused values of FaultInjectionUtil.RpcCallType and match
values with test_rpc_timeout.py.

Change-Id: Ia9693151c35e02235665b3c285a48c585973d390
---
M be/src/rpc/impala-service-pool.cc
M be/src/rpc/impala-service-pool.h
M be/src/rpc/rpc-mgr-test.cc
M be/src/rpc/rpc-mgr-test.h
M be/src/rpc/rpc-mgr.h
M be/src/rpc/rpc-mgr.inline.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/query-state.h
M be/src/service/client-request-state.cc
M be/src/service/control-service.cc
M be/src/service/control-service.h
M be/src/testutil/fault-injection-util.h
M common/protobuf/rpc_test.proto
M tests/custom_cluster/test_rpc_timeout.py
14 files changed, 268 insertions(+), 62 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/12672/8
--
To view, visit http://gerrit.cloudera.org:8080/12672
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia9693151c35e02235665b3c285a48c585973d390
Gerrit-Change-Number: 12672
Gerrit-PatchSet: 8
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-8143: Enhance DoRpcWithRetry().

2019-04-02 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12672 )

Change subject: IMPALA-8143: Enhance DoRpcWithRetry().
..


Patch Set 5:

(3 comments)

Thanks Michael. Please see the upcoming Patch Set 8.

http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr-test.cc
File be/src/rpc/rpc-mgr-test.cc:

http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr-test.cc@333
PS5, Line 333:   // Call DoRpcWithRetry with no backoff on our busy service.
 :   const int retries = 3;
 :   impala::Status rpc_status = RpcMgr::DoRpcWithRetry(proxy, 
::Ping,
 :   request, , query_ctx, "ping failed", retries, 
MonoDelta::FromSeconds(100));
 :   EXPECT_ERROR(rpc_status, TErrorCode::GENERAL);
 :   EXPECT_STR_CONTAINS(rpc_status.msg().msg(),
 :   "dropped due to backpressure. The service queue contains 1 
items out of a maximum "
 :   "of 1; memory consumption is ");
 :   ASSERT_EQ(overflow_metric->GetValue(), retries);
 :
 :   // Call DoRpcWithRetry, but this time we do backoff on a busy 
service, and this waiting
 :   // allows the outstanding aysnc calls to complete, so that 
then this call will succeed.
 :   impala::Status rpc_status_backoff =
 :   RpcMgr::DoRpcWithRetry(proxy, ::Ping, 
request, ,
 :   query_ctx, "ping failed", 10, 
MonoDelta::FromSeconds(100), 2);
 :   ASSERT_OK(rpc_status_backoff);
 :   ASSERT_GT(overflow_metric->GetValue(), retries);
 :
 :   // Check that async calls did complete.
 :   ASSERT_FALSE(async1_done.pending());
 :   ASSERT_FALSE(async2_done.pending());
> How about the following idea:
I did think about this. So the nice thing about your suggestion (thanks) is 
that we can set up the full queue.
Now we want to have a call to DoRpcWithRetry that will block (because queue is 
full) and then succeed (after retry).
So concurrently with the call to DoRpcWithRetry we need to unblock the queue.
We could start the call to DoRpcWithRetry in a thread, and then sleep for a 
while, and then unblock the queue.
That would work but is functionally equivalent to what we have now in that it 
depends on a sleep call.
So I understand the desire for a completely deterministic test but I still 
don't see a simple way to achieve that. I am happy to listen to more 
suggestions, or explanations of what I am not understanding.


http://gerrit.cloudera.org:8080/#/c/12672/7/be/src/rpc/rpc-mgr.h
File be/src/rpc/rpc-mgr.h:

http://gerrit.cloudera.org:8080/#/c/12672/7/be/src/rpc/rpc-mgr.h@186
PS7, Line 186: og' fun
> timeout_ms
Done


http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr.h
File be/src/rpc/rpc-mgr.h:

http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr.h@178
PS5, Line 178:   // A no-op method that is used as part of overloading 
DoRpcWithRetry().
 :   static void logging_noop(){};
 :
 :   // The type of the log method passed to DoRpcWithRetry().
 :   typedef boost::function RpcLogFn;
> They still seem to be in PS7
Done



--
To view, visit http://gerrit.cloudera.org:8080/12672
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia9693151c35e02235665b3c285a48c585973d390
Gerrit-Change-Number: 12672
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Tue, 02 Apr 2019 18:17:46 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8143: Enhance DoRpcWithRetry().

2019-04-02 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12672 )

Change subject: IMPALA-8143: Enhance DoRpcWithRetry().
..


Patch Set 8:

Reviewers please wait for a future Patch Ste with a simplified test


--
To view, visit http://gerrit.cloudera.org:8080/12672
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia9693151c35e02235665b3c285a48c585973d390
Gerrit-Change-Number: 12672
Gerrit-PatchSet: 8
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Tue, 02 Apr 2019 21:15:30 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3

2019-03-25 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12846 )

Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12846/4/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/12846/4/bin/impala-config.sh@37
PS4, Line 37: # parse command line options
It seems like options to impala-config.sh are currently passed by environment 
variable, for example USE_KUDU_DEBUG_BUILD can be set before sourcing 
impala-config.sh. Is there a particular reason you chose to add arguments to 
impala-config.sh rather than using the existing mechanism?



--
To view, visit http://gerrit.cloudera.org:8080/12846
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605
Gerrit-Change-Number: 12846
Gerrit-PatchSet: 4
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Mon, 25 Mar 2019 22:52:50 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8143: Enhance DoRpcWithRetry().

2019-03-26 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#3). ( 
http://gerrit.cloudera.org:8080/12672 )

Change subject: IMPALA-8143: Enhance DoRpcWithRetry().
..

IMPALA-8143: Enhance DoRpcWithRetry().

Allow callers of RpcMgr::DoRpcWithRetry to specify a time to sleep if
the remote service is busy. Allow callers to specify a logging function
that is called for rpc failures.

Reviewers should note that the following changes to retry parameters
were made:
 Cancel: add a 3 second sleep if service is busy.
 RemoteShutdown: add a 3 second sleep if service is busy.

TESTING:

Ran all end-to-end tests.

Add a new test to rpc-mgr-test.cc which tests RpcMgr::DoRpcWithRetry in
two ways. Firstly we test the case where the remote service is busy
(which we force by filling up the queue), and secondly we exercise
DoRpcWithRetry by using a fake Proxy that simulates failures.

Clean up unused values of FaultInjectionUtil.RpcCallType and match
values with test_rpc_timeout.py.

Change-Id: Ia9693151c35e02235665b3c285a48c585973d390
---
M be/src/rpc/impala-service-pool.cc
M be/src/rpc/impala-service-pool.h
M be/src/rpc/rpc-mgr-test.cc
M be/src/rpc/rpc-mgr-test.h
M be/src/rpc/rpc-mgr.h
M be/src/rpc/rpc-mgr.inline.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/query-state.h
M be/src/service/client-request-state.cc
M be/src/service/control-service.cc
M be/src/service/control-service.h
M be/src/testutil/fault-injection-util.h
M tests/custom_cluster/test_rpc_timeout.py
13 files changed, 309 insertions(+), 58 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/12672/3
--
To view, visit http://gerrit.cloudera.org:8080/12672
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia9693151c35e02235665b3c285a48c585973d390
Gerrit-Change-Number: 12672
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3

2019-03-26 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12846 )

Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3
..


Patch Set 7: Code-Review+1

(5 comments)

a few nits but looks good

http://gerrit.cloudera.org:8080/#/c/12846/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12846/6//COMMIT_MSG@10
PS6, Line 10: the minicluster scripts to use Hive 3.1.0 instead of CDH Hive 
2.1.1.
Nit: to optionally use


http://gerrit.cloudera.org:8080/#/c/12846/6/bin/create-test-configuration.sh
File bin/create-test-configuration.sh:

http://gerrit.cloudera.org:8080/#/c/12846/6/bin/create-test-configuration.sh@132
PS6, Line 132:   # Certain configurations (like 
SentrySyncHMSNotificationsPostListener) does not work
s/does not/do not/


http://gerrit.cloudera.org:8080/#/c/12846/6/bin/impala-config.sh
File bin/impala-config.sh:

http://gerrit.cloudera.org:8080/#/c/12846/6/bin/impala-config.sh@163
PS6, Line 163: export CDH_BUILD_NUMBER=909265
Nit: I wonder if CDP_BUILD_NUMBER should go here so that the 2 build numbers 
are together


http://gerrit.cloudera.org:8080/#/c/12846/6/bin/impala-config.sh@534
PS6, Line 534:   export 
HIVE_HOME="$CDP_COMPONENTS_HOME/apache-hive-${IMPALA_HIVE_VERSION}-bin"
Why in one case is the home apache-hive-xxx and in the other it is hive-xxx ?


http://gerrit.cloudera.org:8080/#/c/12846/6/bin/impala-config.sh@748
PS6, Line 748:   echo "CDP_BUILD_NUMBER= $CDP_BUILD_NUMBER"
It could be confusing that you always echo CDH_BUILD_NUMBER but you only echo 
CDP_BUILD_NUMBER when doing a CDP build.



--
To view, visit http://gerrit.cloudera.org:8080/12846
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605
Gerrit-Change-Number: 12846
Gerrit-PatchSet: 7
Gerrit-Owner: Vihang Karajgaonkar 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Reviewer: Todd Lipcon 
Gerrit-Reviewer: Vihang Karajgaonkar 
Gerrit-Comment-Date: Tue, 26 Mar 2019 19:12:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8143: Enhance DoRpcWithRetry().

2019-04-03 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#9). ( 
http://gerrit.cloudera.org:8080/12672 )

Change subject: IMPALA-8143: Enhance DoRpcWithRetry().
..

IMPALA-8143: Enhance DoRpcWithRetry().

Allow callers of RpcMgr::DoRpcWithRetry to specify a time to sleep if
the remote service is busy. DoRpcWithRetry now only sleeps if the remote
service is busy.

TESTING:

Ran all end-to-end tests.

Add a new test to rpc-mgr-test.cc which tests RpcMgr::DoRpcWithRetry in
two ways. Firstly we test the case where the remote service is busy
(which we force by filling up the queue), and secondly we exercise
DoRpcWithRetry by using a fake Proxy that simulates failures.

Clean up unused values of FaultInjectionUtil.RpcCallType and match
values with test_rpc_timeout.py.

Change-Id: Ia9693151c35e02235665b3c285a48c585973d390
---
M be/src/rpc/impala-service-pool.cc
M be/src/rpc/impala-service-pool.h
M be/src/rpc/rpc-mgr-test.cc
M be/src/rpc/rpc-mgr-test.h
M be/src/rpc/rpc-mgr.h
M be/src/rpc/rpc-mgr.inline.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/query-state.h
M be/src/service/client-request-state.cc
M be/src/service/control-service.cc
M be/src/service/control-service.h
M be/src/testutil/fault-injection-util.h
M tests/custom_cluster/test_rpc_timeout.py
13 files changed, 274 insertions(+), 62 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/12672/9
--
To view, visit http://gerrit.cloudera.org:8080/12672
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia9693151c35e02235665b3c285a48c585973d390
Gerrit-Change-Number: 12672
Gerrit-PatchSet: 9
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-6216: Make PYTHON EGG CACHE configurable

2019-04-04 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12911 )

Change subject: IMPALA-6216: Make PYTHON_EGG_CACHE configurable
..


Patch Set 1: Code-Review+1

LGTM


--
To view, visit http://gerrit.cloudera.org:8080/12911
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I695b2b31d9045eef1a53268f6516858935aed508
Gerrit-Change-Number: 12911
Gerrit-PatchSet: 1
Gerrit-Owner: Yongzhi Chen 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Laszlo Gaal 
Gerrit-Comment-Date: Thu, 04 Apr 2019 16:22:26 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8143: Enhance DoRpcWithRetry().

2019-03-29 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12672 )

Change subject: IMPALA-8143: Enhance DoRpcWithRetry().
..


Patch Set 5:

(20 comments)

Thanks for the reviews. Please see patch set 7.

http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr-test.h
File be/src/rpc/rpc-mgr-test.h:

http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr-test.h@296
PS5, Line 296: tha
> typo
Done


http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr-test.cc
File be/src/rpc/rpc-mgr-test.cc:

http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr-test.cc@214
PS5, Line 214:   ASSERT_OK(static_cast(scan_mem_impl)
 : ->
> Not sure if it's output of clang-tidy or something but I find it a bit hard
Yes this is clang-format, but I see now how to avoid this getting "fixed".


http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr-test.cc@263
PS5, Line 263: shoudl
> typo
Done


http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr-test.cc@291
PS5, Line 291: NULL
> nullptr;
Done


http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr-test.cc@298
PS5, Line 298: RpcController controller;
> not used ?
yes!


http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr-test.cc@316
PS5, Line 316: sleep_ms = 100;
> Instead of sharing "sleep_ms" between RPCs, it may be cleaner to pass the s
Thanks, this is cleaner


http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr-test.cc@322
PS5, Line 322: proxy.get()->
> proxy->
Done


http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr-test.cc@333
PS5, Line 333:   // Call DoRpcWithRetry with no backoff on our busy service.
 :   const int retries = 3;
 :   impala::Status rpc_status = RpcMgr::DoRpcWithRetry(proxy, 
::Ping,
 :   request, , query_ctx, "ping failed", retries, 
MonoDelta::FromSeconds(100));
 :   EXPECT_ERROR(rpc_status, TErrorCode::GENERAL);
 :   EXPECT_STR_CONTAINS(rpc_status.msg().msg(),
 :   "dropped due to backpressure. The service queue contains 1 
items out of a maximum "
 :   "of 1; memory consumption is ");
 :   ASSERT_EQ(overflow_metric->GetValue(), retries);
 :
 :   // Call DoRpcWithRetry, but this time we do backoff on a busy 
service, and this waiting
 :   // allows the outstanding aysnc calls to complete, so that 
then this call will succeed.
 :   impala::Status rpc_status_backoff =
 :   RpcMgr::DoRpcWithRetry(proxy, ::Ping, 
request, ,
 :   query_ctx, "ping failed", 10, 
MonoDelta::FromSeconds(100), 2);
 :   ASSERT_OK(rpc_status_backoff);
 :   ASSERT_GT(overflow_metric->GetValue(), retries);
 :
 :   // Check that async calls did complete.
 :   ASSERT_FALSE(async1_done.pending());
 :   ASSERT_FALSE(async2_done.pending());
> These set of tests seem rather timing dependent. For instance, it may occur
I reviewed the test and tried to make to more bulletproof. I think any test 
that really does fill up the queue and test that the retry works in this case 
will have some asynchrony. I don't like tests with sleep calls but I think this 
should be reliable. Let me know if you are not persuaded.


http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr-test.cc@378
PS5, Line 378:   // If we fail 2x and retry 2x, then the call fails.
 :   unique_ptr failing_proxy2 =
 :   make_unique(2);
 :
 :   Status rpc_status_fail =
 :   RpcMgr::DoRpcWithRetry(failing_proxy2, 
::Ping, request,
 :   , boost::bind(::log, this), 
query_ctx, "ping failed", 2,
 :   MonoDelta::FromSeconds(10));
> Please see comments in rpc-mgr.inline.h that we probably should only retry
Done


http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr.h
File be/src/rpc/rpc-mgr.h:

http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr.h@178
PS5, Line 178:   // A no-op method that is used as part of overloading 
DoRpcWithRetry().
 :   static void logging_noop(){};
 :
 :   // The type of the log method passed to DoRpcWithRetry().
 :   typedef boost::function RpcLogFn;
> Not needed as discussed in person.
Removed


http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr.h@189
PS5, Line 189: class
> typename
Done


http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr.h@191
PS5, Line 191: rpc_call
> It's important to highlight that this only works iff rpc_call is idempotent
Thanks


http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr.h@193
PS5, Line 193: const kudu::MonoDelta& timeout,
 :   const int server_busy_backoff_s = 0
> It 

[Impala-ASF-CR] IMPALA-8143: Enhance DoRpcWithRetry().

2019-03-29 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#7). ( 
http://gerrit.cloudera.org:8080/12672 )

Change subject: IMPALA-8143: Enhance DoRpcWithRetry().
..

IMPALA-8143: Enhance DoRpcWithRetry().

Allow callers of RpcMgr::DoRpcWithRetry to specify a time to sleep if
the remote service is busy. DoRpcWithRetry now only sleeps if the remote
service is busy.

TESTING:

Ran all end-to-end tests.

Add a new test to rpc-mgr-test.cc which tests RpcMgr::DoRpcWithRetry in
two ways. Firstly we test the case where the remote service is busy
(which we force by filling up the queue), and secondly we exercise
DoRpcWithRetry by using a fake Proxy that simulates failures.

Clean up unused values of FaultInjectionUtil.RpcCallType and match
values with test_rpc_timeout.py.

Change-Id: Ia9693151c35e02235665b3c285a48c585973d390
---
M be/src/rpc/impala-service-pool.cc
M be/src/rpc/impala-service-pool.h
M be/src/rpc/rpc-mgr-test.cc
M be/src/rpc/rpc-mgr-test.h
M be/src/rpc/rpc-mgr.h
M be/src/rpc/rpc-mgr.inline.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/query-state.h
M be/src/service/client-request-state.cc
M be/src/service/control-service.cc
M be/src/service/control-service.h
M be/src/testutil/fault-injection-util.h
M common/protobuf/rpc_test.proto
M tests/custom_cluster/test_rpc_timeout.py
14 files changed, 274 insertions(+), 62 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/12672/7
--
To view, visit http://gerrit.cloudera.org:8080/12672
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia9693151c35e02235665b3c285a48c585973d390
Gerrit-Change-Number: 12672
Gerrit-PatchSet: 7
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-6826: Extend bootstrap system.sh to Ubuntu 18.04

2019-03-29 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12893 )

Change subject: IMPALA-6826: Extend bootstrap_system.sh to Ubuntu 18.04
..


Patch Set 1: Code-Review+1

(2 comments)

LGTM

http://gerrit.cloudera.org:8080/#/c/12893/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12893/1//COMMIT_MSG@13
PS1, Line 13: - add 'ant' to the list of onstalled packages
typo: installed


http://gerrit.cloudera.org:8080/#/c/12893/1//COMMIT_MSG@17
PS1, Line 17: These changes enable bootstratp_system.sh to set up an Impala 
development
typo: bootstrap



--
To view, visit http://gerrit.cloudera.org:8080/12893
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Iad790f72ea6b62258aed2225eb7bdf79590c350f
Gerrit-Change-Number: 12893
Gerrit-PatchSet: 1
Gerrit-Owner: Laszlo Gaal 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Sat, 30 Mar 2019 00:01:10 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8143: Enhance DoRpcWithRetry().

2019-03-26 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#4). ( 
http://gerrit.cloudera.org:8080/12672 )

Change subject: IMPALA-8143: Enhance DoRpcWithRetry().
..

IMPALA-8143: Enhance DoRpcWithRetry().

Allow callers of RpcMgr::DoRpcWithRetry to specify a time to sleep if
the remote service is busy. Allow callers to specify a logging function
that is called for rpc failures.

Reviewers should note that the following changes to retry parameters
were made:
 Cancel: add a 3 second sleep if service is busy.
 RemoteShutdown: add a 3 second sleep if service is busy.

TESTING:

Ran all end-to-end tests.

Add a new test to rpc-mgr-test.cc which tests RpcMgr::DoRpcWithRetry in
two ways. Firstly we test the case where the remote service is busy
(which we force by filling up the queue), and secondly we exercise
DoRpcWithRetry by using a fake Proxy that simulates failures.

Clean up unused values of FaultInjectionUtil.RpcCallType and match
values with test_rpc_timeout.py.

Change-Id: Ia9693151c35e02235665b3c285a48c585973d390
---
M be/src/rpc/impala-service-pool.cc
M be/src/rpc/impala-service-pool.h
M be/src/rpc/rpc-mgr-test.cc
M be/src/rpc/rpc-mgr-test.h
M be/src/rpc/rpc-mgr.h
M be/src/rpc/rpc-mgr.inline.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/query-state.h
M be/src/service/client-request-state.cc
M be/src/service/control-service.cc
M be/src/service/control-service.h
M be/src/testutil/fault-injection-util.h
M tests/custom_cluster/test_rpc_timeout.py
13 files changed, 303 insertions(+), 58 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/12672/4
--
To view, visit http://gerrit.cloudera.org:8080/12672
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia9693151c35e02235665b3c285a48c585973d390
Gerrit-Change-Number: 12672
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-8143: Enhance DoRpcWithRetry().

2019-03-26 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#5). ( 
http://gerrit.cloudera.org:8080/12672 )

Change subject: IMPALA-8143: Enhance DoRpcWithRetry().
..

IMPALA-8143: Enhance DoRpcWithRetry().

Allow callers of RpcMgr::DoRpcWithRetry to specify a time to sleep if
the remote service is busy. Allow callers to specify a logging function
that is called for rpc failures.

Reviewers should note that the following changes to retry parameters
were made:
 Cancel: add a 3 second sleep if service is busy.
 RemoteShutdown: add a 3 second sleep if service is busy.

TESTING:

Ran all end-to-end tests.

Add a new test to rpc-mgr-test.cc which tests RpcMgr::DoRpcWithRetry in
two ways. Firstly we test the case where the remote service is busy
(which we force by filling up the queue), and secondly we exercise
DoRpcWithRetry by using a fake Proxy that simulates failures.

Clean up unused values of FaultInjectionUtil.RpcCallType and match
values with test_rpc_timeout.py.

Change-Id: Ia9693151c35e02235665b3c285a48c585973d390
---
M be/src/rpc/impala-service-pool.cc
M be/src/rpc/impala-service-pool.h
M be/src/rpc/rpc-mgr-test.cc
M be/src/rpc/rpc-mgr-test.h
M be/src/rpc/rpc-mgr.h
M be/src/rpc/rpc-mgr.inline.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/query-state.h
M be/src/service/client-request-state.cc
M be/src/service/control-service.cc
M be/src/service/control-service.h
M be/src/testutil/fault-injection-util.h
M tests/custom_cluster/test_rpc_timeout.py
13 files changed, 302 insertions(+), 58 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/12672/5
--
To view, visit http://gerrit.cloudera.org:8080/12672
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia9693151c35e02235665b3c285a48c585973d390
Gerrit-Change-Number: 12672
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-8143: Enhance DoRpcWithRetry().

2019-03-26 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12672 )

Change subject: IMPALA-8143: Enhance DoRpcWithRetry().
..


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12672/4/be/src/rpc/rpc-mgr.inline.h
File be/src/rpc/rpc-mgr.inline.h:

http://gerrit.cloudera.org:8080/#/c/12672/4/be/src/rpc/rpc-mgr.inline.h@56
PS4, Line 56: const int times_to_try, const kudu::MonoDelta& timeout, const 
int server_busy_backoff_s,
> line too long (92 > 90)
Done



--
To view, visit http://gerrit.cloudera.org:8080/12672
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia9693151c35e02235665b3c285a48c585973d390
Gerrit-Change-Number: 12672
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Tue, 26 Mar 2019 21:51:11 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] Improve error handling for validation of unified backend executable

2019-04-01 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12895 )

Change subject: Improve error handling for validation of unified backend 
executable
..


Patch Set 1: Code-Review+1

LGTM


--
To view, visit http://gerrit.cloudera.org:8080/12895
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ia84cd074be79bc9d99b0621a1ae216f3debf5833
Gerrit-Change-Number: 12895
Gerrit-PatchSet: 1
Gerrit-Owner: Joe McDonnell 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 01 Apr 2019 16:14:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8330: Impala shell config file should use flag names

2019-03-25 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12823 )

Change subject: IMPALA-8330: Impala shell config file should use flag names
..


Patch Set 2:

(2 comments)

Looks good

http://gerrit.cloudera.org:8080/#/c/12823/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/12823/2//COMMIT_MSG@10
PS2, Line 10: both short and long flag names instead of dest names because dest 
names
My life would be easier if you explained what a 'dest name' is here.


http://gerrit.cloudera.org:8080/#/c/12823/2//COMMIT_MSG@24
PS2, Line 24: ; Flags can be repeated.
Flags can be repeated. So if I have
query_options=DEFAULT_FILE_FORMAT=parquet
Q=DEFAULT_FILE_FORMAT=orc
will it all work OK?



--
To view, visit http://gerrit.cloudera.org:8080/12823
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ic43603c1b538af08fddcab1b2c1f6ad1af1a6cb9
Gerrit-Change-Number: 12823
Gerrit-PatchSet: 2
Gerrit-Owner: Fredy Wijaya 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Bharath Vissapragada 
Gerrit-Reviewer: Fredy Wijaya 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Comment-Date: Mon, 25 Mar 2019 18:51:20 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-7800: Reject new connections after --fe service threads

2019-03-25 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12579 )

Change subject: IMPALA-7800: Reject new connections after --fe_service_threads
..


Patch Set 8:

(5 comments)

All I have are nits and questions

http://gerrit.cloudera.org:8080/#/c/12579/8/be/src/rpc/TAcceptQueueServer.h
File be/src/rpc/TAcceptQueueServer.h:

http://gerrit.cloudera.org:8080/#/c/12579/8/be/src/rpc/TAcceptQueueServer.h@97
PS8, Line 97:   /// Name of the thrift server
Add terminating period (nit).


http://gerrit.cloudera.org:8080/#/c/12579/8/be/src/rpc/TAcceptQueueServer.h@113
PS8, Line 113:   /// Number of connections rejected due to timeout
Add terminating period (nit).


http://gerrit.cloudera.org:8080/#/c/12579/8/be/src/rpc/TAcceptQueueServer.cpp
File be/src/rpc/TAcceptQueueServer.cpp:

http://gerrit.cloudera.org:8080/#/c/12579/8/be/src/rpc/TAcceptQueueServer.cpp@220
PS8, Line 220:   } catch (string s) {
I know this isn't part of this change, but can we really ever get string 
exceptions?


http://gerrit.cloudera.org:8080/#/c/12579/8/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/12579/8/be/src/service/impala-server.cc@241
PS8, Line 241: "the queue before we time it out and reject the connection 
request. A value of 0 "
In the flags that configure the accept queue we call it "the post-accept, 
pre-setup connection queue", which is rather verbose, but do we want to use the 
same wording for consistency? Also the description might be clearer if it was 
rewritten to not use "we" (this is a definite nit)


http://gerrit.cloudera.org:8080/#/c/12579/8/tests/custom_cluster/test_frontend_connection_limit.py
File tests/custom_cluster/test_frontend_connection_limit.py:

http://gerrit.cloudera.org:8080/#/c/12579/8/tests/custom_cluster/test_frontend_connection_limit.py@89
PS8, Line 89:   raise ImpalaBeeswaxException(e.message, e)
I am confused about this. Why do we raise an exception here?  I see that the 
test is marked xfail(raises=ImpalaBeeswaxException)but doesn't make the test 
fragile in the case where some other bit of code raises a 
ImpalaBeeswaxException? Maybe this is a python testing pattern I am unfamiliar 
with.



--
To view, visit http://gerrit.cloudera.org:8080/12579
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idb345c1d84cc2f691f54ded467f253e758f87e64
Gerrit-Change-Number: 12579
Gerrit-PatchSet: 8
Gerrit-Owner: Zoram Thanga 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Reviewer: Zoram Thanga 
Gerrit-Comment-Date: Mon, 25 Mar 2019 18:00:02 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8143: Enhance DoRpcWithRetry().

2019-03-26 Thread Andrew Sherman (Code Review)
Andrew Sherman has uploaded a new patch set (#6). ( 
http://gerrit.cloudera.org:8080/12672 )

Change subject: IMPALA-8143: Enhance DoRpcWithRetry().
..

IMPALA-8143: Enhance DoRpcWithRetry().

Allow callers of RpcMgr::DoRpcWithRetry to specify a time to sleep if
the remote service is busy. Allow callers to specify a logging function
that is called for rpc failures.

Reviewers should note that the following changes to retry parameters
were made:
 Cancel: add a 3 second sleep if service is busy.
 RemoteShutdown: add a 3 second sleep if service is busy.

TESTING:

Ran all end-to-end tests.

Add a new test to rpc-mgr-test.cc which tests RpcMgr::DoRpcWithRetry in
two ways. Firstly we test the case where the remote service is busy
(which we force by filling up the queue), and secondly we exercise
DoRpcWithRetry by using a fake Proxy that simulates failures.

Clean up unused values of FaultInjectionUtil.RpcCallType and match
values with test_rpc_timeout.py.

Change-Id: Ia9693151c35e02235665b3c285a48c585973d390
---
M be/src/rpc/impala-service-pool.cc
M be/src/rpc/impala-service-pool.h
M be/src/rpc/rpc-mgr-test.cc
M be/src/rpc/rpc-mgr-test.h
M be/src/rpc/rpc-mgr.h
M be/src/rpc/rpc-mgr.inline.h
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/query-state.h
M be/src/service/client-request-state.cc
M be/src/service/control-service.cc
M be/src/service/control-service.h
M be/src/testutil/fault-injection-util.h
M tests/custom_cluster/test_rpc_timeout.py
13 files changed, 302 insertions(+), 58 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/12672/6
--
To view, visit http://gerrit.cloudera.org:8080/12672
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia9693151c35e02235665b3c285a48c585973d390
Gerrit-Change-Number: 12672
Gerrit-PatchSet: 6
Gerrit-Owner: Andrew Sherman 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 


[Impala-ASF-CR] IMPALA-8256: Better error message for ImpalaServicePool::RejectTooBusy()

2019-02-27 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12624 )

Change subject: IMPALA-8256: Better error message for 
ImpalaServicePool::RejectTooBusy()
..


Patch Set 1:

Oh I see, the service-pool.cc version does not reject messages because of lack 
of memory, doh.


--
To view, visit http://gerrit.cloudera.org:8080/12624
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0297658acf2b23823dcb7d2bdff5d8e4475bb98
Gerrit-Change-Number: 12624
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Wed, 27 Feb 2019 21:16:03 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-8256: Better error message for ImpalaServicePool::RejectTooBusy()

2019-02-27 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12624 )

Change subject: IMPALA-8256: Better error message for 
ImpalaServicePool::RejectTooBusy()
..


Patch Set 1:

(3 comments)

Change looks basically good, I have a few nits.

http://gerrit.cloudera.org:8080/#/c/12624/1/be/src/rpc/impala-service-pool.cc
File be/src/rpc/impala-service-pool.cc:

http://gerrit.cloudera.org:8080/#/c/12624/1/be/src/rpc/impala-service-pool.cc@119
PS1, Line 119:  "The service queue is full; it has $3 items; 
memory "
Isn't the message "The service queue is full" misleading?


http://gerrit.cloudera.org:8080/#/c/12624/1/be/src/rpc/impala-service-pool.cc@120
PS1, Line 120:  "consumption is $4",
Nit: needs terminating "."


http://gerrit.cloudera.org:8080/#/c/12624/1/be/src/rpc/impala-service-pool.cc@125
PS1, Line 125:  
PrettyPrinter::Print(service_mem_tracker_->consumption(), TUnit::BYTES));
There's a similar message in service-pool.cc
Is it right that you are not changing the message there because there is no 
service_mem_tracker in that class?



--
To view, visit http://gerrit.cloudera.org:8080/12624
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0297658acf2b23823dcb7d2bdff5d8e4475bb98
Gerrit-Change-Number: 12624
Gerrit-PatchSet: 1
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Wed, 27 Feb 2019 21:11:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-8256: Better error message for ImpalaServicePool::RejectTooBusy()

2019-03-01 Thread Andrew Sherman (Code Review)
Andrew Sherman has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12624 )

Change subject: IMPALA-8256: Better error message for 
ImpalaServicePool::RejectTooBusy()
..


Patch Set 3: Code-Review+1

LGTM


--
To view, visit http://gerrit.cloudera.org:8080/12624
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If0297658acf2b23823dcb7d2bdff5d8e4475bb98
Gerrit-Change-Number: 12624
Gerrit-PatchSet: 3
Gerrit-Owner: Michael Ho 
Gerrit-Reviewer: Andrew Sherman 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Lars Volker 
Gerrit-Reviewer: Michael Ho 
Gerrit-Reviewer: Thomas Marshall 
Gerrit-Comment-Date: Fri, 01 Mar 2019 16:53:46 +
Gerrit-HasComments: No


  1   2   3   4   5   6   7   >