[Impala-ASF-CR] IMPALA-7448: Invalidate recently unused tables from catalogd
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11224 ) Change subject: IMPALA-7448: Invalidate recently unused tables from catalogd .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/11224/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java File fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java: http://gerrit.cloudera.org:8080/#/c/11224/1/fe/src/main/java/org/apache/impala/catalog/CatalogServiceCatalog.java@261 PS1, Line 261: sleep((long)(unusedTableTtlSec) * 1000L); TimeUnit.SECONDS.sleep(unusedTableTtlSec); -- To view, visit http://gerrit.cloudera.org:8080/11224 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ib549717abefcffb14d9a3814ee8cf0de8bd49e89 Gerrit-Change-Number: 11224 Gerrit-PatchSet: 1 Gerrit-Owner: Tianyi Wang Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tianyi Wang Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 15 Aug 2018 20:29:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6271: Impala daemon should log a message when it's being shut down
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/10847 ) Change subject: IMPALA-6271: Impala daemon should log a message when it's being shut down .. Patch Set 6: (1 comment) http://gerrit.cloudera.org:8080/#/c/10847/6/be/src/catalog/catalogd-main.cc File be/src/catalog/catalogd-main.cc: http://gerrit.cloudera.org:8080/#/c/10847/6/be/src/catalog/catalogd-main.cc@58 PS6, Line 58: using namespace apache::thrift; If you make more changes to this file then you could remove this duplicated line. -- To view, visit http://gerrit.cloudera.org:8080/10847 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Id20da9e30440b7348557beccb8a0da14775fcc29 Gerrit-Change-Number: 10847 Gerrit-PatchSet: 6 Gerrit-Owner: Pranay Singh Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Pranay Singh Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Mon, 06 Aug 2018 17:59:14 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6709: Simplify tests that copy local files to tables
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11127 ) Change subject: IMPALA-6709: Simplify tests that copy local files to tables .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/11127/1/tests/common/test_parquet.py File tests/common/test_parquet.py: http://gerrit.cloudera.org:8080/#/c/11127/1/tests/common/test_parquet.py@31 PS1, Line 31: hdfs_file = '/test-warehouse/{0}.db/{1}'.format(unique_database, filename) This is the nittiest of nits, but I am learning python: Do we use both '%' and 'format' ? Is it good style to mix and match them in the same file? -- To view, visit http://gerrit.cloudera.org:8080/11127 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ie00a4561825facf8abe2e8e74a6b6e93194f416f Gerrit-Change-Number: 11127 Gerrit-PatchSet: 1 Gerrit-Owner: Zoltan Borok-Nagy Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 06 Aug 2018 17:42:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6568 add missing Query Compilation section to profiles.
Andrew Sherman has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11387 Change subject: IMPALA-6568 add missing Query Compilation section to profiles. .. IMPALA-6568 add missing Query Compilation section to profiles. The profile command is used to display low-level information about the most recent query. When a client makes request for the Profile, it sends a GetRuntimeProfile request for the last queryId to to the server. The queryId is used to find the ClientRequestState, an object which tracks information about the execution, including Profile data which is stored in several RuntimeProfile objects. The reply to the GetRuntimeProfile message contains the Profile, pretty-printed as a string. When a query is sent to the Front End to be compiled, the TExecRequest that is returned from createExecRequest() in the JVM contains a Timeline, which is a named sequence of events with timing information. This Timeline is added to the Summary Profile in the ClientRequestState. In the following cases the Front End was not setting the Timeline in the TExecRequest: - All DDL operations - Load data statements - Set statements - Explain statements And this meant that the profile would not contain the "Query Compilation" timeline. ALTERNATIVES: To fix this I set the Timeline in TExecRequest before returning it. >result.setTimeline(timeline.toThrift()); >return result; I considered writing this as >return result.setTimeline(timeline.toThrift()); To try to make sure this code gets copied in any future return statements, but I didn’t think it was worth the cost to readability. TESTING: Add a new test to test_observability.py which checks that the "Query Compilation" timeline appears in the profile for various queries designed to exercise the new code paths in createExecRequest. Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5 --- M fe/src/main/java/org/apache/impala/service/Frontend.java M tests/query_test/test_observability.py 2 files changed, 46 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/11387/1 -- To view, visit http://gerrit.cloudera.org:8080/11387 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5 Gerrit-Change-Number: 11387 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Sherman
[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/11388/2/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/11388/2/be/src/service/impala-server.cc@950 PS2, Line 950: (*request_state)->query_events()->MarkEvent("Planning finished"); Looks like both FE and BE set "Planning Finished" in the timeline. Is that wrong? -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 2 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 05 Sep 2018 19:02:33 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: IMPALA-6568 add missing Query Compilation section to profiles.
Andrew Sherman has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/11387 ) Change subject: WIP: IMPALA-6568 add missing Query Compilation section to profiles. .. WIP: IMPALA-6568 add missing Query Compilation section to profiles. WIP: so Todd can look at refactor TODO: change tests to check that "Planning finished" is present for explain queries The profile command is used to display low-level information about the most recent query. When a client makes request for the Profile, it sends a GetRuntimeProfile request for the last queryId to to the server. The queryId is used to find the ClientRequestState, an object which tracks information about the execution, including Profile data which is stored in several RuntimeProfile objects. The reply to the GetRuntimeProfile message contains the Profile, pretty-printed as a string. When a query is sent to the Front End to be compiled, the TExecRequest that is returned from createExecRequest() in the JVM contains a Timeline, which is a named sequence of events with timing information. This Timeline is added to the Summary Profile in the ClientRequestState. In the following cases the Front End was not setting the Timeline in the TExecRequest: - All DDL operations - Load data statements - Set statements - Explain statements And this meant that the profile would not contain the "Query Compilation" timeline. ALTERNATIVES: To fix this I set the Timeline in TExecRequest before returning it. >result.setTimeline(timeline.toThrift()); >return result; I considered writing this as >return result.setTimeline(timeline.toThrift()); To try to make sure this code gets copied in any future return statements, but I didn’t think it was worth the cost to readability. REFACTOR I refactored the main createExecRequest() method to try to make the flow clearer. TESTING: Add a new test to test_observability.py which checks that the "Query Compilation" timeline appears in the profile for various queries designed to exercise the new code paths in createExecRequest. Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5 --- M fe/src/main/java/org/apache/impala/service/Frontend.java M tests/query_test/test_observability.py 2 files changed, 134 insertions(+), 56 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/11387/2 -- To view, visit http://gerrit.cloudera.org:8080/11387 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5 Gerrit-Change-Number: 11387 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-6568 add missing Query Compilation section to profiles.
Andrew Sherman has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/11387 ) Change subject: IMPALA-6568 add missing Query Compilation section to profiles. .. IMPALA-6568 add missing Query Compilation section to profiles. The profile command is used to display low-level information about the most recent query. When a client makes request for the Profile, it sends a GetRuntimeProfile request for the last queryId to to the server. The queryId is used to find the ClientRequestState, an object which tracks information about the execution, including Profile data which is stored in several RuntimeProfile objects. The reply to the GetRuntimeProfile message contains the Profile, pretty-printed as a string. When a query is sent to the Front End to be compiled, the TExecRequest that is returned from createExecRequest() in the JVM contains a Timeline, which is a named sequence of events with timing information. This Timeline is added to the Summary Profile in the ClientRequestState. In the following cases the Front End was not setting the Timeline in the TExecRequest: - All DDL operations - Load data statements - Set statements - Explain statements And this meant that the profile would not contain the "Query Compilation" timeline. I refactored the main createExecRequest() method to - try to make the flow clearer, - allow the timeline to be set in the TExecRequest in only one place. - to set "Planning finished" in all timelines TESTING: Add a new test to test_observability.py which checks that the "Query Compilation" and "Planning finished" timelines appear in the profile for various queries designed to exercise the new code paths in createExecRequest. Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5 --- M fe/src/main/java/org/apache/impala/service/Frontend.java M tests/query_test/test_observability.py 2 files changed, 147 insertions(+), 58 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/11387/3 -- To view, visit http://gerrit.cloudera.org:8080/11387 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5 Gerrit-Change-Number: 11387 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon
[Impala-ASF-CR] IMPALA-6568 add missing Query Compilation section to profiles.
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11387 ) Change subject: IMPALA-6568 add missing Query Compilation section to profiles. .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/11387/1/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/11387/1/fe/src/main/java/org/apache/impala/service/Frontend.java@1167 PS1, Line 1167: timeline.markEvent("Planning finished"); > hmm, but even with the earlier 'planning finished', statements like LOAD DA OK, thanks, I think I see what you mean now. -- To view, visit http://gerrit.cloudera.org:8080/11387 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5 Gerrit-Change-Number: 11387 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Comment-Date: Wed, 05 Sep 2018 22:30:47 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6568 add missing Query Compilation section to profiles.
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11387 ) Change subject: IMPALA-6568 add missing Query Compilation section to profiles. .. Patch Set 3: (5 comments) http://gerrit.cloudera.org:8080/#/c/11387/3/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/11387/3/fe/src/main/java/org/apache/impala/service/Frontend.java@1065 PS3, Line 1065: if (!analysisResult.isCreateTableAsSelectStmt()) { : return result; : } > I personally think it's weird, but the impala style is to use single-line i I tried this but I don't like it. I think its more readable how it is. I don't want to get into a code style war for my first (or second) change... so I will change this if you think it is best, but this doesn't look inconsistent with existing code. http://gerrit.cloudera.org:8080/#/c/11387/3/fe/src/main/java/org/apache/impala/service/Frontend.java@1134 PS3, Line 1134: private void setMtDopForCatalogOp(AnalysisResult analysisResult, > can be static? same with several of the extracted functions below Done http://gerrit.cloudera.org:8080/#/c/11387/3/fe/src/main/java/org/apache/impala/service/Frontend.java@1150 PS3, Line 1150: createTExecRequest > maybe rename this to 'createBaseExecRequest' or something? Otherwise it's s Done http://gerrit.cloudera.org:8080/#/c/11387/3/fe/src/main/java/org/apache/impala/service/Frontend.java@1164 PS3, Line 1164: AnalysisResult analysisResult, > it seems this is just used to get the insertStmt, and the insertStmt is alr Done http://gerrit.cloudera.org:8080/#/c/11387/3/fe/src/main/java/org/apache/impala/service/Frontend.java@1185 PS3, Line 1185: void addQueryMetadata > it seems that 'result' is only used as an argument here for storing the fin Done -- To view, visit http://gerrit.cloudera.org:8080/11387 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5 Gerrit-Change-Number: 11387 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 07 Sep 2018 18:45:40 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java File fe/src/main/java/org/apache/impala/service/FrontendProfile.java: http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@73 PS3, Line 73:* Create a new profile, setting it as the current thread-lcoal profile for the spelling: thread-local http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@151 PS3, Line 151: oldThreadLocalValue_ = THREAD_LOCAL.get(); > Ya I think we should have some guard that prevents nested scopes. I doubt i I see now. I think maybe all that is needed is more comments, perhaps in createNewWIthScope(), explaining the existence of a stack of FrontendProfiles. Plus some exercise of the nestable scopes in the forthcoming unit test -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 07 Sep 2018 16:13:27 + Gerrit-HasComments: Yes
[Impala-ASF-CR] WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11388 ) Change subject: WIP: IMPALA-7527: add fetch-from-catalogd cache info to profile .. Patch Set 3: (2 comments) http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java File fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java: http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/catalog/local/CatalogdMetaProvider.java@189 PS3, Line 189: private static final String DB_LIST_STATS_CATEGORY = "DatabaseList"; Is it worth having some sort of enum for these keys? Maybe it will make the code more untidy? http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java File fe/src/main/java/org/apache/impala/service/FrontendProfile.java: http://gerrit.cloudera.org:8080/#/c/11388/3/fe/src/main/java/org/apache/impala/service/FrontendProfile.java@151 PS3, Line 151: oldThreadLocalValue_ = THREAD_LOCAL.get(); Can you explain what is going on here? Is there a case where oldThreadLocalValue_ is not null? -- To view, visit http://gerrit.cloudera.org:8080/11388 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I808d55a225338912ebaebd0cf71c36db944b7276 Gerrit-Change-Number: 11388 Gerrit-PatchSet: 3 Gerrit-Owner: Todd Lipcon Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 06 Sep 2018 22:54:01 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6568 add missing Query Compilation section to profiles.
Andrew Sherman has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/11387 ) Change subject: IMPALA-6568 add missing Query Compilation section to profiles. .. IMPALA-6568 add missing Query Compilation section to profiles. The profile command is used to display low-level information about the most recent query. When a client makes request for the Profile, it sends a GetRuntimeProfile request for the last queryId to to the server. The queryId is used to find the ClientRequestState, an object which tracks information about the execution, including Profile data which is stored in several RuntimeProfile objects. The reply to the GetRuntimeProfile message contains the Profile, pretty-printed as a string. When a query is sent to the Front End to be compiled, the TExecRequest that is returned from createExecRequest() in the JVM contains a Timeline, which is a named sequence of events with timing information. This Timeline is added to the Summary Profile in the ClientRequestState. In the following cases the Front End was not setting the Timeline in the TExecRequest: - All DDL operations - Load data statements - Set statements - Explain statements And this meant that the profile would not contain the "Query Compilation" timeline. I refactored the main createExecRequest() method to - try to make the flow clearer, - allow the timeline to be set in the TExecRequest in only one place. - to set "Planning finished" in all timelines TESTING: Add a new test to test_observability.py which checks that the "Query Compilation" and "Planning finished" timelines appear in the profile for various queries designed to exercise the new code paths in createExecRequest. Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5 --- M fe/src/main/java/org/apache/impala/service/Frontend.java M tests/query_test/test_observability.py 2 files changed, 146 insertions(+), 58 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/11387/4 -- To view, visit http://gerrit.cloudera.org:8080/11387 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5 Gerrit-Change-Number: 11387 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6568 add missing Query Compilation section to profiles.
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11387 ) Change subject: IMPALA-6568 add missing Query Compilation section to profiles. .. Patch Set 3: (5 comments) http://gerrit.cloudera.org:8080/#/c/11387/3/fe/src/main/java/org/apache/impala/service/Frontend.java File fe/src/main/java/org/apache/impala/service/Frontend.java: http://gerrit.cloudera.org:8080/#/c/11387/3/fe/src/main/java/org/apache/impala/service/Frontend.java@a1069 PS3, Line 1069: > so was this the main culprit (early return) for omitting the timeline for a Yes, the early returns were not setting the timeline correctly, which is the source of the bug. Do you want me to split out the cleanup into a separate jira so it can be better reviewed? http://gerrit.cloudera.org:8080/#/c/11387/3/fe/src/main/java/org/apache/impala/service/Frontend.java@1034 PS3, Line 1034: t > nit: uppercase to getTExecRequest Done http://gerrit.cloudera.org:8080/#/c/11387/3/fe/src/main/java/org/apache/impala/service/Frontend.java@1035 PS3, Line 1035:StringBuilder explainString) > indentation here should be 4 spaces from the beginning of the line I think you mean 6 spaces from the beginning of the line? So 4 spaces indented from the method's indentation? I see I was doing this wrong everywhere so I fixed the other cases too. http://gerrit.cloudera.org:8080/#/c/11387/3/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/11387/3/tests/query_test/test_observability.py@221 PS3, Line 221: def test_query_profile_contains_all_events(self): > use unique_database fixture here so that the various objects here are scope Thanks for making me learn about pytest :-) http://gerrit.cloudera.org:8080/#/c/11387/3/tests/query_test/test_observability.py@224 PS3, Line 224: self.hdfs_client.delete_file_dir('tmp/impala6568') : se > would it simplify things here if you used an existing file? I have tidied this up so it is more consistent with other tests -- To view, visit http://gerrit.cloudera.org:8080/11387 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5 Gerrit-Change-Number: 11387 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 07 Sep 2018 23:40:06 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6568 add missing Query Compilation section to profiles.
Andrew Sherman has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/11387 ) Change subject: IMPALA-6568 add missing Query Compilation section to profiles. .. IMPALA-6568 add missing Query Compilation section to profiles. The profile command is used to display low-level information about the most recent query. When a client makes request for the Profile, it sends a GetRuntimeProfile request for the last queryId to to the server. The queryId is used to find the ClientRequestState, an object which tracks information about the execution, including Profile data which is stored in several RuntimeProfile objects. The reply to the GetRuntimeProfile message contains the Profile, pretty-printed as a string. When a query is sent to the Front End to be compiled, the TExecRequest that is returned from createExecRequest() in the JVM contains a Timeline, which is a named sequence of events with timing information. This Timeline is added to the Summary Profile in the ClientRequestState. In the following cases the Front End was not setting the Timeline in the TExecRequest: - All DDL operations - Load data statements - Set statements - Explain statements And this meant that the profile would not contain the "Query Compilation" timeline. I refactored the main createExecRequest() method to - try to make the flow clearer, - allow the timeline to be set in the TExecRequest in only one place. - to set "Planning finished" in all timelines TESTING: Add a new test to test_observability.py which checks that the "Query Compilation" and "Planning finished" timelines appear in the profile for various queries designed to exercise the new code paths in createExecRequest. Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5 --- M fe/src/main/java/org/apache/impala/service/Frontend.java M tests/query_test/test_observability.py 2 files changed, 144 insertions(+), 59 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/87/11387/5 -- To view, visit http://gerrit.cloudera.org:8080/11387 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I869eaeb4be4291b6b938f91847f624c76ec90ea5 Gerrit-Change-Number: 11387 Gerrit-PatchSet: 5 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11582 ) Change subject: IMPALA-6658: improve Parquet RLE for low bit widths .. Patch Set 1: (5 comments) Another patch should be coming soon http://gerrit.cloudera.org:8080/#/c/11582/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11582/1//COMMIT_MSG@31 PS1, Line 31: All unit tests pass > I assume that you ran the e2e tests, and also that you checked that there i I did run e2e tests, and now I have also checked that they encode values of bit width 1 and 2. Bit width 1 is used when writing definition levels (i.e. when a value is null or not). Bit width 2 is used when encoding a dictionary with 3 or 4 entries. http://gerrit.cloudera.org:8080/#/c/11582/1/be/src/util/rle-encoding.h File be/src/util/rle-encoding.h: http://gerrit.cloudera.org:8080/#/c/11582/1/be/src/util/rle-encoding.h@208 PS1, Line 208: min_repeated_run_length > mention that it must be a multiple of 8, less than MAX_RUN_LENGTH_BUFFER Done http://gerrit.cloudera.org:8080/#/c/11582/1/be/src/util/rle-encoding.h@209 PS1, Line 209: which is the best choice > seems like from your DCHECK(TestInfo::is_test()) that this is the required Done http://gerrit.cloudera.org:8080/#/c/11582/1/be/src/util/rle-encoding.h@242 PS1, Line 242: MinRepeatedRunLength(bit_width) > should this be min_repeated_run_length_, in case its set to a non-default v I've changed this function again after thinking it through more carefully, and it no longer uses MinRepeatedRunLength() in this way. http://gerrit.cloudera.org:8080/#/c/11582/1/be/src/util/rle-encoding.h@380 PS1, Line 380: run_length > min_repeated_run_length_? Done -- To view, visit http://gerrit.cloudera.org:8080/11582 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76 Gerrit-Change-Number: 11582 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Mon, 15 Oct 2018 23:52:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5821: Add query with implicit casts to extended explain output.
Andrew Sherman has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/11719 ) Change subject: IMPALA-5821: Add query with implicit casts to extended explain output. .. IMPALA-5821: Add query with implicit casts to extended explain output. If explain_level is at 'extended' level or higher, then print the analyzed query as text in the explain header to aid in diagnosis. This query text is generated from the analyzed statement tree, so it includes query rewriting. In addition implicit casts are printed, and numeric literals are printed with a cast so that their type is visible. If SHOW_IMPLICIT_CASTS is passed to toSql() then - in CastExpr print the implicit cast - in NumericLiteral print the literal with a cast to show the type Add a PlannerTestOption directive that will force the query text showing implicit casts to be included in the PLAN section of a .test file. The toSql() method can be called on a ParseNode tree to return the sql corresponding ot the tree. In the past toSQl() has been enhanced to print rewritten sql by partially overloading toSql() [with toSql(boolean)]. This current change requires changing toSQl() in many places as NumericLiteral can appear at different points in ia parse tree. To avoid many new fragile overloads of toSql() I added toSql(ToSqlOptions), where ToSqlOptions is an enum which controls the form of the Sql that is returned. This changes many files but is safer and means that any future options to toSql() can be added painlessly. If a query in a .test file produces a diff when run by PlannerTest, then print the name of the .test file in the output. Documentation of this change will be done as IMPALA-7718 EXAMPLE OUTPUT: [localhost:21000] default> set explain_level=extended; EXPLAIN_LEVEL set to extended [localhost:21000] default> explain select * from functional_kudu.alltypestiny where bigint_col < 1000 / 100; Query: explain select * from functional_kudu.alltypestiny where bigint_col < 1000 / 100 Query with implicit casts: SELECT * FROM functional_kudu.alltypestiny WHERE CAST(bigint_col AS DOUBLE) < CAST(10 AS DOUBLE) Max Per-Host Resource Reservation: Memory=0B Threads=2 Per-Host Resource Estimates: Memory=10MB Codegen disabled by planner "" F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 | Per-Host Resources: mem-estimate=0B mem-reservation=0B thread-reservation=2 (etc.) Note that the initial "Query:" that is printed originates from impala-shell (suppressed with --quiet), but for neatness I put the new "Query with implicit casts:" output at the top of the header. TESTING: All end-to-end tests pass. Added a new test in ExprRewriterTest which prints sql with implict casts for some interesting queries. The output of some Planner Tests in .test files has been updated to include the sql with implict casts that is printed when explain_level is at at least 'extended' level. QUESTIONS FOR REVIEWERS: Is "Query with implicit casts:" a good description? Is the EXPLAIN header the right place for this output? Change-Id: I55c3bdacc295137f66b2316a912fc347da30d6b0 --- M fe/src/main/java/org/apache/impala/analysis/AdminFnStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java M fe/src/main/java/org/apache/impala/analysis/AnalyticWindow.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/BaseTableRef.java M fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDataSrcStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDropRoleStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java M
[Impala-ASF-CR] IMPALA-5821: Add query with implicit casts to extended explain output.
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11719 ) Change subject: IMPALA-5821: Add query with implicit casts to extended explain output. .. Patch Set 1: (4 comments) Thanks for the review. What do you think about: Is "Query with implicit casts:" a good description? Is the EXPLAIN header the right place for this output? http://gerrit.cloudera.org:8080/#/c/11719/1/fe/src/main/java/org/apache/impala/analysis/ParseNode.java File fe/src/main/java/org/apache/impala/analysis/ParseNode.java: http://gerrit.cloudera.org:8080/#/c/11719/1/fe/src/main/java/org/apache/impala/analysis/ParseNode.java@30 PS1, Line 30: /** :* Returns the SQL string corresponding to this node. :*/ > Can you explain the relation of the two toSql() functions in the comment? I Thanks fro thinking about this. This is a case where default parameters would help. http://gerrit.cloudera.org:8080/#/c/11719/1/fe/src/main/java/org/apache/impala/analysis/ToSqlOptions.java File fe/src/main/java/org/apache/impala/analysis/ToSqlOptions.java: http://gerrit.cloudera.org:8080/#/c/11719/1/fe/src/main/java/org/apache/impala/analysis/ToSqlOptions.java@1 PS1, Line 1: package org.apache.impala.analysis; > Apache license is missing (also noticed by the jenkins job). Done http://gerrit.cloudera.org:8080/#/c/11719/1/fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java File fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java: http://gerrit.cloudera.org:8080/#/c/11719/1/fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java@344 PS1, Line 344: TestToSqlWithImplicitCasts > Can you add a test (or maybe a few) to check that nested structures like su good idea http://gerrit.cloudera.org:8080/#/c/11719/1/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java File fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java: http://gerrit.cloudera.org:8080/#/c/11719/1/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java@546 PS1, Line 546: errorLog.append("\nTest file: " + testFile + ".test" : + "\n"); > nit: fits to one line Done -- To view, visit http://gerrit.cloudera.org:8080/11719 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I55c3bdacc295137f66b2316a912fc347da30d6b0 Gerrit-Change-Number: 11719 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Thu, 18 Oct 2018 22:08:31 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11582 ) Change subject: IMPALA-6658: improve Parquet RLE for low bit widths .. Patch Set 2: (7 comments) Thanks for the reviews http://gerrit.cloudera.org:8080/#/c/11582/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11582/2//COMMIT_MSG@20 PS2, Line 20: By default RleEncoder will now use run length encoding for runs of : length 24 for single bit values, and of length 16 for 2 bit wide values. : All other bit widths will use the existing length 8 runs. > About the solution in general: I agree with your calculations. I think sometimes min_run_length=24 is better at bit width=1 I’m going to use the notation of R16 for a repeated run of length 16, and L16 for a literal run of length 16. So your example (bit width 1) is: repeated lengths: R24 R16 R24 R16 R24 min_run_length 8 (old) 2 2 2 2 2 min_run_length 24 (new) 2 3 2 3 2 min_run_length 16 2 2 2 2 2 The case where min_run_length=24 is better is where we avoid interrupting a literal run L24 R16 L24 R16 L24 Old solution4 2 4 2 4 min_run_length 24 4 2 3 2 3 (one long literal run) min_run_length 16 4 2 4 2 4 The case with R8 interrupting a literal: L24 R8 L24 R8 L24 Old solution4 2 4 2 4 min_run_length 24 4 1 3 1 3 (one long literal run) min_run_length 16 4 1 3 1 3 (one long literal run) The cases for bit_width = 2 are similar. My thinking originally was that the case of a run interrupting a literal would be more common than alternating runs. But I think you are right that I should only change things where there is always a win. I will use min_run_length=16 for bit width 1, and that will be the only case where min_run_length != 8. http://gerrit.cloudera.org:8080/#/c/11582/2/be/src/util/rle-encoding.h File be/src/util/rle-encoding.h: http://gerrit.cloudera.org:8080/#/c/11582/2/be/src/util/rle-encoding.h@203 PS2, Line 203: the previous run is flushed out > I think that it would be helpful to update this comment with a brief discus Done http://gerrit.cloudera.org:8080/#/c/11582/2/be/src/util/rle-encoding.h@211 PS2, Line 211: mu;tiple > typo: multiple Done http://gerrit.cloudera.org:8080/#/c/11582/2/be/src/util/rle-encoding.h@348 PS2, Line 348: most values > most MAX_RUN_LENGTH_BUFFER values Done http://gerrit.cloudera.org:8080/#/c/11582/2/be/src/util/rle-encoding.h@350 PS2, Line 350: _ > nit: period, here and elsewhere Done http://gerrit.cloudera.org:8080/#/c/11582/2/be/src/util/rle-encoding.h@475 PS2, Line 475: // updates num_buffered_values > This comment does not seam true here. I think it is true, even when update_indicator_byte = false. But the comment is bad, should be // updates num_buffered_values_ (I am not used to that trailing underscore) http://gerrit.cloudera.org:8080/#/c/11582/2/be/src/util/rle-encoding.h@549 PS2, Line 549: we > typo Done -- To view, visit http://gerrit.cloudera.org:8080/11582 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76 Gerrit-Change-Number: 11582 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Mon, 22 Oct 2018 23:24:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5821: Add query with implicit casts to extended explain output.
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11719 ) Change subject: IMPALA-5821: Add query with implicit casts to extended explain output. .. Patch Set 2: > It seems good to me, but I would ping Greg Rahn in the Jira, maybe > he has some ideas. I have pinged him > I have one issue with the current output: as I saw in the .test > files, queries are printed in one line by default, which can make > complex queries very difficult to read. It would be much nicer to > break them at logical points, but I have no idea how to do it > easily. I agree that a clever printing would be nice, but that sounds tricky. I could wrap at say the last space before 80 columns with only small effort, but this will make the header harder to parse in tests. Should I try this? -- To view, visit http://gerrit.cloudera.org:8080/11719 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I55c3bdacc295137f66b2316a912fc347da30d6b0 Gerrit-Change-Number: 11719 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Fri, 19 Oct 2018 17:04:34 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-5821: Add query with implicit casts to extended explain output.
Andrew Sherman has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11719 Change subject: IMPALA-5821: Add query with implicit casts to extended explain output. .. IMPALA-5821: Add query with implicit casts to extended explain output. If explain_level is at 'extended' level or higher, then print the analyzed query as text in the explain header to aid in diagnosis. This query text is generated from the analyzed statement tree, so it includes query rewriting. In addition implicit casts are printed, and numeric literals are printed with a cast so that their type is visible. If SHOW_IMPLICIT_CASTS is passed to toSql() then - in CastExpr print the implicit cast - in NumericLiteral print the literal with a cast to show the type Add a PlannerTestOption directive that will force the query text showing implicit casts to be included in the PLAN section of a .test file. The toSql() method can be called on a ParseNode tree to return the sql corresponding ot the tree. In the past toSQl() has been enhanced to print rewritten sql by partially overloading toSql() [with toSql(boolean)]. This current change requires changing toSQl() in many places as NumericLiteral can appear at different points in ia parse tree. To avoid many new fragile overloads of toSql() I added toSql(ToSqlOptions), where ToSqlOptions is an enum which controls the form of the Sql that is returned. This changes many files but is safer and means that any future options to toSql() can be added painlessly. If a query in a .test file produces a diff when run by PlannerTest, then print the name of the .test file in the output. Documentation of this change will be done as IMPALA-7718 EXAMPLE OUTPUT: [localhost:21000] default> set explain_level=extended; EXPLAIN_LEVEL set to extended [localhost:21000] default> explain select * from functional_kudu.alltypestiny where bigint_col < 1000 / 100; Query: explain select * from functional_kudu.alltypestiny where bigint_col < 1000 / 100 Query with implicit casts: SELECT * FROM functional_kudu.alltypestiny WHERE CAST(bigint_col AS DOUBLE) < CAST(10 AS DOUBLE) Max Per-Host Resource Reservation: Memory=0B Threads=2 Per-Host Resource Estimates: Memory=10MB Codegen disabled by planner "" F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 | Per-Host Resources: mem-estimate=0B mem-reservation=0B thread-reservation=2 (etc.) Note that the initial "Query:" that is printed originates from impala-shell (suppressed with --quiet), but for neatness I put the new "Query with implicit casts:" output at the top of the header. TESTING: All end-to-end tests pass. Added a new test in ExprRewriterTest which prints sql with implict casts for some interesting queries. The output of some Planner Tests in .test files has been updated to include the sql with implict casts that is printed when explain_level is at at least 'extended' level. QUESTIONS FOR REVIEWERS: Is "Query with implicit casts:" a good description? Is the EXPLAIN header the right place for this output? Change-Id: I55c3bdacc295137f66b2316a912fc347da30d6b0 --- M fe/src/main/java/org/apache/impala/analysis/AdminFnStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java M fe/src/main/java/org/apache/impala/analysis/AnalyticWindow.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/BaseTableRef.java M fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDataSrcStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDropRoleStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java M
[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths
Andrew Sherman has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/11582 ) Change subject: IMPALA-6658: improve Parquet RLE for low bit widths .. IMPALA-6658: improve Parquet RLE for low bit widths RleEncoder buffers values in its own cache to detect run lengths that can be efficiently encoded. When a run is detected it is written with an indicator byte which encodes the length of the run. So a run is encoded as 2 bytes, but the actual cost of a run of small bit width values in the middle of a literal is really 3 bytes, as a new indicator is needed after the run. Change RleEncoder to have the ability to use run lengths other than 8. A new parameter to the constructor (min_run_length) allows test callers (only) to set the minimum run length. By default RleEncoder will now use run length encoding for runs of length 24 for single bit values, and of length 16 for 2 bit wide values. All other bit widths will use the existing length 8 runs. Internally RleEncoder must buffer more values so that the longer runs can be detected. The internal buffer “buffered_values_” is larger and is now a circular buffer so that the first 8 bytes of the buffer can be separately flushed to BitWriter. Testing: All end-to-end and unit tests pass The unit test rle-test is enhanced to run all tests against RleEncoders using all possible values of min_run_length. In Addition, rle-test is refactored so that the Rle tests are in a class that inherits from ::testing::Test so that a SetUp() method can be used. The Overflow test is enhanced to be more exhaustive (while still completing in a second or two). Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76 --- M be/src/util/rle-encoding.h M be/src/util/rle-test.cc 2 files changed, 507 insertions(+), 255 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/11582/2 -- To view, visit http://gerrit.cloudera.org:8080/11582 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76 Gerrit-Change-Number: 11582 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11582 ) Change subject: IMPALA-6658: improve Parquet RLE for low bit widths .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/11582/2/be/src/util/rle-test.cc File be/src/util/rle-test.cc: http://gerrit.cloudera.org:8080/#/c/11582/2/be/src/util/rle-test.cc@148 PS2, Line 148: Rle > nit: RleTest Done, missed this before -- To view, visit http://gerrit.cloudera.org:8080/11582 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76 Gerrit-Change-Number: 11582 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Tue, 23 Oct 2018 19:39:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths
Andrew Sherman has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/11582 ) Change subject: IMPALA-6658: improve Parquet RLE for low bit widths .. IMPALA-6658: improve Parquet RLE for low bit widths RleEncoder buffers values in its own cache to detect run lengths that can be efficiently encoded. When a run is detected it is written with an indicator byte which encodes the length of the run. So an encoded run always has an overhead of at least one byte. This means that for single bit values, encoding 8 values as a run is inefficient. Change RleEncoder to have the ability to use run lengths other than 8. A new parameter to the constructor (min_run_length) allows test callers (only) to set the minimum run length. By default RleEncoder will now use run length encoding for runs of length 16 for single bit values. All other bit widths will use the existing length 8 runs. Internally RleEncoder must buffer more values so that the longer runs can be detected. The internal buffer “buffered_values_” is larger and is now a circular buffer so that the first 8 bytes of the buffer can be separately flushed to BitWriter. Testing: All end-to-end and unit tests pass The unit test rle-test is enhanced to run all tests against RleEncoders using all possible values of min_run_length. In Addition, rle-test is refactored so that the Rle tests are in a class that inherits from ::testing::Test so that a SetUp() method can be used. The Overflow test is enhanced to be more exhaustive (while still completing in a second or two). Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76 --- M be/src/util/rle-encoding.h M be/src/util/rle-test.cc 2 files changed, 495 insertions(+), 255 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/11582/3 -- To view, visit http://gerrit.cloudera.org:8080/11582 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76 Gerrit-Change-Number: 11582 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11582 ) Change subject: IMPALA-6658: improve Parquet RLE for low bit widths .. Patch Set 2: (5 comments) Thanks Thomas and Csaba for reviews so far http://gerrit.cloudera.org:8080/#/c/11582/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11582/2//COMMIT_MSG@20 PS2, Line 20: By default RleEncoder will now use run length encoding for runs of : length 24 for single bit values, and of length 16 for 2 bit wide values. : All other bit widths will use the existing length 8 runs. > >This could be avoided by using smaller I agree that this might be a better way, but I am happy with the simple improvement we get from this change. http://gerrit.cloudera.org:8080/#/c/11582/3/be/src/util/rle-encoding.h File be/src/util/rle-encoding.h: http://gerrit.cloudera.org:8080/#/c/11582/3/be/src/util/rle-encoding.h@266 PS3, Line 266: 1 byte for the repeated value > I prefer to phrase it as "never worse than 8". For alternating long repeate Done http://gerrit.cloudera.org:8080/#/c/11582/3/be/src/util/rle-encoding.h@267 PS3, Line 267: > Maybe "better than 16" would be better. Done http://gerrit.cloudera.org:8080/#/c/11582/2/be/src/util/rle-encoding.h File be/src/util/rle-encoding.h: http://gerrit.cloudera.org:8080/#/c/11582/2/be/src/util/rle-encoding.h@203 PS2, Line 203: the previous run is flushed out > Done Done for real this time http://gerrit.cloudera.org:8080/#/c/11582/3/be/src/util/rle-test.cc File be/src/util/rle-test.cc: http://gerrit.cloudera.org:8080/#/c/11582/3/be/src/util/rle-test.cc@216 PS3, Line 216: c > Oops, I missed this one: Done -- To view, visit http://gerrit.cloudera.org:8080/11582 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76 Gerrit-Change-Number: 11582 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Mon, 29 Oct 2018 21:14:39 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths
Andrew Sherman has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/11582 ) Change subject: IMPALA-6658: improve Parquet RLE for low bit widths .. IMPALA-6658: improve Parquet RLE for low bit widths RleEncoder buffers values in its own cache to detect run lengths that can be efficiently encoded. When a run is detected it is written with an indicator byte which encodes the length of the run. So an encoded run always has an overhead of at least one byte. This means that for single bit values, encoding 8 values as a run is inefficient. Change RleEncoder to have the ability to use run lengths other than 8. A new parameter to the constructor (min_run_length) allows test callers (only) to set the minimum run length. By default RleEncoder will now use run length encoding for runs of length 16 for single bit values. All other bit widths will use the existing length 8 runs. Internally RleEncoder must buffer more values so that the longer runs can be detected. The internal buffer “buffered_values_” is larger and is now a circular buffer so that the first 8 bytes of the buffer can be separately flushed to BitWriter. Testing: All end-to-end and unit tests pass The unit test rle-test is enhanced to run all tests against RleEncoders using all possible values of min_run_length. In Addition, rle-test is refactored so that the Rle tests are in a class that inherits from ::testing::Test so that a SetUp() method can be used. The Overflow test is enhanced to be more exhaustive (while still completing in a second or two). Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76 --- M be/src/util/rle-encoding.h M be/src/util/rle-test.cc 2 files changed, 500 insertions(+), 255 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/11582/4 -- To view, visit http://gerrit.cloudera.org:8080/11582 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76 Gerrit-Change-Number: 11582 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-5821: Add query with implicit casts to extended explain output.
Andrew Sherman has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/11719 ) Change subject: IMPALA-5821: Add query with implicit casts to extended explain output. .. IMPALA-5821: Add query with implicit casts to extended explain output. If explain_level is at 'extended' level or higher, then enhance the output from the explain command. (1) Show the analyzed sql in the explain header, this is the rewritten sql, which includes implicit casts, and literals are printed with a cast so that their type is visible. (2) When predicates are shown in the plan these are shown in the same format. The toSql() method can be called on a ParseNode tree to return the sql corresponding ot the tree. In the past toSQl() has been enhanced to print rewritten sql by partially overloading toSql() [with toSql(boolean)]. This current change requires changing toSQl() in many places as NumericLiteral can appear at different points in ia parse tree. To avoid many new fragile overloads of toSql() I added toSql(ToSqlOptions), where ToSqlOptions is an enum which controls the form of the Sql that is returned. This changes many files but is safer and means that any future options to toSql() can be added painlessly. If SHOW_IMPLICIT_CASTS is passed to toSql() then - in CastExpr print the implicit cast - in NumericLiteral print the literal with a cast to show the type Add a PlannerTestOption directive that will force the query text showing implicit casts to be included in the PLAN section of a .test file. The analyzed query text is wrapped at 80 characters. Note that the analyzed query cannot always be executed as queries rewritten to use LEFT SEMI JOIN are not legal sql. In addition some space characters may be removed from the query for prettier display. Documentation of this change will be done as IMPALA-7718 EXAMPLE OUTPUT: [localhost:21000] default> set explain_level=2; EXPLAIN_LEVEL set to 2 [localhost:21000] default> explain select * from functional_kudu.alltypestiny where bigint_col < 1000 / 100; Query: explain select * from functional_kudu.alltypestiny where bigint_col < 1000 / 100 Max Per-Host Resource Reservation: Memory=0B Threads=2 Per-Host Resource Estimates: Memory=10MB Codegen disabled by planner Analyzed query: SELECT * FROM functional_kudu.alltypestiny WHERE CAST(bigint_col AS DOUBLE) < CAST(10 AS DOUBLE) "" F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 | Per-Host Resources: mem-estimate=4.88MB mem-reservation=0B thread-reservation=2 PLAN-ROOT SINK | mem-estimate=0B mem-reservation=0B thread-reservation=0 | 00:SCAN KUDU [functional_kudu.alltypestiny] predicates: CAST(bigint_col AS DOUBLE) < CAST(10 AS DOUBLE) mem-estimate=4.88MB mem-reservation=0B thread-reservation=1 tuple-ids=0 row-size=97B cardinality=1 in pipelines: 00(GETNEXT) Fetched 16 row(s) in 0.03s TESTING: All end-to-end tests pass. Added a new test in ExprRewriterTest which prints sql with implict casts for some interesting queries. Add a unit test for the code which wraps text at 60 characters. The output of some Planner Tests in .test files has been updated to include the Analyzed sql that is printed when explain_level is at at least 'extended' level. Change-Id: I55c3bdacc295137f66b2316a912fc347da30d6b0 --- M fe/src/main/java/org/apache/impala/analysis/AdminFnStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java M fe/src/main/java/org/apache/impala/analysis/AnalyticWindow.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/BaseTableRef.java M fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDataSrcStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDropRoleStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java M
[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11582 ) Change subject: IMPALA-6658: improve Parquet RLE for low bit widths .. Patch Set 4: (2 comments) http://gerrit.cloudera.org:8080/#/c/11582/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11582/2//COMMIT_MSG@20 PS2, Line 20: length 16 for single bit values. All other bit widths will use the : existing length 8 runs. : > Yes, it looks good, but I think that the worst case (alternating L8 R16) wo OK http://gerrit.cloudera.org:8080/#/c/11582/4/be/src/util/rle-encoding.h File be/src/util/rle-encoding.h: http://gerrit.cloudera.org:8080/#/c/11582/4/be/src/util/rle-encoding.h@265 PS4, Line 265: return std::max(MinBufferSize(bit_width), literal_max_size); > I think that is still not always the worst case - the example you wrote for You are right, and I have updated MaxBufferSize (and added a test) in next patch -- To view, visit http://gerrit.cloudera.org:8080/11582 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76 Gerrit-Change-Number: 11582 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Wed, 31 Oct 2018 23:36:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5821: Add query with implicit casts to extended explain output.
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11719 ) Change subject: IMPALA-5821: Add query with implicit casts to extended explain output. .. Patch Set 4: (21 comments) Thanks for review comments http://gerrit.cloudera.org:8080/#/c/11719/4//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11719/4//COMMIT_MSG@51 PS4, Line 51: "" > This quote seems strange here - is it included in the output? This is what you get when you run 'impala-shell.sh -B' http://gerrit.cloudera.org:8080/#/c/11719/4//COMMIT_MSG@69 PS4, Line 69: 80 > Isn't it 60? yes http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/analysis/ParseNode.java File fe/src/main/java/org/apache/impala/analysis/ParseNode.java: http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/analysis/ParseNode.java@32 PS4, Line 32: sql > nit: pls use consistent case (SQL, sql, Sql) Thanks http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/analysis/ParseNode.java@32 PS4, Line 32: @param > pls omit these javadoc annotations. Done http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/analysis/ParseNode.java@39 PS4, Line 39:* This should return the same result as calling toSql(ToSqlOptions.DEFAULT). > would be good to do this via a default interface method. pls add a todo for Clever idea http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/analysis/ToSqlOptions.java File fe/src/main/java/org/apache/impala/analysis/ToSqlOptions.java: http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/analysis/ToSqlOptions.java@25 PS4, Line 25: default, normal > "normal" makes sense if you know what is currently done. perhaps "original Done http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/analysis/ToSqlOptions.java@38 PS4, Line 38: // This does have the consequence that the sql with implict casts may not pssibly fail > nit: spelling Done http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/analysis/ToSqlOptions.java@38 PS4, Line 38: may not pssibly fail : // to parse if resubmitted as > I found this difficult to understand. Did you mean that Sql produced in thi Yes, rewritten SQL is already not always valid SQL. EXISTS queries that are rewritten as semi-joins are not legal SQL. I clarified the comment slightly http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/analysis/TypeDef.java File fe/src/main/java/org/apache/impala/analysis/TypeDef.java: http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/analysis/TypeDef.java@165 PS4, Line 165: return parsedType_.toSql(); > pass down here? It looks like it should, but the parsedType is not a ParseNode so I did not propagate ToSqlOptions to there. http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/common/PrintUtils.java File fe/src/main/java/org/apache/impala/common/PrintUtils.java: http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/common/PrintUtils.java@110 PS4, Line 110: 32 > what's this? Just some padding for if lines expand. Maybe this is too ugly. http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/common/PrintUtils.java@115 PS4, Line 115: exiting > sp. existing? Done http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java File fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java: http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java@424 PS4, Line 424: @pa > remove I don't really understand what is wrong with param? But done. http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/test/java/org/apache/impala/analysis/ExprRewriterTest.java@451 PS4, Line 451: // AnalyzesOk(stmt.toSql(true), ctx); > rm? Done http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java File fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java: http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java@750 PS4, Line 750: @para > remove the comment annotations. Done http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/test/java/org/apache/impala/planner/PlannerTestBase.java@772 PS4, Line 772: if (line.contains("Analyzed query:")) { : builder.append(line).append("\n"); : inImplictCasts = true; : } else if (inImplictCasts) { : // Keep copying the query with implicit casts. : // This works because this is the last thing in the header. :
[Impala-ASF-CR] IMPALA-5821: Add query with implicit casts to extended explain output.
Andrew Sherman has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/11719 ) Change subject: IMPALA-5821: Add query with implicit casts to extended explain output. .. IMPALA-5821: Add query with implicit casts to extended explain output. If explain_level is at 'extended' level or higher, then enhance the output from the explain command. (1) Show the analyzed sql in the explain header, this is the rewritten sql, which includes implicit casts, and literals are printed with a cast so that their type is visible. (2) When predicates are shown in the plan these are shown in the same format. The toSql() method can be called on a ParseNode tree to return the sql corresponding ot the tree. In the past toSQl() has been enhanced to print rewritten sql by partially overloading toSql() [with toSql(boolean)]. This current change requires changing toSQl() in many places as NumericLiteral can appear at different points in ia parse tree. To avoid many new fragile overloads of toSql() I added toSql(ToSqlOptions), where ToSqlOptions is an enum which controls the form of the Sql that is returned. This changes many files but is safer and means that any future options to toSql() can be added painlessly. If SHOW_IMPLICIT_CASTS is passed to toSql() then - in CastExpr print the implicit cast - in NumericLiteral print the literal with a cast to show the type Add a PlannerTestOption directive that will force the query text showing implicit casts to be included in the PLAN section of a .test file. The analyzed query text is wrapped at 80 characters. Note that the analyzed query cannot always be executed as queries rewritten to use LEFT SEMI JOIN are not legal sql. In addition some space characters may be removed from the query for prettier display. Documentation of this change will be done as IMPALA-7718 EXAMPLE OUTPUT: [localhost:21000] default> set explain_level=2; EXPLAIN_LEVEL set to 2 [localhost:21000] default> explain select * from functional_kudu.alltypestiny where bigint_col < 1000 / 100; Query: explain select * from functional_kudu.alltypestiny where bigint_col < 1000 / 100 Max Per-Host Resource Reservation: Memory=0B Threads=2 Per-Host Resource Estimates: Memory=10MB Codegen disabled by planner Analyzed query: SELECT * FROM functional_kudu.alltypestiny WHERE CAST(bigint_col AS DOUBLE) < CAST(10 AS DOUBLE) "" F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 | Per-Host Resources: mem-estimate=4.88MB mem-reservation=0B thread-reservation=2 PLAN-ROOT SINK | mem-estimate=0B mem-reservation=0B thread-reservation=0 | 00:SCAN KUDU [functional_kudu.alltypestiny] predicates: CAST(bigint_col AS DOUBLE) < CAST(10 AS DOUBLE) mem-estimate=4.88MB mem-reservation=0B thread-reservation=1 tuple-ids=0 row-size=97B cardinality=1 in pipelines: 00(GETNEXT) Fetched 16 row(s) in 0.03s TESTING: All end-to-end tests pass. Added a new test in ExprRewriterTest which prints sql with implict casts for some interesting queries. Add a unit test for the code which wraps text at 80 characters. The output of some Planner Tests in .test files has been updated to include the Analyzed sql that is printed when explain_level is at at least 'extended' level. Change-Id: I55c3bdacc295137f66b2316a912fc347da30d6b0 --- M fe/src/main/java/org/apache/impala/analysis/AdminFnStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java M fe/src/main/java/org/apache/impala/analysis/AnalyticWindow.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/BaseTableRef.java M fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDataSrcStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDropRoleStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java M
[Impala-ASF-CR] IMPALA-5821: Add query with implicit casts to extended explain output.
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11719 ) Change subject: IMPALA-5821: Add query with implicit casts to extended explain output. .. Patch Set 4: I added a new patch because rebase was required -- To view, visit http://gerrit.cloudera.org:8080/11719 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I55c3bdacc295137f66b2316a912fc347da30d6b0 Gerrit-Change-Number: 11719 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Tue, 30 Oct 2018 20:34:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11582 ) Change subject: IMPALA-6658: improve Parquet RLE for low bit widths .. Patch Set 2: (4 comments) Thanks Csaba http://gerrit.cloudera.org:8080/#/c/11582/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11582/2//COMMIT_MSG@20 PS2, Line 20: By default RleEncoder will now use run length encoding for runs of : length 24 for single bit values, and of length 16 for 2 bit wide values. : All other bit widths will use the existing length 8 runs. > Ok, the biggest win (50% size for alternating runs of 8 in the 1 bit case) So you want me to write something like this in the jira? If you know the structure of the data then better encodings are possible. For example with bit_width=1, using min_run_length=24 is better in the case where we avoid interrupting a literal run. Using the notation of 'RXX' for a repeated run of length XX (so R16 is a run of lngth 16), and 'LYY' for a literal run of length YY. L24 R16 L24 R16 L24 min_run_length 8 4 2 4 2 4 min_run_length 16 (new default) 4 2 4 2 4 min_run_length 244 2 3 2 3 (one long literal run) So it is possible to optimize by detecting this situation and avoiding breaking a long literal run for a run of length 16. http://gerrit.cloudera.org:8080/#/c/11582/4/be/src/util/rle-encoding.h File be/src/util/rle-encoding.h: http://gerrit.cloudera.org:8080/#/c/11582/4/be/src/util/rle-encoding.h@250 PS4, Line 250: iter > Can you simplify this expression? MAX_VALUES_PER_LITERAL_RUN must be divisi Good idea http://gerrit.cloudera.org:8080/#/c/11582/4/be/src/util/rle-encoding.h@264 PS4, Line 264: insertin > Aren't we double counting the indicator byte here? My assumption is that th I think this is correct as is but I am changing it to the clearer literal_max_size = num_runs * (1 + bytes_per_run) -- 1 is the indicator -- bytes_per_run is the encoded bytes http://gerrit.cloudera.org:8080/#/c/11582/4/be/src/util/rle-encoding.h@265 PS4, Line 265: / 1 byte for the encoded run length > This seems to assume that a single big literal run is the worsts case - can I think before this change a long literal was not always the worst case, but now it is. -- To view, visit http://gerrit.cloudera.org:8080/11582 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76 Gerrit-Change-Number: 11582 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Tue, 30 Oct 2018 22:22:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths
Andrew Sherman has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/11582 ) Change subject: IMPALA-6658: improve Parquet RLE for low bit widths .. IMPALA-6658: improve Parquet RLE for low bit widths RleEncoder buffers values in its own cache to detect run lengths that can be efficiently encoded. When a run is detected it is written with an indicator byte which encodes the length of the run. So an encoded run always has an overhead of at least one byte. This means that for single bit values, encoding 8 values as a run is inefficient. Change RleEncoder to have the ability to use run lengths other than 8. A new parameter to the constructor (min_run_length) allows test callers (only) to set the minimum run length. By default RleEncoder will now use run length encoding for runs of length 16 for single bit values. All other bit widths will use the existing length 8 runs. Internally RleEncoder must buffer more values so that the longer runs can be detected. The internal buffer “buffered_values_” is larger and is now a circular buffer so that the first 8 bytes of the buffer can be separately flushed to BitWriter. Testing: All end-to-end and unit tests pass The unit test rle-test is enhanced to run all tests against RleEncoders using all possible values of min_run_length. In Addition, rle-test is refactored so that the Rle tests are in a class that inherits from ::testing::Test so that a SetUp() method can be used. The Overflow test is enhanced to be more exhaustive (while still completing in a second or two). Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76 --- M be/src/util/rle-encoding.h M be/src/util/rle-test.cc 2 files changed, 499 insertions(+), 255 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/11582/7 -- To view, visit http://gerrit.cloudera.org:8080/11582 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76 Gerrit-Change-Number: 11582 Gerrit-PatchSet: 7 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths
Andrew Sherman has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/11582 ) Change subject: IMPALA-6658: improve Parquet RLE for low bit widths .. IMPALA-6658: improve Parquet RLE for low bit widths RleEncoder buffers values in its own cache to detect run lengths that can be efficiently encoded. When a run is detected it is written with an indicator byte which encodes the length of the run. So an encoded run always has an overhead of at least one byte. This means that for single bit values, encoding 8 values as a run is inefficient. Change RleEncoder to have the ability to use run lengths other than 8. A new parameter to the constructor (min_run_length) allows test callers (only) to set the minimum run length. By default RleEncoder will now use run length encoding for runs of length 16 for single bit values. All other bit widths will use the existing length 8 runs. Internally RleEncoder must buffer more values so that the longer runs can be detected. The internal buffer “buffered_values_” is larger and is now a circular buffer so that the first 8 bytes of the buffer can be separately flushed to BitWriter. Testing: All end-to-end and unit tests pass The unit test rle-test is enhanced to run all tests against RleEncoders using all possible values of min_run_length. In Addition, rle-test is refactored so that the Rle tests are in a class that inherits from ::testing::Test so that a SetUp() method can be used. The Overflow test is enhanced to be more exhaustive (while still completing in a second or two). Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76 --- M be/src/util/rle-encoding.h M be/src/util/rle-test.cc 2 files changed, 484 insertions(+), 256 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/11582/6 -- To view, visit http://gerrit.cloudera.org:8080/11582 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76 Gerrit-Change-Number: 11582 Gerrit-PatchSet: 6 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths
Andrew Sherman has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/11582 ) Change subject: IMPALA-6658: improve Parquet RLE for low bit widths .. IMPALA-6658: improve Parquet RLE for low bit widths RleEncoder buffers values in its own cache to detect run lengths that can be efficiently encoded. When a run is detected it is written with an indicator byte which encodes the length of the run. So an encoded run always has an overhead of at least one byte. This means that for single bit values, encoding 8 values as a run is inefficient. Change RleEncoder to have the ability to use run lengths other than 8. A new parameter to the constructor (min_run_length) allows test callers (only) to set the minimum run length. By default RleEncoder will now use run length encoding for runs of length 16 for single bit values. All other bit widths will use the existing length 8 runs. Internally RleEncoder must buffer more values so that the longer runs can be detected. The internal buffer “buffered_values_” is larger and is now a circular buffer so that the first 8 bytes of the buffer can be separately flushed to BitWriter. Testing: All end-to-end and unit tests pass The unit test rle-test is enhanced to run all tests against RleEncoders using all possible values of min_run_length. In Addition, rle-test is refactored so that the Rle tests are in a class that inherits from ::testing::Test so that a SetUp() method can be used. The Overflow test is enhanced to be more exhaustive (while still completing in a second or two). Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76 --- M be/src/util/rle-encoding.h M be/src/util/rle-test.cc 2 files changed, 510 insertions(+), 255 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/11582/5 -- To view, visit http://gerrit.cloudera.org:8080/11582 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76 Gerrit-Change-Number: 11582 Gerrit-PatchSet: 5 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-5821: Add query with implicit casts to extended explain output.
Andrew Sherman has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/11719 ) Change subject: IMPALA-5821: Add query with implicit casts to extended explain output. .. IMPALA-5821: Add query with implicit casts to extended explain output. If explain_level is at 'extended' level or higher, then enhance the output from the explain command. (1) Show the analyzed sql in the explain header, this is the rewritten sql, which includes implicit casts, and literals are printed with a cast so that their type is visible. (2) When predicates are shown in the plan these are shown in the same format. The toSql() method can be called on a ParseNode tree to return the sql corresponding ot the tree. In the past toSQl() has been enhanced to print rewritten sql by partially overloading toSql() [with toSql(boolean)]. This current change requires changing toSQl() in many places as NumericLiteral can appear at different points in ia parse tree. To avoid many new fragile overloads of toSql() I added toSql(ToSqlOptions), where ToSqlOptions is an enum which controls the form of the Sql that is returned. This changes many files but is safer and means that any future options to toSql() can be added painlessly. If SHOW_IMPLICIT_CASTS is passed to toSql() then - in CastExpr print the implicit cast - in NumericLiteral print the literal with a cast to show the type Add a PlannerTestOption directive that will force the query text showing implicit casts to be included in the PLAN section of a .test file. The analyzed query text is wrapped at 80 characters. Note that the analyzed query cannot always be executed as queries rewritten to use LEFT SEMI JOIN are not legal sql. In addition some space characters may be removed from the query for prettier display. Documentation of this change will be done as IMPALA-7718 EXAMPLE OUTPUT: [localhost:21000] default> set explain_level=2; EXPLAIN_LEVEL set to 2 [localhost:21000] default> explain select * from functional_kudu.alltypestiny where bigint_col < 1000 / 100; Query: explain select * from functional_kudu.alltypestiny where bigint_col < 1000 / 100 Max Per-Host Resource Reservation: Memory=0B Threads=2 Per-Host Resource Estimates: Memory=10MB Codegen disabled by planner Analyzed query: SELECT * FROM functional_kudu.alltypestiny WHERE CAST(bigint_col AS DOUBLE) < CAST(10 AS DOUBLE) "" F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 | Per-Host Resources: mem-estimate=4.88MB mem-reservation=0B thread-reservation=2 PLAN-ROOT SINK | mem-estimate=0B mem-reservation=0B thread-reservation=0 | 00:SCAN KUDU [functional_kudu.alltypestiny] predicates: CAST(bigint_col AS DOUBLE) < CAST(10 AS DOUBLE) mem-estimate=4.88MB mem-reservation=0B thread-reservation=1 tuple-ids=0 row-size=97B cardinality=1 in pipelines: 00(GETNEXT) Fetched 16 row(s) in 0.03s TESTING: All end-to-end tests pass. Added a new test in ExprRewriterTest which prints sql with implict casts for some interesting queries. Add a unit test for the code which wraps text at 80 characters. The output of some Planner Tests in .test files has been updated to include the Analyzed sql that is printed when explain_level is at at least 'extended' level. Change-Id: I55c3bdacc295137f66b2316a912fc347da30d6b0 --- M fe/src/main/java/org/apache/impala/analysis/AdminFnStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java M fe/src/main/java/org/apache/impala/analysis/AnalyticWindow.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/BaseTableRef.java M fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDataSrcStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDropRoleStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java M
[Impala-ASF-CR] IMPALA-5821: Add query with implicit casts to extended explain output.
Andrew Sherman has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/11719 ) Change subject: IMPALA-5821: Add query with implicit casts to extended explain output. .. IMPALA-5821: Add query with implicit casts to extended explain output. If explain_level is at 'extended' level or higher, then enhance the output from the explain command. (1) Show the analyzed sql in the explain header, this is the rewritten sql, which includes implicit casts, and literals are printed with a cast so that their type is visible. (2) When predicates are shown in the plan these are shown in the same format. The toSql() method can be called on a ParseNode tree to return the sql corresponding ot the tree. In the past toSQl() has been enhanced to print rewritten sql by partially overloading toSql() [with toSql(boolean)]. This current change requires changing toSQl() in many places as NumericLiteral can appear at different points in ia parse tree. To avoid many new fragile overloads of toSql() I added toSql(ToSqlOptions), where ToSqlOptions is an enum which controls the form of the Sql that is returned. This changes many files but is safer and means that any future options to toSql() can be added painlessly. If SHOW_IMPLICIT_CASTS is passed to toSql() then - in CastExpr print the implicit cast - in NumericLiteral print the literal with a cast to show the type Add a PlannerTestOption directive that will force the query text showing implicit casts to be included in the PLAN section of a .test file. The analyzed query text is wrapped at 80 characters. Note that the analyzed query cannot always be executed as queries rewritten to use LEFT SEMI JOIN are not legal sql. In addition some space characters may be removed from the query for prettier display. Documentation of this change will be done as IMPALA-7718 EXAMPLE OUTPUT: [localhost:21000] default> set explain_level=2; EXPLAIN_LEVEL set to 2 [localhost:21000] default> explain select * from functional_kudu.alltypestiny where bigint_col < 1000 / 100; Query: explain select * from functional_kudu.alltypestiny where bigint_col < 1000 / 100 Max Per-Host Resource Reservation: Memory=0B Threads=2 Per-Host Resource Estimates: Memory=10MB Codegen disabled by planner Analyzed query: SELECT * FROM functional_kudu.alltypestiny WHERE CAST(bigint_col AS DOUBLE) < CAST(10 AS DOUBLE) "" F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 | Per-Host Resources: mem-estimate=4.88MB mem-reservation=0B thread-reservation=2 PLAN-ROOT SINK | mem-estimate=0B mem-reservation=0B thread-reservation=0 | 00:SCAN KUDU [functional_kudu.alltypestiny] predicates: CAST(bigint_col AS DOUBLE) < CAST(10 AS DOUBLE) mem-estimate=4.88MB mem-reservation=0B thread-reservation=1 tuple-ids=0 row-size=97B cardinality=1 in pipelines: 00(GETNEXT) Fetched 16 row(s) in 0.03s TESTING: All end-to-end tests pass. Added a new test in ExprRewriterTest which prints sql with implict casts for some interesting queries. Add a unit test for the code which wraps text at 60 characters. The output of some Planner Tests in .test files has been updated to include the Analyzed sql that is printed when explain_level is at at least 'extended' level. Change-Id: I55c3bdacc295137f66b2316a912fc347da30d6b0 --- M fe/src/main/java/org/apache/impala/analysis/AdminFnStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java M fe/src/main/java/org/apache/impala/analysis/AnalyticWindow.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/BaseTableRef.java M fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDataSrcStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDropRoleStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java M
[Impala-ASF-CR] IMPALA-5821: Add query with implicit casts to extended explain output.
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11719 ) Change subject: IMPALA-5821: Add query with implicit casts to extended explain output. .. Patch Set 4: (1 comment) I agree that maintenance is a concern. 1) for ParseNode I could remove toSql() from the interface and force all callers to use toSql(DEFAULT). This is easy in terms of effort as IntelliJ will do it for me, but results in a lot of code changes, especially as many exception messages call toSql(). I did consider doing this originally but wanted to make this change as clear as possible. I can do this quickly in a follow-up jira. There are few more associated classes in analysis which would also be changed. 2) Do you want me to look at changes to toSql() that are not part of the ParseNode interface (or associated classes)? There is org.apache.impala.catalog.Type which has toSql() {return toSql(1)} where the argument controls how deeply we show nested Types. There is org.apache.impala.analysis.TableName.toSql() and org.apache.impala.analysis.PlanHint.toSql() Right now I would propose leaving these as is, but they could also be changed if we wanted to never have a naked toSql() call. http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/common/PrintUtils.java File fe/src/main/java/org/apache/impala/common/PrintUtils.java: http://gerrit.cloudera.org:8080/#/c/11719/4/fe/src/main/java/org/apache/impala/common/PrintUtils.java@110 PS4, Line 110: 32 > ok, just mention it in a brief comment. I prefer such inline constants to h I removed it. -- To view, visit http://gerrit.cloudera.org:8080/11719 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I55c3bdacc295137f66b2316a912fc347da30d6b0 Gerrit-Change-Number: 11719 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 01 Nov 2018 20:32:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths
Andrew Sherman has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/11582 ) Change subject: IMPALA-6658: improve Parquet RLE for low bit widths .. IMPALA-6658: improve Parquet RLE for low bit widths RleEncoder buffers values in its own cache to detect run lengths that can be efficiently encoded. When a run is detected it is written with an indicator byte which encodes the length of the run. So an encoded run always has an overhead of at least one byte. This means that for single bit values, encoding 8 values as a run is inefficient. Change RleEncoder to have the ability to use run lengths other than 8. A new parameter to the constructor (min_run_length) allows test callers (only) to set the minimum run length. By default RleEncoder will now use run length encoding for runs of length 16 for single bit values. All other bit widths will use the existing length 8 runs. Internally RleEncoder must buffer more values so that the longer runs can be detected. The internal buffer “buffered_values_” is larger and is now a circular buffer so that the first 8 bytes of the buffer can be separately flushed to BitWriter. Testing: All end-to-end and unit tests pass The unit test rle-test is enhanced to run all tests against RleEncoders using all possible values of min_run_length. In Addition, rle-test is refactored so that the Rle tests are in a class that inherits from ::testing::Test so that a SetUp() method can be used. The Overflow test is enhanced to be more exhaustive (while still completing in a second or two). Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76 --- M be/src/util/rle-encoding.h M be/src/util/rle-test.cc 2 files changed, 583 insertions(+), 256 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/11582/9 -- To view, visit http://gerrit.cloudera.org:8080/11582 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76 Gerrit-Change-Number: 11582 Gerrit-PatchSet: 9 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11582 ) Change subject: IMPALA-6658: improve Parquet RLE for low bit widths .. Patch Set 8: (4 comments) http://gerrit.cloudera.org:8080/#/c/11582/8/be/src/util/rle-encoding.h File be/src/util/rle-encoding.h: http://gerrit.cloudera.org:8080/#/c/11582/8/be/src/util/rle-encoding.h@259 PS8, Line 259: MaxBufferSize > I am not 100% sure here, but I think that this could be expressed in an eas OK, interesting. I think this is what you mean (see below). I'm saying this is not clearer to humans (but it was worth trying) static int MaxBufferSize(int bit_width, int num_values) { int data_bytes; int overhead; if (bit_width > 2) { // The largest encoded size is for all long literals with no repeated runs. int bytes_per_run = (bit_width * MAX_VALUES_PER_LITERAL_RUN) / 8; int num_runs = static_cast(BitUtil::Ceil(num_values, MAX_VALUES_PER_LITERAL_RUN)); data_bytes = bytes_per_run * num_runs; overhead = num_runs; } else { data_bytes = static_cast(BitUtil::Ceil(num_values * bit_width, 8)); if (bit_width == 1) { // Worst case is we use 4 bytes for every 3 of the input e.g. L8 R16 L8 R16 L8. overhead = 1 + static_cast(BitUtil::Ceil(data_bytes, 3)); } else { // bit_width == 2 // Worst case is we use 3 bytes for every 2 of the input e.g. L8 R8 L8 R8 L8. overhead = 1 + static_cast(BitUtil::Ceil(data_bytes, 2)); } } return std::max(MinBufferSize(bit_width), data_bytes + overhead); } http://gerrit.cloudera.org:8080/#/c/11582/8/be/src/util/rle-encoding.h@261 PS8, Line 261: encoded > Maybe repeated would be better here? Done http://gerrit.cloudera.org:8080/#/c/11582/8/be/src/util/rle-test.cc File be/src/util/rle-test.cc: http://gerrit.cloudera.org:8080/#/c/11582/8/be/src/util/rle-test.cc@636 PS8, Line 636: buffer_len > I think that this var could have a more descriptive name like expected_max_ Done http://gerrit.cloudera.org:8080/#/c/11582/8/be/src/util/rle-test.cc@642 PS8, Line 642: bug > typo: big :) :-( -- To view, visit http://gerrit.cloudera.org:8080/11582 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76 Gerrit-Change-Number: 11582 Gerrit-PatchSet: 8 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Mon, 05 Nov 2018 18:45:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5821: Add query with implicit casts to extended explain output.
Andrew Sherman has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/11719 ) Change subject: IMPALA-5821: Add query with implicit casts to extended explain output. .. IMPALA-5821: Add query with implicit casts to extended explain output. If explain_level is at 'extended' level or higher, then enhance the output from the explain command. (1) Show the analyzed sql in the explain header, this is the rewritten sql, which includes implicit casts, and literals are printed with a cast so that their type is visible. (2) When predicates are shown in the plan these are shown in the same format. The toSql() method can be called on a ParseNode tree to return the sql corresponding ot the tree. In the past toSQl() has been enhanced to print rewritten sql by partially overloading toSql() [with toSql(boolean)]. This current change requires changing toSQl() in many places as NumericLiteral can appear at different points in ia parse tree. To avoid many new fragile overloads of toSql() I added toSql(ToSqlOptions), where ToSqlOptions is an enum which controls the form of the Sql that is returned. This changes many files but is safer and means that any future options to toSql() can be added painlessly. If SHOW_IMPLICIT_CASTS is passed to toSql() then - in CastExpr print the implicit cast - in NumericLiteral print the literal with a cast to show the type Add a PlannerTestOption directive that will force the query text showing implicit casts to be included in the PLAN section of a .test file. The analyzed query text is wrapped at 80 characters. Note that the analyzed query cannot always be executed as queries rewritten to use LEFT SEMI JOIN are not legal sql. In addition some space characters may be removed from the query for prettier display. Documentation of this change will be done as IMPALA-7718 EXAMPLE OUTPUT: [localhost:21000] default> set explain_level=2; EXPLAIN_LEVEL set to 2 [localhost:21000] default> explain select * from functional_kudu.alltypestiny where bigint_col < 1000 / 100; Query: explain select * from functional_kudu.alltypestiny where bigint_col < 1000 / 100 Max Per-Host Resource Reservation: Memory=0B Threads=2 Per-Host Resource Estimates: Memory=10MB Codegen disabled by planner Analyzed query: SELECT * FROM functional_kudu.alltypestiny WHERE CAST(bigint_col AS DOUBLE) < CAST(10 AS DOUBLE) "" F00:PLAN FRAGMENT [UNPARTITIONED] hosts=1 instances=1 | Per-Host Resources: mem-estimate=4.88MB mem-reservation=0B thread-reservation=2 PLAN-ROOT SINK | mem-estimate=0B mem-reservation=0B thread-reservation=0 | 00:SCAN KUDU [functional_kudu.alltypestiny] predicates: CAST(bigint_col AS DOUBLE) < CAST(10 AS DOUBLE) mem-estimate=4.88MB mem-reservation=0B thread-reservation=1 tuple-ids=0 row-size=97B cardinality=1 in pipelines: 00(GETNEXT) Fetched 16 row(s) in 0.03s TESTING: All end-to-end tests pass. Added a new test in ExprRewriterTest which prints sql with implict casts for some interesting queries. Add a unit test for the code which wraps text at 60 characters. The output of some Planner Tests in .test files has been updated to include the Analyzed sql that is printed when explain_level is at at least 'extended' level. Change-Id: I55c3bdacc295137f66b2316a912fc347da30d6b0 --- M fe/src/main/java/org/apache/impala/analysis/AdminFnStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java M fe/src/main/java/org/apache/impala/analysis/AnalyticWindow.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/BaseTableRef.java M fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDataSrcStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDropRoleStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java M
[Impala-ASF-CR] IMPALA-6658: improve Parquet RLE for low bit widths
Andrew Sherman has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11582 Change subject: IMPALA-6658: improve Parquet RLE for low bit widths .. IMPALA-6658: improve Parquet RLE for low bit widths RleEncoder buffers values in its own cache to detect run lengths that can be efficiently encoded. When a run is detected it is written with an indicator byte which encodes the length of the run. So a run is encoded as 2 bytes, but the actual cost of a run of small bit width values in the middle of a literal is really 3 bytes, as a new indicator is needed after the run. Change RleEncoder to have the ability to use run lengths other than 8. A new parameter to the constructor (min_run_length) allows test callers to set the minimum run length. By default RleEncoder will now use run length encoding for runs of length 24 for single bit values, and of length 16 for 2 bit wide values. All other bit widths will use the existing length 8 runs. Internally RleEncoder must buffer more values so that the longer runs can be detected. The internal buffer “buffered_values_” is larger and is now a circular buffer so that the first 8 bytes of the buffer can be separately flushed to BitWriter. Testing: All unit tests pass The unit test rle-test is enhanced to run all tests against RleEncoders using all possible values of min_run_length. In Addition, rle-test is refactored so that the rle tests are in a class that inherits from ::testing::Test so that a SetUp() method can be used. Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76 --- M be/src/util/rle-encoding.h M be/src/util/rle-test.cc 2 files changed, 497 insertions(+), 254 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/82/11582/1 -- To view, visit http://gerrit.cloudera.org:8080/11582 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I191a581d3f699b6669e48ac9dc39c76ed77c4a76 Gerrit-Change-Number: 11582 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-589: Add sql function returning the impalad coordinator hostname.
Andrew Sherman has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/11459 ) Change subject: IMPALA-589: Add sql function returning the impalad coordinator hostname. .. IMPALA-589: Add sql function returning the impalad coordinator hostname. In every execution of an Impala query, one of the impalad daemons acts as the coordinator node. In some cases, such as when using a proxy, a user cannot predict which host will act as the coordinator. To aid in diagnosis, we provide a sql function which returns the name of the host on which the coordinator is running. EXTERNAL DESCRIPTION: Add a builtin function called coordinator(), which returns the name of the host which is running the impalad that is acting as the coordinator for the current query. IMPLEMENTATION: All functions are passed a FunctionContext object which is the interface to the rest of the system for a UDF. From the FunctionContext we get the TQueryCtx (query context) which contains a TNetworkAddress for the coordinator. The hostname of the coordinator is copied from the TNetworkAddress. Can the TNetworkAddress for the coordinator be unitialized? In the thrift source the coord_address is marked optional, but my reading of the code says this is always set in a real impalad. To future-proof the code a null StringVal is returned if coord_address is unset. TESTING: - Added a basic unit test for the new function. - Added a unit test which simulates the case when coord_address is unset. - Hand tested in a development deployment. - Ran regression tests and got a clean run. LIMITATIONS: This change only adds the coordinator() function. It does not add the debug_url() function that was mentioned in the comments on the original Jira. Change-Id: I94d6e2664ba659b48df53c5c06f67b502c533e47 --- M be/src/exprs/expr-test.cc M be/src/exprs/utility-functions-ir.cc M be/src/exprs/utility-functions.h M common/function-registry/impala_functions.py M docs/topics/impala_misc_functions.xml 5 files changed, 60 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/11459/2 -- To view, visit http://gerrit.cloudera.org:8080/11459 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I94d6e2664ba659b48df53c5c06f67b502c533e47 Gerrit-Change-Number: 11459 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Zoram Thanga
[Impala-ASF-CR] IMPALA-589: Add sql function returning the impalad coordinator hostname.
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11459 ) Change subject: IMPALA-589: Add sql function returning the impalad coordinator hostname. .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/11459/1/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/11459/1/be/src/exprs/expr-test.cc@4961 PS1, Line 4961: ObjectPool pool; > This does not seem to be used. Thanks -- To view, visit http://gerrit.cloudera.org:8080/11459 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I94d6e2664ba659b48df53c5c06f67b502c533e47 Gerrit-Change-Number: 11459 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Thu, 20 Sep 2018 00:38:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-589: Add sql function returning the impalad coordinator hostname.
Andrew Sherman has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/11459 ) Change subject: IMPALA-589: Add sql function returning the impalad coordinator hostname. .. IMPALA-589: Add sql function returning the impalad coordinator hostname. In every execution of an Impala query, one of the impalad daemons acts as the coordinator node. In some cases, such as when using a proxy, a user cannot predict which host will act as the coordinator. To aid in diagnosis, we provide a sql function which returns the name of the host on which the coordinator is running. EXTERNAL DESCRIPTION: Add a builtin function called coordinator(), which returns the name of the host which is running the impalad that is acting as the coordinator for the current query. TESTING: - Added a basic unit test for the new function. - Added a unit test which simulates the case when coord_address is unset. - Added a query that uses coordinator() to exprs.test - Hand tested in a development deployment. - Ran regression tests and got a clean run. Change-Id: I94d6e2664ba659b48df53c5c06f67b502c533e47 --- M be/src/exprs/expr-test.cc M be/src/exprs/utility-functions-ir.cc M be/src/exprs/utility-functions.h M common/function-registry/impala_functions.py M docs/topics/impala_misc_functions.xml M testdata/workloads/functional-query/queries/QueryTest/exprs.test 6 files changed, 66 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/11459/3 -- To view, visit http://gerrit.cloudera.org:8080/11459 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I94d6e2664ba659b48df53c5c06f67b502c533e47 Gerrit-Change-Number: 11459 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Zoram Thanga
[Impala-ASF-CR] IMPALA-589: Add sql function returning the impalad coordinator hostname.
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11459 ) Change subject: IMPALA-589: Add sql function returning the impalad coordinator hostname. .. Patch Set 2: (1 comment) Thanks for making me look at the *.test files http://gerrit.cloudera.org:8080/#/c/11459/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11459/2//COMMIT_MSG@7 PS2, Line 7: IMPALA-589: Add sql function returning the impalad coordinator hostname. > I appreciate your thoroughness, but we generally make our commit messages a Done -- To view, visit http://gerrit.cloudera.org:8080/11459 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I94d6e2664ba659b48df53c5c06f67b502c533e47 Gerrit-Change-Number: 11459 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Mon, 24 Sep 2018 16:09:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7801: Remove toSql() from ParseNode interface.
Andrew Sherman has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/11942 ) Change subject: IMPALA-7801: Remove toSql() from ParseNode interface. .. IMPALA-7801: Remove toSql() from ParseNode interface. In IMPALA-5821 the method "toSql(ToSqlOptions)" was added to ParseNode, to allow options to be passed when generating SQL from a parse tree. This change was suggested as part of the review of IMPALA-5821. The old "toSql()" method is currently implemented everywhere by calling toSql(ToSqlOptions.DEFAULT), but this is fragile as this convention could be accidentally ignored, To make the code more maintainable, remove the old "toSql()" method and have all callers call the new method instead. In addition, similarly replace "toSql()" in a few more associated classes: AnalyticWindow, LimitElement, Boundary, OrderByElement, SelectListItem. TESTING: No new tests are added as there are no functional changes. All end to end tests run clean. Change-Id: I17025901838e9ffd753894a8087170123f9d8b33 --- M fe/src/main/java/org/apache/impala/analysis/AdminFnStmt.java M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java M fe/src/main/java/org/apache/impala/analysis/AggregateInfoBase.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableSetLocationStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java M fe/src/main/java/org/apache/impala/analysis/AnalyticInfo.java M fe/src/main/java/org/apache/impala/analysis/AnalyticWindow.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java M fe/src/main/java/org/apache/impala/analysis/ExtractFromExpr.java M fe/src/main/java/org/apache/impala/analysis/FromClause.java M fe/src/main/java/org/apache/impala/analysis/FunctionArgs.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/HdfsCachingOp.java M fe/src/main/java/org/apache/impala/analysis/InPredicate.java M fe/src/main/java/org/apache/impala/analysis/IsNotEmptyPredicate.java M fe/src/main/java/org/apache/impala/analysis/KuduPartitionParam.java M fe/src/main/java/org/apache/impala/analysis/LikePredicate.java M fe/src/main/java/org/apache/impala/analysis/LimitElement.java M fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java M fe/src/main/java/org/apache/impala/analysis/OrderByElement.java M fe/src/main/java/org/apache/impala/analysis/ParseNode.java M fe/src/main/java/org/apache/impala/analysis/PartitionDef.java M fe/src/main/java/org/apache/impala/analysis/PartitionKeyValue.java M fe/src/main/java/org/apache/impala/analysis/PartitionSet.java M fe/src/main/java/org/apache/impala/analysis/PartitionSpec.java M fe/src/main/java/org/apache/impala/analysis/PartitionSpecBase.java M fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java M fe/src/main/java/org/apache/impala/analysis/RangePartition.java M fe/src/main/java/org/apache/impala/analysis/SelectListItem.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java M fe/src/main/java/org/apache/impala/analysis/SlotRef.java M fe/src/main/java/org/apache/impala/analysis/StatementBase.java M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java M fe/src/main/java/org/apache/impala/analysis/Subquery.java M fe/src/main/java/org/apache/impala/analysis/TableRef.java M fe/src/main/java/org/apache/impala/analysis/TableSampleClause.java M fe/src/main/java/org/apache/impala/analysis/TimestampArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/analysis/TypeDef.java M fe/src/main/java/org/apache/impala/analysis/UnionStmt.java M fe/src/main/java/org/apache/impala/analysis/WithClause.java M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java M
[Impala-ASF-CR] IMPALA-7801: Remove toSql() from ParseNode interface.
Andrew Sherman has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11942 Change subject: IMPALA-7801: Remove toSql() from ParseNode interface. .. IMPALA-7801: Remove toSql() from ParseNode interface. In IMPALA-5821 the method "toSql(ToSqlOptions)" was added to ParseNode, to allow options to be passed when generating SQL from a parse tree. This change was suggested as part of the review of IMPALA-5821. The old "toSql()" method is currently implemented everywhere by calling toSql(ToSqlOptions.DEFAULT), but this is fragile as this convention could be accidentally ignored, To make the code more maintainable, remove the old "toSql()" method and have all callers call the new method instead. In addition, similarly replace "toSql()" in a few more associated classes: AnalyticWindow, LimitElement, Boundary, OrderByElement, SelectListItem. TESTING: No new tests are added as there are no functional changes. All end to end tests run clean. Change-Id: I17025901838e9ffd753894a8087170123f9d8b33 --- M fe/src/main/java/org/apache/impala/analysis/AdminFnStmt.java M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java M fe/src/main/java/org/apache/impala/analysis/AggregateInfoBase.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableSetLocationStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java M fe/src/main/java/org/apache/impala/analysis/AnalyticInfo.java M fe/src/main/java/org/apache/impala/analysis/AnalyticWindow.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java M fe/src/main/java/org/apache/impala/analysis/ExtractFromExpr.java M fe/src/main/java/org/apache/impala/analysis/FromClause.java M fe/src/main/java/org/apache/impala/analysis/FunctionArgs.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/HdfsCachingOp.java M fe/src/main/java/org/apache/impala/analysis/InPredicate.java M fe/src/main/java/org/apache/impala/analysis/IsNotEmptyPredicate.java M fe/src/main/java/org/apache/impala/analysis/KuduPartitionParam.java M fe/src/main/java/org/apache/impala/analysis/LikePredicate.java M fe/src/main/java/org/apache/impala/analysis/LimitElement.java M fe/src/main/java/org/apache/impala/analysis/LiteralExpr.java M fe/src/main/java/org/apache/impala/analysis/ModifyStmt.java M fe/src/main/java/org/apache/impala/analysis/NumericLiteral.java M fe/src/main/java/org/apache/impala/analysis/OrderByElement.java M fe/src/main/java/org/apache/impala/analysis/ParseNode.java M fe/src/main/java/org/apache/impala/analysis/PartitionDef.java M fe/src/main/java/org/apache/impala/analysis/PartitionKeyValue.java M fe/src/main/java/org/apache/impala/analysis/PartitionSet.java M fe/src/main/java/org/apache/impala/analysis/PartitionSpec.java M fe/src/main/java/org/apache/impala/analysis/PartitionSpecBase.java M fe/src/main/java/org/apache/impala/analysis/PrivilegeSpec.java M fe/src/main/java/org/apache/impala/analysis/QueryStmt.java M fe/src/main/java/org/apache/impala/analysis/RangePartition.java M fe/src/main/java/org/apache/impala/analysis/SelectListItem.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/analysis/SlotDescriptor.java M fe/src/main/java/org/apache/impala/analysis/SlotRef.java M fe/src/main/java/org/apache/impala/analysis/StatementBase.java M fe/src/main/java/org/apache/impala/analysis/StmtRewriter.java M fe/src/main/java/org/apache/impala/analysis/Subquery.java M fe/src/main/java/org/apache/impala/analysis/TableRef.java M fe/src/main/java/org/apache/impala/analysis/TableSampleClause.java M fe/src/main/java/org/apache/impala/analysis/TimestampArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/ToSqlUtils.java M fe/src/main/java/org/apache/impala/analysis/TypeDef.java M fe/src/main/java/org/apache/impala/analysis/UnionStmt.java M fe/src/main/java/org/apache/impala/analysis/WithClause.java M fe/src/main/java/org/apache/impala/catalog/FeCatalogUtils.java M
[Impala-ASF-CR] IMPALA-7468: Port CancelQueryFInstances() to KRPC.
Andrew Sherman has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12142 Change subject: IMPALA-7468: Port CancelQueryFInstances() to KRPC. .. IMPALA-7468: Port CancelQueryFInstances() to KRPC. When the Coordinator needs to cancel a query (usually because a user has hit Control-C), it does this by sending a CancelQueryFInstances message to each fragment instance. This change switches this code to use KRPC. Add new protobuf definitions for the messages, and remove the old thrift definitions. Move the server-side implementation of Cancel() from ImpalaInternalService to ControlService. Rework the scheduler so that the FInstanceExecParams always contains the KRPC address of the fragment executors, this address can then be used if a query is to be cancelled. For now keep the KRPC calls to CancelQueryFInstances() as synchronous. While moving the client-side code, remove the fault injection code that was inserted with FAULT_INJECTION_SEND_RPC_EXCEPTION and FAULT_INJECTION_RECV_RPC_EXCEPTION (triggered by running impalad with --fault_injection_rpc_exception_type=1) as this tickles code in client-cache.h which is now not used. TESTING: Ran all end-to-end tests. No new tests as test_cancellation.py provides good coverage. Checked in debugger that DebugAction style fault injection (triggered from test_cancellation.py) was working correctly. Change-Id: I625030c3f1068061aa029e6e242f016cadd84969 --- M be/src/runtime/backend-client.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/service/control-service.cc M be/src/service/control-service.h M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M common/protobuf/control_service.proto M common/thrift/ImpalaInternalService.thrift 12 files changed, 165 insertions(+), 105 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/12142/1 -- To view, visit http://gerrit.cloudera.org:8080/12142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I625030c3f1068061aa029e6e242f016cadd84969 Gerrit-Change-Number: 12142 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Sherman
[Impala-ASF-CR] IMPALA-7468: Port CancelQueryFInstances() to KRPC.
Andrew Sherman has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/12142 ) Change subject: IMPALA-7468: Port CancelQueryFInstances() to KRPC. .. IMPALA-7468: Port CancelQueryFInstances() to KRPC. When the Coordinator needs to cancel a query (usually because a user has hit Control-C), it does this by sending a CancelQueryFInstances message to each fragment instance. This change switches this code to use KRPC. Add new protobuf definitions for the messages, and remove the old thrift definitions. Move the server-side implementation of Cancel() from ImpalaInternalService to ControlService. Rework the scheduler so that the FInstanceExecParams always contains the KRPC address of the fragment executors, this address can then be used if a query is to be cancelled. For now keep the KRPC calls to CancelQueryFInstances() as synchronous. While moving the client-side code, remove the fault injection code that was inserted with FAULT_INJECTION_SEND_RPC_EXCEPTION and FAULT_INJECTION_RECV_RPC_EXCEPTION (triggered by running impalad with --fault_injection_rpc_exception_type=1) as this tickles code in client-cache.h which is now not used. TESTING: Ran all end-to-end tests. No new tests as test_cancellation.py provides good coverage. Checked in debugger that DebugAction style fault injection (triggered from test_cancellation.py) was working correctly. Change-Id: I625030c3f1068061aa029e6e242f016cadd84969 --- M be/src/runtime/backend-client.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/service/control-service.cc M be/src/service/control-service.h M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M common/protobuf/control_service.proto M common/thrift/ImpalaInternalService.thrift 12 files changed, 164 insertions(+), 105 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/12142/2 -- To view, visit http://gerrit.cloudera.org:8080/12142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I625030c3f1068061aa029e6e242f016cadd84969 Gerrit-Change-Number: 12142 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins
[Impala-ASF-CR] IMPALA-8047 Support .proto files in .clang-format
Andrew Sherman has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12165 Change subject: IMPALA-8047 Support .proto files in .clang-format .. IMPALA-8047 Support .proto files in .clang-format The .proto file extension is used for the Google Protocol Buffers language. Impala uses this language to specify the format of messages used by KRPC. Add support for this language to .clang-format so that we can have consistent formatting. The proposed support is: Language: Proto BasedOnStyle: Google ColumnLimit: 90 This produces only a few diffs when run against the existing Impala code. I’m not proposing to make any changes to .proto files, this is just to show what clang-format will do. Apart from wrapping comments and code at 90 chars, the diffs are mostly of the form -syntax="proto2"; +syntax = "proto2"; - message Certificate {}; + message Certificate { + }; - optional bool client_timeout_defined = 4 [ default = false ]; + optional bool client_timeout_defined = 4 [default = false]; -UNKNOWN= 999; -NEGOTIATE = 1; -SASL_SUCCESS = 0; -SASL_INITIATE = 2; +UNKNOWN = 999; +NEGOTIATE = 1; +SASL_SUCCESS = 0; +SASL_INITIATE = 2; This last change can be configured using “AlignConsecutiveAssignments: true” but that creates a different set of diffs. Change-Id: I0c2dcfc21fc8f9206adb64166fbd05580516791f --- M .clang-format 1 file changed, 4 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/65/12165/2 -- To view, visit http://gerrit.cloudera.org:8080/12165 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I0c2dcfc21fc8f9206adb64166fbd05580516791f Gerrit-Change-Number: 12165 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Sherman
[Impala-ASF-CR] IMPALA-7468: Port CancelQueryFInstances() to KRPC.
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/12142 ) Change subject: IMPALA-7468: Port CancelQueryFInstances() to KRPC. .. Patch Set 1: (6 comments) Thanks for the helpful review http://gerrit.cloudera.org:8080/#/c/12142/1/be/src/runtime/coordinator-backend-state.h File be/src/runtime/coordinator-backend-state.h: http://gerrit.cloudera.org:8080/#/c/12142/1/be/src/runtime/coordinator-backend-state.h@294 PS1, Line 294: Status DoCancelQueryFInstancesRrpcWithRetry( > Nit: The prevailing style seems to place reference ('&') and pointer ('*') Up to now I have just been using what the clang-format rules dictate. In clang-format terms I think you are saying we should use 'DerivePointerAlignment: true' (which means: "analyze the formatted file for the most common alignment of & and *. Pointer and reference alignment styles are going to be updated according to the preferences found in the file. PointerAlignment is then used only as fallback"), is that the standard practice for Impala? http://gerrit.cloudera.org:8080/#/c/12142/1/be/src/scheduling/query-schedule.h File be/src/scheduling/query-schedule.h: http://gerrit.cloudera.org:8080/#/c/12142/1/be/src/scheduling/query-schedule.h@104 PS1, Line 104: const TNetworkAddress& krpc_host, int per_fragment_instance_idx, > Nit: I wonder if there are trailing whitespaces in these lines? I don't und There is no trailing whitespace I could find. I think this formatting is the result of the clang-format parameter: ConstructorInitializerAllOnOneLineOrOnePerLine: true which is splitting the initializers to one-per-line. http://gerrit.cloudera.org:8080/#/c/12142/1/be/src/scheduling/scheduler.h File be/src/scheduling/scheduler.h: http://gerrit.cloudera.org:8080/#/c/12142/1/be/src/scheduling/scheduler.h@433 PS1, Line 433: void ComputeFragmentExecParams(const BackendConfig executor_config, > Nit: Please retain existing placement of '&' and '*'. Thanks for catching my not using a reference for executor_config. http://gerrit.cloudera.org:8080/#/c/12142/1/be/src/scheduling/scheduler.cc File be/src/scheduling/scheduler.cc: http://gerrit.cloudera.org:8080/#/c/12142/1/be/src/scheduling/scheduler.cc@333 PS1, Line 333: ComputeFragmentExecParams(executor_config, plan_exec_info, > Nit: Trailing spaces? Yes there is something wrong here and I have reformatted http://gerrit.cloudera.org:8080/#/c/12142/1/common/protobuf/control_service.proto File common/protobuf/control_service.proto: http://gerrit.cloudera.org:8080/#/c/12142/1/common/protobuf/control_service.proto@163 PS1, Line 163: optional UniqueIdPB query_id = 1; > Don't understand how the query id can be optional... OK you made me think! I had just copied what other people had done in implementing krpc. But now I have investigated further and found that many people argue that having required fields is a bad idea, and the syntax is even removed in future protobuf implementations, see for example https://stackoverflow.com/questions/31801257/why-required-and-optional-is-removed-in-protocol-buffers-3 Overall this makes sense to me, so I think optional is OK here, are you persuaded? http://gerrit.cloudera.org:8080/#/c/12142/1/common/protobuf/control_service.proto@179 PS1, Line 179: // Cancellation is asynchronous. > Commit message says cancellation is synchronous. Do you want to change this When the coordinator cancels a query it sends CancelQueryFInstancesRequestPB messages to the hosts executing fragment instances for the query. These calls are synchronous, that is the coordinator waits for a response to each CancelQueryFInstances call before making the next one. This is what the commit message means by 'For now keep the KRPC calls to CancelQueryFInstances() as synchronous'. When a cancellation occurs on an impalad executing a fragment instance then the cancellation is implemented by setting flags saying the query is cancelled in various data structures, and by waking various sleeping threads. This should be enough to stop the query executing, but the CancelQueryFInstances call returns without doing any checking that all query execution has stopped. So in this sense cancel is asynchronous. I updated the comment to make this clearer. -- To view, visit http://gerrit.cloudera.org:8080/12142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I625030c3f1068061aa029e6e242f016cadd84969 Gerrit-Change-Number: 12142 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Thu, 03 Jan 2019 00:33:30 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7468: Port CancelQueryFInstances() to KRPC.
Andrew Sherman has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/12142 ) Change subject: IMPALA-7468: Port CancelQueryFInstances() to KRPC. .. IMPALA-7468: Port CancelQueryFInstances() to KRPC. When the Coordinator needs to cancel a query (usually because a user has hit Control-C), it does this by sending a CancelQueryFInstances message to each fragment instance. This change switches this code to use KRPC. Add new protobuf definitions for the messages, and remove the old thrift definitions. Move the server-side implementation of Cancel() from ImpalaInternalService to ControlService. Rework the scheduler so that the FInstanceExecParams always contains the KRPC address of the fragment executors, this address can then be used if a query is to be cancelled. For now keep the KRPC calls to CancelQueryFInstances() as synchronous. While moving the client-side code, remove the fault injection code that was inserted with FAULT_INJECTION_SEND_RPC_EXCEPTION and FAULT_INJECTION_RECV_RPC_EXCEPTION (triggered by running impalad with --fault_injection_rpc_exception_type=1) as this tickles code in client-cache.h which is now not used. TESTING: Ran all end-to-end tests. No new tests as test_cancellation.py provides good coverage. Checked in debugger that DebugAction style fault injection (triggered from test_cancellation.py) was working correctly. Change-Id: I625030c3f1068061aa029e6e242f016cadd84969 --- M be/src/runtime/backend-client.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/service/control-service.cc M be/src/service/control-service.h M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M common/protobuf/control_service.proto M common/thrift/ImpalaInternalService.thrift 12 files changed, 158 insertions(+), 111 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/12142/3 -- To view, visit http://gerrit.cloudera.org:8080/12142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I625030c3f1068061aa029e6e242f016cadd84969 Gerrit-Change-Number: 12142 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Zoram Thanga
[Impala-ASF-CR] IMPALA-7985: Port RemoteShutdown() to KRPC.
Andrew Sherman has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12260 Change subject: IMPALA-7985: Port RemoteShutdown() to KRPC. .. IMPALA-7985: Port RemoteShutdown() to KRPC. The :shutdown command is used to shutdown a remote server. The common case is that a user specifies the impalad to shutdown by specifying a host e.g. :shutdown('host100'). If a user has more than one impalad on a remote host then the form :shutdown(':') can be used to specify the port by which the imapald can be contacted. Prior to IMPALA-7985 this port was the backend port, e.g. :shutdown('host100:22000'). With IMPALA-7985 the port to use is the KRPC port, e.g. :shutdown('host100:27000'). Shutdown is implemented by making an rpc call to the target impalad. This changes the implementation of this call to use KRPC. To aid the user in finding the KRPC port, the KRPC address is added to the /backends section of the debug web page. We attempt to detect the case where :shutdown is pointed at a thrift port (like the backend port) and print an informative message. Documentation of this change will be done in IMPALA-8098. For discussion of why it was chosen to implement this change in an incompatible way, see comments in https://issues.apache.org/jira/browse/IMPALA-7985. TESTING Ran all end-to-end tests. Add a test for /backends to test_web_pages.py. In test_restart_services.py add a call to the old backend port to the test. Some expected error messages were changed in line with what KRPC returns. Change-Id: I4fd00ee4e638f5e71e27893162fd65501ef9e74e --- M be/src/runtime/backend-client.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/service/client-request-state.cc M be/src/service/control-service.cc M be/src/service/control-service.h M be/src/service/impala-http-handler.cc M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M common/protobuf/control_service.proto M common/thrift/ImpalaInternalService.thrift M tests/custom_cluster/test_restart_services.py M tests/webserver/test_web_pages.py M www/backends.tmpl 16 files changed, 219 insertions(+), 144 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/12260/1 -- To view, visit http://gerrit.cloudera.org:8080/12260 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I4fd00ee4e638f5e71e27893162fd65501ef9e74e Gerrit-Change-Number: 12260 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr.
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/12068 ) Change subject: IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr. .. Patch Set 3: I tested performance of the two scalar expressions by running queries on my desktop box. ValidTupleId is faster with the new codegen. To prove this I used a query with many 'sum(distinct column_name)' clauses. In this case the new code runs 2% faster. I think this would be undetectable in any normal query. I ran many queries trying to analyze whether IsNotEmptyPredicate is faster with the new codegen. In most queries using the predicate there was no detectable difference. I was able to construct a pathological query with 100 union all expressions, each of which read from a nested type. With this query runtime was 5% slower with the new codegen. I don't think this would be detectable in any normal query. Arguably we should still use this new codegen'd version as it fits with our vision of removing all calls to GetCodegendComputeFnWrapper(). -- To view, visit http://gerrit.cloudera.org:8080/12068 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ifb87b9e3b879c278ce8638d97bcb320a7555a6b3 Gerrit-Change-Number: 12068 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Thu, 13 Dec 2018 00:21:19 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7183: Include remote host when logging unknown execution status.
Andrew Sherman has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12102 Change subject: IMPALA-7183: Include remote host when logging unknown execution status. .. IMPALA-7183: Include remote host when logging unknown execution status. ControlService::ReportExecStatus reports the execution status of a fragment to the coordinator. When the coordinator receives a ReportExecStatus call, it looks up the ClientRequestState for the query id. If this cannot be found, usually because the query is in the process of being canceled, then it logs a message before responding to the RPC. Add the remote host name to this message. We try to use the DNS reverse-lookup name, if that fails we use the IP address instead. Example output: "ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled): 9a437a10d83cfd86:3c961136 remote host=localhost" TESTING Ran end-to-end tests. Forced code to execute by hand to check output is correct. Change-Id: I79b223e22096e2d964b72654a34dc0d88e8c6422 --- M be/src/service/control-service.cc 1 file changed, 9 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/12102/1 -- To view, visit http://gerrit.cloudera.org:8080/12102 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I79b223e22096e2d964b72654a34dc0d88e8c6422 Gerrit-Change-Number: 12102 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr.
Andrew Sherman has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/12068 ) Change subject: IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr. .. IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr. These two classes evaluate scalar expressions. Previously codegen was done by calling ScalarExpr::GetCodegendComputeFnWrapper which generates a static method that calls the scalar expression evaluation method. Make this more efficient by using cross-compiled code which is customized at codegen time. Add cross-compiled files is-not-empty-predicate-ir.cc and valid-tuple-id-ir.cc which contain the scalar expression evaluation methods. At Codegen time generate a static method that calls the cross-compiled methods. This is done using a new common piece of code in ScalarExpr::GetCodegendCrossCompiledComputeFnWrapper. So the improvement in this change is that the evaluation method code is cross-compiled to ir. IMPALA-7657 also requests replacing GetCodegendComputeFnWrapper() in TupleIsNullPredicate. In the current Impala code this method is never called. This is because TupleIsNullPredicate is always wrapped in an IfExpr. This is always codegen'd by IfExpr's GetCodegendComputeFnWrapper() method. There is a separate Jira IMPALA-7655 to improve codegen of IfExpr. Minor corrections: Correct the link to llvm tutorial in LlvmCodegen. Make a method private in TupleIsNullPredicate.java. PERFORMANCE: I tested performance on a local mini-cluster. I wrote some pathological queries to test the new code. The new codegen'd code is very similar in performance. ValidTupleIdExpr seems a bit faster, but IsNotEmptyPredicate seems a little slower. Overall these changes are not purely for performance but to move away from GetCodegendComputeFnWrapper. TESTING: The changed scalar expressions are well exercised by current tests. Ran exhaustive end-to-end tests (pending). Change-Id: Ifb87b9e3b879c278ce8638d97bcb320a7555a6b3 --- M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/impala-ir.cc M be/src/codegen/llvm-codegen.h M be/src/exprs/CMakeLists.txt A be/src/exprs/is-not-empty-predicate-ir.cc M be/src/exprs/is-not-empty-predicate.cc M be/src/exprs/is-not-empty-predicate.h M be/src/exprs/scalar-expr.cc M be/src/exprs/scalar-expr.h M be/src/exprs/slot-ref.cc A be/src/exprs/valid-tuple-id-ir.cc M be/src/exprs/valid-tuple-id.cc M be/src/exprs/valid-tuple-id.h M fe/src/main/java/org/apache/impala/analysis/TupleIsNullPredicate.java 14 files changed, 156 insertions(+), 53 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/12068/4 -- To view, visit http://gerrit.cloudera.org:8080/12068 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifb87b9e3b879c278ce8638d97bcb320a7555a6b3 Gerrit-Change-Number: 12068 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-7183: Include remote host when logging unknown execution status.
Andrew Sherman has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/12102 ) Change subject: IMPALA-7183: Include remote host when logging unknown execution status. .. IMPALA-7183: Include remote host when logging unknown execution status. ControlService::ReportExecStatus reports the execution status of a fragment to the coordinator. When the coordinator receives a ReportExecStatus call, it looks up the ClientRequestState for the query id. If this cannot be found, usually because the query is in the process of being canceled, then it logs a message before responding to the RPC. Add the IP address and port of the remote host to this message. Example output (from minicluster): "ReportExecStatus(): Received report for unknown query ID (probably closed or cancelled): 174dbb16aa00a5c2:382e5cc8 remote host=127.0.0.1:47112" TESTING Ran end-to-end tests. Forced code to execute by hand to check output is correct. Change-Id: I79b223e22096e2d964b72654a34dc0d88e8c6422 --- M be/src/service/control-service.cc 1 file changed, 3 insertions(+), 1 deletion(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/12102/2 -- To view, visit http://gerrit.cloudera.org:8080/12102 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I79b223e22096e2d964b72654a34dc0d88e8c6422 Gerrit-Change-Number: 12102 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr.
Andrew Sherman has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/12068 ) Change subject: IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr. .. IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr. These two classes evaluate scalar expressions. Previously codegen was done by calling ScalarExpr::GetCodegendComputeFnWrapper which generates a static method that calls the scalar expression evaluation methods. Make this more efficient by using cross-compiled code which is customized at codegen time. Add cross-compiled files is-not-empty-predicate-ir.cc and valid-tuple-id-ir.cc These files contain the methods that will be specialized during code generation These methods have to be static to match the expected method signatures Refactor the existing scalar methods so that they call a new internal static method which contains the implementation of the scalar expression This static method is called from both the old member function and the new cross-compiled method. IMPALA-7657 also requests replacing GetCodegendComputeFnWrapper() in TupleIsNullPredicate. In the current Impala code this method is never called. This is because TupleIsNullPredicate is always wrapped in an IfExpr. This is always codegen'd by IfExpr's GetCodegendComputeFnWrapper() method. There is a separate Jira IMPALA-7655 to improve codegen of IfExpr. Minor corrections: Correct the link to llvm tutorial in LlvmCodegen. Make a method private in TupleIsNullPredicate.java. TESTING: The changed scalar expressions are well exercised by current tests. Ran end-to-end tests. Change-Id: Ifb87b9e3b879c278ce8638d97bcb320a7555a6b3 --- M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/impala-ir.cc M be/src/codegen/llvm-codegen.h M be/src/exprs/CMakeLists.txt A be/src/exprs/is-not-empty-predicate-ir.cc M be/src/exprs/is-not-empty-predicate.cc M be/src/exprs/is-not-empty-predicate.h M be/src/exprs/slot-ref.cc A be/src/exprs/valid-tuple-id-ir.cc M be/src/exprs/valid-tuple-id.cc M be/src/exprs/valid-tuple-id.h M fe/src/main/java/org/apache/impala/analysis/TupleIsNullPredicate.java 12 files changed, 205 insertions(+), 41 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/12068/2 -- To view, visit http://gerrit.cloudera.org:8080/12068 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifb87b9e3b879c278ce8638d97bcb320a7555a6b3 Gerrit-Change-Number: 12068 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr.
Andrew Sherman has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12068 Change subject: IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr. .. IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr. These two classes evaluate scalar expressions. Previously codegen was done by calling ScalarExpr::GetCodegendComputeFnWrapper which generates a static method that calls the scalar expression evaluation methods. Make this more efficient by using cross-compiled code which is customized at codegen time. Add cross-compiled files is-not-empty-predicate-ir.cc and valid-tuple-id-ir.cc These files contain the methods that will be specialized during code generation These methods have to be static to match the expected method signatures Refactor the existing scalar methods so that they call a new internal static method which contains the implementation of the scalar expression This static method is called from both the old member function and the new cross-compiled method. IMPALA-7657 also requests replacing GetCodegendComputeFnWrapper() in TupleIsNullPredicate. In the current Impala code this method is never called. This is because TupleIsNullPredicate is always wrapped in an IfExpr. This is always codegen'd by IfExpr's GetCodegendComputeFnWrapper() method. There is a separate Jira IMPALA-7655 to improve codegen of IfExpr. Minor corrections: Correct the link to llvm tutorial in LlvmCodegen. Make a method private in TupleIsNullPredicate.java. TESTING: The changed scalar expressions are well exercised by current tests. Ran end-to-end tests. Change-Id: Ifb87b9e3b879c278ce8638d97bcb320a7555a6b3 --- M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/impala-ir.cc M be/src/codegen/llvm-codegen.h M be/src/exprs/CMakeLists.txt A be/src/exprs/is-not-empty-predicate-ir.cc M be/src/exprs/is-not-empty-predicate.cc M be/src/exprs/is-not-empty-predicate.h M be/src/exprs/slot-ref.cc A be/src/exprs/valid-tuple-id-ir.cc M be/src/exprs/valid-tuple-id.cc M be/src/exprs/valid-tuple-id.h M fe/src/main/java/org/apache/impala/analysis/TupleIsNullPredicate.java 12 files changed, 205 insertions(+), 41 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/12068/1 -- To view, visit http://gerrit.cloudera.org:8080/12068 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: Ifb87b9e3b879c278ce8638d97bcb320a7555a6b3 Gerrit-Change-Number: 12068 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr.
Andrew Sherman has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/12068 ) Change subject: IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr. .. IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr. These two classes evaluate scalar expressions. Previously codegen was done by calling ScalarExpr::GetCodegendComputeFnWrapper which generates a static method that calls the scalar expression evaluation methods. Make this more efficient by using cross-compiled code which is customized at codegen time. Add cross-compiled files is-not-empty-predicate-ir.cc and valid-tuple-id-ir.cc These files contain the methods that will be specialized during code generation These methods have to be static to match the expected method signatures Refactor the existing scalar methods so that they call a new internal static method which contains the implementation of the scalar expression This static method is called from both the old member function and the new cross-compiled method. IMPALA-7657 also requests replacing GetCodegendComputeFnWrapper() in TupleIsNullPredicate. In the current Impala code this method is never called. This is because TupleIsNullPredicate is always wrapped in an IfExpr. This is always codegen'd by IfExpr's GetCodegendComputeFnWrapper() method. There is a separate Jira IMPALA-7655 to improve codegen of IfExpr. Minor corrections: Correct the link to llvm tutorial in LlvmCodegen. Make a method private in TupleIsNullPredicate.java. TESTING: The changed scalar expressions are well exercised by current tests. Ran end-to-end tests. Change-Id: Ifb87b9e3b879c278ce8638d97bcb320a7555a6b3 --- M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/impala-ir.cc M be/src/codegen/llvm-codegen.h M be/src/exprs/CMakeLists.txt A be/src/exprs/is-not-empty-predicate-ir.cc M be/src/exprs/is-not-empty-predicate.cc M be/src/exprs/is-not-empty-predicate.h M be/src/exprs/slot-ref.cc A be/src/exprs/valid-tuple-id-ir.cc M be/src/exprs/valid-tuple-id.cc M be/src/exprs/valid-tuple-id.h M fe/src/main/java/org/apache/impala/analysis/TupleIsNullPredicate.java 12 files changed, 206 insertions(+), 46 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/12068/3 -- To view, visit http://gerrit.cloudera.org:8080/12068 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifb87b9e3b879c278ce8638d97bcb320a7555a6b3 Gerrit-Change-Number: 12068 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-6924: Add child queries to profile in compute stats
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11977 ) Change subject: IMPALA-6924: Add child queries to profile in compute stats .. Patch Set 2: (2 comments) Looks like a useful change http://gerrit.cloudera.org:8080/#/c/11977/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11977/2//COMMIT_MSG@9 PS2, Line 9: COMPUTE STATS triggers to child queries which do the actual stats Should this be "triggers some child queries"? Or "triggers two child queries"? http://gerrit.cloudera.org:8080/#/c/11977/2/be/src/service/child-query.h File be/src/service/child-query.h: http://gerrit.cloudera.org:8080/#/c/11977/2/be/src/service/child-query.h@129 PS2, Line 129: RuntimeProfile* profile_; Do you want a /// comment here? -- To view, visit http://gerrit.cloudera.org:8080/11977 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5006c3b366d381eed4687e550cdfc463be3d1350 Gerrit-Change-Number: 11977 Gerrit-PatchSet: 2 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 29 Nov 2018 01:07:12 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6924: Add child queries to profile in compute stats
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11977 ) Change subject: IMPALA-6924: Add child queries to profile in compute stats .. Patch Set 4: Code-Review+1 (5 comments) This all looks good to me. I have comments that are more like questions so I can learn things. http://gerrit.cloudera.org:8080/#/c/11977/4/be/src/service/client-request-state.cc File be/src/service/client-request-state.cc: http://gerrit.cloudera.org:8080/#/c/11977/4/be/src/service/client-request-state.cc@572 PS4, Line 572: child_queries.push_back(ChildQuery(compute_stats_params.tbl_stats_query, this, I know you didn't write this line, but I see some people advocating that emplace_back is better than push_back as it can avoid a copy? http://gerrit.cloudera.org:8080/#/c/11977/4/be/src/service/impala-hs2-server.cc File be/src/service/impala-hs2-server.cc: http://gerrit.cloudera.org:8080/#/c/11977/4/be/src/service/impala-hs2-server.cc@907 PS4, Line 907: return_val.__set_profile(ss.str()); So this covers both the string and base64 cases? http://gerrit.cloudera.org:8080/#/c/11977/4/be/src/util/runtime-profile-test.cc File be/src/util/runtime-profile-test.cc: http://gerrit.cloudera.org:8080/#/c/11977/4/be/src/util/runtime-profile-test.cc@93 PS4, Line 93: EXPECT_TRUE(deserialized_profile->GetCounter("Not there") == NULL); == nullptr http://gerrit.cloudera.org:8080/#/c/11977/4/be/src/util/runtime-profile.cc File be/src/util/runtime-profile.cc: http://gerrit.cloudera.org:8080/#/c/11977/4/be/src/util/runtime-profile.cc@888 PS4, Line 888: uint32_t deserialized_len = result_len; Would there be any advantage to using a cast here? http://gerrit.cloudera.org:8080/#/c/11977/4/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/11977/4/tests/query_test/test_observability.py@265 PS4, Line 265: matches = re.findall("Query \(id=.{,33}\)", results.runtime_profile) So 33 is the length of a query id? That made me think. -- To view, visit http://gerrit.cloudera.org:8080/11977 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5006c3b366d381eed4687e550cdfc463be3d1350 Gerrit-Change-Number: 11977 Gerrit-PatchSet: 4 Gerrit-Owner: Thomas Marshall Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Tue, 04 Dec 2018 17:41:49 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7801: Remove toSql() from ParseNode interface.
Andrew Sherman has abandoned this change. ( http://gerrit.cloudera.org:8080/11942 ) Change subject: IMPALA-7801: Remove toSql() from ParseNode interface. .. Abandoned Will let Paul Rogers fix (or at least think about) in a bigger change -- To view, visit http://gerrit.cloudera.org:8080/11942 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: I17025901838e9ffd753894a8087170123f9d8b33 Gerrit-Change-Number: 11942 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-7801: Remove toSql() from ParseNode interface.
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11942 ) Change subject: IMPALA-7801: Remove toSql() from ParseNode interface. .. Patch Set 4: Thanks Csaba, hope you didn't waste too much time on this -- To view, visit http://gerrit.cloudera.org:8080/11942 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I17025901838e9ffd753894a8087170123f9d8b33 Gerrit-Change-Number: 11942 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Tue, 20 Nov 2018 00:04:21 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7801: Remove toSql() from ParseNode interface.
Andrew Sherman has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/11942 ) Change subject: IMPALA-7801: Remove toSql() from ParseNode interface. .. IMPALA-7801: Remove toSql() from ParseNode interface. In IMPALA-5821 the method "toSql(ToSqlOptions)" was added to ParseNode, to allow options to be passed when generating SQL from a parse tree. This change was suggested as part of the review of IMPALA-5821. The old "toSql()" method is currently implemented everywhere by calling toSql(ToSqlOptions.DEFAULT), but this is fragile as this convention could be accidentally ignored, To make the code more maintainable, remove the old "toSql()" method and have all callers call the new method instead. In addition, similarly replace "toSql()" in a few more associated classes: AnalyticWindow, LimitElement, Boundary, OrderByElement, SelectListItem. Finally, rename ToSqlOptions to ToSqlOption anbd noive it to be part of the ParseNode interface, and, in ToSqlOption, rename DEFAULT to ORIGINAL. TESTING: No new tests are added as there are no functional changes. All end to end tests run clean. Change-Id: I17025901838e9ffd753894a8087170123f9d8b33 --- M fe/src/main/java/org/apache/impala/analysis/AdminFnStmt.java M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java M fe/src/main/java/org/apache/impala/analysis/AggregateInfoBase.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableSetLocationStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java M fe/src/main/java/org/apache/impala/analysis/AnalyticInfo.java M fe/src/main/java/org/apache/impala/analysis/AnalyticWindow.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/BaseTableRef.java M fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDataSrcStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDropRoleStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java M fe/src/main/java/org/apache/impala/analysis/DeleteStmt.java M fe/src/main/java/org/apache/impala/analysis/DescribeDbStmt.java M fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java M fe/src/main/java/org/apache/impala/analysis/DropDataSrcStmt.java M fe/src/main/java/org/apache/impala/analysis/DropDbStmt.java M fe/src/main/java/org/apache/impala/analysis/DropFunctionStmt.java M fe/src/main/java/org/apache/impala/analysis/DropStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java M fe/src/main/java/org/apache/impala/analysis/ExistsPredicate.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java M fe/src/main/java/org/apache/impala/analysis/ExtractFromExpr.java M fe/src/main/java/org/apache/impala/analysis/FromClause.java M fe/src/main/java/org/apache/impala/analysis/FunctionArgs.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/GrantRevokePrivStmt.java M fe/src/main/java/org/apache/impala/analysis/GrantRevokeRoleStmt.java M fe/src/main/java/org/apache/impala/analysis/HdfsCachingOp.java M fe/src/main/java/org/apache/impala/analysis/InPredicate.java M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/analysis/IsNotEmptyPredicate.java M fe/src/main/java/org/apache/impala/analysis/IsNullPredicate.java M
[Impala-ASF-CR] IMPALA-7801: Remove toSql() from ParseNode interface.
Andrew Sherman has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/11942 ) Change subject: IMPALA-7801: Remove toSql() from ParseNode interface. .. IMPALA-7801: Remove toSql() from ParseNode interface. In IMPALA-5821 the method "toSql(ToSqlOptions)" was added to ParseNode, to allow options to be passed when generating SQL from a parse tree. This change was suggested as part of the review of IMPALA-5821. The old "toSql()" method is currently implemented everywhere by calling toSql(ToSqlOptions.DEFAULT), but this is fragile as this convention could be accidentally ignored, To make the code more maintainable, remove the old "toSql()" method and have all callers call the new method instead. In addition, similarly replace "toSql()" in a few more associated classes: AnalyticWindow, LimitElement, Boundary, OrderByElement, SelectListItem. Finally, rename ToSqlOptions to ToSqlOption anbd noive it to be part of the ParseNode interface, and, in ToSqlOption, rename DEFAULT to ORIGINAL. TESTING: No new tests are added as there are no functional changes. All end to end tests run clean. Change-Id: I17025901838e9ffd753894a8087170123f9d8b33 --- M fe/src/main/java/org/apache/impala/analysis/AdminFnStmt.java M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java M fe/src/main/java/org/apache/impala/analysis/AggregateInfoBase.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAddDropRangePartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAddPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableAlterColStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableDropPartitionStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterTableSetLocationStmt.java M fe/src/main/java/org/apache/impala/analysis/AlterViewStmt.java M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/AnalyticExpr.java M fe/src/main/java/org/apache/impala/analysis/AnalyticInfo.java M fe/src/main/java/org/apache/impala/analysis/AnalyticWindow.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/ArithmeticExpr.java M fe/src/main/java/org/apache/impala/analysis/BaseTableRef.java M fe/src/main/java/org/apache/impala/analysis/BetweenPredicate.java M fe/src/main/java/org/apache/impala/analysis/BinaryPredicate.java M fe/src/main/java/org/apache/impala/analysis/BoolLiteral.java M fe/src/main/java/org/apache/impala/analysis/CaseExpr.java M fe/src/main/java/org/apache/impala/analysis/CastExpr.java M fe/src/main/java/org/apache/impala/analysis/ColumnDef.java M fe/src/main/java/org/apache/impala/analysis/CompoundPredicate.java M fe/src/main/java/org/apache/impala/analysis/ComputeStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDataSrcStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDbStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateDropRoleStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateOrAlterViewStmtBase.java M fe/src/main/java/org/apache/impala/analysis/CreateTableAsSelectStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeFileStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableLikeStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateTableStmt.java M fe/src/main/java/org/apache/impala/analysis/CreateViewStmt.java M fe/src/main/java/org/apache/impala/analysis/DeleteStmt.java M fe/src/main/java/org/apache/impala/analysis/DescribeDbStmt.java M fe/src/main/java/org/apache/impala/analysis/DescribeTableStmt.java M fe/src/main/java/org/apache/impala/analysis/DropDataSrcStmt.java M fe/src/main/java/org/apache/impala/analysis/DropDbStmt.java M fe/src/main/java/org/apache/impala/analysis/DropFunctionStmt.java M fe/src/main/java/org/apache/impala/analysis/DropStatsStmt.java M fe/src/main/java/org/apache/impala/analysis/DropTableOrViewStmt.java M fe/src/main/java/org/apache/impala/analysis/ExistsPredicate.java M fe/src/main/java/org/apache/impala/analysis/Expr.java M fe/src/main/java/org/apache/impala/analysis/ExprSubstitutionMap.java M fe/src/main/java/org/apache/impala/analysis/ExtractFromExpr.java M fe/src/main/java/org/apache/impala/analysis/FromClause.java M fe/src/main/java/org/apache/impala/analysis/FunctionArgs.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/GrantRevokePrivStmt.java M fe/src/main/java/org/apache/impala/analysis/GrantRevokeRoleStmt.java M fe/src/main/java/org/apache/impala/analysis/HdfsCachingOp.java M fe/src/main/java/org/apache/impala/analysis/InPredicate.java M fe/src/main/java/org/apache/impala/analysis/InlineViewRef.java M fe/src/main/java/org/apache/impala/analysis/InsertStmt.java M fe/src/main/java/org/apache/impala/analysis/IsNotEmptyPredicate.java M
[Impala-ASF-CR] IMPALA-7801: Remove toSql() from ParseNode interface.
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11942 ) Change subject: IMPALA-7801: Remove toSql() from ParseNode interface. .. Patch Set 2: (6 comments) Thanks Paul for the interesting review and discussion http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/analysis/Expr.java File fe/src/main/java/org/apache/impala/analysis/Expr.java: http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/analysis/Expr.java@395 PS2, Line 395: String sql = toSql(DEFAULT); > Let's think about the meaning of the options for expressions. Expr nodes wo I understand the point about Expr nodes but there is no simple solution. The ToSqlOption(s) does allow more complex behaviors by having different methods that can be called on a ToSqlOption object. Csaba also suggested having some sort of context object to be passed as well. Maybe that is closer to what you want. In this change I am just trying to remove the default toSql() methods. http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/analysis/Expr.java@1300 PS2, Line 1300: name, (printExpr) ? " '" + toSql(DEFAULT) + "'" : "", type_.toString())); > This comment applies in a zillion other places. Our general rule for error In this change, as requested by Vuk, I am removing the default toSql() method. He thought, and I think I agree, that forcing a ToSqlOption object parameter makes things more consistent. If we don't want that then maybe I should abandon this change. http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/analysis/ParseNode.java File fe/src/main/java/org/apache/impala/analysis/ParseNode.java: http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/analysis/ParseNode.java@33 PS2, Line 33: String toSql(ToSqlOptions options); > A common practice in Java is to include the enum definition in the interfac Nice suggestion, done. http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java File fe/src/main/java/org/apache/impala/analysis/QueryStmt.java: http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/analysis/QueryStmt.java@217 PS2, Line 217: "OFFSET requires an ORDER BY clause: " + limitElement_.toSql(DEFAULT).trim()); > I wonder if we can solve a mess here with exceptions also. Rather than buil Nice idea, but IMHO beyond the scope of this change. http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/planner/SortNode.java File fe/src/main/java/org/apache/impala/planner/SortNode.java: http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/planner/SortNode.java@44 PS2, Line 44: import static org.apache.impala.analysis.ToSqlOptions.DEFAULT; > Two suggestions. Rename is done in next patch. I am not sure I agree about importing the enum (i.e. not a static import of the needed enums). I think it is tidier to have just ORIGINAL in the code. As it is (now) a common idiom I believe people will get used to reading this correctly. http://gerrit.cloudera.org:8080/#/c/11942/2/fe/src/main/java/org/apache/impala/planner/SortNode.java@214 PS2, Line 214: output.append(info_.getSortExprs().get(i).toSql(DEFAULT) + " "); > This is an example of the point made elsewhere. It SEEMS we are asking the OK, thanks for explaining this. I will add another followup jira which will say something like: IMPALA-7801 changed the code so that all ParseNode trees can be asked to create sql by calling toSql(SqlOption.ORIGINAL) or toSql(SqlOption.REWRITTEN). Although an Expr node produces output when toSql(SqlOption.ORIGINAL) is called, it is a lie in that it is not the original sql. The current Expr may have replaced a tree of the original Expr objects during query rewrite. The replaced objects are gone, so we cannot reconstruct the "original" sql. A possible solution is to observe that the select-like ParseNodes and Expr ParseNodes have different semantics and to change the interface of toSql() to express this. -- To view, visit http://gerrit.cloudera.org:8080/11942 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I17025901838e9ffd753894a8087170123f9d8b33 Gerrit-Change-Number: 11942 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Fri, 16 Nov 2018 23:46:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7801: Remove toSql() from ParseNode interface.
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11942 ) Change subject: IMPALA-7801: Remove toSql() from ParseNode interface. .. Patch Set 4: reviewers should ignore patch set 3 which is very wrong -- To view, visit http://gerrit.cloudera.org:8080/11942 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I17025901838e9ffd753894a8087170123f9d8b33 Gerrit-Change-Number: 11942 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Sat, 17 Nov 2018 00:49:32 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7801: Remove toSql() from ParseNode interface.
Andrew Sherman has removed Vuk Ercegovac from this change. ( http://gerrit.cloudera.org:8080/11942 ) Change subject: IMPALA-7801: Remove toSql() from ParseNode interface. .. Removed reviewer Vuk Ercegovac. -- To view, visit http://gerrit.cloudera.org:8080/11942 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: deleteReviewer Gerrit-Change-Id: I17025901838e9ffd753894a8087170123f9d8b33 Gerrit-Change-Number: 11942 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-7801: Remove toSql() from ParseNode interface.
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11942 ) Change subject: IMPALA-7801: Remove toSql() from ParseNode interface. .. Patch Set 4: I'm not happy with using ORIGINAL as the default argument. This doesn't really solve the problems that Paul has identified with the semantics of this parameter. Using ORIGINAL really just makes things worse in this regard. So I am thinking I will abandon this change unless someone wants the change to use an explicit DEFAULT to describe the give-me-whatever-you-have case. -- To view, visit http://gerrit.cloudera.org:8080/11942 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I17025901838e9ffd753894a8087170123f9d8b33 Gerrit-Change-Number: 11942 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Paul Rogers Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Mon, 19 Nov 2018 17:38:24 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr.
Andrew Sherman has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/12068 ) Change subject: IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr. .. IMPALA-7657: Codegen IsNotEmptyPredicate and ValidTupleIdExpr. These two classes evaluate scalar expressions. Previously codegen was done by calling ScalarExpr::GetCodegendComputeFnWrapper which generates a static method that calls the scalar expression evaluation method. Make this more efficient by generating code which is customized using information available at codegen time. Add new cross-compiled files is-not-empty-predicate-ir.cc null-literal- ir.cc slot-ref-ir.cc valid-tuple-id-ir.cc. IsNotEmptyPredicate works by getting a CollectionVal object from the single child Expr node, and counting its tuples. As preparation for codegen, split out the logic that counts the tuples to a new cross- compiled method "IsNotEmpty()". At codegen time we know the type and value of the child node. Generate a call to a node-specific non-virtual cross-compiled method to get the CollectionVal object from the child. Then generate a call to return the result of calling IsNotNull() on the CollectionVal object. A ValidTupleIdExpr node contains a vector of tuple ids. It works by probing each row for the tuple ids in the vector to find a non-null tuple. As preparation for codegen, split out the logic that probes the row for a non-null tuple id into a new cross-compiled method "IsTupleFromRowNonNull()". At codegen time we know the vector of tuple ids. We unroll the loop through the tuple ids, generating code that calls "IsTupleFromRowNonNull()", and returning the tuple id if/when a non-null tuple is found. IMPALA-7657 also requests replacing GetCodegendComputeFnWrapper() in TupleIsNullPredicate. In the current Impala code this method is never called. This is because TupleIsNullPredicate is always wrapped in an IfExpr. This is always codegen'd by IfExpr's GetCodegendComputeFnWrapper() method. There is a separate Jira IMPALA-7655 to improve codegen of IfExpr. Minor corrections: Correct the link to llvm tutorial in LlvmCodegen. PERFORMANCE: I tested performance on a local mini-cluster. I wrote some pathological queries to test the new code. The new codegen'd code is very similar in performance. ValidTupleIdExpr seems a bit faster, but IsNotEmptyPredicate is only very slightly faster or identical. Overall these changes are not purely for performance but to move away from GetCodegendComputeFnWrapper. TESTING: The changed scalar expressions are well exercised by current tests. Ran exhaustive end-to-end tests. Change-Id: Ifb87b9e3b879c278ce8638d97bcb320a7555a6b3 --- M be/src/codegen/gen_ir_descriptions.py M be/src/codegen/impala-ir.cc M be/src/codegen/llvm-codegen.h M be/src/exprs/CMakeLists.txt A be/src/exprs/is-not-empty-predicate-ir.cc M be/src/exprs/is-not-empty-predicate.cc M be/src/exprs/is-not-empty-predicate.h A be/src/exprs/null-literal-ir.cc M be/src/exprs/null-literal.cc M be/src/exprs/null-literal.h M be/src/exprs/scalar-expr.h A be/src/exprs/slot-ref-ir.cc M be/src/exprs/slot-ref.cc M be/src/exprs/slot-ref.h A be/src/exprs/valid-tuple-id-ir.cc M be/src/exprs/valid-tuple-id.cc M be/src/exprs/valid-tuple-id.h 17 files changed, 378 insertions(+), 37 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/12068/5 -- To view, visit http://gerrit.cloudera.org:8080/12068 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ifb87b9e3b879c278ce8638d97bcb320a7555a6b3 Gerrit-Change-Number: 12068 Gerrit-PatchSet: 5 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-7468: Port CancelQueryFInstances() to KRPC.
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/12142 ) Change subject: IMPALA-7468: Port CancelQueryFInstances() to KRPC. .. Patch Set 3: (11 comments) Thanks for the code reviews http://gerrit.cloudera.org:8080/#/c/12142/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12142/3//COMMIT_MSG@9 PS3, Line 9: (usually because a user has : hit Control-C) > Not that important, but is this really true? Some other cancellation situat Thanks good point, "usually" is definitely wrong. http://gerrit.cloudera.org:8080/#/c/12142/3/be/src/runtime/coordinator-backend-state.h File be/src/runtime/coordinator-backend-state.h: http://gerrit.cloudera.org:8080/#/c/12142/3/be/src/runtime/coordinator-backend-state.h@239 PS3, Line 239: Execution backend. > Thrift execution backend? Done http://gerrit.cloudera.org:8080/#/c/12142/1/be/src/runtime/coordinator-backend-state.h File be/src/runtime/coordinator-backend-state.h: http://gerrit.cloudera.org:8080/#/c/12142/1/be/src/runtime/coordinator-backend-state.h@294 PS1, Line 294: Status DoCancelQueryFInstancesRrpcWithRetry( > I believe our current clang-format rules will tell you to put the symbols n Done http://gerrit.cloudera.org:8080/#/c/12142/3/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/12142/3/be/src/runtime/coordinator-backend-state.cc@427 PS3, Line 427: auto > We generally avoid the use of auto as it hurts readability. Done http://gerrit.cloudera.org:8080/#/c/12142/3/be/src/runtime/coordinator-backend-state.cc@440 PS3, Line 440: DoCancelQueryFInstancesRrpcWithRetry > Is there a reasonable way to make this generic like DoRpcWithRetry()? We'll I'm doing retry using a lambda http://gerrit.cloudera.org:8080/#/c/12142/3/be/src/scheduling/query-schedule.h File be/src/scheduling/query-schedule.h: http://gerrit.cloudera.org:8080/#/c/12142/3/be/src/scheduling/query-schedule.h@88 PS3, Line 88: execution backend > Thrift execution backend? Done http://gerrit.cloudera.org:8080/#/c/12142/3/be/src/service/control-service.h File be/src/service/control-service.h: http://gerrit.cloudera.org:8080/#/c/12142/3/be/src/service/control-service.h@41 PS3, Line 41: public: > Not your code, but while you're here would you mind fixing the spacing in t Done http://gerrit.cloudera.org:8080/#/c/12142/3/be/src/service/control-service.h@63 PS3, Line 63: /// Cancel any executing fragment instances for the query id specified in 'req'. > nit: could you move this under ReportExecStatus() to keep all of the rpc ha Done http://gerrit.cloudera.org:8080/#/c/12142/3/be/src/service/control-service.h@64 PS3, Line 64: void > virtual Done http://gerrit.cloudera.org:8080/#/c/12142/3/be/src/service/control-service.cc File be/src/service/control-service.cc: http://gerrit.cloudera.org:8080/#/c/12142/3/be/src/service/control-service.cc@173 PS3, Line 173: "query" > Since this is a constant, its not really necessary to Substitute() it. Done http://gerrit.cloudera.org:8080/#/c/12142/3/common/protobuf/control_service.proto File common/protobuf/control_service.proto: http://gerrit.cloudera.org:8080/#/c/12142/3/common/protobuf/control_service.proto@181 PS3, Line 181: rpc CancelQueryFInstances(CancelQueryFInstancesRequestPB) returns (CancelQueryFInstancesResponsePB); > Long line. Done, I added IMPALA-8047 for the clang-format support -- To view, visit http://gerrit.cloudera.org:8080/12142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I625030c3f1068061aa029e6e242f016cadd84969 Gerrit-Change-Number: 12142 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Mon, 07 Jan 2019 18:36:44 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7468: Port CancelQueryFInstances() to KRPC.
Andrew Sherman has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/12142 ) Change subject: IMPALA-7468: Port CancelQueryFInstances() to KRPC. .. IMPALA-7468: Port CancelQueryFInstances() to KRPC. When the Coordinator needs to cancel a query (for example because a user has hit Control-C), it does this by sending a CancelQueryFInstances message to each fragment instance. This change switches this code to use KRPC. Add new protobuf definitions for the messages, and remove the old thrift definitions. Move the server-side implementation of Cancel() from ImpalaInternalService to ControlService. Rework the scheduler so that the FInstanceExecParams always contains the KRPC address of the fragment executors, this address can then be used if a query is to be cancelled. For now keep the KRPC calls to CancelQueryFInstances() as synchronous. While moving the client-side code, remove the fault injection code that was inserted with FAULT_INJECTION_SEND_RPC_EXCEPTION and FAULT_INJECTION_RECV_RPC_EXCEPTION (triggered by running impalad with --fault_injection_rpc_exception_type=1) as this tickles code in client-cache.h which is now not used. TESTING: Ran all end-to-end tests. No new tests as test_cancellation.py provides good coverage. Checked in debugger that DebugAction style fault injection (triggered from test_cancellation.py) was working correctly. Change-Id: I625030c3f1068061aa029e6e242f016cadd84969 --- M be/src/runtime/backend-client.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/scheduling/query-schedule.h M be/src/scheduling/scheduler.cc M be/src/scheduling/scheduler.h M be/src/service/control-service.cc M be/src/service/control-service.h M be/src/service/impala-internal-service.cc M be/src/service/impala-internal-service.h M common/protobuf/control_service.proto M common/thrift/ImpalaInternalService.thrift 12 files changed, 192 insertions(+), 139 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/12142/4 -- To view, visit http://gerrit.cloudera.org:8080/12142 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I625030c3f1068061aa029e6e242f016cadd84969 Gerrit-Change-Number: 12142 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Zoram Thanga
[Impala-ASF-CR] IMPALA-589: Add a sql function to return the impalad coordinator hostname for diagnostic purposes.
Andrew Sherman has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11459 Change subject: IMPALA-589: Add a sql function to return the impalad coordinator hostname for diagnostic purposes. .. IMPALA-589: Add a sql function to return the impalad coordinator hostname for diagnostic purposes. In every execution of an Impala query, one of the impalad daemons acts as the coordinator node. In some cases, such as when using a proxy, a user cannot predict which host will act as the coordinator. To aid in diagnosis, we provide a sql function which returns the name of the host on which the coordinator is running. EXTERNAL DESCRIPTION: Add a builtin function called coordinator(), which returns the name of the host which is running the impalad that is acting as the coordinator for the current query. IMPLEMENTATION: All functions are passed a FunctionContext object which is the interface to the rest of the system for a UDF. From the FunctionContext we get the TQueryCtx (query context) which contains a TNetworkAddress for the coordinator. The hostname of the coordinator is copied from the TNetworkAddress. Can the TNetworkAddress for the coordinator be unitialized? In the thrift source the coord_address is marked optional, but my reading of the code says this is always set in a real impalad. To future-proof the code a null StringVal is returned if coord_address is unset. TESTING: - Added a basic unit test for the new function. - Added a unit test which simulates the case when coord_address is unset. - Hand tested in a development deployment. - Ran regression tests and got a clean run. LIMITATIONS: This change only adds the coordinator() function. It does not add the debug_url() function that was mentioned in the comments on the original Jira. Change-Id: I94d6e2664ba659b48df53c5c06f67b502c533e47 --- M be/src/exprs/expr-test.cc M be/src/exprs/utility-functions-ir.cc M be/src/exprs/utility-functions.h M common/function-registry/impala_functions.py M docs/topics/impala_misc_functions.xml 5 files changed, 61 insertions(+), 0 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/59/11459/1 -- To view, visit http://gerrit.cloudera.org:8080/11459 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I94d6e2664ba659b48df53c5c06f67b502c533e47 Gerrit-Change-Number: 11459 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Sherman
[Impala-ASF-CR] IMPALA-7492: Add support for DATE text parser/formatter
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11450 ) Change subject: IMPALA-7492: Add support for DATE text parser/formatter .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-value.cc File be/src/runtime/date-value.cc: http://gerrit.cloudera.org:8080/#/c/11450/2/be/src/runtime/date-value.cc@89 PS2, Line 89: } Do you want to set an informative string if the value is not valid? -- To view, visit http://gerrit.cloudera.org:8080/11450 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1eec00f22502c4c67c6807c4b51384419ea8b831 Gerrit-Change-Number: 11450 Gerrit-PatchSet: 2 Gerrit-Owner: Attila Jeges Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Csaba Ringhofer Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Zoltan Borok-Nagy Gerrit-Comment-Date: Mon, 17 Sep 2018 23:59:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7579: use the generic filesystem client to delete and create files in test query profile contains all events.
Andrew Sherman has uploaded this change for review. ( http://gerrit.cloudera.org:8080/11461 Change subject: IMPALA-7579: use the generic filesystem_client to delete and create files in test_query_profile_contains_all_events. .. IMPALA-7579: use the generic filesystem_client to delete and create files in test_query_profile_contains_all_events. This bug was introduced by IMPALA-6568 which added the test_query_profile_contains_all_events test. This test creates a file in the filesystem so that is can be used by 'load data inpath'. The test was using the hdfs_client object to do file system operations, but this client only works with hdfs. Switch to using the filesystem_client object which can work on other filesystems, including s3. Change-Id: I7c6e0899455dd4e12636b959fab4bde79f02fb95 --- M tests/query_test/test_observability.py 1 file changed, 2 insertions(+), 2 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/11461/1 -- To view, visit http://gerrit.cloudera.org:8080/11461 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I7c6e0899455dd4e12636b959fab4bde79f02fb95 Gerrit-Change-Number: 11461 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-7579: fix test query profile contains all events on S3
Andrew Sherman has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/11461 ) Change subject: IMPALA-7579: fix test_query_profile_contains_all_events on S3 .. IMPALA-7579: fix test_query_profile_contains_all_events on S3 This bug was introduced by IMPALA-6568 which added the test_query_profile_contains_all_events test. This test creates a file in the filesystem so that it can be used by 'load data inpath'. The test was using the hdfs_client object to do file system operations, but this client only works with hdfs. Switch to using the filesystem_client object which can work on other filesystems, including s3. Also create the file that will be moved by 'load data inpath' under the unique database directory; this means the file can be created without having to check if it exists already and must be deleted. Change-Id: I7c6e0899455dd4e12636b959fab4bde79f02fb95 --- M tests/query_test/test_observability.py 1 file changed, 2 insertions(+), 3 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/61/11461/2 -- To view, visit http://gerrit.cloudera.org:8080/11461 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I7c6e0899455dd4e12636b959fab4bde79f02fb95 Gerrit-Change-Number: 11461 Gerrit-PatchSet: 2 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-7579: fix test query profile contains all events on S3
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/11461 ) Change subject: IMPALA-7579: fix test_query_profile_contains_all_events on S3 .. Patch Set 1: (5 comments) http://gerrit.cloudera.org:8080/#/c/11461/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11461/1//COMMIT_MSG@7 PS1, Line 7: IMPALA-7579: use the generic filesystem_client to delete and create files in test_query_profile_contains_all_events. > nit: keep commit message lines to 72 characters (and don't wrap the title l Done http://gerrit.cloudera.org:8080/#/c/11461/1//COMMIT_MSG@7 PS1, Line 7: IMPALA-7579: use the generic filesystem_client to delete and create files in test_query_profile_contains_all_events. > make this line less verbose... something like "fix S3 file ops in test_quer Done http://gerrit.cloudera.org:8080/#/c/11461/1//COMMIT_MSG@11 PS1, Line 11: is > nit: it Done http://gerrit.cloudera.org:8080/#/c/11461/1//COMMIT_MSG@11 PS1, Line 11: > nit: get rid of extra ws Done http://gerrit.cloudera.org:8080/#/c/11461/1/tests/query_test/test_observability.py File tests/query_test/test_observability.py: http://gerrit.cloudera.org:8080/#/c/11461/1/tests/query_test/test_observability.py@224 PS1, Line 224: tmp > is this where the db is created? I thought it was under 'test-warehouse' Thanks for the suggestion -- To view, visit http://gerrit.cloudera.org:8080/11461 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I7c6e0899455dd4e12636b959fab4bde79f02fb95 Gerrit-Change-Number: 11461 Gerrit-PatchSet: 1 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 19 Sep 2018 00:33:13 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8371: Return appropriate error code for unified backend tests
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/12885 ) Change subject: IMPALA-8371: Return appropriate error code for unified backend tests .. Patch Set 2: Code-Review+1 (2 comments) LGTM http://gerrit.cloudera.org:8080/#/c/12885/2/bin/gen-backend-test-script.py File bin/gen-backend-test-script.py: http://gerrit.cloudera.org:8080/#/c/12885/2/bin/gen-backend-test-script.py@21 PS2, Line 21: # 1: The file location to write the generated script Aren't these parameters now named (--gtest_filter,--test_script_output) rather than positional? http://gerrit.cloudera.org:8080/#/c/12885/2/bin/gen-backend-test-script.py@32 PS2, Line 32: # The script template requires substitutions to set the variables used by the body. Nit: Clearer maybe to say "the script header template". -- To view, visit http://gerrit.cloudera.org:8080/12885 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia146d026d42f76d5ea12d92798f299182de03eef Gerrit-Change-Number: 12885 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Joe McDonnell Gerrit-Comment-Date: Tue, 02 Apr 2019 16:38:59 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8143: Enhance DoRpcWithRetry().
Andrew Sherman has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/12672 ) Change subject: IMPALA-8143: Enhance DoRpcWithRetry(). .. IMPALA-8143: Enhance DoRpcWithRetry(). Allow callers of RpcMgr::DoRpcWithRetry to specify a time to sleep if the remote service is busy. DoRpcWithRetry now only sleeps if the remote service is busy. TESTING: Ran all end-to-end tests. Add a new test to rpc-mgr-test.cc which tests RpcMgr::DoRpcWithRetry in two ways. Firstly we test the case where the remote service is busy (which we force by filling up the queue), and secondly we exercise DoRpcWithRetry by using a fake Proxy that simulates failures. Clean up unused values of FaultInjectionUtil.RpcCallType and match values with test_rpc_timeout.py. Change-Id: Ia9693151c35e02235665b3c285a48c585973d390 --- M be/src/rpc/impala-service-pool.cc M be/src/rpc/impala-service-pool.h M be/src/rpc/rpc-mgr-test.cc M be/src/rpc/rpc-mgr-test.h M be/src/rpc/rpc-mgr.h M be/src/rpc/rpc-mgr.inline.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/query-state.h M be/src/service/client-request-state.cc M be/src/service/control-service.cc M be/src/service/control-service.h M be/src/testutil/fault-injection-util.h M common/protobuf/rpc_test.proto M tests/custom_cluster/test_rpc_timeout.py 14 files changed, 268 insertions(+), 62 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/12672/8 -- To view, visit http://gerrit.cloudera.org:8080/12672 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia9693151c35e02235665b3c285a48c585973d390 Gerrit-Change-Number: 12672 Gerrit-PatchSet: 8 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-8143: Enhance DoRpcWithRetry().
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/12672 ) Change subject: IMPALA-8143: Enhance DoRpcWithRetry(). .. Patch Set 5: (3 comments) Thanks Michael. Please see the upcoming Patch Set 8. http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr-test.cc File be/src/rpc/rpc-mgr-test.cc: http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr-test.cc@333 PS5, Line 333: // Call DoRpcWithRetry with no backoff on our busy service. : const int retries = 3; : impala::Status rpc_status = RpcMgr::DoRpcWithRetry(proxy, ::Ping, : request, , query_ctx, "ping failed", retries, MonoDelta::FromSeconds(100)); : EXPECT_ERROR(rpc_status, TErrorCode::GENERAL); : EXPECT_STR_CONTAINS(rpc_status.msg().msg(), : "dropped due to backpressure. The service queue contains 1 items out of a maximum " : "of 1; memory consumption is "); : ASSERT_EQ(overflow_metric->GetValue(), retries); : : // Call DoRpcWithRetry, but this time we do backoff on a busy service, and this waiting : // allows the outstanding aysnc calls to complete, so that then this call will succeed. : impala::Status rpc_status_backoff = : RpcMgr::DoRpcWithRetry(proxy, ::Ping, request, , : query_ctx, "ping failed", 10, MonoDelta::FromSeconds(100), 2); : ASSERT_OK(rpc_status_backoff); : ASSERT_GT(overflow_metric->GetValue(), retries); : : // Check that async calls did complete. : ASSERT_FALSE(async1_done.pending()); : ASSERT_FALSE(async2_done.pending()); > How about the following idea: I did think about this. So the nice thing about your suggestion (thanks) is that we can set up the full queue. Now we want to have a call to DoRpcWithRetry that will block (because queue is full) and then succeed (after retry). So concurrently with the call to DoRpcWithRetry we need to unblock the queue. We could start the call to DoRpcWithRetry in a thread, and then sleep for a while, and then unblock the queue. That would work but is functionally equivalent to what we have now in that it depends on a sleep call. So I understand the desire for a completely deterministic test but I still don't see a simple way to achieve that. I am happy to listen to more suggestions, or explanations of what I am not understanding. http://gerrit.cloudera.org:8080/#/c/12672/7/be/src/rpc/rpc-mgr.h File be/src/rpc/rpc-mgr.h: http://gerrit.cloudera.org:8080/#/c/12672/7/be/src/rpc/rpc-mgr.h@186 PS7, Line 186: og' fun > timeout_ms Done http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr.h File be/src/rpc/rpc-mgr.h: http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr.h@178 PS5, Line 178: // A no-op method that is used as part of overloading DoRpcWithRetry(). : static void logging_noop(){}; : : // The type of the log method passed to DoRpcWithRetry(). : typedef boost::function RpcLogFn; > They still seem to be in PS7 Done -- To view, visit http://gerrit.cloudera.org:8080/12672 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia9693151c35e02235665b3c285a48c585973d390 Gerrit-Change-Number: 12672 Gerrit-PatchSet: 5 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Tue, 02 Apr 2019 18:17:46 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8143: Enhance DoRpcWithRetry().
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/12672 ) Change subject: IMPALA-8143: Enhance DoRpcWithRetry(). .. Patch Set 8: Reviewers please wait for a future Patch Ste with a simplified test -- To view, visit http://gerrit.cloudera.org:8080/12672 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia9693151c35e02235665b3c285a48c585973d390 Gerrit-Change-Number: 12672 Gerrit-PatchSet: 8 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Tue, 02 Apr 2019 21:15:30 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/12846 ) Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3 .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/12846/4/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/12846/4/bin/impala-config.sh@37 PS4, Line 37: # parse command line options It seems like options to impala-config.sh are currently passed by environment variable, for example USE_KUDU_DEBUG_BUILD can be set before sourcing impala-config.sh. Is there a particular reason you chose to add arguments to impala-config.sh rather than using the existing mechanism? -- To view, visit http://gerrit.cloudera.org:8080/12846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 Gerrit-Change-Number: 12846 Gerrit-PatchSet: 4 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Mon, 25 Mar 2019 22:52:50 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8143: Enhance DoRpcWithRetry().
Andrew Sherman has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/12672 ) Change subject: IMPALA-8143: Enhance DoRpcWithRetry(). .. IMPALA-8143: Enhance DoRpcWithRetry(). Allow callers of RpcMgr::DoRpcWithRetry to specify a time to sleep if the remote service is busy. Allow callers to specify a logging function that is called for rpc failures. Reviewers should note that the following changes to retry parameters were made: Cancel: add a 3 second sleep if service is busy. RemoteShutdown: add a 3 second sleep if service is busy. TESTING: Ran all end-to-end tests. Add a new test to rpc-mgr-test.cc which tests RpcMgr::DoRpcWithRetry in two ways. Firstly we test the case where the remote service is busy (which we force by filling up the queue), and secondly we exercise DoRpcWithRetry by using a fake Proxy that simulates failures. Clean up unused values of FaultInjectionUtil.RpcCallType and match values with test_rpc_timeout.py. Change-Id: Ia9693151c35e02235665b3c285a48c585973d390 --- M be/src/rpc/impala-service-pool.cc M be/src/rpc/impala-service-pool.h M be/src/rpc/rpc-mgr-test.cc M be/src/rpc/rpc-mgr-test.h M be/src/rpc/rpc-mgr.h M be/src/rpc/rpc-mgr.inline.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/query-state.h M be/src/service/client-request-state.cc M be/src/service/control-service.cc M be/src/service/control-service.h M be/src/testutil/fault-injection-util.h M tests/custom_cluster/test_rpc_timeout.py 13 files changed, 309 insertions(+), 58 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/12672/3 -- To view, visit http://gerrit.cloudera.org:8080/12672 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia9693151c35e02235665b3c285a48c585973d390 Gerrit-Change-Number: 12672 Gerrit-PatchSet: 3 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-8345 : Add option to set up minicluster to use Hive 3
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/12846 ) Change subject: IMPALA-8345 : Add option to set up minicluster to use Hive 3 .. Patch Set 7: Code-Review+1 (5 comments) a few nits but looks good http://gerrit.cloudera.org:8080/#/c/12846/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12846/6//COMMIT_MSG@10 PS6, Line 10: the minicluster scripts to use Hive 3.1.0 instead of CDH Hive 2.1.1. Nit: to optionally use http://gerrit.cloudera.org:8080/#/c/12846/6/bin/create-test-configuration.sh File bin/create-test-configuration.sh: http://gerrit.cloudera.org:8080/#/c/12846/6/bin/create-test-configuration.sh@132 PS6, Line 132: # Certain configurations (like SentrySyncHMSNotificationsPostListener) does not work s/does not/do not/ http://gerrit.cloudera.org:8080/#/c/12846/6/bin/impala-config.sh File bin/impala-config.sh: http://gerrit.cloudera.org:8080/#/c/12846/6/bin/impala-config.sh@163 PS6, Line 163: export CDH_BUILD_NUMBER=909265 Nit: I wonder if CDP_BUILD_NUMBER should go here so that the 2 build numbers are together http://gerrit.cloudera.org:8080/#/c/12846/6/bin/impala-config.sh@534 PS6, Line 534: export HIVE_HOME="$CDP_COMPONENTS_HOME/apache-hive-${IMPALA_HIVE_VERSION}-bin" Why in one case is the home apache-hive-xxx and in the other it is hive-xxx ? http://gerrit.cloudera.org:8080/#/c/12846/6/bin/impala-config.sh@748 PS6, Line 748: echo "CDP_BUILD_NUMBER= $CDP_BUILD_NUMBER" It could be confusing that you always echo CDH_BUILD_NUMBER but you only echo CDP_BUILD_NUMBER when doing a CDP build. -- To view, visit http://gerrit.cloudera.org:8080/12846 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icfed856c1f5429ed45fd3d9cb08a5d1bb96a9605 Gerrit-Change-Number: 12846 Gerrit-PatchSet: 7 Gerrit-Owner: Vihang Karajgaonkar Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Todd Lipcon Gerrit-Reviewer: Vihang Karajgaonkar Gerrit-Comment-Date: Tue, 26 Mar 2019 19:12:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8143: Enhance DoRpcWithRetry().
Andrew Sherman has uploaded a new patch set (#9). ( http://gerrit.cloudera.org:8080/12672 ) Change subject: IMPALA-8143: Enhance DoRpcWithRetry(). .. IMPALA-8143: Enhance DoRpcWithRetry(). Allow callers of RpcMgr::DoRpcWithRetry to specify a time to sleep if the remote service is busy. DoRpcWithRetry now only sleeps if the remote service is busy. TESTING: Ran all end-to-end tests. Add a new test to rpc-mgr-test.cc which tests RpcMgr::DoRpcWithRetry in two ways. Firstly we test the case where the remote service is busy (which we force by filling up the queue), and secondly we exercise DoRpcWithRetry by using a fake Proxy that simulates failures. Clean up unused values of FaultInjectionUtil.RpcCallType and match values with test_rpc_timeout.py. Change-Id: Ia9693151c35e02235665b3c285a48c585973d390 --- M be/src/rpc/impala-service-pool.cc M be/src/rpc/impala-service-pool.h M be/src/rpc/rpc-mgr-test.cc M be/src/rpc/rpc-mgr-test.h M be/src/rpc/rpc-mgr.h M be/src/rpc/rpc-mgr.inline.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/query-state.h M be/src/service/client-request-state.cc M be/src/service/control-service.cc M be/src/service/control-service.h M be/src/testutil/fault-injection-util.h M tests/custom_cluster/test_rpc_timeout.py 13 files changed, 274 insertions(+), 62 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/12672/9 -- To view, visit http://gerrit.cloudera.org:8080/12672 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia9693151c35e02235665b3c285a48c585973d390 Gerrit-Change-Number: 12672 Gerrit-PatchSet: 9 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-6216: Make PYTHON EGG CACHE configurable
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/12911 ) Change subject: IMPALA-6216: Make PYTHON_EGG_CACHE configurable .. Patch Set 1: Code-Review+1 LGTM -- To view, visit http://gerrit.cloudera.org:8080/12911 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I695b2b31d9045eef1a53268f6516858935aed508 Gerrit-Change-Number: 12911 Gerrit-PatchSet: 1 Gerrit-Owner: Yongzhi Chen Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Laszlo Gaal Gerrit-Comment-Date: Thu, 04 Apr 2019 16:22:26 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8143: Enhance DoRpcWithRetry().
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/12672 ) Change subject: IMPALA-8143: Enhance DoRpcWithRetry(). .. Patch Set 5: (20 comments) Thanks for the reviews. Please see patch set 7. http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr-test.h File be/src/rpc/rpc-mgr-test.h: http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr-test.h@296 PS5, Line 296: tha > typo Done http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr-test.cc File be/src/rpc/rpc-mgr-test.cc: http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr-test.cc@214 PS5, Line 214: ASSERT_OK(static_cast(scan_mem_impl) : -> > Not sure if it's output of clang-tidy or something but I find it a bit hard Yes this is clang-format, but I see now how to avoid this getting "fixed". http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr-test.cc@263 PS5, Line 263: shoudl > typo Done http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr-test.cc@291 PS5, Line 291: NULL > nullptr; Done http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr-test.cc@298 PS5, Line 298: RpcController controller; > not used ? yes! http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr-test.cc@316 PS5, Line 316: sleep_ms = 100; > Instead of sharing "sleep_ms" between RPCs, it may be cleaner to pass the s Thanks, this is cleaner http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr-test.cc@322 PS5, Line 322: proxy.get()-> > proxy-> Done http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr-test.cc@333 PS5, Line 333: // Call DoRpcWithRetry with no backoff on our busy service. : const int retries = 3; : impala::Status rpc_status = RpcMgr::DoRpcWithRetry(proxy, ::Ping, : request, , query_ctx, "ping failed", retries, MonoDelta::FromSeconds(100)); : EXPECT_ERROR(rpc_status, TErrorCode::GENERAL); : EXPECT_STR_CONTAINS(rpc_status.msg().msg(), : "dropped due to backpressure. The service queue contains 1 items out of a maximum " : "of 1; memory consumption is "); : ASSERT_EQ(overflow_metric->GetValue(), retries); : : // Call DoRpcWithRetry, but this time we do backoff on a busy service, and this waiting : // allows the outstanding aysnc calls to complete, so that then this call will succeed. : impala::Status rpc_status_backoff = : RpcMgr::DoRpcWithRetry(proxy, ::Ping, request, , : query_ctx, "ping failed", 10, MonoDelta::FromSeconds(100), 2); : ASSERT_OK(rpc_status_backoff); : ASSERT_GT(overflow_metric->GetValue(), retries); : : // Check that async calls did complete. : ASSERT_FALSE(async1_done.pending()); : ASSERT_FALSE(async2_done.pending()); > These set of tests seem rather timing dependent. For instance, it may occur I reviewed the test and tried to make to more bulletproof. I think any test that really does fill up the queue and test that the retry works in this case will have some asynchrony. I don't like tests with sleep calls but I think this should be reliable. Let me know if you are not persuaded. http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr-test.cc@378 PS5, Line 378: // If we fail 2x and retry 2x, then the call fails. : unique_ptr failing_proxy2 = : make_unique(2); : : Status rpc_status_fail = : RpcMgr::DoRpcWithRetry(failing_proxy2, ::Ping, request, : , boost::bind(::log, this), query_ctx, "ping failed", 2, : MonoDelta::FromSeconds(10)); > Please see comments in rpc-mgr.inline.h that we probably should only retry Done http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr.h File be/src/rpc/rpc-mgr.h: http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr.h@178 PS5, Line 178: // A no-op method that is used as part of overloading DoRpcWithRetry(). : static void logging_noop(){}; : : // The type of the log method passed to DoRpcWithRetry(). : typedef boost::function RpcLogFn; > Not needed as discussed in person. Removed http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr.h@189 PS5, Line 189: class > typename Done http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr.h@191 PS5, Line 191: rpc_call > It's important to highlight that this only works iff rpc_call is idempotent Thanks http://gerrit.cloudera.org:8080/#/c/12672/5/be/src/rpc/rpc-mgr.h@193 PS5, Line 193: const kudu::MonoDelta& timeout, : const int server_busy_backoff_s = 0 > It
[Impala-ASF-CR] IMPALA-8143: Enhance DoRpcWithRetry().
Andrew Sherman has uploaded a new patch set (#7). ( http://gerrit.cloudera.org:8080/12672 ) Change subject: IMPALA-8143: Enhance DoRpcWithRetry(). .. IMPALA-8143: Enhance DoRpcWithRetry(). Allow callers of RpcMgr::DoRpcWithRetry to specify a time to sleep if the remote service is busy. DoRpcWithRetry now only sleeps if the remote service is busy. TESTING: Ran all end-to-end tests. Add a new test to rpc-mgr-test.cc which tests RpcMgr::DoRpcWithRetry in two ways. Firstly we test the case where the remote service is busy (which we force by filling up the queue), and secondly we exercise DoRpcWithRetry by using a fake Proxy that simulates failures. Clean up unused values of FaultInjectionUtil.RpcCallType and match values with test_rpc_timeout.py. Change-Id: Ia9693151c35e02235665b3c285a48c585973d390 --- M be/src/rpc/impala-service-pool.cc M be/src/rpc/impala-service-pool.h M be/src/rpc/rpc-mgr-test.cc M be/src/rpc/rpc-mgr-test.h M be/src/rpc/rpc-mgr.h M be/src/rpc/rpc-mgr.inline.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/query-state.h M be/src/service/client-request-state.cc M be/src/service/control-service.cc M be/src/service/control-service.h M be/src/testutil/fault-injection-util.h M common/protobuf/rpc_test.proto M tests/custom_cluster/test_rpc_timeout.py 14 files changed, 274 insertions(+), 62 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/12672/7 -- To view, visit http://gerrit.cloudera.org:8080/12672 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia9693151c35e02235665b3c285a48c585973d390 Gerrit-Change-Number: 12672 Gerrit-PatchSet: 7 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-6826: Extend bootstrap system.sh to Ubuntu 18.04
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/12893 ) Change subject: IMPALA-6826: Extend bootstrap_system.sh to Ubuntu 18.04 .. Patch Set 1: Code-Review+1 (2 comments) LGTM http://gerrit.cloudera.org:8080/#/c/12893/1//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12893/1//COMMIT_MSG@13 PS1, Line 13: - add 'ant' to the list of onstalled packages typo: installed http://gerrit.cloudera.org:8080/#/c/12893/1//COMMIT_MSG@17 PS1, Line 17: These changes enable bootstratp_system.sh to set up an Impala development typo: bootstrap -- To view, visit http://gerrit.cloudera.org:8080/12893 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Iad790f72ea6b62258aed2225eb7bdf79590c350f Gerrit-Change-Number: 12893 Gerrit-PatchSet: 1 Gerrit-Owner: Laszlo Gaal Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Sat, 30 Mar 2019 00:01:10 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8143: Enhance DoRpcWithRetry().
Andrew Sherman has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/12672 ) Change subject: IMPALA-8143: Enhance DoRpcWithRetry(). .. IMPALA-8143: Enhance DoRpcWithRetry(). Allow callers of RpcMgr::DoRpcWithRetry to specify a time to sleep if the remote service is busy. Allow callers to specify a logging function that is called for rpc failures. Reviewers should note that the following changes to retry parameters were made: Cancel: add a 3 second sleep if service is busy. RemoteShutdown: add a 3 second sleep if service is busy. TESTING: Ran all end-to-end tests. Add a new test to rpc-mgr-test.cc which tests RpcMgr::DoRpcWithRetry in two ways. Firstly we test the case where the remote service is busy (which we force by filling up the queue), and secondly we exercise DoRpcWithRetry by using a fake Proxy that simulates failures. Clean up unused values of FaultInjectionUtil.RpcCallType and match values with test_rpc_timeout.py. Change-Id: Ia9693151c35e02235665b3c285a48c585973d390 --- M be/src/rpc/impala-service-pool.cc M be/src/rpc/impala-service-pool.h M be/src/rpc/rpc-mgr-test.cc M be/src/rpc/rpc-mgr-test.h M be/src/rpc/rpc-mgr.h M be/src/rpc/rpc-mgr.inline.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/query-state.h M be/src/service/client-request-state.cc M be/src/service/control-service.cc M be/src/service/control-service.h M be/src/testutil/fault-injection-util.h M tests/custom_cluster/test_rpc_timeout.py 13 files changed, 303 insertions(+), 58 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/12672/4 -- To view, visit http://gerrit.cloudera.org:8080/12672 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia9693151c35e02235665b3c285a48c585973d390 Gerrit-Change-Number: 12672 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-8143: Enhance DoRpcWithRetry().
Andrew Sherman has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/12672 ) Change subject: IMPALA-8143: Enhance DoRpcWithRetry(). .. IMPALA-8143: Enhance DoRpcWithRetry(). Allow callers of RpcMgr::DoRpcWithRetry to specify a time to sleep if the remote service is busy. Allow callers to specify a logging function that is called for rpc failures. Reviewers should note that the following changes to retry parameters were made: Cancel: add a 3 second sleep if service is busy. RemoteShutdown: add a 3 second sleep if service is busy. TESTING: Ran all end-to-end tests. Add a new test to rpc-mgr-test.cc which tests RpcMgr::DoRpcWithRetry in two ways. Firstly we test the case where the remote service is busy (which we force by filling up the queue), and secondly we exercise DoRpcWithRetry by using a fake Proxy that simulates failures. Clean up unused values of FaultInjectionUtil.RpcCallType and match values with test_rpc_timeout.py. Change-Id: Ia9693151c35e02235665b3c285a48c585973d390 --- M be/src/rpc/impala-service-pool.cc M be/src/rpc/impala-service-pool.h M be/src/rpc/rpc-mgr-test.cc M be/src/rpc/rpc-mgr-test.h M be/src/rpc/rpc-mgr.h M be/src/rpc/rpc-mgr.inline.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/query-state.h M be/src/service/client-request-state.cc M be/src/service/control-service.cc M be/src/service/control-service.h M be/src/testutil/fault-injection-util.h M tests/custom_cluster/test_rpc_timeout.py 13 files changed, 302 insertions(+), 58 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/12672/5 -- To view, visit http://gerrit.cloudera.org:8080/12672 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia9693151c35e02235665b3c285a48c585973d390 Gerrit-Change-Number: 12672 Gerrit-PatchSet: 5 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-8143: Enhance DoRpcWithRetry().
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/12672 ) Change subject: IMPALA-8143: Enhance DoRpcWithRetry(). .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/12672/4/be/src/rpc/rpc-mgr.inline.h File be/src/rpc/rpc-mgr.inline.h: http://gerrit.cloudera.org:8080/#/c/12672/4/be/src/rpc/rpc-mgr.inline.h@56 PS4, Line 56: const int times_to_try, const kudu::MonoDelta& timeout, const int server_busy_backoff_s, > line too long (92 > 90) Done -- To view, visit http://gerrit.cloudera.org:8080/12672 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia9693151c35e02235665b3c285a48c585973d390 Gerrit-Change-Number: 12672 Gerrit-PatchSet: 4 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Tue, 26 Mar 2019 21:51:11 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Improve error handling for validation of unified backend executable
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/12895 ) Change subject: Improve error handling for validation of unified backend executable .. Patch Set 1: Code-Review+1 LGTM -- To view, visit http://gerrit.cloudera.org:8080/12895 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ia84cd074be79bc9d99b0621a1ae216f3debf5833 Gerrit-Change-Number: 12895 Gerrit-PatchSet: 1 Gerrit-Owner: Joe McDonnell Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 01 Apr 2019 16:14:20 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8330: Impala shell config file should use flag names
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/12823 ) Change subject: IMPALA-8330: Impala shell config file should use flag names .. Patch Set 2: (2 comments) Looks good http://gerrit.cloudera.org:8080/#/c/12823/2//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/12823/2//COMMIT_MSG@10 PS2, Line 10: both short and long flag names instead of dest names because dest names My life would be easier if you explained what a 'dest name' is here. http://gerrit.cloudera.org:8080/#/c/12823/2//COMMIT_MSG@24 PS2, Line 24: ; Flags can be repeated. Flags can be repeated. So if I have query_options=DEFAULT_FILE_FORMAT=parquet Q=DEFAULT_FILE_FORMAT=orc will it all work OK? -- To view, visit http://gerrit.cloudera.org:8080/12823 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ic43603c1b538af08fddcab1b2c1f6ad1af1a6cb9 Gerrit-Change-Number: 12823 Gerrit-PatchSet: 2 Gerrit-Owner: Fredy Wijaya Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Fredy Wijaya Gerrit-Reviewer: Impala Public Jenkins Gerrit-Comment-Date: Mon, 25 Mar 2019 18:51:20 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-7800: Reject new connections after --fe service threads
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/12579 ) Change subject: IMPALA-7800: Reject new connections after --fe_service_threads .. Patch Set 8: (5 comments) All I have are nits and questions http://gerrit.cloudera.org:8080/#/c/12579/8/be/src/rpc/TAcceptQueueServer.h File be/src/rpc/TAcceptQueueServer.h: http://gerrit.cloudera.org:8080/#/c/12579/8/be/src/rpc/TAcceptQueueServer.h@97 PS8, Line 97: /// Name of the thrift server Add terminating period (nit). http://gerrit.cloudera.org:8080/#/c/12579/8/be/src/rpc/TAcceptQueueServer.h@113 PS8, Line 113: /// Number of connections rejected due to timeout Add terminating period (nit). http://gerrit.cloudera.org:8080/#/c/12579/8/be/src/rpc/TAcceptQueueServer.cpp File be/src/rpc/TAcceptQueueServer.cpp: http://gerrit.cloudera.org:8080/#/c/12579/8/be/src/rpc/TAcceptQueueServer.cpp@220 PS8, Line 220: } catch (string s) { I know this isn't part of this change, but can we really ever get string exceptions? http://gerrit.cloudera.org:8080/#/c/12579/8/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/12579/8/be/src/service/impala-server.cc@241 PS8, Line 241: "the queue before we time it out and reject the connection request. A value of 0 " In the flags that configure the accept queue we call it "the post-accept, pre-setup connection queue", which is rather verbose, but do we want to use the same wording for consistency? Also the description might be clearer if it was rewritten to not use "we" (this is a definite nit) http://gerrit.cloudera.org:8080/#/c/12579/8/tests/custom_cluster/test_frontend_connection_limit.py File tests/custom_cluster/test_frontend_connection_limit.py: http://gerrit.cloudera.org:8080/#/c/12579/8/tests/custom_cluster/test_frontend_connection_limit.py@89 PS8, Line 89: raise ImpalaBeeswaxException(e.message, e) I am confused about this. Why do we raise an exception here? I see that the test is marked xfail(raises=ImpalaBeeswaxException)but doesn't make the test fragile in the case where some other bit of code raises a ImpalaBeeswaxException? Maybe this is a python testing pattern I am unfamiliar with. -- To view, visit http://gerrit.cloudera.org:8080/12579 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Idb345c1d84cc2f691f54ded467f253e758f87e64 Gerrit-Change-Number: 12579 Gerrit-PatchSet: 8 Gerrit-Owner: Zoram Thanga Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Reviewer: Zoram Thanga Gerrit-Comment-Date: Mon, 25 Mar 2019 18:00:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8143: Enhance DoRpcWithRetry().
Andrew Sherman has uploaded a new patch set (#6). ( http://gerrit.cloudera.org:8080/12672 ) Change subject: IMPALA-8143: Enhance DoRpcWithRetry(). .. IMPALA-8143: Enhance DoRpcWithRetry(). Allow callers of RpcMgr::DoRpcWithRetry to specify a time to sleep if the remote service is busy. Allow callers to specify a logging function that is called for rpc failures. Reviewers should note that the following changes to retry parameters were made: Cancel: add a 3 second sleep if service is busy. RemoteShutdown: add a 3 second sleep if service is busy. TESTING: Ran all end-to-end tests. Add a new test to rpc-mgr-test.cc which tests RpcMgr::DoRpcWithRetry in two ways. Firstly we test the case where the remote service is busy (which we force by filling up the queue), and secondly we exercise DoRpcWithRetry by using a fake Proxy that simulates failures. Clean up unused values of FaultInjectionUtil.RpcCallType and match values with test_rpc_timeout.py. Change-Id: Ia9693151c35e02235665b3c285a48c585973d390 --- M be/src/rpc/impala-service-pool.cc M be/src/rpc/impala-service-pool.h M be/src/rpc/rpc-mgr-test.cc M be/src/rpc/rpc-mgr-test.h M be/src/rpc/rpc-mgr.h M be/src/rpc/rpc-mgr.inline.h M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/query-state.h M be/src/service/client-request-state.cc M be/src/service/control-service.cc M be/src/service/control-service.h M be/src/testutil/fault-injection-util.h M tests/custom_cluster/test_rpc_timeout.py 13 files changed, 302 insertions(+), 58 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/72/12672/6 -- To view, visit http://gerrit.cloudera.org:8080/12672 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ia9693151c35e02235665b3c285a48c585973d390 Gerrit-Change-Number: 12672 Gerrit-PatchSet: 6 Gerrit-Owner: Andrew Sherman Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall
[Impala-ASF-CR] IMPALA-8256: Better error message for ImpalaServicePool::RejectTooBusy()
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/12624 ) Change subject: IMPALA-8256: Better error message for ImpalaServicePool::RejectTooBusy() .. Patch Set 1: Oh I see, the service-pool.cc version does not reject messages because of lack of memory, doh. -- To view, visit http://gerrit.cloudera.org:8080/12624 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0297658acf2b23823dcb7d2bdff5d8e4475bb98 Gerrit-Change-Number: 12624 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Wed, 27 Feb 2019 21:16:03 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-8256: Better error message for ImpalaServicePool::RejectTooBusy()
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/12624 ) Change subject: IMPALA-8256: Better error message for ImpalaServicePool::RejectTooBusy() .. Patch Set 1: (3 comments) Change looks basically good, I have a few nits. http://gerrit.cloudera.org:8080/#/c/12624/1/be/src/rpc/impala-service-pool.cc File be/src/rpc/impala-service-pool.cc: http://gerrit.cloudera.org:8080/#/c/12624/1/be/src/rpc/impala-service-pool.cc@119 PS1, Line 119: "The service queue is full; it has $3 items; memory " Isn't the message "The service queue is full" misleading? http://gerrit.cloudera.org:8080/#/c/12624/1/be/src/rpc/impala-service-pool.cc@120 PS1, Line 120: "consumption is $4", Nit: needs terminating "." http://gerrit.cloudera.org:8080/#/c/12624/1/be/src/rpc/impala-service-pool.cc@125 PS1, Line 125: PrettyPrinter::Print(service_mem_tracker_->consumption(), TUnit::BYTES)); There's a similar message in service-pool.cc Is it right that you are not changing the message there because there is no service_mem_tracker in that class? -- To view, visit http://gerrit.cloudera.org:8080/12624 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0297658acf2b23823dcb7d2bdff5d8e4475bb98 Gerrit-Change-Number: 12624 Gerrit-PatchSet: 1 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Wed, 27 Feb 2019 21:11:03 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-8256: Better error message for ImpalaServicePool::RejectTooBusy()
Andrew Sherman has posted comments on this change. ( http://gerrit.cloudera.org:8080/12624 ) Change subject: IMPALA-8256: Better error message for ImpalaServicePool::RejectTooBusy() .. Patch Set 3: Code-Review+1 LGTM -- To view, visit http://gerrit.cloudera.org:8080/12624 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If0297658acf2b23823dcb7d2bdff5d8e4475bb98 Gerrit-Change-Number: 12624 Gerrit-PatchSet: 3 Gerrit-Owner: Michael Ho Gerrit-Reviewer: Andrew Sherman Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Michael Ho Gerrit-Reviewer: Thomas Marshall Gerrit-Comment-Date: Fri, 01 Mar 2019 16:53:46 + Gerrit-HasComments: No