[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query
Hello Tim Armstrong, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10415 to look at the new patch set (#6). Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query .. IMPALA-6034: Add Cpu and scanned bytes limits per query To protect Impala from runaway queries add per query limits for Cpu and scanned bytes, limits are enforced via query options and applied to the entire query, not per backend like mem_limit. If a query exceeds any of the limits it will get cancelled. Query profile is updated to include query wide and per backend metrics for Cpu and scanned bytes. Testing: Added tests for various permutations for MAX_CPU_TIME_S and MAX_SCAN_BYTES Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20 --- M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift A testdata/workloads/functional-query/queries/QueryTest/query-resource-limits.test A tests/query_test/test_query_resource_limits.py 13 files changed, 375 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/10415/6 -- To view, visit http://gerrit.cloudera.org:8080/10415 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20 Gerrit-Change-Number: 10415 Gerrit-PatchSet: 6 Gerrit-Owner: Mostafa Mokhtar Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query
Hello Tim Armstrong, Bikramjeet Vig, Impala Public Jenkins, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10415 to look at the new patch set (#5). Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query .. IMPALA-6034: Add Cpu and scanned bytes limits per query To protect Impala from runaway queries add per query limits for Cpu and scanned bytes, limits are enforced via query options and applied to the entire query, not per backend like mem_limit. If a query exceeds any of the limits it will get cancelled. Query profile is updated to include query wide and per backend metrics for Cpu and scanned bytes. Testing: Added tests for various permutations for MAX_CPU_TIME_S and MAX_SCAN_BYTES Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20 --- M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift A testdata/workloads/functional-query/queries/QueryTest/query-resource-limits.test A tests/query_test/test_query_resource_limits.py 13 files changed, 374 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/10415/5 -- To view, visit http://gerrit.cloudera.org:8080/10415 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20 Gerrit-Change-Number: 10415 Gerrit-PatchSet: 5 Gerrit-Owner: Mostafa Mokhtar Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Impala Public Jenkins Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query
Hello Tim Armstrong, Bikramjeet Vig, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10415 to look at the new patch set (#4). Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query .. IMPALA-6034: Add Cpu and scanned bytes limits per query To protect Impala from runaway queries add per query limits for Cpu and scanned bytes, limits are enforced via query options and applied to the entire query, not per backend like mem_limit. If a query exceeds any of the limits it will get cancelled. Query profile is updated to include query wide and per backend metrics for Cpu and scanned bytes. Testing: Added tests for various permutations for MAX_CPU_TIME_S and MAX_SCAN_BYTES Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20 --- M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift A testdata/workloads/functional-query/queries/QueryTest/query-resource-limits.test A tests/query_test/test_query_resource_limits.py 13 files changed, 376 insertions(+), 22 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/10415/4 -- To view, visit http://gerrit.cloudera.org:8080/10415 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20 Gerrit-Change-Number: 10415 Gerrit-PatchSet: 4 Gerrit-Owner: Mostafa Mokhtar Gerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/10415 ) Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query .. Patch Set 3: (22 comments) http://gerrit.cloudera.org:8080/#/c/10415/3//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/10415/3//COMMIT_MSG@10 PS3, Line 10: apply > nit: applied Done http://gerrit.cloudera.org:8080/#/c/10415/3//COMMIT_MSG@14 PS3, Line 14: upated > updated Done http://gerrit.cloudera.org:8080/#/c/10415/3//COMMIT_MSG@18 PS3, Line 18: > nit: extra whitespace Done http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.h File be/src/runtime/coordinator-backend-state.h: http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.h@113 PS3, Line 113: fragments > nit: fragment Done http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.h@200 PS3, Line 200: /// used by Coordinator::BackendState::AggregateBackendStats > maybe mention units. (same for cpu_sys_) Done http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.h@201 PS3, Line 201: cpu_user_ > nit: how about cpu_user_time_ Done http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.h@207 PS3, Line 207: BYTES_READ_COUNTERs in profile_ > nit: Collection of BYTES_READ_COUNTERs of all the scan nodes in this fragme Done http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.cc File be/src/runtime/coordinator-backend-state.cc: http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.cc@213 PS3, Line 213: void Coordinator::BackendState::GetBackendResourceUtilization( > We could just return BackendResourceUtilization by value and achieve the sa Done http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.cc@341 PS3, Line 341: AggregateBackendStats > do we need this call here? are any of the instance stats used in AggregateB Coordinator::ComputeQuerySummary calls Coordinator::BackendState::UpdateExecStats before the final metrics are written to the query profile, otherwise stale info will be written. http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.cc@527 PS3, Line 527: RuntimeProfile::Counter* profile_user_time_counter = profile_->GetCounter("TotalThreadsUserTime"); > nit: long line Done http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.cc@533 PS3, Line 533: RuntimeProfile::Counter* profile_system_time_counter = profile_->GetCounter("TotalThreadsSysTime"); > nit: long line Done http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.cc@638 PS3, Line 638: "peak_mem_consumption", resource_utilization_.peak_consumption, document->GetAllocator()); > nit:long line Done http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator.h@173 PS3, Line 173: peak_consumption > nit: how about peak_mem_consumption Done http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator.h@176 PS3, Line 176: backend_ > nit: we can probably get rid of this prefix now that all these counters are Done http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator.h@185 PS3, Line 185: Aggregate CPU and bytes read metrics > update comment: Aggregate CPU, scanned bytes and peak memory consumption me Done http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator.cc@725 PS3, Line 725: info > unused variable Done http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator.cc@732 PS3, Line 732: peak_memory = backend_resource_utilization.peak_consumption; : backend_user_cpu = backend_resource_utilization.backend_cpu_user; : backend_sys_cpu = backend_resource_utilization.backend_cpu_sys; : backend_scanned_bytes = backend_resource_utilization.backend_scanned_bytes; > nit: can we directly use the values in the struct? Done http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/service/impala-server.cc File be/src/service/impala-server.cc: http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/service/impala-server.cc@1011 PS3, Line 1011: queries_by_timestamp_.emplace(ExpirationEvent{ > We only need to queue one expiration event if both max_cpu_time_s and max_s Done http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/service/impala-server.cc@1909 PS3, Line 1909: // 4. Check and cancel queries with Cpu and scan bytes constraints if limit is exceeded > nit:long line Done
[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/10415 ) Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query .. Patch Set 1: (31 comments) http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h File be/src/runtime/coordinator-backend-state.h: http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@115 PS1, Line 115: Cpu > Maybe clarify: These are methods returns total consumption per backend across fragment instances. http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@118 PS1, Line 118: /// Return total sys Cpu. > /// Return total system CPU consumption across all backends. Done http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@121 PS1, Line 121: /// Return total scanned bytes. > /// Return total scanned bytes (from HDFS or HDFS-compatible filesystem) ac Done http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@124 PS1, Line 124: void GetBackendResourceUsage(int64_t& user_cpu, int64_t& sys_cpu, int64_t& scanned_bytes, > It would be more readable if this returned a struct. 4 output arguments wit Done http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@125 PS1, Line 125: int64_t& peak_mem > nit: use pointers for output parameters Done http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@206 PS1, Line 206: /// total scan ranges complete across all scan nodes > maybe add "for this backend" This is a per fragment instance counter. http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@209 PS1, Line 209: /// total user cpu consumed > maybe add "for this backend" Done http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@213 PS1, Line 213: int64_t cpu_sys_ = 0; > maybe add "for this backend" Done http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@281 PS1, Line 281: /// total scan ranges complete across all scan nodes > maybe add "across all backends. Updated by AggregateBackendStats()." Done http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@284 PS1, Line 284: /// total user cpu consumed > maybe add "across all backends. Updated by AggregateBackendStats()." Done http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@287 PS1, Line 287: /// total system cpu consumed > maybe add "across all backends. Updated by AggregateBackendStats()." Done http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.h File be/src/runtime/coordinator.h: http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.h@169 PS1, Line 169: void > Maybe return a struct instead of multiple output arguments? Done http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.h@169 PS1, Line 169: int64_t& > nit: use pointer for output parameters Done http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc File be/src/runtime/coordinator.cc: http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc@721 PS1, Line 721: : int64_t total_cpu =0, total_scanned_bytes =0, backend_user_cpu = 0, : backend_sys_cpu = 0,backend_bytes_read = 0, peak_memory = 0; > nit: various whitespace inconsistencies Done http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc@726 PS1, Line 726: > nit: don't need some of this vertical whitespace Done http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc@751 PS1, Line 751: query_profile_->AddInfoString("Total CPU time", cpu_total_info.str()); > nit: capitalisation is a bit inconsistent - I think everything should be ti Done http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc@755 PS1, Line 755: query_profile_->AddInfoString("Per Node User time", cpu_user_info.str()); > Any reason why we don't show the aggregate user and system CPU time across Check "Total CPU time" which aggregates user and system CPU across all backends. http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc@802 PS1, Line 802: total_user_cpu += backend_state->GetUserCpu(); > Should we modify the BackendState functions so that we can get all three va Done http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/service/impala-server.h File be/src/service/impala-server.h: http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/service/impala-server.h@1041 PS1, Line 1041: int64_t max_cpu_time_ns; > I think we can avoid including these new fields in ExpirationEvent - the ex Done http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/service/impala-server.cc File
[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query
Hello Tim Armstrong, Bikramjeet Vig, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10415 to look at the new patch set (#3). Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query .. IMPALA-6034: Add Cpu and scanned bytes limits per query To protect Impala from runaway queries add per query limits for Cpu and scanned bytes, limits are enforced via query options and apply to the entire query, not per backend like mem_limit. If a query exceeds any of the limits it will get cancelled. Query profile is upated to include query wide and per backend metrics for Cpu and scanned bytes. Testing: Added tests for various permutations for MAX_CPU_TIME_S and MAX_SCAN_BYTES Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20 --- M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift A testdata/workloads/functional-query/queries/QueryTest/query-resource-limits.test A tests/query_test/test_query_resource_limits.py 13 files changed, 372 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/10415/3 -- To view, visit http://gerrit.cloudera.org:8080/10415 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20 Gerrit-Change-Number: 10415 Gerrit-PatchSet: 3 Gerrit-Owner: Mostafa MokhtarGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query
Hello Tim Armstrong, Bikramjeet Vig, I'd like you to reexamine a change. Please visit http://gerrit.cloudera.org:8080/10415 to look at the new patch set (#2). Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query .. IMPALA-6034: Add Cpu and scanned bytes limits per query To protect Impala from runaway queries add per query limits for Cpu and scanned bytes, limits are enforced via query options and apply to the entire query, not per backend like mem_limit. If a query exceeds any of the limits it will get cancelled. Query profile is upated to include query wide and per backend metrics for Cpu and scanned bytes. Testing: Added tests for various permutations for MAX_CPU_TIME_S and MAX_SCAN_BYTES Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20 --- M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift A testdata/workloads/functional-query/queries/QueryTest/query-resource-limits.test A tests/query_test/test_query_resource_limits.py 13 files changed, 373 insertions(+), 21 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/10415/2 -- To view, visit http://gerrit.cloudera.org:8080/10415 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20 Gerrit-Change-Number: 10415 Gerrit-PatchSet: 2 Gerrit-Owner: Mostafa MokhtarGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6035: Add query options to limit thread reservation
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/10365 ) Change subject: IMPALA-6035: Add query options to limit thread reservation .. Patch Set 4: (1 comment) http://gerrit.cloudera.org:8080/#/c/10365/4/testdata/workloads/functional-query/queries/QueryTest/resource-limits.test File testdata/workloads/functional-query/queries/QueryTest/resource-limits.test: http://gerrit.cloudera.org:8080/#/c/10365/4/testdata/workloads/functional-query/queries/QueryTest/resource-limits.test@1 PS4, Line 1: Can you rename the file to "thread-limits.test"? Same for added tests that test the thread limit? -- To view, visit http://gerrit.cloudera.org:8080/10365 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5b5bbbdad5cd6b24442eb6c99a4d38c2ad710007 Gerrit-Change-Number: 10365 Gerrit-PatchSet: 4 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Bikramjeet Vig Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Mon, 21 May 2018 16:26:31 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/8523 ) Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/8523/12//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8523/12//COMMIT_MSG@13 PS12, Line 13: includes S3, ADLS, and local fs. > there are todo's in this change to extend to hdfs. pls see the comment in P Does this change the dependency on fs.s3a.block.size? Having to rely on that parameter for generate even scan ranges is error prone. https://www.cloudera.com/documentation/enterprise/5-9-x/topics/impala_s3.html#s3_performance -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 12 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 15 May 2018 23:15:35 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query
Mostafa Mokhtar has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10415 Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query .. IMPALA-6034: Add Cpu and scanned bytes limits per query To protect Impala from runaway queries add per query limits for Cpu and scanned bytes, limits are enforced via query options and apply to the entire query, not per backend like mem_limit. If a query exceeds any of the limits it will get cancelled. Query profile is upated to include query wide and per backend metrics for Cpu and scanned bytes. Testing: Added tests for various permutations for MAX_CPU_TIME_S and MAX_SCAN_BYTES Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20 --- M be/src/runtime/coordinator-backend-state.cc M be/src/runtime/coordinator-backend-state.h M be/src/runtime/coordinator.cc M be/src/runtime/coordinator.h M be/src/service/impala-server.cc M be/src/service/impala-server.h M be/src/service/query-options-test.cc M be/src/service/query-options.cc M be/src/service/query-options.h M common/thrift/ImpalaInternalService.thrift M common/thrift/ImpalaService.thrift M tests/custom_cluster/test_query_expiration.py 12 files changed, 347 insertions(+), 13 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/10415/1 -- To view, visit http://gerrit.cloudera.org:8080/10415 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20 Gerrit-Change-Number: 10415 Gerrit-PatchSet: 1 Gerrit-Owner: Mostafa Mokhtar
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/8523 ) Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. Patch Set 12: (1 comment) http://gerrit.cloudera.org:8080/#/c/8523/12//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8523/12//COMMIT_MSG@13 PS12, Line 13: includes S3, ADLS, and local fs. What about erasure coded HDFS data? -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 12 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Tue, 15 May 2018 17:26:02 + Gerrit-HasComments: Yes
[Impala-ASF-CR] CDH-65183 Update scripts in benchmark folder to store workload and few minor updates
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/10100 ) Change subject: CDH-65183 Update scripts in benchmark folder to store workload and few minor updates .. Patch Set 5: (4 comments) http://gerrit.cloudera.org:8080/#/c/10100/5/tests/benchmark/perf_result_datastore.py File tests/benchmark/perf_result_datastore.py: http://gerrit.cloudera.org:8080/#/c/10100/5/tests/benchmark/perf_result_datastore.py@298 PS5, Line 298: insert into workloadmetrics Fix indentation http://gerrit.cloudera.org:8080/#/c/10100/5/tests/benchmark/perf_result_datastore.py@338 PS5, Line 338: profile = profile.replace('\"', '') Would be good to add a comment explaining why this is happening? The quotes here " break line below, would single quotes work instead? http://gerrit.cloudera.org:8080/#/c/10100/5/tests/benchmark/report_benchmark_results.py File tests/benchmark/report_benchmark_results.py: http://gerrit.cloudera.org:8080/#/c/10100/5/tests/benchmark/report_benchmark_results.py@735 PS5, Line 735: if not first_exec_summary: Add comment. Metadata only queries don't contain a summary, this code is to handle that case. Metadata query is for something like "alter table foo recover partitions" http://gerrit.cloudera.org:8080/#/c/10100/5/tests/benchmark/report_benchmark_results.py@1058 PS5, Line 1058: if exec_summaries[0] is None: Same as comment above, please add a comment line explaining what this does. -- To view, visit http://gerrit.cloudera.org:8080/10100 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Ica7c00ad59963d466bae9e607a4692af0138962c Gerrit-Change-Number: 10100 Gerrit-PatchSet: 5 Gerrit-Owner: Nithya JanarthananGerrit-Reviewer: David Knupp Gerrit-Reviewer: Michael Brown Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Comment-Date: Tue, 24 Apr 2018 21:28:37 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Add new queries to targeted-perf workload
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/9979 ) Change subject: Add new queries to targeted-perf workload .. Patch Set 3: Please add a description to the change describing the high level goals behind the queries. -- To view, visit http://gerrit.cloudera.org:8080/9979 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I5c415924d0bb6da1b1f5df6cb16b95a1d2eaa3ab Gerrit-Change-Number: 9979 Gerrit-PatchSet: 3 Gerrit-Owner: Nithya JanarthananGerrit-Reviewer: David Knupp Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Comment-Date: Wed, 18 Apr 2018 00:55:54 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6792: Fail status reporting if coordinator refuses connections
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/9916 ) Change subject: IMPALA-6792: Fail status reporting if coordinator refuses connections .. Patch Set 2: (2 comments) http://gerrit.cloudera.org:8080/#/c/9916/2/be/src/runtime/query-state.cc File be/src/runtime/query-state.cc: http://gerrit.cloudera.org:8080/#/c/9916/2/be/src/runtime/query-state.cc@260 PS2, Line 260: if (i < 2) SleepForMs(100); I would feel more comfortable if the 100ms is under a flag as this might uncover false positives on large clusters with busy coordinators. http://gerrit.cloudera.org:8080/#/c/9916/2/be/src/runtime/query-state.cc@274 PS2, Line 274: << "Query " << PrintId(query_id()) << " may hang. See IMPALA-2990."; > Should we include the reason for the abort here? Otherwise the large number Can't connect to coordinator Vs. Can't report status? -- To view, visit http://gerrit.cloudera.org:8080/9916 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: If668838f99f78b5ffa713488178b2eb5799ba220 Gerrit-Change-Number: 9916 Gerrit-PatchSet: 2 Gerrit-Owner: Sailesh MukilGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 04 Apr 2018 02:41:34 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-5931: Generates scan ranges in planner for s3/adls
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/8523 ) Change subject: IMPALA-5931: Generates scan ranges in planner for s3/adls .. Patch Set 11: (1 comment) http://gerrit.cloudera.org:8080/#/c/8523/11//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/8523/11//COMMIT_MSG@34 PS11, Line 34: - all core tests pass when configured with s3 Do the tests validate that an "even" number of bytes is assigned to each backend? For S3 scans Impala relies fs.s3a.block.size to ensure scan ranges are evenly distributed.a -- To view, visit http://gerrit.cloudera.org:8080/8523 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I326065adbb2f7e632814113aae85cb51ca4779a5 Gerrit-Change-Number: 8523 Gerrit-PatchSet: 11 Gerrit-Owner: Vuk ErcegovacGerrit-Reviewer: Dan Hecht Gerrit-Reviewer: Dimitris Tsirogiannis Gerrit-Reviewer: Lars Volker Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 28 Mar 2018 06:02:19 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6675: Default to --compact catalog topic=true.
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/9661 ) Change subject: IMPALA-6675: Default to --compact_catalog_topic=true. .. Patch Set 2: Ran small and large payload test cases against a 130 node cluster, speedup from compaction and reduction in network traffic are good. -- To view, visit http://gerrit.cloudera.org:8080/9661 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I39a2dd42a21ef448b85278a8cef3c1d0112b844f Gerrit-Change-Number: 9661 Gerrit-PatchSet: 2 Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Comment-Date: Fri, 16 Mar 2018 15:28:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6675: Default to --compact catalog topic=true.
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/9661 ) Change subject: IMPALA-6675: Default to --compact_catalog_topic=true. .. Patch Set 2: Code-Review+2 -- To view, visit http://gerrit.cloudera.org:8080/9661 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I39a2dd42a21ef448b85278a8cef3c1d0112b844f Gerrit-Change-Number: 9661 Gerrit-PatchSet: 2 Gerrit-Owner: Alex BehmGerrit-Reviewer: Alex Behm Gerrit-Reviewer: Bharath Vissapragada Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Comment-Date: Fri, 16 Mar 2018 15:27:18 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6461 : Micro-optimizations to DataStreamSender::Send
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/9221 ) Change subject: IMPALA-6461 : Micro-optimizations to DataStreamSender::Send .. Patch Set 8: Code-Review+2 Carry +2 -- To view, visit http://gerrit.cloudera.org:8080/9221 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I642a9dad531a29d4838a3537ab0e04320a69960d Gerrit-Change-Number: 9221 Gerrit-PatchSet: 8 Gerrit-Owner: Mostafa MokhtarGerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 21 Feb 2018 18:29:22 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6461 : Micro-optimizations to DataStreamSender::Send
Mostafa Mokhtar has uploaded a new patch set (#8). ( http://gerrit.cloudera.org:8080/9221 ) Change subject: IMPALA-6461 : Micro-optimizations to DataStreamSender::Send .. IMPALA-6461 : Micro-optimizations to DataStreamSender::Send While analyzing performance of partition exchange operator, I noticed that there is dependency and a function call per row in the hot path. Optimizations in this change are: 1) Remove the data dependency between computing the hash and the channel 2) Inline DataStreamSender::Channel::AddRow 3) Save partition_exprs_.size() to save a couple of instructions This translates to improving CPI for DataStreamSender::Send by 10% Change-Id: I642a9dad531a29d4838a3537ab0e04320a69960d --- M be/src/runtime/data-stream-sender.cc M be/src/runtime/krpc-data-stream-sender.cc M be/src/runtime/row-batch.h 3 files changed, 110 insertions(+), 48 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/21/9221/8 -- To view, visit http://gerrit.cloudera.org:8080/9221 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I642a9dad531a29d4838a3537ab0e04320a69960d Gerrit-Change-Number: 9221 Gerrit-PatchSet: 8 Gerrit-Owner: Mostafa MokhtarGerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong
[Impala-ASF-CR] IMPALA-6461 : Micro-optimizations to DataStreamSender::Send
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/9221 ) Change subject: IMPALA-6461 : Micro-optimizations to DataStreamSender::Send .. Patch Set 7: (4 comments) http://gerrit.cloudera.org:8080/#/c/9221/7/be/src/runtime/data-stream-sender.cc File be/src/runtime/data-stream-sender.cc: http://gerrit.cloudera.org:8080/#/c/9221/7/be/src/runtime/data-stream-sender.cc@451 PS7, Line 451: (num_rows - batch_start) > nit: unnecessary parenthesis Done http://gerrit.cloudera.org:8080/#/c/9221/7/be/src/runtime/data-stream-sender.cc@485 PS7, Line 485: (num_rows - batch_start) > Same here Done http://gerrit.cloudera.org:8080/#/c/9221/7/be/src/runtime/krpc-data-stream-sender.cc File be/src/runtime/krpc-data-stream-sender.cc: http://gerrit.cloudera.org:8080/#/c/9221/7/be/src/runtime/krpc-data-stream-sender.cc@668 PS7, Line 668: (num_rows - batch_start) > Same here. Done http://gerrit.cloudera.org:8080/#/c/9221/7/be/src/runtime/krpc-data-stream-sender.cc@702 PS7, Line 702: (num_rows - batch_start) > Same here. Done -- To view, visit http://gerrit.cloudera.org:8080/9221 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I642a9dad531a29d4838a3537ab0e04320a69960d Gerrit-Change-Number: 9221 Gerrit-PatchSet: 7 Gerrit-Owner: Mostafa MokhtarGerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Wed, 21 Feb 2018 18:29:56 + Gerrit-HasComments: Yes
[Impala-ASF-CR] Force inlining of BloomFilter::MakeMask
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/9214 ) Change subject: Force inlining of BloomFilter::MakeMask .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/9214/2/be/src/util/bloom-filter.h File be/src/util/bloom-filter.h: http://gerrit.cloudera.org:8080/#/c/9214/2/be/src/util/bloom-filter.h@162 PS2, Line 162: __attribute__((__target__("avx2"))); Yes, I see it too in BucketFindAvx2 0.67 ? mov%rdi,%r12 ? 1.07 ? mov%esi,%ebx ? 0.56 ? mov%edx,%edi ? 1.32 ? shl$0x5,%rbx ? 0.72 ? sub$0x18,%rsp ? 1.73 ?? callq impala::BloomFilter::MakeMask(unsigned int) Wondering if the multi-versioning attribute is still needed since MakeMask only called from BucketFindAVX2 and BucketInsertAVX2 which also have (__target__("avx2")). -- To view, visit http://gerrit.cloudera.org:8080/9214 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I89282f6c315570bea5ad8a0f854cb6eea0592923 Gerrit-Change-Number: 9214 Gerrit-PatchSet: 2 Gerrit-Owner: Tim ArmstrongGerrit-Reviewer: Mostafa Mokhtar Gerrit-Comment-Date: Tue, 06 Feb 2018 16:50:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6356: Reduce amount of logging from RpczStore::LogTrace()
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/9125 ) Change subject: IMPALA-6356: Reduce amount of logging from RpczStore::LogTrace() .. Patch Set 3: Code-Review+1 -- To view, visit http://gerrit.cloudera.org:8080/9125 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I347b0dea641368e10ba84bc40ec250c26a4f43b2 Gerrit-Change-Number: 9125 Gerrit-PatchSet: 3 Gerrit-Owner: Michael HoGerrit-Reviewer: Michael Ho Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Sailesh Mukil Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Thu, 25 Jan 2018 02:36:01 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6356: Reduce amount of logging from RpczStore::LogTrace()
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/9125 ) Change subject: IMPALA-6356: Reduce amount of logging from RpczStore::LogTrace() .. Patch Set 1: (1 comment) http://gerrit.cloudera.org:8080/#/c/9125/1/be/src/rpc/rpc-mgr.cc File be/src/rpc/rpc-mgr.cc: http://gerrit.cloudera.org:8080/#/c/9125/1/be/src/rpc/rpc-mgr.cc@60 PS1, Line 60: FLAGS_rpc_duration_too_long_ms = 15 * 60 * 1000; I think the default of 15 minutes is too long, since we have noticed queries taking that long to fail in case of network partitioning. -- To view, visit http://gerrit.cloudera.org:8080/9125 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I347b0dea641368e10ba84bc40ec250c26a4f43b2 Gerrit-Change-Number: 9125 Gerrit-PatchSet: 1 Gerrit-Owner: Michael HoGerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Wed, 24 Jan 2018 22:55:21 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-6395: Add a flag for data stream sender's buffer size
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/9026 ) Change subject: IMPALA-6395: Add a flag for data stream sender's buffer size .. Patch Set 1: Is it this accumulation limit existed for Shuffle but not for broadcast? or was that an oversight? -- To view, visit http://gerrit.cloudera.org:8080/9026 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I385f4b7a0671bb2d7872bee60d476c375680b5c2 Gerrit-Change-Number: 9026 Gerrit-PatchSet: 1 Gerrit-Owner: Michael HoGerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Sailesh Mukil Gerrit-Comment-Date: Tue, 16 Jan 2018 19:07:59 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-6364: Bypass file handle cache for ineligible files
Mostafa Mokhtar has posted comments on this change. ( http://gerrit.cloudera.org:8080/8945 ) Change subject: IMPALA-6364: Bypass file handle cache for ineligible files .. Patch Set 2: Code-Review+1 Validated the the regression is addressed with this change. -- To view, visit http://gerrit.cloudera.org:8080/8945 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I4ab52b0884a909a4faeb6692f32d45878ea2838f Gerrit-Change-Number: 8945 Gerrit-PatchSet: 2 Gerrit-Owner: Joe McDonnellGerrit-Reviewer: Joe McDonnell Gerrit-Reviewer: Mostafa Mokhtar Gerrit-Reviewer: Tim Armstrong Gerrit-Comment-Date: Fri, 05 Jan 2018 06:29:18 + Gerrit-HasComments: No