[Impala-ASF-CR] IMPALA-6034: Add CPU and scanned bytes limits per query

2018-07-18 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add CPU and scanned bytes limits per query
..


Patch Set 11:

> I was talking to Pooja offline and realised that the CPU limit
 > implementation is deeply flawed because we don't update the CPU
 > usage at any regular cadence - e.g. scanner threads only update it
 > when they exit.
 >
 > I'm thinking about making cpu_limit_s a "DEVELOPMENT" query option
 > and leaving it undocumented until we can address that issue.
 >
 > The bytes read implementation works as expected.

Sounds good, can you also create a JIRA for tracking that issue


--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 11
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 18 Jul 2018 22:17:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6034: Add CPU and scanned bytes limits per query

2018-07-18 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add CPU and scanned bytes limits per query
..


Patch Set 11:

I was talking to Pooja offline and realised that the CPU limit implementation 
is deeply flawed because we don't update the CPU usage at any regular cadence - 
e.g. scanner threads only update it when they exit.

I'm thinking about making cpu_limit_s a "DEVELOPMENT" query option and leaving 
it undocumented until we can address that issue.

The bytes read implementation works as expected.


--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 11
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 18 Jul 2018 21:00:20 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6034: Add CPU and scanned bytes limits per query

2018-07-17 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add CPU and scanned bytes limits per query
..


Patch Set 11: Code-Review+1

(3 comments)

http://gerrit.cloudera.org:8080/#/c/10415/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10415/9//COMMIT_MSG@14
PS9, Line 14:
> nit:long line
can you include an example too?


http://gerrit.cloudera.org:8080/#/c/10415/9/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/10415/9/be/src/runtime/coordinator-backend-state.cc@600
PS9, Line 600:   avg_profile_->AddInfoString("num instances", 
lexical_cast(num_instances_));
 : }
> maybe add other members of resource_utilization_ as well here
can you address this too?


http://gerrit.cloudera.org:8080/#/c/10415/9/be/src/service/query-options-test.cc
File be/src/service/query-options-test.cc:

http://gerrit.cloudera.org:8080/#/c/10415/9/be/src/service/query-options-test.cc@146
PS9, Line 146:  {-1, I64_MAX}},
> I just ran it through clang-format :).
i agree, looks more consistent now



--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 11
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Jul 2018 23:39:04 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6034: Add CPU and scanned bytes limits per query

2018-07-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add CPU and scanned bytes limits per query
..


Patch Set 11:

I really liked the suggestions btw, I think it's getting cleaner.


--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 11
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 17 Jul 2018 00:47:44 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6034: Add CPU and scanned bytes limits per query

2018-07-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#10) to the change originally 
created by Mostafa Mokhtar. ( http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add CPU and scanned bytes limits per query
..

IMPALA-6034: Add CPU and scanned bytes limits per query

To protect Impala from runaway queries add per query limits for CPU and
scanned bytes, limits are specified via query options CPU_LIMIT_S and
SCAN_BYTES_LIMIT and applied to the entire query, not per backend like
mem_limit. If a query exceeds any of the limits it is terminated. The
checks are periodic so the query may go somewhat over the limits.

Query profile is updated to include query wide and per backend metrics
for CPU and scanned bytes.

Testing:
Added tests for various permutations for CPU_LIMIT_S and
SCAN_BYTES_LIMIT

Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
A 
testdata/workloads/functional-query/queries/QueryTest/query-resource-limits.test
M tests/query_test/test_cancellation.py
M tests/query_test/test_resource_limits.py
14 files changed, 353 insertions(+), 52 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/10415/10
--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 10
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6034: Add CPU and scanned bytes limits per query

2018-07-13 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add CPU and scanned bytes limits per query
..


Patch Set 9:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/10415/9//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10415/9//COMMIT_MSG@8
PS9, Line 8:
maybe mention the query_option names added in the patch. I know its mentioned 
in the "testing" part but would still prefer that its mentioned explicitly in 
the description.


http://gerrit.cloudera.org:8080/#/c/10415/9//COMMIT_MSG@10
PS9, Line 10: scanned bytes, limits are enforced via query options and applied 
to the entire
nit: long line


http://gerrit.cloudera.org:8080/#/c/10415/9//COMMIT_MSG@14
PS9, Line 14: Query profile is updated to include query wide and per backend 
metrics for Cpu
nit:long line

also include an example


http://gerrit.cloudera.org:8080/#/c/10415/9/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/10415/9/be/src/runtime/coordinator-backend-state.cc@500
PS9, Line 500:
 :   // Update resource utilization and apply delta.
 :   ResourceUtilization prev_utilization = resource_utilization_;
 :   RuntimeProfile::Counter* user_time = 
profile_->GetCounter("TotalThreadsUserTime");
 :   if (user_time != nullptr) resource_utilization_.cpu_user_ns = 
user_time->value();
 :
 :   RuntimeProfile::Counter* system_time = 
profile_->GetCounter("TotalThreadsSysTime");
 :   if (system_time != nullptr) resource_utilization_.cpu_sys_ns = 
system_time->value();
 :
 :   int total_bytes = 0;
 :   for (RuntimeProfile::Counter* c: bytes_read_counters_) 
total_bytes += c->value();
 :   resource_utilization_.bytes_read = total_bytes;
 :
 :   RuntimeProfile::Counter* peak_mem =
 :   
profile_->GetCounter(FragmentInstanceState::PER_HOST_PEAK_MEM_COUNTER);
 :   if (peak_mem != nullptr) 
resource_utilization_.peak_mem_consumption = peak_mem->value();
 :   
aggregate_utilization->Merge(resource_utilization_.Delta(prev_utilization));
I feel we should just defer this calculation to 
BackendState::GetResourceUtilization(). Seems like we are only concerned with 
the aggregate level ResourceUtilization if ComputeQuerySummary() is called 
(only called at a terminal state) or if  ComputeQueryResourceUtilization() is 
called (which only happens if there is a resource limit on the query). This 
deferred calculation will help get rid of BackendState::lock_ required in this 
method and the logic around delta updates of ResourceUtilization. Moreover, 
queries that dont have a resource limit wont have to pay for this 
locking/updating cost everytime an update is recieved.

NOTE: if we envision resource management to always set a default limit on 
resources then we can just stick with the current implementation as having 
resources limits would be the norm rather than the exception.


http://gerrit.cloudera.org:8080/#/c/10415/9/be/src/runtime/coordinator-backend-state.cc@600
PS9, Line 600:   value->AddMember("peak_mem_consumption", 
resource_utilization_.peak_mem_consumption,
 :   document->GetAllocator());
maybe add other members of resource_utilization_ as well here


http://gerrit.cloudera.org:8080/#/c/10415/9/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/10415/9/be/src/runtime/coordinator.h@171
PS9, Line 171: for this query
nit: since this struct can hold info for either finstance, backend and query 
level, maybe clarify what we mean here.


http://gerrit.cloudera.org:8080/#/c/10415/9/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/10415/9/be/src/service/impala-server.cc@2026
PS9, Line 2026: crs
love the variable name


http://gerrit.cloudera.org:8080/#/c/10415/9/be/src/service/impala-server.cc@2036
PS9, Line 2036: CheckResourceLimits
nit: how about we return a status from CheckResourceLimits(), and on a non-ok 
status we call ExpireQuery(). This way its sounds intuitive what the name 
CheckResourceLimits() is doing and its consistent with the how the rest of the 
method behaves.

Dont have a strong preference so feel free to ignore this comment.


http://gerrit.cloudera.org:8080/#/c/10415/9/be/src/service/impala-server.cc@2125
PS9, Line 2125: TIME_NS
we can just use seconds here instead as the cpu_limit_s can only we defined in 
seconds


http://gerrit.cloudera.org:8080/#/c/10415/9/be/src/service/query-options-test.cc
File be/src/service/query-options-test.cc:

http://gerrit.cloudera.org:8080/#/c/10415/9/be/src/service/query-options-test.cc@146
PS9, Line 146:
nit: not a huge deal but we can probably get this in 

[Impala-ASF-CR] IMPALA-6034: Add CPU and scanned bytes limits per query

2018-07-11 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add CPU and scanned bytes limits per query
..


Patch Set 9:

Could you take another look, Bikram?


--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 9
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 12 Jul 2018 00:23:17 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6034: Add CPU and scanned bytes limits per query

2018-07-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add CPU and scanned bytes limits per query
..


Patch Set 6:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/10415/8/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

http://gerrit.cloudera.org:8080/#/c/10415/8/be/src/runtime/coordinator-backend-state.h@198
PS8, Line 198:
> For this fragment instance
Done


http://gerrit.cloudera.org:8080/#/c/10415/8/be/src/runtime/coordinator-backend-state.h@207
PS8, Line 207: /// Collection of BYTES_READ_COUNTERs of all scan nodes in 
this fragment instance.
> peak_mem_counter_ was removed...
Done


http://gerrit.cloudera.org:8080/#/c/10415/8/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/10415/8/be/src/runtime/coordinator-backend-state.cc@294
PS8, Line 294:   // TODO: We're losing this profile information. Call 
ReportQuerySummary only after
> I'm considering adjusting this to use an O(1) algorithm - Update() could re
Done


http://gerrit.cloudera.org:8080/#/c/10415/8/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/10415/8/be/src/runtime/coordinator.h@177
PS8, Line 177:
> This isn't accurate, could be fragment instances, backend or query.
Done


http://gerrit.cloudera.org:8080/#/c/10415/8/be/src/service/query-options.h
File be/src/service/query-options.h:

http://gerrit.cloudera.org:8080/#/c/10415/8/be/src/service/query-options.h@135
PS8, Line 135:TQueryOptionLevel::ADVANCED)\
> Need to fix tabs
Done


http://gerrit.cloudera.org:8080/#/c/10415/8/be/src/service/query-options.cc
File be/src/service/query-options.cc:

http://gerrit.cloudera.org:8080/#/c/10415/8/be/src/service/query-options.cc@682
PS8, Line 682: // Parse the scan bytes limit and validate it.
> Useless comment.
Done



--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 6
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 03 Jul 2018 23:40:55 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6034: Add CPU and scanned bytes limits per query

2018-07-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#9) to the change originally 
created by Mostafa Mokhtar. ( http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add CPU and scanned bytes limits per query
..

IMPALA-6034: Add CPU and scanned bytes limits per query

To protect Impala from runaway queries add per query limits for CPU and
scanned bytes, limits are enforced via query options and applied to the entire
query, not per backend like mem_limit.
If a query exceeds any of the limits it will get cancelled.

Query profile is updated to include query wide and per backend metrics for Cpu
and scanned bytes.

Testing:
Added tests for various permutations for CPU_LIMIT_S and
SCAN_BYTES_LIMIT

Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
A 
testdata/workloads/functional-query/queries/QueryTest/query-resource-limits.test
M tests/query_test/test_cancellation.py
M tests/query_test/test_resource_limits.py
14 files changed, 351 insertions(+), 43 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/10415/9
--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 9
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6034: Add CPU and scanned bytes limits per query

2018-07-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add CPU and scanned bytes limits per query
..


Patch Set 8:

(6 comments)

There's some more cleanup to do after my first pass...

http://gerrit.cloudera.org:8080/#/c/10415/8/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

http://gerrit.cloudera.org:8080/#/c/10415/8/be/src/runtime/coordinator-backend-state.h@198
PS8, Line 198: backend
For this fragment instance


http://gerrit.cloudera.org:8080/#/c/10415/8/be/src/runtime/coordinator-backend-state.h@207
PS8, Line 207: /// and peak_mem_counter_ from profile_.
peak_mem_counter_ was removed...


http://gerrit.cloudera.org:8080/#/c/10415/8/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/10415/8/be/src/runtime/coordinator-backend-state.cc@294
PS8, Line 294: 
resource_utilization_.Merge(entry.second->resource_utilization_);
I'm considering adjusting this to use an O(1) algorithm - Update() could return 
the delta in resource consumption and this could just be a Merge().


http://gerrit.cloudera.org:8080/#/c/10415/8/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/10415/8/be/src/runtime/coordinator.h@177
PS8, Line 177: backend
This isn't accurate, could be fragment instances, backend or query.


http://gerrit.cloudera.org:8080/#/c/10415/8/be/src/service/query-options.h
File be/src/service/query-options.h:

http://gerrit.cloudera.org:8080/#/c/10415/8/be/src/service/query-options.h@135
PS8, Line 135:TQueryOptionLevel::ADVANCED)\
Need to fix tabs


http://gerrit.cloudera.org:8080/#/c/10415/8/be/src/service/query-options.cc
File be/src/service/query-options.cc:

http://gerrit.cloudera.org:8080/#/c/10415/8/be/src/service/query-options.cc@682
PS8, Line 682: // Parse the scan bytes limit and validate it.
Useless comment.



--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 8
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 03 Jul 2018 19:12:08 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6034: Add CPU and scanned bytes limits per query

2018-07-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add CPU and scanned bytes limits per query
..


Patch Set 7:

(5 comments)

Addressed the previous comments and did various cleanup work.

http://gerrit.cloudera.org:8080/#/c/10415/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10415/6//COMMIT_MSG@18
PS6, Line 18: Added tests for various permutations for CPU_LIMIT_S and
> Something I noticed after playing around: How about we rename this to have
Done


http://gerrit.cloudera.org:8080/#/c/10415/6/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/10415/6/be/src/service/impala-server.cc@1940
PS6, Line 1940:   if (session_timeout <= 0) return;
Fixed the null coordinator dereference and added a cancellation test that 
exercises the new code path.


http://gerrit.cloudera.org:8080/#/c/10415/6/be/src/service/impala-server.cc@1970
PS6, Line 1970: is();
> This comment applies to my previous work here, but "expired" is actually ki
Done


http://gerrit.cloudera.org:8080/#/c/10415/6/be/src/service/impala-server.cc@1978
PS6, Line 1978: if (session_state.second->ref_count > 0) continue;
> We still need to erase the event - now we have two copies of it so we've en
Done


http://gerrit.cloudera.org:8080/#/c/10415/6/be/src/service/impala-server.cc@1979
PS6, Line 1979: // A session closed by other means is in the process of 
being removed, and it's
> nit: tabs
Done



--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 7
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 03 Jul 2018 19:07:03 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6034: Add CPU and scanned bytes limits per query

2018-07-03 Thread Tim Armstrong (Code Review)
Tim Armstrong has uploaded a new patch set (#7) to the change originally 
created by Mostafa Mokhtar. ( http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add CPU and scanned bytes limits per query
..

IMPALA-6034: Add CPU and scanned bytes limits per query

To protect Impala from runaway queries add per query limits for CPU and
scanned bytes, limits are enforced via query options and applied to the entire
query, not per backend like mem_limit.
If a query exceeds any of the limits it will get cancelled.

Query profile is updated to include query wide and per backend metrics for Cpu
and scanned bytes.

Testing:
Added tests for various permutations for CPU_LIMIT_S and
SCAN_BYTES_LIMIT

Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
A 
testdata/workloads/functional-query/queries/QueryTest/query-resource-limits.test
M tests/query_test/test_cancellation.py
M tests/query_test/test_resource_limits.py
14 files changed, 338 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/10415/7
--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 7
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query

2018-07-02 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query
..


Patch Set 6:

I'm going to take over this patch from Mostafa. Need to do some cleanup but 
I'll post a new patchset soon.


--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 6
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Sailesh Mukil 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 03 Jul 2018 01:14:45 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query

2018-06-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query
..


Patch Set 6:

Following diff prevents it from crashing immediately

-  
query_state->GetCoordinator()->AggregateBackendsResourceUsage(_resource_usage);
+  Coordinator* coord = query_state->GetCoordinator();
+  if (coord == nullptr) {
+queries_by_timestamp_.emplace(ExpirationEvent{
+now + 1000L, expiration_event->query_id, 
ExpirationKind::RESOURCE_LIMIT});
+expiration_event = queries_by_timestamp_.erase(expiration_event);
+continue;
+  }
+  coord->AggregateBackendsResourceUsage(_resource_usage);


--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 6
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Sat, 16 Jun 2018 00:37:14 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query

2018-06-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query
..


Patch Set 6:

(2 comments)

I ran a local stress test with avro/snappy and it crashed almost immediately:

(gdb) down
#9  0x030e0378 in impala::Coordinator::AggregateBackendsResourceUsage 
(this=0x0,
query_resource_utilization=0x7f2f1ddc1270) at 
be/src/runtime/coordinator.cc:823
823   for (BackendState* backend_state: backend_states_) {
(gdb) p this
$4 = (impala::Coordinator * const) 0x0


(gdb) bt
#0  0x7f2f8befb428 in __GI_raise (sig=sig@entry=6) at 
../sysdeps/unix/sysv/linux/raise.c:54
#1  0x7f2f8befd02a in __GI_abort () at abort.c:89
#2  0x7f2f8ee372c9 in ?? () from 
/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
#3  0x7f2f8efe7c77 in ?? () from 
/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
#4  0x7f2f8ee405af in JVM_handle_linux_signal ()
   from /usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
#5  0x7f2f8ee34408 in ?? () from 
/usr/lib/jvm/java-8-openjdk-amd64/jre/lib/amd64/server/libjvm.so
#6  
#7  0x030e5912 in 
__gnu_cxx::__normal_iterator > >::__normal_iterator 
(this=0x7f2f1ddc11c0,
__i=@0x10: )
at 
/home/tarmstrong/Impala/incubator-impala/toolchain/gcc-4.9.2/include/c++/4.9.2/bits/stl_iterator.h:729
#8  0x030e3e59 in std::vector >::begin (this=0x10)
at 
/home/tarmstrong/Impala/incubator-impala/toolchain/gcc-4.9.2/include/c++/4.9.2/bits/stl_vector.h:548
#9  0x030e0378 in impala::Coordinator::AggregateBackendsResourceUsage 
(this=0x0,
query_resource_utilization=0x7f2f1ddc1270) at 
be/src/runtime/coordinator.cc:823
#10 0x01e0210f in impala::ImpalaServer::ExpireQueries (this=0xc611800) 
at be/src/service/impala-server.cc:2001
#11 0x01e322c5 in boost::_mfi::mf0::operator() (this=0x7f2f1ddc1ce8, p=0xc611800)
at 
/opt/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/mem_fn_template.hpp:49
#12 0x01e3095c in 
boost::_bi::list1 
>::operator(), boost::_bi::list0> 
(this=0x7f2f1ddc1cf8, f=..., a=...)
at /opt/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/bind.hpp:253
#13 0x01e2de65 in boost::_bi::bind_t, 
boost::_bi::list1 > >::operator() 
(this=0x7f2f1ddc1ce8)
at 
/opt/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/bind_template.hpp:20
#14 0x01e2a6e7 in 
boost::detail::function::void_function_obj_invoker0, 
boost::_bi::list1 > >, void>::invoke (
function_obj_ptr=...) at 
/opt/Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:153
#15 0x01b92aca in boost::function0::operator() 
(this=0x7f2f1ddc1ce0)
at 
/opt/Impala-Toolchain/boost-1.57.0-p3/include/boost/function/function_template.hpp:767
#16 0x01f865e9 in impala::Thread::SuperviseThread(std::string const&, 
std::string const&, boost::function, impala::ThreadDebugInfo const*, 
impala::Promise*) (name="query-expirer",
category="impala-server", functor=..., parent_thread_info=0x0, 
thread_started=0x7fff1e696170)
at be/src/util/thread.cc:356
#17 0x01f8e6c1 in boost::_bi::list5, 
boost::_bi::value, boost::_bi::value >, 
boost::_bi::value, 
boost::_bi::value*> 
>::operator(), impala::ThreadDebugInfo const*, impala::Promise*), boost::_bi::list0>(boost::_bi::type, void 
(*&)(std::string const&, std::string const&, boost::function, 
impala::ThreadDebugInfo const*, impala::Promise*), boost::_bi::list0&, int) (this=0xe6561c0,
f=@0xe6561b8: 0x1f86282 , impala::ThreadDebugInfo 
const*, impala::Promise*)>, a=...)
at /opt/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/bind.hpp:525
#18 0x01f8e5e5 in boost::_bi::bind_t, impala::ThreadDebugInfo const*, 
impala::Promise*), 
boost::_bi::list5, 
boost::_bi::value, boost::_bi::value >, 
boost::_bi::value, 
boost::_bi::value*> > 
>::operator()() (
this=0xe6561b8) at 
/opt/Impala-Toolchain/boost-1.57.0-p3/include/boost/bind/bind_template.hpp:20
#19 0x01f8e5a8 in boost::detail::thread_data, 
impala::ThreadDebugInfo const*, impala::Promise*), boost::_bi::list5, 
boost::_bi::value, boost::_bi::value >, 
boost::_bi::value, 
boost::_bi::value*> > > >::run() 
(this=0xe656000) at 
/opt/Impala-Toolchain/boost-1.57.0-p3/include/boost/thread/detail/thread.hpp:116
#20 0x031db81a in thread_proxy ()
#21 0x7f2f8c2976ba in start_thread (arg=0x7f2f1ddc2700) at 
pthread_create.c:333
#22 0x7f2f8bfcd41d in clone () at 
../sysdeps/unix/sysv/linux/x86_64/clone.S:109

http://gerrit.cloudera.org:8080/#/c/10415/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10415/6//COMMIT_MSG@18
PS6, Line 18: Added tests for various permutations for MAX_CPU_TIME_S and 
MAX_SCAN_BYTES
Something I noticed after playing around: How about we rename this to have a 
_limit suffix for consistency with other 

[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query

2018-06-15 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query
..


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/10415/6/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/10415/6/be/src/service/impala-server.cc@1978
PS6, Line 1978:   // Query didn't cross the resource limits, move to 
the end of the queue
We still need to erase the event - now we have two copies of it so we've end up 
with exponential growth.


http://gerrit.cloudera.org:8080/#/c/10415/6/be/src/service/impala-server.cc@1979
PS6, Line 1979:   queries_by_timestamp_.emplace(ExpirationEvent{
nit: tabs



--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 6
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 15 Jun 2018 23:53:33 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query

2018-06-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query
..


Patch Set 6: Verified+1


--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 6
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 15 Jun 2018 00:11:54 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query

2018-06-14 Thread Mostafa Mokhtar (Code Review)
Hello Tim Armstrong, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10415

to look at the new patch set (#6).

Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query
..

IMPALA-6034: Add Cpu and scanned bytes limits per query

To protect Impala from runaway queries add per query limits for Cpu and
scanned bytes, limits are enforced via query options and applied to the entire
query, not per backend like mem_limit.
If a query exceeds any of the limits it will get cancelled.

Query profile is updated to include query wide and per backend metrics for Cpu
and scanned bytes.

Testing:
Added tests for various permutations for MAX_CPU_TIME_S and MAX_SCAN_BYTES

Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
A 
testdata/workloads/functional-query/queries/QueryTest/query-resource-limits.test
A tests/query_test/test_query_resource_limits.py
13 files changed, 375 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/10415/6
--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 6
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query

2018-06-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query
..


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2675/


--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 6
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 14 Jun 2018 20:59:08 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query

2018-06-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query
..


Patch Set 5: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/2671/


--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 5
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 14 Jun 2018 20:13:56 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query

2018-06-14 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query
..


Patch Set 5:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2671/


--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 5
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Thu, 14 Jun 2018 19:10:15 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query

2018-06-14 Thread Mostafa Mokhtar (Code Review)
Hello Tim Armstrong, Bikramjeet Vig, Impala Public Jenkins,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10415

to look at the new patch set (#5).

Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query
..

IMPALA-6034: Add Cpu and scanned bytes limits per query

To protect Impala from runaway queries add per query limits for Cpu and
scanned bytes, limits are enforced via query options and applied to the entire
query, not per backend like mem_limit.
If a query exceeds any of the limits it will get cancelled.

Query profile is updated to include query wide and per backend metrics for Cpu
and scanned bytes.

Testing:
Added tests for various permutations for MAX_CPU_TIME_S and MAX_SCAN_BYTES

Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
A 
testdata/workloads/functional-query/queries/QueryTest/query-resource-limits.test
A tests/query_test/test_query_resource_limits.py
13 files changed, 374 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/10415/5
--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 5
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query

2018-06-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query
..


Patch Set 4:

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/2648/


--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 4
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 13 Jun 2018 02:47:10 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query

2018-06-12 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2648/


--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 4
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Tue, 12 Jun 2018 23:29:28 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query

2018-06-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query
..


Patch Set 4: Verified-1

Build failed: https://jenkins.impala.io/job/gerrit-verify-dryrun/2626/


--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 4
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 08 Jun 2018 04:54:58 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query

2018-06-07 Thread Impala Public Jenkins (Code Review)
Impala Public Jenkins has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query
..


Patch Set 4:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2626/


--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 4
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Impala Public Jenkins 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 08 Jun 2018 01:38:21 +
Gerrit-HasComments: No


[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query

2018-06-07 Thread Mostafa Mokhtar (Code Review)
Hello Tim Armstrong, Bikramjeet Vig,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10415

to look at the new patch set (#4).

Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query
..

IMPALA-6034: Add Cpu and scanned bytes limits per query

To protect Impala from runaway queries add per query limits for Cpu and
scanned bytes, limits are enforced via query options and applied to the entire
query, not per backend like mem_limit.
If a query exceeds any of the limits it will get cancelled.

Query profile is updated to include query wide and per backend metrics for Cpu
and scanned bytes.

Testing:
Added tests for various permutations for MAX_CPU_TIME_S and MAX_SCAN_BYTES

Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
A 
testdata/workloads/functional-query/queries/QueryTest/query-resource-limits.test
A tests/query_test/test_query_resource_limits.py
13 files changed, 376 insertions(+), 22 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/10415/4
--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 4
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query

2018-06-07 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query
..


Patch Set 3:

(22 comments)

http://gerrit.cloudera.org:8080/#/c/10415/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10415/3//COMMIT_MSG@10
PS3, Line 10: apply
> nit: applied
Done


http://gerrit.cloudera.org:8080/#/c/10415/3//COMMIT_MSG@14
PS3, Line 14: upated
> updated
Done


http://gerrit.cloudera.org:8080/#/c/10415/3//COMMIT_MSG@18
PS3, Line 18:
> nit: extra whitespace
Done


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.h@113
PS3, Line 113: fragments
> nit: fragment
Done


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.h@200
PS3, Line 200: /// used by Coordinator::BackendState::AggregateBackendStats
> maybe mention units. (same for cpu_sys_)
Done


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.h@201
PS3, Line 201: cpu_user_
> nit: how about cpu_user_time_
Done


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.h@207
PS3, Line 207: BYTES_READ_COUNTERs in profile_
> nit: Collection of BYTES_READ_COUNTERs of all the scan nodes in this fragme
Done


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.cc@213
PS3, Line 213: void Coordinator::BackendState::GetBackendResourceUtilization(
> We could just return BackendResourceUtilization by value and achieve the sa
Done


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.cc@341
PS3, Line 341: AggregateBackendStats
> do we need this call here? are any of the instance stats used in AggregateB
Coordinator::ComputeQuerySummary calls  
Coordinator::BackendState::UpdateExecStats before the final metrics are written 
to the query profile, otherwise stale info will be written.


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.cc@527
PS3, Line 527:   RuntimeProfile::Counter* profile_user_time_counter = 
profile_->GetCounter("TotalThreadsUserTime");
> nit: long line
Done


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.cc@533
PS3, Line 533:   RuntimeProfile::Counter* profile_system_time_counter = 
profile_->GetCounter("TotalThreadsSysTime");
> nit: long line
Done


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.cc@638
PS3, Line 638:   "peak_mem_consumption", 
resource_utilization_.peak_consumption, document->GetAllocator());
> nit:long line
Done


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator.h@173
PS3, Line 173: peak_consumption
> nit: how about peak_mem_consumption
Done


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator.h@176
PS3, Line 176: backend_
> nit: we can probably get rid of this prefix now that all these counters are
Done


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator.h@185
PS3, Line 185: Aggregate CPU and bytes read metrics
> update comment: Aggregate CPU, scanned bytes and peak memory consumption me
Done


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator.cc@725
PS3, Line 725: info
> unused variable
Done


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator.cc@732
PS3, Line 732: peak_memory = backend_resource_utilization.peak_consumption;
 : backend_user_cpu = 
backend_resource_utilization.backend_cpu_user;
 : backend_sys_cpu = 
backend_resource_utilization.backend_cpu_sys;
 : backend_scanned_bytes = 
backend_resource_utilization.backend_scanned_bytes;
> nit: can we directly use the values in the struct?
Done


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/service/impala-server.cc@1011
PS3, Line 1011:   queries_by_timestamp_.emplace(ExpirationEvent{
> We only need to queue one expiration event if both max_cpu_time_s and max_s
Done


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/service/impala-server.cc@1909
PS3, Line 1909: // 4. Check and cancel queries with Cpu and scan bytes 
constraints if limit is exceeded
> nit:long line
Done



[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query

2018-05-31 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query
..


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.cc@213
PS3, Line 213: void Coordinator::BackendState::GetBackendResourceUtilization(
We could just return BackendResourceUtilization by value and achieve the same 
thing (copying all the members into the caller's stack allocation) I.e.


  BackendResourceUtilization 
Coordinator::BackendState::GetBackendResourceUtilization() {
lock_guard l(lock_);
return resource_utilization_;
  }


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator.cc@815
PS3, Line 815: query_resource_utilization->peak_consumption = max(
If we add more counters like this at some point we should come up with some 
shared logic for doing the aggregation instead of stamping it out. That seems 
like overkill for now though...


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/service/impala-server.cc@1011
PS3, Line 1011:   queries_by_timestamp_.emplace(ExpirationEvent{
We only need to queue one expiration event if both max_cpu_time_s and 
max_scan_bytes are set. I.e. it should be something like:

  if (max_cpu_time_s > 0 || max_scan_bytes > 0) {
if (max_cpu_time_s > 0) {
  VLOG_QUERY << "Query " << PrintId(query_id) << " has cpu limit of "
 << PrettyPrinter::Print(max_cpu_time_s, TUnit::TIME_S);
}
if (max_scan_bytes > 0) {
  VLOG_QUERY << "Query " << PrintId(query_id) << " has scan bytes limit of "
 << PrettyPrinter::Print(max_scan_bytes, TUnit::BYTES);
}
  queries_by_timestamp_.emplace(ExpirationEvent{
  now + 1000L, query_id, ExpirationKind::RESOURCE_LIMIT});
}


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/service/impala-server.cc@1934
PS3, Line 1934: query_resouce_usage
type:query_resource_usage


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/service/impala-server.cc@1972
PS3, Line 1972: }
> looks like we are not updating the deadline for RESOURCE_LIMIT type of expi
Ah, good catch. Yeah I think we should do something similar to the query 
timeouts where we re-enqueue it for a later time.


expiration_event = queries_by_timestamp_.erase(expiration_event);
queries_by_timestamp_.emplace(ExpirationEvent{
UnixMillis() + 1000L, query_id, 
ExpirationKind::RESOURCE_LIMIT});

I think otherwise we run into some potential fairness issues where we're always 
processing these resource limit checks before other time limits.



--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 3
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Mostafa Mokhtar 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Fri, 01 Jun 2018 00:16:18 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query

2018-05-29 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query
..


Patch Set 3:

(21 comments)

http://gerrit.cloudera.org:8080/#/c/10415/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/10415/3//COMMIT_MSG@10
PS3, Line 10: apply
nit: applied


http://gerrit.cloudera.org:8080/#/c/10415/3//COMMIT_MSG@14
PS3, Line 14: upated
updated


http://gerrit.cloudera.org:8080/#/c/10415/3//COMMIT_MSG@18
PS3, Line 18:
nit: extra whitespace


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.h@113
PS3, Line 113: fragments
nit: fragment


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.h@200
PS3, Line 200: /// used by Coordinator::BackendState::AggregateBackendStats
maybe mention units. (same for cpu_sys_)


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.h@201
PS3, Line 201: cpu_user_
nit: how about cpu_user_time_

(same for cpu_sys_)


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.h@207
PS3, Line 207: BYTES_READ_COUNTERs in profile_
nit: Collection of BYTES_READ_COUNTERs of all the scan nodes in this fragment 
instance.


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.cc
File be/src/runtime/coordinator-backend-state.cc:

http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.cc@218
PS3, Line 218:   resource_utilization->backend_scanned_bytes = 
resource_utilization_.backend_scanned_bytes;
long line


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.cc@341
PS3, Line 341: AggregateBackendStats
do we need this call here? are any of the instance stats used in 
AggregateBackendStats updated in this method?


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.cc@527
PS3, Line 527:   RuntimeProfile::Counter* profile_user_time_counter = 
profile_->GetCounter("TotalThreadsUserTime");
nit: long line


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.cc@533
PS3, Line 533:   RuntimeProfile::Counter* profile_system_time_counter = 
profile_->GetCounter("TotalThreadsSysTime");
nit: long line


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator-backend-state.cc@638
PS3, Line 638:   "peak_mem_consumption", 
resource_utilization_.peak_consumption, document->GetAllocator());
nit:long line


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator.h@173
PS3, Line 173: peak_consumption
nit: how about peak_mem_consumption


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator.h@176
PS3, Line 176: backend_
nit: we can probably get rid of this prefix now that all these counters are 
under a struct named "BackendResourceUtilization"


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator.h@179
PS3, Line 179: cpu_user
nit: cpu_user_time_
(same below)


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator.h@185
PS3, Line 185: Aggregate CPU and bytes read metrics
update comment: Aggregate CPU, scanned bytes and peak memory consumption 
metrics across all backends


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator.cc@725
PS3, Line 725: info
unused variable


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/runtime/coordinator.cc@732
PS3, Line 732: peak_memory = backend_resource_utilization.peak_consumption;
 : backend_user_cpu = 
backend_resource_utilization.backend_cpu_user;
 : backend_sys_cpu = 
backend_resource_utilization.backend_cpu_sys;
 : backend_scanned_bytes = 
backend_resource_utilization.backend_scanned_bytes;
nit: can we directly use the values in the struct?
If the objective of doing this was to make the names of the variables shorter 
for readability, perhaps we can shorten "backend_resource_utilization" to  just 
"resource_usage"


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/service/impala-server.cc@1015
PS1, Line 1015: _QUERY << "Qu
> maybe have a named constant for this or a default constructor specifically
can you address this too?


http://gerrit.cloudera.org:8080/#/c/10415/3/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:


[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query

2018-05-21 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query
..


Patch Set 1:

(31 comments)

http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@115
PS1, Line 115: Cpu
> Maybe clarify:
These are methods returns total consumption per backend across fragment 
instances.


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@118
PS1, Line 118:   /// Return total sys Cpu.
> /// Return total system CPU consumption across all backends.
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@121
PS1, Line 121:   /// Return total scanned bytes.
> /// Return total scanned bytes (from HDFS or HDFS-compatible filesystem) ac
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@124
PS1, Line 124:   void GetBackendResourceUsage(int64_t& user_cpu, int64_t& 
sys_cpu, int64_t& scanned_bytes,
> It would be more readable if this returned a struct. 4 output arguments wit
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@125
PS1, Line 125: int64_t& peak_mem
> nit: use pointers for output parameters
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@206
PS1, Line 206: /// total scan ranges complete across all scan nodes
> maybe add "for this backend"
This is a per fragment instance counter.


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@209
PS1, Line 209: /// total user cpu consumed
> maybe add "for this backend"
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@213
PS1, Line 213: int64_t cpu_sys_ = 0;
> maybe add "for this backend"
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@281
PS1, Line 281:   /// total scan ranges complete across all scan nodes
> maybe add "across all backends. Updated by AggregateBackendStats()."
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@284
PS1, Line 284:   /// total user cpu consumed
> maybe add "across all backends. Updated by AggregateBackendStats()."
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@287
PS1, Line 287:   /// total system cpu consumed
> maybe add "across all backends. Updated by AggregateBackendStats()."
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.h@169
PS1, Line 169: void
> Maybe return a struct instead of multiple output arguments?
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.h@169
PS1, Line 169: int64_t&
> nit: use pointer for output parameters
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc@721
PS1, Line 721:
 :   int64_t total_cpu =0, total_scanned_bytes =0, backend_user_cpu 
= 0,
 :   backend_sys_cpu = 0,backend_bytes_read = 0, peak_memory = 
0;
> nit: various whitespace inconsistencies
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc@726
PS1, Line 726:
> nit: don't need some of this vertical whitespace
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc@751
PS1, Line 751:   query_profile_->AddInfoString("Total CPU time", 
cpu_total_info.str());
> nit: capitalisation is a bit inconsistent - I think everything should be ti
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc@755
PS1, Line 755:   query_profile_->AddInfoString("Per Node User time", 
cpu_user_info.str());
> Any reason why we don't show the aggregate user and system CPU time across
Check "Total CPU time" which aggregates user and system CPU across all backends.


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc@802
PS1, Line 802: total_user_cpu += backend_state->GetUserCpu();
> Should we modify the BackendState functions so that we can get all three va
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/service/impala-server.h
File be/src/service/impala-server.h:

http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/service/impala-server.h@1041
PS1, Line 1041: int64_t max_cpu_time_ns;
> I think we can avoid including these new fields in ExpirationEvent - the ex
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/service/impala-server.cc
File 

[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query

2018-05-21 Thread Mostafa Mokhtar (Code Review)
Hello Tim Armstrong, Bikramjeet Vig,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10415

to look at the new patch set (#3).

Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query
..

IMPALA-6034: Add Cpu and scanned bytes limits per query

To protect Impala from runaway queries add per query limits for Cpu and
scanned bytes, limits are enforced via query options and apply to the entire
query, not per backend like mem_limit.
If a query exceeds any of the limits it will get cancelled.

Query profile is upated to include query wide and per backend metrics for Cpu
and scanned bytes.

Testing:
Added tests for various permutations for  MAX_CPU_TIME_S and MAX_SCAN_BYTES

Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
A 
testdata/workloads/functional-query/queries/QueryTest/query-resource-limits.test
A tests/query_test/test_query_resource_limits.py
13 files changed, 372 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/10415/3
--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 3
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query

2018-05-21 Thread Mostafa Mokhtar (Code Review)
Hello Tim Armstrong, Bikramjeet Vig,

I'd like you to reexamine a change. Please visit

http://gerrit.cloudera.org:8080/10415

to look at the new patch set (#2).

Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query
..

IMPALA-6034: Add Cpu and scanned bytes limits per query

To protect Impala from runaway queries add per query limits for Cpu and
scanned bytes, limits are enforced via query options and apply to the entire
query, not per backend like mem_limit.
If a query exceeds any of the limits it will get cancelled.

Query profile is upated to include query wide and per backend metrics for Cpu
and scanned bytes.

Testing:
Added tests for various permutations for  MAX_CPU_TIME_S and MAX_SCAN_BYTES

Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
A 
testdata/workloads/functional-query/queries/QueryTest/query-resource-limits.test
A tests/query_test/test_query_resource_limits.py
13 files changed, 373 insertions(+), 21 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/10415/2
--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 2
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 


[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query

2018-05-16 Thread Bikramjeet Vig (Code Review)
Bikramjeet Vig has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query
..


Patch Set 1:

(5 comments)

just a few more cleanup asks

http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/service/impala-server.cc@1015
PS1, Line 1015: now + (1000L)
maybe have a named constant for this or a default constructor specifically for 
resource expiration events with a comment that resources checks are done every 
second.


http://gerrit.cloudera.org:8080/#/c/10415/1/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

http://gerrit.cloudera.org:8080/#/c/10415/1/common/thrift/ImpalaInternalService.thrift@286
PS1, Line 286: Scan bytes limit, after which a query will be cancelled
nit: "See comment in ImpalaService.thrift."  would work too


http://gerrit.cloudera.org:8080/#/c/10415/1/common/thrift/ImpalaInternalService.thrift@289
PS1, Line 289: CPU time limit in seconds, after which a query will be cancelled
same here


http://gerrit.cloudera.org:8080/#/c/10415/1/tests/custom_cluster/test_query_expiration.py
File tests/custom_cluster/test_query_expiration.py:

http://gerrit.cloudera.org:8080/#/c/10415/1/tests/custom_cluster/test_query_expiration.py@179
PS1, Line 179: client.execute("SET MAX_CPU_TIME_S=1")
 : client.execute("SET MAX_SCAN_BYTES=10G")
can use this instead to set multiple options:
set_configuration(self, config_option_dict)


http://gerrit.cloudera.org:8080/#/c/10415/1/tests/custom_cluster/test_query_expiration.py@182
PS1, Line 182: client.execute("SET MAX_CPU_TIME_S=0")
 : client.execute("SET MAX_SCAN_BYTES=0")
similarly can use this instead:

clear_configuration()



--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 1
Gerrit-Owner: Mostafa Mokhtar 
Gerrit-Reviewer: Bikramjeet Vig 
Gerrit-Reviewer: Tim Armstrong 
Gerrit-Comment-Date: Wed, 16 May 2018 21:07:40 +
Gerrit-HasComments: Yes


[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query

2018-05-16 Thread Tim Armstrong (Code Review)
Tim Armstrong has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query
..


Patch Set 1:

(29 comments)

I have one major concern about the coordinator locking that we need to fix then 
a bunch of cleanup asks.

http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@115
PS1, Line 115: Cpu
Maybe clarify:

  /// Return total user CPU consumption across all backends.


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@118
PS1, Line 118:   /// Return total sys Cpu.
/// Return total system CPU consumption across all backends.


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@121
PS1, Line 121:   /// Return total scanned bytes.
/// Return total scanned bytes (from HDFS or HDFS-compatible filesystem) across 
all backends.


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@124
PS1, Line 124:   void GetBackendResourceUsage(int64_t& user_cpu, int64_t& 
sys_cpu, int64_t& scanned_bytes,
It would be more readable if this returned a struct. 4 output arguments with 
the same type makes it easy to mess up argument order.


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@125
PS1, Line 125: int64_t& peak_mem
nit: use pointers for output parameters


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@206
PS1, Line 206: /// total scan ranges complete across all scan nodes
maybe add "for this backend"


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@209
PS1, Line 209: /// total user cpu consumed
maybe add "for this backend"


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@213
PS1, Line 213: int64_t cpu_sys_ = 0;
maybe add "for this backend"


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@281
PS1, Line 281:   /// total scan ranges complete across all scan nodes
maybe add "across all backends. Updated by AggregateBackendStats()."


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@284
PS1, Line 284:   /// total user cpu consumed
maybe add "across all backends. Updated by AggregateBackendStats()."


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@287
PS1, Line 287:   /// total system cpu consumed
maybe add "across all backends. Updated by AggregateBackendStats()."


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.h@169
PS1, Line 169: int64_t&
nit: use pointer for output parameters


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.h@169
PS1, Line 169: void
Maybe return a struct instead of multiple output arguments?


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc@721
PS1, Line 721:
 :   int64_t total_cpu =0, total_scanned_bytes =0, backend_user_cpu 
= 0,
 :   backend_sys_cpu = 0,backend_bytes_read = 0, peak_memory = 
0;
nit: various whitespace inconsistencies


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc@726
PS1, Line 726:
nit: don't need some of this vertical whitespace


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc@751
PS1, Line 751:   query_profile_->AddInfoString("Total CPU time", 
cpu_total_info.str());
nit: capitalisation is a bit inconsistent - I think everything should be title 
case, i.e. "Read", "Time"


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc@755
PS1, Line 755:   query_profile_->AddInfoString("Per Node User time", 
cpu_user_info.str());
Any reason why we don't show the aggregate user and system CPU time across all 
backends? I guess people will look at the top-level number then only drill down 
in rare cases?


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc@802
PS1, Line 802: total_user_cpu += backend_state->GetUserCpu();
Should we modify the BackendState functions so that we can get all three values 
with a single lock acquisition, e.g. just have it return a 
BackendResourceUtilization struct with all three elements?


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/service/impala-server.h
File be/src/service/impala-server.h:

http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/service/impala-server.h@1041
PS1, Line 1041: int64_t max_cpu_time_ns;
I think we 

[Impala-ASF-CR] IMPALA-6034: Add Cpu and scanned bytes limits per query

2018-05-15 Thread Mostafa Mokhtar (Code Review)
Mostafa Mokhtar has uploaded this change for review. ( 
http://gerrit.cloudera.org:8080/10415


Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query
..

IMPALA-6034: Add Cpu and scanned bytes limits per query

To protect Impala from runaway queries add per query limits for Cpu and
scanned bytes, limits are enforced via query options and apply to the entire
query, not per backend like mem_limit.
If a query exceeds any of the limits it will get cancelled.

Query profile is upated to include query wide and per backend metrics for Cpu
and scanned bytes.

Testing:
Added tests for various permutations for  MAX_CPU_TIME_S and MAX_SCAN_BYTES

Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
---
M be/src/runtime/coordinator-backend-state.cc
M be/src/runtime/coordinator-backend-state.h
M be/src/runtime/coordinator.cc
M be/src/runtime/coordinator.h
M be/src/service/impala-server.cc
M be/src/service/impala-server.h
M be/src/service/query-options-test.cc
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaInternalService.thrift
M common/thrift/ImpalaService.thrift
M tests/custom_cluster/test_query_expiration.py
12 files changed, 347 insertions(+), 13 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/15/10415/1
--
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 1
Gerrit-Owner: Mostafa Mokhtar