Mostafa Mokhtar has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10415 )

Change subject: IMPALA-6034: Add Cpu and scanned bytes limits per query
......................................................................


Patch Set 1:

(31 comments)

http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h
File be/src/runtime/coordinator-backend-state.h:

http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@115
PS1, Line 115: Cpu
> Maybe clarify:
These are methods returns total consumption per backend across fragment 
instances.


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@118
PS1, Line 118:   /// Return total sys Cpu.
> /// Return total system CPU consumption across all backends.
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@121
PS1, Line 121:   /// Return total scanned bytes.
> /// Return total scanned bytes (from HDFS or HDFS-compatible filesystem) ac
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@124
PS1, Line 124:   void GetBackendResourceUsage(int64_t& user_cpu, int64_t& 
sys_cpu, int64_t& scanned_bytes,
> It would be more readable if this returned a struct. 4 output arguments wit
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@125
PS1, Line 125: int64_t& peak_mem
> nit: use pointers for output parameters
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@206
PS1, Line 206:     /// total scan ranges complete across all scan nodes
> maybe add "for this backend"
This is a per fragment instance counter.


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@209
PS1, Line 209:     /// total user cpu consumed
> maybe add "for this backend"
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@213
PS1, Line 213:     int64_t cpu_sys_ = 0;
> maybe add "for this backend"
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@281
PS1, Line 281:   /// total scan ranges complete across all scan nodes
> maybe add "across all backends. Updated by AggregateBackendStats()."
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@284
PS1, Line 284:   /// total user cpu consumed
> maybe add "across all backends. Updated by AggregateBackendStats()."
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator-backend-state.h@287
PS1, Line 287:   /// total system cpu consumed
> maybe add "across all backends. Updated by AggregateBackendStats()."
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.h
File be/src/runtime/coordinator.h:

http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.h@169
PS1, Line 169: void
> Maybe return a struct instead of multiple output arguments?
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.h@169
PS1, Line 169: int64_t&
> nit: use pointer for output parameters
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc
File be/src/runtime/coordinator.cc:

http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc@721
PS1, Line 721:
             :   int64_t total_cpu =0, total_scanned_bytes =0, backend_user_cpu 
= 0,
             :       backend_sys_cpu = 0,backend_bytes_read = 0, peak_memory = 
0;
> nit: various whitespace inconsistencies
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc@726
PS1, Line 726:
> nit: don't need some of this vertical whitespace
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc@751
PS1, Line 751:   query_profile_->AddInfoString("Total CPU time", 
cpu_total_info.str());
> nit: capitalisation is a bit inconsistent - I think everything should be ti
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc@755
PS1, Line 755:   query_profile_->AddInfoString("Per Node User time", 
cpu_user_info.str());
> Any reason why we don't show the aggregate user and system CPU time across
Check "Total CPU time" which aggregates user and system CPU across all backends.


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/runtime/coordinator.cc@802
PS1, Line 802:     total_user_cpu += backend_state->GetUserCpu();
> Should we modify the BackendState functions so that we can get all three va
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/service/impala-server.h
File be/src/service/impala-server.h:

http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/service/impala-server.h@1041
PS1, Line 1041:     int64_t max_cpu_time_ns;
> I think we can avoid including these new fields in ExpirationEvent - the ex
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/service/impala-server.cc
File be/src/service/impala-server.cc:

http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/service/impala-server.cc@1934
PS1, Line 1934:           int64_t cpu_time_ns =0 , scan_bytes =0;
> nit: weird whitespace
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/service/impala-server.cc@1943
PS1, Line 1943:                 "Query $0 expired due to cpu limit of $1",
> Should these messages include the amount of CPU consumed by the query?
This is intentional, to avoid error messages like:
  Query cancelled due consuming 500 Cpu seconds which exceeds limit of 100s.


There are cases where the runtime profiles come in slower than expected which 
results in a non-linear increase in reported Cpu time. Query can report 0 Cpu 
time for a couple of seconds then report a much higher number than the limit.

Also Cpu consumed so far is not a good indicator for how much higher the Cpu 
limit needs to be.


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/service/impala-server.cc@1957
PS1, Line 1957:                 "Query $0 expired due to scan bytes limit of 
$1",
> Should these messages include the bytes scanned by the query?
Scanned bytes are added to the query profile.
Check previous comment.


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/service/impala-server.cc@1967
PS1, Line 1967:
> I think we can get rid of this blank line and following comment.
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/service/query-options-test.cc
File be/src/service/query-options-test.cc:

http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/service/query-options-test.cc@232
PS1, Line 232:       {MAKE_OPTIONDEF(exec_time_limit_s),              {0, 
I32_MAX}},
> Add the CPU time option here?
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/service/query-options.cc
File be/src/service/query-options.cc:

http://gerrit.cloudera.org:8080/#/c/10415/1/be/src/service/query-options.cc@678
PS1, Line 678:         LOG(INFO) << "Max Cpu time is : " << max_cpu_time_s << " 
seconds";
> Leftover log message?
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/common/thrift/ImpalaInternalService.thrift
File common/thrift/ImpalaInternalService.thrift:

http://gerrit.cloudera.org:8080/#/c/10415/1/common/thrift/ImpalaInternalService.thrift@286
PS1, Line 286: Scan bytes limit, after which a query will be cancelled
> nit: "See comment in ImpalaService.thrift."  would work too
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/common/thrift/ImpalaInternalService.thrift@289
PS1, Line 289: CPU time limit in seconds, after which a query will be cancelled
> same here
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/tests/custom_cluster/test_query_expiration.py
File tests/custom_cluster/test_query_expiration.py:

http://gerrit.cloudera.org:8080/#/c/10415/1/tests/custom_cluster/test_query_expiration.py@168
PS1, Line 168:   @CustomClusterTestSuite.with_args("--idle_query_timeout=0")
> I don't think this actually needs to be a custom cluster test.
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/tests/custom_cluster/test_query_expiration.py@179
PS1, Line 179: client.execute("SET MAX_CPU_TIME_S=10000")
             :     client.execute("SET MAX_SCAN_BYTES=10G")
> can use this instead to set multiple options:
Done


http://gerrit.cloudera.org:8080/#/c/10415/1/tests/custom_cluster/test_query_expiration.py@182
PS1, Line 182: client.execute("SET MAX_CPU_TIME_S=0")
             :     client.execute("SET MAX_SCAN_BYTES=0")
> similarly can use this instead:
Tried client.clear_configuration() and doesn't work as expected.


http://gerrit.cloudera.org:8080/#/c/10415/1/tests/custom_cluster/test_query_expiration.py@219
PS1, Line 219:     sleep(30)
> We should avoid long sleeps in tests, they make the tests slow and can mask
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/10415
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I4c6015e21da684bb9f33e236d71309dd4c178a20
Gerrit-Change-Number: 10415
Gerrit-PatchSet: 1
Gerrit-Owner: Mostafa Mokhtar <mmokh...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bikramjeet....@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mmokh...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com>
Gerrit-Comment-Date: Mon, 21 May 2018 22:05:11 +0000
Gerrit-HasComments: Yes

Reply via email to