Grant Henke has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/16343 )

Change subject: KUDU-1587 part 2: reject write ops if apply queue is overloaded
......................................................................


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/16343/6//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16343/6//COMMIT_MSG@23
PS6, Line 23: new behavior is not yet enabled by default
What is the rational for not enabling by default? Is there some threshold_ms 
that is long enough that would be a good/safe default without negatively 
impacting any existing workloads?


http://gerrit.cloudera.org:8080/#/c/16343/6/src/kudu/tablet/tablet_replica.cc
File src/kudu/tablet/tablet_replica.cc:

http://gerrit.cloudera.org:8080/#/c/16343/6/src/kudu/tablet/tablet_replica.cc@456
PS6, Line 456: size_factor
I am not surely I fully understand why we want to give preference to smaller 
batches. Can you explain this a bit more?


http://gerrit.cloudera.org:8080/#/c/16343/6/src/kudu/tablet/tablet_replica.cc@445
PS6, Line 445:   // If the apply queue is overloaded, reject the incoming 
operation with
             :   // some probability.
             :   MonoDelta queue_otime;
             :   MonoDelta threshold;
             :   if (apply_pool_->QueueOverloaded(&queue_otime, &threshold)) {
             :     auto overload_threshold_ms = threshold.ToMilliseconds();
             :     // The longer the queue has been in the overloaded state, 
the higher the
             :     // probability of rejection.
             :     auto time_factor = queue_otime.ToMilliseconds() / 
overload_threshold_ms + 1;
             :     // The heavier the request in terms of number of rows, the 
higher the
             :     // probability of rejecting it if the apply queue is 
overloaded.
             :     auto size_factor = 
op_state->request()->row_operations().rows().size();
             :     auto factor = 1 + size_factor * time_factor;
             :     if (!rng_.OneIn(factor)) {
             :       return Status::ServiceUnavailable("op apply queue is 
overloaded");
             :     }
             :   }
> Could we instead bake this into the TabletService endpoint? We already do s
+1 to this if it's not to complicated.



--
To view, visit http://gerrit.cloudera.org:8080/16343
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6d7688d6fa832e606b8efc4549568fa52dfa1931
Gerrit-Change-Number: 16343
Gerrit-PatchSet: 6
Gerrit-Owner: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Andrew Wong <[email protected]>
Gerrit-Reviewer: Attila Bukor <[email protected]>
Gerrit-Reviewer: Grant Henke <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Comment-Date: Tue, 25 Aug 2020 16:42:48 +0000
Gerrit-HasComments: Yes

Reply via email to