> We are not supporting explicit load balancing constraints for retries. 
The retry attempt or hedged RPC will be re-resolved through the 
load-balancer, so it's up to the service owner to ensure that this has a 
low-likelihood of issuing the request to the same backend.

That seems fairly difficult for any service with request-dependent routing 
semantics. Lets use a DFS as an example: many DFSes maintain N replicas of 
a given file block. In the case where you send a hedged request for a 
block, your likelihood is 1/N of requerying the same DFS node which might 
well have a slow disk. At least for us using HDFS, N=3 most of the time; a 
therefore 33% chance of requerying the same node. Even assuming a smart 
load balancing service which intelligently removes poorly performing 
storage nodes from service, it still seems desirable to ensure hedged 
requests go to a different node. Not having a story for more informed load 
balancing seems like it makes a lot of use cases more difficult than they 
need to be.

Regards,
Michael

On Sunday, February 12, 2017 at 7:24:59 PM UTC-7, Eric Gribkoff wrote:
>
> Hi Michael,
>
> Thanks for the feedback. Responses to your questions (and Josh's follow-up 
> question on retry backoff times) are inline below.
>
> On Sat, Feb 11, 2017 at 1:57 PM, 'Michael Rose' via grpc.io <
> [email protected] <javascript:>> wrote:
>
>> A few questions:
>>
>> 1) Under this design, is it possible to add a load balancing constraints 
>> for retried/hedged requests? Especially during hedging, I'd like to be able 
>> to try a different server since the original server might be garbage 
>> collecting or have otherwise collected a queue of requests such that a 
>> retry/hedge to this server will not be very useful. Or, perhaps the key I'm 
>> looking up lives on a specific subset of storage servers and therefore 
>> should be balanced to that specific subset. While that's the domain of a LB 
>> policy, what information will hedging/retries provide to the LB policy?
>>
>>
> We are not supporting explicit load balancing constraints for retries. The 
> retry attempt or hedged RPC will be re-resolved through the load-balancer, 
> so it's up to the service owner to ensure that this has a low-likelihood of 
> issuing the request to the same backend. This is part of a decision to keep 
> the retry design as simple as possible while satisfying the majority of use 
> cases. If your load-balancing policy has a high likelihood of sending 
> requests to the same server each time, hedging (and to some extent retries) 
> will be less useful regardless. There will be metadata attached to the call 
> indicating that it's a retry, but it won't include information about which 
> servers the previous requests went to.
>
>  
>
>> 2) "Clients cannot override retry policy set by the service config." -- 
>> is this intended for inside Google? How about gRPC users outside of Google 
>> which don't use the DNS mechanism to push configuration? It seems like 
>> having a client override for retry/hedging policy is pragmatic.
>>
>>
> In general, we don't want to support client specification of retry 
> policies. The necessary information about what methods are safe to retry or 
> hedge, the potential for increased load, etc., are really decisions that 
> should be left to the service owner. The retry policy will definitely be a 
> part of the service config. While there are still some security-related 
> discussions about the exact delivery mechanism for the service config and 
> retry policies, I think your concern here  should be part of the service 
> config design discussion rather than something specific to retry support.
>  
>
>> 3) Retry backoff time -- if I'm reading it right, it will always retry in 
>> random(0, current_backoff) milliseconds. What's your feeling on this vs. a 
>> retry w/ configurable jitter parameter (e.x. linear 1000ms increase w/ 10% 
>> jitter). Is it OK if there's no minimum backoff?
>>
>>
> You are reading the backoff time correctly. There are a number of ways of 
> doing this, (see https://www.awsarchitectureblog.com/2015/03/backoff.html) 
> but choosing between random(0, current_backoff) is done intentionally and 
> should generally give the best results. We do not want a configurable 
> "jitter" parameter. Empirically, the retries should have more varied 
> backoff time, and we also do not want to let service owners specify very 
> low values for jitter (e.g., 1% or even 0), as this would cluster all 
> retries tightly together and further contribute to server overloading.
>
> Best,
>
> Eric Gribkoff
>  
>
> Regards,
>> Michael
>>
>> On Friday, February 10, 2017 at 5:31:01 PM UTC-7, [email protected] 
>> wrote:
>>>
>>> I've created a gRFC describing the design and implementation plan for 
>>> gRPC Retries.
>>>
>>> Take a look at the gRPC on Github 
>>> <https://github.com/grpc/proposal/pull/12>.
>>>
>>
>> *CONFIDENTIALITY NOTICE: This email message, and any documents, files or 
>> previous e-mail messages attached to it is for the sole use of the intended 
>> recipient(s) and may contain confidential and privileged information. Any 
>> unauthorized review, use, disclosure or distribution is prohibited. If you 
>> are not the intended recipient, please contact the sender by reply email 
>> and destroy all copies of the original message.* 
>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "grpc.io" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to [email protected] <javascript:>.
>> To post to this group, send email to [email protected] 
>> <javascript:>.
>> Visit this group at https://groups.google.com/group/grpc-io.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/grpc-io/62809dba-3349-4a60-9aa9-ccc044d27f53%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/grpc-io/62809dba-3349-4a60-9aa9-ccc044d27f53%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/grpc-io.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/ce59f63d-1dee-46ff-a3eb-c813d15fc2dc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to