> If you want the in-flight requests to be reported by the server, then I 
don't see why we'd need another metric here

I see the confusion here. I don't want in-flight requests to be reported by 
the server. The client will keep track of *its* in-flight request. 
The weight could then be calculated by using *client side* in-flight 
request + *server side* reported qps and cpu utilization. 

> I think the cases you mention here are fundamentally different, because 
they involve request latency being driven primarily by things other than 
CPU utilization.  I don't think those use-cases are going to be able to be 
addressed by this design.  I suspect those are more likely to be addressed 
by something like a least_request policy

Makes sense, thanks! 


On Monday, February 13, 2023 at 11:01:47 PM UTC+1 Mark D. Roth wrote:

> On Mon, Feb 13, 2023 at 1:19 PM Tommy Ulfsparre <to...@ulfsparre.se> 
> wrote:
>
>>
>> > I don't think we'd want the client to do its own tracking of in-flight 
>> requests to each endpoint, because the endpoint may also be receiving 
>> requests from many other endpoints at the same time, and the client would 
>> not see those, so it could result in incorrect weights
>>
>> The client will see those because it's carried through server side load 
>> reporting? In-flight requests would just be an added variable to the weight 
>> calculation function which currently only consist of qps / cpu utilization
>> . Or did i misunderstand what you meant here?  
>>
>
> If you want the in-flight requests to be reported by the server, then I 
> don't see why we'd need another metric here.  The server could simply 
> choose to increment its qps metric at the start of each request rather than 
> at the end of each request, so the existing qps metric would also include 
> in-flight requests.  This design does not dictate how the server computes 
> the values it reports.
>  
>
>>
>> > I think it's both more correct and simpler to do this based solely on 
>> the metrics reported by the endpoint
>>
>> The current design would not penalize endpoints with higher latency or 
>> other things that can cause client to perceive a higher latency like 
>> (stop-the-world) garbage collection or CPU throttling. If that is not the 
>> goal of this design then opting for something simpler makes sense. 
>>
>
> The goal of this policy is to provide balanced CPU utilization over a set 
> of backends that are otherwise equivalent -- i.e., request latency is 
> expected to be driven primarily by CPU utilization, so balancing the CPU 
> utilization will also balance the request latency.
>
> I think the cases you mention here are fundamentally different, because 
> they involve request latency being driven primarily by things other than 
> CPU utilization.  I don't think those use-cases are going to be able to be 
> addressed by this design.  I suspect those are more likely to be addressed 
> by something like a least_request policy.
>  
>
>> On Monday, February 13, 2023 at 9:04:00 PM UTC+1 Mark D. Roth wrote:
>>
>>> I don't think we'd want the client to do its own tracking of in-flight 
>>> requests to each endpoint, because the endpoint may also be receiving 
>>> requests from many other endpoints at the same time, and the client would 
>>> not see those, so it could result in incorrect weights.  I think it's both 
>>> more correct and simpler to do this based solely on the metrics reported by 
>>> the endpoint.
>>>
>>> On Mon, Feb 13, 2023 at 11:38 AM Tommy Ulfsparre <to...@ulfsparre.se> 
>>> wrote:
>>>
>>>> Hey Mark,
>>>>
>>>> I read the proposal and my question was not about using the weight 
>>>> based on in-flight request or network latency rather is there cases where 
>>>> you don't want to always include both. Meaning, can the existing design be 
>>>> improved by including both in-flight request count in addition to the 
>>>> server reported CPU utilization and request rate in the final weight 
>>>> calculation. Does that make sense? 
>>>> On Monday, February 13, 2023 at 7:50:09 PM UTC+1 Mark D. Roth wrote:
>>>>
>>>>> This design does not actually use any info about in-flight requests or 
>>>>> network latencies.  It weights backends purely by the CPU utilization and 
>>>>> request rate reported by the endpoint.
>>>>>
>>>>> It's certainly possible to write an LB policy that weights on 
>>>>> in-flight requests or network latency, but that's not the goal of this 
>>>>> particular policy.
>>>>>
>>>>> On Tue, Feb 7, 2023 at 6:27 AM Tommy Ulfsparre <to...@ulfsparre.se> 
>>>>> wrote:
>>>>>
>>>>>> Hey,
>>>>>>
>>>>>> Looking forward to see this proposal implemented!  
>>>>>>
>>>>>> Is there cases where you don't want to also include client local 
>>>>>> observations (like in-flight request) into the weight calculation? 
>>>>>>
>>>>>> How would WRR behave for a client that load balance over a set of 
>>>>>> endpoint where a subset of the endpoints has higher (network) latencies, 
>>>>>> meaning latencies that isn't observable server side? Instead of choosing 
>>>>>> between least request and WRR could we get the benefits of both? 
>>>>>>
>>>>>>
>>>>>> On Monday, February 6, 2023 at 7:05:02 AM UTC+1 Yousuk Seung wrote:
>>>>>>
>>>>>>> This is the discussion thread for A58: Weighted Round Robin LB 
>>>>>>> Policy.
>>>>>>>
>>>>>>> https://github.com/grpc/proposal/pull/343
>>>>>>>
>>>>>>> Please share your comments.
>>>>>>>
>>>>>> -- 
>>>>>> You received this message because you are subscribed to the Google 
>>>>>> Groups "grpc.io" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>> send an email to grpc-io+u...@googlegroups.com.
>>>>>> To view this discussion on the web visit 
>>>>>> https://groups.google.com/d/msgid/grpc-io/4b12c382-5535-42e5-9ed8-48b1464f37adn%40googlegroups.com
>>>>>>  
>>>>>> <https://groups.google.com/d/msgid/grpc-io/4b12c382-5535-42e5-9ed8-48b1464f37adn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>>
>>>>>
>>>>>
>>>>> -- 
>>>>> Mark D. Roth <ro...@google.com>
>>>>> Software Engineer
>>>>> Google, Inc.
>>>>>
>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "grpc.io" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to grpc-io+u...@googlegroups.com.
>>>>
>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/grpc-io/feee4b82-7f7d-4274-980d-1f95c95beb58n%40googlegroups.com
>>>>  
>>>> <https://groups.google.com/d/msgid/grpc-io/feee4b82-7f7d-4274-980d-1f95c95beb58n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>>
>>>
>>>
>>> -- 
>>> Mark D. Roth <ro...@google.com>
>>> Software Engineer
>>> Google, Inc.
>>>
>> -- 
>> You received this message because you are subscribed to the Google Groups 
>> "grpc.io" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to grpc-io+u...@googlegroups.com.
>>
> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/grpc-io/506feea7-42f5-4ad6-a12f-11bde27dcdf5n%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/grpc-io/506feea7-42f5-4ad6-a12f-11bde27dcdf5n%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>>
>
>
> -- 
> Mark D. Roth <ro...@google.com>
> Software Engineer
> Google, Inc.
>

-- 
You received this message because you are subscribed to the Google Groups 
"grpc.io" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to grpc-io+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/grpc-io/b56aba8a-4b84-485b-a34a-30da969d338an%40googlegroups.com.

Reply via email to