> If you want the in-flight requests to be reported by the server, then I don't see why we'd need another metric here
I see the confusion here. I don't want in-flight requests to be reported by the server. The client will keep track of *its* in-flight request. The weight could then be calculated by using *client side* in-flight request + *server side* reported qps and cpu utilization. > I think the cases you mention here are fundamentally different, because they involve request latency being driven primarily by things other than CPU utilization. I don't think those use-cases are going to be able to be addressed by this design. I suspect those are more likely to be addressed by something like a least_request policy Makes sense, thanks! On Monday, February 13, 2023 at 11:01:47 PM UTC+1 Mark D. Roth wrote: > On Mon, Feb 13, 2023 at 1:19 PM Tommy Ulfsparre <to...@ulfsparre.se> > wrote: > >> >> > I don't think we'd want the client to do its own tracking of in-flight >> requests to each endpoint, because the endpoint may also be receiving >> requests from many other endpoints at the same time, and the client would >> not see those, so it could result in incorrect weights >> >> The client will see those because it's carried through server side load >> reporting? In-flight requests would just be an added variable to the weight >> calculation function which currently only consist of qps / cpu utilization >> . Or did i misunderstand what you meant here? >> > > If you want the in-flight requests to be reported by the server, then I > don't see why we'd need another metric here. The server could simply > choose to increment its qps metric at the start of each request rather than > at the end of each request, so the existing qps metric would also include > in-flight requests. This design does not dictate how the server computes > the values it reports. > > >> >> > I think it's both more correct and simpler to do this based solely on >> the metrics reported by the endpoint >> >> The current design would not penalize endpoints with higher latency or >> other things that can cause client to perceive a higher latency like >> (stop-the-world) garbage collection or CPU throttling. If that is not the >> goal of this design then opting for something simpler makes sense. >> > > The goal of this policy is to provide balanced CPU utilization over a set > of backends that are otherwise equivalent -- i.e., request latency is > expected to be driven primarily by CPU utilization, so balancing the CPU > utilization will also balance the request latency. > > I think the cases you mention here are fundamentally different, because > they involve request latency being driven primarily by things other than > CPU utilization. I don't think those use-cases are going to be able to be > addressed by this design. I suspect those are more likely to be addressed > by something like a least_request policy. > > >> On Monday, February 13, 2023 at 9:04:00 PM UTC+1 Mark D. Roth wrote: >> >>> I don't think we'd want the client to do its own tracking of in-flight >>> requests to each endpoint, because the endpoint may also be receiving >>> requests from many other endpoints at the same time, and the client would >>> not see those, so it could result in incorrect weights. I think it's both >>> more correct and simpler to do this based solely on the metrics reported by >>> the endpoint. >>> >>> On Mon, Feb 13, 2023 at 11:38 AM Tommy Ulfsparre <to...@ulfsparre.se> >>> wrote: >>> >>>> Hey Mark, >>>> >>>> I read the proposal and my question was not about using the weight >>>> based on in-flight request or network latency rather is there cases where >>>> you don't want to always include both. Meaning, can the existing design be >>>> improved by including both in-flight request count in addition to the >>>> server reported CPU utilization and request rate in the final weight >>>> calculation. Does that make sense? >>>> On Monday, February 13, 2023 at 7:50:09 PM UTC+1 Mark D. Roth wrote: >>>> >>>>> This design does not actually use any info about in-flight requests or >>>>> network latencies. It weights backends purely by the CPU utilization and >>>>> request rate reported by the endpoint. >>>>> >>>>> It's certainly possible to write an LB policy that weights on >>>>> in-flight requests or network latency, but that's not the goal of this >>>>> particular policy. >>>>> >>>>> On Tue, Feb 7, 2023 at 6:27 AM Tommy Ulfsparre <to...@ulfsparre.se> >>>>> wrote: >>>>> >>>>>> Hey, >>>>>> >>>>>> Looking forward to see this proposal implemented! >>>>>> >>>>>> Is there cases where you don't want to also include client local >>>>>> observations (like in-flight request) into the weight calculation? >>>>>> >>>>>> How would WRR behave for a client that load balance over a set of >>>>>> endpoint where a subset of the endpoints has higher (network) latencies, >>>>>> meaning latencies that isn't observable server side? Instead of choosing >>>>>> between least request and WRR could we get the benefits of both? >>>>>> >>>>>> >>>>>> On Monday, February 6, 2023 at 7:05:02 AM UTC+1 Yousuk Seung wrote: >>>>>> >>>>>>> This is the discussion thread for A58: Weighted Round Robin LB >>>>>>> Policy. >>>>>>> >>>>>>> https://github.com/grpc/proposal/pull/343 >>>>>>> >>>>>>> Please share your comments. >>>>>>> >>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "grpc.io" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to grpc-io+u...@googlegroups.com. >>>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/grpc-io/4b12c382-5535-42e5-9ed8-48b1464f37adn%40googlegroups.com >>>>>> >>>>>> <https://groups.google.com/d/msgid/grpc-io/4b12c382-5535-42e5-9ed8-48b1464f37adn%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> >>>>> >>>>> >>>>> -- >>>>> Mark D. Roth <ro...@google.com> >>>>> Software Engineer >>>>> Google, Inc. >>>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "grpc.io" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to grpc-io+u...@googlegroups.com. >>>> >>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/grpc-io/feee4b82-7f7d-4274-980d-1f95c95beb58n%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/grpc-io/feee4b82-7f7d-4274-980d-1f95c95beb58n%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>> >>> >>> -- >>> Mark D. Roth <ro...@google.com> >>> Software Engineer >>> Google, Inc. >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "grpc.io" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to grpc-io+u...@googlegroups.com. >> > To view this discussion on the web visit >> https://groups.google.com/d/msgid/grpc-io/506feea7-42f5-4ad6-a12f-11bde27dcdf5n%40googlegroups.com >> >> <https://groups.google.com/d/msgid/grpc-io/506feea7-42f5-4ad6-a12f-11bde27dcdf5n%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > > > -- > Mark D. Roth <ro...@google.com> > Software Engineer > Google, Inc. > -- You received this message because you are subscribed to the Google Groups "grpc.io" group. To unsubscribe from this group and stop receiving emails from it, send an email to grpc-io+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/grpc-io/b56aba8a-4b84-485b-a34a-30da969d338an%40googlegroups.com.