bigprincipalkk commented on issue #15810: URL: https://github.com/apache/dubbo/issues/15810#issuecomment-3770769507
@nithin-cherry Thank you for your response. Let me first summarize my current understanding of the code. The real latency is only updated once—when the client receives the server’s response—by refreshing the EWMA latency. Later, when an actual call is made, the current delay is predicted by either deferring or applying a penalty, and then smoothed via EWMA. This strikes me as incomplete for two reasons: After every successful metric update, a penalty logic is triggered that predicts the current latency as 2 × timeout. When timeout is either too small or too large, it dilutes the influence of the true latency. Could we adjust this so that if the elapsed time since the last update is less than timeout we simply use the EWMA latency, otherwise we let the predicted latency grow linearly up to 2 × timeout (with that value as the hard cap)? The present EWMA strategy uses a count-based decay with a fixed β = 0.5. A time-based decay—i.e., w_pre = exp(–timeDelta / τ)—would work well for both high-QPS and low-QPS scenarios. Under the current approach, when QPS is very low the EWMA latency still reflects delays measured long ago, which is usually undesirable. These are my questions; I may simply lack deeper insight and would appreciate your guidance. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
