Peter,

thanks much for the comments!

I've had a look at this draft, and have a few questions / concerns. My apologies if this is covered somehow in the overload work (I am not following it closely), or if I missed some discussion going by. Basically, I think there are some potential problems with ambiguity, and also no good way to tell the client where (or if) to go somewhere else for service. Specifically:

o 500 would become overloaded to mean either "something bad (but unspecified) happened" or "go somewhere else (maybe try again later)", but it seems impossible to determine which case it is. The name of this code implies only the former.
The changes proposed in the draft should not affect the 500 response code. In which respect would there be a change to 500?

o 503 changes its meaning to "I'm overloaded, back off (optionally don't try again for some time)". But this is not the name of this response -- it is Service Unavailable. The trouble is, there are many reasons why service may not be available, overload being just one. (Yes, I know this issue is raised as OPEN ISSUE 3 in the draft itself.)
503 means that the "server is temporarily unable to process the request due to a temporary overloading or maintenance of the server."

It seems that 503 is working reasonably well for maintenance but not for overload. So the question is if we can change the overload behavior without breaking the maintenance part. I believe that the proposed changes 1. - 4. would provide such a change if 503 is used as today for maintenance (with Retry-After) and without Retry-After as described for proxies under overload.

Proposals:

1. Can a new 5xx response code be defined specifically to handle overload scenarios, instead of overloading 503 / 500? This would have the explicit semantic of "I'm overloaded, back off (optionally for some time)". It could also have additional info on how the client should handle it (see next). 503 would keep its original meaning (go somewhere else), and 500 would keep its meaning (something bad happened).
I think this would be closer to a new SIP extension for overload control, which I think is useful, than to a correction of RFC3261, which is the scope of this draft.

2. Add a new header, to explicitly list one or more servers to go get service from, usable (at least) in 503 and the new 5xx, maybe 500 also. With this, 503 could have the additional semantic of "go somewhere else, and here's where" (optionally with a time to retry here). New 5xx could have the additional semantic of "I'm overloaded, go here instead" (optionally with a time to retry here). I can certainly see cases, even in overload handling, where the current server may want to explicitly send the client somewhere else, effectively redirecting to a different server. This could be either to an explicit server (or list of them), or just "go try somewhere else, you figure it out". Also handy for maintenance operations and for load sharing, possibly others. Lack of an explicit way to send clients elsewhere for service seems like a weakness overall.

I can see that it might be useful in some scenarios to provide these kind of hints to the client. But often a client is able to find backup servers even without such hints, e.g., through DNS. The client might also know of other server the overloaded server wasn't aware of. With a redirection list, a server would need to know all its peers that might possibly jump in if it is overloaded.

I think that being specific in a response about what went wrong (e.g. "I'm overloaded") is useful since it will help the client to react properly (e.g. not to resend the request).

Hopefully, the above (or something along those lines) could help both dis-ambiguate, and help by acting as a redirection for those cases needing it. Minimally, it would at least make it easier to figure out (in the implementation) what the client is being told to do, and would make troubleshooting a whole lot easier. I do understand the desire to minimize changes. But it is also important to get the right behaviour, and not create situations open to too much interpretation. Yes, there would be lots of backwards compatibility issues, but any change in this stuff will have loads anyway.
I agree. However, as mentioned, this draft is not intending to propose a SIP extension for overload control.

Just throwing some ideas out there to see if the flame throwers come out. (Donning my flame-proof suit now ..... ;-)

This kind of input is highly appreciated! :)

Thanks,

Volker




_______________________________________________
Sip mailing list  https://www1.ietf.org/mailman/listinfo/sip
This list is for NEW development of the core SIP Protocol
Use [EMAIL PROTECTED] for questions on current sip
Use [EMAIL PROTECTED] for new developments on the application of sip

Reply via email to