We've had quite a lot of trouble like this regarding dialog event
subscriptions.
The important semantics for this is in RFC 3265 section 3.2.2:
A NOTIFY request is considered failed if the response times out, or a
non-200 class response code is received which has no "Retry-After"
header and no implied further action which can be taken to retry the
request (e.g., "401 Authorization Required".)
[...]
If the NOTIFY request fails (as defined above) due to an error
response, and the subscription was installed using a soft-state
mechanism, the notifier MUST remove the corresponding subscription.
In particular, if the notifier receives a 500 response *without*
Retry-After, then the subscription is terminated, but if it *has*
Retry-After, the subscription is not terminated. SipSubscribeServer is
supposed to be coded to follow this rule, although of course there might
be a bug.
This rule shouldn't be problematic for the subscriber, because the
subscriber knows whether or not it sent a 500, and whether it contained
Retry-After. If it sent a 500-without-Retry-After on a subscription, it
knows that it needs to resubscribe. (Although I don't know whether
SipSubscribeClient implements this correctly.)
One of the situations that generates 500s is if a request is sent to a
UA using one transport, but there is some sort of delay in the request
or response, so the sender then sends the request using another
transport. The second request processed receives a 500 response because
its CSeq is lower than the expected next-request CSeq, and it is not a
duplicate of the first request because it has a different branch value.
So unfortunately, occasional 500 responses are unavoidable. (Arguably,
the SIP specification could be improved by changing this.)
With this background...
On Mon, 2010-02-08 at 13:54 -0500, Carolyn Beeton wrote:
> We have been having intermittent problems with SAA on our scstrial.ca,
> and I think the root cause is that we terminate subscriptions when we
> receive 500 Internal Server Error.
>
> The Polycom sets object to Cseqs being sent out of order, and we appear
> to do this from time to time. They send us 500 Internal Server Error
> but then continue on to OK all the (mixed-up) NOTIFYs. This appears to
> be correct, based on rfc3261:12.2.2:
> "If the remote sequence number was not empty, but the sequence number
> of the request is lower than the remote sequence number, the request
> is out of order and MUST be rejected with a 500 (Server Internal
> Error) response. "
>
> However, we terminate their subscription when we receive the 500
> response.
At this point, the phone should know its subscription has been
terminated, since it sent the 500.
> Things are then handicapped (the set will not receive
> line-seize NOTIFYs) until eventually they try to reSUBSCRIBE to us on
> that same subscription, which they consider to be up and running. We
> reject it with 481 Does Not Exist, and since the Polycom links the
> incoming and outgoing subscriptions, they internally cancel both
> subscriptions.
At this point, in regard to the "other" subscription, the phone should
send a NOTIFY with "Subscription-State: termianted". Indeed, it must
send such a NOTIFY even if the subscriber requested the subscription be
termianted. See RFC 3265 section 3.3.4:
A subscriber may send a SUBSCRIBE request with an "Expires" header
of 0 in order to trigger the sending of such a NOTIFY request;
however, for the purposes of subscription and dialog lifetime, the
subscription is not considered terminated until the NOTIFY with a
"Subscription-State" of "terminated" is sent.
If SipSubscribeClient received such a NOTIFY, it would immediately
resubscribe.
> I vaguely remember an old discussion wherein Dale suggested that they
> put a Retry-After header in the 500 message, which we don't exactly
> handle, but at least we don't terminate the subscription if one is
> present. However, since the spec says that they MUST reject with 500
> and only MAY use the Retry-After header, is terminating a subscription
> of receipt of 500 response without Retry-After too extreme?
See XTRN-10 "Suggest to Polycom adding "Retry-After: 0" header in 500
Out Of Order responses".
So a critical question is whether these 500 responses have a Retry-After
header. If not, it appears that the phone is not understanding the
consequences of its actions. If so, some part of our subscription
infrastructure is malfunctioning.
Dale
_______________________________________________
sipx-dev mailing list [email protected]
List Archive: http://list.sipfoundry.org/archive/sipx-dev
Unsubscribe: http://list.sipfoundry.org/mailman/listinfo/sipx-dev
sipXecs IP PBX -- http://www.sipfoundry.org/