Hi:
The following is a review of draft-ietf-speechsc-mrcpv2-15 with
emphasis on S1-S5, S12 and S14. There are a few general comments
first; the rest of the comments follow in linear fashion with respect
to the sections enumerated above.
General comment 1:
------------------
A general comment I have is what is the relationship of MRCPv2 with
RTSP? Both of these seem to provide some similar functionality,
such as a recording resource. Also, RFC4463, the precursor to MRCPv2
appears to discuss RTSP more than this document does. In this
document, SIP essentially replaces RTSP as the preferred protocol to
establish a control session. This is fine, but it would seem to me
that the average reader familiar with RTSP may benefit from some
thoughts about the relationship of MRCPv2 to RTSP.
General comment 2:
------------------
Another general comment I have concerns the SIP message flow in S4.2.
There are a number of tactical omissions and implicit assumptions in
that call flow that are repeated in the call flows of other sections
as well. I will list these out for S4.2 and trust that the authors
will retrofit other sections based on this discussion.
1. On page 15, when the C->S, the R-URI of the request is shown as
sip:[EMAIL PROTECTED] While this is not inaccurate,
what will typically happen in SIP is that the initial request is
targeted to a domain (i.e., sip:[EMAIL PROTECTED]) instead
of a specific server within that domain
(i.e., sip:[EMAIL PROTECTED]). Once the SIP dialog
has been established, subsequent requests will contain a R-URI
that is reflective of the specific server within that domain.
Thus, on the very first request of a dialog-forming method like
INVITE, you may want to consider changing the R-URI to reflect
the domain URI of "sip:[EMAIL PROTECTED]". When that server
replies with a 200 OK, the Contact URI of that response will
contain a FQDN (i.e., "sip:[EMAIL PROTECTED]") -- and
this is indeed the case in your call flow (see 200 OK on Page 16.)
Note that subsequent requests in the dialog will contain a R-URI
that corresponds to the FQDN of the server. Thus, in S4.2, at the
end of page 16, the R-URI of the server is accurately shown as
being "sip:[EMAIL PROTECTED]".
2. On page 15, the Contact URI of the initial INVITE request should
contain a specific client within that domain. That is, the Contact
URI must be "sip:[EMAIL PROTECTED]" and not
"sip:[EMAIL PROTECTED]".
3. In SIP, a LWS is used to denote continuation. Thus, instead of
writing:
Via:SIP/2.0/TCP client.atlanta.example.com:5060;
branch=z9hG4bK74bf9
you must instead write:
Via:SIP/2.0/TCP client.atlanta.example.com:5060;
branch=z9hG4bK74bf9
Note the LWS on the second line above.
4. In SIP, when a server responds with an affirmative response,
it always adds a tag parameter to the From header. In the call
flow on page 16, the From header of the 200 OK does not contain
a tag parameter, it must. Interestingly enough, the subsequent
ACK request contains the tag in the From parameter (as indeed it
should.) Please add a tag parameter to the From header in the
200 OK, and ensure that it matches the one in the ACK request.
The tags are maintained for the duration of the lifetime of that
dialog.
Thus, in the re-INV that starts at the bottom of page 16, the
From tag must be maintained.
Aside: the second m= line in the 200 OK body should have one 0
(i.e., s/00/0).
5. In SIP, the CSeq number of the ACK is kept the same as the
preceding INVITE. Thus, on page 16 the CSeq number of the ACK
request must be 314161.
General comment 3:
------------------
In S6.1.1, page 29, there is an example message:
MRCP/2.0 124 SET-PARAMS 543256
Channel-Identifier:[EMAIL PROTECTED]
Voice-gender:female
Voice-variant:3
The second token (124) refers to the message length of the
message, including the message body, if any. In the above
example, if I assume that each line is terminated by a '\n'
and a '\r\n' terminates the message itself, then I come up
with a message length of 115 octets, which does not match
the value shown above of 124. I would advise that, for the sake of
completeness, the message-length field matches the actual value
of the number of octets in the message for this example as well
as the rest of the examples in the draft, including the examples
in S14.
As another example, take a look at the S->C message at the end
of page 42 in S7. I calculate the payload length to be 291
octets, which does not match the Content-Length field value of
269 in the example. I would advise that, for the sake of
completeness, the Content-Length field value matches the actual
value of the number of octets in the SIP message body for this
example as well as the rest of the examples in the draft.
General comment 4:
------------------
The state machines in the draft are depicted as a time-flow
diagram -- this is interesting. While I have not seen a state machine
depicted as such -- all of the ones I am used to employ the
canonical vertex and edges between them to signify transitions --
if a time-flow diagram suffices for the population reading the draft,
then I am perfectly fine with leaving it in as is. I will depend on
your discretion to strike out this comment if the depiction of the
state machine is consistent with reader expectations.
Rest of my comments follow:
- Abstract: It is stated that MRCPv2 "relies on a session management
protocol such as the Session Initiation Protocol (SIP) to establish
the MRCPv2 control session between the client and the server, and
for rendezvous and capability discovery." True, SIP is a rendezvous
protocol, but in and of itself, it is not a (media) capability
discovery protocol; SDP is probably closer to the mark for (media)
capability discovery and description (incidentally, SDP is mentioned
in the immediate next sentence.) I would suggest a modest re-write,
something along these lines:
OLD:
- it relies on a session management protocol such as the Session
Initiation Protocol (SIP) to establish the MRCPv2 control session
between the client and the server, and for rendezvous and capability
discovery. It also depends on SIP and SDP ...
NEW:
- it relies on other protocols, such as Session Initiation Protocol
(SIP) to rendezvous MRCPv2 client and servers and manage sessions
between them, and the Session Description Protocol (SDP) to describe,
discover and exchange capabilities. It also depends on SIP and SDP
...
- S1, second paragraph: It is stated that "In addition, the client
can use a SIP re-INVITE method (an INVITE dialog sent within an
existing SIP Session) to change the characteristics of these
media and control session while maintaining the SIP dialog between
the client and server." It is true that you can use re-INV to do
what is mentioned. However, I am curious: did you consider the
UPDATE method? It also allows a client to update the parameters
of a session, but is lightweight than the re-INV (for instance,
it does not require an ACK, and does not require the server to
keep retransmission state of the response until it gets an ACK.)
Now, using a re-INV for offer/answer is sometimes advantageous
in situations where, for instance, the sender wants to send a re-
INV but does not specify any offer in it. The receiver sends an
offer in the 2xx, and the sender answers that offer in the ACK.
But I do not see MRCPv2 using such an exchange in the call flows
present in the document. These call flows are of the normal pattern
of "offer in request, answer in response". I realize that call
flows do not normatively reflect the text, but if MRCPv2 can benefit
from some reduced complexity by using a UPDATE/200 to change the
session characteristics, then why not think about it? I will
leave the final decision to you.
- S1, third paragraph, first line: consider taking out the phrase
"client/server" from the sentence. In SIP, once a dialog is set
up, either end can act as a client.
- S2.1: There appears to be some formatting problems. The second
column invariably starts at a line below the first column.
S3.1: Same comment as above when describing the media processing
resources.
- S2.2: Given that the reader of this document has no idea (at this
point in the document) what PENDING, IN-PROGRESS, or COMPLETE
are, you may wish to consider moving this paragraph before the
first state diagram, by which time these states would have been
explained. Or alternatively, you may want to consider putting a
forward reference to S5.3, where these states are defined.
- S3, third paragraph: s/server using this/server on this/
- S3, third paragraph: You may want to put a forward reference to the
framing issue when doing I/O over a stream transport. When I was
reading this paragraph, it naturally occurred to me whether MRCPv2
provides framing (it does, and this is discussed in S5.)
Same comment as above for the opening paragraph of S4.
- S3.1, in the definition of a Recorder:
s/A resource capable of recording audio and saving it to a URI./A
resource capable of recording audio and saving it as resource
accessible by a URI./
- S3.1, in the defintion of a Recorder, second line:
s/end pointing//
- S3.2: the opening paragraph can best be re-written as follows:
The MRCPv2 server is a generic SIP server, and is thus addressed
by a SIP URI, for example: sip:[EMAIL PROTECTED]
This is more concise, and I believe sufficient. How this address is
resolved to reach the SIP server, and how that SIP server registers
with its registrar are all handled by rfc3261. I do not see any
advantage of discussing SIP registrars and Contact URIs here.
- S4.1, third line: s/SIP transactions/SIP requests/
- S4.3, page 21: s/and fork it/and give it/
- S4.3, last paragraph: I am always a proponent of better error
diagnostics. Thus, instead of putting the stock reason-phrase of
"Not Implemented" on the 501 status-code, why not be more
descriptive and put a reason-phrase of "Cannot mix/multimix or
fork media"? That allows the automata to consume the 501
status-code as intended, but also allows for a descriptive
reason-phrase to be displayed to the user.
- S5.1, page 22: In the ABNF for generic-message, the generic-
header production rule is not expanded until S6.2. I would
suggest that it be expanded in S5.1, along with the resource-
header production rule, or barring that, at least a forward
reference to S6.2 be maintained to let the reader quickly
find where generic-header is expanded.
- S5.1: It may be best to move the discussion of the message-length
field on page 23 to S5.2. It fits better there since the token
is described in S5.2. Otherwise, you straddle critical information
in between two sections and the reader has to go back and forth
to make sense (BTW, FWIW, I am glad you have this field; framing is
made much easier and code can be optimized by having it.)
Same comment for mrcp-version production rule; describe it in S5.2
as well.
- S5.2: I think it may be best if you describe the fields in the
order that they appear in the request-line production rule. So,
describe mrcp-version production rule first, followed by
message-length, and so on. And consider consolidating the
description of mrcp-version and message-length from S5.1 to this
section (see previous comment.)
As an added incentive to do this, note that in S5.3, the order
of description matches the order in which the tokens appear in
the response-line production rule.
- S5.4: I note that you have some sort of an asynchronous
notification system of responses (i.e., a PENDING or IN-PROGRESS
indicates further Event messages will be delivered.) Why not have
an specific status-code to that effect? In SIP, we have the
notion of a "202 Accepted" response, which indicates that a request
has been accepted but yet to be fully processed.
Would a 202 response code be valuable to MRCPv2? I cannot say for
sure, but did want to point this out to you so you can make a
judgment on whether this is indeed the case. If you would like
to read up more on the semantics of the 202 response code in SIP,
it is defined in rfc3265.
- S5.5, first paragraph: s/The status value is COMPLETE/The request-
state value is COMPLETE/
Also, same comment as in S5.2 about making sure that the description
of the tokens in the event-line production rule match the order in
which the tokens appear in the production rule.
- S7: missing terminating period on the last sentence of the section
(end of page 42.)
- The state machines in the draft are depicted as a time-flow
diagram -- this is interesting. While I have not seen a state machine
depicted as such -- all of the ones I am used to employ the
canonical vertex and edges between them to signify transitions --
if a time-flow diagram suffices for the population reading the draft,
then I am perfectly fine with leaving it in as is. I will depend on
your discretion to strike out this comment if the depiction of the
state machine is consistent with reader expectations.
- S12.3 - Does anything need to be said about how to key the SRTP
stream?
Thanks,
- vijay
--
Vijay K. Gurbani, Bell Laboratories, Alcatel-Lucent
2701 Lucent Lane, Rm. 9F-546, Lisle, Illinois 60532 (USA)
Email: [EMAIL PROTECTED],bell-labs.com,acm.org}
WWW: http://www.alcatel-lucent.com/bell-labs
_______________________________________________
Sip mailing list https://www1.ietf.org/mailman/listinfo/sip
This list is for NEW development of the core SIP Protocol
Use [EMAIL PROTECTED] for questions on current sip
Use [EMAIL PROTECTED] for new developments on the application of sip