Hi all,

Most of the issues that Scott pointed out are specific and clear.
We hope we will update the draft and submit it in soon.
Any other comments or advise are very helpful to us, of course.

Regards, Matsumoto

I think Scott has raised some issues here that should be addressed.
Can folks sort out what (if any) changes need to be made, then update
the draft or let me know no update is needed?

Thanks, Cullen

On Jun 24, 2008, at 5:47 AM, Scott Brim wrote:

I have been selected as the General Area Review Team (Gen-ART)
reviewer for this draft (for background on Gen-ART, please see
http://www.alvestrand.no/ietf/gen/art/gen-art-FAQ.html).

Please wait for direction from your document shepherd
or AD before posting a new version of the draft.

Document: draft-ietf-avt-rtp-atrac-family-16.txt
Reviewer: Scott Brim
Review Date: 24 June 2008
IESG Telechat date: 02 July 2008

Summary:

This draft is on the right track, but has open issues,
described in the review.

Comments:

This is being submitted as a proposed standard.  Therefore I am
asking that it be very clear.  My concerns are mainly with what I
see as some ambiguities and some possible errors in documenting
protocol behavior. There aren't many so I have left them in the
order they occur in the draft instead of categorizing them.


1. Introduction

   The need for real-time streaming of audio data has grown, and
   this document details our efforts in increasing the product and
   application space for the ATRAC family of codecs.
This is a draft for a proposed standard technical specification.
Whether it is motivated by a desire to increase product and
application space is irrelevant.  I would delete this.


4.5.2 Scalable Multi-Session Streaming

   While there may be alternative methods for synchronization of the
   layers, it is RECOMMENDED that the timestamp will be used for
   synchronizing the base layer with its enhancement. Applications
"It is RECOMMENDED" does not conform to RFC 2119.  This should be a
SHOULD, along with an explanation of the conditions under which it is
reasonable not to implement (so that implementors are not left
guessing).

   If the enhancement layer's session data cannot arrive until the
   presentation time, the decoder SHALL decode the Base layer
   session's data only, ignoring the enhancement layer's data.
Change SHALL to MUST globally.


5.1  Global Structure of Payload Format

   The structure of ATRAC Payload is illustrated in Figure 3.  The
   RTP payload following the RTP header contains three octet-aligned
   data sections.
Only two data sections are described.  Do you mean that the RTP
header plus ATRAC header section plus payload section form three
sections?


5.3.1 Usage of ATRAC Header Section

   Fragment Number (FrgNo): 3 bits
   In the event of data fragmentation, this value is one for the
   first packet, and increases sequentially for the remaining
   fragmented data packets. This value SHOULD be zero for an
   unfragmented frame.
Earlier it was said: "The ATRAC codec can handle very large frames.
As most IP networks have significantly smaller MTU sizes than the
frame sizes ATRAC can handle ...".  If there can be such a
significant difference -- and if you want to allow for larger frames
in the future -- is there special handling for when this 3-bit
counter rolls over (more than 7 fragments)?  If not, at least
mention that you do not expect it to roll over -- or that you expect
the receiver to be able to handle rollovers.


5.3.2.2  Frame Fragmentation

   However, if even a single ATRAC frame will not fit into a
   complete RTP packet, the ATRAC frame SHOULD be fragmented.
What is the alternative to fragmenting it?  If there is no
alternative, make the SHOULD a MUST.  If there is an alternative, what
is it and under what conditions is it acceptable to do it?  For
example, you might say: ... "the ATRAC frame SHOULD be fragmented
unless the receiver is non-compliant and has indicated it is
incapable of receiving fragments, in which case the session MUST be
terminated."

   As subsequent packets do not contain any new frames, the Number
   of Frames field SHOULD be ignored.
Should this SHOULD be a MUST?  I would think so.  If not, under what
conditions is it acceptable NOT to ignore the Number of Frames field?


6.1  Example Multi-frame Packet

First, NFrames=5 means there are 6 frames in the packet but only 5
are shown.

Second, up in 4.5.1 you said: "In multiplexed streaming, the base
layer and enhancement layer are coupled together in each packet,
utilizing only one session as illustrated in Figure 1.  While the
packet may begin with either layer type, the two layer types MUST
interleave."  In this example you show 3 base layer frames, an
enhancement frame, and then a base layer frame.  Since

 - you have begun interleaving in the middle of a packet, and

 - interleaving can begin with either layer type, and

 - there are no frame numbers,

how can you tell that the enhancement layer frame is not the
_beginning_ of the interleaving, and that it is not associated with
the _following_ base layer frame?  There seem to be some implicit
assumptions that should be made explicit, so that implementors can
avoid incompatibility.


7.5.2  For Media subtype ATRAC-X

      The "baseLayer" parameter MUST be the first entry on this
      line.  It is RECOMMENDED that the "channelID" parameter be the
      next entry.
Again, make this a SHOULD, and explain under what conditions it is
acceptable not to do so.  Why are you allowing implementors NOT to
have channelID be second?  Why do you want them to?


7.5.3  For Media subtype ATRAC Advanced Lossless

   o  The Media subtype (payload format name) goes in SDP "a=rtpmap"
      as the encoding name.  This SHOULD be followed by the
      "sampleRate" (as the RTP clock rate), and then the actual
      number of channels regardless of the channelID parameter.
What is the problem if this order isn't followed?  If you have a
SHOULD, it's good to tell implementors under what conditions it is
acceptable for them not to do it.  Otherwise you get inconsistent
implementations.  Some just ignore all SHOULDs.

It is RECOMMENDED
Make it a SHOULD, with explanation.

The same comment applies to the uses of RECOMMENDED that follow.  I'll
stop mentioning them.

7.6  Offer-Answer Model Considerations

   In order to establish an interoperable transmission framework, an
   Offer-Answer negotiation in SDP SHOULD observe the following
   considerations.
Under what conditions is it acceptable not to?


7.6.3  For Media subtype ATRAC-X

   o  When creating an offer with considerably high requirements
      (such as 8 channels at 96kHz), it is RECOMMENDED that the
      offer also contain a configuration with lower requirements
      (such as a stereo only option).  Although multiple alternative
      configurations may be offered, care SHOULD be taken not to
      offer too many payload types.
I'm not sure what this SHOULD means.  If this is just a general bit of
advice, make the SHOULD lower case should -- or perhaps just delete
it.  If this is an important guide to implementation, then should the
SHOULD be a MUST?  If so, what specifically do you mean by "too many"?
Is it possible for the offerer to know?  If it should be a SHOULD,
what is the impact of offering too many?  Under what conditions is it
acceptable to offer too many?  When the receiver's capabilities are
not known?

      For best performance, we suggest an answer SHALL NOT contain
      any values requiring further capabilities than the offer
      contains,
"suggest ... SHALL NOT".  Either they MUST NOT or SHOULD NOT, but I
wouldn't just "suggest" a requirement.  What happens if an offer
_does_ contain further capabilities?


Scott





_______________________________________________
Gen-art mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/gen-art

Reply via email to