I agree encoder latency should be specified directly in low latency testing. 
IMO, this has nothing to do with bitrates or rate control strategies. It is 
simply constraining the encoder to zero structural delay when encoding each 
frame. That is, frame in, frame out, no reordering or lookahead.

Rate control constraints may be necessary in some testing to reflect transport 
channel constraints. But this should not be conflated with low latency as in 
zero structural delay.

Mo (as individual)

On 7/22/15, 4:40 AM, video-codec on behalf of Keith Winstein 
<[email protected]<mailto:[email protected]> on behalf of 
[email protected]<mailto:[email protected]>> wrote:

Hello Thomas,

It seems like constraining the bitrate variation by enforcing a limited 
receiver-side VBV buffer is kind of an indirect way to get at what you really 
care about. Most receivers can probably buffer at least >5 megabytes (so >5 
seconds of video) with no problem. If the document wants to specify a model for 
low-latency video, I think the constraint might be better expressed in terms of 
latency directly.

Here would be my proposal for an idealized model for what video conferencing 
would require from the video coder:

(a) The source video arrives at 30 fps, and each frame is given an "encode 
time" of n/30 seconds.
(b) After being encoded, the coded frames are appended to a sender-side FIFO.
(c) The sender-side FIFO is drained (and delivered to the receiver) at a 
particular link rate given in bits per second.
(d) At the receiver, each frame is given a "display time" of the earliest 
moment it can be shown to the user given when its coded representation finishes 
being delivered, and any rules of the format about frame reordering (i.e. the 
presentation timestamp of MPEG). The receiver doesn't attempt isochronous 
presentation of the frames; it just shows each frame at the earliest possible 
moment.

For the first 8 seconds, the link rate "(c)" is 4 megabits/sec, and the coder 
is informed of this. At t=8 seconds, the link rate "(c)" instantaneously 
changes to 500 kilobits/sec. The coder learns of this change at t=8.25 seconds.

The figure of merit is a combination of (1) visual fidelity (e.g. PSNR, SSIM, 
etc.) of the coded frames relative to the source frames, and (2) the maximum 
difference between the "encode time" and "display time" of any frame, taken 
over all the frames in the video.

Best regards,
Keith

On Tue, Jul 21, 2015 at 4:11 AM, Thomas Daede 
<[email protected]<mailto:[email protected]>> wrote:
At the Monday NETVC session, it was suggested that CBR mode with a fixed
buffer size was not a very good representation of what rate control for
video conferencing would do.

Are there suggestions of a better model? Keep in mind that this is only
for relatively short (~15s) clips.

Another option is just to not specify CBR bounds, so that this test
would be like the high latency test but with lookahead and 2 pass
constraints.

_______________________________________________
video-codec mailing list
[email protected]<mailto:[email protected]>
https://www.ietf.org/mailman/listinfo/video-codec

_______________________________________________
video-codec mailing list
[email protected]
https://www.ietf.org/mailman/listinfo/video-codec

Reply via email to