I'll just add a couple of things to what Dave has said.

There are two sources of latency in a video codec. The first is
out-of-order encoding - which is what Dave was talking about. This
causes latency because a decoder has to wait to assemble all the missing
pieces before being able to display anything.

You do out of order coding because it's more efficient. For example, if
you have 3 frames

F1 F2 F3

you can code F1 without reference to anything ("Intra" coding), then
code F3 by predicting from F1 and then code F2 by predicting from _both_
F1 and F2. This causes delay, because in coded order you have

F1 F3 F2

and you have to wait for F2 _and_ F3 before you can decode and display
F2. But this is more efficient than making coded order the same as
display order, since by being able to predict from both sides for F2 you
make the number of bits for F2 _much_ smaller.

So the low-latency Schro mode that Dave is talking about would code
everything in display order

F1 F2 F3

with Fn predicted from Fn-1. This has an efficiency cost of maybe 10% or
more over other ways of arranging the pictures.

There is a second cause of latency, though, and this is bit rate
variation. For example, the source pictures might arrive every 40ms, but
because the pictures are encoded to different sizes (because of the
different amounts of information they contain), in the coded stream they
might take

12ms, 62ms, 38ms ...

so you can't just display each picture when it comes or the stream will
be really juddery. The pictures have to be buffered so as to get a
smooth output stream. This is done by having a buffer model at the
encoder and decoder. This is a "leaky bucket" where bits enter the
bucket at the decoder in a constant stream, and pictures, corresponding
to variable-sized dollops, are pulled out. 

You can operate Schro in a Constant Bit Rate mode with this buffer
model. The smaller the buffer, the lower the delay, but the worse the
quality. 

So the long and the short of it is that lower delay means lower quality.
In typical video streaming scenarios, it's probably the buffering delay
that dominates actually - that's why video on YouTube says "Buffering
..." at the start. The good news is that in Schro's low delay mode ("P
only coding"), the amount of information in each encoded picture is
mostly about constant, so a smallish buffer of around 0.5 sec or maybe
even less should be ok. Note that the default is actually much larger,
however: typical internet streaming services might have buffers 10s long
or even longer.

regards

Thomas

>-----Original Message-----
>From: David Schleef [mailto:d...@entropywave.com] 
>Sent: 27 March 2009 06:19
>To: Norbert Kubiec
>Cc: schrodinger-devel@lists.sourceforge.net
>Subject: Re: [Schrodinger-devel] Few questions about Dirac video codec
>
>On Thu, Mar 26, 2009 at 11:07:32PM +0100, Norbert Kubiec wrote:
>> The question of latency was not unfounded. Have You heard about 
>> OnLive? They use new interactive video compression 
>algorithm. Latency 
>> through the algorithm is just 1-ms instead of the 0.5- to 
>0.75-second 
>> lag inherent in conventional compression algorithms used in 
>corporate 
>> video conferencing solutions, for example.
>
>Glad to hear that you totally bought the marketing speak. :)
>
>Rather than respond to your questions directly, I'll talk 
>randomly about low-latency video codecs work.
>
>One key point about low latency video encoding is that the 
>output bits that represent the pixel have to exist somewhere 
>in the bitstream between the time the encoder gets the pixel 
>from the camera, and N ms later, where N is the latency.
>
>One method of very low-latency compression works on a scanline basis.
>An example is the low-delay profile of Dirac.  A camera reads 
>out a few scan lines (say, 16), the encoder compresses them, 
>and then sends those bits out over ethernet or ASI or 
>whatever.  The latency is on the order of a few scan lines, 
>say 16*2 + a small number.  Why 16*2?
>Because it takes 16 lines to read in the 16 line chunk, then 
>spends the time that it takes to read in the next chunk to 
>encode the first chunk and send it out over the wire.  
>Simultaneously, the decoder reads in the data and decodes.  
>Then during the third set of 16 lines, the decoder scans out 
>the uncompressed lines.  So the decoder scans out line 0 as 
>the camera is scanning out line 32.  Real encoders need a bit 
>of extra time for synchronization, so 32 is ideal.  Of course, 
>in a real system there is network latency, but we'll make 
>someone else worry about that.
>32 lines works out to be abous 1 ms for 1080p at 30 frames per 
>second, depending on exactly the system you're using.  
>Compression ratios are purposefully low, since you can't 
>spread around worst-case bits at all, and because this kind of 
>compression is only really useful for studio work.
>
>Note that camera that has a few-scanline latency start at USD 
>10,000 and an encoder/decoder pair for DiracPro is about USD 
>4,000, iirc.
>This is not the kind of technology you roll out in a consumer product.
>
>Another method is similar, but using an entire frame instead 
>of a few scan lines.  In this case, you get a theoretical 
>latency of 2 frames, or about 60 ms for 30 fps video.  I've 
>seen companies advertising encoder/decoder pairs that claim 70 
>ms latency (of course, without any network latency), and I can 
>pretty much believe this number.  Again, you can't get away 
>with cheap hardware -- my DV camera has an internal latency 
>somewhere between 90 and 120 ms, and HDV cameras are much worse.
>
>In a frame-based low-latency system, it's much more realistic 
>to use motion compensation, in which you use the previous one 
>or two frames as reference pictures.  Since the general point 
>of using motion compensation is to decrease the bit rate, this 
>causes compression artifacts immediately after scene changes 
>that clear up after a few frames, and is very characteristic 
>of the technique.  
>
>Due to the way that Dirac puts together pictures, the 
>non-low-delay profiles of Dirac has a approximate latency of 4 
>pictures for a simple implementation, although you can 
>decrease this to nearly 2 pictures with more complex 
>algorithms.  Schroedinger implements the simple algorithm, and 
>with suitable modifications (it does not do this by default) 
>you can get close to 4 frames latency.  Schro's implementation 
>of Low-Delay Profile is also 4 frames, since it uses the same code.
>
>Entropy Wave has implementations of the more complex algorithm 
>for Simple and Intra profiles, as well as an actual low delay 
>implementation of Low-Delay profile, with latencies that are 
>very near the theoretical latencies.  These are not open 
>source.  Unfortunately, since all the code that currently can 
>use these codecs is frame based, there's very minor advantage 
>over Schroedinger unless you write a bunch of custom code.
>
>
>
>dave...
>
>
>---------------------------------------------------------------
>---------------
>_______________________________________________
>Schrodinger-devel mailing list
>Schrodinger-devel@lists.sourceforge.net
>https://lists.sourceforge.net/lists/listinfo/schrodinger-devel
>

http://www.bbc.co.uk/
This e-mail (and any attachments) is confidential and may contain personal 
views which are not the views of the BBC unless specifically stated.
If you have received it in error, please delete it from your system.
Do not use, copy or disclose the information in any way nor act in reliance on 
it and notify the sender immediately.
Please note that the BBC monitors e-mails sent or received.
Further communication will signify your consent to this.
                                        

------------------------------------------------------------------------------
_______________________________________________
Schrodinger-devel mailing list
Schrodinger-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/schrodinger-devel

Reply via email to