Dr. Werner Fink wrote:
> On Wed, Feb 21, 2007 at 11:12:43PM +0100, Reinhard Nissl wrote:
>> Actually, I don't know how this is done in the case of a FF card and
>> what the firmware has to do in this regard. A guess -- which could
>> explain the issues you see -- would be that sync is not maintained
>> continuously. So after having maintained sync for some time, audio and
>> video frames are simply taken out of some FIFOs at a constant rate and
>> presented to the user -- this should keep audio and video in sync as
>> originally maintained. But when then for example an audio frame gets
>> lost due to a lost TS packet, audio and video get out of sync as the
>> lost packet brakes filling the FIFOs at a constant rate. When you try to
>> reproduce this effect by seeking back in the recording, then sync is
>> maintained actively and you don't see this issue again at that position
>> in the recording.
> If the resulting Mpeg-Audio stream is broken in such a way that
> the HW-Decoder runs into trouble from which it can not recover
> the Audio HW_buffer gets emptied very fast which .. in fact ..
> results in a silent but very fast video sequence. For the next
> firmware I've added a dectection of such an unrecoverable audio
> decoder error to restart the audio decoder as fast as possible.
> Btw: With xine and mplayer I hear a short noise and nothing
> happen with the running picture. Maybe the mplay software
> decoder its self has some checking about the Mpeg-Audio stream
> and the AV synchronization does not depend on the audio PTS.
Well, in xine it works like that:
The demuxer thread reads the PES packets from the input stream,
determines whether the packet is audio respectively video, get's an
empty buffer from either the audio respectively video input buffer pool,
fills the buffer with the PES payload, transfers PTS from the PES packet
to the buffer when PTS is available in the PES packet and puts the
buffer into the audio respectively video input FIFO of the decoders.
Audio and video is decoded by different threads. Each one takes a buffer
from its input FIFO, stores the contained PTS value internally when one
is available to have it ready when the next access unit (= an audio
respectively video frame) is started. The buffer's data is decoded into
the current access unit's output buffer. When a new access unit starts,
the current output buffer is put into the audio respectively video
output FIFO and a new output buffer is allocated from the audio
respectively video output buffer pool. When a PTS value is available,
the PTS value is transferred into the current access unit's output
buffer. Otherwise, the PTS value of the last access unit's output buffer
is incremented by the last access unit's duration (= either audio
respectively video frame duration) and transferred instead. As a result
each decoded audio respectively video frame has now a PTS value assigned
to it (to be precise, transferring the PTS value translates it into an
internal representation which is used next).
Audio and video presentation is done by different threads. Each one
takes an output buffer (= access unit (= audio respectively video
frame)) from its associated decoder output FIFO, reads the PTS value and
compares it to the STC which is provided by the so called metronom. When
PTS is larger than STC, the video output thread simply sleeps until PTS
is less or equal to STC and when it awakes, it writes the output buffer
into the graphics card's video memory. The audio output thread behaves
similar, although it has to generate silence audio data for the sound
card in the case where sleeping for a longer time would cause the sound
card's input buffer to underrun.
So from the above explanation, I don't think that AV synchronization
doesn't depend on the audio PTS -- at least in the case of xine.
Dipl.-Inform. (FH) Reinhard Nissl
vdr mailing list