Am Mittwoch, 30. November 2005 12:54 schrieb Juergen Bausa:
> I just got to know about dvbcut and think it is a very good concept.
> However, I know that synchonisation of audio and video is a very difficult
> task when processing transport streams. To my knowledge, projectx is uoto
> now the only program that can handle this perfectly correct. Does dvbcut
> use the same method as projectx? Will it also deliver perfect
> synchronisation?

Hello, Jürgen.

I'm not familiar with projectx, so I cannot comment on that, but I give you 
some explanations on the DVBCUT conecepts. I've also read your post on the 
Topfield board, by the way.

So, first of all very generally, DVBCUT is not just a GUI which displays 
pictures and copies parts of the MPEG file itself (like byte ranges) to a new 
file. Actually, quite some thought went into the concepts of parsing the 
input streams and creating the output stream. I am well aware of 
presentations time stamps (PTS) and things like that.

Internally, DVBCUT separates the input MPEG stream into video and audio 
streams. You may call this "demuxing", but not in the sense that seperate 
video and audio files are created from the single input file. I just mean 
that while reading the input file, the data of the different streams (video, 
audio) are transfered to different buffers. These buffers not only store the 
payload data of the streams (the elementary streams), but also keep track of 
the presentation timestamps (PTS) included in the headers. From these 
buffers, the access units (mpeg video pictures and mpeg audio frames 
repectively) are given to libavformat for muxing.

In case that the first picture to be written is not an I-frame (i.e. key 
frame), DVBCUT decodes the whole group of pictures, and re-encodes the 
selected pictures. From the first I-frame in your selected range onwards, the 
pictures are passed as-they-are to the muxer. B-frames are taken care of, 
that means, that B frames that appear in the original video after the first 
I-frame within your range, and before the following I- or P-frame are not 
passed to the muxer (these actually encode pictures that are displayed before 
the I-frame, and therefore were already passed to muxer as re-encoded 
pictures, if they are within the selected START/STOP-range). If the last 
picture(s) before your STOP-marker are B-frames, these pictures are 
re-encoded as well, because the decoder wouldn't be able to decode these 
pictures without the following I- or P-frame.

In fact, I spent quite some time reading the relevant ISO-standards.

So, you worry about A/V-synchronicity. As stated in the posting on the 
Topfield board, this is a matter of the presentation time stamps (PTS). The 
PTS written by DVBCUT are exactly the PTS taken from the input streams, minus 
a constant offset. For the first START/STOP-range (or if there is only one), 
the number which is subtracted from the PTS is identical for all streams 
audio and video. That means that the synchronicity from the original data is 
preserved.

For all other START/STOP-ranges (so after the commercial breaks, for example) 
there can be a tiny shift between audio and video, but never more than 12 
milliseconds (considering a video frame rate of 25 per second and 48kHz audio 
sampling rate). The reason for this is the fact, that the smallest time unit 
for video is one frame (1/25 seconds, or 40 milliseconds), while for audio it 
is one mpeg audio frame (1152 audio samples, thus 24 milliseconds for 48kHz 
sampling rate). Apparently, in the MPEG you don't have audio data stored in 
units which correspond to one video picture. Instead you have video pictures 
(one each 40ms), and audio frames (one each 24ms). DVBCUT does not do any 
audio recoding, it keeps the original audio data. So, actually the video and 
audio streams of DVBCUT's output do not start absolutely simultaneously. 
Audio may start a few milliseconds before are after the first video picture, 
but there is no shift (for the first START/STOP range). For the same reason, 
audio streams may stop a few milliseconds before or after the video stream. 
When appending the next START/STOP-ranges to the video, all streams shall 
appear gapless. Therefore new PTS offsets are calculated such that the PTS's 
from the first part of the video are continued seamlessly be the next part. 
But because audio and video of the first part can stop at different times, 
and audio and video of the second part may start at different times, it is 
sometimes (most times) necessary to shift the audio with respect to the video 
a little bit, such that all output (audio and video) are fitting gaplessly.

When speaking of "starting/stopping at different times" and "shifts", this is 
always only a matter of a few milliseconds (+/- 12 milliseconds). The shifts 
do not accumulate. So even the 100th part of your output file (if you have so 
many START/STOP-ranges) will have an audio/video shift of at most +/- 12 
milliseconds.

There is no way to circumvent this, apart from recoding the audio (that is 
decoding and then encoding MP2) or introducing gaps in the audio streams. I 
dismiss both alternatives. The first because recoding needs significant CPU 
time and also degrades quality. The latter, because gaps in the audio streams 
may cause serious problems for some players. After all, 12 milliseconds are 
negligible: you cannot really speak of A/V asynchronicity when the amount is 
less then one third of the duration of one video picture.

And, by the way, DVBCUT reports on this while creating an MPEG stream. This is 
what DVBCUT writes in the report window:
 =====
Exporting 3177 pictures: 00:11:37.200/00 .. 00:13:44.280/00
Audio channel 1: starts 2.933 milliseconds after video
Audio channel 1: stops 5.067 milliseconds before video
Audio channel 1: delayed 0.000 milliseconds

Exporting 17119 pictures: 00:23:17.600/00 .. 00:34:42.360/00
Recoding 9 pictures
Audio channel 1: starts 5.067 milliseconds before video
Audio channel 1: stops 2.933 milliseconds after video
Audio channel 1: delayed -8.000 milliseconds

Saved 20296 pictures (00:13:31.840)
 =====
As you can see, the audio shift in the first part amounts to zero. Audio of 
the first parts stops 5.067 milliseconds early, therefore audio of the second 
parts also starts 5.067 milliseconds early.

I hope I could convince you a little bit, that care has been taken of these 
issues while designing and coding DVBCUT. Actually, accuracy and standard 
compliance are the main design goals of DVBCUT.

DVBCUT is not 100% stable yet, and surely it contains bugs. But I am working 
on it. More releases will follow.

So... have a nice day.

-- 
Sven Over
Stephanienstr. 9
76133 Karlsruhe
GERMANY

Telefon: 0721-9204199

http://www.svenover.de/



-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_idv37&alloc_id865&op=click
_______________________________________________
DVBCUT-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dvbcut-user

Reply via email to