Re: [transcode-users] time_handling

Francesco Romani Thu, 09 Oct 2008 14:36:51 -0700

On Tue, 2008-10-07 at 14:45 -0500, Carl Karsten wrote:
> > Very unlinkely at this development speed.
> > A lot of code rewrite, a careful planning of some architectural changes
> > and the design and the implementation of better synchro algorythm are
> > the steps needed. Noone is easy nor quick :)
> Can you post a road map, along with what sorts of skills are needed?  I'll try
> goto get some help if I know what kind of help is needed.


Yup, I'm setting up this page:
http://tcforge.berlios.de/articles/tasks/index.html

[...]
> >> I am really having a hard time imagining how it would work... but this
> >> is what
> >> comes to mind:
> >> Create input from scratch:
> >> Use python PIL module to construct 2 images: one all black, one all white. 
> >>  use
> >> some mystery module to generate a tone every second.  use something to 
> >> construct
> >> a stream that just alternates between the 2 images, and plays the tone 
> >> during one.
> > 
> > Yep the keypoint is to be able to clearly identify the frames in the
> > stream (both audio and video) then we can assert the streams are in sync
> > if a given audio frame matches a given video frame (modulo a small fixed
> > error).
> 
> Is there currently any way to test for sync errors?  my understanding is the
> resulting stream is valid, just not what is expected.

Well I think such way exists granted the ability to identify frames.
In a nutshell, A/V sync means correct pairing between frames, and it can
be verified programmatically if the program can recognize the frames.

> >> Represent playback as data:
> >> given a stream, I want access to the frames as if it was a list, or at 
> >> least an
> >> iterator.  video would be rendered as single image blobs.  I am not really 
> >> sure
> >> how the audio would be represented, but I need some way of detecting the 
> >> transition.
> > 
> > audio is just a different blob.
> Can you have a frame of audio?

Why not? It's just matter to properly pack some samples. For example,
considering PAL framerate, it's matter to pack 40ms of audio.

Moreover, audio frames don't need to have the same rate of the video
frames. Given 10s of a PAL stream, it can be consituted by
250 video frames and 250 audio frames (1 audio frame = 40ms audio)
or 25 audio frames (1 audio frame = 400ms audio) or even 1 audio frame
or any other combination.

> >> Any idea how much of this has already been done, or where a good starting 
> >> point is?
> > Well, testframes generators (both audio and video) are quite common, we
> > have some examples in TC codebase, in the ffmpeg one and I guess in much
> > others.
> > We can also just tag the frames using some kind of watermark in order to
> > make them recognizable.
> 
> I think it will be easy enough to identify video frames.  draw text, ocr the
> text, fairly reliable and easy to use.

Yep, that's right; I'm stressing the watermarket idea because using that
will made easy to identify the audio as well (and in general the
detection is much more reliable). The only drawback is the need to mark
the streams just after the generation/capture, but I think it will be
a reasonnable price (the watermark has NOT to be unique, so I don't see
any other drawback).

> I am planning on getting familiar with ctypes, swig and pyrex in the next few
> months.

Me too :)

(offtopic)
I'd also like to code something on the line of seamingly defunct
pymedia; I already have some prototype code, maybe I'll start a new
project in the next weeks.


Bests,

-- 
Francesco Romani // Ikitt
http://fromani.exit1.org  ::: transcode homepage
http://tcforge.berlios.de ::: transcode experimental forge

Re: [transcode-users] time_handling

Reply via email to