On Sun, 2009-01-11 at 19:40 +0000, Andrew Church wrote:
> >Sounds good.
> >So, I volunteer for maintining the legacy code (the current src/*
> >stuff).
> 
> Then I guess I'm tapped for the new code, huh? (:  Well, I'll see how
> my work schedule looks, but I should be able to contribute something,
> at least.  Plus it looks like fun to try. (:

Don't get my wrong, I'm very happy when I write the new one; I just
don't want the transcode-legacy to be abandoned (...yet), so I'll
support it, that's all.

> Well, that's certainly good to start with (changing too many things at
> once will just lead to problems), but we probably ought to consider
> changes once the basic framework is solidified, just so we don't have
> too many round-peg-to-square-hole converters thrown into the code.

Yes indeed. The NMS has is already showing some shortcomings, but I'd
like to collect more informations before to change the API again.
That of course doesn't mean the API will not change -quite the
opposite :)

As general rule, though, I believe into "the special cases aren't
special enough to break the rules".
Which roughly translates into
"makes common case easy and special case possible"
In my own experience, some people -yes, sometimes myself too- tries to
make _everything_ easy/simple but the most common result is a general
increase of complexity and very questionable advantages.

So, if 99% of our modules has just one input and just one output, I feel
the API should make comfortable write _that kind_ of modules. Of course
MIMO (MultiInputMultiOutput) modules should be possible to write but if
they are a bit harder to write... hey, that's life.

> >The point I'm blocked on is how to blend together real parallelism and
> >efficient frame passing. If we apply the strategy I exposed verbatim in
> >a multi-{cpu,core} environment, I'm afraid just one node will
> >effectively run at any given time, while all the others are blocked in
> >pulling. Of course this is no good.
> 
> I agree it's not easy (I gave it a little thought myself and eventually
> decided it wasn't worth worrying about in an initial implementation),
> but I still think we should keep the one-thread-per-node concept.  After
> all, it's much harder to add multithreading later to a single-threaded
> design than it is to design in from the beginning.

I do agree completely.
I've thought a bit more on this subject, and quickly come out with the
following: a low-hanging fruit *seems to be* to use double buffering for
work buffers:

typedef struct tcprocessingnode_ TCProcessingNode;
strut tcprocessingnode_ {
        TCProcessingNode *upstrem;
        TCModule         *module;
        TCFrameBuffer    *frames[2];
        int              work; /* index of the framebuffer to use */
        
        int              (*request_frame)(TCProcessorItem *P, TCFrameBuffer 
*frame);
        int              (*wait_frame)(TCProcessorItem *P, TCFrameBuffer 
*frame);
};

static int int generic_get_frame_mt_friendly(TCProcessorItem *P, TCFrameBuffer 
*frame)
{
    int err = TC_OK;
    /* init postconditions: frames[2] point to valid framebuffers, work=0 */
    P->upstream->request_frame(P->upstream, P->frames[!work]); /* async, in a 
separate thread */
    P->upstream->wait_frame(P->upstream, P->frames[work]);
    /* blocks until ready, hopefully 99% of times just not blocks */
    err = tc_module_process(P->module, P->frames[work], frame);
    work = !work; /* done with current buffer, so rotate it */
    return err;
}

Locking and blocking has to be sorted out clearly (the above is just lacking)
but the core idea is to let current step and upstream (and so on until 
leafnodes)
to work in parallel. Hope it's clear enough, too :)

[...]
> i.e. thread 1 pulls from thread 2 and thread 3 at the same time, so it
> only has to wait for max(wait_2,wait_3), not (wait_2 + wait_3).
> Admittedly I'm not sure how well this will work in practice, since the
> demux stage will be single-threaded and not (yet) have any buffering,
> but at the least we should be able to get the video and audio streams
> running in parallel.

At very least it's worth trying :)

Bests,

-- 
Francesco Romani // Ikitt
http://fromani.exit1.org  ::: transcode homepage
http://tcforge.berlios.de ::: transcode experimental forge

Reply via email to