Re: [transcode-devel] psu mode problem with 1.1

Francesco Romani Sun, 04 Jan 2009 02:30:38 -0800

----------------------------------------------------------------------------
WARNING: long mail follows.
One-line summary:
HEAD broken. Sorry for that. I'm already testing the patch, stay tuned!
-----------------------------------------------------------------------------

On Tue, 2008-12-30 at 08:31 -0700, William Astle wrote:
> > This means a new RC out in a few weeks and (hopefully) the 1.1.0 final
> > a few weeks after.

Still working on that.
I've committed a preliminary and already quite big patch on both HEAD
and 1.1.0 branches. Looks like I'm on the right track (I consider the
new code better in any way with respect to old one, and I wrote both of
them, so this isn't geek pride ;))

But, everybody knows: new code, new bugs[0]. Found yet another one when
doing further testing. I've the code which fixes that (and improves
the code and the design again), but until I commit that

*BOTH* HEAD and 1.1.0 ARE BUGGY! <<======

(I'm sorry for tainting HEAD, that was not made by purpose)

Feel free to ask me the gory details if you like.
A short (!) summary follows.

+++

The root of the problem is the following:

the filter layer not guarantee the in-order deliver of frames to export
layer, because it's multithreaded and has not ordering-enforcing code
(yet).

Let's consider a four-frames long stream.

Import layer delivers frames 1-2-3-4.
Filter layer got them in frame order (import has one thread per stream
and intrinsecally work in FIFO discipline. And we're all very happy with
that).

Filter layer uses N threads per stream. The thread can finish in any
order, so they can deliver the frames to export in any order.
But export expects frames in-order!

An order as returned by filter layer can be 1-2-3-4, 4-3-2-1, 1-3-2-4,
any combination is legal.

Formerly, the linked list placed on the framebuffer was ensuring the
correct order. But that data model was ugly for a number of reasons,
first and foremost duplicating a significant amount of code (private
linked list implementation, using extra fields in frame structure).

Now, no more linked list. How to deal with out-of-order deliver?

I've initially thought (and I'm still convinced that is possible, that's
the most efficient solution and it's also quite elegant) that it is
possible to use a pseudo-sliding-window procedure
(http://en.wikipedia.org/wiki/Sliding_window), having implicit ordering.
The export buffers become a sort of rotating sliding window (so, a
sliding window implemented using a ring buffer),
no longer a plain old FIFO. So filter layer can deliver the completed
frames directly into the correct slot without hassle and, as said, with
implicit ordering. And maximum efficiency.

Uhm, clever! Probably too clever. That's hard to explain (so hard to
document, hard understand, hard to hack, and that's Bad. Very Bad.),
and most important the current implementation is the result of various
simplifications of the sliding window algorhytm... Which ends up to bugs
[1] :)

Solution (currently being tested on my local tree): step back, humilty,
drop all this supposed cleverness and implement an heap buffer 
(http://en.wikipedia.org/wiki/Binary_heap) for the export, and only for
the export.
Heap is simple, well known, easy to grok, efficient enough. Is not O(1)
for insertion and extraction as the sliding window idea was *supposed*
to be, but it's still fast for our common case, yet flexible enough for
future changes. And someday both FIFO and heap can be extracted,
generalized and put into libtc(util) with minimal effort.

Ok, that's all for the short summary :)

Discussion welcome.

+++

[0] I was not satisfied with the old framebuffer. The code was
redundant, crufty, too complex *AND* already buggy at very least for PSU
mode. Given the overall situation it's pointless to be on schedule but
with bugs like that. So, let's bite the bullet and let's do something
better.

[1] Which ultimately is:
current code places incoming frames into the export buffer in the
position corresponding to frame's bufid. The bufid is the internal ID of
a framebuffer, corresponding to the frame registration order. Nothing
else.
Nobody, except a series of circumstances, guarantees that 

frame_id % frame_ring_size = frame_bufid (hard constraint)

Or even that

the bufid sequence seen by export is monotonically increasing (soft
constraint)

But the root of the problem is on the algorythimic side: the buffer
handling code should make it's decisions on public fields, most
important being the frame sequence number (=id) in order to be robust
and flexible.

-- 
Francesco Romani // Ikitt
http://fromani.exit1.org  ::: transcode homepage
http://tcforge.berlios.de ::: transcode experimental forge

Re: [transcode-devel] psu mode problem with 1.1

Reply via email to