Hey folks,
I'm working to port and optimize libmpeg2 to the IBM Cell processor and am
looking for some advice.
Brief overview of the Cell processor: it has 1 general-purpose power-based
CPU (PPE = "Power processor element") and 8 independent "synergistic
processing elements" (SPEs), which are essentially stripped-down vector
co-processors/accelerators (with separate memory spaces; DMA transfers are
needed to get data between the PPE (which has access to the full memory
space of the system) and the SPEs (each of which has 256KB local storage)).
I've done some searching and found a few brief mentions of slice-level
parallel decoding with libmpeg2 (specifically here
http://sourceforge.net/mailarchive/message.php?msg_id=2301783). That thread
mentions that something similar will be patched into the main tree, but I
haven't seen those changes or any example/explanation of how they're used.
From this, I found TexMPEG, which seems to offer multithreaded decoding on a
stream, but it's based on an old version (0.2.2) and seems to add several
additional complexities that I don't need. Any pointers to other
multithreaded decoder ports would be great.
I would prefer doing as much of the parallelizing as possible within the
library with little change to the external interface. More importantly, I
want the majority of the calculations to be offloaded to the SPEs, since
that is where the bulk of the Cell's processing power is. My initial
thought for implementation was to maintain a work queue of individual slices
on the PPE and have each SPE take and decode a single slice at a time,
synchronizing as necessary (probably at the end of each frame). I figured I
would essentially offload mpeg2_slice() onto each SPE, and would start by
just using a single SPE.
The conceptual problem I've run into so far (starting with the main source
tree) is determining where within libmpeg2's internal buffer a given slice
lies. This is important again because all data used by the SPE must be
manually transferred to it. Is a structure maintained with the start and
end of each slice or would this be something easy to implement?
I appreciate any advice that can be given.
Thanks,
Nick
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Libmpeg2-devel mailing list
Libmpeg2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/libmpeg2-devel