On Feb 4, 2009, at 4:27 AM, Jim Klimov wrote:
Sorry for jumping in, but I a curious about this. Is the
compression being done in a separate thread on the server side?
No. The X server is not multithreaded.
DM> Ok, thanks! (I can certainly see how that'd be a BIG pain
to do)
I got curious about this thesis - why is it a big pain?
We've recently been doing tests concerning parallelized
compression (of files, via pbzip2 and pigz) and these
tasks fan out to multiple CPUs to strands pretty well.
In our case we had a 6core T1 (24 strands) perform over
13 times better than its single strand. But this success
relied on there being no floating-point math.
...
I'm not suggesting that it'd be a pain to write a threaded X
server...that probably wouldn't be so bad. The big pain would be
REwriting the current X server to be threaded. I worked on some of
the internals of X11R4 and X11R5 years ago (bug hunting) and
discovered that it is a VERY large pile of code.
I believe that compression of screenbuffer can be chopped
to blocks which are compressed independently, JPEG-style
(seeing the artefacts sometimes, it is obvious that this
chopping occurs anyway). These blocks can be compressed
separately by different threads.
Yes, that'd be easy. (well, relatively)
From what I learned in CS, there's this neat OpenMP stuff
which allows programmers to "potentially-parallelize"
certain blocks of their code (like big cycles) with C/C++
pragmas and some well-written code. This is a relatively
unintrusive method to make old linear code work in parallel
in parts where that's needed with no or minor (and already
patterned) rewrites of the logic.
So a single function call from "linear" Xsrv which goes
like CompressImage(buf) would return faster on multi-CPU
machines. No major changes to other code. Did I mention
that this function could also hide in a shared-object
library compiled with optimizations for certain CPUs'
instruction sets which are often aimed at multimedia?
OpenMP works in recent gcc-4.3.x and AFAIK in Sun Studio
compilers as well. I've seen that gcc's libgomp can be
linked in statically so it appears to ldd as only a
standard libpthreads-dependant application.
Am I wrong? How? :)
I've messed with OpenMP a little bit and can't think of any reason
why it wouldn't be doable. Things like -xautopar (Sun Studio
autoparallelization) might also get you somewhere, but that's the
sort of thing in which a very small tweak to the code can make the
difference between parallelizable and nonparallelizable routines. It
would still be a potentially large amount of work to get any real
gain. (think of loop restructuring to eliminate inter-iteration
dependencies, etc)
-Dave
--
Dave McGuire
Port Charlotte, FL
_______________________________________________
SunRay-Users mailing list
[email protected]
http://www.filibeto.org/mailman/listinfo/sunray-users