It sounds like the granularity of parallelism is too fine.  That is, each
"task" is too short and the overhead of task dispatching (your task queue
processing, the kernels thread context switching, any IPC required, etc.) is
longer then the duration of a single task.

I hit the same problem a decade and a half ago when I worked on distributed
parallel ray tracing systems for my post graduate thesis. If each task is a
pixel then you may want to consider increasing this to a (configurable size)
bundle of pixels.  Depending on the algorithm being parallelized the bundle
may contain contiguous pixels (if processing of each pixel requires
approximately uniform processor time) or a random set of pixels (if there
is, or can potentially be, significant variance in per-pixel processing
time).


-Dave

-----Original Message-----
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Daniel
Egger
Sent: Monday, 21 February 2005 9:13 AM
To: [EMAIL PROTECTED] ( Marc) (A.) (Lehmann )
Cc: Sven Neumann; Developer gimp-devel
Subject: Re: [Gimp-developer] GIMP and multiple processors

On 20.02.2005, at 23:47, <[EMAIL PROTECTED] ( Marc) (A.) (Lehmann )> wrote:

> Linux will not keep two threads running on a single cpu if both are 
> ready
> and nothing else is running, regardless of locality etc., as the kernel
> lacks the tools to effectively decide wether threads should stay on a 
> cpu
> or not.

Yes and no. I just figured out that the tools I were
looking for are called schedutils and can be used to
change the affinity settings of a process, i.e. pin
it to some CPU or allow it to migrate as the kernel
decides between a set of CPUs.

Forcing the NPTL implementation to degrade to legacy
pthreads means that one thread equals one process and
thus can be controlled with taskset.

Oh yes, and I just noticed that now this isn't even
necessary anymore because for some reason the kernel now
migrates on of the pthread processes to the other CPU
automatically after a short while of processing.

> (I mean, it's of course bad to interlave operations on a per-pixel 
> basis
> instead of e.g. a per-tile basis, but the kernel will run the threads
> concurrently wether or not it gets slower).

Certainly. Opterons are bandwidth monsters but this
doesn't mean that they'll be forgiving to stupid
algorithms.

> That's quite possible, but IFF the kernel indeed keeps the two threads 
> on
> a single cpu then it means that both aren't ready at the same time, 
> e.g.
> due to lock contention or other things.

I can force it to use both CPUs now, but even with
200% utilization it is 2s slower to run this stupid
ubenchmark than on 1 CPU without threads.

Servus,
       Daniel

_______________________________________________
Gimp-developer mailing list
Gimp-developer@lists.xcf.berkeley.edu
http://lists.xcf.berkeley.edu/mailman/listinfo/gimp-developer

Reply via email to