On Mon, 2005-02-28 at 00:25 +0100, Sven Neumann wrote:
Hi,
Jay Cox [EMAIL PROTECTED] writes:
Now that this race condition is eliminated I might look into adding
hooks to the pixel-processor to allow initialisation of per-thread
data, like for example a GRand.
I think that is the
Hi,
Daniel Egger [EMAIL PROTECTED] writes:
The first goal should be to reduce complexity. Today I looked
into parallelizing some *really* slow code: supersampling.
Activating it slows down the processing *much* more than just
the factor of the supersampling depth. And hey, the code is
uglee
On 28.02.2005, at 21:15, Sven Neumann wrote:
Since supersampling is in libgimpcolor, it will probably be difficult
to improve without breaking backwards compatibility. But of course
noone forces us to actually use the functionality in libgimpcolor...
But, and this is one of many upsides of
Hi,
Jay Cox [EMAIL PROTECTED] writes:
The dither code is way too complex. It looks like it should boil
down to: color.{r,g,b,a} += g_rand_int()/RAND_MAX.
We shouldn't need 32 bits of random data per component. 8 bits
should do, so we only need one call to g_rand_int per pixel.
Good
Daniel Egger wrote:
Maybe it would be best if someone could come example what supersampling might
be useful for in gradient blend
code except for slowing down everything...
For an A-B linear blend it probably doesn't noticably help
much (unless, possibly, it's a long colour range in a small
On 27.02.2005, at 01:04, Sven Neumann wrote:
I have eliminated last_visited from the gradient struct. Instead the
caller of gimp_gradient_get_color_at() may now do the same
optimization without any caching in the gradient itself. I very much
doubt that this makes any difference though. Perhaps if
Hi,
Daniel Egger [EMAIL PROTECTED] writes:
Your change dramatically changed the picture.
That surprises me, but there is obviously a lot I have to learn about
threads. Thanks for testing my changes.
However dithering on is still loosing quite a bit on this SMP
machine
Dithering makes
On 27.02.2005, at 14:31, Sven Neumann wrote:
Dithering makes heavy use of GRand and as long as the random number
generator is shared between the threads... I wonder if it actually
makes sense to add the overhead of per-thread data initialization or
if we should rather replace the use of a random
Hi,
Daniel Egger [EMAIL PROTECTED] writes:
Since the randomness doesn't play a big role here (like in a
security environment) I wonder if it wouldn't be easiest to
employ a per-thread pseudo-RNG seeded with a different
random number. One global RNG would be nice for this...
GRand is such a
On 27.02.2005, at 15:19, Sven Neumann wrote:
It is called once per tile. Your approach probably makes sense as long
as don't use g_rand_new() but g_rand_new_with_seed(). g_rand_new()
initializes the random number generator from /dev/urandom which is
probably too expensive to be done once per
On 27.02.2005, at 15:19, Sven Neumann wrote:
It is called once per tile. Your approach probably makes sense as long
as don't use g_rand_new() but g_rand_new_with_seed(). g_rand_new()
initializes the random number generator from /dev/urandom which is
probably too expensive to be done once per
On 27.02.2005, at 17:24, Sven Neumann wrote:
Incidentally this is exactly what I'm testing right now. ;=)
Incidentally that is what the code in CVS is doing. Looks like we were
working on the same code. We should perhaps start using mutexes on the
source code ;)
Heh, you did not only do the same
On 27.02.2005, at 17:19, [EMAIL PROTECTED] ( Marc) (A.) (Lehmann ) wrote:
As for your claim, dithering is completely invisible, try this image,
which
is done with gimp's blend and no dithering:
http://data.plan9.de/d0.png
That image features quite visible banding effects - you will have
On Sun, 2005-02-27 at 01:04 +0100, Sven Neumann wrote:
Hi again,
Jay Cox [EMAIL PROTECTED] writes:
Clearly the gradient code could use some tuning. A linear blend
shouldn't take much more than 1/2 a second even with dithering.
Could we improve this by preparing a reasonably fine
Hi,
Jay Cox [EMAIL PROTECTED] writes:
Now that this race condition is eliminated I might look into adding
hooks to the pixel-processor to allow initialisation of per-thread
data, like for example a GRand.
I think that is the correct way to do it. It should be done generaly
enough so that
On 28.02.2005, at 00:25, Sven Neumann wrote:
The histogram code does already use the threaded pixel-processor. You
think we could simplify the code? IMO the current solution isn't all
that bad. But I haven't benchmarked it so I don't really know...
I tried to introduce the per-thread
Hi,
Jay Cox [EMAIL PROTECTED] writes:
Here are some results for you:
Dual 2.5ghz g5 mac, mac os x 10.3.8
CVS gimp Changelog revision 1.10539
Linear Gradient blend on a 3000x3000 pixel image (Dithering on)
1 Processor: 7.98 seconds1x
2 processors: 5.20 seconds1.5x
3
On 26.02.2005, at 02:44, Jay Cox wrote
:
Dual 2.5ghz g5 mac, mac os x 10.3.8
CVS gimp Changelog revision 1.10539
Linear Gradient blend on a 3000x3000 pixel image (Dithering on)
1 Processor: 7.98 seconds1x
2 processors: 5.20 seconds1.5x
3 processors: 5.23 seconds1.5x
Hi again,
Jay Cox [EMAIL PROTECTED] writes:
Clearly the gradient code could use some tuning. A linear blend
shouldn't take much more than 1/2 a second even with dithering.
Could we improve this by preparing a reasonably fine array
of samples and picking from these samples instead of calling
On Wed, 2005-02-16 at 22:42 +0100, Sven Neumann wrote:
Hi,
I couldn't resist and changed the PixelProcessor to use a thread pool.
Main motivation was to make progress callback work for the threaded
case. So there's now a variant of pixel_regions_process_parallel()
that takes progress
Hi,
David Bonnell [EMAIL PROTECTED] writes:
If each thread obtains an (exclusive) lock on the pixel region then
the tasks will effectively be serialized and overall execution time
will increase compared to a non-threaded implementation due to the
threading overheads. (Queue manipulation,
Hi,
John Cupitt [EMAIL PROTECTED] writes:
FWIW, vips works by having a thread pool (rather than a tile queue)
and a simple for(;;) loop over tiles. At each tile, the for() loop
waits for a thread to become free, then assigns it a tile to work
on.
It would be trivial to change the GIMP code
On Sun, Feb 20, 2005 at 11:52:16PM +, Adam D. Moss wrote:
I can force it to use both CPUs now, but even with
200% utilization it is 2s slower to run this stupid
ubenchmark than on 1 CPU without threads.
Just a vague guess, but the multiprocessor GIMP pixel
work scheduler might* farm
On 21.02.2005, at 03:14, [EMAIL PROTECTED] ( Marc) (A.) (Lehmann ) wrote:
Forcing the NPTL implementation to degrade to legacy
BTW, is this 2.4 or 2.6?
Linux ulli 2.6.9-k8 #31 SMP Wed Nov 3 10:58:29 CET 2004 x86_64 GNU/Linux
32bit userland from Debian unstable.
Servus,
Daniel
PGP.sig
of memory at the
same time where possible.
-Dave
-Original Message-
From: [EMAIL PROTECTED]
[mailto:[EMAIL PROTECTED] On Behalf Of Tino
Schwarze
Sent: Monday, 21 February 2005 6:14 PM
To: gimp-developer@lists.xcf.berkeley.edu
Subject: Re: [Gimp-developer] GIMP and multiple processors
On Mon, Feb 21, 2005 at 09:14:13AM +0100, Tino Schwarze [EMAIL PROTECTED]
wrote:
I can force it to use both CPUs now, but even with
200% utilization it is 2s slower to run this stupid
ubenchmark than on 1 CPU without threads.
Just a vague guess, but the multiprocessor GIMP pixel
work
On Mon, 21 Feb 2005 02:00:39 +0100, Sven Neumann [EMAIL PROTECTED] wrote:
It sounds like the granularity of parallelism is too fine. That is,
each task is too short and the overhead of task dispatching (your
task queue processing, the kernels thread context switching, any IPC
required,
On Sun, 2005-02-20 at 23:52 +, Adam D. Moss wrote:
Daniel Egger wrote:
I can force it to use both CPUs now, but even with
200% utilization it is 2s slower to run this stupid
ubenchmark than on 1 CPU without threads.
Just a vague guess, but the multiprocessor GIMP pixel
work
On 21.02.2005, at 19:07, Jay Cox wrote:
I'm not sure what system the benchmark is being run on, but the cache
line size on a P4 is 128Byes (most other systems have smaller cache
line
sizes). A simple test to see if this is the problem would be to change
the tile allocation code to allocate an
] GIMP and multiple processors
...
This is unlikely, as the gimp has no say in the physical layout of the
memory
it gets from the kernel.
___
Gimp-developer mailing list
Gimp-developer@lists.xcf.berkeley.edu
http://lists.xcf.berkeley.edu/mailman/listinfo
On Mon, 2005-02-21 at 21:34 +0100, Daniel Egger wrote:
On 21.02.2005, at 19:07, Jay Cox wrote:
I'm not sure what system the benchmark is being run on, but the cache
line size on a P4 is 128Byes (most other systems have smaller cache
line
sizes). A simple test to see if this is the
Hi,
Daniel Egger [EMAIL PROTECTED] writes:
Hm, there's still no idea floating around how to benchmark.
There are very clear ideas on how to do it. Someone just needs to sit
down and write the script (or dig out script-fu-bench.scm which is
what we used to use a long time ago).
I'd rather
On 20.02.2005, at 14:09, Sven Neumann wrote:
There are very clear ideas on how to do it.
Hm, must have missed that...
Someone just needs to sit down and write the script (or dig out
script-fu-bench.scm which is what we used to use a long time ago).
You'd still do me a favor if you would try
Sven Neumann wrote:
Daniel Egger [EMAIL PROTECTED] writes:
Hm, there's still no idea floating around how to benchmark.
There are very clear ideas on how to do it. Someone just needs to sit
down and write the script (or dig out script-fu-bench.scm which is
what we used to use a long time
On 20.02.2005, at 14:09, Sven Neumann wrote:
You'd still do me a favor if you would try current CVS and told me
whether it feels faster or not.
It's slower, measurable and reproducible slower.
As a benchmark I used a gradient fill in a 3000x3000px (68.8M)
image. I get consistently times of 8s for
Hi,
Daniel Egger [EMAIL PROTECTED] writes:
It's slower, measurable and reproducible slower.
As a benchmark I used a gradient fill in a 3000x3000px (68.8M)
image. I get consistently times of 8s for 1 thread and between
9.2s and 9.6s for 2 threads. With a running application, after a
restart
On 20.02.2005, at 21:55, Sven Neumann wrote:
As a benchmark I used a gradient fill in a 3000x3000px (68.8M)
image. I get consistently times of 8s for 1 thread and between
9.2s and 9.6s for 2 threads. With a running application, after a
restart -- doesn't matter.
What is strange though, is that it
Hi,
Daniel Egger [EMAIL PROTECTED] writes:
What is strange though, is that it only seems two use one CPU
for both threads; maybe a stupid gthread implementation?
Since gthread is just a very thin wrapper around pthreads, that would
mean that it's a stupid pthread implementation. To me this
On Sun, Feb 20, 2005 at 10:55:18PM +0100, Sven Neumann [EMAIL PROTECTED]
wrote:
mean that it's a stupid pthread implementation. To me this looks like
the kernel believes that it would be better to keep the threads local
than to move one to the other CPU.
Linux will not keep two threads
On 20.02.2005, at 22:55, Sven Neumann wrote:
Since gthread is just a very thin wrapper around pthreads, that would
mean that it's a stupid pthread implementation. To me this looks like
the kernel believes that it would be better to keep the threads local
than to move one to the other CPU. I wonder
On 20.02.2005, at 23:47, [EMAIL PROTECTED] ( Marc) (A.) (Lehmann ) wrote:
Linux will not keep two threads running on a single cpu if both are
ready
and nothing else is running, regardless of locality etc., as the kernel
lacks the tools to effectively decide wether threads should stay on a
cpu
or
Daniel Egger wrote:
I can force it to use both CPUs now, but even with
200% utilization it is 2s slower to run this stupid
ubenchmark than on 1 CPU without threads.
Just a vague guess, but the multiprocessor GIMP pixel
work scheduler might* farm alternating tiles to alternating
CPUs. These are
Egger
Sent: Monday, 21 February 2005 9:13 AM
To: [EMAIL PROTECTED] ( Marc) (A.) (Lehmann )
Cc: Sven Neumann; Developer gimp-devel
Subject: Re: [Gimp-developer] GIMP and multiple processors
On 20.02.2005, at 23:47, [EMAIL PROTECTED] ( Marc) (A.) (Lehmann ) wrote:
Linux will not keep two threads
Hi,
David Bonnell [EMAIL PROTECTED] writes:
It sounds like the granularity of parallelism is too fine. That is,
each task is too short and the overhead of task dispatching (your
task queue processing, the kernels thread context switching, any IPC
required, etc.) is longer then the duration
-developer] GIMP and multiple processors
Hi,
David Bonnell [EMAIL PROTECTED] writes:
It sounds like the granularity of parallelism is too fine. That is,
each task is too short and the overhead of task dispatching (your
task queue processing, the kernels thread context switching, any IPC
On 16.02.2005, at 22:42, Sven Neumann wrote:
It would be interesting to know if this actually gives a noticeable
speedup on SMP systems. Would be nice if one of you could give this
some testing. Please try to do gradient blends (w/o adaptive
supersampling!) on large images. Changing the number of
Hi,
I couldn't resist and changed the PixelProcessor to use a thread pool.
Main motivation was to make progress callback work for the threaded
case. So there's now a variant of pixel_regions_process_parallel()
that takes progress function and data. This allows us to parallelize
some of the slower
On Sun, 13 Feb 2005 22:10:21 +0100, Sven Neumann [EMAIL PROTECTED] wrote:
Also tried to port the code to GThreadPool but it turned out to be not
as trivial as I expected. The current code blocks until all threads
are returned and it is not trivial to implement this behaviour with
GThreadPool.
On Tue, 15 Feb 2005 16:29:00 -0500, Nathan Summers [EMAIL PROTECTED] wrote:
if (g_thread_pool_get_num_threads(synch-pool) == 1)
correction: I meant if (g_thread_pool_get_num_threads(synch-pool) == 1
g_thread_pool_unprocessed(synch-pool) == 0)
although the
Hi,
Nathan Summers [EMAIL PROTECTED] writes:
Here's a solution that would work: (note, untested code, error
checking and unintersting stuff omitted for clarity, etc)
Yes, that would most probably work.
struct SynchStuff {
GThreadPool *pool;
GCond *cond;
GMutex *mutex;
}
Hi,
a quick followup to myself...
I couldn't resist and spent some time on the threaded pixel-processor
code . The first part of the TODO I posted yesterday is done, the code
has been ported to gthread. This makes the thread functionality
available on all platforms supported by gthread-2.0.
I
On 13.02.2005, at 22:10, Sven Neumann wrote:
I couldn't resist and spent some time on the threaded pixel-processor
code . The first part of the TODO I posted yesterday is done, the code
has been ported to gthread. This makes the thread functionality
available on all platforms supported by
On 13.02.2005, at 22:43, Sven Neumann wrote:
Doing benchmarks is exactly what would have to be done at this
point. The main problem is probably what to use as a benchmark.
One could write a script-fu and run that using batch-mode.
I've been using the coffee stain effect fu in the past but
this is
Hi,
since hyperthreading and multi-core CPUs are becoming more and more
common, I think we should put a little more focus on making use of
these features. For that reason, I have made --enable-mp the default
in CVS HEAD and also changed the gimprc preference so that GIMP uses 2
processors. This
54 matches
Mail list logo