Re: [UFRaw-Devel] Severe performance issues when rendering previews in UFRaw 0.15

Bruce Guenter Fri, 16 Oct 2009 09:52:01 -0700

On Thu, Oct 15, 2009 at 12:50:08AM +0200, Andreas Sandberg wrote:
> I guess one possibility would be to find the areas prior to starting the
> parallel region.


That's possible, but tricky.

> Don't know if there is any noticeable improvement in
> the user experience from this, in fact I hardly notice any difference
> between running with 1 thread compared to 4 threads (I'm running on a
> Core2 Quad Q9450).

I did measure it and found a small improvement in responsiveness, but I
don't recall how much.  It would likely show up most at 100% zoom (which
only existed in the devel version when I started writing this reply).

> Doesn't the 'omp parallel for' in develop() create additional threads
> when called from render_preview_image()?

It shouldn't.  The tutorial I read indicated that nested directives are
handled intelligently -- if there are already the maximum number of
threads operating, no more are started.  However, some of this may be
implementation defined.

> It seems like ufraw_write_image_data parallelizes over the rows of the
> image. If both ufraw_write_image_data() and render_preview_image() are
> both parallelized what is the reason for having a 'parallel for' in the
> develop() procedure?

It isn't needed any more, unless you are running on a massively
multi-core system (in which case the memory transfer overhead probably
swamps the speedups).  I think it was initially put there before the
parallelism in ufraw_write_image_data and render_preview_image was
complete.

Given the concerns about nested OpenMP directives, you could take them
out and test if there is any improvement.

> I did some measurements on a batch conversion of 19 raw images (6 MP).
> And got the following data (best time from two runs, CVS HEAD):
> # Threads, Run time (s), Speedup
> 1,46.66,1
> 2,33.53,1.39
> 3,26.14,1.79
> 4,26.86,1.74
> 
> I would expect the speedup to decrease further when going beyond 4
> threads, but I haven't got any easy access to systems with more cores at
> the moment.

Actually, due to scheduler quirks, you can still get speedups when going
past one thread per core.

> Haven't really had time to analyze where the bottlenecks
> are, but I guess that makes for a nice project a rainy day... :)

There are some parts of the file writing that simply must happen in
sequence (for example, writing out the JPEG file), so it will never
scale perfectly.

-- 
Bruce Guenter <[email protected]>                http://untroubled.org/

pgprlz0QH1iyn.pgp
Description: PGP signature

------------------------------------------------------------------------------
Come build with us! The BlackBerry(R) Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9 - 12, 2009. Register now!
http://p.sf.net/sfu/devconference

_______________________________________________
ufraw-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/ufraw-devel

Re: [UFRaw-Devel] Severe performance issues when rendering previews in UFRaw 0.15

Reply via email to