There is no guarantee that any given graphics card performs faster than a CPU. The opposite may be true if you have a rather fast rig and a low performing GPU.
I do not know the Geforce 610, however judging from my experience with a 660Ti I expect the 6xx Nvidia series to be relatively weak when it comes to OpenCL - relative to its price. I suggest you have a look at the various GPU benchmarking figures you may find in the web. Pay close attention to the benchmark that deals with "compute GPU" rather than gaming aspects. One possible source might be http://www.videocardbenchmark.net/ That said: be assured that darktable does no double processing CPU+GPU. It's in the nature of how the profiling data are collected that you see figures for the OpenCL kernels as well as for the time spent in the modules. The most reliable figure is the total time spent in the pixelpipe. In your case it seems that your GPU heavily underperforms in profiled denoise, namely the non-local means processing. Interesting but probably linked to your graphics card. Knowing the used algorithm I assume that the card has a really low memory bandwidth. Best wishes Ulrich Am 10.04.2014 11:16, schrieb Hans Petter Birkeland: > I have now got a new graphics card, a Nvidia GeForce 610 with 2 GB > memory. It is capable of using OpenCL, so I thought I would get a real > performance boost. But the fact is that I see no improvement at all. > Instead it seems to slow things down. I started Darktable with > "darktable -d opencl -d perf" and then I opened an image with certain > things done to it, among them Profiled Denoise. Simply zooming the > image to 100% (middle click) takes around 6.5 seconds with OpenCL, and > 2.5 seconds without! This looks very weird to me. I also had a look in > the manual, in the section about optimizing OpenCL. Most things there > seem to be about AMD cards, and the things I tried did not have any effect. > > Here is the terminal output for the zoom with OpenCL enabled: > > --- > [dev] took 0,000 secs (0,000 CPU) to load the image. > [pixelpipe_process] [full] using device 0 > [dev_pixelpipe] took 0,003 secs (0,004 CPU) initing base buffer [full] > [dev_pixelpipe] took 0,005 secs (0,000 CPU) processing `white balance' > [full] > [dev_pixelpipe] took 0,004 secs (0,000 CPU) processing `highlight > reconstruction' [full] > [dev_pixelpipe] took 0,032 secs (0,020 CPU) processing `demosaic' [full] > [dev_pixelpipe] took 6,114 secs (2,636 CPU) processing `denoise > (profiled)' [full] > [dev_pixelpipe] took 0,256 secs (0,152 CPU) processing `lens correction' > [full] > [dev_pixelpipe] took 0,018 secs (0,004 CPU) processing `base curve' [full] > [dev_pixelpipe] took 0,010 secs (0,008 CPU) processing `input color > profile' [full] > [dev_pixelpipe] took 0,033 secs (0,004 CPU) processing `sharpen' [full] > [dev_pixelpipe] took 0,027 secs (0,016 CPU) processing `output color > profile' [full] > [dev_pixelpipe] took 0,005 secs (0,000 CPU) processing `overexposed' [full] > [dev_pixelpipe] took 0,013 secs (0,016 CPU) processing `gamma' [full] > [opencl_profiling] spent 0,0004 seconds in [Write Image (from host to > device)] > [opencl_profiling] spent 0,0024 seconds in whitebalance_1ui > [opencl_profiling] spent 0,0027 seconds in highlights_1f > [opencl_profiling] spent 0,0075 seconds in ppg_demosaic_green > [opencl_profiling] spent 0,0177 seconds in ppg_demosaic_redblue > [opencl_profiling] spent 0,0057 seconds in border_interpolate > [opencl_profiling] spent 0,0081 seconds in denoiseprofile_precondition > [opencl_profiling] spent 0,0032 seconds in denoiseprofile_init > [opencl_profiling] spent 0,9002 seconds in denoiseprofile_dist > [opencl_profiling] spent 0,3109 seconds in denoiseprofile_horiz > [opencl_profiling] spent 3,3197 seconds in denoiseprofile_vert > [opencl_profiling] spent 1,4256 seconds in denoiseprofile_accu > [opencl_profiling] spent 0,0108 seconds in denoiseprofile_finish > [opencl_profiling] spent 0,0075 seconds in [Copy Image (on device)] > [opencl_profiling] spent 0,0120 seconds in [Write Buffer (from host to > device)] > [opencl_profiling] spent 0,1920 seconds in lens_distort_lanczos3 > [opencl_profiling] spent 0,0160 seconds in basecurve > [opencl_profiling] spent 0,0075 seconds in colorin > [opencl_profiling] spent 0,0107 seconds in sharpen_hblur > [opencl_profiling] spent 0,0082 seconds in sharpen_vblur > [opencl_profiling] spent 0,0118 seconds in sharpen_mix > [opencl_profiling] spent 0,0247 seconds in colorout > [opencl_profiling] spent 0,0081 seconds in [Read Image (from device to > host)] > [opencl_profiling] spent 6,3135 seconds totally in command queue (with > 0 events missing) > [dev_process_image] pixel pipeline processing took 6,621 secs (2,972 CPU) > --- > > And here is the same without OpenCL: > > --- > [dev] took 0,000 secs (0,000 CPU) to load the image. > [pixelpipe_process] [full] using device -1 > [dev_pixelpipe] took 0,004 secs (0,008 CPU) initing base buffer [full] > [dev_pixelpipe] took 0,001 secs (0,000 CPU) processing `white balance' > [full] > [dev_pixelpipe] took 0,001 secs (0,000 CPU) processing `highlight > reconstruction' [full] > [dev_pixelpipe] took 0,018 secs (0,036 CPU) processing `demosaic' [full] > [dev_pixelpipe] took 2,312 secs (4,556 CPU) processing `denoise > (profiled)' [full] > [dev_pixelpipe] took 0,222 secs (0,420 CPU) processing `lens correction' > [full] > [dev_pixelpipe] took 0,008 secs (0,016 CPU) processing `base curve' [full] > [dev_pixelpipe] took 0,013 secs (0,016 CPU) processing `input color > profile' [full] > [dev_pixelpipe] took 0,028 secs (0,040 CPU) processing `sharpen' [full] > [dev_pixelpipe] took 0,023 secs (0,028 CPU) processing `output color > profile' [full] > [dev_pixelpipe] took 0,004 secs (0,004 CPU) processing `overexposed' [full] > [dev_pixelpipe] took 0,004 secs (0,008 CPU) processing `gamma' [full] > [dev_process_image] pixel pipeline processing took 2,741 secs (5,236 CPU) > --- > > To me it looks like when using OpenCL, everything is first done in the > CPU, then redone in the GPU? That can't be very effective. Any ideas, Am > I doing something wrong? > > Hans Petter > > http://hpbirkeland.com > > > 2014-04-07 10:52 GMT+02:00 Hans Petter Birkeland <[email protected] > <mailto:[email protected]>>: > > Ok, thank you. Then I'll probably go for this, and I might also > consider getting a better graphics card. > > Den 7. apr. 2014 10:49 skrev "Rob Z. Smith" <[email protected] > <mailto:[email protected]>> følgende: > > I used to run a 1920x1200 screen on dt on Core 2 Duo and it was > absolutely fine for speed. I did have a mid level graphics card > in the box though which would have speeded things up. I’ve > since swapped the processor out for a quad core which ran > marginally but not noticeably faster and a modern graphics card > which made a larger difference but it was plenty quick enough in > its original state at the larger resolution. > > Rgds, > > Rob. > > *From:*Hans Petter Birkeland [mailto:[email protected] > <mailto:[email protected]>] > *Sent:* 06 April 2014 10:05 > *To:* [email protected] > <mailto:[email protected]> > *Cc:* darktable-users > *Subject:* Re: [Darktable-users] Screen resolution and Darktable > > But then there is no point in a bigger screen anyway... > > Den 6. apr. 2014 01:26 skrev <[email protected] > <mailto:[email protected]>> følgende: > > You can limit the maximum displayed (and processed, as far as I > know) pixel dimensions in the global options, so no need to > limit your screen size for that reason. :) > > On 2014-04-05 [email protected] > <mailto:[email protected]> wrote: > > Hi all, > > just a quick question. When working in Darktable, how much of > a picture is > > processed every time one adjusts something? Is it the whole > file or just > > the visible pixels? I mean, will it be slower on a bigger > screen with > > higher resolution? > > > > I use a somewhat old computer, with a Intel Core 2 Duo > processor and > > 1366x768 screen resolution. Darktable is working all right on > it, but I > > can't risk it to be slower. I am thinking about getting a > 1920x1080 screen, > > but if that means more data is being processed and things are > going slower > > then maybe I shouldn't. Whart do you think? > > > > Hans Petter > > > > http://hpbirkeland.com > > > > ------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Put Bad Developers to Shame Dominate Development with Jenkins Continuous Integration Continuously Automate Build, Test & Deployment Start a new project now. Try Jenkins in the cloud. http://p.sf.net/sfu/13600_Cloudbees _______________________________________________ Darktable-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/darktable-users
