Hi,
the reason you got better performance with a higher opencl_memory_headroom is that yours was too low. Darktable tried to allocate memory and failed (because your system was already using it), to process the image the cpu was used. Now after you increased the headroom memory-allocation doesn't fail, so darktable uses the GPU. The only time a lower value should increase performance is when darktable doesn't need to tile (because you only needed ~100MB more RAM). happy holidays, Holger On December 24, 2019 4:39:20 PM GMT+01:00, Jochen Keil <[email protected]> wrote: >Hi, > >I just updated my nvidia drivers to 435 from 390. This didn't make a >big >difference for either 2.6.3 nor 3.1.0~git9.962bc9ae3. > >So I got curious. My first attempts at 3.1.0 where to copy my config >directory to /tmp and run darktable using the `--configdir` parameter. > >Darktable converted my config from 2.6.3 to 3.1.0 and that's what I've >used >to run my benchmark with. > >Since the driver update showed no improvement I figured I might as well >try >with an empty config directory and make darktable create a vanilla >config. > >And guess what, all of a sudden tiling works on my GPU: > >[..] >5,941991 [default_process_tiling_cl_ptp] use tiling on module >'denoiseprofile' for image with full size 7967 x 4979 >5,941999 [default_process_tiling_cl_ptp] (2 x 1) tiles with max >dimensions >7096 x 4979 and overlap 128 >5,942001 [default_process_tiling_cl_ptp] tile (0, 0) with 7096 x 4979 >at >origin [0, 0] >6,843400 [default_process_tiling_cl_ptp] tile (1, 0) with 1127 x 4979 >at >origin [6840, 0] >7,856860 [dev_pixelpipe] took 1,916 secs (3,798 CPU) processed `denoise >(profiled)' on GPU with tiling, blended on CPU [export] >[..] > >This brings the export time for this particular file down to ~20s from >previously over 50! That's amazing work, a massive thank you to all the >developers! > >But what changed? I diff'ed the darktablerc files and changed the >differing >opencl parameters one by one until I found the parameter which caused >the >tiling to work. > >It turned out that it's the `opencl_memory_headroom` parameter. > >In my old config it was set to `300`, in the new vanilla config it's >set to >`400`. Setting it to `400` in my old config finally did the trick. > >I've read in a different mail that one should actually lower the value >to >see performance improvements. However, for me, the opposite seems to be >the >case. Maybe the increase was necessary to trigger the tiling? > >I've also noticed two other things: > >The demosaic module still runs on the CPU. Is that by design, i.e. is >it >not possible to move this module to the GPU? > >The local contrast module also still runs on the CPU. Same question >here. I >really like this module and use it a lot, any plans of bringing it to >the >GPU? > >Thank you and happy holidays, > > Jochen > > > > > > >On Mon, Dec 23, 2019 at 8:14 AM Ulrich Pegelow ><[email protected]> >wrote: > >> Am 22.12.19 um 23:10 schrieb Аl Воgnеr: >> >> You might need to adjust parameter opencl_memory_headroom in >> >> darktablerc. >> > >> > Thanks to remember me to do this. Could you please tell me the >exact >> > name of the variable(s) to change? >> >> As previously written: opencl_memory_headroom >> >> >> > $ nvidia-smi >> > Sun Dec 22 23:04:06 2019 >> > >> >+-----------------------------------------------------------------------------+ >> > | NVIDIA-SMI 435.21 Driver Version: 435.21 CUDA >Version: >> > 10.1 | >> > >> >|-------------------------------+----------------------+----------------------+ >> > | GPU Name Persistence-M| Bus-Id Disp.A | Volatile >> > Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| >Memory-Usage | >> > GPU-Util Compute M. | >> > >> >|===============================+======================+======================| >> > | 0 GeForce GTX 1660 Off | 00000000:2F:00.0 On | >> > N/A | | 0% 39C P0 23W / 130W | 676MiB / 5941MiB | >> > 1% Default | >> > >> >+-------------------------------+----------------------+----------------------+ >> > >> >+-----------------------------------------------------------------------------+ >> > | Processes: >GPU >> > Memory | | GPU PID Type Process name >> > Usage | >> > >> >|=============================================================================| >> > | 0 2482 G /usr/lib/xorg/Xorg >> > 586MiB | | 0 3298 G xfwm4 >> > 4MiB | | 0 4933 C /usr/bin/darktable >> > 73MiB | >> > >> >+-----------------------------------------------------------------------------+ >> > >> >> Your output of nvidia-smi could indicate leaking memory. About 700MB >of >> VRAM is used which seems high to me. However, I don't know the >> requirements of xfwm4. Maybe it's normal. If this memory usage is >stable >> you will need to set opencl_memory_headroom to 700 or 800. >> >> >> >____________________________________________________________________________ >> darktable user mailing list >> to unsubscribe send a mail to >> [email protected] >> >> > >____________________________________________________________________________ >darktable user mailing list >to unsubscribe send a mail to >[email protected] ____________________________________________________________________________ darktable user mailing list to unsubscribe send a mail to [email protected]
