On 12/17/19 12:38 AM, Michael Rasmussen wrote:
CPU: AMD Ryzen 1700 overclocked to 3.8 GHz
Memory: 32 GB DDR 4 at 3200 MHz

So using a less capable GPU than you I am able to do
39,681519 [dev_process_export] pixel pipeline processing took 38,916
secs (360,590 CPU)

And you are able to do
53,785899 [dev_process_export] pixel pipeline processing took 53,032
secs (130,516 CPU)

That is more or less 20% faster than you. The only reason to this is
that you are using a very old version of the Nvidia driver (390.116)
while I am using (430.64). I suspect that if you use the same Nvidia
driver version that me you would be able to cut between 20 and 30
seconds of your processing time.

Hi,

 I would disagree with your conclusion, at least with the driver part. You have a 8 Core CPU as well as Al, who also sees massive improvements in speed.

If the driver would play such a massive role, my performance should be better (I need 35-40s) and am using version 440.36. Below are the relevant line from my run.

Memory might be an issue, but only because the GPU doesn't allow to allocate all of the memory (in my case 1482MB)? IfI calculated correctly we would need about 1900MB for these three modules. This would underline the importance of the CPU.


Maybe someone with more OpenCL knowledge can explain this memory allocation limit to me?


Regards,

Holger


My run:

0.035386 [opencl_init] device 0 `GeForce RTX 2060' has sm_20 support.
0.035507 [opencl_init] device 0 `GeForce RTX 2060' supports image sizes of 32768 x 32768 0.035511 [opencl_init] device 0 `GeForce RTX 2060' allows GPU memory allocations of up to 1482MB
[opencl_init] device 0: GeForce RTX 2060

     GLOBAL_MEM_SIZE:          5931MB

     MAX_WORK_GROUP_SIZE:      1024
     MAX_WORK_ITEM_DIMENSIONS: 3
     MAX_WORK_ITEM_SIZES:      [ 1024 1024 64 ]
     DRIVER_VERSION:           440.36
     DEVICE_VERSION:           OpenCL 1.2 CUDA     GLOBAL_MEM_SIZE:          5931MB
     MAX_WORK_GROUP_SIZE:      1024
     MAX_WORK_ITEM_DIMENSIONS: 3
     MAX_WORK_ITEM_SIZES:      [ 1024 1024 64 ]
     DRIVER_VERSION:           440.36
     DEVICE_VERSION:           OpenCL 1.2 CUDA

36.494093 [opencl_profiling] profiling device 0 ('GeForce RTX 2060'):
36.494098 [opencl_profiling] spent  0.3877 seconds in [Write Image (from host to device)]
36.494102 [opencl_profiling] spent  0.0014 seconds in rawprepare_1f
36.494105 [opencl_profiling] spent  0.0015 seconds in whitebalance_1f
36.494108 [opencl_profiling] spent  0.0014 seconds in highlights_1f_clip
36.494111 [opencl_profiling] spent  1.2340 seconds in [Read Image (from device to host)] 36.494114 [opencl_profiling] spent  0.0108 seconds in denoiseprofile_precondition 36.494117 [opencl_profiling] spent  0.3176 seconds in denoiseprofile_decompose 36.494120 [opencl_profiling] spent  0.0206 seconds in denoiseprofile_reduce_first 36.494123 [opencl_profiling] spent  0.0001 seconds in denoiseprofile_reduce_second 36.494126 [opencl_profiling] spent  0.0000 seconds in [Read Buffer (from device to host)] 36.494129 [opencl_profiling] spent  0.0697 seconds in denoiseprofile_synthesize 36.494133 [opencl_profiling] spent  0.0667 seconds in [Copy Image (on device)] 36.494136 [opencl_profiling] spent  0.0094 seconds in denoiseprofile_backtransform
36.494139 [opencl_profiling] spent  0.0013 seconds in blendop_set_mask
36.494142 [opencl_profiling] spent  0.0455 seconds in blendop_rgb
36.494145 [opencl_profiling] spent  0.0204 seconds in exposure
36.494148 [opencl_profiling] spent  0.0151 seconds in blendop_mask_rgb
36.494151 [opencl_profiling] spent  0.0068 seconds in colorin_unbound
36.494154 [opencl_profiling] spent  0.0046 seconds in vibrance
36.494157 [opencl_profiling] spent  0.0059 seconds in filmic
36.494159 [opencl_profiling] spent  0.0063 seconds in colisa
36.494162 [opencl_profiling] spent  0.0310 seconds in blendop_mask_Lab
36.494165 [opencl_profiling] spent  0.0403 seconds in blendop_Lab
36.494168 [opencl_profiling] spent  0.0263 seconds in tonecurve
36.494172 [opencl_profiling] spent  0.0065 seconds in colorcorrection
36.494174 [opencl_profiling] spent  0.0070 seconds in sharpen_hblur
36.494177 [opencl_profiling] spent  0.0053 seconds in sharpen_vblur
36.494180 [opencl_profiling] spent  0.0088 seconds in sharpen_mix
36.494183 [opencl_profiling] spent  0.0085 seconds in colorout
36.494186 [opencl_profiling] spent  0.0064 seconds in velvia
36.494189 [opencl_profiling] spent  2.3669 seconds totally in command queue (with 0 events missing) 36.494200 [dev_process_export] pixel pipeline processing took 35.754 secs (191.942 CPU)


____________________________________________________________________________
darktable user mailing list
to unsubscribe send a mail to [email protected]

Reply via email to