On 12/17/19 12:38 AM, Michael Rasmussen wrote:
CPU: AMD Ryzen 1700 overclocked to 3.8 GHz
Memory: 32 GB DDR 4 at 3200 MHz
So using a less capable GPU than you I am able to do
39,681519 [dev_process_export] pixel pipeline processing took 38,916
secs (360,590 CPU)
And you are able to do
53,785899 [dev_process_export] pixel pipeline processing took 53,032
secs (130,516 CPU)
That is more or less 20% faster than you. The only reason to this is
that you are using a very old version of the Nvidia driver (390.116)
while I am using (430.64). I suspect that if you use the same Nvidia
driver version that me you would be able to cut between 20 and 30
seconds of your processing time.
Hi,
I would disagree with your conclusion, at least with the driver part.
You have a 8 Core CPU as well as Al, who also sees massive improvements
in speed.
If the driver would play such a massive role, my performance should be
better (I need 35-40s) and am using version 440.36. Below are the
relevant line from my run.
Memory might be an issue, but only because the GPU doesn't allow to
allocate all of the memory (in my case 1482MB)? IfI calculated correctly
we would need about 1900MB for these three modules. This would underline
the importance of the CPU.
Maybe someone with more OpenCL knowledge can explain this memory
allocation limit to me?
Regards,
Holger
My run:
0.035386 [opencl_init] device 0 `GeForce RTX 2060' has sm_20 support.
0.035507 [opencl_init] device 0 `GeForce RTX 2060' supports image sizes
of 32768 x 32768
0.035511 [opencl_init] device 0 `GeForce RTX 2060' allows GPU memory
allocations of up to 1482MB
[opencl_init] device 0: GeForce RTX 2060
GLOBAL_MEM_SIZE: 5931MB
MAX_WORK_GROUP_SIZE: 1024
MAX_WORK_ITEM_DIMENSIONS: 3
MAX_WORK_ITEM_SIZES: [ 1024 1024 64 ]
DRIVER_VERSION: 440.36
DEVICE_VERSION: OpenCL 1.2 CUDA
GLOBAL_MEM_SIZE: 5931MB
MAX_WORK_GROUP_SIZE: 1024
MAX_WORK_ITEM_DIMENSIONS: 3
MAX_WORK_ITEM_SIZES: [ 1024 1024 64 ]
DRIVER_VERSION: 440.36
DEVICE_VERSION: OpenCL 1.2 CUDA
36.494093 [opencl_profiling] profiling device 0 ('GeForce RTX 2060'):
36.494098 [opencl_profiling] spent 0.3877 seconds in [Write Image (from
host to device)]
36.494102 [opencl_profiling] spent 0.0014 seconds in rawprepare_1f
36.494105 [opencl_profiling] spent 0.0015 seconds in whitebalance_1f
36.494108 [opencl_profiling] spent 0.0014 seconds in highlights_1f_clip
36.494111 [opencl_profiling] spent 1.2340 seconds in [Read Image (from
device to host)]
36.494114 [opencl_profiling] spent 0.0108 seconds in
denoiseprofile_precondition
36.494117 [opencl_profiling] spent 0.3176 seconds in
denoiseprofile_decompose
36.494120 [opencl_profiling] spent 0.0206 seconds in
denoiseprofile_reduce_first
36.494123 [opencl_profiling] spent 0.0001 seconds in
denoiseprofile_reduce_second
36.494126 [opencl_profiling] spent 0.0000 seconds in [Read Buffer (from
device to host)]
36.494129 [opencl_profiling] spent 0.0697 seconds in
denoiseprofile_synthesize
36.494133 [opencl_profiling] spent 0.0667 seconds in [Copy Image (on
device)]
36.494136 [opencl_profiling] spent 0.0094 seconds in
denoiseprofile_backtransform
36.494139 [opencl_profiling] spent 0.0013 seconds in blendop_set_mask
36.494142 [opencl_profiling] spent 0.0455 seconds in blendop_rgb
36.494145 [opencl_profiling] spent 0.0204 seconds in exposure
36.494148 [opencl_profiling] spent 0.0151 seconds in blendop_mask_rgb
36.494151 [opencl_profiling] spent 0.0068 seconds in colorin_unbound
36.494154 [opencl_profiling] spent 0.0046 seconds in vibrance
36.494157 [opencl_profiling] spent 0.0059 seconds in filmic
36.494159 [opencl_profiling] spent 0.0063 seconds in colisa
36.494162 [opencl_profiling] spent 0.0310 seconds in blendop_mask_Lab
36.494165 [opencl_profiling] spent 0.0403 seconds in blendop_Lab
36.494168 [opencl_profiling] spent 0.0263 seconds in tonecurve
36.494172 [opencl_profiling] spent 0.0065 seconds in colorcorrection
36.494174 [opencl_profiling] spent 0.0070 seconds in sharpen_hblur
36.494177 [opencl_profiling] spent 0.0053 seconds in sharpen_vblur
36.494180 [opencl_profiling] spent 0.0088 seconds in sharpen_mix
36.494183 [opencl_profiling] spent 0.0085 seconds in colorout
36.494186 [opencl_profiling] spent 0.0064 seconds in velvia
36.494189 [opencl_profiling] spent 2.3669 seconds totally in command
queue (with 0 events missing)
36.494200 [dev_process_export] pixel pipeline processing took 35.754
secs (191.942 CPU)
____________________________________________________________________________
darktable user mailing list
to unsubscribe send a mail to [email protected]