Hi,

> I am interested in your experience, both in terms of automatic detection
> of the best suited profile and in terms of overall performance. Please
> note that this is all about system latency and perceived system
> responsiveness in the darkroom view. Calling darktable with '-d perf'
> will only give you limited insights so you need to mostly rely on your
> own judgement.
A bit late, but today I had some time for deeper testing this. In general 
detection seems to work well and the profile very fast GPU with a single GPU 
works nice as well. 

My hardware:
Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
0       'GeForce GTX 1060 6GB'

On startup the profile was set to "very fast GPU". Switching between images is 
faster because the preview is faster compared to default. This is fine so far. 

Then I plugged in my old GT 640 as well.
1       'GeForce GT 640'

On startup profile was set to multiple GPU, so detection works. Unfortunately 
the GT 640 is relatively slow and often the full pipline gets processed on 
this device:
[pixelpipe_process] [thumbnail] using device 0
[pixelpipe_process] [full] using device 1
[pixelpipe_process] [preview] using device 0

If the full pipe is running on a slow GPU switching between images is way 
slower than before on larger history stakes especially with denoising active.

So I set opencl_device_priority as written in the manual to:
opencl_device_priority=!GeForce GT 640,*/!GeForce GTX 1060 6GB,*/GeForce GTX 
1060 6GB,*/GeForce GTX 1060 6GB,*

Now the full pipe should not run on device 1 anymore, but it still does run on 
device 1 if I switch between images:
[pixelpipe_process] [thumbnail] using device 0
[pixelpipe_process] [full] using device 1
[pixelpipe_process] [preview] using device 0

Zooming after switching works correctly on device 0.

darktable -d opencl reports this:
[opencl_update_scheduling_profile] scheduling profile set to multiple GPUs
[opencl_priorities] these are your device priorities:
[opencl_priorities]             image   preview export  thumbnail
[opencl_priorities]             0       0       0       0
[opencl_priorities]             1       1       1       1
[opencl_priorities] show if opencl use is mandatory for a given pixelpipe:
[opencl_priorities]             image   preview export  thumbnail
[opencl_priorities]             0       0       0       0

What does its output mean? Are my opencl_device_priority settings refused?
Maybe this is a corner case, because leaving a second slow GPU in a system 
does not make sense.

All the best,
Christian

Here all my opencl settings from darktablerc
opencl=TRUE
opencl_async_pixelpipe=TRUE
opencl_avoid_atomics=FALSE
opencl_checksum=2128616438
opencl_device_priority=!GeForce GT 640,*/!GeForce GTX 1060 6GB,*/GeForce GTX 
1060 6GB,*/GeForce GTX 1060 6GB,*
opencl_enable_markesteijn=true
opencl_library=
opencl_mandatory_timeout=200
opencl_memory_headroom=0
opencl_memory_requirement=768
opencl_micro_nap=0
opencl_number_event_handles=175
opencl_omit_whitebalance=
opencl_runtime=
opencl_scheduling_profile=very fast GPU
opencl_size_roundup=16
opencl_synch_cache=false
opencl_use_cpu_devices=false
opencl_use_events=FALSE
opencl_use_pinned_memory=FALSE

Am Samstag, 8. April 2017, 14:29:18 CEST schrieb Ulrich Pegelow:
> Hi,
> 
> I added a bit more flexibility concerning OpenCL device scheduling into
> master. There is a new selection box in preferences (core options) that
> allows to choose among a few typical presets.
> 
> The main target are modern systems with very fast GPUs. By default and
> "traditionally" darktable distributes work between CPU and GPU in the
> darkroom: the GPU processes the center (full) view and the CPU is
> responsible for the preview (navigation) panel. Now that GPUs get faster
> and faster there are systems where the GPU so strongly outperforms the
> CPU that it makes more sense to process preview and full pixelpipe on
> the GPU sequentially.
> 
> For that reason the "OpenCL scheduling profile" parameter has three options:
> 
> * "default" describes the old behavior: work is split between GPU and
> CPU and works best for systems where CPU and GPU performance are on a
> similar level.
> 
> * "very fast GPU" tackles the case described above: in darkroom view
> both pixelpipes are sequentially processed by the GPU. This is meant for
> GPUs which strongly outperform the CPU on that system.
> 
> * "multiple GPUs" is meant for systems with more than one OpenCL device
> so that the full and the preview pixelpipe get processed by separate GPUs.
> 
> At first startup darktable tries to find the best suited profile based
> on some benchmarking. You may at any time change the profile, this takes
> effect immediately.
> 
> I am interested in your experience, both in terms of automatic detection
> of the best suited profile and in terms of overall performance. Please
> note that this is all about system latency and perceived system
> responsiveness in the darkroom view. Calling darktable with '-d perf'
> will only give you limited insights so you need to mostly rely on your
> own judgement.
> 
> Best wishes
> 
> Ulrich
> 
> 
> ___________________________________________________________________________
> darktable developer mailing list
> to unsubscribe send a mail to darktable-dev+unsubscr...@lists.darktable.org


___________________________________________________________________________
darktable developer mailing list
to unsubscribe send a mail to darktable-dev+unsubscr...@lists.darktable.org

Reply via email to