Thanks for sharing. Yours is a good example of an OpenCL system that is
not limited by host<->device memory transfers. In a typical export job
your system spends about 30% of its time in memory transfer, the rest is
pure computing. That's a very good situation in which pinned memory does
not give advantages - maybe even slow down a bit.
Others have systems which are purely limited by memory transfer. We have
reports of insane cases where over 95% of the OpenCL pixelpipe is used
by memory transfers. Those are the ones where opencl_use_pinned_memory
makes a real difference.
Ulrich
Am 15.09.2016 um 22:11 schrieb KOVÁCS István:
Hi,
Core2-Duo E6550 @ 2.33GHz +Nvidia GeForce GTX 650 / 2 GB, driver
361.42, OpenCL 1.2 CUDA, darktable 2.0.6 from PPA.
With pinned memory, performance is slightly (about 10%?) worse.
There are lines like
[opencl_profiling] spent 0,3774 seconds in [Map Buffer]
that are only seen in the 'pinned' log.
One notable difference after exporting 114 photos:
pinned = false gives
[opencl_summary_statistics] device 'GeForce GTX 650': 8960 out of 8960
events were successful and 0 events lost
pinned = true gives
[opencl_summary_statistics] device 'GeForce GTX 650': 9933 out of 9933
events were successful and 0 events lost
as one of the last lines in the output.
My opencl-related darktablerc entries:
opencl=TRUE
opencl_async_pixelpipe=false
opencl_avoid_atomics=false
opencl_checksum=2684983341
opencl_device_priority=*/!0,*/*/*
opencl_library=
opencl_memory_headroom=300
opencl_memory_requirement=768
opencl_micro_nap=1000
opencl_number_event_handles=25
opencl_omit_whitebalance=
opencl_size_roundup=16
opencl_synch_cache=false
opencl_use_cpu_devices=false
opencl_use_pinned_memory=false
The logs are at:
http://tech.kovacs-telekes.org/files/darktable-opencl-pinned-memory/
Thanks,
Kofa
____________________________________________________________________________
darktable user mailing list
to unsubscribe send a mail to [email protected]