Hi,

I just pushed an update to darktable's opencl tiling code. On some 
devices (namely AMD/ATI) I found a severe performance penalty for 
host<->device memory transfers. This mainly hits our tiling code, where 
we keep the full input and output images in host memory and repeatedly 
process small chunks on GPU.

There are different OpenCL ways to transfer data between host and 
device. Currently we use a direct transfer between an arbitrary position 
of the host image in memory and the GPU.

With the recent changes an alternative transfer via so called "pinned 
memory" is used if the configuration variable opencl_use_pinned_memory 
is set to TRUE. Default is FALSE.

On my system with an HD7950 this speeds up export of large images (eg. 
panoramas with 10k x 10k) by a factor of 2 to 3. This is only visible 
for large image sizes (relative to your GPU memory) where opencl tiling 
plays a role.

It would be great if people who use OpenCL give the current code in git 
master a try and report any problems. Please have a look at the 
performance effect when the parameter is switched on or off. Please also 
have a look at the correctness of your output image and report any issues.

Thanks

Ulrich

------------------------------------------------------------------------------
How ServiceNow helps IT people transform IT departments:
1. A cloud service to automate IT design, transition and operations
2. Dashboards that offer high-level views of enterprise services
3. A single system of record for all IT processes
http://p.sf.net/sfu/servicenow-d2d-j
_______________________________________________
darktable-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/darktable-devel

Reply via email to