Hi, I just pushed an update to darktable's opencl tiling code. On some devices (namely AMD/ATI) I found a severe performance penalty for host<->device memory transfers. This mainly hits our tiling code, where we keep the full input and output images in host memory and repeatedly process small chunks on GPU.
There are different OpenCL ways to transfer data between host and device. Currently we use a direct transfer between an arbitrary position of the host image in memory and the GPU. With the recent changes an alternative transfer via so called "pinned memory" is used if the configuration variable opencl_use_pinned_memory is set to TRUE. Default is FALSE. On my system with an HD7950 this speeds up export of large images (eg. panoramas with 10k x 10k) by a factor of 2 to 3. This is only visible for large image sizes (relative to your GPU memory) where opencl tiling plays a role. It would be great if people who use OpenCL give the current code in git master a try and report any problems. Please have a look at the performance effect when the parameter is switched on or off. Please also have a look at the correctness of your output image and report any issues. Thanks Ulrich ------------------------------------------------------------------------------ How ServiceNow helps IT people transform IT departments: 1. A cloud service to automate IT design, transition and operations 2. Dashboards that offer high-level views of enterprise services 3. A single system of record for all IT processes http://p.sf.net/sfu/servicenow-d2d-j _______________________________________________ darktable-devel mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/darktable-devel
