for 2) the answer is probably that when you enable opencl you process the
image faster but you need to move your image to/from the CPU, which is
quite long

when opencl enabled modules are next to each othe in the pipe we don't need
to copy the image around, but when there is a non-opencl module, we have to
bring the result back to main memory to process it.


On Wed, Jun 5, 2013 at 9:59 AM, Christian Iuga <roum...@gmail.com> wrote:

> Hi,
>
> Ok for this response but i still don't understand 2 problem with that :
>
> 1) module watermak on dev version is 3 times slower on the dev version
> than the stable version :
>
>  stable darktable, module watermark with  opencl : 0.476 secs
>  dev     darktable, module watermake with opencl : 1.351 secs
>
> So, 3 times slower with the current dev version
>
> 2 ) if the opencl is not used, why it's faster to use the module watermark
> with " --disable-opencl " ?
>
> With opencl : (stable version )
> cpu [dev_pixelpipe] took 0.318 secs (0.310 CPU) processing `watermark'
> [export]
> Without opencl : (stable version )
> gpu [dev_pixelpipe] took 0.476 secs (0.582 CPU) processing `watermark'
> [export]
>
>
> Regards
>
>
> 2013/6/5 johannes hanika <hana...@gmail.com>
>
>>
>>
>>
>> On Wed, Jun 5, 2013 at 1:08 AM, Roumano <roum...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> (Only) just now (with the new ati-drivers 13.6) , I can enable the
>>> opencl on darktable
>>>
>>> [opencl_init] device 0 `Tahiti' supports image sizes of 16384 x 16384
>>> [opencl_init] device 0 `Tahiti' allows GPU memory allocations of up to
>>> 1024MB
>>> [opencl_init] device 0: Tahiti
>>>      GLOBAL_MEM_SIZE:          1845MB
>>>      MAX_WORK_GROUP_SIZE:      256
>>>      MAX_WORK_ITEM_DIMENSIONS: 3
>>>      MAX_WORK_ITEM_SIZES:      [ 256 256 256 ]
>>>      DRIVER_VERSION:           1214.3 (VM)
>>>      DEVICE_VERSION:           OpenCL 1.2 AMD-APP (1214.3)
>>> ...
>>> discarding CPU device 1 `Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz' as it
>>> will not deliver any performance gain.
>>>
>>>
>>> The result is realy great, for a picture-exemple-test, exporting it
>>> (with lot of big treatement like sady/denoise)
>>> before : 29,837 secs (202,660 CPU)
>>> after  : 8.412  secs (10.730 CPU)
>>>
>>> So, we can really see improvement about speed & reactivity (as it's use
>>> very less the CPU)
>>>
>>>
>>> 1) But some plugin are more slower on the GPU than on the CPU :
>>> it's not a fallback mode to CPU :
>>> [opencl_summary_statistics] device 'Tahiti': 1468 out of 1468 events
>>> were successful and 0 events lost
>>>
>>>
>>> Test with
>>> 1.2 :
>>> cpu [dev_pixelpipe] took 0.347 secs (1.789 CPU) processing `raw
>>> denoise' [export]
>>> gpu [dev_pixelpipe] took 0.432 secs (1.852 CPU) processing `raw
>>> denoise' [export]
>>> cpu [dev_pixelpipe] took 0.318 secs (0.310 CPU) processing
>>> `watermark' [export]
>>> gpu [dev_pixelpipe] took 0.476 secs (0.582 CPU) processing
>>> `watermark' [export]
>>> cpu [dev_pixelpipe] took 0.029 secs (0.205 CPU) processing
>>> `gamma' [export]
>>> gpu [dev_pixelpipe] took 0.033 secs (0.232 CPU) processing
>>> `gamma' [export]
>>> Devel :
>>> cpu [dev_pixelpipe] took 0.317 secs (1.638 CPU) processing `raw
>>> denoise' [export]
>>> gpu [dev_pixelpipe] took 0.371 secs (1.642 CPU) processing `raw
>>> denoise' [export]
>>> cpu [dev_pixelpipe] took 0.205 secs (0.199 CPU) processing
>>> `watermark' [export]
>>> gpu [dev_pixelpipe] took 1.351 secs (2.157 CPU) processing
>>> `watermark' [export]
>>> cpu [dev_pixelpipe] took 0.024 secs (0.175 CPU) processing
>>> `gamma' [export]
>>> gpu [dev_pixelpipe] took 0.024 secs (0.186 CPU) processing
>>> `gamma' [export]
>>>
>>>
>>> 1.1 ) Specialy for the waterwark (where it's a big difference between
>>> 1.2 & the devel version)
>>>
>>
>> watermarks are rendered using librsvg, i think even on a single core,
>> there is definitely no opencl for this.
>>
>> j.
>>
>>
>>>
>>>
>>> 2) Also, i was needed to change the parameter opencl_memory_headroom to
>>> 350 ( default 300 ) to prevent this kind of error when use the equalizer
>>> module :
>>>
>>> default_process_tiling_cl_ptp] use tiling on module 'atrous' for image
>>> with full size 3374 x 5064
>>> [default_process_tiling_cl_ptp] (1 x 3) tiles with max dimensions 3374 x
>>> 2726 and overlap 256
>>> [default_process_tiling_cl_ptp] tile (0, 0) with 3374 x 2726 at origin
>>> [0, 0]
>>> [opencl_atrous] couldn't enqueue kernel! -4
>>> [default_process_tiling_opencl_ptp] couldn't run process_cl() for module
>>> 'atrous' in tiling mode: 0
>>> [opencl_pixelpipe] failed to run module 'atrous'. fall back to cpu path
>>> [dev_pixelpipe] took 7.192 secs (37.128 CPU) processing
>>> `equalizer' [export]
>>> [opencl_pixelpipe] couldn't copy image to opencl device for module
>>> tonecurve
>>> [opencl_pixelpipe] failed to run module 'tonecurve'. fall back to cpu
>>> path
>>> [opencl_pixelpipe (b)] late opencl error detected while copying back to
>>> cpu buffer: -4
>>>
>>> 3) To prevent regression & catch worse performance, we can imagine a
>>> test script which will enable nearly all modules then export one file (&
>>> the same without opencl) then report it on a graph ?
>>> i don't known how & if it's realy hard to do that ... (lua ?)
>>> about it, i was thinking about the job done by firefox :
>>>
>>> http://arewefastyet.com/
>>> https://areweslimyet.com/
>>>
>>>
>>>
>>> Regards
>>>
>>>
>>>
>>>
>>> ------------------------------------------------------------------------------
>>> How ServiceNow helps IT people transform IT departments:
>>> 1. A cloud service to automate IT design, transition and operations
>>> 2. Dashboards that offer high-level views of enterprise services
>>> 3. A single system of record for all IT processes
>>> http://p.sf.net/sfu/servicenow-d2d-j
>>> _______________________________________________
>>> darktable-devel mailing list
>>> darktable-devel@lists.sourceforge.net
>>> https://lists.sourceforge.net/lists/listinfo/darktable-devel
>>>
>>
>>
>
>
> ------------------------------------------------------------------------------
> How ServiceNow helps IT people transform IT departments:
> 1. A cloud service to automate IT design, transition and operations
> 2. Dashboards that offer high-level views of enterprise services
> 3. A single system of record for all IT processes
> http://p.sf.net/sfu/servicenow-d2d-j
> _______________________________________________
> darktable-devel mailing list
> darktable-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/darktable-devel
>
>
------------------------------------------------------------------------------
How ServiceNow helps IT people transform IT departments:
1. A cloud service to automate IT design, transition and operations
2. Dashboards that offer high-level views of enterprise services
3. A single system of record for all IT processes
http://p.sf.net/sfu/servicenow-d2d-j
_______________________________________________
darktable-devel mailing list
darktable-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/darktable-devel

Reply via email to