Hi,

Ok for this response but i still don't understand 2 problem with that :

1) module watermak on dev version is 3 times slower on the dev version than
the stable version :

 stable darktable, module watermark with  opencl : 0.476 secs
 dev     darktable, module watermake with opencl : 1.351 secs

So, 3 times slower with the current dev version

2 ) if the opencl is not used, why it's faster to use the module watermark
with " --disable-opencl " ?

With opencl : (stable version )
cpu [dev_pixelpipe] took 0.318 secs (0.310 CPU) processing `watermark'
[export]
Without opencl : (stable version )
gpu [dev_pixelpipe] took 0.476 secs (0.582 CPU) processing `watermark'
[export]


Regards

2013/6/5 johannes hanika <hana...@gmail.com>

>
>
>
> On Wed, Jun 5, 2013 at 1:08 AM, Roumano <roum...@gmail.com> wrote:
>
>> Hi,
>>
>> (Only) just now (with the new ati-drivers 13.6) , I can enable the
>> opencl on darktable
>>
>> [opencl_init] device 0 `Tahiti' supports image sizes of 16384 x 16384
>> [opencl_init] device 0 `Tahiti' allows GPU memory allocations of up to
>> 1024MB
>> [opencl_init] device 0: Tahiti
>>      GLOBAL_MEM_SIZE:          1845MB
>>      MAX_WORK_GROUP_SIZE:      256
>>      MAX_WORK_ITEM_DIMENSIONS: 3
>>      MAX_WORK_ITEM_SIZES:      [ 256 256 256 ]
>>      DRIVER_VERSION:           1214.3 (VM)
>>      DEVICE_VERSION:           OpenCL 1.2 AMD-APP (1214.3)
>> ...
>> discarding CPU device 1 `Intel(R) Core(TM) i7-2600K CPU @ 3.40GHz' as it
>> will not deliver any performance gain.
>>
>>
>> The result is realy great, for a picture-exemple-test, exporting it
>> (with lot of big treatement like sady/denoise)
>> before : 29,837 secs (202,660 CPU)
>> after  : 8.412  secs (10.730 CPU)
>>
>> So, we can really see improvement about speed & reactivity (as it's use
>> very less the CPU)
>>
>>
>> 1) But some plugin are more slower on the GPU than on the CPU :
>> it's not a fallback mode to CPU :
>> [opencl_summary_statistics] device 'Tahiti': 1468 out of 1468 events
>> were successful and 0 events lost
>>
>>
>> Test with
>> 1.2 :
>> cpu [dev_pixelpipe] took 0.347 secs (1.789 CPU) processing `raw
>> denoise' [export]
>> gpu [dev_pixelpipe] took 0.432 secs (1.852 CPU) processing `raw
>> denoise' [export]
>> cpu [dev_pixelpipe] took 0.318 secs (0.310 CPU) processing
>> `watermark' [export]
>> gpu [dev_pixelpipe] took 0.476 secs (0.582 CPU) processing
>> `watermark' [export]
>> cpu [dev_pixelpipe] took 0.029 secs (0.205 CPU) processing
>> `gamma' [export]
>> gpu [dev_pixelpipe] took 0.033 secs (0.232 CPU) processing
>> `gamma' [export]
>> Devel :
>> cpu [dev_pixelpipe] took 0.317 secs (1.638 CPU) processing `raw
>> denoise' [export]
>> gpu [dev_pixelpipe] took 0.371 secs (1.642 CPU) processing `raw
>> denoise' [export]
>> cpu [dev_pixelpipe] took 0.205 secs (0.199 CPU) processing
>> `watermark' [export]
>> gpu [dev_pixelpipe] took 1.351 secs (2.157 CPU) processing
>> `watermark' [export]
>> cpu [dev_pixelpipe] took 0.024 secs (0.175 CPU) processing
>> `gamma' [export]
>> gpu [dev_pixelpipe] took 0.024 secs (0.186 CPU) processing
>> `gamma' [export]
>>
>>
>> 1.1 ) Specialy for the waterwark (where it's a big difference between
>> 1.2 & the devel version)
>>
>
> watermarks are rendered using librsvg, i think even on a single core,
> there is definitely no opencl for this.
>
> j.
>
>
>>
>>
>> 2) Also, i was needed to change the parameter opencl_memory_headroom to
>> 350 ( default 300 ) to prevent this kind of error when use the equalizer
>> module :
>>
>> default_process_tiling_cl_ptp] use tiling on module 'atrous' for image
>> with full size 3374 x 5064
>> [default_process_tiling_cl_ptp] (1 x 3) tiles with max dimensions 3374 x
>> 2726 and overlap 256
>> [default_process_tiling_cl_ptp] tile (0, 0) with 3374 x 2726 at origin
>> [0, 0]
>> [opencl_atrous] couldn't enqueue kernel! -4
>> [default_process_tiling_opencl_ptp] couldn't run process_cl() for module
>> 'atrous' in tiling mode: 0
>> [opencl_pixelpipe] failed to run module 'atrous'. fall back to cpu path
>> [dev_pixelpipe] took 7.192 secs (37.128 CPU) processing
>> `equalizer' [export]
>> [opencl_pixelpipe] couldn't copy image to opencl device for module
>> tonecurve
>> [opencl_pixelpipe] failed to run module 'tonecurve'. fall back to cpu
>> path
>> [opencl_pixelpipe (b)] late opencl error detected while copying back to
>> cpu buffer: -4
>>
>> 3) To prevent regression & catch worse performance, we can imagine a
>> test script which will enable nearly all modules then export one file (&
>> the same without opencl) then report it on a graph ?
>> i don't known how & if it's realy hard to do that ... (lua ?)
>> about it, i was thinking about the job done by firefox :
>>
>> http://arewefastyet.com/
>> https://areweslimyet.com/
>>
>>
>>
>> Regards
>>
>>
>>
>>
>> ------------------------------------------------------------------------------
>> How ServiceNow helps IT people transform IT departments:
>> 1. A cloud service to automate IT design, transition and operations
>> 2. Dashboards that offer high-level views of enterprise services
>> 3. A single system of record for all IT processes
>> http://p.sf.net/sfu/servicenow-d2d-j
>> _______________________________________________
>> darktable-devel mailing list
>> darktable-devel@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/darktable-devel
>>
>
>
------------------------------------------------------------------------------
How ServiceNow helps IT people transform IT departments:
1. A cloud service to automate IT design, transition and operations
2. Dashboards that offer high-level views of enterprise services
3. A single system of record for all IT processes
http://p.sf.net/sfu/servicenow-d2d-j
_______________________________________________
darktable-devel mailing list
darktable-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/darktable-devel

Reply via email to