Hi all,

I'm trying to make my transform go fast. I've got a 1920x1080 RGB
image being transformed from sRGB to the display profile. I've got a
quad core processor on my development box, no shaders or GPU, and I'm
trying to do the transform as quickly as possible.

I figured the fastest way to do this would be to set up a threadpool
with max_threads = 4. Then I have a few choices:

* pop a thread from the pool for every line of the image, creating
local state with p_in, p_out, width and stride
* pop a thread from the pool for every n lines of the image, creating
local state with p_in, p_out, width, stride and rows_to_process (where
n = height / max_threads)

I figured 4 threads should be ~4x faster than using 1 thread (in the
second case we should only have 4 threads, so not much overhead), but
no matter the value of max_threads or 'n' I can only achieve a ~1.9x
speed-up. I've tried with and without cmsFLAGS_NOCACHE. Any pointers
very welcome.

Thanks,

Richard

------------------------------------------------------------------------------
AlienVault Unified Security Management (USM) platform delivers complete
security visibility with the essential security capabilities. Easily and
efficiently configure, manage, and operate all of your security controls
from a single console and one unified framework. Download a free trial.
http://p.sf.net/sfu/alienvault_d2d
_______________________________________________
Lcms-user mailing list
Lcms-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lcms-user

Reply via email to