Hi all, I'm trying to make my transform go fast. I've got a 1920x1080 RGB image being transformed from sRGB to the display profile. I've got a quad core processor on my development box, no shaders or GPU, and I'm trying to do the transform as quickly as possible.
I figured the fastest way to do this would be to set up a threadpool with max_threads = 4. Then I have a few choices: * pop a thread from the pool for every line of the image, creating local state with p_in, p_out, width and stride * pop a thread from the pool for every n lines of the image, creating local state with p_in, p_out, width, stride and rows_to_process (where n = height / max_threads) I figured 4 threads should be ~4x faster than using 1 thread (in the second case we should only have 4 threads, so not much overhead), but no matter the value of max_threads or 'n' I can only achieve a ~1.9x speed-up. I've tried with and without cmsFLAGS_NOCACHE. Any pointers very welcome. Thanks, Richard ------------------------------------------------------------------------------ AlienVault Unified Security Management (USM) platform delivers complete security visibility with the essential security capabilities. Easily and efficiently configure, manage, and operate all of your security controls from a single console and one unified framework. Download a free trial. http://p.sf.net/sfu/alienvault_d2d _______________________________________________ Lcms-user mailing list Lcms-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/lcms-user