Am 13.05.2013 23:17, schrieb Bob Friesenhahn:
> Are you sharing the same transform (created by one thread), or are you 
> creating an independent transform for each thread (ideally created by the 
> thread which uses it)?  Creating the transform can consume considerable time 
> so it can be useful to parallelize (even though it "wastes" CPU) and it help 
> work better given whatever NUMA characteristics pertain to your hardware.

Separate ones would likely allocate memory in a more NUMA-friendly way. For 
optimal performance the whole transform should fit into the L2 cache though 
(and then the backing memory access time is no longer so important). For 
instance, a 33 grid point 16-bit RGB -> RGB device link LUT needs 
33^3*2*3=215622 bytes, which basically fits into the 256k L2 cache of a core i7 
(but leaving not too much L2 cache for other stuff). I.e. if the grid 
resolution is is kept at moderate levels, there is a chance that the transform 
can be kept in L2 cache. L3 cache access time is about twice of L2 cache access 
time, AFAIK, and memory access time is about twice (or more) of L3 cache access 
time. That's of course a trade-off against transform quality...

> Cache-line effects can be significant if there is accidental cache-line 
> sharing (two cores sharing data in the same cache line).
> Padding structures to prevent false-sharing or using an aligned memory 
> allocator can help surmount such problems. Cache line issues can be very 
> hardware/OS specific and mysterious.

Profiling with e.g. OProfile or Intel VTune Amplifier, making use of the 
various performance counters of modern CPUs, is IMO essential in order to 
locate such issues/bottlenecks and to optimize the code (granted that there is 
still room for improvement).

Best Regards,
Gerhard


------------------------------------------------------------------------------
AlienVault Unified Security Management (USM) platform delivers complete
security visibility with the essential security capabilities. Easily and
efficiently configure, manage, and operate all of your security controls
from a single console and one unified framework. Download a free trial.
http://p.sf.net/sfu/alienvault_d2d
_______________________________________________
Lcms-user mailing list
Lcms-user@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/lcms-user

Reply via email to