Some time ago I did some performance testing and tuning
on lcms. My thoughts on the matter (all of this related to
16 bit in/out, device-link constructed 3D transformations at
high quality):


... by the way, the code changes from Stuart Nixon are being incorporated. In the CVS there are some modifications already applied. Hopefully next version will hold all changes.

Regards,
--
Marti Maria
The littlecms project.
www.littlecms.com


----- Original Message ----- From: "Dirk Str�ker" <[EMAIL PROTECTED]>
To: <[email protected]>
Sent: Friday, March 25, 2005 4:20 PM
Subject: Re: [Lcms-user] CMM and Performance



Hi,

I just received this mail and I think it should be postet here:



[Dirk, email bounced when I tried to send this to the lcms list, so I'm
emailing
you directly with my comments. Feel free to forward this to the list]



> > we work with littleCMS and are very happy about the results. But
besides
> > the color quality I'm interested in the performance. We measured a
> > calculation time of around 400 ms using a 1 Mega Pixel picture. > > Knowing

Some time ago I did some performance testing and tuning on lcms. My thoughts on the matter (all of this related to 16 bit in/out, device-link constructed 3D transformations at high quality):

- The standard lcms 1.12 library takes about 450ns/pixel on a "typical"
PC from 2 years ago.

- Very minor performance tuning for the common 3 in/3 out, 16 bit
in/out per channel situation enabled the library speed to be boosted
to 150ns/pixel, e.g. nearly 3x faster. I believe this
figure to be as good as or better than any other CMS I tested,
as it results in about 7 Megapixels throughput for full 16 bit
data (e.g. 40 Megabytes throughput).

- The tuning involved mainly:
a) using a fast IEEE to int conversion macro at a critical point
(standard IEEE to int performance is notoriously slow on Intel).
b) Unrolling the core 3D LUT interpolation routine.
In all less than 100 lines of code changes or so.

- I did try lcms 1.14 and applied the above changes, but speed
was much slower (~400ns/pixel). I have yet to look into why this
is, however it should be easy enough to retune 1.14 to get 150ns/pixel
speed out of it. This
is why I have not applied the changes back to Marti for the library
(although I did send him the 1.12 changes a couple of years ago).

- My view is that it is very hard to get below 100ns/pixel for
for a full 3D transformation, because the bottleneck becomes
memory access for the 3D LUT array. The 3D LUT is too large
(for a high quality transformation) to fit into L2 let alone
L1 cache). With memory typically cycling at 50ns for a random
read, main memory is about 100x slower than the CPU these days
(e.g. you can do about 100 CPU instructions in the time it takes
to do one main-memory access). This difference between
CPU and RAM speed is in my view the critical factor in
performance tuning these days. It is not the CPU load
(e.g. instructions executed) that matters, but instead the
data flow load becomes the vital factor.

- Given the above RAM/CPU disparity, it might be possible
to make improvements by moving away from a 3D LUT
device link transformations (which has a heavy RAM load),
to a description of the curve in a mathematical sense
(such as a piece-wise set of polynomials). This would
enable everything to fit into main CPU cache, and could
potentially give throughput in the 10ns/pixel to 50ns/pixel range.
This also gives a greater performance boost on multi-core
machines (as the formulae can sit in L1 cache).
Although this is an area I've worked on for other
pipelines, I have not looked into this in much detail for
CMS work, as most of my effort currently is on spectral based
profiling & transformation (this is for digital cameras).

- Speaking of which, it seems that recent products on the
market are doing major 'cheats' to get high speed. In
effect, they are bypassing true CMS work and doing
simple hacks (I think - I've not looked at their code).
In short, the emphasis now seems to be on speed not
on accuracy. If you are trying to write products that
compete or compare with some of the photo editing products,
this is worth keeping in mind - many products now are
at best doing simple matrix transformations, and at
worst not even doing that.


My $0.02 anyway.

Regards,

Stuart


------------------------------------------------------- SF email is sponsored by - The IT Product Guide Read honest & candid reviews on hundreds of IT Products from real users. Discover which products truly live up to the hype. Start reading now. http://ads.osdn.com/?ad_ide95&alloc_id396&op�k _______________________________________________ Lcms-user mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/lcms-user



--
No virus found in this incoming message.
Checked by AVG Anti-Virus.
Version: 7.0.308 / Virus Database: 266.8.1 - Release Date: 23/03/2005





-- No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.308 / Virus Database: 266.8.3 - Release Date: 25/03/2005



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id396&op=click
_______________________________________________
Lcms-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/lcms-user

Reply via email to