Hi,

 I just received this mail and I think it should be postet here:



[Dirk, email bounced when I tried to send this to the lcms list, so I'm 
emailing
you directly with my comments. Feel free to forward this to the list]



> > > �we work with littleCMS and are very happy about the results. But 
besides
> > > the color quality I'm interested in the performance. We measured a
> > > calculation time of around 400 ms using a 1 Mega Pixel picture. Knowing

Some time ago I did some performance testing and tuning
on lcms. �My thoughts on the matter (all of this related to
16 bit in/out, device-link constructed 3D transformations at
high quality):

-�������The standard lcms 1.12 library takes about 450ns/pixel on a "typical"
��������PC from 2 years ago.

-�������Very minor performance tuning for the common 3 in/3 out, 16 bit
��������in/out per channel situation enabled the library speed to be boosted
��������to 150ns/pixel, e.g. nearly 3x faster. �I believe this
��������figure to be as good as or better than any other CMS I tested,
��������as it results in about 7 Megapixels throughput for full 16 bit
��������data (e.g. 40 Megabytes throughput).

-�������The tuning involved mainly:
��������a) using a fast IEEE to int conversion macro at a critical point
��������(standard IEEE to int performance is notoriously slow on Intel).
��������b) Unrolling the core 3D LUT interpolation routine.
��������In all less than 100 lines of code changes or so.

-�������I did try lcms 1.14 and applied the above changes, but speed
��������was much slower (~400ns/pixel). I have yet to look into why this
��������is, however it should be easy enough to retune 1.14 to get 150ns/pixel
��������speed out of it. This
��������is why I have not applied the changes back to Marti for the library
��������(although I did send him the 1.12 changes a couple of years ago).

-�������My view is that it is very hard to get below 100ns/pixel for
��������for a full 3D transformation, because the bottleneck becomes
��������memory access for the 3D LUT array. �The 3D LUT is too large
��������(for a high quality transformation) to fit into L2 let alone
��������L1 cache). With memory typically cycling at 50ns for a random
��������read, main memory is about 100x slower than the CPU these days
��������(e.g. you can do about 100 CPU instructions in the time it takes
��������to do one main-memory access). �This difference between
��������CPU and RAM speed is in my view the critical factor in
��������performance tuning these days. �It is not the CPU load
��������(e.g. instructions executed) that matters, but instead the
��������data flow load becomes the vital factor.

-�������Given the above RAM/CPU disparity, it might be possible
��������to make improvements by moving away from a 3D LUT
��������device link transformations (which has a heavy RAM load),
��������to a description of the curve in a mathematical sense
��������(such as a piece-wise set of polynomials). �This would 
��������enable everything to fit into main CPU cache, and could
��������potentially give throughput in the 10ns/pixel to 50ns/pixel range.
��������This also gives a greater performance boost on multi-core
��������machines (as the formulae can sit in L1 cache).
��������Although this is an area I've worked on for other
��������pipelines, I have not looked into this in much detail for
��������CMS work, as most of my effort currently is on spectral based
��������profiling & transformation (this is for digital cameras).

-�������Speaking of which, it seems that recent products on the
��������market are doing major 'cheats' to get high speed. In
��������effect, they are bypassing true CMS work and doing
��������simple hacks (I think - I've not looked at their code).
��������In short, the emphasis now seems to be on speed not
��������on accuracy. �If you are trying to write products that
��������compete or compare with some of the photo editing products,
��������this is worth keeping in mind - many products now are
��������at best doing simple matrix transformations, and at
��������worst not even doing that.


My $0.02 anyway.

Regards,

Stuart


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now.
http://ads.osdn.com/?ad_ide95&alloc_id396&op=click
_______________________________________________
Lcms-user mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/lcms-user

Reply via email to