Re: [magick-users] Optimizing the compare function

Anthony Thyssen Mon, 31 Jul 2006 21:52:59 -0700

surimask on  wrote...
| Hello all,
| 
| I'm fairly new to ImageMagick so apologies if this is not an appropriate
| question for this group.  I couldn't locate the answer anywhere else.  Here
| is my situation:
| 
| I need to run the compare function on an image against about 3000 other
| images.  If the images are determined to be the same (in my case, that means
| a -metric MAE value of 50 or less).  All images are between 4000-4500 bytes
| in size.  Other than the -metric switch, I am using no other options.  What
| I am trying to figure out is:
| 
| What is the fastest way to accomplish this efficiently?
| Would I be better off using PerlMagick for this (there are some web requests
| involved to obtain the images, for which I am using perl)?
| Is there perhaps a way to cache the images in-memory increase performance?
| 
| Any advice is very much appreciated.  Thanks,


Comparing large volumes of data, be it text files or images (something I
have done for both) is a very big task.

The problem is that if you have  N  files to compare
you will need to compare  (N)*(N-1)/2  times to exhaustiveally compare them.

That is   A gets compared to B,C,D,E,F,...
Then      B gets compared to C,D,E,F,...
and so on.
You do not need to compare B to A as you already compared A to B.

In other words for 3 files you need to compare  3 times
for 4 you need  6 comparisions,  for 5 files you need 10 comapres
for 3000 files you need 4498500 compares

In one of my own comparision runs I comared 32580 files!
as as the comparsions too a long time it was estimated that this would
have taken about 3 years to exhaustivally compare them.


The trick is to preprocess the files to
  1/  reduce the time it takes to compare
  2/  sort the files into sets, so you only need to compare the files in
      the same set.

For images you can do the first one by pre-reading in the images as
smaller thumbnails,  if these show a high degree of matching, only then
compare the full sized file for a positive match.

The second one is more tricky...

As image comparision can depend on what type of images you have, you can
initially group images into mostly white or black figures, grey scale,
cartoon like limited color images, and full color images.

Another idea is to figure out some 'metric' about the image so that
images with siluar metrics can be comapred.  For example a average
color of the center part of the image could be used.  Of course if
some images were modified (such as 'spam text' or watermarks) then
you metric may also make image look too different to be comapred.


You also may have to watch out for things like: extra text and
watermarking (spam/copyright text),  extra borders, or cropping of one
of the images, color and gamma changes, JPEG color distortions, and so
on.

I have place some information on this in IM examples, Image Comparing
  http://www.cit.gu.edu.au/~anthony/

If you or anyone like to add more, please mail me and I'll add it.


  Anthony Thyssen ( System Programmer )    <[EMAIL PROTECTED]>
 -----------------------------------------------------------------------------
   I go to bed at night with a smile on my face, dream beautiful dreams
   and wake up in the morning with an even bigger smile because I am
   lucky enough to be on this planet sharing this point in history...
                       -- Bruce Ressia - Griffith University Academic Support
 -----------------------------------------------------------------------------
     Anthony's Home is his Castle     http://www.cit.gu.edu.au/~anthony/
_______________________________________________
Magick-users mailing list
[email protected]
http://studio.imagemagick.org/mailman/listinfo/magick-users

Re: [magick-users] Optimizing the compare function

Reply via email to