Re: ImSim: Image Similarity
On Mon, Mar 7, 2011 at 6:30 AM, n00m n...@narod.ru wrote: Remind me this piece of humor: One man entered a lift cabin at the 1st floor, lift goes to the3rd floor, opens and ... it's empty! Physicist, Chemist and Mathematician were asked: what happened to the man? Physicist: he was squashed to the floor by acceleration! Chemist: he was vaporized by some acid gases! Mathematician: hmm... Let's call a lift *empty* if there is inside of it no more than *1* man. That reminds me of another joke I once heard. A logician, a biologist, and a physicist are watching an empty house. They observe two people going into the house, and then some time later three people emerge. The logician observes, Our premise must have been incorrect. The house was not initially empty. The biologist speculates, Perhaps the two people whom we saw go into the house have reproduced while inside, resulting in the third. Finally, the mathematician concludes, I don't know about that, but I do know that if one more person goes into that house, it will be empty! This has nothing to do with Python, though. -- http://mail.python.org/mailman/listinfo/python-list
Re: ImSim: Image Similarity
On Mar 6, 7:54 pm, n00m n...@narod.ru wrote: If someone will encounter 2 apparently unrelated pics but for which ImSim gives value of their mutual diff. *** less than 20% *** please emailed them to me. Never mind, people. I've found such a pair of images in my .zipped project. It's sky1.jpg and lake1.jpg, with sim. value 15%. sky1.jpg sky1.jpg0.00 sky2.jpg0.77 lake1.jpg 14.28- bears2.jpg 23.29 bears3.jpg 26.60 roses2.jpg 29.41 roses1.jpg 31.36 ff1.jpg 33.47 bears1.jpg 36.60 ff2.jpg 39.52 water1.jpg 40.11 But funny thing takes place. At first thought it's a false-positive: some modern South East Asian town and a lake somewhere in Russia, more than 100 years ago. Nothing similar in them? On both pics we see: -- a lot of water on foreground; -- a lot of blue sky at sunny mid-day; -- a bit of light white clouds in the sky; In short, the notion of similarity can be speculated about just endlessly. -- http://mail.python.org/mailman/listinfo/python-list
Re: ImSim: Image Similarity
Just admit that your algorithm doesn't work that well already :-) Or give a solid formal definition of similarity and prove that your algo works with that definition. On Mon, Mar 7, 2011 at 4:22 PM, n00m n...@narod.ru wrote: In short, the notion of similarity can be speculated about just endlessly. -- http://mail.python.org/mailman/listinfo/python-list
Re: ImSim: Image Similarity
On Mar 7, 2:54 pm, Grigory Javadyan grigory.javad...@gmail.com wrote: Just admit that your algorithm doesn't work that well already :-) Or give a solid formal definition of similarity and prove that your algo works with that definition. On Mon, Mar 7, 2011 at 4:22 PM, n00m n...@narod.ru wrote: In short, the notion of similarity can be speculated about just endlessly. Remind me this piece of humor: One man entered a lift cabin at the 1st floor, lift goes to the3rd floor, opens and ... it's empty! Physicist, Chemist and Mathematician were asked: what happened to the man? Physicist: he was squashed to the floor by acceleration! Chemist: he was vaporized by some acid gases! Mathematician: hmm... Let's call a lift *empty* if there is inside of it no more than *1* man. -- http://mail.python.org/mailman/listinfo/python-list
Re: ImSim: Image Similarity
So, my current very strict definition of similarity is: --- 2 pics are similar if my script gives for them value 20%, otherwise the pics are not similar. --- It is left to study possible transitivity of similarity. == LOL -- http://mail.python.org/mailman/listinfo/python-list
Re: ImSim: Image Similarity
n00m wrote: But funny thing takes place. At first thought it's a false-positive: some modern South East Asian town and a lake somewhere in Russia, more than 100 years ago. Nothing similar in them? On both pics we see: -- a lot of water on foreground; -- a lot of blue sky at sunny mid-day; -- a bit of light white clouds in the sky; In short, the notion of similarity can be speculated about just endlessly. Exactly. That's the kind of similarity I would call valid. That's what my algorithms, if I ever finished writing any, would be looking for. Mel. -- http://mail.python.org/mailman/listinfo/python-list
Re: ImSim: Image Similarity
@all and just in case. Also see my TiRG project (since 2011-01-31): http://sourceforge.net/projects/tirg/ It's for detecting and localizing textareas in raster graphics. Among its files there is a python script -- absolutely working. Enjoy to do with it whatever you like -- it's my public domain. And again and this theme is a rich field for speculating about, over What on the earth is the text? question. Is 50 in a row letters 'o' some text or is it just a part of some fancy Indian ornament? And so on, etc. PS Don't confuse it with OCR. It's 2 different beasts. === -- http://mail.python.org/mailman/listinfo/python-list
Re: ImSim: Image Similarity
Obviously if we'd use it in practice (in a web-museum ?) all pic's matrices should be precalculated only once and stored in a table with fourty fields v00 ... v93 like: --- pic_title v00v01v02... v93 --- bears2.jpg1234 4534 8922... 333 ... ... --- Then SQL query will look like this: select top 3 pic_title from table order by abs(v00 - w[0][0]) + abs(v01 - w[0][1]) + ... + abs(v93 - w[9][3]) here w[][] is the matrix of a newly-entering picture. P.S. If someone will encounter 2 apparently unrelated pics but for which ImSim gives value of their mutual diff. *** less than 20% *** please emailed them to me. -- http://mail.python.org/mailman/listinfo/python-list
Re: ImSim: Image Similarity
n00m n...@narod.ru writes: http://www.nga.gov/search/index.shtm http://deyoung.famsf.org/search-collections etc Seems they all offer search only by keywords and this kind. What about to submit e.g. roses2.jpg (copy) and to find its original? Assume we don't know its author neither its title Title: TinEye, author: http://ideeinc.com/ Search: http://www.tineye.com/ Example: http://www.tineye.com/search/2b3305135fa4c59311ed58b41da5d07f213e4d47/ Notice how it finds modified images. -- John Bokma j3b Blog: http://johnbokma.com/Facebook: http://www.facebook.com/j.j.j.bokma Freelance Perl Python Development: http://castleamber.com/ -- http://mail.python.org/mailman/listinfo/python-list
Re: ImSim: Image Similarity
On Mar 6, 8:55 pm, John Bokma j...@castleamber.com wrote: n00m n...@narod.ru writes: http://www.nga.gov/search/index.shtm http://deyoung.famsf.org/search-collections etc Seems they all offer search only by keywords and this kind. What about to submit e.g. roses2.jpg (copy) and to find its original? Assume we don't know its author neither its title Title: TinEye, author:http://ideeinc.com/ Search:http://www.tineye.com/ Example: http://www.tineye.com/search/2b3305135fa4c59311ed58b41da5d07f213e4d47/ Notice how it finds modified images. -- John Bokma j3b Blog:http://johnbokma.com/ Facebook:http://www.facebook.com/j.j.j.bokma Freelance Perl Python Development:http://castleamber.com/ It's for kids. Such trifles can easily be cracked by e.g. Jorgen Grahn's algo (see his message) -- http://mail.python.org/mailman/listinfo/python-list
Re: ImSim: Image Similarity
On Mar 6, 10:17 pm, n00m n...@narod.ru wrote: On Mar 6, 8:55 pm, John Bokma j...@castleamber.com wrote: n00m n...@narod.ru writes: http://www.nga.gov/search/index.shtm http://deyoung.famsf.org/search-collections etc Seems they all offer search only by keywords and this kind. What about to submit e.g. roses2.jpg (copy) and to find its original? Assume we don't know its author neither its title Title: TinEye, author:http://ideeinc.com/ Search:http://www.tineye.com/ Example: http://www.tineye.com/search/2b3305135fa4c59311ed58b41da5d07f213e4d47/ Notice how it finds modified images. -- John Bokma j3b Blog:http://johnbokma.com/ Facebook:http://www.facebook.com/j.j.j.bokma Freelance Perl Python Development:http://castleamber.com/ It's for kids. Such trifles can easily be cracked by e.g. Jorgen Grahn's algo (see his message) Even his algo will be an overhead. Comparing meta-data/EXIF of image files will be enough in 99% cases. -- http://mail.python.org/mailman/listinfo/python-list
Re: ImSim: Image Similarity
n00m n...@narod.ru writes: On Mar 6, 10:17 pm, n00m n...@narod.ru wrote: On Mar 6, 8:55 pm, John Bokma j...@castleamber.com wrote: n00m n...@narod.ru writes: http://www.nga.gov/search/index.shtm http://deyoung.famsf.org/search-collections etc Seems they all offer search only by keywords and this kind. What about to submit e.g. roses2.jpg (copy) and to find its original? Assume we don't know its author neither its title Title: TinEye, author:http://ideeinc.com/ Search:http://www.tineye.com/ Example: http://www.tineye.com/search/2b3305135fa4c59311ed58b41da5d07f213e4d47/ Notice how it finds modified images. -- John Bokma j3b Blog:http://johnbokma.com/ Facebook:http://www.facebook.com/j.j.j.bokma Freelance Perl Python Development:http://castleamber.com/ It's for kids. Such trifles can easily be cracked by e.g. Jorgen Grahn's algo (see his message) Even his algo will be an overhead. Comparing meta-data/EXIF of image files will be enough in 99% cases. Yes, yes, we get it. You're so much smarter (but not smart enough to not quote a signature...). Anyway, I guess that's the reason big names use tineye and not your algorithm... -- John Bokma j3b Blog: http://johnbokma.com/Facebook: http://www.facebook.com/j.j.j.bokma Freelance Perl Python Development: http://castleamber.com/ -- http://mail.python.org/mailman/listinfo/python-list
Re: ImSim: Image Similarity
As for proper quoting: I read/post to this group via my web-browser. And for me everything looks OK. I don't even quite understand what exactly do you mean by your remark. I'm not a facebookie/forumish/twitterish thing. Btw I don't know what is the twitter. I don't need it, neither to know nor to use it. Oh... Pres. Medvedev knows what is the twitter and uses it. -- http://mail.python.org/mailman/listinfo/python-list
Re: ImSim: Image Similarity
n00m n...@narod.ru writes: As for proper quoting: I read/post to this group via my web-browser. And for me everything looks OK. I don't even quite understand what exactly do you mean by your remark. I'm not a facebookie/forumish/twitterish thing. Exactly. It's Usenet, something I've been using for, oh, just over 20 years now, and even then it was not new. You know, before the web thing you're talking about... -- John Bokma j3b Blog: http://johnbokma.com/Facebook: http://www.facebook.com/j.j.j.bokma Freelance Perl Python Development: http://castleamber.com/ -- http://mail.python.org/mailman/listinfo/python-list
Re: ImSim: Image Similarity
At least you could've tried to make the script more usable by adding the possibility to supply command line arguments, instead of editing the source every time you want to compare a couple of images. On Sat, Mar 5, 2011 at 11:23 AM, n00m n...@narod.ru wrote: Let me present my newborn project (in Python) ImSim: http://sourceforge.net/projects/imsim/ Its README.txt: - ImSim is a python script for finding the most similar pic(s) to a given one among a set/list/db of your pics. The script is very short and very easy to follow and understand. Its sample output looks like this: bears2.jpg bears2.jpg 0.00 bears3.jpg 55.33 bears1.jpg 68.87 sky1.jpg 83.84 sky2.jpg 84.41 ff1.jpg 91.35 lake1.jpg 95.14 water1.jpg 96.94 ff2.jpg 102.36 roses1.jpg 115.02 roses2.jpg 130.02 Done! The *less* numeric value -- the *more similar* this pic is to the tested pic. If this value 70 almost for sure these pictures are absolutely different (from totally different domains, so to speak). What is similarity and how can/could/should it be estimated this point I'm leaving for your consideration/contemplation/arguing etc. Several sample pics (*.jpg) are included into .zip. And of course the stuff requires PIL (Python Imaging Library), see: Home-page: http://www.pythonware.com/products/pil Download-URL: http://effbot.org/zone/pil-changes-116.htm -- http://mail.python.org/mailman/listinfo/python-list -- http://mail.python.org/mailman/listinfo/python-list
Re: ImSim: Image Similarity
I uploaded a new version of the subject with a VERY MINOR correction in it. Namely, in line #55: print '%12s %7.2f' % (db[k][1], db[k][0] / 3600.0,) instead of print '%12s %7.2f' % (db[k][1], db[k][0] * 0.001,) I.e. I normalized it to base = 100. Now the values of similarity can't be greater than 100 and can be treated as some regular percents (%%). Also, due to this change, the *empirical* threshold of system alarmity moved down from number 70 to 20%. bears2.jpg bears2.jpg0.00 bears3.jpg 15.37 bears1.jpg 19.13 sky1.jpg 23.29 sky2.jpg 23.45 ff1.jpg 25.37 lake1.jpg 26.43 water1.jpg 26.93 ff2.jpg 28.43 roses1.jpg 31.95 roses2.jpg 36.12 Done! -- http://mail.python.org/mailman/listinfo/python-list
Re: ImSim: Image Similarity
n00m wrote: I uploaded a new version of the subject with a VERY MINOR correction in it. Namely, in line #55: print '%12s %7.2f' % (db[k][1], db[k][0] / 3600.0,) instead of print '%12s %7.2f' % (db[k][1], db[k][0] * 0.001,) I.e. I normalized it to base = 100. Now the values of similarity can't be greater than 100 and can be treated as some regular percents (%%). Also, due to this change, the *empirical* threshold of system alarmity moved down from number 70 to 20%. bears2.jpg bears2.jpg0.00 bears3.jpg 15.37 bears1.jpg 19.13 sky1.jpg 23.29 sky2.jpg 23.45 ff1.jpg 25.37 lake1.jpg 26.43 water1.jpg 26.93 ff2.jpg 28.43 roses1.jpg 31.95 roses2.jpg 36.12 I'd like to see a *lot* more structure in there, with modularization, so the internal functions could be used from another program. Once I'd figured out what it was doing, I had this: from PIL import Image from PIL import ImageStat def row_column_histograms (file_name): '''Reduce the image to a 5x5 square of b/w brightness levels 0..3 Return two brightness histograms across Y and X packed into a 10-item list of 4-item histograms.''' im = Image.open (file_name) im = im.convert ('L') # convert to 8-bit b/w w, h = 300, 300 im = im.resize ((w, h)) imst = ImageStat.Stat (im) sr = imst.mean[0] # average pixel level in layer 0 sr_low, sr_mid, sr_high = (sr*2)/3, sr, (sr*4)/3 def foo (t): if t sr_low: return 0 if t sr_mid: return 1 if t sr_high: return 2 return 3 im = im.point (foo) # reduce to brightness levels 0..3 yhist = [[0]*4 for i in xrange(5)] xhist = [[0]*4 for i in xrange(5)] for y in xrange (h): for x in xrange (w): k = im.getpixel ((x, y)) yhist[y / 60][k] += 1 xhist[x / 60][k] += 1 return yhist + xhist def difference_ranks (test_histogram, sample_histograms): '''Return a list of difference ranks between the test histograms and each of the samples.''' result = [0]*len (sample_histograms) for k, s in enumerate (sample_histograms): # for each image for i in xrange(10):# for each histogram slot for j in xrange(4): # for each brightness level result[k] += abs (s[i][j] - test_histogram[i][j]) return result if __name__ == '__main__': import getopt, sys opts, args = getopt.getopt (sys.argv[1:], '', []) if not args: args = [ 'bears1.jpg', 'bears2.jpg', 'bears3.jpg', 'roses1.jpg', 'roses2.jpg', 'ff1.jpg', 'ff2.jpg', 'sky1.jpg', 'sky2.jpg', 'water1.jpg', 'lake1.jpg', ] test_pic = 'bears2.jpg' else: test_pic, args = args[0], args[1:] z = [row_column_histograms (a) for a in args] test_z = row_column_histograms (test_pic) file_ranks = zip (difference_ranks (test_z, z), args) file_ranks.sort() print '%12s' % (test_pic,) print '' for r in file_ranks: print '%12s %7.2f' % (r[1], r[0] / 3600.0,) (omitting a few comments that wrapped around.) The test-case still agrees with your archived version: mwilson@tecumseth:~/sandbox/im_sim$ python image_rank.py bears2.jpg *.jpg bears2.jpg bears2.jpg0.00 bears3.jpg 15.37 bears1.jpg 19.20 sky1.jpg 23.20 sky2.jpg 23.37 ff1.jpg 25.30 lake1.jpg 26.38 water1.jpg 26.98 ff2.jpg 28.43 roses1.jpg 32.01 I'd vaguely wanted to do something like this for a while, but I never dug far enough into PIL to even get started. An additional kind of ranking that takes colour into account would also be good -- that's the first one I never did. Cheers, Mel. -- http://mail.python.org/mailman/listinfo/python-list
Re: ImSim: Image Similarity
On Mar 5, 7:10 pm, Mel mwil...@the-wire.com wrote: n00m wrote: I uploaded a new version of the subject with a VERY MINOR correction in it. Namely, in line #55: print '%12s %7.2f' % (db[k][1], db[k][0] / 3600.0,) instead of print '%12s %7.2f' % (db[k][1], db[k][0] * 0.001,) I.e. I normalized it to base = 100. Now the values of similarity can't be greater than 100 and can be treated as some regular percents (%%). Also, due to this change, the *empirical* threshold of system alarmity moved down from number 70 to 20%. bears2.jpg bears2.jpg 0.00 bears3.jpg 15.37 bears1.jpg 19.13 sky1.jpg 23.29 sky2.jpg 23.45 ff1.jpg 25.37 lake1.jpg 26.43 water1.jpg 26.93 ff2.jpg 28.43 roses1.jpg 31.95 roses2.jpg 36.12 I'd like to see a *lot* more structure in there, with modularization, so the internal functions could be used from another program. Once I'd figured out what it was doing, I had this: from PIL import Image from PIL import ImageStat def row_column_histograms (file_name): '''Reduce the image to a 5x5 square of b/w brightness levels 0..3 Return two brightness histograms across Y and X packed into a 10-item list of 4-item histograms.''' im = Image.open (file_name) im = im.convert ('L') # convert to 8-bit b/w w, h = 300, 300 im = im.resize ((w, h)) imst = ImageStat.Stat (im) sr = imst.mean[0] # average pixel level in layer 0 sr_low, sr_mid, sr_high = (sr*2)/3, sr, (sr*4)/3 def foo (t): if t sr_low: return 0 if t sr_mid: return 1 if t sr_high: return 2 return 3 im = im.point (foo) # reduce to brightness levels 0..3 yhist = [[0]*4 for i in xrange(5)] xhist = [[0]*4 for i in xrange(5)] for y in xrange (h): for x in xrange (w): k = im.getpixel ((x, y)) yhist[y / 60][k] += 1 xhist[x / 60][k] += 1 return yhist + xhist def difference_ranks (test_histogram, sample_histograms): '''Return a list of difference ranks between the test histograms and each of the samples.''' result = [0]*len (sample_histograms) for k, s in enumerate (sample_histograms): # for each image for i in xrange(10): # for each histogram slot for j in xrange(4): # for each brightness level result[k] += abs (s[i][j] - test_histogram[i][j]) return result if __name__ == '__main__': import getopt, sys opts, args = getopt.getopt (sys.argv[1:], '', []) if not args: args = [ 'bears1.jpg', 'bears2.jpg', 'bears3.jpg', 'roses1.jpg', 'roses2.jpg', 'ff1.jpg', 'ff2.jpg', 'sky1.jpg', 'sky2.jpg', 'water1.jpg', 'lake1.jpg', ] test_pic = 'bears2.jpg' else: test_pic, args = args[0], args[1:] z = [row_column_histograms (a) for a in args] test_z = row_column_histograms (test_pic) file_ranks = zip (difference_ranks (test_z, z), args) file_ranks.sort() print '%12s' % (test_pic,) print '' for r in file_ranks: print '%12s %7.2f' % (r[1], r[0] / 3600.0,) (omitting a few comments that wrapped around.) The test-case still agrees with your archived version: mwilson@tecumseth:~/sandbox/im_sim$ python image_rank.py bears2.jpg *.jpg bears2.jpg bears2.jpg 0.00 bears3.jpg 15.37 bears1.jpg 19.20 sky1.jpg 23.20 sky2.jpg 23.37 ff1.jpg 25.30 lake1.jpg 26.38 water1.jpg 26.98 ff2.jpg 28.43 roses1.jpg 32.01 I'd vaguely wanted to do something like this for a while, but I never dug far enough into PIL to even get started. An additional kind of ranking that takes colour into account would also be good -- that's the first one I never did. Cheers, Mel. Very nice, Mel. As for using color info... my current strong opinion is: the colors must be forgot for good. Paradoxically but profound elaboration and detailization can/will spoil/undermine the whole thing. Just my current imo. === Vitali -- http://mail.python.org/mailman/listinfo/python-list
Re: ImSim: Image Similarity
On Sat, 2011-03-05, Grigory Javadyan wrote: At least you could've tried to make the script more usable by adding the possibility to supply command line arguments, instead of editing the source every time you want to compare a couple of images. On Sat, Mar 5, 2011 at 11:23 AM, n00m n...@narod.ru wrote: Let me present my newborn project (in Python) ImSim: http://sourceforge.net/projects/imsim/ Its README.txt: - ImSim is a python script for finding the most similar pic(s) to a given one among a set/list/db of your pics. The script is very short and very easy to follow and understand. Its sample output looks like this: ... The *less* numeric value -- the *more similar* this pic is to the tested pic. If this value 70 almost for sure these pictures are absolutely different (from totally different domains, so to speak). What is similarity and how can/could/should it be estimated this point I'm leaving for your consideration/contemplation/arguing etc. So basically you're saying you won't tell the users what the program *does*. I don't get that. Is it better than this? - scale each image to 100x100 - go blackwhite in such a way that half the pixels are black - XOR the images and count the mismatches That takes care of JPEG quality, scaling and possibly gamma correction, but not cropping or rotation. I'm sure there are better, well-known algorithms. /Jorgen -- // Jorgen Grahn grahn@ Oo o. . . \X/ snipabacken.se O o . -- http://mail.python.org/mailman/listinfo/python-list
Re: ImSim: Image Similarity
Is it better than this? - scale each image to 100x100 - go blackwhite in such a way that half the pixels are black - XOR the images and count the mismatches It's *much* better but I'm not *much* about to prove it. I'm sure there are better, well-known algorithms. The best well-known algorithm is to hire a man with good eyesight for to do the job of comparing, ranking and selecting the pictures. -- http://mail.python.org/mailman/listinfo/python-list
Re: ImSim: Image Similarity
PS For some reason they don't update the link to the last version. It's _20110306, here: http://sourceforge.net/projects/imsim/files/ I use Python 2.5 PIL for Python 2.5 -- http://mail.python.org/mailman/listinfo/python-list
Re: ImSim: Image Similarity
n00m wrote: As for using color info... my current strong opinion is: the colors must be forgot for good. Paradoxically but profound elaboration and detailization can/will spoil/undermine the whole thing. Just my current imo. Yeah. I guess including color info cubes the complexity of the answer. Might be too complicated to know what to do with an answer like that. Mel. -- http://mail.python.org/mailman/listinfo/python-list
Re: ImSim: Image Similarity
On Mar 6, 6:10 am, Mel mwil...@the-wire.com wrote: n00m wrote: As for using color info... my current strong opinion is: the colors must be forgot for good. Paradoxically but profound elaboration and detailization can/will spoil/undermine the whole thing. Just my current imo. Yeah. I guess including color info cubes the complexity of the answer. Might be too complicated to know what to do with an answer like that. Mel. Uhmm, Mel. Totally agree with you. + I included roses1.jpg roses2.jpg on purpose: the 1st one is a painting by Abbott Handerson Thayer, the 2nd is its copy by some obscure Russian painter. But it's of course a creative revamped copy. In strict sense they are 2 different images (look at their colors etc) , on the other hand they are closely related to each other. Plus, we can't tell *in principle* what is original and what is copy what colors are right/good and what colors are wrong/bad -- http://mail.python.org/mailman/listinfo/python-list
Re: ImSim: Image Similarity
http://www.nga.gov/search/index.shtm http://deyoung.famsf.org/search-collections etc Seems they all offer search only by keywords and this kind. What about to submit e.g. roses2.jpg (copy) and to find its original? Assume we don't know its author neither its title -- http://mail.python.org/mailman/listinfo/python-list
ImSim: Image Similarity
Let me present my newborn project (in Python) ImSim: http://sourceforge.net/projects/imsim/ Its README.txt: - ImSim is a python script for finding the most similar pic(s) to a given one among a set/list/db of your pics. The script is very short and very easy to follow and understand. Its sample output looks like this: bears2.jpg bears2.jpg0.00 bears3.jpg 55.33 bears1.jpg 68.87 sky1.jpg 83.84 sky2.jpg 84.41 ff1.jpg 91.35 lake1.jpg 95.14 water1.jpg 96.94 ff2.jpg 102.36 roses1.jpg 115.02 roses2.jpg 130.02 Done! The *less* numeric value -- the *more similar* this pic is to the tested pic. If this value 70 almost for sure these pictures are absolutely different (from totally different domains, so to speak). What is similarity and how can/could/should it be estimated this point I'm leaving for your consideration/contemplation/arguing etc. Several sample pics (*.jpg) are included into .zip. And of course the stuff requires PIL (Python Imaging Library), see: Home-page: http://www.pythonware.com/products/pil Download-URL: http://effbot.org/zone/pil-changes-116.htm -- http://mail.python.org/mailman/listinfo/python-list