Re: ImSim: Image Similarity

2011-03-08 Thread Ian Kelly
On Mon, Mar 7, 2011 at 6:30 AM, n00m n...@narod.ru wrote:
 Remind me this piece of humor:

 One man entered a lift cabin at the 1st floor,
 lift goes to the3rd floor, opens and ... it's empty!
 Physicist, Chemist and Mathematician were asked:
 what happened to the man?

 Physicist: he was squashed to the floor by acceleration!

 Chemist: he was vaporized by some acid gases!

 Mathematician: hmm... Let's call a lift *empty* if
 there is inside of it no more than *1* man.

That reminds me of another joke I once heard.

A logician, a biologist, and a physicist are watching an empty house.
They observe two people going into the house, and then some time later
three people emerge.

The logician observes, Our premise must have been incorrect.  The
house was not initially empty.

The biologist speculates, Perhaps the two people whom we saw go into
the house have reproduced while inside, resulting in the third.

Finally, the mathematician concludes, I don't know about that, but I
do know that if one more person goes into that house, it will be
empty!

This has nothing to do with Python, though.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ImSim: Image Similarity

2011-03-07 Thread n00m
On Mar 6, 7:54 pm, n00m n...@narod.ru wrote:
 If someone will encounter 2 apparently unrelated pics
 but for which ImSim gives value of their mutual diff.
 *** less than 20% *** please emailed them to me.

Never mind, people.
I've found such a pair of images in my .zipped project.
It's sky1.jpg and lake1.jpg, with sim. value  15%.

sky1.jpg

sky1.jpg0.00
sky2.jpg0.77
   lake1.jpg   14.28-
  bears2.jpg   23.29
  bears3.jpg   26.60
  roses2.jpg   29.41
  roses1.jpg   31.36
 ff1.jpg   33.47
  bears1.jpg   36.60
 ff2.jpg   39.52
  water1.jpg   40.11

But funny thing takes place.
At first thought it's a false-positive: some modern South East
Asian town and a lake somewhere in Russia, more than 100 years
ago. Nothing similar in them?

On both pics we see:
-- a lot of water on foreground;
-- a lot of blue sky at sunny mid-day;
-- a bit of light white clouds in the sky;

In short,
the notion of similarity can be speculated about just endlessly.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ImSim: Image Similarity

2011-03-07 Thread Grigory Javadyan
Just admit that your algorithm doesn't work that well already :-)
Or give a solid formal definition of similarity and prove that your
algo works with that definition.

On Mon, Mar 7, 2011 at 4:22 PM, n00m n...@narod.ru wrote:

 In short,
 the notion of similarity can be speculated about just endlessly.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ImSim: Image Similarity

2011-03-07 Thread n00m
On Mar 7, 2:54 pm, Grigory Javadyan grigory.javad...@gmail.com
wrote:
 Just admit that your algorithm doesn't work that well already :-)
 Or give a solid formal definition of similarity and prove that your
 algo works with that definition.

 On Mon, Mar 7, 2011 at 4:22 PM, n00m n...@narod.ru wrote:

  In short,
  the notion of similarity can be speculated about just endlessly.




Remind me this piece of humor:

One man entered a lift cabin at the 1st floor,
lift goes to the3rd floor, opens and ... it's empty!
Physicist, Chemist and Mathematician were asked:
what happened to the man?

Physicist: he was squashed to the floor by acceleration!

Chemist: he was vaporized by some acid gases!

Mathematician: hmm... Let's call a lift *empty* if
there is inside of it no more than *1* man.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ImSim: Image Similarity

2011-03-07 Thread n00m
So, my current very strict definition of similarity is:

---
2 pics are similar if my script gives for them value  20%,
otherwise the pics are not similar.
---

It is left to study possible transitivity of similarity.


==
LOL
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ImSim: Image Similarity

2011-03-07 Thread Mel
n00m wrote:

 But funny thing takes place.
 At first thought it's a false-positive: some modern South East
 Asian town and a lake somewhere in Russia, more than 100 years
 ago. Nothing similar in them?
 
 On both pics we see:
 -- a lot of water on foreground;
 -- a lot of blue sky at sunny mid-day;
 -- a bit of light white clouds in the sky;
 
 In short,
 the notion of similarity can be speculated about just endlessly.

Exactly.  That's the kind of similarity I would call valid.  That's what my 
algorithms, if I ever finished writing any, would be looking for.

Mel.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ImSim: Image Similarity

2011-03-07 Thread n00m
@all and just in case.
Also see my TiRG project (since 2011-01-31):
http://sourceforge.net/projects/tirg/
It's for detecting and localizing textareas in raster graphics.
Among its files there is a python script -- absolutely working.
Enjoy to do with it whatever you like -- it's my public domain.

And again and this theme is a rich field for speculating about,
over What on the earth is the text? question.
Is 50 in a row letters 'o' some text or is it just a part of
some fancy Indian ornament? And so on, etc.

PS
Don't confuse it with OCR. It's 2 different beasts.
===
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ImSim: Image Similarity

2011-03-06 Thread n00m
Obviously if we'd use it in practice (in a web-museum ?)
all pic's matrices should be precalculated only once and
stored in a table with fourty fields v00 ... v93 like:

---
pic_title  v00v01v02... v93
---
bears2.jpg1234   4534   8922... 333
...
...
---

Then SQL query will look like this:

select top 3 pic_title from table
order by
abs(v00 - w[0][0]) +
abs(v01 - w[0][1]) +
... +
abs(v93 - w[9][3])

here w[][] is the matrix of a newly-entering picture.


P.S.
If someone will encounter 2 apparently unrelated pics
but for which ImSim gives value of their mutual diff.
*** less than 20% *** please emailed them to me.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ImSim: Image Similarity

2011-03-06 Thread John Bokma
n00m n...@narod.ru writes:

 http://www.nga.gov/search/index.shtm
 http://deyoung.famsf.org/search-collections
 etc
 Seems they all offer search only by keywords and this kind.
 What about to submit e.g. roses2.jpg (copy) and to find its
 original? Assume we don't know its author neither its title

Title: TinEye, author: http://ideeinc.com/
Search: http://www.tineye.com/

Example: 
  http://www.tineye.com/search/2b3305135fa4c59311ed58b41da5d07f213e4d47/

Notice how it finds modified images.

-- 
John Bokma   j3b

Blog: http://johnbokma.com/Facebook: http://www.facebook.com/j.j.j.bokma
Freelance Perl  Python Development: http://castleamber.com/
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ImSim: Image Similarity

2011-03-06 Thread n00m
On Mar 6, 8:55 pm, John Bokma j...@castleamber.com wrote:
 n00m n...@narod.ru writes:
 http://www.nga.gov/search/index.shtm
 http://deyoung.famsf.org/search-collections
  etc
  Seems they all offer search only by keywords and this kind.
  What about to submit e.g. roses2.jpg (copy) and to find its
  original? Assume we don't know its author neither its title

 Title: TinEye, author:http://ideeinc.com/
 Search:http://www.tineye.com/

 Example:
  http://www.tineye.com/search/2b3305135fa4c59311ed58b41da5d07f213e4d47/

 Notice how it finds modified images.

 --
 John Bokma                                                               j3b

 Blog:http://johnbokma.com/   Facebook:http://www.facebook.com/j.j.j.bokma
     Freelance Perl  Python Development:http://castleamber.com/


It's for kids.
Such trifles can easily be cracked by e.g. Jorgen Grahn's algo (see
his message)
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ImSim: Image Similarity

2011-03-06 Thread n00m
On Mar 6, 10:17 pm, n00m n...@narod.ru wrote:
 On Mar 6, 8:55 pm, John Bokma j...@castleamber.com wrote:



  n00m n...@narod.ru writes:
  http://www.nga.gov/search/index.shtm
  http://deyoung.famsf.org/search-collections
   etc
   Seems they all offer search only by keywords and this kind.
   What about to submit e.g. roses2.jpg (copy) and to find its
   original? Assume we don't know its author neither its title

  Title: TinEye, author:http://ideeinc.com/
  Search:http://www.tineye.com/

  Example:
   http://www.tineye.com/search/2b3305135fa4c59311ed58b41da5d07f213e4d47/

  Notice how it finds modified images.

  --
  John Bokma                                                               j3b

  Blog:http://johnbokma.com/  Facebook:http://www.facebook.com/j.j.j.bokma
      Freelance Perl  Python Development:http://castleamber.com/

 It's for kids.
 Such trifles can easily be cracked by e.g. Jorgen Grahn's algo (see
 his message)


Even his algo will be an overhead.
Comparing meta-data/EXIF of image files will be enough in 99% cases.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ImSim: Image Similarity

2011-03-06 Thread John Bokma
n00m n...@narod.ru writes:

 On Mar 6, 10:17 pm, n00m n...@narod.ru wrote:
 On Mar 6, 8:55 pm, John Bokma j...@castleamber.com wrote:



  n00m n...@narod.ru writes:
  http://www.nga.gov/search/index.shtm
  http://deyoung.famsf.org/search-collections
   etc
   Seems they all offer search only by keywords and this kind.
   What about to submit e.g. roses2.jpg (copy) and to find its
   original? Assume we don't know its author neither its title

  Title: TinEye, author:http://ideeinc.com/
  Search:http://www.tineye.com/

  Example:
   http://www.tineye.com/search/2b3305135fa4c59311ed58b41da5d07f213e4d47/

  Notice how it finds modified images.

  --
  John Bokma                                                               
  j3b

  Blog:http://johnbokma.com/  Facebook:http://www.facebook.com/j.j.j.bokma
      Freelance Perl  Python Development:http://castleamber.com/

 It's for kids.
 Such trifles can easily be cracked by e.g. Jorgen Grahn's algo (see
 his message)


 Even his algo will be an overhead.
 Comparing meta-data/EXIF of image files will be enough in 99% cases.

Yes, yes, we get it. You're so much smarter (but not smart enough to not
quote a signature...). Anyway, I guess that's the reason big names use
tineye and not your algorithm...

-- 
John Bokma   j3b

Blog: http://johnbokma.com/Facebook: http://www.facebook.com/j.j.j.bokma
Freelance Perl  Python Development: http://castleamber.com/
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ImSim: Image Similarity

2011-03-06 Thread n00m

As for proper quoting: I read/post to this group via my web-browser.
And for me everything looks OK. I don't even quite understand what
exactly
do you mean by your remark. I'm not a facebookie/forumish/twitterish
thing.
Btw I don't know what is the twitter. I don't need it, neither to know
nor
to use it. Oh... Pres. Medvedev knows what is the twitter and uses it.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ImSim: Image Similarity

2011-03-06 Thread John Bokma
n00m n...@narod.ru writes:

 As for proper quoting: I read/post to this group via my web-browser.
 And for me everything looks OK. I don't even quite understand what
 exactly
 do you mean by your remark. I'm not a facebookie/forumish/twitterish
 thing.

Exactly. It's Usenet, something I've been using for, oh, just over 20
years now, and even then it was not new. You know, before the web thing
you're talking about...

-- 
John Bokma   j3b

Blog: http://johnbokma.com/Facebook: http://www.facebook.com/j.j.j.bokma
Freelance Perl  Python Development: http://castleamber.com/
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ImSim: Image Similarity

2011-03-05 Thread Grigory Javadyan
At least you could've tried to make the script more usable by adding
the possibility to supply command line arguments, instead of editing
the source every time you want to compare a couple of images.

On Sat, Mar 5, 2011 at 11:23 AM, n00m n...@narod.ru wrote:
 Let me present my newborn project (in Python) ImSim:

 http://sourceforge.net/projects/imsim/

 Its README.txt:
 -
 ImSim is a python script for finding the most similar pic(s) to
 a given one among a set/list/db of your pics.
 The script is very short and very easy to follow and understand.
 Its sample output looks like this:

  bears2.jpg
 
  bears2.jpg    0.00
  bears3.jpg   55.33
  bears1.jpg   68.87
    sky1.jpg   83.84
    sky2.jpg   84.41
     ff1.jpg   91.35
   lake1.jpg   95.14
  water1.jpg   96.94
     ff2.jpg  102.36
  roses1.jpg  115.02
  roses2.jpg  130.02

 Done!

 The *less* numeric value -- the *more similar* this pic is to the
 tested pic. If this value  70 almost for sure these pictures are
 absolutely different (from totally different domains, so to speak).

 What is similarity and how can/could/should it be estimated this
 point I'm leaving for your consideration/contemplation/arguing etc.

 Several sample pics (*.jpg) are included into .zip.
 And of course the stuff requires PIL (Python Imaging Library), see:
 Home-page: http://www.pythonware.com/products/pil
 Download-URL: http://effbot.org/zone/pil-changes-116.htm

 --
 http://mail.python.org/mailman/listinfo/python-list

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ImSim: Image Similarity

2011-03-05 Thread n00m

I uploaded a new version of the subject with a
VERY MINOR correction in it. Namely, in line #55:

print '%12s %7.2f' % (db[k][1], db[k][0] / 3600.0,)

instead of

print '%12s %7.2f' % (db[k][1], db[k][0] * 0.001,)

I.e. I normalized it to base = 100.
Now the values of similarity can't be greater than 100
and can be treated as some regular percents (%%).

Also, due to this change, the *empirical* threshold of
system alarmity moved down from number 70 to 20%.

  bears2.jpg

  bears2.jpg0.00
  bears3.jpg   15.37
  bears1.jpg   19.13
sky1.jpg   23.29
sky2.jpg   23.45
 ff1.jpg   25.37
   lake1.jpg   26.43
  water1.jpg   26.93
 ff2.jpg   28.43
  roses1.jpg   31.95
  roses2.jpg   36.12

Done!

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ImSim: Image Similarity

2011-03-05 Thread Mel
n00m wrote:

 
 I uploaded a new version of the subject with a
 VERY MINOR correction in it. Namely, in line #55:
 
 print '%12s %7.2f' % (db[k][1], db[k][0] / 3600.0,)
 
 instead of
 
 print '%12s %7.2f' % (db[k][1], db[k][0] * 0.001,)
 
 I.e. I normalized it to base = 100.
 Now the values of similarity can't be greater than 100
 and can be treated as some regular percents (%%).
 
 Also, due to this change, the *empirical* threshold of
 system alarmity moved down from number 70 to 20%.
 
   bears2.jpg
 
   bears2.jpg0.00
   bears3.jpg   15.37
   bears1.jpg   19.13
 sky1.jpg   23.29
 sky2.jpg   23.45
  ff1.jpg   25.37
lake1.jpg   26.43
   water1.jpg   26.93
  ff2.jpg   28.43
   roses1.jpg   31.95
   roses2.jpg   36.12

I'd like to see a *lot* more structure in there, with modularization, so the 
internal functions could be used from another program.  Once I'd figured out 
what it was doing, I had this:


from PIL import Image
from PIL import ImageStat

def row_column_histograms (file_name):
'''Reduce the image to a 5x5 square of b/w brightness levels 0..3
Return two brightness histograms across Y and X
packed into a 10-item list of 4-item histograms.'''
im = Image.open (file_name)
im = im.convert ('L')   # convert to 8-bit b/w
w, h = 300, 300
im = im.resize ((w, h))
imst = ImageStat.Stat (im)
sr = imst.mean[0]   # average pixel level in layer 0
sr_low, sr_mid, sr_high = (sr*2)/3, sr, (sr*4)/3
def foo (t):
if t  sr_low: return 0
if t  sr_mid: return 1
if t  sr_high: return 2
return 3
im = im.point (foo) # reduce to brightness levels 0..3
yhist = [[0]*4 for i in xrange(5)]
xhist = [[0]*4 for i in xrange(5)]
for y in xrange (h):
for x in xrange (w):
k = im.getpixel ((x, y))
yhist[y / 60][k] += 1
xhist[x / 60][k] += 1
return yhist + xhist


def difference_ranks (test_histogram, sample_histograms):
'''Return a list of difference ranks between the test histograms and 
each of the samples.'''
result = [0]*len (sample_histograms)
for k, s in enumerate (sample_histograms):  # for each image
for i in xrange(10):# for each histogram slot
for j in xrange(4): # for each brightness level
result[k] += abs (s[i][j] - test_histogram[i][j])   
return result


if __name__ == '__main__':
import getopt, sys
opts, args = getopt.getopt (sys.argv[1:], '', [])
if not args:
args = [
'bears1.jpg',
'bears2.jpg',
'bears3.jpg',
'roses1.jpg',
'roses2.jpg',
'ff1.jpg',
'ff2.jpg',
'sky1.jpg',
'sky2.jpg',
'water1.jpg',
'lake1.jpg',
]
test_pic = 'bears2.jpg' 
else:
test_pic, args = args[0], args[1:]

z = [row_column_histograms (a) for a in args]
test_z = row_column_histograms (test_pic)

file_ranks = zip (difference_ranks (test_z, z), args)   
file_ranks.sort()

print '%12s' % (test_pic,)
print ''
for r in file_ranks:
print '%12s %7.2f' % (r[1], r[0] / 3600.0,)



(omitting a few comments that wrapped around.)  The test-case still agrees 
with your archived version:

mwilson@tecumseth:~/sandbox/im_sim$ python image_rank.py bears2.jpg *.jpg
  bears2.jpg

  bears2.jpg0.00
  bears3.jpg   15.37
  bears1.jpg   19.20
sky1.jpg   23.20
sky2.jpg   23.37
 ff1.jpg   25.30
   lake1.jpg   26.38
  water1.jpg   26.98
 ff2.jpg   28.43
  roses1.jpg   32.01


I'd vaguely wanted to do something like this for a while, but I never dug 
far enough into PIL to even get started.  An additional kind of ranking that 
takes colour into account would also be good -- that's the first one I never 
did.

Cheers, Mel.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ImSim: Image Similarity

2011-03-05 Thread n00m
On Mar 5, 7:10 pm, Mel mwil...@the-wire.com wrote:
 n00m wrote:

  I uploaded a new version of the subject with a
  VERY MINOR correction in it. Namely, in line #55:

      print '%12s %7.2f' % (db[k][1], db[k][0] / 3600.0,)

  instead of

      print '%12s %7.2f' % (db[k][1], db[k][0] * 0.001,)

  I.e. I normalized it to base = 100.
  Now the values of similarity can't be greater than 100
  and can be treated as some regular percents (%%).

  Also, due to this change, the *empirical* threshold of
  system alarmity moved down from number 70 to 20%.

    bears2.jpg
  
    bears2.jpg    0.00
    bears3.jpg   15.37
    bears1.jpg   19.13
      sky1.jpg   23.29
      sky2.jpg   23.45
       ff1.jpg   25.37
     lake1.jpg   26.43
    water1.jpg   26.93
       ff2.jpg   28.43
    roses1.jpg   31.95
    roses2.jpg   36.12

 I'd like to see a *lot* more structure in there, with modularization, so the
 internal functions could be used from another program.  Once I'd figured out
 what it was doing, I had this:

 from PIL import Image
 from PIL import ImageStat

 def row_column_histograms (file_name):
     '''Reduce the image to a 5x5 square of b/w brightness levels 0..3
     Return two brightness histograms across Y and X
     packed into a 10-item list of 4-item histograms.'''
     im = Image.open (file_name)
     im = im.convert ('L')       # convert to 8-bit b/w
     w, h = 300, 300
     im = im.resize ((w, h))
     imst = ImageStat.Stat (im)
     sr = imst.mean[0]   # average pixel level in layer 0
     sr_low, sr_mid, sr_high = (sr*2)/3, sr, (sr*4)/3
     def foo (t):
         if t  sr_low: return 0
         if t  sr_mid: return 1
         if t  sr_high: return 2
         return 3
     im = im.point (foo) # reduce to brightness levels 0..3
     yhist = [[0]*4 for i in xrange(5)]
     xhist = [[0]*4 for i in xrange(5)]
     for y in xrange (h):
         for x in xrange (w):
             k = im.getpixel ((x, y))
             yhist[y / 60][k] += 1
             xhist[x / 60][k] += 1
     return yhist + xhist

 def difference_ranks (test_histogram, sample_histograms):
     '''Return a list of difference ranks between the test histograms and
 each of the samples.'''
     result = [0]*len (sample_histograms)
     for k, s in enumerate (sample_histograms):  # for each image
         for i in xrange(10):    # for each histogram slot
             for j in xrange(4): # for each brightness level
                 result[k] += abs (s[i][j] - test_histogram[i][j])      
     return result

 if __name__ == '__main__':
     import getopt, sys
     opts, args = getopt.getopt (sys.argv[1:], '', [])
     if not args:
         args = [
             'bears1.jpg',
             'bears2.jpg',
             'bears3.jpg',
             'roses1.jpg',
             'roses2.jpg',
             'ff1.jpg',
             'ff2.jpg',
             'sky1.jpg',
             'sky2.jpg',
             'water1.jpg',
             'lake1.jpg',
         ]
         test_pic = 'bears2.jpg'
     else:
         test_pic, args = args[0], args[1:]

     z = [row_column_histograms (a) for a in args]
     test_z = row_column_histograms (test_pic)

     file_ranks = zip (difference_ranks (test_z, z), args)      
     file_ranks.sort()

     print '%12s' % (test_pic,)
     print ''
     for r in file_ranks:
         print '%12s %7.2f' % (r[1], r[0] / 3600.0,)

 (omitting a few comments that wrapped around.)  The test-case still agrees
 with your archived version:

 mwilson@tecumseth:~/sandbox/im_sim$ python image_rank.py bears2.jpg *.jpg
   bears2.jpg
 
   bears2.jpg    0.00
   bears3.jpg   15.37
   bears1.jpg   19.20
     sky1.jpg   23.20
     sky2.jpg   23.37
      ff1.jpg   25.30
    lake1.jpg   26.38
   water1.jpg   26.98
      ff2.jpg   28.43
   roses1.jpg   32.01

 I'd vaguely wanted to do something like this for a while, but I never dug
 far enough into PIL to even get started.  An additional kind of ranking that
 takes colour into account would also be good -- that's the first one I never
 did.

         Cheers,         Mel.


Very nice, Mel.

As for using color info...
my current strong opinion is: the colors must be forgot for good.
Paradoxically but profound elaboration and detailization can/will
spoil/undermine the whole thing. Just my current imo.


===
Vitali

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ImSim: Image Similarity

2011-03-05 Thread Jorgen Grahn
On Sat, 2011-03-05, Grigory Javadyan wrote:
 At least you could've tried to make the script more usable by adding
 the possibility to supply command line arguments, instead of editing
 the source every time you want to compare a couple of images.

 On Sat, Mar 5, 2011 at 11:23 AM, n00m n...@narod.ru wrote:
 Let me present my newborn project (in Python) ImSim:

 http://sourceforge.net/projects/imsim/

 Its README.txt:
 -
 ImSim is a python script for finding the most similar pic(s) to
 a given one among a set/list/db of your pics.
 The script is very short and very easy to follow and understand.
 Its sample output looks like this:
...
 The *less* numeric value -- the *more similar* this pic is to the
 tested pic. If this value  70 almost for sure these pictures are
 absolutely different (from totally different domains, so to speak).

 What is similarity and how can/could/should it be estimated this
 point I'm leaving for your consideration/contemplation/arguing etc.

So basically you're saying you won't tell the users what the program
*does*. I don't get that.

Is it better than this?
- scale each image to 100x100
- go blackwhite in such a way that half the pixels are black
- XOR the images and count the mismatches

That takes care of JPEG quality, scaling and possibly gamma
correction, but not cropping or rotation. I'm sure there are better,
well-known algorithms.

/Jorgen

-- 
  // Jorgen Grahn grahn@  Oo  o.   .  .
\X/ snipabacken.se   O  o   .
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ImSim: Image Similarity

2011-03-05 Thread n00m

 Is it better than this?
 - scale each image to 100x100
 - go blackwhite in such a way that half the pixels are black
 - XOR the images and count the mismatches


It's *much* better but I'm not *much* about to prove it.



 I'm sure there are better,
 well-known algorithms.


The best well-known algorithm is to hire a man with good eyesight
for to do the job of comparing, ranking and selecting the pictures.
-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ImSim: Image Similarity

2011-03-05 Thread n00m

PS

For some reason they don't update the link to the last version.

It's _20110306, here: http://sourceforge.net/projects/imsim/files/

I use Python 2.5  PIL for Python 2.5

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ImSim: Image Similarity

2011-03-05 Thread Mel
n00m wrote:

 As for using color info...
 my current strong opinion is: the colors must be forgot for good.
 Paradoxically but profound elaboration and detailization can/will
 spoil/undermine the whole thing. Just my current imo.

Yeah.  I guess including color info cubes the complexity of the answer.  
Might be too complicated to know what to do with an answer like that.

Mel.

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ImSim: Image Similarity

2011-03-05 Thread n00m
On Mar 6, 6:10 am, Mel mwil...@the-wire.com wrote:
 n00m wrote:
  As for using color info...
  my current strong opinion is: the colors must be forgot for good.
  Paradoxically but profound elaboration and detailization can/will
  spoil/undermine the whole thing. Just my current imo.

 Yeah.  I guess including color info cubes the complexity of the answer.  
 Might be too complicated to know what to do with an answer like that.

         Mel.

Uhmm, Mel. Totally agree with you.
+
I included roses1.jpg  roses2.jpg on purpose:
the 1st one is a painting by Abbott Handerson Thayer,
the 2nd is its copy by some obscure Russian painter.
But it's of course a creative  revamped copy.

In strict sense they are 2 different images (look at their colors etc)
, on the other hand they are closely related to each other.
Plus, we can't tell *in principle* what is original and what is copy
what colors are right/good and what colors are wrong/bad

-- 
http://mail.python.org/mailman/listinfo/python-list


Re: ImSim: Image Similarity

2011-03-05 Thread n00m
http://www.nga.gov/search/index.shtm
http://deyoung.famsf.org/search-collections
etc
Seems they all offer search only by keywords and this kind.
What about to submit e.g. roses2.jpg (copy) and to find its
original? Assume we don't know its author neither its title
-- 
http://mail.python.org/mailman/listinfo/python-list


ImSim: Image Similarity

2011-03-04 Thread n00m
Let me present my newborn project (in Python) ImSim:

http://sourceforge.net/projects/imsim/

Its README.txt:
-
ImSim is a python script for finding the most similar pic(s) to
a given one among a set/list/db of your pics.
The script is very short and very easy to follow and understand.
Its sample output looks like this:

  bears2.jpg

  bears2.jpg0.00
  bears3.jpg   55.33
  bears1.jpg   68.87
sky1.jpg   83.84
sky2.jpg   84.41
 ff1.jpg   91.35
   lake1.jpg   95.14
  water1.jpg   96.94
 ff2.jpg  102.36
  roses1.jpg  115.02
  roses2.jpg  130.02

Done!

The *less* numeric value -- the *more similar* this pic is to the
tested pic. If this value  70 almost for sure these pictures are
absolutely different (from totally different domains, so to speak).

What is similarity and how can/could/should it be estimated this
point I'm leaving for your consideration/contemplation/arguing etc.

Several sample pics (*.jpg) are included into .zip.
And of course the stuff requires PIL (Python Imaging Library), see:
Home-page: http://www.pythonware.com/products/pil
Download-URL: http://effbot.org/zone/pil-changes-116.htm

-- 
http://mail.python.org/mailman/listinfo/python-list