The problem - trying to find duplicate files in a database (Apple Aperture), and have found a checksum column for all the image files.

I've had a go at writing a handler to find the dupes and it does OK, but wondered if the bright sparks on the list have any advice on speeding it up it...

The handler:

====================

put the milliseconds into tt
put ijwAPLIB_getAllChecksums() into tList -- this returns the list of checksums, 10k in my sample BD, over 40k in the 'real' DB
  put number of lines of tList into tNumLines
  sort tlist
  put 0 into x
  repeat tNumLines times
    add 1 to x
if last char of x is 1 then set the cursor to busy -- removing this speeds it up by roughly 10%
    put line x of tList into tCheck
    if tCheck is empty then next repeat
    put x + 1 into y
    repeat (tNumLines - x) times
      put line y of tList into tOther
      if tCheck is tOther then
        put x & tab & y & tab & tCheck & return after tRet
      else
        put y into x
        exit repeat
      end if
      add 1 to y
    end repeat
  end repeat
put the milliseconds - tt & return & "number of files:" && tNumLines & return & return & tRet

====================

Sample results:

9804
number of files: 8708

116     117     027351c1bed597af774536af8e982363
119     120     0292d175c04d790f50246a5ee043a599
162     163     03d6313ee21a91ed0b0343f339c583e4
185     186     046ddab379a8f44955f1d5605c294605
230     231     05a77db5e76eb02f8d439e13286d3620
245     246     065474aa9bba7e2f24c7435863f5f2ff
314     315     0884f4b24b5bd99ddefdb100fde58a31
333     334     0918ce2135933d6c8f0ee2860837b5f9
360     361     0a2525bef1a46a329b7e902981ef94e2
360     362     0a2525bef1a46a329b7e902981ef94e2
360     363     0a2525bef1a46a329b7e902981ef94e2
360     364     0a2525bef1a46a329b7e902981ef94e2

Ian
_______________________________________________
use-revolution mailing list
[email protected]
Please visit this url to subscribe, unsubscribe and manage your subscription 
preferences:
http://lists.runrev.com/mailman/listinfo/use-revolution

Reply via email to