Ken Krugler wrote:
common case. Thus it could be somewhat computationally expensive
(e.g. a winnowing ala
http://theory.stanford.edu/~aiken/publications/papers/sigmod03.pdf).
Interesting paper, thanks for the pointer - I always wondered what
criteria to use to reduce the number of shingles, and this
winnowing is a simple enough recipe for creating page signatures.
I may be tempted to implement it ;)
I took a quick scan through the public code and didn't find
anything that looked appropriate for this. One more potentially
useful paper is here:
http://theory.stanford.edu/~aiken/publications/papers/sigmod03.pdf
This URL looks similar to the one you mentioned before ... probably
a case of near-duplicate *chuckle* ...
Sorry about that - I can't really claim I was checking your manual
dedup support. The real URL is:
http://www1.cs.columbia.edu/~cs6998/final_reports/ca2269-report.pdf
-- Ken
--
Ken Krugler
Krugle, Inc.
+1 530-210-6378
"If you can't find it, you can't fix it"