Ken Krugler wrote:

common case. Thus it could be somewhat computationally expensive (e.g. a winnowing ala http://theory.stanford.edu/~aiken/publications/papers/sigmod03.pdf).

Interesting paper, thanks for the pointer - I always wondered what criteria to use to reduce the number of shingles, and this winnowing is a simple enough recipe for creating page signatures. I may be tempted to implement it ;)

I took a quick scan through the public code and didn't find anything that looked appropriate for this. One more potentially useful paper is here:

http://theory.stanford.edu/~aiken/publications/papers/sigmod03.pdf

This URL looks similar to the one you mentioned before ... probably a case of near-duplicate *chuckle* ...


--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply via email to