Re: near duplicates

2006-10-24 Thread Beto Siless
Hi Karl! I'm interested in near duplicate detection based on termFreqVectos. Now I'm comparing all documents with each other (calculating the angle)... Is there a way to avoid that? Thanks! Beto karl wettin wrote: 17 okt 2006 kl. 17.54 skrev Find Me: How to eliminate near duplicates

Re: near duplicates

2006-10-24 Thread Beto Siless
Hi Andrej! I'm taking a look to fuzzy signatures for near duplicate detection and and I have seen your TextProfileSignature. The question is: If I index the documents with their text signature, is there a way to filter near duplicates at search time without comparing each document with all