BELLINI ADAM wrote:

hi,

dedup doesn't work for me.
I have read that  Duplicates have either the same contents (via MD5 hash) or 
the same URL
in my case i dont have the same URLS but still have the same contents for those 
URLS.
i give you an exemple:  i have three urls that have the same content

1- www.domaine/folder/
2- www.domaine/folder/index.html
3- www.domaine/folder/index.html?lang=fr

but i find all of them in my index :(
i was wondering that dedup will delete 1 and 2
the dedup wont work correclty !!

Please check the value of the Signature field for all the above urls in your crawldb.

--
Best regards,
Andrzej Bialecki     <><
 ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com

Reply via email to