Run bin/nutch dedup segments dedup.tmp

Dima Mazmanov wrote:
Hi all!! I'm running on nutch-0.7.1.

Here is result of my search.

ArGo Software Design Homepage [html] - 30.2 k - ... Look of our Web Site Our web site has new look and ... link on the ... http://www.argosoft.org/RootPages/Default.aspx (Cached) ArGo Software Design Homepage [html] - 30.2 k - ... Look of our Web Site Our web site has new look and ... link on the ... http://www.argosoft.com/rootpages/Default.aspx (Cached) ArGo Software Design Homepage [html] - 30.2 k - ... Look of our Web Site Our web site has new look and ... link on the ... http://www.argosoft.com/RootPages/Default.aspx (Cached) ArGo Software Design Homepage [html] - 30.2 k - ... Look of our Web Site Our web site has new look and ... link on the ... http://www.argosoft.org/rootpages/Default.aspx (Cached)
As you can see one result is shown multiple times.
Why so? What is the difference between these links? I don't see any..
So, how can I avoid this problem?
Thanks, Regards, Dima



Reply via email to