Hi Andrzej,

Thank you for your reply. Could you please confirm whether invertlinks supports more than
5 parts?

According to the tutorial (http://wiki.apache.org/nutch/NutchTutorial):

"Step-by-Step: Indexing

Before indexing we first invert all of the links, so that we may index incoming anchor text with the pages.

bin/nutch invertlinks crawl/linkdb crawl/segments  "


This step failed when we had more than 5 parts. Has anyone successfully excuted invertlinks with more than 5 parts on 0.8? I would appreciate any confirmation. All I wanted to find out is whether there is a possible bug in invertlinks, or something else. It all comes down to invertlinks command.

Thank you again for your help.

Olive


From: Andrzej Bialecki <[EMAIL PROTECTED]>
Reply-To: [email protected]
To: [email protected]
Subject: Re: please help!! inverlinks not work properly with more than 5 input parts (0.8)
Date: Thu, 06 Apr 2006 14:43:29 +0200

Olive g wrote:
Hi gurus,

I posted questions on how to do incremental crawls on 0.8 a few days ago and thank you all for your help. However, when I tried to workaround (see http://www.mail-archive.com/nutch-user%40lucene.apache.org/msg04111.html), inverlinks crashed when there were more than 5 input parts.


You should understand very clearly that what you are doing is NOT supported and very non-standard. It might (or might not) have worked as a one time workaround to get you out of trouble.

Nutch DOES support incremental crawling and indexing, and the way it does is described in the tutorial (http://wiki.apache.org/nutch/NutchTutorial). Please follow the tutorial where it says about "Step-by-Step or Whole-web Crawling" - you will save yourself (and us) a lot of grief.

--
Best regards,
Andrzej Bialecki     <><
___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today - it's FREE! http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/



-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to