Hi Andrzej,
Thank you for your reply. Could you please confirm whether invertlinks
supports more than
5 parts?
According to the tutorial (http://wiki.apache.org/nutch/NutchTutorial):
"Step-by-Step: Indexing
Before indexing we first invert all of the links, so that we may index
incoming anchor text with the pages.
bin/nutch invertlinks crawl/linkdb crawl/segments "
This step failed when we had more than 5 parts. Has anyone successfully
excuted invertlinks with more than 5 parts on 0.8? I would appreciate any
confirmation. All I wanted to find out is whether there is a possible bug in
invertlinks, or something else. It all comes down to invertlinks command.
Thank you again for your help.
Olive
From: Andrzej Bialecki <[EMAIL PROTECTED]>
Reply-To: [email protected]
To: [email protected]
Subject: Re: please help!! inverlinks not work properly with more than 5
input parts (0.8)
Date: Thu, 06 Apr 2006 14:43:29 +0200
Olive g wrote:
Hi gurus,
I posted questions on how to do incremental crawls on 0.8 a few days ago
and thank you all for your help. However, when I tried to workaround (see
http://www.mail-archive.com/nutch-user%40lucene.apache.org/msg04111.html),
inverlinks crashed when there were more than 5 input parts.
You should understand very clearly that what you are doing is NOT supported
and very non-standard. It might (or might not) have worked as a one time
workaround to get you out of trouble.
Nutch DOES support incremental crawling and indexing, and the way it does
is described in the tutorial (http://wiki.apache.org/nutch/NutchTutorial).
Please follow the tutorial where it says about "Step-by-Step or Whole-web
Crawling" - you will save yourself (and us) a lot of grief.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com
_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today - it's FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/
-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general