RE: Nutch Gotchas as of release 1.3

Jerry E. Craig, Jr. Mon, 11 Jul 2011 16:02:33 -0700

Just from a total noob standpoint (just installed my first LAMP box over the 
last month) realizing that I needed to look in the Runtime folder when I 
downloaded the tar.gz file was a HUGE step.


Then we all run the Crawl at least to make sure things work.  The main tutorial 
was missing the [-solr] part of the crawl command line to get that to index.  
It wasn't after someone helped me here and pointed me to the actual documents 
that I found it.

Those were the 2 big things for me as a total noob, otherwise I'm really happy 
to have at least that part working.  Now, my stupid CentOS install only has 
libxml2 2.6.15 and I need 2.6.17 for php and I'm a few revisions off on libcurl 
also.  I have NO idea how to go back and fix that.  Not sure if I should just 
try to upgrade to php53 and hope for the best or what.  But, that's more of a 
solr / php question than a Nutch question I think.


-----Original Message-----
From: Markus Jelsma [mailto:[email protected]] 
Sent: Monday, July 11, 2011 3:19 PM
To: [email protected]
Cc: lewis john mcgibbney
Subject: Re: Nutch Gotchas as of release 1.3

Well, now i'm thinking of it: yes.

- there were three (incl. myself) people mentioning the problem described in 
NUTCH-1016;
- a few users don't seem to catch the part of the tutorial telling them to add 
their robot to the config
- missing crawl-urlfilter
- mails about missing solrUrl

I think quite a few users still rely on the crawl command instead of running a 
script.

> Hello list,
> 
> Do we have any suggestions we wish to discuss regarding the above?
> 
> thanks

RE: Nutch Gotchas as of release 1.3

Reply via email to