Hi Vacuum

I hope nutch wiki will help you much:)
http://wiki.apache.org/nutch/


Regards
/Jack

On 7/6/05, Vacuum Joe <[EMAIL PROTECTED]> wrote:
> Hello Nutch-gurus,
> 
> I have some very straightforward and yet totally
> newbie questions which I hope some kind person would
> answer.
> 
> First of all, what is a db?  It seems like I have to
> inject links into the db to get the process started.
> So the links are in the db, and then I run fetch on
> them.  That brings me to the next question: what's a
> segment?  I notice that it creates timestamped segment
> directories.  What's in them?  Does the running Nutch
> web application automatically pick up new segment
> files when they are added, or do I have to restart it?
> 
> I'm trying to figure this out because I want to get
> started with automated crawling, so I'll have one or
> two machines crawling all the time, and then have a
> cluster of web server machines.  I assume that the web
> server front-end machines need the segments and the
> crawlers need the db, but I'm not sure exactly what
> the functions of these are.
> 
> Thanks for your help and thanks for the awesome piece
> of software.  Hopefully as we do some work on it,
> we'll have some code to return to the source.
> 
> 
> __________________________________________________
> Do You Yahoo!?
> Tired of spam?  Yahoo! Mail has the best spam protection around
> http://mail.yahoo.com
>

Reply via email to