Hi Vacuum I hope nutch wiki will help you much:) http://wiki.apache.org/nutch/
Regards /Jack On 7/6/05, Vacuum Joe <[EMAIL PROTECTED]> wrote: > Hello Nutch-gurus, > > I have some very straightforward and yet totally > newbie questions which I hope some kind person would > answer. > > First of all, what is a db? It seems like I have to > inject links into the db to get the process started. > So the links are in the db, and then I run fetch on > them. That brings me to the next question: what's a > segment? I notice that it creates timestamped segment > directories. What's in them? Does the running Nutch > web application automatically pick up new segment > files when they are added, or do I have to restart it? > > I'm trying to figure this out because I want to get > started with automated crawling, so I'll have one or > two machines crawling all the time, and then have a > cluster of web server machines. I assume that the web > server front-end machines need the segments and the > crawlers need the db, but I'm not sure exactly what > the functions of these are. > > Thanks for your help and thanks for the awesome piece > of software. Hopefully as we do some work on it, > we'll have some code to return to the source. > > > __________________________________________________ > Do You Yahoo!? > Tired of spam? Yahoo! Mail has the best spam protection around > http://mail.yahoo.com >