I'm not specifying the delay..just bin/nutch crawl urls -dir crawl.test -depth 10 >& crawl.log
On 7/27/05, ir <[EMAIL PROTECTED]> wrote: > What is your delay between requests to the same server? > > On 7/27/05, thomas delnoij <[EMAIL PROTECTED]> wrote: > > I don't think it is a-typical, because I had similar > > effects with crawl depth = 10. > > > > Rgrds, Thomas > > > > --- blackwater dev <[EMAIL PROTECTED]> wrote: > > > > > Over a gig now, 18 hours running and still > > > going...might just have to > > > kill it unless this is typical. > > > > > > On 7/27/05, blackwater dev <[EMAIL PROTECTED]> > > > wrote: > > > > I am just curious how long it typically takes to > > > crawl a site? I am > > > > crawling server side which I realize is big. I > > > also changed the > > > > urlfilter file to accept urls with ? so I could > > > grab all the stuff > > > > from the forums and am doing a depth of 10 but I > > > started crawling > > > > around 9 last night and now 13 hours later, it is > > > still going. The > > > > crawl directory is up to about 627 meg. I can't > > > imagine how long it > > > > would take if I tried to crawl the web. > > > > > > > > Thanks! > > > > > > > > > > > > ------------------------------------------------------- SF.Net email is Sponsored by the Better Software Conference & EXPO September 19-22, 2005 * San Francisco, CA * Development Lifecycle Practices Agile & Plan-Driven Development * Managing Projects & Teams * Testing & QA Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
