Yes Cam, if you use a depth 1 you will crawl only the first document. With a depth 2 you will crawl the first document and all the links found on this document. With depth 3, you will crawl the first one, its links and all links found in cycle 2. And so on. Increasing you depth will increasing your WebDB too. Try it ;)
Regards On 8/31/06, Sandy Polanski <[EMAIL PROTECTED]> wrote:
Cam, try increasing the depth and see what happens. It seems that logic would say that they're on the same directory depth/level; however, just give it a try because I ran into a similar problem, and if I'm not mistaken, that fixed it. --- Cam Bazz <[EMAIL PROTECTED]> wrote: > Hello, > > I have a problem. I tried to index some localfiles > with nutch. > > What I have done is put them in a local apache > server, (html files) > and create a urls file that contains > http://localhost/file01.html etc. > > then I do a nutch crawl urls . -dir crawl -depth 1 > > but the crawl stales after a while, and nothing > happens. > > I also tried -topN 10000 > > is not there a more convinient way of indexing from > file system? > > Best regards, > -C.B. > __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
-- Lourival Junior Universidade Federal do Pará Curso de Bacharelado em Sistemas de Informação http://www.ufpa.br/cbsi Msn: [EMAIL PROTECTED]
