[Nutch-dev] Adding new urls in WebDB

Lourival Júnior Fri, 09 Jun 2006 04:46:38 -0700

Hi all!

I have some problems with update my WebDB. I've a page, test.htm, that has 4
links to 4 pdf's documents. I execute the crawler then when I do this
command:


bin/nutch readdb Mydir/db -stats

I get this output:

Number of pages: 5
Number of links: 4

That's ok. The problem is when I add more 4 links to the test.htm. I want a
script that re crawl or update my WebDB without I have to delete Mydir
folder. I hope I am being clearly.
I found some shell scripts to do this, however it's don't do what I want.
Always I get the same number of pages and links.

Anyone can help me?

--
Lourival Junior
Universidade Federal do Pará
Curso de Bacharelado em Sistemas de Informação
http://www.ufpa.br/cbsi
Msn: [EMAIL PROTECTED]

_______________________________________________
Nutch-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-developers

[Nutch-dev] Adding new urls in WebDB

Reply via email to