Hi
I have some questions because some things are not that clear to me (-- newbie
:P )
I'm using nutch 0.8.1 currently.
first:
bin/nutch inject crawl/crawldb testurl/
bin/nutch generate crawl/crawldb crawl/segments -topN 50 -numFetchers 10
first one injects the seed urls in the WebDB.
second
Hi,
I have wrote two different plugins in nutch.Both of them are
working individually when tested using bin/nutch plugin .
Take the names of the plugins as A and B. I need to use the plugin A in B.
When I am importing plugin A in B it is giving error that package A is not
found. I
Hello!
In short:
Is it possible to tell Nutch to follow the links through one larger name
space, but only index (add to its database) the content of links that are in
a sub-name space of that?
The background:
I have started to experiment with crawling my blog with Nutch. The problem
is that