I noticed in the documentation that you can do whole web crawling and
intranet.  My question, can you combine a database that you crawl
through a set provided URLs to the database you created with whole-web
crawling.

For example, here you create a directory crawl.test, this contains a database.

bin/nutch crawl urls -dir crawl.test -depth 3 >& crawl.log

bin/nutch admin new/db -create

Here, I am creating a database in the directory 'new'.  Can I add the
two databases together.  For example, let me say I run through
whole-web crawling and then I want to crawl a set of URLs, can I add
those to the index.

bin/nutch crawl urls -dir new -depth 3 >& crawl.log

?

Reply via email to