I noticed in the documentation that you can do whole web crawling and intranet. My question, can you combine a database that you crawl through a set provided URLs to the database you created with whole-web crawling.
For example, here you create a directory crawl.test, this contains a database. bin/nutch crawl urls -dir crawl.test -depth 3 >& crawl.log bin/nutch admin new/db -create Here, I am creating a database in the directory 'new'. Can I add the two databases together. For example, let me say I run through whole-web crawling and then I want to crawl a set of URLs, can I add those to the index. bin/nutch crawl urls -dir new -depth 3 >& crawl.log ?
