I noticed in the documentation that you can do whole web crawling and
intranet.  My question, can you combine a database that you crawl
through a set provided URLs to the database you created with whole-web
crawling.

For example, here you create a directory crawl.test, this contains a database.

bin/nutch crawl urls -dir crawl.test -depth 3 >& crawl.log

bin/nutch admin new/db -create

Here, I am creating a database in the directory 'new'.  Can I add the
two databases together.  For example, let me say I run through
whole-web crawling and then I want to crawl a set of URLs, can I add
those to the index.

bin/nutch crawl urls -dir new -depth 3 >& crawl.log

?


-------------------------------------------------------
This SF.Net email is sponsored by xPML, a groundbreaking scripting language
that extends applications into web and mobile media. Attend the live webcast
and join the prime developer group breaking into this new coding territory!
http://sel.as-us.falkag.net/sel?cmd=lnk&kid0944&bid$1720&dat1642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to