Yes, I really want to use Nutch, because Nutch has an API, and wget doesn't. I want to create a module which can take jobs for importing sites. This module is responsible for taking requests of web site downloads (it will be an import site module). I chose Nutch instead of writing my own crawler, because it is open source, it's solid (I can expect to have a lot of bugs if I write my own crawler). If I chose wget, I would need to create a process for every request of site importing, which is not what I want.
-- View this message in context: http://lucene.472066.n3.nabble.com/Using-Nutch-for-Web-Site-Mirroring-tp3986066p3986086.html Sent from the Nutch - User mailing list archive at Nabble.com.

