Yes it is possible. I have seen Nutch called in the following ways:
1. Running the command line scripts without Hadoop using the default
configuration. This would only use a single JVM. This is usually
called via a script.
2. Calling the individual programs directly through another Java
driver program. For instance calling the Fetcher programatically
and passing in arguments.
I have seen Nutch deployed inside of an app server. The problem areas
are classpath, making sure plugins, jars, and configuration files are in
the classpath and get correctly deployed.
Dennis
On 05/21/2010 04:32 AM, Hannes Carl Meyer wrote:
Hi,
is it possible to run nutch in a single virtual machine for intranet
crawling? Even inside a Java Application Server?
Normally I'm using custom Nutch crawl scripts and start from the OS command
line by cron. In a new project it is required to use a running Virtual
Machine for deloyment and invocation of crawler tasks.
Does anybody has experiences in deploying Nutch in such a scenario?
Kind Regards
Hannes
--