Yes it is possible.  I have seen Nutch called in the following ways:

  1. Running the command line scripts without Hadoop using the default
     configuration.  This would only use a single JVM.  This is usually
     called via a script.
  2. Calling the individual programs directly through another Java
     driver program.  For instance calling the Fetcher programatically
     and passing in arguments.

I have seen Nutch deployed inside of an app server. The problem areas are classpath, making sure plugins, jars, and configuration files are in the classpath and get correctly deployed.

Dennis

On 05/21/2010 04:32 AM, Hannes Carl Meyer wrote:
Hi,

is it possible to run nutch in a single virtual machine for intranet
crawling? Even inside a Java Application Server?

Normally I'm using custom Nutch crawl scripts and start from the OS command
line by cron. In a new project it is required to use a running Virtual
Machine for deloyment and invocation of crawler tasks.

Does anybody has experiences in deploying Nutch in such a scenario?

Kind Regards

Hannes

--

Reply via email to