First of all, thanks to everybody involved in Nutch. It looks wonderful and
I can't wait to apply what you've done.
Is it possible to run and control Nutch completely within Tomcat 5.0.28 and
Java 1.4.2 using no command line?
In other words, I'd like to avoid using the command line and instead call
the java classes directly on a scheduled or user-controlled basis from
Tomcat. From what I see in bin/nutch I should be able to replace the
command:
bin/nutch crawl urls -dir crawl.test -depth 3 >& crawl.log
with something like:
net.nutch.tools.CrawlTool crawlTool = new net.nutch.tools.CrawlTool();
String[] args = new String[7];
args[0] = "urls";
args[1] = "-dir";
args[2] = "crawl.test";
args[3] = "-depth";
args[4] = "3";
args[5] = ">&";
args[6] = "crawl.log";
crawlTool.main(args);
Is this possible? Is this smart? What sort of issues will arrise if I try
to run everything from Tomcat/Java?
Thanks,
Joe Reger