Hello, We're currently developing an application using the Lucene API for building a search engine and as part of the application we have a component for parsing several file formats. For this component we were hoping to use several of the plug-ins in Nutch and we have written classes in our own application that build a map of Parsers that we utilise for the various file formats we have considered applicable for us.
Everything works fine when we run our unit tests and all of Nutch's plug-ins are loaded successfully and we can parse all the file formats we want to. However, we run into a problem as we deploy our application on our app.server. We have decided to use Glassfish for our app.server and somehow after deployment, the Nutch plug-ins cannot be configured because it doesn't seem like the PluginManifestParser can find the plugin folder that is bundled in our WAR, and neither can the ParsePluginsReader find the parse-plugins.xml file. See the exceptions below: [#|2006-09-12T14:01:49.944+0200|INFO|sun-appserver-ee9.1|javax.enterpris e.system.stream.out|_ThreadID=10;_ThreadName=main;|WARN - PluginManifestParser.getPluginFolder(126) | Plugins: directory not found: /WEB-INF/lib/plugins [#|2006-09-12T14:01:49.959+0200|INFO|sun-appserver-ee9.1|javax.enterpris e.system.stream.out|_ThreadID=10;_ThreadName=main;|WARN - ParsePluginsReader.parse(115) | Unable to parse [null].Reason is [java.net.MalformedURLException] On Glassfish we've so far just deployed the WAR by dropping it in the \autodeploy directory, and it gets deployed in the j2ee-modules folder where the path to various Nutch files is as follows. I'm including Glassfish <glassfish-domain> folder as the root of our directory structure: <glassfish-domain> - applications --- j2ee-modules ------ <app-context-root> --------- conf (we've put the Nutch configuration files from the $NUTCH_HOME\conf both in this directory and the one below, WEB-INF) --------- WEB-INF (we've put the Nutch configuration files from the $NUTCH_HOME\conf both in this directory and the one above, conf) ------------ lib --------------- plugins (here are all the Nutch plug-in folders, i.e. parse-html, parse-pdf, etc. and all dependent folders for those plug-ins we utilise) - autodeploy (we drop the WAR here and it gets deployed into the <app-context-root> folder above) - lib I feel I have exhausted all combinations of putting the libs and configuration files in various folders but the ParsePluginsReader never seems able to find the parse-plugins.xml file. Has anyone got some experience deploying on Glassfish, or just general tips about how we can try to configure our application to use the plug-ins? Thanking you in anticipation, Trym -- Trym Asserson ------------------------------------------------------------------------- Using Tomcat but need to do more? Need to support web services, security? Get stuff done quickly with pre-integrated technology to make your job easier Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
