Hello,

We're currently developing an application using the Lucene API for
building a search engine and as part of the application we have a
component for parsing several file formats. For this component we were
hoping to use several of the plug-ins in Nutch and we have written
classes in our own application that build a map of Parsers that we
utilise for the various file formats we have considered applicable for
us.

Everything works fine when we run our unit tests and all of Nutch's
plug-ins are loaded successfully and we can parse all the file formats
we want to. However, we run into a problem as we deploy our application
on our app.server. We have decided to use Glassfish for our app.server
and somehow after deployment, the Nutch plug-ins cannot be configured
because it doesn't seem like the PluginManifestParser can find the
plugin folder that is bundled in our WAR, and neither can the
ParsePluginsReader find the parse-plugins.xml file. See the exceptions
below:

[#|2006-09-12T14:01:49.944+0200|INFO|sun-appserver-ee9.1|javax.enterpris
e.system.stream.out|_ThreadID=10;_ThreadName=main;|WARN -
PluginManifestParser.getPluginFolder(126) | Plugins: directory not
found: /WEB-INF/lib/plugins

[#|2006-09-12T14:01:49.959+0200|INFO|sun-appserver-ee9.1|javax.enterpris
e.system.stream.out|_ThreadID=10;_ThreadName=main;|WARN -
ParsePluginsReader.parse(115) | Unable to parse [null].Reason is
[java.net.MalformedURLException]

On Glassfish we've so far just deployed the WAR by dropping it in the
\autodeploy directory, and it gets deployed in the j2ee-modules folder
where the path to various Nutch files is as follows. I'm including
Glassfish <glassfish-domain> folder as the root of our directory
structure:

<glassfish-domain>
 - applications
 --- j2ee-modules
 ------ <app-context-root>
 --------- conf         (we've put the Nutch configuration files from
the $NUTCH_HOME\conf both in this directory and the one below, WEB-INF)
 --------- WEB-INF (we've put the Nutch configuration files from the
$NUTCH_HOME\conf both in this directory and the one above, conf)
 ------------ lib
 --------------- plugins (here are all the Nutch plug-in folders, i.e.
parse-html, parse-pdf, etc. and all dependent folders for those plug-ins
we utilise)
 - autodeploy (we drop the WAR here and it gets deployed into the
<app-context-root> folder above)
 - lib

I feel I have exhausted all combinations of putting the libs and
configuration files in various folders but the ParsePluginsReader never
seems able to find the parse-plugins.xml file.

Has anyone got some experience deploying on Glassfish, or just general
tips about how we can try to configure our application to use the
plug-ins?


Thanking you in anticipation,

Trym

--
Trym Asserson

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to