Hello Uli,
> I'm trying to use SimpleServer to create a Website, that uses NLP services.
> Unfortunately I can't get past the PEAR installation process of the
> SimpleServerServlet. For the tokenizer I use a resource binding which is
> located in root/resources and I just can't get the PackageInstaller to find
> the this file no matter what I write into the CLASSPATH. Running the
> component form the installed PEAR file is no problem.
> I would appreciate it greatly if someone could point me in the right
> direction.
> The error log:
>
> UIMA Simple Service configuaration failed
> org.apache.uima.pear.tools.PackageInstallerException: The following error
> occurred during the installation verification of component ScienceDays:
> org.apache.uima.resource.ResourceInitializationException: Error initializing
> "org.apache.uima.resource.impl.DataResource_impl" from descriptor
> file:/Users/uliheld/Daten/eclipse-workspace/.metadata/.plugins/org.eclipse.wst.server.core/tmp0/work/Catalina/localhost/NLPipe-Server/ScienceDays/desc/annotators/TokenizerDescriptor.xml.
> at
> org.apache.uima.util.SimpleResourceFactory.produceResource(SimpleResourceFactory.java:144)
> ...
> Caused by: org.apache.uima.resource.ResourceInitializationException: Could
> not access the resource data at file:german_abbreviations.txt.
> at
> org.apache.uima.resource.impl.DataResource_impl.initialize(DataResource_impl.java:126)
> at
> org.apache.uima.util.SimpleResourceFactory.produceResource(SimpleResourceFactory.java:123)
> ... 45 more
From this information I read that you have some external resource configured in
the TokenizerDescriptor.xml with the URL "file:german_abbreviations.txt". You
seem to run this in Eclipse using WTP on OS X, so the execution directory is
likely to be "/Users/eclipse/Eclipse.app/Contents/MacOS". Since the URL is
relative, this would result in UIMA trying to load the data from
"/Users/eclipse/Eclipse.app/Contents/MacOS/german_abbreviations.txt".
I don't like PEARs too much, because they work with hard-coded paths...
You could change the descriptor so that it loads from an absolute URL, like
"file:/Users/uliheld/Daten/eclipse-workspace/.metadata/.plugins/org.eclipse.wst.server.core/tmp0/work/Catalina/localhost/NLPipe-Server/ScienceDays/resources/german_abbreviations.txt".
Alternatively you could use programmatically create your descriptor at runtime
and fill in the correct absolute path that you have resolved in any manner,
e.g. by looking it up using getClass().getResource(). This can be done with
UIMA or more conveniently with uimaFIT [1]. An example might look something
along the lines of this.
AnalysisEngineDescription tokenizer =
AnalysisEngineFactory.createPrimitive(Tokenizer.class);
ExternalResourceFactory.bindResource(tokenizer, "abbreviations",
AbbreviationsResource.class,
getClass().getResource("/resources/german_abbreviations.txt"));
SimplePipeline.runPipeline(reader, tokenizer, consumer);
Mind this is written from memory and I did not test it - assumes you configured
the uimaFIT automatic typesystem detection, otherwise the typesystem needs to
be loaded and provided as an additional parameter.
Best,
Richard
[1] http://code.google.com/p/uimafit/
--
-------------------------------------------------------------------
Richard Eckart de Castilho
Technical Lead
Ubiquitous Knowledge Processing Lab
FB 20 Computer Science Department
Technische Universität Darmstadt
Hochschulstr. 10, D-64289 Darmstadt, Germany
phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117
[email protected]
www.ukp.tu-darmstadt.de
Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de
-------------------------------------------------------------------