Hello Uli,

> I'm trying to use SimpleServer to create a Website, that uses NLP services. 
> Unfortunately I can't get past the PEAR installation process of the 
> SimpleServerServlet. For the tokenizer I use a resource binding which is 
> located in root/resources and I just can't get the PackageInstaller to find 
> the this file no matter what I write into the CLASSPATH. Running the 
> component form the installed PEAR file is no problem.
> I would appreciate it greatly if someone could point me in the right 
> direction.

> The error log:
> 
> UIMA Simple Service configuaration failed
> org.apache.uima.pear.tools.PackageInstallerException: The following error 
> occurred during the installation verification of component ScienceDays: 
> org.apache.uima.resource.ResourceInitializationException: Error initializing 
> "org.apache.uima.resource.impl.DataResource_impl" from descriptor 
> file:/Users/uliheld/Daten/eclipse-workspace/.metadata/.plugins/org.eclipse.wst.server.core/tmp0/work/Catalina/localhost/NLPipe-Server/ScienceDays/desc/annotators/TokenizerDescriptor.xml.
>       at 
> org.apache.uima.util.SimpleResourceFactory.produceResource(SimpleResourceFactory.java:144)
> ...
> Caused by: org.apache.uima.resource.ResourceInitializationException: Could 
> not access the resource data at file:german_abbreviations.txt.
>       at 
> org.apache.uima.resource.impl.DataResource_impl.initialize(DataResource_impl.java:126)
>       at 
> org.apache.uima.util.SimpleResourceFactory.produceResource(SimpleResourceFactory.java:123)
>       ... 45 more

From this information I read that you have some external resource configured in 
the TokenizerDescriptor.xml with the URL "file:german_abbreviations.txt". You 
seem to run this in Eclipse using WTP on OS X, so the execution directory is 
likely to be "/Users/eclipse/Eclipse.app/Contents/MacOS". Since the URL is 
relative, this would result in UIMA trying to load the data from 
"/Users/eclipse/Eclipse.app/Contents/MacOS/german_abbreviations.txt".

I don't like PEARs too much, because they work with hard-coded paths... 

You could change the descriptor so that it loads from an absolute URL, like 
"file:/Users/uliheld/Daten/eclipse-workspace/.metadata/.plugins/org.eclipse.wst.server.core/tmp0/work/Catalina/localhost/NLPipe-Server/ScienceDays/resources/german_abbreviations.txt".

Alternatively you could use programmatically create your descriptor at runtime 
and fill in the correct absolute path that you have resolved in any manner, 
e.g. by looking it up using getClass().getResource(). This can be done with 
UIMA or more conveniently with uimaFIT [1]. An example might look something 
along the lines of this.

  AnalysisEngineDescription tokenizer = 
AnalysisEngineFactory.createPrimitive(Tokenizer.class);
  ExternalResourceFactory.bindResource(tokenizer, "abbreviations", 
     AbbreviationsResource.class, 
getClass().getResource("/resources/german_abbreviations.txt"));
  SimplePipeline.runPipeline(reader, tokenizer, consumer);

Mind this is written from memory and I did not test it - assumes you configured 
the uimaFIT automatic typesystem detection, otherwise the typesystem needs to 
be loaded and provided as an additional parameter.

Best,

Richard

[1] http://code.google.com/p/uimafit/

-- 
------------------------------------------------------------------- 
Richard Eckart de Castilho
Technical Lead
Ubiquitous Knowledge Processing Lab 
FB 20 Computer Science Department      
Technische Universität Darmstadt 
Hochschulstr. 10, D-64289 Darmstadt, Germany 
phone [+49] (0)6151 16-7477, fax -5455, room S2/02/B117
[email protected] 
www.ukp.tu-darmstadt.de 
Web Research at TU Darmstadt (WeRC) www.werc.tu-darmstadt.de
------------------------------------------------------------------- 




Reply via email to