SimpleServer configuration with Sofas

Ben Morgan Tue, 07 Dec 2010 14:35:02 -0800

Hey folks,

I've got a problem with the UIMA SimpleServer[1][5] not being able tocorrectly run an aggregate analysis engine[6]. The aggregate AE works asexpected however when I test it with the "UIMA CAS Visual Debugger", the"UIMA Run AE" and the "UIMA Document Analyzer".

The analysis engine[2] is relatively simple (as of yet). It is composedof the following components:


    AE PDF Text Extractor[3]
        :: gets a URL as the "initial view" and downloads
           the file, extracts the text and puts it in a new
           view by the name of "extractedText".
        -> Input Sofa: urlString
        -> Output Sofa: extractedText
    AE Email Annotator[4]
        :: simple annotator, just annotates email addresses.

When I run the aggregate analysis engine, it terminates before givingany results with an error (taken from the Tomcat log file):


    SEVERE: Exception occurred
    org.apache.uima.analysis_engine.AnalysisEngineProcessException:
        Annotator processing failed.
    ...
    Caused by: org.apache.uima.cas.CASRuntimeException:
        No sofaFS with name plainText found.
    ...

"plainText" is the Sofa in the aggregate analysis engine which is linkedto the output of the PDF Text Extractor "extractedText".

I took the aggregate analysis engine apart piece by piece, and I startedwith the Email Annotator AE. That worked fine with the SimpleServer.

Then I tested the PDF Text Extractor (I changed the input view to_InitialView). When I tested a URL, it came through as XML, but onlywith the intial view and not with the extracted text. In fact, whentesting the text extractor otherwise, it would take around 3 seconds todownload the pdf file, while the SimpleServer sent back its resultsimmediately (so what is that all about? Does it not even run the code inthe function process()?).

That's my problem, and I wonder if there is something special you needto do, when there are views or different output sofas. I can not for thelife of me figure out, what is wrong and why it does not work.


Thanks for your help,
Ben Morgan

_______________________________________________________________________________

1:http://uima.apache.org/downloads/sandbox/simpleServerUserGuide/simpleServerUserGuide.html

2: Aggregate AE Descriptor:https://github.com/cassava/bibrefext/blob/uima/UIMA/workspace/ReferenceAnnotator/desc/referenceAnnotatorDescriptor.xml

3: PDF Extractor descriptor:https://github.com/cassava/bibrefext/blob/uima/UIMA/workspace/ReferenceAnnotator/desc/PDFTextExtractorDescriptor.xmlPDF Extractor java source:https://github.com/cassava/bibrefext/blob/uima/UIMA/workspace/ReferenceAnnotator/src/de/uniwue/informatik/bibrefext/pdf/TextExtractor.java

4: Email Annotator descriptor:https://github.com/cassava/bibrefext/blob/uima/UIMA/workspace/ReferenceAnnotator/desc/EmailAnnotatorDescriptor.xml

5: SimpleServer web.xml:https://github.com/cassava/bibrefext/blob/uima/UIMA/workspace/ReferenceWebService/WebContent/WEB-INF/web.xml

6: Complete WAR file:https://github.com/downloads/cassava/bibrefext/bibrefext.war

SimpleServer configuration with Sofas

Reply via email to