Not sure this is going to help, but here goes.  I find
this kind of situation quite common.  So I don't use a
CPE, but just run an aggregate under my own control.

Create a UIMA app with something like this:

public class UimaApp {

  public static void main(String[] args) {
    try {
      // Get Resource Specifier from XML file
      XMLInputSource in = new XMLInputSource(args[0]);
      ResourceSpecifier specifier = 
UIMAFramework.getXMLParser().parseResourceSpecifier(in);
      // Instantiate analysis engine
      AnalysisEngine ae = UIMAFramework.produceAnalysisEngine(specifier);
      // Create CAS
      CAS cas = ae.newCAS();

      // Now process documents (in a loop, hooked up to a queue or whatever is 
convenient...)

      // Reset CAS before processing
      cas.reset();
      // Set document text and do other initialization
      cas.setDocumentText("Document text goes here");
      // Run ae on CAS
      ae.process(cas);
      // If necessary, get results out of CAS...
      System.out.println("Number of annotations: " + 
cas.getAnnotationIndex().size());

    } catch (Exception e) {
      e.printStackTrace();
    }
  }

}

So if the only thing you need the CPE for is the collection
reader, this is an alternative for you.

--Thilo

Christoph Büscher wrote:
Hi,

so far I've always used UIMA CPEs to read whole collections of documents from e.g. a source directory. In a new application it will be necessary to run a CPE on new documents beeing passed to it by another application (outside UIMA). It would be nice to be able to simply hand single documents over to a collection reader and then simply to "run/wake up" the CPE to process the document.

My idea was to put the incoming documents into a waiting queue, register this at a custom collection reader and then let the hasNext/getNext-Method simply to ask the queue if there is work to do. But when "hasNext()" in the collection reader returns "false", the CPE stops execution.

Is it possible to put a reader or the whole CPE into a "waiting" mode, or is the only solution to always restart the whole CPE once new documents have arrived to be processed? Has anybody dealt with a similar situation so far and has any "best practices" to share? How do you handle them ?

Thanks,


Reply via email to