As you have figured out already, the "container" is simply a subclass of an Annotation type (i.e., with offsets). If you check the OpenNLP code, you will see that all you need to do is tell the SentenceDetector the (fully qualified) name of your type as a String value, say "org.myorg.tcas.MyAnnotationType"
This type name string can be set as a parameter in the descriptor of the SentenceDetector annotator; the parameter is/should be named: opennlp.uima.ContainerType If set, the SentenceDetector will only split text into sentences that are inside the spans annotated by this "ContainerType". Hope these instructions were clear enough - if you need more details, let me know. Here is the link to the XML descriptor for the OpenNLP SentenceDetector annotator once more: https://svn.apache.org/repos/asf/opennlp/trunk/opennlp-uima/descriptors/SentenceDetector.xml Cheers, Florian On 17 Aug 2012, at 15:42, Andreas Niekler wrote: > Hello, > > sorry for the new reply! But i forgot to ask how i can pass this type to the > annotator than > > Thanks again > > Andreas > > Am 17.08.2012 15:34, schrieb Andreas Niekler: >> Hello, >> >> thanks a lot. But how can i exactly define a container type which should >> be an AnnotationFS i guess. An how do i pass the container information >> to the annotator than? Due to the missing documentation for the openNLP >> UIMA Wrapper i get your point but don't know how to impement a >> Collection Reader that can create such containers. >> >> >> Am 17.08.2012 12:41, schrieb Florian Leitner: >>> As far as the OpenNLP philosophy goes, you'd use a container type that >>> would determine which part of the SOFA is a title, subtitle, document, >>> or any other content you are interested in sentence-segmenting and >>> only process text within that particular container type, while the >>> default is to process the entire content if no container type is set; >> >> >> Thanks a lot >> > > -- > Andreas Niekler, Dipl. Ing. (FH) > NLP Group | Department of Computer Science > University of Leipzig > Johannisgasse 26 | 04103 Leipzig > > mail: [email protected] -- Florian Leitner, PhD <[email protected]> Structural Biology and BioComputing Programme Spanish National Cancer Research Centre (CNIO) Address: C/ Melchor Fernandez Almagro 3; E-28029 Madrid Phone: +34 91 732 8000 Fax: +34 91 224 6980 Internet: http://www.cnio.es
