As you have figured out already, the "container" is simply a subclass of an 
Annotation type (i.e., with offsets). If you check the OpenNLP code, you will 
see that all you need to do is tell the SentenceDetector the (fully qualified) 
name of your type as a String value, say "org.myorg.tcas.MyAnnotationType"

This type name string can be set as a parameter in the descriptor of the 
SentenceDetector annotator; the parameter is/should be named:

opennlp.uima.ContainerType

If set, the SentenceDetector will only split text into sentences that are 
inside the spans annotated by this "ContainerType".

Hope these instructions were clear enough - if you need more details, let me 
know. Here is the link to the XML descriptor for the OpenNLP SentenceDetector 
annotator once more:

https://svn.apache.org/repos/asf/opennlp/trunk/opennlp-uima/descriptors/SentenceDetector.xml

Cheers,
Florian

On 17 Aug 2012, at 15:42, Andreas Niekler wrote:

> Hello,
> 
> sorry for the new reply! But i forgot to ask how i can pass this type to the 
> annotator than
> 
> Thanks again
> 
> Andreas
> 
> Am 17.08.2012 15:34, schrieb Andreas Niekler:
>> Hello,
>> 
>> thanks a lot. But how can i exactly define a container type which should
>> be an AnnotationFS i guess. An how do i pass the container information
>> to the annotator than? Due to the missing documentation for the openNLP
>> UIMA Wrapper i get your point but don't know how to impement a
>> Collection Reader that can create such containers.
>> 
>> 
>> Am 17.08.2012 12:41, schrieb Florian Leitner:
>>> As far as the OpenNLP philosophy goes, you'd use a container type that
>>> would determine which part of the SOFA is a title, subtitle, document,
>>> or any other content you are interested in sentence-segmenting and
>>> only process text within that particular container type, while the
>>> default is to process the entire content if no container type is set;
>> 
>> 
>> Thanks a lot
>> 
> 
> -- 
> Andreas Niekler, Dipl. Ing. (FH)
> NLP Group | Department of Computer Science
> University of Leipzig
> Johannisgasse 26 | 04103 Leipzig
> 
> mail: [email protected]

-- 
Florian Leitner, PhD <[email protected]>

Structural Biology and BioComputing Programme
Spanish National Cancer Research Centre (CNIO)

Address: C/ Melchor Fernandez Almagro 3; E-28029 Madrid
Phone: +34 91 732 8000
Fax: +34 91 224 6980
Internet: http://www.cnio.es

Reply via email to