[ 
https://issues.apache.org/jira/browse/OPENNLP-479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13543963#comment-13543963
 ] 

adithya renduhcintala commented on OPENNLP-479:
-----------------------------------------------

Hi, I am trying to add short forms and abbreviations to my sentence detector 
(using the java api) but my SD still does not detect abbreations, and splits 
sentences when it should not.

Is there a code snippet to use the abbreviationDictonary when training a 
sentence detector?

                
> Features related to abbreviation dictionary are not properly collected by 
> DefaultSDContextGenerator
> ---------------------------------------------------------------------------------------------------
>
>                 Key: OPENNLP-479
>                 URL: https://issues.apache.org/jira/browse/OPENNLP-479
>             Project: OpenNLP
>          Issue Type: Bug
>          Components: Sentence Detector
>    Affects Versions: tools-1.5.3
>            Reporter: William Colen
>            Assignee: William Colen
>             Fix For: tools-1.5.3
>
>
> The documentation is not clear about if the entries in abbreviation 
> dictionary should include the EOS character. For example "mr" or "mr.". Also, 
> part of the collector code expects the dictionary to include the EOS 
> character, and others don't.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to