Joern Kottmann created OPENNLP-1185:
---------------------------------------

             Summary: Tokenizers should be able to output a new line token
                 Key: OPENNLP-1185
                 URL: https://issues.apache.org/jira/browse/OPENNLP-1185
             Project: OpenNLP
          Issue Type: Improvement
          Components: Tokenizer
            Reporter: Joern Kottmann
            Assignee: Peter Thygesen


Some use cases need the tokenizers to also output new line tokens. This is 
needed e.g. by cTakes to process clinical notes, or by the name finder to 
process list of names where each name is written in one line. Also it helps the 
name finder to process news articles.

To fix this issue add an option to all three tokenizers to emit new line tokens.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to