[
https://issues.apache.org/jira/browse/OPENNLP-1185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jeff Zemerick reopened OPENNLP-1185:
------------------------------------
> Tokenizers should be able to output a new line token
> ----------------------------------------------------
>
> Key: OPENNLP-1185
> URL: https://issues.apache.org/jira/browse/OPENNLP-1185
> Project: OpenNLP
> Issue Type: Improvement
> Components: Tokenizer
> Reporter: Jörn Kottmann
> Priority: Major
> Labels: ctakes
> Fix For: 1.9.5
>
>
> Some use cases need the tokenizers to also output new line tokens. This is
> needed e.g. by cTakes to process clinical notes, or by the name finder to
> process list of names where each name is written in one line. Also it helps
> the name finder to process news articles.
> To fix this issue add an option to all three tokenizers to emit new line
> tokens.
--
This message was sent by Atlassian Jira
(v8.20.7#820007)