[
https://issues.apache.org/jira/browse/UIMA-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671206#comment-13671206
]
Armin Wegner edited comment on UIMA-2947 at 5/31/13 6:50 AM:
-------------------------------------------------------------
Old dictionaries will no long work. Therefore a dictionary converter should be
provided.
was (Author: ithohd9u):
Old dictionaries will no long work. Therefore a dictionary convert should
be provided.
> Improve format of multi-word entries in dictionary files
> --------------------------------------------------------
>
> Key: UIMA-2947
> URL: https://issues.apache.org/jira/browse/UIMA-2947
> Project: UIMA
> Issue Type: Improvement
> Components: Sandbox-DictionaryAnnotator
> Environment: Linux
> Reporter: Armin Wegner
> Labels: XML,, dictionary
>
> Using a single character to separate tokens in a Dictionary Annotator's
> dictionary file is not XML like. It looks like a remnant from old
> comma-separated-value days. So remove multiWordSeparator from
> dictionaryMetaData and let an entry look like
> <entry><key><token>AOL</token><token>Mail</token></key></entry> or
> <entry><key><token>azbuz</token><token>.</token><token>com</token></key></entry>.
> By the way, what is <key> good for? Do we need it?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira