[ 
https://issues.apache.org/jira/browse/UIMA-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13671206#comment-13671206
 ] 

Armin Wegner edited comment on UIMA-2947 at 5/31/13 6:54 AM:
-------------------------------------------------------------

This would be a major change. Old dictionaries will no long work. Therefore a 
dictionary converter should be provided.
                
      was (Author: ithohd9u):
    Old dictionaries will no long work. Therefore a dictionary converter should 
be provided.
                  
> Improve format of multi-word entries in dictionary files
> --------------------------------------------------------
>
>                 Key: UIMA-2947
>                 URL: https://issues.apache.org/jira/browse/UIMA-2947
>             Project: UIMA
>          Issue Type: Improvement
>          Components: Sandbox-DictionaryAnnotator
>         Environment: Linux
>            Reporter: Armin Wegner
>              Labels: XML,, dictionary
>
> Using a single character to separate tokens in a Dictionary Annotator's 
> dictionary file is not XML like. It looks like a remnant from old 
> comma-separated-value days. So remove multiWordSeparator from 
> dictionaryMetaData and let an entry look like 
> <entry><key><token>AOL</token><token>Mail</token></key></entry> or 
> <entry><key><token>azbuz</token><token>.</token><token>com</token></key></entry>.
>  By the way, what is <key> good for? Do we need it?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to