Hugues de Mazancourt created UIMA-5680:
------------------------------------------
Summary: Special characters in MARKFAST dictionaries mask entries
Key: UIMA-5680
URL: https://issues.apache.org/jira/browse/UIMA-5680
Project: UIMA
Issue Type: Bug
Components: Ruta
Affects Versions: 2.6.1ruta
Reporter: Hugues de Mazancourt
Attachments: Slash.ruta, dict.txt, text.txt
It seems that two entries in MARKFAST dictionary simply differing from a
special character make MARKFAST ignore some entries :
My script is:
{{DECLARE AndOr;
Document{->MARKFAST(AndOr, 'dict.txt', true)};
}}
My dict.txt contains
{{and/or
and or}}
On the following text : "knowledge of java and/or php and or Groovy is a plus",
only the second "and or" (without the slash) is marked. If I remove the
"unslashed" entry from the dict.txt file, "and/or" is correctly marked.
This also happens with other separators, such as "+", ".", etc. and even if two
entries share the same prefix. For example, if you add "and/or php" to
dict.txt, it won't be marked.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)