I've stumbled upon a problem with UIMA Ruta Workbench 2.3.1 in Eclipse
Luna 4.4.2. Whenever working with a WORDLIST or WORDTABLE where one entry
starts with a common substring of another one, it will not be recognized
and therefore not annotated.
Consider this minimal example:
WORDLIST "Keywords.txt"in resources directory with the following entries:
Bill Clinton
Billy
Input file in input directory with the following contents:
Billy wished he was president, just like Bill Clinton once was.
Main.ruta script in scripts directory:
WORDLIST list = 'Keywords.txt';
DECLARE president;
Document {->MARKFAST(president, list)};
Upon execution, only Bill Clinton will be annotated while Billy will be
ignored.
Any help/hints/comments appreciated!
Best regards,
Ronny