Andreas Thiel created UIMA-5723: ----------------------------------- Summary: MARKTABLE fails to assign feature for single word entry in first CSV column Key: UIMA-5723 URL: https://issues.apache.org/jira/browse/UIMA-5723 Project: UIMA Issue Type: Bug Components: Ruta Affects Versions: 2.6.1ruta Reporter: Andreas Thiel
When using Ruta's MARKTABLE action with a CSV file {{nl_law_names.csv}} like this {code:xml} WAZ;WAZELF Wet arbeidsongeschiktheidsverzekering zelfstandigen;WAZELF {code} and corresponding Ruta script containing these lines {code:java} WORDTABLE LawNameTable = 'nl_law_names.csv'; Document{->MARKTABLE(WetNaam, 1, LawNameTable, "WetIdentifier" = 2)}; {code} it seems that the text {{WAZ}} is detected, but the {{WetIdentifier}} feature of the resulting annotation is not filled by the string following the semicolon. Instead, it remains empty. (Note: _WetNaam_ annotation is defined elsewhere via type system description) In contrast, the fully written name {{Wet arbeidsongeschiktheidsverzekering zelfstandigen}} is detected and processed as expected with feature WetIdentifier = WAZELF after annnotating. Could it be that problems arise when only a single word (i.e. no spaces or uppercase letters following lowercase chars) is present in the first column in the CSV file? Or is it a matter of configuration? We experimented also with the optional arguments of MARKTABLE regarding uppercase/lowercase distinction, but to no avail. -- This message was sent by Atlassian JIRA (v7.6.3#76005)