[ https://issues.apache.org/jira/browse/UIMA-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16369052#comment-16369052 ]
Andreas Thiel commented on UIMA-5723: ------------------------------------- Thank you Peter for looking into the issue. With a reduced system, I now also get the "correct" behavior. But I have not yet figured out why the full system behaves like reported above. I will try to investigate further and report back. I would also be very interested in more details on the alternative to MARKLIST and MARKTABLE you mentioned, although it is not yet part of UIMA/Ruta. > MARKTABLE fails to assign feature for single word entry in first CSV column > --------------------------------------------------------------------------- > > Key: UIMA-5723 > URL: https://issues.apache.org/jira/browse/UIMA-5723 > Project: UIMA > Issue Type: Bug > Components: Ruta > Affects Versions: 2.6.1ruta > Reporter: Andreas Thiel > Assignee: Peter Klügl > Priority: Major > > When using Ruta's MARKTABLE action with a CSV file {{nl_law_names.csv}} like > this > {code:xml} > WAZ;WAZELF > Wet arbeidsongeschiktheidsverzekering zelfstandigen;WAZELF > {code} > and corresponding Ruta script containing these lines > {code:java} > WORDTABLE LawNameTable = 'nl_law_names.csv'; > Document{->MARKTABLE(WetNaam, 1, LawNameTable, "WetIdentifier" = 2)}; > {code} > it seems that the text {{WAZ}} is detected, but the {{WetIdentifier}} feature > of the resulting annotation is not filled by the string following the > semicolon. Instead, it remains empty. > (Note: _WetNaam_ annotation is defined elsewhere via type system description) > In contrast, the fully written name {{Wet arbeidsongeschiktheidsverzekering > zelfstandigen}} is detected and processed as expected with feature > WetIdentifier = WAZELF after annnotating. > Could it be that problems arise when only a single word (i.e. no spaces or > uppercase letters following lowercase chars) is present in the first column > in the CSV file? Or is it a matter of configuration? > We experimented also with the optional arguments of MARKTABLE regarding > uppercase/lowercase distinction, but to no avail. -- This message was sent by Atlassian JIRA (v7.6.3#76005)