[
https://issues.apache.org/jira/browse/UIMA-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16367288#comment-16367288
]
Peter Klügl commented on UIMA-5723:
-----------------------------------
I was not able to reproduce the problem. I used the CSV file itself as input
and both annotations had feature values. On which input did you observe the
problem? Did you change the visibility settings?
The problem normally occurs if some sort of modification of the visibility or
filtered chars is applied during the lookup. The assignment of the features has
an additional step for the lookup where not all functionality is available.
That is at least suboptimal.
I personally do not use WORDLISTs and WORDTABLEs anymore, but missed yet to
contribute the alternative to UIMA Ruta. I really should catch up on that.
> MARKTABLE fails to assign feature for single word entry in first CSV column
> ---------------------------------------------------------------------------
>
> Key: UIMA-5723
> URL: https://issues.apache.org/jira/browse/UIMA-5723
> Project: UIMA
> Issue Type: Bug
> Components: Ruta
> Affects Versions: 2.6.1ruta
> Reporter: Andreas Thiel
> Assignee: Peter Klügl
> Priority: Major
>
> When using Ruta's MARKTABLE action with a CSV file {{nl_law_names.csv}} like
> this
> {code:xml}
> WAZ;WAZELF
> Wet arbeidsongeschiktheidsverzekering zelfstandigen;WAZELF
> {code}
> and corresponding Ruta script containing these lines
> {code:java}
> WORDTABLE LawNameTable = 'nl_law_names.csv';
> Document{->MARKTABLE(WetNaam, 1, LawNameTable, "WetIdentifier" = 2)};
> {code}
> it seems that the text {{WAZ}} is detected, but the {{WetIdentifier}} feature
> of the resulting annotation is not filled by the string following the
> semicolon. Instead, it remains empty.
> (Note: _WetNaam_ annotation is defined elsewhere via type system description)
> In contrast, the fully written name {{Wet arbeidsongeschiktheidsverzekering
> zelfstandigen}} is detected and processed as expected with feature
> WetIdentifier = WAZELF after annnotating.
> Could it be that problems arise when only a single word (i.e. no spaces or
> uppercase letters following lowercase chars) is present in the first column
> in the CSV file? Or is it a matter of configuration?
> We experimented also with the optional arguments of MARKTABLE regarding
> uppercase/lowercase distinction, but to no avail.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)