[ 
https://issues.apache.org/jira/browse/UIMA-5723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16370252#comment-16370252
 ] 

Andreas Thiel commented on UIMA-5723:
-------------------------------------

I was finally able to pin down the factor which caused the misbehavior. The 
MARKTABLE started to behave like expected when I created the CAS with setting 
the {{TypePriorities}} argument of {{CasCreationUtils.createCas}} as returned 
by the standard {{TypePrioritiesFactory.createTypePriorities()}}. Previously, 
this had been set to {{null}}, probably because of copying the code from some 
place on the internet without understanding the role of the arguments. Well, to 
be honest, I still don't understand the role of _TypePriorities_ and why they 
alter the outcome of the feature assignment in MARKTABLE, but maybe you can 
explain that?

If you think that is the normal and expected behavior, I would opt for closing 
this ticket.

Regarding the possible replacement of wordtables, our system now really relies 
on the capability of feature value assignment taken form the table, so whatever 
the substitution will be, please consider that this capability is somehow 
retained.  

> MARKTABLE fails to assign feature for single word entry in first CSV column
> ---------------------------------------------------------------------------
>
>                 Key: UIMA-5723
>                 URL: https://issues.apache.org/jira/browse/UIMA-5723
>             Project: UIMA
>          Issue Type: Bug
>          Components: Ruta
>    Affects Versions: 2.6.1ruta
>            Reporter: Andreas Thiel
>            Assignee: Peter Klügl
>            Priority: Major
>
> When using Ruta's MARKTABLE action with a CSV file {{nl_law_names.csv}} like 
> this
> {code:xml}
> WAZ;WAZELF
> Wet arbeidsongeschiktheidsverzekering zelfstandigen;WAZELF
> {code}
> and corresponding Ruta script containing these lines
> {code:java}
> WORDTABLE LawNameTable = 'nl_law_names.csv';
> Document{->MARKTABLE(WetNaam, 1, LawNameTable, "WetIdentifier" = 2)};
> {code}
> it seems that the text {{WAZ}} is detected, but the {{WetIdentifier}} feature 
> of the resulting annotation is not filled by the string following the 
> semicolon. Instead, it remains empty.
> (Note: _WetNaam_ annotation is defined elsewhere via type system description)
> In contrast, the fully written name {{Wet arbeidsongeschiktheidsverzekering 
> zelfstandigen}} is detected and processed as expected with feature 
> WetIdentifier = WAZELF after annnotating.
> Could it be that problems arise when only a single word (i.e. no spaces or 
> uppercase letters following lowercase chars) is present in the first column 
> in the CSV file? Or is it a matter of configuration?
> We experimented also with the optional arguments of MARKTABLE regarding 
> uppercase/lowercase distinction, but to no avail.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to