[
https://issues.apache.org/jira/browse/UIMA-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peter Klügl resolved UIMA-5775.
-------------------------------
Resolution: Fixed
> Performance problem MARKTABLE when matching case insensitive
> ------------------------------------------------------------
>
> Key: UIMA-5775
> URL: https://issues.apache.org/jira/browse/UIMA-5775
> Project: UIMA
> Issue Type: Bug
> Components: Ruta
> Affects Versions: 2.6.1ruta
> Reporter: Jasper Huzen
> Assignee: Peter Klügl
> Priority: Major
> Fix For: 2.6.2ruta
>
> Attachments: UIMA-5775.patch
>
>
> Hi,
> We encounter a performance issue (or maybe infinitive loop) when we use the
> MARKTABLE action, with case insenstive valuelists.
> The call in our script is:
> {code:java}
> ADDRETAINTYPE(WS);
> MARKTABLE(LawName, 1, 'nl_law_names.ignorecase.csv', true, 0, "", 0,
> "lawIdentifier" = 2);{code}
> Using the following input fragment will result in a timeout exception after 1
> minute.
> {code:java}
> Groenboek COM(2006) 105 definitief een Europese strategie voor duurzame,
> concurrerende en continu geleverde energie voor Europa {SEC(2006)317}{code}
> That complete name is a Dutch lawname and also be an entry of the
> _nl_law_names.csv_ file.
> When we try to match it and we have the ignoreCase flag to false, it is no
> problem and fast.. If we toggle that flag to true (case is ignored), the
> matching is really slow or even hanging in an infinitive loop.
> I debugged the code and pinpoint me to the _TreeWordList_ class. The
> recursive method _recursiveContains_ have a potential bug.
> I think that the problem is when the item have a special character, that it
> is the same character in upper and lowercase. The recursive method will then
> look/fork twice on the same tree item.
> I made a fix that checks if the uppercase character is the same as the
> lowercase character, and in that case it only do the recursive call once.
> That solved the (performance) issue but I'm not sure if this is really the
> main problem and the current fix is the best fix for this.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)