[jira] [Updated] (UIMA-2207) Reimplementation of the TextMarker engine

JIRA Thu, 22 Sep 2011 04:57:54 -0700

     [ 
https://issues.apache.org/jira/browse/UIMA-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Peter Klügl updated UIMA-2207:
------------------------------

    Attachment: UIMA-2207-1.patch

First patch added with the new implementation of the rule inference. Took me 
longer than expected and there were also some delays.

Most parts are implemented and remaining bugs will be fixed when more test 
cases are added or old TextMarker projects are migrated to the new 
implementation.

What is still missing:
- Dynamic anchoring is not done yet and therefore deactivated.
- Seeding and the separation of inference annotations are not done yet.
- Tooling support for the new language elements is not available yet, e.g., 
formatter will delete composed rule elements
- Explanation component is not yet adapted to the new features.
- Extraction of modifier engine.

I tested the rule inference on a TextMarker project with about thousand rules: 
The memory consumption is still too high, but the processing time is a bit 
faster than before even though no optimizations are included yet.

The issue will be closed when the missing stuff is added with additional 
patches or is moved to separate issues.


> Reimplementation of the TextMarker engine
> -----------------------------------------
>
>                 Key: UIMA-2207
>                 URL: https://issues.apache.org/jira/browse/UIMA-2207
>             Project: UIMA
>          Issue Type: Improvement
>          Components: TextMarker
>            Reporter: Peter Klügl
>            Assignee: Peter Klügl
>         Attachments: UIMA-2207-1.patch
>
>
> Some severe flaws need to be removed from the TextMarker rule inference. This 
> requires some major refactoring and reimplementation of internal parts of the 
> engine project.
> This improvement covers:
> - fixes the flaw with the two annotations of the same type starting at the 
> same offset.
> - introduces quantifiers for sequences of rule elements.
> - eases the usage of own seed information.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (UIMA-2207) Reimplementation of the TextMarker engine

Reply via email to