[
https://issues.apache.org/jira/browse/UIMA-2207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Peter Klügl updated UIMA-2207:
------------------------------
Attachment: UIMA-2207-1.patch
First patch added with the new implementation of the rule inference. Took me
longer than expected and there were also some delays.
Most parts are implemented and remaining bugs will be fixed when more test
cases are added or old TextMarker projects are migrated to the new
implementation.
What is still missing:
- Dynamic anchoring is not done yet and therefore deactivated.
- Seeding and the separation of inference annotations are not done yet.
- Tooling support for the new language elements is not available yet, e.g.,
formatter will delete composed rule elements
- Explanation component is not yet adapted to the new features.
- Extraction of modifier engine.
I tested the rule inference on a TextMarker project with about thousand rules:
The memory consumption is still too high, but the processing time is a bit
faster than before even though no optimizations are included yet.
The issue will be closed when the missing stuff is added with additional
patches or is moved to separate issues.
> Reimplementation of the TextMarker engine
> -----------------------------------------
>
> Key: UIMA-2207
> URL: https://issues.apache.org/jira/browse/UIMA-2207
> Project: UIMA
> Issue Type: Improvement
> Components: TextMarker
> Reporter: Peter Klügl
> Assignee: Peter Klügl
> Attachments: UIMA-2207-1.patch
>
>
> Some severe flaws need to be removed from the TextMarker rule inference. This
> requires some major refactoring and reimplementation of internal parts of the
> engine project.
> This improvement covers:
> - fixes the flaw with the two annotations of the same type starting at the
> same offset.
> - introduces quantifiers for sequences of rule elements.
> - eases the usage of own seed information.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira