The Apache UIMA team is pleased to announce the release of
Apache UIMA Ruta (Rule-based Text Annotation), version 2.7.0.

The Unstructured Information Management Architecture (UIMA) is a
component framework supporting development, discovery, composition, and
deployment of multi-modal analytics tasked with the analysis of
unstructured information.

Apache UIMA is an Apache licensed open source implementation of the UIMA
specification which is being developed by a technical committee within
OASIS, a standards organization. The implementation comprises an SDK and
tooling for composing and running analytic components written in Java
and C++, with some support for Perl, Python and TCL.

Apache UIMA Ruta is a rule-based script language supported by
Eclipse-based tooling. The language is designed to enable rapid
development of text processing applications within UIMA. A special focus
lies on the intuitive and flexible domain specific language for defining
patterns of annotations. The Eclipse-based tooling,
called the Apache UIMA Ruta Workbench, was created to support the
user and to facilitate every step when writing rules. Both
the rule language and the workbench integrate
smoothly with Apache UIMA.

Major Changes in this Release

UIMA Ruta Language and Analysis Engine:

- Requires Java 8
- New language feature: label expressions at actions for directly
assigning/reusing newly created annotations. Example: Document{-> a:T1,
CREATE(T2, "ref" = a)};
- New language feature: new type of rule element for completely optional
match which does not require an existing annotation and therefore also
works at the boundary of a window/document. Example: NUM _{-PARTOF(CW)};
- Type lists can be used as matching condition.
- Initial default value of string and annotations variables is now null.
- Comparison of annotation and annotation list are now supported.
- New configuration parameter 'inferenceVisitors'.
- New configuration parameter 'maxRuleMatches'.
- New configuration parameter 'maxRuleElementMatches'.
- New configuration parameter 'rulesScriptName'.
- Inlined rules as condition are only evaluated if the rule element
match was successful.
- Multiple inlined rule blocks are allowed at one rule element.
- String features with allowed values are supported.
- PlainTextAnnotator supports vertical tabs.
- Various improvements for WORDTABLE.
- Thrown exceptions include script name.
- Fixed values of label for failed matches.
- Fixed inlined rules as condition at wildcards.
- Fixed resetting of annotation-based variables.
- Fixed various bugs of wildcards.
- Fixed CONTAINS condition for annotations overlapping the window.
- Fixed COUNT condition.
- Fixed setting variables by configuration parameter.

UIMA Ruta Workbench:

- Query View support more CAS formats.
- Fixed order of scripts in Applied Rules view.
- Fixed reporting of non-existing problems in editor.


For a full list of the changes, please refer to Jira:
http://uima.apache.org/d/ruta-2.7.0/issuesFixed/jira-report.html

More information about UIMA Ruta can be found here:
http://uima.apache.org/ruta.html

 - Peter Klügl, for the Apache UIMA development team








Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to