The Apache UIMA team is pleased to announce the release of Apache UIMA Ruta (Rule-based Text Annotation), version 2.7.0.
The Unstructured Information Management Architecture (UIMA) is a component framework supporting development, discovery, composition, and deployment of multi-modal analytics tasked with the analysis of unstructured information. Apache UIMA is an Apache licensed open source implementation of the UIMA specification which is being developed by a technical committee within OASIS, a standards organization. The implementation comprises an SDK and tooling for composing and running analytic components written in Java and C++, with some support for Perl, Python and TCL. Apache UIMA Ruta is a rule-based script language supported by Eclipse-based tooling. The language is designed to enable rapid development of text processing applications within UIMA. A special focus lies on the intuitive and flexible domain specific language for defining patterns of annotations. The Eclipse-based tooling, called the Apache UIMA Ruta Workbench, was created to support the user and to facilitate every step when writing rules. Both the rule language and the workbench integrate smoothly with Apache UIMA. Major Changes in this Release UIMA Ruta Language and Analysis Engine: - Requires Java 8 - New language feature: label expressions at actions for directly assigning/reusing newly created annotations. Example: Document{-> a:T1, CREATE(T2, "ref" = a)}; - New language feature: new type of rule element for completely optional match which does not require an existing annotation and therefore also works at the boundary of a window/document. Example: NUM _{-PARTOF(CW)}; - Type lists can be used as matching condition. - Initial default value of string and annotations variables is now null. - Comparison of annotation and annotation list are now supported. - New configuration parameter 'inferenceVisitors'. - New configuration parameter 'maxRuleMatches'. - New configuration parameter 'maxRuleElementMatches'. - New configuration parameter 'rulesScriptName'. - Inlined rules as condition are only evaluated if the rule element match was successful. - Multiple inlined rule blocks are allowed at one rule element. - String features with allowed values are supported. - PlainTextAnnotator supports vertical tabs. - Various improvements for WORDTABLE. - Thrown exceptions include script name. - Fixed values of label for failed matches. - Fixed inlined rules as condition at wildcards. - Fixed resetting of annotation-based variables. - Fixed various bugs of wildcards. - Fixed CONTAINS condition for annotations overlapping the window. - Fixed COUNT condition. - Fixed setting variables by configuration parameter. UIMA Ruta Workbench: - Query View support more CAS formats. - Fixed order of scripts in Applied Rules view. - Fixed reporting of non-existing problems in editor. For a full list of the changes, please refer to Jira: http://uima.apache.org/d/ruta-2.7.0/issuesFixed/jira-report.html More information about UIMA Ruta can be found here: http://uima.apache.org/ruta.html - Peter Klügl, for the Apache UIMA development team
signature.asc
Description: OpenPGP digital signature