Author: pkluegl
Date: Mon Aug 19 13:28:13 2013
New Revision: 1515405

URL: http://svn.apache.org/r1515405
Log:
UIMA-3115
- added some documentation about inlined rules

Modified:
    
uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.syntax.xml
    uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.xml
    uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.overview.xml

Modified: 
uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.syntax.xml
URL: 
http://svn.apache.org/viewvc/uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.syntax.xml?rev=1515405&r1=1515404&r2=1515405&view=diff
==============================================================================
--- 
uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.syntax.xml 
(original)
+++ 
uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.syntax.xml 
Mon Aug 19 13:28:13 2013
@@ -93,20 +93,17 @@ GroupAssignment        -> TypeExpression
 RuleElements           -> RuleElement+
 RuleElement            -> RuleElementType | RuleElementLiteral
                         | RuleElementComposed | RuleElementWildCard
-RuleElementType        ->  TypeExpression QuantifierPart?
-                                         ("{" Conditions?  Actions? "}")?
-RuleElementWithCA      ->  TypeExpression QuantifierPart?
-                                            "{" Conditions?  Actions? "}"
-RuleElementLiteral     ->  SimpleStringExpression QuantifierPart?
-                                          ("{" Conditions?  Actions? "}")?
-RuleElementComposed    -> ( RuleElement ("&" RuleElement)+
-                          | RuleElement ("|" RuleElement)+
-                          | "(" RuleElements ")") 
-                          QuantifierPart? ("{" Conditions?  Actions? "}")?
-RuleElementDisjunctive -> "(" (TypeExpression | SimpleStringExpression)
-                        ("|" (TypeExpression | SimpleStringExpression) )+
-                        (")" QuantifierPart? "{" Conditions?  Actions? }")?
-RuleElementWildCard    -> "#"("{" Conditions?  Actions? }")?
+RuleElementType        ->  TypeExpression OptionalRuleElementPart
+RuleElementWithCA      ->  TypeExpression OptionalRuleElementPart
+RuleElementLiteral     ->  SimpleStringExpression OptionalRuleElementPart
+RuleElementComposed    -> "(" RuleElement ("&" RuleElement)+ ")"
+                          | "(" RuleElement ("|" RuleElement)+ ")"
+                          | "(" RuleElements ")"
+                          OptionalRuleElementPart
+OptionalRuleElementPart-> QuantifierPart? ("{" Conditions?  Actions? "}")?
+                          InlinedRules?
+InlinedRules           -> ( "<-" | "->" ) "{" SimpleStatement+ "}"
+RuleElementWildCard    -> "#"("{" Conditions?  Actions? }")? InlinedRules?
 QuantifierPart         -> "*" | "*?" | "+" | "+?" | "?" | "??"
                         | "[" NumberExpression "," NumberExpression "]"
                         | "[" NumberExpression "," NumberExpression "]?"

Modified: 
uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.xml
URL: 
http://svn.apache.org/viewvc/uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.xml?rev=1515405&r1=1515404&r2=1515405&view=diff
==============================================================================
--- uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.xml 
(original)
+++ uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.language.xml 
Mon Aug 19 13:28:13 2013
@@ -343,6 +343,47 @@ Document{->CALL(MyScript.countNumberOfTy
     </section>
 
   </section>
+  
+  <section id="ugr.tools.ruta.language.inlined">
+    <title>Inlined rules</title>
+    <para>
+      A rule element can have a few optional parts, e.g., the quantifier or 
the curly brackets with conditions and actions.
+      After the part with the conditions and actions, the rule element can 
also contain an optional part with inlined rules.
+      These rules are applied in the context of the rule element similar to 
the rules within a block construct: The rules 
+      will try to match within the window specified by the current match of 
the rule element. There are two types of inlined rules.
+      If the curly brackets start with the symbol <quote>-></quote>, the 
inlined rules will only be applied for successful matches of the surrounding 
rule.
+      This behavior is very similar to the block construct. However, there are 
also some differences, e.g, inlined rules do not specify a 
+      namespace, may not contain declarations and cannot be called by other 
rules.
+      If the curly brackets start with the symbol <quote>-></quote>,
+      then the inlined rules are interpreted as some sort of conditions. The 
surrounding rules will only match, if one of the inlined rules was successfully 
applied.
+      The functionality introduced by inlined rules is illustrated with a few 
examples:
+    </para>
+    <programlisting><![CDATA[Sentence{} -> {NUM{-> NumBeforeWord} W;};
+Sentence{-> SentenceWithNumBeforeWord} <- {NUM W;};
+]]></programlisting>
+    <para>
+      The first rule in this example matches on each <quote>Sentence</quote> 
annotation and applies the inlined rule within each matched sentence. The 
inlined rule 
+      matches on numbers followed by a word and annotates the number with an 
annotation of the type <quote>NumBeforeWord</quote>. The second rule matches on 
each sentence 
+      and applies the inlined rule within each sentence. Note that the inlined 
rule contains no actions. The rule matches only successfully on a sentence if 
one of the inlined rules was
+      successfully applied. In this case, the sentence is only annotated with 
an annotation of the type <quote>SentenceWithNumBeforeWord</quote>, if the 
+      sentence contains a number followed by a word.
+    </para>
+
+    <programlisting><![CDATA[Document.language == "en"{} -> {
+  PERIOD #{} <- {
+      COLON COLON % COMMA COMMA;
+    }
+    PERIOD{-> SpecialPeriod};
+}    
+]]></programlisting>
+    <para>
+      This examples combines both types of inlined rules. First, the rule 
matches on document annotations with the language feature set to 
<quote>en</quote>. Only for those documents,
+      the first inner rule is applied. The inner rule matches on everything 
between two period, but only if the text span between the period fulfills two 
conditions: There must be two 
+      successive colons and two successive commas within the window of the 
matched part of the wildcard. Only if these constraints are fulfilled, then the 
last period is annotated with the type 
+      <quote>SpecialPeriod</quote>.
+    </para>  
+  </section>
+  
   <section id="ugr.tools.ruta.language.score">
     <title>Heuristic extraction using scoring rules</title>
     <para>

Modified: 
uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.overview.xml
URL: 
http://svn.apache.org/viewvc/uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.overview.xml?rev=1515405&r1=1515404&r2=1515405&view=diff
==============================================================================
--- uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.overview.xml 
(original)
+++ uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.overview.xml 
Mon Aug 19 13:28:13 2013
@@ -406,8 +406,9 @@ NUM{PARSE(moneyAmount)} SPECIAL{REGEXP("
     </para>
     
     <programlisting><![CDATA[DECLARE LessThan;
-    MoneyAmount.currency=="€"{-> MoneyAmount.currency="Euro"};
-    MoneyAmount{(MoneyAmount.amount<=100), MoneyAmount.currency=="Euro" -> 
LessThan};]]></programlisting>
+MoneyAmount.currency=="€"{-> MoneyAmount.currency="Euro"};
+MoneyAmount{(MoneyAmount.amount<=100), 
+    MoneyAmount.currency=="Euro" -> LessThan};]]></programlisting>
 
     <para>
       UIMA Ruta script files with many rules can quickly confuse the reader. 
The UIMA Ruta language, therefore, allows to import other script files in order 
to increase
@@ -484,6 +485,27 @@ BLOCK(ForEach) Sentence{} {
     </para>
 
     <para>
+      There are two more language constructs (<quote><![CDATA[->]]></quote> 
and <quote><![CDATA[<-]]></quote>) that allow to apply rules within a certain 
context. These rules are added to an arbitrary rule element 
+      and are called inlined rules. The first example interprets the inlined 
rules as actions. They are executed if the surrounding rule was able to match, 
+      which makes this one very similar to the block statement.
+    </para>
+
+    <programlisting><![CDATA[DECLARE SentenceWithNoLeadingNP;
+Sentence{}->{
+    Document{-STARTSWITH(NP) -> SentenceWithNoLeadingNP};
+};
+]]></programlisting>
+
+    <para>
+      The second one (<quote><![CDATA[<-]]></quote>) interprets the inlined 
rules as conditions. The surrounding rule can only match if at least one 
inlined rule was successfully applied.
+      In the following example, a sentence is annotated with the type 
SentenceWithNPNP, if there are two successive NP annotations within this 
sentence.
+    </para>
+    <programlisting><![CDATA[DECLARE SentenceWithNPNP;
+Sentence{-> SentenceWithNPNP}<-{
+    NP NP;
+};
+]]></programlisting>
+    <para>
       Let us take a closer look on what exactly the UIMA Ruta rules match. The 
following rule matches on a word followed by another word:
     </para>
     <programlisting><![CDATA[W W;]]></programlisting>
@@ -858,7 +880,7 @@ ae.process(cas);]]></programlisting>
                   <entry>
                     <link 
linkend='ugr.tools.ruta.ae.basic.parameter.createdBy'>createdBy</link>
                   </entry>
-                  <entry>Option to add additional information, which rule 
created a annotation.
+                  <entry>Option to add additional information, which rule 
created an annotation.
                   </entry>
                   <entry>Single Boolean</entry>
                 </row>


Reply via email to