Author: pkluegl
Date: Wed Jun 12 09:14:01 2013
New Revision: 1492121

URL: http://svn.apache.org/r1492121
Log:
UIMA-2704
- added section about textruler usage

Modified:
    
uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.workbench.textruler.xml

Modified: 
uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.workbench.textruler.xml
URL: 
http://svn.apache.org/viewvc/uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.workbench.textruler.xml?rev=1492121&r1=1492120&r2=1492121&view=diff
==============================================================================
--- 
uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.workbench.textruler.xml
 (original)
+++ 
uima/sandbox/ruta/trunk/ruta-docbook/src/docbook/tools.ruta.workbench.textruler.xml
 Wed Jun 12 09:14:01 2013
@@ -38,7 +38,8 @@ under the License.
   <para>
     This section gives a short introduction about the included features and 
learners, and how to use the framework to learn UIMA Ruta rules. First, the 
     available rule learning algorithms are introduced in <xref 
linkend="section.tools.ruta.workbench.textruler.learner"/>. Then, 
-    the user interface and the usage is explained in <xref 
linkend="section.tools.ruta.workbench.textruler.ui"/> using an exemplary UIMA 
Ruat project.
+    the user interface and the usage is explained in <xref 
linkend="section.tools.ruta.workbench.textruler.ui"/> and 
+    <xref linkend="section.tools.ruta.workbench.textruler.example"/> 
illustrates the usage with an exemplary UIMA Ruta project.
   </para>
    <section id="section.tools.ruta.workbench.textruler.learner">
     <title>Included rule learning algorithms</title>
@@ -176,10 +177,10 @@ under the License.
       <para>
         The name of the rule learner KEP (knowledge engineering patterns) is 
derived from the idea that humans use different engineering patterns 
         to write annotation rules. This algorithms implements simple rule 
induction methods for some patterns, such as boundary detection 
-        or annotation-based restriction of the window. The results are then 
combined in order to take adavantage of the combination of 
+        or annotation-based restriction of the window. The results are then 
combined in order to take advantage of the combination of 
         the different kinds of induced rules. Since the single rules are 
constructed according to how humans engineer the annotations rules, 
         the resulting rule set should resemble more a handcrafted rule set. 
Furthermore, by exploiting the synergy of the patterns, solutions for 
-        some annotation are much simplier. The following parameters are 
available. For a more detailed description of the parameters, 
+        some annotation are much simpler. The following parameters are 
available. For a more detailed description of the parameters, 
         please refer to the implementation.
       </para>
       <para>
@@ -197,6 +198,23 @@ under the License.
    <section id="section.tools.ruta.workbench.textruler.ui">
    <title>The TextRuler view</title>
       <para> 
+        The TextRuler view is normally located in the lower center of the UIMA 
Ruta perspective and is the main
+        user interface to configure and start the rule learning algorithms. 
The view consists of four parts (cf. <xref 
linkend="figure.tools.ruta.workbench.textruler.main"/>): 
+        The toolbar contains buttons for starting (green button) and stopping 
(red button) the learning process, 
+        and one button that opening the preference page (blue gears) for 
configuring the rule induction algorithms cf. <xref 
linkend="figure.tools.ruta.workbench.textruler.pref"/>.
+        The upper part of the view contains text fields for defining the set 
of utilized documents. <quote>Training Data</quote>
+        points to the absolute location of the folder containing the gold 
standard documents. <quote>Additional Data</quote> points
+        to the absolute location of documents that can be additionally used by 
the algorithms. These documents are currently only needed
+        by the TraBal algorithm, which tries to learn correction rules for the 
error in those documents. <quote>Test Data</quote> is not yet available.
+        Finally, <quote>Preprocess Script</quote> points to the absolute 
location of a UIMA Ruta script, which contains all necessary types and can be 
applied
+        on the documents before the algorithms start in order to add 
additional annotations as learning features. The preprocessing can be skipped.
+        All text fields support drag and drop: the user can drag a file in the 
script explorer and drop it in the respective text field.
+        In the center of the view, the target types, for which rule should be 
induced, can be specified in the <quote>Information Types</quote> list.
+        The list <quote>Featured Feature Types</quote> specify the filtering 
settings, but it is discourage to change these settings. The user is able to 
drop
+        a simple text file, which contains a type with complete namespace in 
each line, to the <quote>Information Types</quote> list in order to add all 
those types.
+        The lower part of the view contains the list of available algorithms. 
All checked algorithms will be started, if the start button in the toolbar of 
the view is pressed.
+        When the algorithms are started, they display their current action 
after their name, and a result view with the currently induced rules is 
displayed 
+        in the right part of the perspective.
       </para>
       <figure id="figure.tools.ruta.workbench.textruler.main">
       <title>The UIMA Ruta TextRuler framework
@@ -232,6 +250,40 @@ under the License.
         </textobject>
       </mediaobject>
     </figure>
-    
    </section>
+   
+   <section id="section.tools.ruta.workbench.textruler.example">
+   <title>Example</title>
+      <para> 
+      This section gives a short example how the TextRuler framework is 
applied in order to induce annotation rules. We refer to the screenshot in 
<xref linkend="figure.tools.ruta.workbench.textruler.main"/>
+      for the configuration and are using the exemplary UIMA Ruta project 
<quote>TextRulerExample</quote>, which is part of the source release of UIMA 
Ruta.
+      </para>
+      <para> 
+        In this example, we are using the <quote>KEP</quote> algorithm for 
learning annotation rules for identifying Bibtex entries in the reference 
section of scientific publications:
+        <orderedlist>
+        <listitem>
+          <para>Select the folder <quote>single</quote> and drag and drop it 
to the <quote>Training Data</quote> text field. This folder contains one file 
with 
+          correct annotations and serves as gold standard data in our 
example.</para>
+        </listitem>
+        <listitem>
+          <para>Select the file <quote>Feature.ruta</quote> and drag and drop 
it to the <quote>Preprocess Script</quote> text field. This UIMA Ruta script 
knows all necessary types, especially the types
+          of the annotations we try the learn rules for, and additionally it 
contains rules that create useful annotations, which can be used by the 
algorithm in order to learn better rules.</para>
+        </listitem>
+        <listitem>
+          <para>Select the file <quote>InfoTypes.txt</quote> and drag and drop 
it to the <quote>Information Types</quote> list. This specifies the goal of the 
learning process, 
+          which types of annotations should be annotated by the induced rules, 
respectively.</para>
+        </listitem>
+        <listitem>
+          <para>Check the checkbox of the <quote>KEP</quote> algorithm and 
press the start button in the toolbar fo the view.</para>
+        </listitem>
+        <listitem>
+          <para>The algorithm now tries to induce rules for the targeted 
types. The current result is displayed in the view <quote>KEP Results</quote> 
in the right part of the perspective.</para>
+        </listitem>
+        <listitem>
+          <para>After the algorithms finished the learning process, create a 
new UIMA Ruta file in the <quote>uima.ruta.example</quote> package and copy the 
content of the result view
+          to the new file. Now, the induced rules can be applied as a normal 
UIMA Ruta script file.</para>
+        </listitem>
+      </orderedlist>
+      </para>
+    </section>
 </section>


Reply via email to