Author: pkluegl Date: Wed Jan 30 15:46:43 2013 New Revision: 1440482 URL: http://svn.apache.org/viewvc?rev=1440482&view=rev Log: UIMA-2619 - added note about the actual method for determining the filtering settings - extended information for FILTERTYPE and RETAINTYPE and also added the note to these actions
Modified: uima/sandbox/TextMarker/trunk/uima-docbook-textmarker/src/docbook/tools.textmarker.language.actions.xml uima/sandbox/TextMarker/trunk/uima-docbook-textmarker/src/docbook/tools.textmarker.language.xml Modified: uima/sandbox/TextMarker/trunk/uima-docbook-textmarker/src/docbook/tools.textmarker.language.actions.xml URL: http://svn.apache.org/viewvc/uima/sandbox/TextMarker/trunk/uima-docbook-textmarker/src/docbook/tools.textmarker.language.actions.xml?rev=1440482&r1=1440481&r2=1440482&view=diff ============================================================================== --- uima/sandbox/TextMarker/trunk/uima-docbook-textmarker/src/docbook/tools.textmarker.language.actions.xml (original) +++ uima/sandbox/TextMarker/trunk/uima-docbook-textmarker/src/docbook/tools.textmarker.language.actions.xml Wed Jan 30 15:46:43 2013 @@ -383,6 +383,18 @@ Document{->EXEC(NamedEntities)};]]></pro ignored by rules. Expressions are not yet supported. This action is related to RETAINTYPE (see <xref linkend='ugr.tools.tm.language.actions.retaintype' />). </para> + <note> + <para> + The visibility of types is calculated using three lists: + A list <quote>default</quote> for the initially filtered types, + which is specified in the configuration parameters of the analysis engine, the list <quote>filtered</quote>, which is + specified by the FILTERTYPE action, and the list <quote>retained</quote>, which is specified by the RETAINTYPE action. + For determining the actual visibility of types, list <quote>filtered</quote> is added to list <quote>default</quote> + and then all elements of list <quote>retained</quote> are removed. The annotations of the types in the resulting list are not visible. + Please note that the actions FILTERTYPE and RETAINTYPE replace all elements of the respective lists and that RETAINTYPE + overrides FILTERTYPE. + </para> + </note> <section> <title> <emphasis role="bold">Definition:</emphasis> @@ -402,6 +414,12 @@ Document{->EXEC(NamedEntities)};]]></pro This rule filters all small written words in the input document. They are further ignored by every rule. </para> + <para> + <programlisting><![CDATA[Document{->FILTERTYPE};]]></programlisting> + </para> + <para> + Here, the the action (without parentheses) specifies that no additional types should be filtered. + </para> </section> </section> @@ -983,6 +1001,18 @@ Document{-> MARKTABLE(Struct, 1, TestTab are now not ignored by rules. This action is related to FILTERTYPE (see <xref linkend='ugr.tools.tm.language.actions.filtertype' />). </para> + <note> + <para> + The visibility of types is calculated using three lists: + A list <quote>default</quote> for the initially filtered types, + which is specified in the configuration parameters of the analysis engine, the list <quote>filtered</quote>, which is + specified by the FILTERTYPE action, and the list <quote>retained</quote>, which is specified by the RETAINTYPE action. + For determining the actual visibility of types, list <quote>filtered</quote> is added to list <quote>default</quote> + and then all elements of list <quote>retained</quote> are removed. The annotations of the types in the resulting list are not visible. + Please note that the actions FILTERTYPE and RETAINTYPE replace all elements of the respective lists and that RETAINTYPE + overrides FILTERTYPE. + </para> + </note> <section> <title> <emphasis role="bold">Definition:</emphasis> @@ -1001,6 +1031,12 @@ Document{-> MARKTABLE(Struct, 1, TestTab <para> Here, all spaces are retained and can be matched by rules. </para> + <para> + <programlisting><![CDATA[Document{->RETAINTYPE};]]></programlisting> + </para> + <para> + Here, the the action (without parentheses) specifies that no types should be retained. + </para> </section> </section> Modified: uima/sandbox/TextMarker/trunk/uima-docbook-textmarker/src/docbook/tools.textmarker.language.xml URL: http://svn.apache.org/viewvc/uima/sandbox/TextMarker/trunk/uima-docbook-textmarker/src/docbook/tools.textmarker.language.xml?rev=1440482&r1=1440481&r2=1440482&view=diff ============================================================================== --- uima/sandbox/TextMarker/trunk/uima-docbook-textmarker/src/docbook/tools.textmarker.language.xml (original) +++ uima/sandbox/TextMarker/trunk/uima-docbook-textmarker/src/docbook/tools.textmarker.language.xml Wed Jan 30 15:46:43 2013 @@ -56,18 +56,30 @@ under the License. unexpected markup. The TextMarker System enables the knowledge engineer to filter and to hide all possible combinations of predefined and new types of annotations. The - visibility of tokens and - annotations is modified by the actions of - rule elements and can be - conditioned using the complete + visibility of tokens and annotations is modified by the actions of + rule elements and can be conditioned using the complete expressiveness of the language. Therefore the TextMarker system supports a robust approach to information extraction and simplifies the creation of new rules since the knowledge engineer can focus on - important textual features. If no - rule action changed the + important textual features. + </para> + <note> + <para> + The visibility of types is calculated using three lists: + A list <quote>default</quote> for the initially filtered types, + which is specified in the configuration parameters of the analysis engine, the list <quote>filtered</quote>, which is + specified by the FILTERTYPE action, and the list <quote>retained</quote>, which is specified by the RETAINTYPE action. + For determining the actual visibility of types, list <quote>filtered</quote> is added to list <quote>default</quote> + and then all elements of list <quote>retained</quote> are removed. The annotations of the types in the resulting list are not visible. + Please note that the actions FILTERTYPE and RETAINTYPE replace all elements of the respective lists and that RETAINTYPE + overrides FILTERTYPE. + </para> + </note> + <para> + If no rule action changed the configuration of the filtering settings, then the default filtering configuration ignores whitespaces and markup. @@ -115,6 +127,7 @@ Dr.JoachimBaumeister since the second rule uses the filtered type PERIOD and is therefore not executed. </para> + </section> <section id="ugr.tools.tm.language.blocks"> <title>Blocks</title>