Author: pkluegl Date: Wed Jan 16 17:03:42 2013 New Revision: 1434039 URL: http://svn.apache.org/viewvc?rev=1434039&view=rev Log: UIMA-2507 - added test for modifier - added parameter to modifier and updated documentation
Added: uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/test/java/org/apache/uima/textmarker/engine/TextMarkerModifierTest.java uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/test/resources/org/apache/uima/textmarker/engine/TextMarkerModifierTest.tm Modified: uima/sandbox/TextMarker/trunk/uima-docbook-textmarker/src/docbook/tools.textmarker.language.xml uima/sandbox/TextMarker/trunk/uima-docbook-textmarker/src/docbook/tools.textmarker.overview.xml uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/main/java/org/apache/uima/textmarker/engine/TextMarkerModifier.java uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/main/resources/org/apache/uima/textmarker/engine/Modifier.xml uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/test/java/org/apache/uima/textmarker/AllTests.java uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/test/java/org/apache/uima/textmarker/TextMarkerTestUtils.java Modified: uima/sandbox/TextMarker/trunk/uima-docbook-textmarker/src/docbook/tools.textmarker.language.xml URL: http://svn.apache.org/viewvc/uima/sandbox/TextMarker/trunk/uima-docbook-textmarker/src/docbook/tools.textmarker.language.xml?rev=1434039&r1=1434038&r2=1434039&view=diff ============================================================================== --- uima/sandbox/TextMarker/trunk/uima-docbook-textmarker/src/docbook/tools.textmarker.language.xml (original) +++ uima/sandbox/TextMarker/trunk/uima-docbook-textmarker/src/docbook/tools.textmarker.language.xml Wed Jan 16 17:03:42 2013 @@ -403,22 +403,15 @@ Headline{SCORE(5,10)->LOG("Maybe a headl <title>Modification</title> <para> There are different actions that can modify the input document, - like DEL, - COLOR and REPLACE. However, the input document itself can not be - modified - directly. A separate engine, the Modifier.xml, has to be - called in - order to create another CAS view with the name "modified". - In that - document, all modifications are executed. + like DEL, COLOR and REPLACE. However, the input document itself can not be + modified directly. A separate engine, the Modifier.xml, has to be + called in order to create another CAS view with the (default) name "modified". + In that document, all modifications are executed. </para> <para> The following example shows how to import and call the - Modifier.xml - engine. - The example is explained in detail in - <xref linkend='ugr.tools.tm.overview.examples' /> - . + Modifier.xml engine. The example is explained in detail in + <xref linkend='ugr.tools.tm.overview.examples' />. </para> <programlisting><![CDATA[ENGINE utils.Modifier; Date{-> DEL}; Modified: uima/sandbox/TextMarker/trunk/uima-docbook-textmarker/src/docbook/tools.textmarker.overview.xml URL: http://svn.apache.org/viewvc/uima/sandbox/TextMarker/trunk/uima-docbook-textmarker/src/docbook/tools.textmarker.overview.xml?rev=1434039&r1=1434038&r2=1434039&view=diff ============================================================================== --- uima/sandbox/TextMarker/trunk/uima-docbook-textmarker/src/docbook/tools.textmarker.overview.xml (original) +++ uima/sandbox/TextMarker/trunk/uima-docbook-textmarker/src/docbook/tools.textmarker.overview.xml Wed Jan 16 17:03:42 2013 @@ -1018,7 +1018,7 @@ ae.process(cas);]]></programlisting> <section id="ugr.tools.tm.ae.modifier"> <title>Modifier</title> <para> - The Modifier Analysis Engine can be used to create an additional view <quote>modified</quote>, which contains all textual modifications and HTML highlightings that + The Modifier Analysis Engine can be used to create an additional view, which contains all textual modifications and HTML highlightings that were specified by the executed rules. This Analysis Engine can be applied, e.g., for anonymization where all annotations of persons are replaced by the string <quote>Person</quote>. Furthermore, the content of the new view can optionally be stored in a new HTML file. @@ -1044,10 +1044,18 @@ ae.process(cas);]]></programlisting> <section id="ugr.tools.tm.ae.modifier.parameter.outputLocation"> <title>outputLocation</title> <para> - This string parameter specifies the absolute path of the resulting file named <quote>output.modified.html</quote>. However, if an annotation of the + This optional string parameter specifies the absolute path of the resulting file named <quote>output.modified.html</quote>. However, if an annotation of the type <quote>org.apache.uima.examples.SourceDocumentInformation</quote> is given, then the value of this parameter is interpreted to be relative to the URI stored in the annotation and the name of the file will be adapted to the name of the source file. The TextMarker IDE automatically adds - the SourceDocumentInformation annotation when the user launches a script file. The default value of this parameter is <quote>/../</quote>. + the SourceDocumentInformation annotation when the user launches a script file. The default value of this parameter is empty. + In this case no additional html file will be created. + </para> + </section> + <section id="ugr.tools.tm.ae.modifier.parameter.outputView"> + <title>outputView</title> + <para> + This string parameter specifies the name of the view, which will contain the modified document. A view of this name must not yet exist. + The default value of this parameter is <quote>modified</quote>. </para> </section> </section> Modified: uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/main/java/org/apache/uima/textmarker/engine/TextMarkerModifier.java URL: http://svn.apache.org/viewvc/uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/main/java/org/apache/uima/textmarker/engine/TextMarkerModifier.java?rev=1434039&r1=1434038&r2=1434039&view=diff ============================================================================== --- uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/main/java/org/apache/uima/textmarker/engine/TextMarkerModifier.java (original) +++ uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/main/java/org/apache/uima/textmarker/engine/TextMarkerModifier.java Wed Jan 16 17:03:42 2013 @@ -28,6 +28,7 @@ import java.util.Iterator; import java.util.List; import java.util.Map; +import org.apache.commons.lang3.StringUtils; import org.apache.uima.UimaContext; import org.apache.uima.analysis_component.JCasAnnotator_ImplBase; import org.apache.uima.analysis_engine.AnalysisEngineProcessException; @@ -44,32 +45,32 @@ import org.apache.uima.tools.stylemap.St import org.apache.uima.util.FileUtils; public class TextMarkerModifier extends JCasAnnotator_ImplBase { - public static final String MODIFIED_SOFA = "modified"; + public static final String DEFAULT_MODIFIED_VIEW = "modified"; - private static final String OUTPUT_LOCATION = "outputLocation"; + public static final String OUTPUT_LOCATION = "outputLocation"; + + public static final String OUTPUT_VIEW = "outputView"; private StyleMapFactory styleMapFactory; private String styleMapLocation; - private UimaContext context; - private String[] descriptorPaths; private String outputLocation; + private String modifiedViewName; + @Override public void initialize(UimaContext aContext) throws ResourceInitializationException { super.initialize(aContext); - if (aContext == null && context != null) { - aContext = context; - } styleMapLocation = (String) aContext.getConfigParameterValue(StyleMapCreator.STYLE_MAP); descriptorPaths = (String[]) aContext .getConfigParameterValue(TextMarkerEngine.DESCRIPTOR_PATHS); outputLocation = (String) aContext.getConfigParameterValue(TextMarkerModifier.OUTPUT_LOCATION); styleMapFactory = new StyleMapFactory(); - this.context = aContext; + modifiedViewName = (String) aContext.getConfigParameterValue(TextMarkerModifier.OUTPUT_VIEW); + modifiedViewName = StringUtils.isBlank(modifiedViewName) ? DEFAULT_MODIFIED_VIEW : modifiedViewName; } @Override @@ -83,7 +84,7 @@ public class TextMarkerModifier extends Iterator<?> viewIterator = cas.getViewIterator(); while (viewIterator.hasNext()) { JCas each = (JCas) viewIterator.next(); - if (each.getViewName().equals(MODIFIED_SOFA)) { + if (each.getViewName().equals(modifiedViewName)) { modifiedView = each; break; } @@ -91,12 +92,12 @@ public class TextMarkerModifier extends if (modifiedView == null) { try { - modifiedView = cas.createView(MODIFIED_SOFA); + modifiedView = cas.createView(modifiedViewName); } catch (Exception e) { - modifiedView = cas.getView(MODIFIED_SOFA); + modifiedView = cas.getView(modifiedViewName); } } else { - modifiedView = cas.getView(MODIFIED_SOFA); + modifiedView = cas.getView(modifiedViewName); } String locate = TextMarkerEngine.locate(styleMapLocation, descriptorPaths, ".xml", true); try { @@ -108,11 +109,13 @@ public class TextMarkerModifier extends String documentText = modifiedView.getDocumentText(); if (documentText != null) { - try { - File outputFile = getOutputFile(cas.getCas()); - FileUtils.saveString2File(documentText, outputFile); - } catch (IOException e) { - throw new AnalysisEngineProcessException(e); + File outputFile = getOutputFile(cas.getCas()); + if (outputFile != null) { + try { + FileUtils.saveString2File(documentText, outputFile); + } catch (IOException e) { + throw new AnalysisEngineProcessException(e); + } } } @@ -123,8 +126,11 @@ public class TextMarkerModifier extends } private File getOutputFile(CAS cas) { - Type sdiType = cas.getTypeSystem().getType(TextMarkerEngine.SOURCE_DOCUMENT_INFORMATION); + if (StringUtils.isBlank(outputLocation)) { + return null; + } + Type sdiType = cas.getTypeSystem().getType(TextMarkerEngine.SOURCE_DOCUMENT_INFORMATION); String filename = "output.modified.html"; File file = new File(outputLocation, filename); if (sdiType != null) { Modified: uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/main/resources/org/apache/uima/textmarker/engine/Modifier.xml URL: http://svn.apache.org/viewvc/uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/main/resources/org/apache/uima/textmarker/engine/Modifier.xml?rev=1434039&r1=1434038&r2=1434039&view=diff ============================================================================== --- uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/main/resources/org/apache/uima/textmarker/engine/Modifier.xml (original) +++ uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/main/resources/org/apache/uima/textmarker/engine/Modifier.xml Wed Jan 16 17:03:42 2013 @@ -42,7 +42,13 @@ </configurationParameter> <configurationParameter> <name>outputLocation</name> - <description>Location of the modified document in HTML</description> + <description></description> + <type>String</type> + <multiValued>false</multiValued> + <mandatory>false</mandatory> + </configurationParameter> + <configurationParameter> + <name>outputView</name> <type>String</type> <multiValued>false</multiValued> <mandatory>false</mandatory> @@ -50,9 +56,9 @@ </configurationParameters> <configurationParameterSettings> <nameValuePair> - <name>outputLocation</name> + <name>outputView</name> <value> - <string>/../</string> + <string>modified</string> </value> </nameValuePair> </configurationParameterSettings> Modified: uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/test/java/org/apache/uima/textmarker/AllTests.java URL: http://svn.apache.org/viewvc/uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/test/java/org/apache/uima/textmarker/AllTests.java?rev=1434039&r1=1434038&r2=1434039&view=diff ============================================================================== --- uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/test/java/org/apache/uima/textmarker/AllTests.java (original) +++ uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/test/java/org/apache/uima/textmarker/AllTests.java Wed Jan 16 17:03:42 2013 @@ -23,6 +23,7 @@ import org.apache.uima.textmarker.condit import org.apache.uima.textmarker.condition.PartOfTest; import org.apache.uima.textmarker.condition.PositionTest; import org.apache.uima.textmarker.engine.HtmlAnnotatorTest; +import org.apache.uima.textmarker.engine.TextMarkerModifierTest; import org.apache.uima.textmarker.seed.DefaultSeederTest; import org.apache.uima.textmarker.verbalizer.ActionVerbalizerTest; import org.apache.uima.textmarker.verbalizer.ConditionVerbalizerTest; @@ -38,7 +39,7 @@ import org.junit.runners.Suite.SuiteClas RuleInferenceTest2.class, RuleInferenceTest3.class, AllActionsTest.class, AllConditionsTest.class, CurrentCountTest.class, PartOfTest.class, PositionTest.class, DefaultSeederTest.class, ConditionVerbalizerTest.class, ActionVerbalizerTest.class, - ExpressionVerbalizerTest.class, HtmlAnnotatorTest.class, EmptyDocumentTest.class }) + ExpressionVerbalizerTest.class, HtmlAnnotatorTest.class, EmptyDocumentTest.class, TextMarkerModifierTest.class }) public class AllTests { } Modified: uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/test/java/org/apache/uima/textmarker/TextMarkerTestUtils.java URL: http://svn.apache.org/viewvc/uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/test/java/org/apache/uima/textmarker/TextMarkerTestUtils.java?rev=1434039&r1=1434038&r2=1434039&view=diff ============================================================================== --- uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/test/java/org/apache/uima/textmarker/TextMarkerTestUtils.java (original) +++ uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/test/java/org/apache/uima/textmarker/TextMarkerTestUtils.java Wed Jan 16 17:03:42 2013 @@ -64,7 +64,7 @@ public class TextMarkerTestUtils { } } - private static final String TYPE = "org.apache.uima.T"; + public static final String TYPE = "org.apache.uima.T"; public static CAS process(String ruleFileName, String textFileName, int amount) throws URISyntaxException, IOException, InvalidXMLException, Added: uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/test/java/org/apache/uima/textmarker/engine/TextMarkerModifierTest.java URL: http://svn.apache.org/viewvc/uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/test/java/org/apache/uima/textmarker/engine/TextMarkerModifierTest.java?rev=1434039&view=auto ============================================================================== --- uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/test/java/org/apache/uima/textmarker/engine/TextMarkerModifierTest.java (added) +++ uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/test/java/org/apache/uima/textmarker/engine/TextMarkerModifierTest.java Wed Jan 16 17:03:42 2013 @@ -0,0 +1,88 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ + +package org.apache.uima.textmarker.engine; + +import static org.junit.Assert.assertEquals; + +import java.net.URL; +import java.util.ArrayList; +import java.util.Collection; + +import org.apache.uima.UIMAFramework; +import org.apache.uima.analysis_engine.AnalysisEngine; +import org.apache.uima.analysis_engine.AnalysisEngineDescription; +import org.apache.uima.cas.CAS; +import org.apache.uima.resource.ResourceSpecifier; +import org.apache.uima.resource.metadata.TypeSystemDescription; +import org.apache.uima.textmarker.TextMarkerTestUtils; +import org.apache.uima.util.CasCreationUtils; +import org.apache.uima.util.XMLInputSource; +import org.junit.Test; + +public class TextMarkerModifierTest { + + @Test + public void test() throws Exception { + String namespace = this.getClass().getPackage().getName().replaceAll("\\.", "/"); + URL url = HtmlAnnotator.class.getClassLoader().getResource("Modifier.xml"); + if (url == null) { + url = HtmlAnnotator.class.getClassLoader().getResource( + "org/apache/uima/textmarker/engine/Modifier.xml"); + } + XMLInputSource in = new XMLInputSource(url); + ResourceSpecifier specifier = UIMAFramework.getXMLParser().parseResourceSpecifier(in); + AnalysisEngineDescription aed = (AnalysisEngineDescription) specifier; + + TypeSystemDescription basicTypeSystem = aed.getAnalysisEngineMetaData().getTypeSystem(); + for (int i = 1; i <= 20; i++) { + basicTypeSystem.addType(TextMarkerTestUtils.TYPE + i, "Type for Testing", "uima.tcas.Annotation"); + } + Collection<TypeSystemDescription> tsds = new ArrayList<TypeSystemDescription>(); + tsds.add(basicTypeSystem); + TypeSystemDescription mergeTypeSystems = CasCreationUtils.mergeTypeSystems(tsds); + aed.getAnalysisEngineMetaData().setTypeSystem(mergeTypeSystems); + AnalysisEngine ae = UIMAFramework.produceAnalysisEngine(aed); + ae.setConfigParameterValue(TextMarkerModifier.OUTPUT_LOCATION, ""); + String viewName = "modified_for_testing"; + ae.setConfigParameterValue(TextMarkerModifier.OUTPUT_VIEW, viewName); + ae.reconfigure(); + + String scriptName = this.getClass().getSimpleName(); + CAS cas = null; + try { + cas = TextMarkerTestUtils.process(namespace + "/" + scriptName + ".tm", namespace + "/test.html", 50); + } catch (Exception e) { + e.printStackTrace(); + assert (false); + } + ae.process(cas); + + CAS modifiedView = cas.getView(viewName); + String text = modifiedView.getDocumentText(); + + assertEquals("start of bodynormal BOLDend of body" , text); + + + cas.release(); + ae.destroy(); + } + + +} Added: uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/test/resources/org/apache/uima/textmarker/engine/TextMarkerModifierTest.tm URL: http://svn.apache.org/viewvc/uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/test/resources/org/apache/uima/textmarker/engine/TextMarkerModifierTest.tm?rev=1434039&view=auto ============================================================================== --- uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/test/resources/org/apache/uima/textmarker/engine/TextMarkerModifierTest.tm (added) +++ uima/sandbox/TextMarker/trunk/uimaj-textmarker/src/test/resources/org/apache/uima/textmarker/engine/TextMarkerModifierTest.tm Wed Jan 16 17:03:42 2013 @@ -0,0 +1,10 @@ +PACKAGE org.apache.uima; + +DECLARE T1,T2,T3,T4,T5,T6,T7,T8,T9,T10,T11,T12,T13,T14,T15,T16,T17,T18,T19,T20,T21,T22,T23,T24,T25; + +Document{-> RETAINTYPE(MARKUP, SPACE, BREAK)}; + +MARKUP{ -> DEL}; BREAK{-> DEL}; + +SW {REGEXP("bold") -> REPLACE("BOLD")}; +