Actually, one update: after doing a Maven update project, the lines UIMAFIT org.apache.uima.ruta.engine.PlainTextAnnotator; TYPESYSTEM org.apache.uima.ruta.engine.PlainTextTypeSystem;
generate a different exception - basically it can't find BasicTypeSystem.xml Exception in thread "main" org.apache.uima.resource.ResourceInitializationException: Initialization of annotator class "org.apache.uima.ruta.engine.RutaEngine" failed. (Descriptor: file:/home/bonnie/Research/eclipse-uima-projects/PipeLineWithRuta/target/classes/ecClassifierRulesEngine.xml) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initializeAnalysisComponent(PrimitiveAnalysisEngine_impl.java:264) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initialize(PrimitiveAnalysisEngine_impl.java:169) at org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94) at org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62) at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:279) at org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:371) at org.apache.uima.ruta.engine.Ruta.wrapAnalysisEngine(Ruta.java:95) at org.apache.uima.ruta.ide.launching.RutaLauncher.main(RutaLauncher.java:123) Caused by: org.apache.uima.resource.ResourceInitializationException at org.apache.uima.ruta.engine.RutaEngine.initialize(RutaEngine.java:519) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initializeAnalysisComponent(PrimitiveAnalysisEngine_impl.java:262) ... 7 more Caused by: org.apache.uima.analysis_engine.AnalysisEngineProcessException at org.apache.uima.ruta.engine.RutaEngine.initializeScript(RutaEngine.java:767) at org.apache.uima.ruta.engine.RutaEngine.initialize(RutaEngine.java:517) ... 8 more Caused by: org.apache.uima.resource.ResourceInitializationException at org.apache.uima.fit.internal.MetaDataUtil.resolve(MetaDataUtil.java:106) at org.apache.uima.fit.internal.MetaDataUtil.scanDescriptors(MetaDataUtil.java:170) at org.apache.uima.fit.factory.TypeSystemDescriptionFactory.scanTypeDescriptors(TypeSystemDescriptionFactory.java:131) at org.apache.uima.fit.factory.TypeSystemDescriptionFactory.createTypeSystemDescription(TypeSystemDescriptionFactory.java:102) at org.apache.uima.fit.factory.AnalysisEngineFactory.createEngineDescription(AnalysisEngineFactory.java:967) at org.apache.uima.fit.factory.AnalysisEngineFactory.createEngine(AnalysisEngineFactory.java:278) at org.apache.uima.ruta.engine.RutaEngine.initializeScript(RutaEngine.java:763) ... 9 more Caused by: java.io.FileNotFoundException: class path resource [classpath*:resources/BasicTypeSystem.xml] cannot be resolved to URL because it does not exist at org.springframework.core.io.ClassPathResource.getURL(ClassPathResource.java:177) at org.apache.uima.fit.internal.MetaDataUtil.resolve(MetaDataUtil.java:101) On Thu, Jun 23, 2016 at 9:21 AM, Bonnie MacKellar <[email protected]> wrote: > Hi, > > Thanks. > > I am not using SourceDocumentInformation in my Ruta script. There is no > dependency there - in the version that is in a regular Ruta Workbench > project, I can remove it and everything is fine. I believe, from looking > at the exception, that the dependency is in UimaFit - it seems to be coming > from SimplePipeline.runPipeline. I have tried adding it in UimaFit > fashion, listing it in > src/main/resources/META-INF/org.apache.uima.fit/types.txt, but I cannot > seem to get UimaFit to find this file in the Maven version of this project, > even though it works fine in the non-Maven project. I just cannot figure > out why this is happening. > > I also don't understand this > "Changing the imports to something like: UIMAFIT > org.apache.uima.ruta.engine.PlainTextAnnotator should do the trick (you > need also to adapt the TYPESYSTEM import). Then the script does not > depend on the project structure." > > Change which imports? Is this something in the pom file? UIMAFIT brings in > additional UimaFit annotation engines to the Ruta script, right? I am not > calling or using any UimaFit annotation engines in my Ruta script. I am > just trying to bring in PlainTextAnnotator. That isn't a UimaFit annotator > - it is something built in to Ruta. > I tried changing the lines in the script to > ENGINE org.apache.uima.ruta.engine.PlainTextAnnotator; > TYPESYSTEM org.apache.uima.ruta.engine.PlainTextTypeSystem; > > but that doesn't work - I get a > "org.apache.uima.ruta.engine.PlainTextAnnotator not found " on the line > ENGINE org.apache.uima.ruta.engine.PlainTextAnnotator; > > I then tried changing to > UIMAFIT org.apache.uima.ruta.engine.PlainTextAnnotator; > TYPESYSTEM org.apache.uima.ruta.engine.PlainTextTypeSystem; > > No compile error, but when I run the script, I get > Found no script/block: PlainTextAnnotator > Exception in thread "main" java.lang.NullPointerException > at > org.apache.uima.ruta.engine.RutaEngine.batchProcessComplete(RutaEngine.java:1122) > at > org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.batchProcessComplete(PrimitiveAnalysisEngine_impl.java:321) > at > org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.batchProcessComplete(AnalysisEngineImplBase.java:447) > at > org.apache.uima.ruta.ide.launching.RutaLauncher.main(RutaLauncher.java:133 > > Clearly it isn't finding PlainTextAnnotator - but that is the crux of my > problem. Where do I put it? > > I think my problem is that I don't understand what these pluigins are all > doing or how they affect each other: ruta-maven-plugin, > jcasgen-maven-plugin, and uimafit-maven-plugin. They all seem to copy > and/or generate different things to target/classes and > target/generated-sources, but it is hard to tell exactly which files each > one is responsible for. I don't have a good mental model of the process! > > thanks, > Bonnie MacKellar > > On Thu, Jun 23, 2016 at 5:07 AM, Peter Klügl <[email protected]> > wrote: > >> Hi, >> >> >> sorry, here's just a short reply since I am currently travelling. If >> the problem still exists I will try to reproduce it and reply with more >> details next week. >> >> >> Yes, in simple UIMA Ruta projects, these descriptors are copied to >> descriptor/utils when you create the project. The descriptor folder is >> listed in the buildpath as a "descriptor" folder, where imported >> descriptors are searched in. >> >> UIMA Ruta supports currently two ways to find the descriptors: the >> absolute paths specified in the descriptorPaths configuration parameter >> and the classpath. Thus, the simplest way for you would be to use the >> classpath to find the descriptor instead of the descriptorPaths (which >> points to the descriptor folder of your ruta project). >> >> Changing the imports to something like: UIMAFIT >> org.apache.uima.ruta.engine.PlainTextAnnotator should do the trick (you >> need also to adapt the TYPESYSTEM import). Then the script does not >> depend on the project structure. >> >> >> If you use the SourceDocumentInformation type system in your ruta >> script, then you need to include it separately. In some situtation, the >> Ruta Workbench does that automatically for you. However, it is not >> mentioned in types.txt in ruta-core. So you need to add it there in your >> maven project so that the typesystem scanning of uimaFIT finds it. >> >> >> If you create the analysis engine (descriptor) for a ruta script >> programmatically, there are sometimes additional configuration >> parameters that need to be set. In your use case, you import additional >> analysis engine in your script. These need to be mentioned in the >> corresponding configuration parameters, e.g., PARAM_ADDITIONAL_ENGINES >> or PARAM_ADDITIONAL_UIMAFIT_ENGINES. Since there are several parameters >> that are rather technical. I normally use the generated descriptor in >> the uimaFIT factory. >> >> >> Best, >> >> >> Peter >> >> >> Am 22.06.2016 um 21:55 schrieb Bonnie MacKellar: >> > I am still trying to figure out how to count Ruta annotations across a >> > bunch of input files. There doesn't seem to be any Workbench way to do >> it. >> > So now I am trying to call Ruta from UimaFit so I can do the job in >> Java. >> > >> > However, I am having serious configuration problems, plus I have a >> question >> > on how do bring in PlainTextAnnotator. >> > >> > I am using Maven, with the jcasgen-maven-plugin, the ruta-maven-plugin, >> and >> > the uimafit-maven-plugin. I will include the pom file at the end of this >> > post. >> > >> > I want my Java code to be aware of the types declared in the Ruta >> script - >> > that is the whole point - I want to count those annotations. >> > >> > My Ruta script also uses PlainTextAnnotator. The problem with this is >> that >> > I can't figure out where to put it. In a Workbench based Ruta project, >> > PlainTextAnnotator.xml and PlainTextAnnotatorTypeSystem get put >> > automatically into descriptor/utils, along with a number of other >> > descriptors that seem to be built into Ruta. But when I create a project >> > using maven, there is no such location, and these descriptors do not get >> > put anywhere. I tried a number of places but could not get my script to >> see >> > the type system for PlainTextAnnotator. Finally, I hit on putting the >> files >> > in target/generated-sources/ruta/descriptor/utils, and finally my >> script is >> > able to see the types and I can run it. This is good because at that >> point, >> > the ruta-maven-plugin does its job and generates the descriptors for my >> > script. However, I suspect this is not a good place to put the >> > PlainTextAnnotator files since doing a clean overwrites them. Where >> should >> > they go? Is there any entry in the pom file that is needed? >> > >> > The second problem is that although my Ruta script works nicely on its >> own, >> > the Java code fails. I get the following exception >> > Exception in thread "main" org.apache.uima.cas.CASRuntimeException: JCas >> > type "org.apache.uima.examples.SourceDocumentInformation" used in Java >> > code, but was not declared in the XML type descriptor. >> > at org.apache.uima.jcas.impl.JCasImpl.getTypeInit(JCasImpl.java:435) >> > at org.apache.uima.jcas.impl.JCasImpl.getType(JCasImpl.java:408) >> > at org.apache.uima.jcas.cas.TOP.<init>(TOP.java:96) >> > at >> org.apache.uima.jcas.cas.AnnotationBase.<init>(AnnotationBase.java:66) >> > at org.apache.uima.jcas.tcas.Annotation.<init>(Annotation.java:54) >> > at >> > >> org.apache.uima.examples.SourceDocumentInformation.<init>(SourceDocumentInformation.java:80) >> > at >> > >> org.apache.uima.examples.cpe.FileSystemCollectionReader.getNext(FileSystemCollectionReader.java:162) >> > at >> > >> org.apache.uima.fit.pipeline.SimplePipeline.runPipeline(SimplePipeline.java:149) >> > at PipelineSystem.<init>(PipelineSystem.java:59) >> > at PipelineSystem.main(PipelineSystem.java:73) >> > >> > I am guessing that I need to put some other descriptor somewhere but I >> > can't figure out what it might be. Here is the code that causes the >> problem >> > >> ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- >> > import java.io.IOException; >> > import java.util.Iterator; >> > >> > import org.apache.uima.UIMAException; >> > import org.apache.uima.analysis_engine.AnalysisEngine; >> > import org.apache.uima.analysis_engine.AnalysisEngineDescription; >> > import org.apache.uima.analysis_engine.AnalysisEngineProcessException; >> > import org.apache.uima.cas.Type; >> > import org.apache.uima.cas.TypeSystem; >> > import org.apache.uima.collection.CollectionReaderDescription; >> > import org.apache.uima.examples.cpe.FileSystemCollectionReader; >> > import org.apache.uima.fit.component.CasDumpWriter; >> > import org.apache.uima.fit.factory.AnalysisEngineFactory; >> > import org.apache.uima.fit.factory.CollectionReaderFactory; >> > import org.apache.uima.fit.pipeline.SimplePipeline; >> > import org.apache.uima.jcas.JCas; >> > import org.apache.uima.resource.ResourceInitializationException; >> > import org.apache.uima.ruta.engine.RutaEngine; >> > >> > public class PipelineSystem { >> > public PipelineSystem() throws IOException, UIMAException >> > { >> > try { >> > CollectionReaderDescription readerDesc = >> > CollectionReaderFactory.createReaderDescription( >> > FileSystemCollectionReader.class, >> > FileSystemCollectionReader.PARAM_INPUTDIR, >> > "/home/bonnie/Research/eclipse-uima-projects/PipeLineWithRuta/input", >> > FileSystemCollectionReader.PARAM_ENCODING, "UTF-8", >> > FileSystemCollectionReader.PARAM_LANGUAGE, "English"); >> > AnalysisEngine rae = >> AnalysisEngineFactory.createEngine(RutaEngine.class, >> > RutaEngine.PARAM_MAIN_SCRIPT, >> > "ecClassifierRules"); >> > AnalysisEngineDescription rutaEngineDesc = >> > AnalysisEngineFactory.createEngineDescription(RutaEngine.class, >> > RutaEngine.PARAM_MAIN_SCRIPT, >> > "ecClassifierRules"); >> > AnalysisEngineDescription writerDesc = >> > AnalysisEngineFactory.createEngineDescription(CasDumpWriter.class, >> > CasDumpWriter.PARAM_OUTPUT_FILE, "dump.txt"); >> > JCas jCas = rae.newJCas(); >> > SimplePipeline.runPipeline(readerDesc, rutaEngineDesc); >> > displayRutaResults(jCas); >> > } catch (ResourceInitializationException e) { >> > // TODO Auto-generated catch block >> > e.printStackTrace(); >> > } catch (AnalysisEngineProcessException e) { >> > // TODO Auto-generated catch block >> > e.printStackTrace(); >> > } >> > } >> > >> > public static void main(String[] args) throws IOException, >> UIMAException { >> > PipelineSystem p = new PipelineSystem(); >> > >> > } >> > >> > public void displayRutaResults(JCas jCas) >> > { >> > System.out.println("in display ruta results"); >> > TypeSystem ts = jCas.getTypeSystem(); >> > Iterator<Type> typeItr = ts.getTypeIterator(); >> > while (typeItr.hasNext()) { >> > Type type = (Type) typeItr.next(); >> > if (type.getName().equals("INCL")) { >> > System.out.println("INCL was found"); >> > } >> > } >> > } >> > >> ------------------------------------------------------------------------------------------------------------------------------------------------ >> > >> > Yes, I know the code doesn't actually count annotations yet - this is >> > strictly a test of the configuration. The type INCL is declared in the >> > script >> > >> > ENGINE utils.PlainTextAnnotator; TYPESYSTEM utils.PlainTextTypeSystem; >> > Document{-> RETAINTYPE(BREAK)}; Document{-> EXEC(PlainTextAnnotator, >> > {Line})}; >> > >> > DECLARE INCL; "INCLUSION" -> INCL; >> > >> > And finally, here is the pom file. I note that the ruta pugin and the >> > jcasegen plugin are correctly generating the descriptor files for the >> > script and the Java classes for the types. I have this set up so that >> the >> > jcasgen plugin reads the type descriptors from the folder that is >> generated >> > by the ruta-maven-plugin (I saw this in one of the examples mentioned >> > elsewhere on this mailing lsit) >> > However, the uimafit plugin does not generate anything. >> > >> > thanks for any help. It is really hard to figure out all these moving >> parts. >> > >> > Bonnie MacKellar >> > >> > >> --------------------------------------------------------------------------------------------------------------------------------- >> > >> > <project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi=" >> > http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation=" >> > http://maven.apache.org/POM/4.0.0 >> > http://maven.apache.org/xsd/maven-4.0.0.xsd"> >> > <modelVersion>4.0.0</modelVersion> <groupId>PipeLineWithRuta</groupId> >> > <artifactId>PipeLineWithRuta</artifactId> >> <version>0.0.1-SNAPSHOT</version> >> > <packaging>jar</packaging> <name>PipeLineWithRuta</name> <url> >> > http://maven.apache.org</url> <properties> >> > <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding> >> > </properties> <build> <sourceDirectory>src/main/java</sourceDirectory> >> > <resources> <resource> <directory>src/main/ruta</directory> </resource> >> > <resource> <directory>src/desc</directory> </resource> </resources> >> > <plugins> <plugin> <artifactId>maven-compiler-plugin</artifactId> >> > <version>3.3</version> <configuration> <source>1.8</source> >> > <target>1.8</target> </configuration> </plugin> <plugin> >> > <groupId>org.apache.uima</groupId> >> > <artifactId>jcasgen-maven-plugin</artifactId> <version>2.4.1</version> >> <!-- >> > change this to the latest version --> <executions> <execution> <goals> >> > <goal>generate</goal> </goals> <!-- this is the only goal --> <!-- runs >> in >> > phase process-resources by default --> <configuration> <!-- REQUIRED --> >> > <typeSystemIncludes> <!-- one or more ant-like file patterns identifying >> > top level descriptors --> >> > >> <typeSystemInclude>target/generated-sources/ruta/descriptor/ecClassifierRulesTypeSystem.xml</typeSystemInclude> >> > </typeSystemIncludes> <!-- OPTIONAL --> <!-- a sequence of ant-like file >> > patterns to exclude from the above include list --> <typeSystemExcludes> >> > </typeSystemExcludes> <!-- OPTIONAL --> <!-- where the generated files >> go >> > --> <!-- default value: >> > ${project.build.directory}/generated-sources/jcasgen" --> >> <outputDirectory> >> > </outputDirectory> <!-- true or false, default = false --> <!-- if true, >> > then although the complete merged type system will be created >> internally, >> > only those types whose definition is contained within this maven project >> > will be generated. The others will be presumed to be available via other >> > projects. --> <!-- OPTIONAL --> <limitToProject>true</limitToProject> >> > </configuration> </execution> </executions> </plugin> <plugin> >> > <groupId>org.apache.uima</groupId> >> > <artifactId>ruta-maven-plugin</artifactId> <version>2.3.1</version> >> > <configuration> <scriptPaths> <scriptPath>src/main/ruta/</scriptPath> >> > </scriptPaths> <!-- Descriptor paths of the generated analysis engine >> > descriptor. --> <!-- default value: none --> <descriptorPaths> >> > >> <descriptorPath>${project.build.directory}/generated-sources/ruta/descriptor</descriptorPath> >> > </descriptorPaths> <!-- Resource paths of the generated analysis engine >> > descriptor. --> <!-- default value: none --> <resourcePaths> >> > <resourcePath>${project.build.directory}/generated-sources/ruta/ >> > resources/</resourcePath> </resourcePaths> >> > <analysisEngineSuffix>Engine</analysisEngineSuffix> >> > <typeSystemSuffix>TypeSystem</typeSystemSuffix> <!-- Type of type system >> > imports. false = import by location. --> <!-- default value: false --> >> > <importByName>false</importByName> <!-- Option to resolve imports while >> > building. --> <!-- default value: false --> >> > <resolveImports>false</resolveImports> <!-- List of packages with >> language >> > extensions --> <!-- default value: none --> <extensionPackages> >> > <extensionPackage>org.apache.uima.ruta</extensionPackage> >> > </extensionPackages> <!-- Add UIMA Ruta nature to .project --> <!-- >> default >> > value: false --> <addRutaNature>true</addRutaNature> <!-- Buildpath of >> the >> > UIMA Ruta Workbench (IDE) for this project --> <!-- default value: none >> --> >> > <buildPaths> <buildPath>script:src/main/ruta/</buildPath> >> > <buildPath>descriptor:target/generated-sources/ruta/descriptor/ >> > </buildPath> <buildPath>resources:src/main/resources/</buildPath> >> > </buildPaths> </configuration> <executions> <execution> <id>default</id> >> > <phase>process-classes</phase> <goals> <goal>generate</goal> </goals> >> > </execution> </executions> </plugin> <plugin> >> > <groupId>org.apache.uima</groupId> >> > <artifactId>uimafit-maven-plugin</artifactId> <version>2.2.0</version> >> <!-- >> > change to latest version --> <configuration> <!-- OPTIONAL --> <!-- Path >> > where the generated resources are written. --> <outputDirectory> >> > ${project.build.directory}/generated-sources/uimafit </outputDirectory> >> > <!-- OPTIONAL --> <!-- Skip generation of >> > META-INF/org.apache.uima.fit/components.txt --> >> > <skipComponentsManifest>false</skipComponentsManifest> <!-- OPTIONAL --> >> > <!-- Source file encoding. --> >> > <encoding>${project.build.sourceEncoding}</encoding> </configuration> >> > <executions> <execution> <id>default</id> <phase>process-classes</phase> >> > <goals> <goal>generate</goal> </goals> </execution> </executions> >> </plugin> >> > </plugins> </build> <dependencies> <dependency> >> > <groupId>org.apache.uima</groupId> <artifactId>uimafit-core</artifactId> >> > <version>2.2.0</version> </dependency> <dependency> >> > <groupId>org.apache.uima</groupId> <artifactId>uimaj-core</artifactId> >> > <version>2.8.1</version> </dependency> <dependency> >> > <groupId>org.apache.uima</groupId> >> > <artifactId>ruta-maven-plugin</artifactId> <version>2.3.1</version> >> > </dependency> <dependency> <groupId>org.apache.uima</groupId> >> > <artifactId>uimaj-cpe</artifactId> <version>2.8.1</version> >> </dependency> >> > <dependency> <groupId>org.apache.uima</groupId> >> > <artifactId>uimaj-examples</artifactId> <version>2.8.1</version> >> > </dependency> </dependencies> </project> >> > >> >> >
