Hi all, I am a newbie at the UIMA framework, and I am facing problems with running the aggregate AE in the HmmTagger project put up on the UIMA sandbox. When I load HmmtaggerAggregate.xml as the AE in CVD, I get the following exception: org.apache.resource.ResourceInitializationException: Error initializing "org.apache.uima.resource.impl.Data_Resource_impl" from the descriptor file:/C:Users/Abhik/workspace/Tagger/desc/HmmTagger.xml . I am running my project in Eclipse.
I am pasting below the contents of the 4 .xml descriptor files in my project. These are largely the same as the ones put up on the SVN server for the HmmTagger code in UIMA sandbox: HmmtaggerAggregate.xml: <?*xml* version="1.0" encoding="UTF-8"?> <analysisEngineDescription *xmlns*="http://uima.apache.org/resourceSpecifier"> <frameworkImplementation> org.apache.uima.java </frameworkImplementation> <primitive>false</primitive> <delegateAnalysisEngineSpecifiers> <delegateAnalysisEngine key="SimpleTokenAndSentenceAnnotator"> <import location="WhitespaceTokenizer.xml" /> </delegateAnalysisEngine> <delegateAnalysisEngine key="HmmTagger"> <import location="HmmTagger.xml" /> </delegateAnalysisEngine> </delegateAnalysisEngineSpecifiers> <analysisEngineMetaData> <name>HmmTaggerTAE</name> <description /> <version /> <vendor /> <configurationParameters searchStrategy="language_fallback" /> <configurationParameterSettings /> <flowConstraints> <fixedFlow> <node>SimpleTokenAndSentenceAnnotator</node> <node>HmmTagger</node> </fixedFlow> </flowConstraints> <typePriorities /> <fsIndexCollection /> <capabilities /> <operationalProperties> <modifiesCas>true</modifiesCas> <multipleDeploymentAllowed>true</multipleDeploymentAllowed> <outputsNewCASes>false</outputsNewCASes> </operationalProperties> </analysisEngineMetaData> <resourceManagerConfiguration /> </analysisEngineDescription> --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- This is the text in my WhitespaceTokenizer.xml file: <?*xml* version="1.0" encoding="UTF-8"?> <analysisEngineDescription *xmlns*="http://uima.apache.org/resourceSpecifier "> <frameworkImplementation>org.apache.uima.java</frameworkImplementation> <primitive>true</primitive> <annotatorImplementationName>org.apache.uima.annotator.WhitespaceTokenizer</annotatorImplementationName> <analysisEngineMetaData> <name>WhitespaceTokenizer</name> <description>creates token and sentence annotations for whitespace separated languages</description> <version>1.0</version> <vendor>The *Apache* Software Foundation</vendor> <configurationParameters> <configurationParameter> <name>SofaNames</name> <description>The Sofa names the *annotator* should work on. If no names are specified, the *annotator* works on the default sofa.</description> <type>String</type> <multiValued>true</multiValued> <mandatory>false</mandatory> </configurationParameter> </configurationParameters> <configurationParameterSettings/> <typeSystemDescription> <types> <typeDescription> <name>org.apache.uima.TokenAnnotation</name> <description>Single token annotation</description> <supertypeName>uima.tcas.Annotation</supertypeName> <features> <featureDescription> <name>tokenType</name> <description>token type</description> <rangeTypeName>uima.cas.String</rangeTypeName> </featureDescription> <featureDescription> <name>posTag</name> <description/> <rangeTypeName>uima.cas.String</rangeTypeName> </featureDescription> </features> </typeDescription> <typeDescription> <name>org.apache.uima.SentenceAnnotation</name> <description>sentence annotation</description> <supertypeName>uima.tcas.Annotation</supertypeName> </typeDescription> </types> </typeSystemDescription> <fsIndexCollection/> <capabilities> <capability> <inputs/> <outputs> <type>org.apache.uima.TokenAnnotation</type> <feature>org.apache.uima.TokenAnnotation:tokentype</feature> <type>org.apache.uima.SentenceAnnotation</type> </outputs> <languagesSupported> <language>x-unspecified</language> </languagesSupported> </capability> </capabilities> <operationalProperties> <modifiesCas>true</modifiesCas> <multipleDeploymentAllowed>true</multipleDeploymentAllowed> <outputsNewCASes>false</outputsNewCASes> </operationalProperties> </analysisEngineMetaData> </analysisEngineDescription> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- This is the text in my HmmTagger.xml file : <?*xml* version="1.0" encoding="UTF-8"?> <analysisEngineDescription *xmlns*="http://uima.apache.org/resourceSpecifier "> <frameworkImplementation>org.apache.uima.java</frameworkImplementation> <primitive>true</primitive> <annotatorImplementationName>org.apache.uima.examples.tagger.HMMTagger</annotatorImplementationName> <analysisEngineMetaData> <name>Hidden *Markov* Model - Part of Speech *Tagger*</name> <description>A configuration of the HmmTaggerAnnotator that looks for parts of speech of identified tokens within existing Sentence and Token annotations. See also WhitespaceTokenizer.xml.</description> <version>1.0</version> <vendor>The *Apache* Software Foundation</vendor> <configurationParameters> <configurationParameter> <name>NGRAM_SIZE</name> <type>Integer</type> <multiValued>false</multiValued> <mandatory>true</mandatory> </configurationParameter> </configurationParameters> <configurationParameterSettings> <nameValuePair> <name>NGRAM_SIZE</name> <value> <integer>3</integer> </value> </nameValuePair> </configurationParameterSettings> <typeSystemDescription> <types> <typeDescription> <name>org.apache.uima.TokenAnnotation</name> <description>Single token annotation</description> <supertypeName>uima.tcas.Annotation</supertypeName> <features> <featureDescription> <name>posTag</name> <description>contains part-of-speech of a corresponding token</description> <rangeTypeName>uima.cas.String</rangeTypeName> </featureDescription> </features> </typeDescription> <typeDescription> <name>org.apache.uima.SentenceAnnotation</name> <description>sentence annotation</description> <supertypeName>uima.tcas.Annotation</supertypeName> </typeDescription> </types> </typeSystemDescription> <typePriorities/> <fsIndexCollection/> <capabilities> <capability> <inputs> <type>org.apache.uima.TokenAnnotation</type> <type allAnnotatorFeatures="true">org.apache.uima.SentenceAnnotation</type> <feature>org.apache.uima.TokenAnnotation:end</feature> <feature>org.apache.uima.TokenAnnotation:begin</feature> </inputs> <outputs> <type>org.apache.uima.TokenAnnotation</type> <feature>org.apache.uima.TokenAnnotation:posTag</feature> <feature>org.apache.uima.TokenAnnotation:end</feature> <feature>org.apache.uima.TokenAnnotation:begin</feature> </outputs> <languagesSupported/> </capability> </capabilities> <operationalProperties> <modifiesCas>true</modifiesCas> <multipleDeploymentAllowed>true</multipleDeploymentAllowed> <outputsNewCASes>false</outputsNewCASes> </operationalProperties> </analysisEngineMetaData> <externalResourceDependencies> <externalResourceDependency> <key>Model</key> <description>HMM *Tagger* model file</description> <interfaceName>org.apache.uima.examples.tagger.IModelResource</interfaceName> <optional>false</optional> </externalResourceDependency> </externalResourceDependencies> <resourceManagerConfiguration> <externalResources> <externalResource> <name>ModelFile</name> <description>HMM *Tagger* model file</description> <fileResourceSpecifier> <fileUrl>file:english/BrownModel.dat</fileUrl> </fileResourceSpecifier> <implementationName>org.apache.uima.examples.tagger.ModelResource</implementationName> </externalResource> </externalResources> <externalResourceBindings> <externalResourceBinding> <key>Model</key> <resourceName>ModelFile</resourceName> </externalResourceBinding> </externalResourceBindings> </resourceManagerConfiguration> </analysisEngineDescription> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- And finally, this is the text in my HmmModelTrainer.xml file: <?*xml* version="1.0" encoding="UTF-8"?> <analysisEngineDescription *xmlns*="http://uima.apache.org/resourceSpecifier "> <frameworkImplementation>org.apache.uima.java</frameworkImplementation> <primitive>true</primitive> <annotatorImplementationName>org.apache.uima.examples.tagger.HMMModelTrainer</annotatorImplementationName> <analysisEngineMetaData> <name>HMMModelTrainer</name> <description>This analysis engine trains an N-gram model for the HMM *tagger *. It uses a training corpus as reference. This corpus must contain annotations on words with an attribute corresponding of the POS value to be learned. The configuration of this analysis engine is done through several parameters: <*ul*> <*li*>View: - the view from which the tokens will be extracted</*li *> <*li*>ModelExportFile: - the path where the model will be written</ *li*> <*li*>FeaturePathPOS: - feature path to the value of the POS to be learned. The annotation should exactly cover a "word".</*li*> </*ul*> <b>BEWARE: this analysis engine does not allow multiple deployment !</b> <i>NB. At the moment: both *bi* and *trigram* statistics are saved in one model file.</i></description> <version>1.0</version> <vendor/> <configurationParameters> <configurationParameter> <name>View</name> <description>The view from which the tokens will be extracted.</description> <type>String</type> <multiValued>false</multiValued> <mandatory>true</mandatory> </configurationParameter> <configurationParameter> <name>ModelExportFile</name> <description>The path where the model will be written.</description> <type>String</type> <multiValued>false</multiValued> <mandatory>true</mandatory> </configurationParameter> <configurationParameter> <name>FeaturePathPOS</name> <description>Feature path to the value of the POS to be *learnt*. The annotation should exactly cover a "word".</description> <type>String</type> <multiValued>false</multiValued> <mandatory>true</mandatory> </configurationParameter> </configurationParameters> <configurationParameterSettings> <nameValuePair> <name>View</name> <value> <string>_InitialView</string> </value> </nameValuePair> <nameValuePair> <name>ModelExportFile</name> <value> <string>hmmtagger_model.dat</string> </value> </nameValuePair> <nameValuePair> <name>FeaturePathPOS</name> <value> <string>org.apache.uima.TokenAnnotation:posTag</string> </value> </nameValuePair> </configurationParameterSettings> <typeSystemDescription/> <typePriorities/> <fsIndexCollection/> <capabilities> <capability> <inputs/> <outputs/> <languagesSupported/> </capability> </capabilities> <operationalProperties> <modifiesCas>false</modifiesCas> <multipleDeploymentAllowed>false</multipleDeploymentAllowed> <outputsNewCASes>false</outputsNewCASes> </operationalProperties> </analysisEngineMetaData> <resourceManagerConfiguration/> </analysisEngineDescription> I understand that I have ended up writing a huge mail as a query, but I am an absolute newbie to the UIMA framework and shall be extremely grateful to anyone who can help me out here. Thanks a lot for your help! Regards, Abhik
