On Mar 15, 2013, at 9:26 PM, "Pei Chen (JIRA)" <[email protected]> wrote:
> If you have spare time, do you want to also try adding the relation extractor 
> aggregate to the regression test?  And having this (pipeline as well as the 
> xml desc configuration) automatically tested in the future?
> It should be as simple as adding a CPE to the directory.
> 
> /ctakes-regression-test/desc/collection_processing_engine/
> Take a look at 
> http://svn.apache.org/repos/asf/incubator/ctakes/trunk/ctakes-regression-test/desc/collection_processing_engine/CoreferenceCPETest.xml
> For example:
> 1)    Just clone and point to the CPE to 
> ../../../ctakes-relation-extractor/desc/analysis_engine/RelationExtractorAggregate.xml
>  instead .
> 2)    Run mvn test once (it should probably fail because there is nothing to 
> compare with, but just collect the generated results).
> 3)    Copy the results from generatedoutput/{NameofCPEFilename}/ into 
> expectedoutput/{NameofCPEFilename}
> 4)    Check the expectedoutput into SVN.
> 5)    Now Every time mvn test is run, that CPE will executed and results 
> compared automatically.

First, a general comment about the regression test, and then some details about 
where I'm currently stuck.

(1) Is it really a good idea to be asserting that the XML files generated by 
cTAKES components should always be identical? Particularly if the current 
components make some mistakes, shouldn't we only be asserting the things that 
they get right? Something more along the lines of 
org.apache.ctakes.relationextractor.ae.RelationExtractorAnnotatorsTest, where 
we have individual assertions for each thing the relation extractor should have 
found?

(2) In trying to add the CPETest, I got stuck trying to get 
ctakes-dictionary-lookup/desc/analysis_engine/DictionaryLookupAnnotatorUMLS.xml 
to work. (This descriptor is referenced by 
ctakes-relation-extractor/desc/analysis_engine/RelationExtractorPreprocessor.xml.)
 Here's the error I'm getting:

org.apache.uima.resource.ResourceInitializationException: Initialization of CAS 
Processor with name "RelationExtractorCPETest" failed.  
        at 
org.apache.uima.collection.impl.CollectionProcessingEngine_impl.initialize(CollectionProcessingEngine_impl.java:83)
        ...
Caused by: org.apache.uima.resource.ResourceConfigurationException: 
Initialization of CAS Processor with name "RelationExtractorCPETest" failed.  
        at 
org.apache.uima.collection.impl.cpm.container.CPEFactory.produceIntegratedCasProcessor(CPEFactory.java:1104)
        …
Caused by: org.apache.uima.resource.ResourceInitializationException
        at 
org.apache.ctakes.core.resource.LuceneIndexReaderResourceImpl.load(LuceneIndexReaderResourceImpl.java:80)
        ...
Caused by: java.io.FileNotFoundException: 
org/apache/ctakes/dictionary/lookup/rxnorm_index
        at 
org.apache.ctakes.core.resource.FileLocator.locateExplicitly(FileLocator.java:69)
        at 
org.apache.ctakes.core.resource.FileLocator.locateFile(FileLocator.java:44)
        at 
org.apache.ctakes.core.resource.LuceneIndexReaderResourceImpl.load(LuceneIndexReaderResourceImpl.java:58)
        ... 53 more

I assume this is because the UMLS indexes aren't in SVN anymore. What's the 
proper way to reference these now, and should DictionaryLookupAnnotatorUMLS.xml 
be updated appropriately?

Thanks,

Steve

Reply via email to