If you can run your process in a debugger like eclipse, then you can suspend execution during that 12 minutes and check the stack to see what is happening.
When I experienced similar behavior, the Dictionary Lookup was reading the database files from a .JAR file that was in my .m2 (maven) repository. The easiest way I found to avoid this happening was to delete or rename the file from my .m2 directory. This is very annoying because rebuilding will re-download the files and I have to do it again. (If there is a better way, I would love to hear about it.) [image: IMAT Solutions] <http://imatsolutions.com> Bruce Tietjen Senior Software Engineer [image: Mobile:] 801.634.1547 [email protected] On Sat, Jun 27, 2015 at 9:08 PM, Jeff Headley <[email protected]> wrote: > I was able to get by the error by modifying my installation's > DictionaryLookupAnnotatorUMLS.xml file. I changed: > > <fileUrl>file:org/apache/ctakes/dictionary/lookup/LookupDesc_Db.xml</fileUrl> > > to > > <fileUrl>file:resources/org/apache/ctakes/dictionary/lookup/LookupDesc_Db.xml</fileUrl> > > and that seemed to work. > > I saw only a slight performance improvement however. Would anyone be able > to tell me what is going on between these two log statements that takes > about 12 minutes? > > 2015-06-27 22:45:02.374 INFO 8972 --- [ main] > .a.c.d.l.a.UmlsDictionaryLookupAnnotator : process(JCas) > 2015-06-27 22:57:39.385 INFO 8972 --- [ main] > o.a.c.c.parser.MaxentParserWrapper : Started processing: null > > On Sat, Jun 27, 2015 at 12:45 PM, Jeff Headley <[email protected]> > wrote: > >> I have changed my cTAKES dependencies in my pom back to >> <scope>provided</scope> and I think I have the classpath set correctly as >> it seems to start out ok but eventually gets this new error. I'm hoping >> maybe someone has seen this before and can help me out. I believe my cTAKES >> is installed correctly. I followed the guide and can use the CVD. The >> analysis engine I'm attempting to load >> is >> desc/ctakes-clinical-pipeline/desc/analysis_engine/AggregatePlaintextUMLSProcessor.xml. >> >> 2015-06-27 12:36:07.425 DEBUG 10332 --- [ main] >> o.a.ctakes.core.ae.OverlapAnnotator : Overlap bitset: {3} >> 2015-06-27 12:36:07.453 INFO 10332 --- [ main] >> o.a.c.d.p.ae.ClearNLPDependencyParserAE : using Morphy analysis? true >> Loading configuration. >> Loading feature templates. >> Loading lexica. >> Loading model: >> >> ........................................................................................ >> 2015-06-27 12:36:16.930 INFO 10332 --- [ main] >> org.apache.ctakes.chunker.ae.Chunker : Chunker model file: >> org/apache/ctakes/chunker/models/chunker-model.zip >> 2015-06-27 12:36:17.952 INFO 10332 --- [ main] >> c.c.a.ContextDependentTokenizerAnnotator : Finite state machines loaded. >> 2015-06-27 12:36:17.959 INFO 10332 --- [ main] >> o.a.c.c.parser.ae.ConstituencyParser : Initializing parser... >> 2015-06-27 12:36:20.616 INFO 10332 --- [ main] >> o.a.ctakes.necontexts.ContextAnnotator : SCOPE ORDER: [1, 3] >> 2015-06-27 12:36:20.619 INFO 10332 --- [ main] >> o.a.c.n.n.NegationContextAnalyzer : initBoundaryData() called for >> ContextInitializer >> 2015-06-27 12:36:20.758 INFO 10332 --- [ main] >> org.apache.ctakes.postagger.POSTagger : POS tagger model file: >> org/apache/ctakes/postagger/models/mayo-pos.zip >> 2015-06-27 12:36:21.061 ERROR 10332 --- [ main] >> c.e.c.processors.CommandLineProcessor : ResourceInitializationException: >> >> org.apache.uima.resource.ResourceInitializationException: Error >> initializing "org.apache.uima.resource.impl.DataResource_impl" from >> descriptor >> file:/D:/java/apache-ctakes-3.2.2/desc/ctakes-dictionary-lookup/desc/analysis_engine/DictionaryLookupAnnotatorUMLS.xml. >> at >> org.apache.uima.util.SimpleResourceFactory.produceResource(SimpleResourceFactory.java:144) >> at >> org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62) >> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269) >> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:243) >> at >> org.apache.uima.resource.impl.ResourceManager_impl.registerResource(ResourceManager_impl.java:565) >> at >> org.apache.uima.resource.impl.ResourceManager_impl.initializeExternalResources(ResourceManager_impl.java:442) >> at >> org.apache.uima.resource.Resource_ImplBase.initialize(Resource_ImplBase.java:153) >> at >> org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.initialize(AnalysisEngineImplBase.java:157) >> at >> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initialize(PrimitiveAnalysisEngine_impl.java:123) >> at >> org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94) >> at >> org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62) >> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269) >> at >> org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:387) >> at >> org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java:254) >> at >> . >> . >> . >> Caused by: org.apache.uima.resource.ResourceInitializationException: >> Could not access the resource data at >> file:org/apache/ctakes/dictionary/lookup/LookupDesc_Db.xml. >> at >> org.apache.uima.resource.impl.DataResource_impl.initialize(DataResource_impl.java:127) >> at >> org.apache.uima.util.SimpleResourceFactory.produceResource(SimpleResourceFactory.java:123) >> ... 35 common frames omitted >> >> On Fri, Jun 26, 2015 at 9:46 AM, Bruce Tietjen < >> [email protected]> wrote: >> >>> I'm sorry I don't have any current numbers for running that pipeline >>> because we need more than just entity recognition. We also need polarity, >>> certainty, etc. >>> >>> We have done a lot of optimization work in the more expensive parts of >>> the pipeline and have made modifications to some areas to make them thread >>> safe to enable running multiple pipelines concurrently within the same >>> process. We have also made changes so most of the models that are loaded >>> can be shared across multiple pipelines. >>> >>> We have not had time and resources to share these changes with the >>> community yet, but intend to make our changes available to the community as >>> soon as we feel they are ready. >>> >>> >>> [image: IMAT Solutions] <http://imatsolutions.com> >>> Bruce Tietjen >>> Senior Software Engineer >>> [image: Mobile:] 801.634.1547 >>> [email protected] >>> >>> On Thu, Jun 25, 2015 at 11:43 PM, Sai Anuroop <[email protected]> >>> wrote: >>> >>>> Hi All, >>>> I am presently working with developer version of cTAKES in Windows >>>> through eclipse. >>>> @Jeff:Thanks for your reply. >>>> @Lance:I am new to cTAKES and Java.So please Can you give me the code >>>> which runs cTAKES CPE in background without opening the CUI and produces >>>> XML output.If the code given does the same then can you please tell where >>>> to create above java class(in which project). >>>> @Bruce:Thanks for your posts.Can you tell What is the average and best >>>> time of cTAKES analyzing say a 20 line discharge report >>>> using AggregatePlaintextFastUMLSProcessor. >>>> >>>> Regards, >>>> >>>> Vetsa Sai Anuroop >>>> >>>> >>>> >>> >> >
