Thank you very much Lance. Once I removed LVG from desc/ctakes-clinical-pipeline/desc/analysis_engine/AggregatePlaintextUMLSProcessor.xml then things started to go much better. I appreciate the help.
On Thu, Jun 18, 2015 at 12:24 PM, Lance Eason <[email protected]> wrote: > Make sure you have a version of the resources unpacked (not in a jar) > first thing on the classpath. See the instructions about installing the > resources here: > https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.1+User+Install+Guide > > My classpath looks like: > > $APP_HOME/desc:$APP_HOME/resources:[all the other jars] > > 'desc' is where the pipeline definitions live, 'resources' is where a > bunch of miscellaneous resources (dictionaries, various ML models, etc.) > live. > > Also I notice the specific error you're getting is trying to load LVG. > I'd strongly recommend removing LVG from your pipeline especially if you're > doing multi-threaded runs. It's the only component in the standard > pipeline that isn't thread-safe and it's a huge performance sink to boot > for not much value add. > > You can remove it by editing the pipeline XML and removing: > > <delegateAnalysisEngine key="LvgAnnotator"> > <import > location="../../../ctakes-lvg/desc/analysis_engine/LvgAnnotator.xml"/> > </delegateAnalysisEngine> > > and: > > <node>LvgAnnotator</node> > > On Wed, Jun 17, 2015 at 8:58 PM, Jeff Headley <[email protected]> wrote: > >> Thank you for posting this code. I too am trying to run cTAKES from >> within a Java application. It works fine until the line: >> AnalysisEngine analysisEngine = >> UIMAFramework.produceAnalysisEngine(pipelineSpecifier, >> threadCount, 0); >> >> From there it is throwing the error below. My cTAKES installation is >> 3.2.2 and I have setup UMLS credentials, etc. Have any ideas what is wrong? >> >> java.lang.IllegalArgumentException: URI is not hierarchical >> at java.io.File.<init>(File.java:418) >> at >> org.apache.ctakes.lvg.resource.LvgCmdApiResourceImpl.load(LvgCmdApiResourceImpl.java:65) >> at >> org.apache.uima.resource.impl.ResourceManager_impl.registerResource(ResourceManager_impl.java:603) >> at >> org.apache.uima.resource.impl.ResourceManager_impl.initializeExternalResources(ResourceManager_impl.java:442) >> at >> org.apache.uima.resource.Resource_ImplBase.initialize(Resource_ImplBase.java:153) >> at >> org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.initialize(AnalysisEngineImplBase.java:157) >> at >> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initialize(PrimitiveAnalysisEngine_impl.java:123) >> at >> org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94) >> at >> org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62) >> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269) >> at >> org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:387) >> at >> org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java:254) >> at >> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initASB(AggregateAnalysisEngine_impl.java:431) >> at >> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initializeAggregateAnalysisEngine(AggregateAnalysisEngine_impl.java:375) >> at >> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:185) >> at >> org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94) >> at >> org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62) >> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269) >> at >> org.apache.uima.internal.util.ResourcePool.fillPool(ResourcePool.java:243) >> at >> org.apache.uima.internal.util.ResourcePool.<init>(ResourcePool.java:100) >> at >> org.apache.uima.internal.util.AnalysisEnginePool.<init>(AnalysisEnginePool.java:91) >> at >> org.apache.uima.analysis_engine.impl.MultiprocessingAnalysisEngine_impl.initialize(MultiprocessingAnalysisEngine_impl.java:118) >> at >> org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94) >> at >> org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62) >> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269) >> at >> org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:475) >> >> Thank you, >> Jeff >> >> On Tue, Jun 16, 2015 at 10:36 AM, Lance Eason <[email protected]> >> wrote: >> >>> Sai, here's an example from what I'm using. I'm using multiple threads >>> to process documents concurrently, if you're not interested in that you can >>> ignore the CASPool stuff and just instantiate a CAS directly. You *do* >>> want to re-use CAS instances though, they're very expensive to create. >>> >>> // the name of the analysis engine xml file >>> String pipelineFileName = >>> ./desc/ctakes-clinical-pipeline/desc/analysis_engine/AggregatePlaintextUMLSProcessor.xml; >>> >>> // the number of simultaneous pipelines to support >>> int threadCount = 3; >>> >>> // load the pipeline specifier >>> XMLInputSource input = new XMLInputSource(new File(pipelineFileName)); >>> ResourceSpecifier pipelineSpecifier = >>> UIMAFramework.getXMLParser().parseResourceSpecifier(input); >>> >>> // create the analysis engine for the pipeline and allocate some CAS >>> AnalysisEngine analysisEngine = >>> UIMAFramework.produceAnalysisEngine(pipelineSpecifier, threadCount, 0); >>> CasPool casPool = new CasPool(threadCount, analysisEngine); >>> >>> >>> >>> // for each document... >>> CAS cas = casPool.getCas(); >>> try >>> { >>> // process the document >>> cas.reset(); >>> cas.setDocumentLanguage("en"); >>> cas.setDocumentText(textToAnalyze); >>> >>> // then consume the assertions of whatever type you're interested in >>> Type eventType = >>> cas.getTypeSystem().getType("org.apache.ctakes.typesystem.type.textsem.EventMention"); >>> >>> FSIterator<FeatureStructure> iter = >>> cas.getIndexRepository().getAllIndexedFS(eventType); >>> while (iter.hasNext()) >>> { >>> FeatureStructure fs = iter.next(); >>> >>> // extract information from the assertion >>> } >>> } >>> finally >>> { >>> casPool.releaseCas(cas); >>> } >>> >>> On Tue, Jun 16, 2015 at 2:37 AM, Sai Anuroop <[email protected]> >>> wrote: >>> >>>> Hi All, >>>> >>>> I want to run cTAKES CPE by choosing a Collection Reader,AE and CAS >>>> Consumer from java directly so that i can reduce the time taken for >>>> processing text documents.Please can anyone explain how to do this by >>>> giving an example java code or point out to any resources. >>>> >>>> Regards, >>>> >>>> Vetsa Sai Anuroop >>>> >>>> >>>> >>> >>> >>> -- >>> ......................................................... >>> *Lance Eason* >>> Iodine Software >>> Vice President of Engineering >>> [email protected] >>> 512.785.5195 office | 801.203.8987 fax >>> ......................................................... >>> >>> >> > > > -- > ......................................................... > *Lance Eason* > Iodine Software > Vice President of Engineering > [email protected] > 512.785.5195 office | 801.203.8987 fax > ......................................................... > >
