Re: Running cTAKES through Java

Jeff Headley Thu, 18 Jun 2015 20:26:41 -0700

Thank you very much Lance. Once I removed LVG
from 
desc/ctakes-clinical-pipeline/desc/analysis_engine/AggregatePlaintextUMLSProcessor.xml
then things started to go much better. I appreciate the help.


On Thu, Jun 18, 2015 at 12:24 PM, Lance Eason <[email protected]>
wrote:

> Make sure you have a version of the resources unpacked (not in a jar)
> first thing on the classpath.  See the instructions about installing the
> resources here:
> https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.1+User+Install+Guide
>
> My classpath looks like:
>
> $APP_HOME/desc:$APP_HOME/resources:[all the other jars]
>
> 'desc' is where the pipeline definitions live, 'resources' is where a
> bunch of miscellaneous resources (dictionaries, various ML models, etc.)
> live.
>
> Also I notice the specific error you're getting is trying to load LVG.
> I'd strongly recommend removing LVG from your pipeline especially if you're
> doing multi-threaded runs.  It's the only component in the standard
> pipeline that isn't thread-safe and it's a huge performance sink to boot
> for not much value add.
>
> You can remove it by editing the pipeline XML and removing:
>
>     <delegateAnalysisEngine key="LvgAnnotator">
>       <import
> location="../../../ctakes-lvg/desc/analysis_engine/LvgAnnotator.xml"/>
>     </delegateAnalysisEngine>
>
> and:
>
>     <node>LvgAnnotator</node>
>
> On Wed, Jun 17, 2015 at 8:58 PM, Jeff Headley <[email protected]> wrote:
>
>> Thank you for posting this code. I too am trying to run cTAKES from
>> within a Java application. It works fine until the line:
>> AnalysisEngine analysisEngine = 
>> UIMAFramework.produceAnalysisEngine(pipelineSpecifier,
>> threadCount, 0);
>>
>> From there it is throwing the error below. My cTAKES installation is
>> 3.2.2 and I have setup UMLS credentials, etc. Have any ideas what is wrong?
>>
>> java.lang.IllegalArgumentException: URI is not hierarchical
>> at java.io.File.<init>(File.java:418)
>> at
>> org.apache.ctakes.lvg.resource.LvgCmdApiResourceImpl.load(LvgCmdApiResourceImpl.java:65)
>> at
>> org.apache.uima.resource.impl.ResourceManager_impl.registerResource(ResourceManager_impl.java:603)
>> at
>> org.apache.uima.resource.impl.ResourceManager_impl.initializeExternalResources(ResourceManager_impl.java:442)
>> at
>> org.apache.uima.resource.Resource_ImplBase.initialize(Resource_ImplBase.java:153)
>> at
>> org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.initialize(AnalysisEngineImplBase.java:157)
>> at
>> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.initialize(PrimitiveAnalysisEngine_impl.java:123)
>> at
>> org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
>> at
>> org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
>> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269)
>> at
>> org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:387)
>> at
>> org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java:254)
>> at
>> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initASB(AggregateAnalysisEngine_impl.java:431)
>> at
>> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initializeAggregateAnalysisEngine(AggregateAnalysisEngine_impl.java:375)
>> at
>> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.initialize(AggregateAnalysisEngine_impl.java:185)
>> at
>> org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
>> at
>> org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
>> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269)
>> at
>> org.apache.uima.internal.util.ResourcePool.fillPool(ResourcePool.java:243)
>> at
>> org.apache.uima.internal.util.ResourcePool.<init>(ResourcePool.java:100)
>> at
>> org.apache.uima.internal.util.AnalysisEnginePool.<init>(AnalysisEnginePool.java:91)
>> at
>> org.apache.uima.analysis_engine.impl.MultiprocessingAnalysisEngine_impl.initialize(MultiprocessingAnalysisEngine_impl.java:118)
>> at
>> org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(AnalysisEngineFactory_impl.java:94)
>> at
>> org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(CompositeResourceFactory_impl.java:62)
>> at org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:269)
>> at
>> org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.java:475)
>>
>> Thank you,
>> Jeff
>>
>> On Tue, Jun 16, 2015 at 10:36 AM, Lance Eason <[email protected]>
>> wrote:
>>
>>> Sai, here's an example from what I'm using.  I'm using multiple threads
>>> to process documents concurrently, if you're not interested in that you can
>>> ignore the CASPool stuff and just instantiate a CAS directly.  You *do*
>>> want to re-use CAS instances though, they're very expensive to create.
>>>
>>> // the name of the analysis engine xml file
>>> String pipelineFileName =
>>> ./desc/ctakes-clinical-pipeline/desc/analysis_engine/AggregatePlaintextUMLSProcessor.xml;
>>>
>>> // the number of simultaneous pipelines to support
>>> int threadCount = 3;
>>>
>>> // load the pipeline specifier
>>> XMLInputSource input = new XMLInputSource(new File(pipelineFileName));
>>> ResourceSpecifier pipelineSpecifier =
>>> UIMAFramework.getXMLParser().parseResourceSpecifier(input);
>>>
>>> // create the analysis engine for the pipeline and allocate some CAS
>>> AnalysisEngine analysisEngine =
>>> UIMAFramework.produceAnalysisEngine(pipelineSpecifier, threadCount, 0);
>>> CasPool casPool = new CasPool(threadCount, analysisEngine);
>>>
>>>
>>>
>>> // for each document...
>>> CAS cas = casPool.getCas();
>>> try
>>> {
>>>     // process the document
>>>     cas.reset();
>>>     cas.setDocumentLanguage("en");
>>>     cas.setDocumentText(textToAnalyze);
>>>
>>>     // then consume the assertions of whatever type you're interested in
>>>     Type eventType =
>>> cas.getTypeSystem().getType("org.apache.ctakes.typesystem.type.textsem.EventMention");
>>>
>>>     FSIterator<FeatureStructure> iter =
>>> cas.getIndexRepository().getAllIndexedFS(eventType);
>>>     while (iter.hasNext())
>>>     {
>>>         FeatureStructure fs = iter.next();
>>>
>>>         // extract information from the assertion
>>>     }
>>> }
>>> finally
>>> {
>>>     casPool.releaseCas(cas);
>>> }
>>>
>>> On Tue, Jun 16, 2015 at 2:37 AM, Sai Anuroop <[email protected]>
>>> wrote:
>>>
>>>> Hi All,
>>>>
>>>> I want to run cTAKES CPE by choosing a Collection Reader,AE and CAS
>>>> Consumer from java directly so that i can reduce the time taken for
>>>> processing text documents.Please can anyone explain how to do this by
>>>> giving an example java code or point out to any resources.
>>>>
>>>> Regards,
>>>>
>>>> Vetsa Sai Anuroop
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> .........................................................
>>> *Lance Eason*
>>> Iodine Software
>>> Vice President of Engineering
>>> [email protected]
>>> 512.785.5195 office | 801.203.8987 fax
>>> .........................................................
>>>
>>>
>>
>
>
> --
> .........................................................
> *Lance Eason*
> Iodine Software
> Vice President of Engineering
> [email protected]
> 512.785.5195 office | 801.203.8987 fax
> .........................................................
>
>

Re: Running cTAKES through Java

Reply via email to