Hi Timothy,
I fixed the password issues and ran with AE AggregatePlainTextProcessor with
-Xms6g -Xmx6g, but still it takes a lot of time ( ~more than 2 hours) for a
single file of 2 Mb size. I have checked the memory consumption of the process
and it never goes above 4.5 G, so I am not sure if it is the memory issue.
However, AE AggregatePlainTextProcessor process the 2KB file in ~11 seconds,
but most of our files are in Mbs so processing time for each file for more than
2 hours is not feasible.
Could you please suggest something which may improve the performance. Below are
the logs for the process of 2 Mb file with AggregatePlainTextProcessor:
Logs:
C:\New_Drive\apache-ctakes-4.0.0-bin\apache-ctakes-4.0.0>java -cp
"C:\New_Drive\apache-ctakes-4.0.0-bin\apache-ctakes-4.0.0\desc\;C:\New_Drive\apache-ctakes-4.0.0-bin\apache-ctakes-4.0.0\resources\;C:\New_Drive\apache-ctakes-4.0.0-bin\apache-ctakes-4.0.0\lib\*"
-Dlog4j.configuration=file:\C:\New_Drive\apache-ctakes-4.0.0-bin\apache-ctakes-4.0.0\config\log4j.xml
-Xms6g -Xmx6g org.apache.uima.tools.cpm.CpmFrame
Dec 14, 2017 9:40:25 AM java.util.prefs.WindowsPreferences <init>
WARNING: Could not open/create prefs root node Software\JavaSoft\Prefs at root
0x80000002. Windows RegCreateKeyEx(...) returned error code 5.
log4j: reset attribute= "false".
log4j: Threshold ="null".
log4j: Retreiving an instance of org.apache.log4j.Logger.
log4j: Setting [ProgressAppender] additivity to [false].
log4j: Level value for ProgressAppender is [INFO].
log4j: ProgressAppender level set to INFO
log4j: Class name: [org.apache.log4j.ConsoleAppender]
log4j: Parsing layout of class: "org.apache.log4j.PatternLayout"
log4j: Setting property [conversionPattern] to [%m].
log4j: Adding appender named [noEolAppender] to category [ProgressAppender].
log4j: Retreiving an instance of org.apache.log4j.Logger.
log4j: Setting [ProgressDone] additivity to [false].
log4j: Level value for ProgressDone is [INFO].
log4j: ProgressDone level set to INFO
log4j: Class name: [org.apache.log4j.ConsoleAppender]
log4j: Parsing layout of class: "org.apache.log4j.PatternLayout"
log4j: Setting property [conversionPattern] to [%m%n].
log4j: Adding appender named [eolAppender] to category [ProgressDone].
log4j: Level value for root is [INFO].
log4j: root level set to INFO
log4j: Class name: [org.apache.log4j.ConsoleAppender]
log4j: Parsing layout of class: "org.apache.log4j.PatternLayout"
log4j: Setting property [conversionPattern] to [%d{dd MMM yyyy HH:mm:ss} %5p
%c{1} - %m%n].
log4j: Adding appender named [consoleAppender] to category [root].
14 Dec 2017 09:42:09 INFO Chunker - Chunker model file:
org/apache/ctakes/chunker/models/chunker-model.zip
14 Dec 2017 09:42:10 INFO TokenizerAnnotatorPTB - Initializing
org.apache.ctakes.core.ae.TokenizerAnnotatorPTB
14 Dec 2017 09:42:10 INFO ContextDependentTokenizerAnnotator - Finite state
machines loaded.
14 Dec 2017 09:42:10 INFO AbstractJCasTermAnnotator - Using dictionary lookup
window type: org.apache.ctakes.typesystem.type.textspan.Sentence
14 Dec 2017 09:42:10 INFO AbstractJCasTermAnnotator - Exclusion tagset loaded:
CC CD DT EX IN LS MD PDT POS PP PP$ PRP PRP$ RP TO VB VBD VBG VBN VBP VBZ WDT
WP WPS WRB
14 Dec 2017 09:42:10 INFO AbstractJCasTermAnnotator - Using minimum term text
span: 3
14 Dec 2017 09:42:10 INFO AbstractJCasTermAnnotator - Using Dictionary
Descriptor: org/apache/ctakes/dictionary/lookup/fast/sno_rx_16ab.xml
14 Dec 2017 09:42:10 INFO DictionaryDescriptorParser - Parsing dictionary
specifications:
14 Dec 2017 09:42:10 INFO UmlsUserApprover - Checking UMLS Account at
https://uts-ws.nlm.nih.gov/restful/isValidUMLSUser for user harish1234:
.14 Dec 2017 09:42:11 INFO UmlsUserApprover - UMLS Account at
https://uts-ws.nlm.nih.gov/restful/isValidUMLSUser for user harish1234 has been
validated
14 Dec 2017 09:42:11 INFO JdbcConnectionFactory - Connecting to
jdbc:hsqldb:file:resources/org/apache/ctakes/dictionary/lookup/fast/sno_rx_16ab/sno_rx_16ab:
14 Dec 2017 09:42:11 INFO ENGINE - open start - state not modified
..................
14 Dec 2017 09:42:17 INFO JdbcConnectionFactory - Database connected
14 Dec 2017 09:42:17 INFO JdbcRareWordDictionary - Connected to cui and term
table CUI_TERMS
14 Dec 2017 09:42:17 INFO JdbcConceptFactory - Connected to concept table TUI
with class TUI
14 Dec 2017 09:42:17 INFO JdbcConceptFactory - Connected to concept table
RXNORM with class LONG
14 Dec 2017 09:42:17 INFO JdbcConceptFactory - Connected to concept table
PREFTERM with class PREFTERM
14 Dec 2017 09:42:17 INFO JdbcConceptFactory - Connected to concept table
SNOMEDCT_US with class LONG
14 Dec 2017 09:42:17 INFO ContextAnnotator - Using left , right scope sizes:
10 , 10
14 Dec 2017 09:42:17 INFO ContextAnnotator - Using scope order: LEFT,RIGHT
14 Dec 2017 09:42:17 INFO ContextAnnotator - SCOPE ORDER: [1, 3]
14 Dec 2017 09:42:17 INFO ContextAnnotator - Using context analyzer:
org.apache.ctakes.necontexts.status.StatusContextAnalyzer
14 Dec 2017 09:42:17 INFO StatusContextAnalyzer - initBoundaryData() called
for ContextInitializer
14 Dec 2017 09:42:17 INFO ContextAnnotator - Using context consumer:
org.apache.ctakes.necontexts.status.StatusContextHitConsumer
14 Dec 2017 09:42:17 INFO ContextAnnotator - Using lookup window type:
org.apache.ctakes.typesystem.type.textspan.Sentence
14 Dec 2017 09:42:17 INFO ContextAnnotator - Using focus type:
org.apache.ctakes.typesystem.type.textsem.IdentifiedAnnotation
14 Dec 2017 09:42:17 INFO ContextAnnotator - Using context type:
org.apache.ctakes.typesystem.type.syntax.BaseToken
14 Dec 2017 09:42:17 INFO ContextAnnotator - Using left , right scope sizes: 7
, 7
14 Dec 2017 09:42:17 INFO ContextAnnotator - Using scope order: LEFT,RIGHT
14 Dec 2017 09:42:17 INFO ContextAnnotator - SCOPE ORDER: [1, 3]
14 Dec 2017 09:42:17 INFO ContextAnnotator - Using context analyzer:
org.apache.ctakes.necontexts.negation.NegationContextAnalyzer
14 Dec 2017 09:42:17 INFO NegationContextAnalyzer - initBoundaryData() called
for ContextInitializer
14 Dec 2017 09:42:17 INFO ContextAnnotator - Using context consumer:
org.apache.ctakes.necontexts.negation.NegationContextHitConsumer
14 Dec 2017 09:42:17 INFO ContextAnnotator - Using lookup window type:
org.apache.ctakes.typesystem.type.textspan.Sentence
14 Dec 2017 09:42:17 INFO ContextAnnotator - Using focus type:
org.apache.ctakes.typesystem.type.textsem.IdentifiedAnnotation
14 Dec 2017 09:42:17 INFO ContextAnnotator - Using context type:
org.apache.ctakes.typesystem.type.syntax.BaseToken
14 Dec 2017 09:42:17 INFO SentenceDetector - Sentence detector model file:
org/apache/ctakes/core/sentdetect/sd-med-model.zip
14 Dec 2017 09:42:17 INFO POSTagger - POS tagger model file:
org/apache/ctakes/postagger/models/mayo-pos.zip
14 Dec 2017 09:42:18 INFO LvgCmdApiResourceImpl - Loading NLM Norm and Lvg
with config file =
C:\New_Drive\apache-ctakes-4.0.0-bin\apache-ctakes-4.0.0\resources\org\apache\ctakes\lvg\data\config\lvg.properties
14 Dec 2017 09:42:18 INFO LvgCmdApiResourceImpl - config file absolute path
=
C:\New_Drive\apache-ctakes-4.0.0-bin\apache-ctakes-4.0.0\resources\org\apache\ctakes\lvg\data\config\lvg.properties
14 Dec 2017 09:42:18 INFO LvgCmdApiResourceImpl - cwd =
C:\New_Drive\apache-ctakes-4.0.0-bin\apache-ctakes-4.0.0
14 Dec 2017 09:42:18 INFO LvgCmdApiResourceImpl - cd
C:\New_Drive\apache-ctakes-4.0.0-bin\apache-ctakes-4.0.0\resources\org\apache\ctakes\lvg\
14 Dec 2017 09:42:18 INFO ENGINE - open start - state not modified
14 Dec 2017 09:42:18 INFO ENGINE - dataFileCache open start
14 Dec 2017 09:42:18 INFO ENGINE - dataFileCache open end
14 Dec 2017 09:42:18 INFO LvgCmdApiResourceImpl - cd
C:\New_Drive\apache-ctakes-4.0.0-bin\apache-ctakes-4.0.0
14 Dec 2017 09:42:18 INFO DrugMentionAnnotator - Finite state machines loaded.
14 Dec 2017 09:42:23 INFO ClearNLPDependencyParserAE - using Morphy analysis?
true
Loading configuration.
Loading feature templates.
Loading lexica.
Loading model:
........................................................................................
Loading configuration.
Loading feature templates.
Loading model:
.
Loading configuration.
Loading feature templates.
Loading lexica.
Loading model:
...
<various Loading model>
.
Loading configuration.
Loading feature templates.
Loading lexica.
Loading model:
................................
Loading model:
.............................
14 Dec 2017 09:42:32 INFO ConstituencyParser - Initializing parser...
14 Dec 2017 09:42:33 INFO SentenceDetector - Starting processing.
14 Dec 2017 09:42:34 INFO TokenizerAnnotatorPTB - process(JCas) in
org.apache.ctakes.core.ae.TokenizerAnnotatorPTB
14 Dec 2017 09:42:36 INFO LvgAnnotator - process(JCas)
14 Dec 2017 09:42:55 INFO ContextDependentTokenizerAnnotator - process(JCas)
14 Dec 2017 09:42:58 INFO POSTagger - process(JCas)
14 Dec 2017 09:43:10 INFO Chunker - process(JCas)
14 Dec 2017 09:43:46 INFO ChunkAdjuster - process(JCas)
14 Dec 2017 09:43:47 INFO ChunkAdjuster - process(JCas)
14 Dec 2017 09:43:48 INFO AbstractJCasTermAnnotator - Starting processing
14 Dec 2017 09:43:54 INFO AbstractJCasTermAnnotator - Finished processing
14 Dec 2017 09:43:54 INFO DrugMentionAnnotator - process(JCas)
14 Dec 2017 09:45:32 INFO DrugMentionAnnotator -
14 Dec 2017 09:45:32 INFO DrugMentionAnnotator -
14 Dec 2017 09:45:32 INFO DrugMentionAnnotator -
14 Dec 2017 09:45:32 INFO DrugMentionAnnotator -
14 Dec 2017 09:45:33 INFO DrugMentionAnnotator -
14 Dec 2017 09:45:33 INFO DrugMentionAnnotator -
14 Dec 2017 09:45:33 INFO DrugMentionAnnotator -
14 Dec 2017 09:45:34 INFO DrugMentionAnnotator -
14 Dec 2017 09:45:38 INFO DrugMentionAnnotator -
14 Dec 2017 09:45:39 INFO DrugMentionAnnotator -
14 Dec 2017 09:45:42 INFO DrugMentionAnnotator -
14 Dec 2017 09:45:43 INFO DrugMentionAnnotator -
14 Dec 2017 09:45:45 INFO DrugMentionAnnotator -
14 Dec 2017 09:45:48 INFO DrugMentionAnnotator -
14 Dec 2017 09:45:48 INFO DrugMentionAnnotator -
14 Dec 2017 09:45:50 INFO DrugMentionAnnotator -
14 Dec 2017 09:45:50 INFO DrugMentionAnnotator -
14 Dec 2017 09:45:53 INFO DrugMentionAnnotator -
14 Dec 2017 09:45:54 INFO DrugMentionAnnotator -
14 Dec 2017 09:45:59 INFO DrugMentionAnnotator -
14 Dec 2017 09:46:00 INFO DrugMentionAnnotator -
14 Dec 2017 09:46:04 INFO DrugMentionAnnotator -
14 Dec 2017 09:46:04 INFO DrugMentionAnnotator -
14 Dec 2017 09:46:05 INFO DrugMentionAnnotator -
14 Dec 2017 09:46:06 INFO DrugMentionAnnotator -
14 Dec 2017 09:46:08 INFO DrugMentionAnnotator -
14 Dec 2017 09:46:09 INFO DrugMentionAnnotator -
14 Dec 2017 09:46:09 INFO DrugMentionAnnotator -
14 Dec 2017 09:46:11 INFO DrugMentionAnnotator -
14 Dec 2017 09:46:16 INFO DrugMentionAnnotator -
14 Dec 2017 09:46:24 INFO DrugMentionAnnotator -
14 Dec 2017 09:46:27 INFO DrugMentionAnnotator -
14 Dec 2017 09:46:30 INFO DrugMentionAnnotator -
14 Dec 2017 09:46:32 INFO DrugMentionAnnotator -
14 Dec 2017 09:46:35 INFO DrugMentionAnnotator -
14 Dec 2017 09:46:38 INFO DrugMentionAnnotator -
14 Dec 2017 09:46:45 INFO DrugMentionAnnotator -
14 Dec 2017 09:46:46 INFO DrugMentionAnnotator -
14 Dec 2017 09:46:46 INFO DrugMentionAnnotator -
14 Dec 2017 09:46:53 INFO DrugMentionAnnotator -
14 Dec 2017 09:46:54 INFO DrugMentionAnnotator -
14 Dec 2017 09:47:02 INFO DrugMentionAnnotator -
14 Dec 2017 09:47:22 INFO DrugMentionAnnotator -
14 Dec 2017 09:47:24 INFO DrugMentionAnnotator -
14 Dec 2017 09:47:28 INFO DrugMentionAnnotator -
14 Dec 2017 09:47:29 INFO DrugMentionAnnotator -
14 Dec 2017 09:47:34 INFO DrugMentionAnnotator -
14 Dec 2017 09:47:38 INFO DrugMentionAnnotator -
14 Dec 2017 09:47:46 INFO DrugMentionAnnotator -
14 Dec 2017 09:47:49 INFO DrugMentionAnnotator -
14 Dec 2017 09:47:54 INFO DrugMentionAnnotator -
14 Dec 2017 09:47:54 INFO DrugMentionAnnotator -
14 Dec 2017 09:47:58 INFO DrugMentionAnnotator -
14 Dec 2017 09:48:45 INFO MaxentParserWrapper - Started processing:
idd_secondTrial.txt
14 Dec 2017 10:20:19 INFO MaxentParserWrapper - Done parsing:
idd_secondTrial.txt
Regards,
Harish.
-----Original Message-----
From: Miller, Timothy [mailto:[email protected]]
Sent: Thursday, December 14, 2017 9:16 AM
To: [email protected]
Subject: Re: Slowness in processing files [EXTERNAL]
Do not try to use AggregatePlainTextProcessor, it is just slow.
Use the fast version and debug the password issues.
Make sure you have your UMLS credentials set in:
$CTAKES_ROOT/resources/org/apache/ctakes/dictionary/lookup/fast/sno_rx_
16ab.xml
in two different places.
Tim
On Thu, 2017-12-14 at 02:36 +0000, Yadav, Harish wrote:
> Hi James,
>
> Thanks for responding.
>
> Single file is taking ~5 hours to process with
> AggregatePlainTextProcessor of size 2 Mb. This is how the process
> looks like for JVM arguments regarding memory:
>
> C:\New_Drive\apache-ctakes-4.0.0-bin\apache-ctakes-4.0.0>java
> -Dctakes.umlsuser="XXXXXXX"  -Dctakes.umlspw="XXXXXXXX" -cp
> "C:\New_Drive\apache-ctakes-4.0.0-bi
> apache-ctakes-4.0.0\desc\;C:\New_Drive\apache-ctakes-4.0.0-
> bin\apache-ctakes-4.0.0\resources\;C:\New_Drive\apache-ctakes-4.0.0-
> bin\apache-ctakes-4.0.0\lib\*" -Dlog4j.
> nfiguration=file:\C:\New_Drive\apache-ctakes-4.0.0-bin\apache-ctakes-
> 4.0.0\config\log4j.xml -Xms512M -Xmx3g
> org.apache.uima.tools.cpm.CpmFrame
>
> Also, just now I tried to process the file with AE
> AggregatePlaintextFastUMLSProcessor but ran into different problem of
> not getting authentication error with same username password being
> used in AggregatePlainTextProcessor.
>
> I can run it with AggregatePlaintextFastUMLSProcessor by increasing
> Xms 5g and Xmx5g, if you could please let me know how can it be
> possible that with one AE AggregatePlainTextProcessor it is running
> fine with above username and password but giving below exception with
> same username, password with AggregatePlaintextFastUMLSProcessor.
>
> Exception:
>
> C:\New_Drive\apache-ctakes-4.0.0-bin\apache-ctakes-4.0.0>java
> -Dctakes.umlsuser="XXXXXXX"  -Dctakes.umlspw="XXXXXX" -cp
> "C:\New_Drive\apache-ctakes-4.0.0-bin\ apache-ctakes-
> 4.0.0\desc\;C:\New_Drive\apache-ctakes-4.0.0-bin\apache-ctakes-
> 4.0.0\resources\;C:\New_Drive\apache-ctakes-4.0.0-bin\apache-ctakes-
> 4.0.0\lib\*" -Dlog4j.co nfiguration=file:\C:\New_Drive\apache-ctakes-
> 4.0.0-bin\apache-ctakes-4.0.0\config\log4j.xml -Xms512M -Xmx3g
> org.apache.uima.tools.cpm.CpmFrame Dec 13, 2017 9:01:20 PM
> java.util.prefs.WindowsPreferences <init> WARNING: Could not
> open/create prefs root node Software\JavaSoft\Prefs at root
> 0x80000002. Windows RegCreateKeyEx(...) returned error code 5. log4j:
> attributes.... 13 Dec 2017 21:04:58 INFO Chunker - Chunker model
> file: org/apache/ctakes/chunker/models/chunker-model.zip 13 Dec 2017
> 21:05:00 INFO TokenizerAnnotatorPTB - Initializing
> org.apache.ctakes.core.ae.TokenizerAnnotatorPTB 13 Dec 2017 21:05:00
> INFO ContextDependentTokenizerAnnotator - Finite state machines
> loaded. 13 Dec 2017 21:05:00 INFO AbstractJCasTermAnnotator - Using
> dictionary lookup window type:
> org.apache.ctakes.typesystem.type.textspan.Sentence 13 Dec 2017
> 21:05:00 INFO AbstractJCasTermAnnotator - Exclusion tagset loaded:
> CC CD DT EX IN LS MD PDT POS PP PP$ PRP PRP$ RP TO VB VBD VBG VBN VBP
> VBZ WDT WP WPS WRB 13 Dec 2017 21:05:00 INFO
> AbstractJCasTermAnnotator - Using minimum term text span: 3 13 Dec
> 2017 21:05:00 INFO AbstractJCasTermAnnotator - Using Dictionary
> Descriptor: org/apache/ctakes/dictionary/lookup/fast/sno_rx_16ab.xml
> 13 Dec 2017 21:05:00 INFO DictionaryDescriptorParser - Parsing
> dictionary specifications: 13 Dec 2017 21:05:00 INFO UmlsUserApprover
> - Checking UMLS Account at https://uts-ws.nlm.nih.go
> v/restful/isValidUMLSUser for user harish1234-ß: ....13 Dec 2017
> 21:05:02 ERROR UmlsUserApprover - UMLS Account at https://uts-ws.nl
> m.nih.gov/restful/isValidUMLSUser is not valid for user XXXXXXX-ß with
> XXXXXXX
> org.apache.uima.resource.ResourceInitializationException:
> Initialization of CAS Processor with name
> "AggregatePlaintextFastUMLSProcessor" failed. at
> org.apache.uima.collection.impl.CollectionProcessingEngine_impl.initi
> alize(CollectionProcessingEngine_impl.java:81) at
> org.apache.uima.impl.UIMAFramework_impl._produceCollectionProcessingE
> ngine(UIMAFramework_impl.java:420) at
> org.apache.uima.UIMAFramework.produceCollectionProcessingEngine(UIMAF
> ramework.java:918) at
> org.apache.uima.tools.cpm.CpmPanel.startProcessing(CpmPanel.java:573)
> at
> org.apache.uima.tools.cpm.CpmPanel.access$000(CpmPanel.java:105)
> at
> org.apache.uima.tools.cpm.CpmPanel$1.run(CpmPanel.java:713) Caused
> by: org.apache.uima.resource.ResourceConfigurationException:
> Initialization of CAS Processor with name
> "AggregatePlaintextFastUMLSProcessor" failed. at
> org.apache.uima.collection.impl.cpm.container.CPEFactory.produceInteg
> ratedCasProcessor(CPEFactory.java:1101) at
> org.apache.uima.collection.impl.cpm.container.CPEFactory.getCasProces
> sors(CPEFactory.java:547) at
> org.apache.uima.collection.impl.cpm.BaseCPMImpl.init(BaseCPMImpl.java
> :253) at
> org.apache.uima.collection.impl.cpm.BaseCPMImpl.<init>(BaseCPMImpl.ja
> va:127) at
> org.apache.uima.collection.impl.CollectionProcessingEngine_impl.initi
> alize(CollectionProcessingEngine_impl.java:73) ... 5 more
> Caused by: org.apache.uima.resource.ResourceInitializationException:
> Initialization of annotator class
> "org.apache.ctakes.dictionary.lookup2.ae.DefaultJCasTermAnnotator "
> failed. (Descriptor: file:/C:/New_Drive/apache-ctakes-4.0.0-
> bin/apache-ctakes-4.0.0/desc/ctakes-dictionary-lookup-
> fast/desc/analysis_engine/UmlsLookupAnnotator.xml) at
> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.ini
> tializeAnalysisComponent(PrimitiveAnalysisEngine_impl.java:271)
> at
> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.ini
> tialize(PrimitiveAnalysisEngine_impl.java:170) at
> org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(Analy
> sisEngineFactory_impl.java:94) at
> org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(Co
> mpositeResourceFactory_impl.java:62) at
> org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:279)
> at
> org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.jav
> a:407) at
> org.apache.uima.analysis_engine.asb.impl.ASB_impl.setup(ASB_impl.java
> :256) at
> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.ini
> tASB(AggregateAnalysisEngine_impl.java:429) at
> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.ini
> tializeAggregateAnalysisEngine(AggregateAnalysisEngine_impl.java:373)
> at
> org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.ini
> tialize(AggregateAnalysisEngine_impl.java:186) at
> org.apache.uima.impl.AnalysisEngineFactory_impl.produceResource(Analy
> sisEngineFactory_impl.java:94) at
> org.apache.uima.impl.CompositeResourceFactory_impl.produceResource(Co
> mpositeResourceFactory_impl.java:62) at
> org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:279)
> at
> org.apache.uima.UIMAFramework.produceResource(UIMAFramework.java:331)
> at
> org.apache.uima.UIMAFramework.produceAnalysisEngine(UIMAFramework.jav
> a:448) at
> org.apache.uima.collection.impl.cpm.container.CPEFactory.produceInteg
> ratedCasProcessor(CPEFactory.java:1085) ... 9 more Caused by:
> org.apache.uima.resource.ResourceInitializationException: MESSAGE
> LOCALIZATION FAILED: Can't find resource for bundle
> java.util.PropertyResourceBundle, key C ould not construct
> org.apache.ctakes.dictionary.lookup2.dictionary.UmlsJdbcRareWordDicti
> onary at
> org.apache.ctakes.dictionary.lookup2.ae.AbstractJCasTermAnnotator.ini
> tialize(AbstractJCasTermAnnotator.java:131) at
> org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.ini
> tializeAnalysisComponent(PrimitiveAnalysisEngine_impl.java:266)
> ... 24 more Caused by:
> org.apache.uima.analysis_engine.annotator.AnnotatorContextException:
> MESSAGE LOCALIZATION FAILED: Can't find resource for bundle
> java.util.PropertyResourceBu ndle, key Could not construct
> org.apache.ctakes.dictionary.lookup2.dictionary.UmlsJdbcRareWordDicti
> onary at
> org.apache.ctakes.dictionary.lookup2.dictionary.DictionaryDescriptorP
> arser.parseDictionary(DictionaryDescriptorParser.java:199) at
> org.apache.ctakes.dictionary.lookup2.dictionary.DictionaryDescriptorP
> arser.parseDictionaries(DictionaryDescriptorParser.java:156)
> at
> org.apache.ctakes.dictionary.lookup2.dictionary.DictionaryDescriptorP
> arser.parseDescriptor(DictionaryDescriptorParser.java:128) at
> org.apache.ctakes.dictionary.lookup2.ae.AbstractJCasTermAnnotator.ini
> tialize(AbstractJCasTermAnnotator.java:129) ... 25 more Caused
> by: java.lang.reflect.InvocationTargetException at
> sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
> at
> sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
> at
> sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown
> Source) at java.lang.reflect.Constructor.newInstance(Unknown
> Source) at
> org.apache.ctakes.dictionary.lookup2.dictionary.DictionaryDescriptorP
> arser.parseDictionary(DictionaryDescriptorParser.java:196)
> ... 28 more Caused by: java.sql.SQLException: Invalid User for UMLS
> dictionary sno_rx_16abTerms at
> org.apache.ctakes.dictionary.lookup2.dictionary.UmlsJdbcRareWordDicti
> onary.<init>(UmlsJdbcRareWordDictionary.java:29) ... 33 more
>
>
>
> From: James Masanz [mailto:[email protected]]
> Sent: Wednesday, December 13, 2017 8:56 PM
> To: [email protected]
> Subject: Re: Slowness in processing files
>
> Using AggregatePlaintextFastUMLSProcessor is much faster than
> AggregatePlainTextProcessor, so I suggest that to start with you just
> use AggregatePlaintextFastUMLSProcessor.
>
> Do you mean it is taking ~5 hours for a single file to be processed at
> times, or is that for a set of files?
>
> If your JVM heap space is not set large enough, you can get very slow
> results.
> Try increasing to 5G (or more) using the JVM parameter -Xmx5G For
> faster start up, you can also set the -Xms to the same or something
> close to -Xmx value.
>
> -- James
>
> On Wed, Dec 13, 2017 at 7:04 PM, Yadav, Harish <[email protected]>
> wrote:
> Hi All,
>
> When the medical records are run with the AE as
> AggregatePlaintextFastUMLSProcessor or AggregatePlainTextProcessor the
> processing is very slow. It is pretty fast when the smaller files
> (~2 kb) are fed as input but when I am processing with bigger files
> say, 2Mb, it is very slow and the files are taking ~5 hours to
> process. Any pointer will be of great help.
>
> Regards,
> Harish.
>