Hi all, I'm using cTAKES user to process CDA documents by AggregateCdaProcessor.xml and AggregateCdaUMLSProcessor.xml located in /desc/ctakes-clinical-pipeline/desc/analysis_engine/
My script to call this is java -Dctakes.umlsuser= -Dctakes.umlspw= -cp $CTAKES_HOME/lib/*:$CTAKES_HOME/desc/:$CTAKES_HOME/resources/ -Dlog4j.configuration=file:$CTAKES_HOME/config/log4j.xml -Xms2g -Xmx3g org.apache.ctakes.core.cpe.CmdLineCpeRunner $CTAKES_HOME/desc/ctakes-clinical-pipeline/desc/collection_processing_engine/test_cda_masoud.xml test_cda_masoud.xml has a proper path to CDA input and output. I'm using the two CDA files that come with the cTAKES package (testpatient_cn_2.xml and testpatient_cn_1.xml compatible with NotesIIST_RTF.DTD). Unfortunately, it seems that CdaCasInitializer cannot run, and I get the attached errors. I get the same errors when using the GUI with AggregateCdaProcessor AE - Am I missing something obvious? - Does cTAKES *User* installation handle CDA documents? - Is org.apache.ctakes.core.cpe.CmdLineCpeRunner an appropriate pipeline for CdaCasInitializer? Thank you so much for your help in advance. Masoud On 11/8/19, 8:30 AM, "Finan, Sean" <sean.fi...@childrens.harvard.edu> wrote: Hi Masoud, I think that the CdaCasInitializer is at least 10 years old. I would not expect it to conform to any recent standards. Does anybody else have a reader or transformer that can handle HL7 CDA r2? Sean p.s. If anybody is involved with HL7 International, you may want to get some movement on addressing the typo on the page header(s): Section 1a: Clinical Document Architcture (CDA®) ________________________________________ From: Masoud Rouhizadeh <m...@jhu.edu> Sent: Thursday, November 7, 2019 5:59 PM To: dev@ctakes.apache.org Subject: cTAKES handling HL7 CDA Level 1 [EXTERNAL] Dear cTAKES developer mailing list, We have been working on a project at Hopkins for converting Epic-generated RTF notes into Clinical Document Architecture Level One. We have been using HL7 CDA® Release 2 Schema, and now we plan to use cTAKES for concept extraction from those documents. The CDA Schema and examples can be found here https://urldefense.proofpoint.com/v2/url?u=https-3A__www.hl7.org_implement_standards_product-5Fbrief.cfm-3Fproduct-5Fid-3D7&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=h8q4BiKKL6eDBOGEta7gcpkDGIx5xFPlGrNfUPlzBuc&s=l8HjgDHeywmdkSUkOJBGWNLpJ-bPlw7Lmgzh02w8k2s&e= In the cTAKES documentation, I see that CdaCasInitializer "does not handle all CDA documents. The CDA document must conform to the DTD resources/cda/NotesIIST_RTF.DTD." Has anyone tested and evaluated cTAKES ability to consume HL7 CDA Level 1 Release 2 documents? Thank you, Masoud ---- Masoud Rouhizadeh, PhD Faculty - Division of Health Science Informatics (DHSI) NLP Lead - Institute for Clinical and Translational Research (ICTR) Johns Hopkins University School of Medicine https://urldefense.proofpoint.com/v2/url?u=https-3A__www.cs.jhu.edu_-7Emrou_&d=DwIGaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTao&m=h8q4BiKKL6eDBOGEta7gcpkDGIx5xFPlGrNfUPlzBuc&s=8fvrQoIy8orWYKCJoob5Z0Sbbioe5xyiN7pDMTzImOc&e=
log4j: reset attribute= "false". log4j: Threshold ="null". log4j: Retreiving an instance of org.apache.log4j.Logger. log4j: Setting [ProgressAppender] additivity to [false]. log4j: Level value for ProgressAppender is [INFO]. log4j: ProgressAppender level set to INFO log4j: Class name: [org.apache.log4j.ConsoleAppender] log4j: Parsing layout of class: "org.apache.log4j.PatternLayout" log4j: Setting property [conversionPattern] to [%m]. log4j: Adding appender named [noEolAppender] to category [ProgressAppender]. log4j: Retreiving an instance of org.apache.log4j.Logger. log4j: Setting [ProgressDone] additivity to [false]. log4j: Level value for ProgressDone is [INFO]. log4j: ProgressDone level set to INFO log4j: Class name: [org.apache.log4j.ConsoleAppender] log4j: Parsing layout of class: "org.apache.log4j.PatternLayout" log4j: Setting property [conversionPattern] to [%m%n]. log4j: Adding appender named [eolAppender] to category [ProgressDone]. log4j: Level value for root is [INFO]. log4j: root level set to INFO log4j: Class name: [org.apache.log4j.ConsoleAppender] log4j: Parsing layout of class: "org.apache.log4j.PatternLayout" log4j: Setting property [conversionPattern] to [%d{dd MMM yyyy HH:mm:ss} %5p %c{1} - %m%n]. log4j: Adding appender named [consoleAppender] to category [root]. 18 Dec 2019 13:24:09 INFO CdaCasInitializer - Hyphen dictionary: org/apache/ctakes/preprocessor/tokenizer/hyphenated.txt 18 Dec 2019 13:24:09 INFO CdaCasInitializer - DTD: org/apache/ctakes/preprocessor/cda/NotesIIST_RTF.DTD 18 Dec 2019 13:24:09 INFO Chunker - Chunker model file: org/apache/ctakes/chunker/models/chunker-model.zip 18 Dec 2019 13:24:11 INFO ContextAnnotator - Using left , right scope sizes: 7 , 7 18 Dec 2019 13:24:11 INFO ContextAnnotator - Using scope order: LEFT,RIGHT 18 Dec 2019 13:24:11 INFO ContextAnnotator - SCOPE ORDER: [1, 3] 18 Dec 2019 13:24:11 INFO ContextAnnotator - Using context analyzer: org.apache.ctakes.necontexts.negation.NegationContextAnalyzer 18 Dec 2019 13:24:11 INFO NegationContextAnalyzer - initBoundaryData() called for ContextInitializer 18 Dec 2019 13:24:11 INFO ContextAnnotator - Using context consumer: org.apache.ctakes.necontexts.negation.NegationContextHitConsumer 18 Dec 2019 13:24:11 INFO ContextAnnotator - Using lookup window type: org.apache.ctakes.typesystem.type.textspan.Sentence 18 Dec 2019 13:24:11 INFO ContextAnnotator - Using focus type: org.apache.ctakes.typesystem.type.textsem.IdentifiedAnnotation 18 Dec 2019 13:24:11 INFO ContextAnnotator - Using context type: org.apache.ctakes.typesystem.type.syntax.BaseToken 18 Dec 2019 13:24:11 INFO SentenceDetector - Sentence detector model file: org/apache/ctakes/core/sentdetect/sd-med-model.zip 18 Dec 2019 13:24:11 INFO TokenizerAnnotatorPTB - Initializing org.apache.ctakes.core.ae.TokenizerAnnotatorPTB 18 Dec 2019 13:24:11 INFO POSTagger - POS tagger model file: org/apache/ctakes/postagger/models/mayo-pos.zip 18 Dec 2019 13:24:11 INFO ContextAnnotator - Using left , right scope sizes: 10 , 10 18 Dec 2019 13:24:11 INFO ContextAnnotator - Using scope order: LEFT,RIGHT 18 Dec 2019 13:24:11 INFO ContextAnnotator - SCOPE ORDER: [1, 3] 18 Dec 2019 13:24:11 INFO ContextAnnotator - Using context analyzer: org.apache.ctakes.necontexts.status.StatusContextAnalyzer 18 Dec 2019 13:24:11 INFO StatusContextAnalyzer - initBoundaryData() called for ContextInitializer 18 Dec 2019 13:24:11 INFO ContextAnnotator - Using context consumer: org.apache.ctakes.necontexts.status.StatusContextHitConsumer 18 Dec 2019 13:24:11 INFO ContextAnnotator - Using lookup window type: org.apache.ctakes.typesystem.type.textspan.Sentence 18 Dec 2019 13:24:11 INFO ContextAnnotator - Using focus type: org.apache.ctakes.typesystem.type.textsem.IdentifiedAnnotation 18 Dec 2019 13:24:11 INFO ContextAnnotator - Using context type: org.apache.ctakes.typesystem.type.syntax.BaseToken 18 Dec 2019 13:24:11 INFO LvgCmdApiResourceImpl - Loading NLM Norm and Lvg with config file = /Users/mrou/CCDA-Mac/Tools/apache-ctakes-4.0.0/resources/org/apache/ctakes/lvg/data/config/lvg.properties 18 Dec 2019 13:24:11 INFO LvgCmdApiResourceImpl - config file absolute path = /Users/mrou/CCDA-Mac/Tools/apache-ctakes-4.0.0/resources/org/apache/ctakes/lvg/data/config/lvg.properties 18 Dec 2019 13:24:11 INFO LvgCmdApiResourceImpl - cwd = /Users/mrou/CCDA-Mac/Tools/apache-ctakes-4.0.0 18 Dec 2019 13:24:11 INFO LvgCmdApiResourceImpl - cd /Users/mrou/CCDA-Mac/Tools/apache-ctakes-4.0.0/resources/org/apache/ctakes/lvg/ 18 Dec 2019 13:24:11 INFO ENGINE - open start - state not modified 18 Dec 2019 13:24:11 INFO ENGINE - dataFileCache open start 18 Dec 2019 13:24:12 INFO LvgCmdApiResourceImpl - cd /Users/mrou/CCDA-Mac/Tools/apache-ctakes-4.0.0 18 Dec 2019 13:24:12 INFO ContextDependentTokenizerAnnotator - Finite state machines loaded. 18 Dec 2019 13:24:12 INFO LuceneIndexReaderResourceImpl - indexDir=org/apache/ctakes/dictionary/lookup/drug_index exists. 18 Dec 2019 13:24:12 INFO LuceneIndexReaderResourceImpl - Loading Lucene Index into memory: /Users/mrou/CCDA-Mac/Tools/apache-ctakes-4.0.0/resources/org/apache/ctakes/dictionary/lookup/drug_index 18 Dec 2019 13:24:12 INFO LuceneIndexReaderResourceImpl - Loaded Lucene Index, # docs=5 18 Dec 2019 13:24:12 INFO LuceneIndexReaderResourceImpl - indexDir=org/apache/ctakes/dictionary/lookup/OrangeBook exists. 18 Dec 2019 13:24:12 INFO LuceneIndexReaderResourceImpl - Loading Lucene Index into memory: /Users/mrou/CCDA-Mac/Tools/apache-ctakes-4.0.0/resources/org/apache/ctakes/dictionary/lookup/OrangeBook 18 Dec 2019 13:24:12 INFO LuceneIndexReaderResourceImpl - Loaded Lucene Index, # docs=18889 18 Dec 2019 13:24:12 INFO LuceneIndexReaderResourceImpl - indexDir=org/apache/ctakes/dictionary/lookup/snomed-like_sample exists. 18 Dec 2019 13:24:12 INFO LuceneIndexReaderResourceImpl - Loading Lucene Index into memory: /Users/mrou/CCDA-Mac/Tools/apache-ctakes-4.0.0/resources/org/apache/ctakes/dictionary/lookup/snomed-like_sample 18 Dec 2019 13:24:12 INFO LuceneIndexReaderResourceImpl - Loaded Lucene Index, # docs=12 18 Dec 2019 13:24:12 INFO DictionaryLookupAnnotator - Parsing descriptor: /Users/mrou/CCDA-Mac/Tools/apache-ctakes-4.0.0/resources/org/apache/ctakes/dictionary/lookup/LookupDesc.xml 18 Dec 2019 13:24:12 INFO FirstTokenPermLookupInitializerImpl - Exclusion tagset loaded: [cc, pp, cd, pdt, prp$, vbn, vbp, pp$, wdt, wrb, ls, vb, vbz, prp, dt, ex, pos, md, vbd, wp, vbg, to, wps, rp] 18 Dec 2019 13:24:12 INFO UmlsToSnomedLuceneConsumerImpl - Using lucene index: /Users/mrou/CCDA-Mac/Tools/apache-ctakes-4.0.0/resources/org/apache/ctakes/dictionary/lookup/snomed-like_codes_sample 18 Dec 2019 13:24:12 INFO UmlsToSnomedLuceneConsumerImpl - Loaded Lucene index with 15 entries. 18 Dec 2019 13:24:12 INFO FirstTokenPermLookupInitializerImpl - Exclusion tagset loaded: [cc, pp, cd, pdt, vbn, vbp, pp$, wdt, wrb, ls, vb, vbz, dt, ex, pos, md, vbd, wp, vbg, to, wps, rp] 18 Dec 2019 13:24:12 INFO AssertionAnalysisEngine - scope model file: /Users/mrou/CCDA-Mac/Tools/apache-ctakes-4.0.0/resources/org/apache/ctakes/assertion/models/scope.model 18 Dec 2019 13:24:12 INFO AssertionAnalysisEngine - cue model file: /Users/mrou/CCDA-Mac/Tools/apache-ctakes-4.0.0/resources/org/apache/ctakes/assertion/models/cue.model 18 Dec 2019 13:24:12 INFO AssertionAnalysisEngine - pos model file: /Users/mrou/CCDA-Mac/Tools/apache-ctakes-4.0.0/resources/org/apache/ctakes/assertion/models/pos.model 18 Dec 2019 13:24:12 WARN BatchRunner - This class cannot be used until CTAKES-76 is implemented. 18 Dec 2019 13:24:12 WARN JarafeMEDecoder - This class cannot be used until CTAKES-76 is implemented. Dec 18, 2019 1:24:12 PM org.apache.uima.resource.impl.ResourceManager_impl initializeExternalResources WARNING: The external resource named assertionModelResourceImpl has been declared multiple times with different definitions. The definition of the resource in component /AggregateCdaProcessor/AssertionAnnotator/assertionAnalysisEngine/ will be used. The definition in component /AggregateCdaProcessor/AssertionAnnotator/conceptConverterAnalysisEngine/ will be ignored. Dec 18, 2019 1:24:12 PM org.apache.uima.resource.impl.ResourceManager_impl initializeExternalResources WARNING: The external resource named scopeModelResourceImpl has been declared multiple times with different definitions. The definition of the resource in component /AggregateCdaProcessor/AssertionAnnotator/assertionAnalysisEngine/ will be used. The definition in component /AggregateCdaProcessor/AssertionAnnotator/conceptConverterAnalysisEngine/ will be ignored. Dec 18, 2019 1:24:12 PM org.apache.uima.resource.impl.ResourceManager_impl initializeExternalResources WARNING: The external resource named cueModelResourceImpl has been declared multiple times with different definitions. The definition of the resource in component /AggregateCdaProcessor/AssertionAnnotator/assertionAnalysisEngine/ will be used. The definition in component /AggregateCdaProcessor/AssertionAnnotator/conceptConverterAnalysisEngine/ will be ignored. Dec 18, 2019 1:24:12 PM org.apache.uima.resource.impl.ResourceManager_impl initializeExternalResources WARNING: The external resource named enabledFeaturesResourceImpl has been declared multiple times with different definitions. The definition of the resource in component /AggregateCdaProcessor/AssertionAnnotator/assertionAnalysisEngine/ will be used. The definition in component /AggregateCdaProcessor/AssertionAnnotator/conceptConverterAnalysisEngine/ will be ignored. 18 Dec 2019 13:24:12 INFO ClearNLPDependencyParserAE - using Morphy analysis? true Loading configuration. Loading feature templates. Loading lexica. Loading model: ........................................................................................ CPM Initialization Complete 18 Dec 2019 13:24:24 INFO CdaCasInitializer - process(JCas) [Fatal Error] :1:1: Content is not allowed in prolog. Dec 18, 2019 1:24:24 PM org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl callAnalysisComponentProcess(430) SEVERE: Exception occurred org.apache.uima.analysis_engine.AnalysisEngineProcessException at org.apache.ctakes.preprocessor.ae.CdaCasInitializer.process(CdaCasInitializer.java:230) at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:396) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:314) at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:570) at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:412) at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:344) at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:265) at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:269) at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext(ProcessingUnit.java:895) at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(ProcessingUnit.java:575) Caused by: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog. at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) at org.apache.ctakes.preprocessor.ClinicalNotePreProcessor.process(ClinicalNotePreProcessor.java:175) at org.apache.ctakes.preprocessor.ae.CdaCasInitializer.process(CdaCasInitializer.java:168) ... 10 more Dec 18, 2019 1:24:24 PM org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl processAndOutputNewCASes(273) SEVERE: Exception occurred org.apache.uima.analysis_engine.AnalysisEngineProcessException at org.apache.ctakes.preprocessor.ae.CdaCasInitializer.process(CdaCasInitializer.java:230) at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:396) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:314) at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:570) at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:412) at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:344) at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:265) at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:269) at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext(ProcessingUnit.java:895) at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(ProcessingUnit.java:575) Caused by: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog. at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) at org.apache.ctakes.preprocessor.ClinicalNotePreProcessor.process(ClinicalNotePreProcessor.java:175) at org.apache.ctakes.preprocessor.ae.CdaCasInitializer.process(CdaCasInitializer.java:168) ... 10 more org.apache.uima.analysis_engine.AnalysisEngineProcessException at org.apache.ctakes.preprocessor.ae.CdaCasInitializer.process(CdaCasInitializer.java:230) at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:396) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:314) at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:570) at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:412) at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:344) at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:265) at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:269) at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext(ProcessingUnit.java:895) at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(ProcessingUnit.java:575) Caused by: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog. at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) at org.apache.ctakes.preprocessor.ClinicalNotePreProcessor.process(ClinicalNotePreProcessor.java:175) at org.apache.ctakes.preprocessor.ae.CdaCasInitializer.process(CdaCasInitializer.java:168) ... 10 more Dec 18, 2019 1:24:24 PM org.apache.uima.collection.impl.cpm.engine.ProcessingUnit process SEVERE: The container AggregateCdaProcessor returned the following error message: null (Thread Name: [Procesing Pipeline#1 Thread]::) Dec 18, 2019 1:24:24 PM org.apache.uima.collection.impl.cpm.engine.ProcessingUnit maybeLogSevereException(2500) SEVERE: Thread: [Procesing Pipeline#1 Thread]::, message: null org.apache.uima.analysis_engine.AnalysisEngineProcessException at org.apache.ctakes.preprocessor.ae.CdaCasInitializer.process(CdaCasInitializer.java:230) at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:396) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:314) at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:570) at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:412) at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:344) at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:265) at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:269) at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext(ProcessingUnit.java:895) at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(ProcessingUnit.java:575) Caused by: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog. at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) at org.apache.ctakes.preprocessor.ClinicalNotePreProcessor.process(ClinicalNotePreProcessor.java:175) at org.apache.ctakes.preprocessor.ae.CdaCasInitializer.process(CdaCasInitializer.java:168) ... 10 more org.apache.uima.analysis_engine.AnalysisEngineProcessException at org.apache.ctakes.preprocessor.ae.CdaCasInitializer.process(CdaCasInitializer.java:230) at org.apache.uima.analysis_component.JCasAnnotator_ImplBase.process(JCasAnnotator_ImplBase.java:48) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.callAnalysisComponentProcess(PrimitiveAnalysisEngine_impl.java:396) at org.apache.uima.analysis_engine.impl.PrimitiveAnalysisEngine_impl.processAndOutputNewCASes(PrimitiveAnalysisEngine_impl.java:314) at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.processUntilNextOutputCas(ASB_impl.java:570) at org.apache.uima.analysis_engine.asb.impl.ASB_impl$AggregateCasIterator.<init>(ASB_impl.java:412) at org.apache.uima.analysis_engine.asb.impl.ASB_impl.process(ASB_impl.java:344) at org.apache.uima.analysis_engine.impl.AggregateAnalysisEngine_impl.processAndOutputNewCASes(AggregateAnalysisEngine_impl.java:265) at org.apache.uima.analysis_engine.impl.AnalysisEngineImplBase.process(AnalysisEngineImplBase.java:269) at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext(ProcessingUnit.java:895) at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(ProcessingUnit.java:575) Caused by: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 1; Content is not allowed in prolog. at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source) at org.apache.ctakes.preprocessor.ClinicalNotePreProcessor.process(ClinicalNotePreProcessor.java:175) at org.apache.ctakes.preprocessor.ae.CdaCasInitializer.process(CdaCasInitializer.java:168) ... 10 more Dec 18, 2019 1:24:24 PM org.apache.uima.collection.impl.cpm.container.ProcessingContainer_Impl process SEVERE: The CPM stopped because the configured error threshold 0 was exceeded. (Thread Name: [Procesing Pipeline#1 Thread]::) Component Name: AggregateCdaProcessor Dec 18, 2019 1:24:24 PM org.apache.uima.collection.impl.cpm.engine.ProcessingUnit process SEVERE: The CPM is terminating. The current component is AggregateCdaProcessor. (Thread Name: [Procesing Pipeline#1 Thread]::) Dec 18, 2019 1:24:24 PM org.apache.uima.collection.impl.cpm.engine.ProcessingUnit process WARNING: The CPM cannot be stopped by force. The current container is AggregateCdaProcessor. (Thread Name: [Procesing Pipeline#1 Thread]::) Reason: The CAS processor AggregateCdaProcessor is configured to stop the CPM when excessive errors are encountered. (Thread Name: [Procesing Pipeline#1 Thread]::) Dec 18, 2019 1:24:24 PM org.apache.uima.collection.impl.cpm.engine.ProcessingUnit maybeLogSevereException(2500) SEVERE: Thread: [Procesing Pipeline#1 Thread]::, message: org.apache.uima.collection.base_cpm.AbortCPMException: at org.apache.uima.collection.impl.cpm.container.ProcessingContainer_Impl.incrementCasProcessorErrors(ProcessingContainer_Impl.java:795) at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.processNext(ProcessingUnit.java:1039) at org.apache.uima.collection.impl.cpm.engine.ProcessingUnit.run(ProcessingUnit.java:575) Dec 18, 2019 1:24:24 PM org.apache.uima.collection.impl.cpm.engine.CPMEngine process INFO: The collection reader thread state is: 1004 (Thread Name: [Procesing Pipeline#1 Thread]::) Dec 18, 2019 1:24:24 PM org.apache.uima.collection.impl.cpm.engine.CPMEngine process INFO: The CPM processing unit is 0 and processing state 2003. (Thread Name: [Procesing Pipeline#1 Thread]::) Dec 18, 2019 1:24:24 PM org.apache.uima.collection.impl.cpm.engine.CPMEngine process INFO: The CAS consumer thread state is 2001. (Thread Name: [Procesing Pipeline#1 Thread]::) Dec 18, 2019 1:24:24 PM org.apache.uima.collection.impl.cpm.engine.CPMEngine process INFO: The application stopped the CPM. (Thread Name: [Procesing Pipeline#1 Thread]::) Dec 18, 2019 1:24:24 PM org.apache.uima.collection.impl.cpm.engine.CPMEngine process INFO: The CPM engine is stopping. An end-of-file token is added to the worker queue. (Thread Name: [Procesing Pipeline#1 Thread]::) Forced stop: true Aborted