Thank you again, Tim. This worked very nicely! One more question on a somewhat unrelated subject, if I may: while this setup found most clinical concepts it did not find a few that I'd think would be fairly standard. For example, it did not find "myeloblastoma", which is definitely a UMLS term. I tried using AggregatePlaintextUMLSProcessor in conduction with DictionaryLookupAnnotatorUMLS, but no luck - it only found "myeloblastoma" as a WordToken. Are there different versions of UMLS dictionaries that could potentially be used in cTAKES?
Thank you, Natalia On Mon, Jul 21, 2014 at 2:52 PM, Miller, Timothy < [email protected]> wrote: > You may need to modify test_plaintext.xml to use the UMLS-based pipeline > if you haven't already. I think the line: > <import > location="../analysis_engine/AggregatePlaintextProcessor.xml"/> > > needs to be changed to use: > > AggregatePlaintextUMLSProcessor.xml > > I believe you can also make that change in the CPE GUI. > > Tim > > > On 07/21/2014 02:43 PM, Natalia Connolly wrote: > > Thanks Tim. This worked in the sense that it did not crash; however, the > output does not seem to have any actual annotations of diagnoses, > medications, etc. The input text contains a number of such concepts that > had indeed been flagged by CVD; but when I grep for "concept" or "medfacts" > or "cui" in the CPE output there is nothing there. Would you have any > suggestions for how to "synchronize" the outputs of CVD and CPE? Both > scripts contain the -Dctakes.umlsuser/umlspw options, so both should have > access to UMLS. > > Thank you, > > Natalia > > > > On Mon, Jul 21, 2014 at 1:36 PM, Miller, Timothy < > [email protected]> wrote: > >> It looks to me like you want test_plaintext.xml rather than test1.xml. >> test1.xml seems to expect CDA-formatted input while test_plaintext.xml can >> read text files like you have. >> Tim >> >> >> On 07/21/2014 01:30 PM, Natalia Connolly wrote: >> >> Hello, >> >> I am new to cTAKES. I am using cTAKES 3.1. I've been able to run >> the visual debugger without any trouble but now I am stuck on running the >> CPE version, which is what I will really need as I have a large number of >> clinical documents to process. >> >> I loaded test1.xml as the descriptor, and made sure both the input >> and the output directories exist. My single input file in the input >> directory is just plain text, similar to the "Dr. Nutritious" example. >> However, I am getting the following error: >> >> org.apache.uima.analysis_engine.AnalysisEngineProcessException >> CausedBy: org,xml.sax.SAXParseException; lineNumber: 1; columnNumber: 2; >> Content is now allowed in Prolog. >> >> Does this mean that the input file has to be in xml format? If so, >> how do I convert plain text into the format that cTAKES expects? >> >> Thank you. >> >> Natalia Connolly >> >> >> >> > >
