Hi Paula, There is a separate DictionaryLookupAnnotatorCSV.xml descriptor for using the delimited file that you found. You would have to update the aggregate to refer to DictionaryLookupAnnotatorCSV.xml instead of DictionaryLookupAnnotator.xml in order to use the delimited files directly.
Or, to create a lucene index to replace the lucene index used by DictionaryLookupAnnotator.xml, there are some posts on the old forum that talk about creating dictionaries. You could take a look at those. https://cabig-kc.nci.nih.gov/Vocab/forums/viewtopic.php?f=28&t=80&start=20#p1459 https://cabig-kc.nci.nih.gov/Vocab/forums/viewtopic.php?f=28&t=423&p=1465 I think a better alternative would be update the SQL dictionaries - delete all the data and replace with what you want. And remove the check for the UMLS user ID. Hope that helps. -- James From: user-return-350-Masanz.James=mayo....@ctakes.apache.org [mailto:user-return-350-Masanz.James=mayo....@ctakes.apache.org] On Behalf Of digital paula Sent: Monday, November 18, 2013 8:48 PM To: user@ctakes.apache.org Subject: RE: Dictionary and Assertion rules for the Aggregate Plaintext Processor Thanks James for the response. I'd like to update that tiny dictionary and see the changes take effect. I looked in the dictionary-lookup folders and found a dictionary1.csv file with these entires, I added 'elbow 'to it (just typed in and saved file). ankle|ankle aspirin|aspirin cm|cm cm|cm is a synonym of a UMLS term Cutaneous Mastocytosis which is in SNOMED but cm is not by itself hyperlipidemia|hyperlipidemia knee|knee knee|knee pain ld|ld ld|ld SNOMED procedure C0011911 medical|medical nutrition therapy nutrition|nutrition nutrition|nutrition therapy pain|pain pain|pain, chronic weight|weight gain elbow|elbow I typed in the following text in CVD using the plaintext processor that uses the DictionaryLookupAnnotator.xml descriptor. "patient has knee pain and gained significant weight gain due to injury. Past history of elbow pain. family history includes hyperlipidemia. ." There was no annotation for 'elbow' or 'weight' after executing the AggregatePlaintext AE processor. Does something else have to be done to get it to annotate, basically what are the steps to add terms to the tiny dictionary? Thanks. Regards, Paula > From: masanz.ja...@mayo.edu<mailto:masanz.ja...@mayo.edu> > To: user@ctakes.apache.org<mailto:user@ctakes.apache.org> > Subject: RE: Dictionary and Assertion rules for the Aggregate Plaintext > Processor > Date: Mon, 18 Nov 2013 20:17:32 +0000 > > The "pain" and "aspirin" terms are from a tiny dictionary that has just a > handful of terms with made-up codes/CUIs. (For the record, they are in a > lucene index instead of an HSQL database.) I think "knee" is also in that > tiny dictionary. > > As far as I know, the assertion component doesn't work any differently > whether you use the UMLS dictionary or the tiny, sample, made-up dictionary. > > Which dictionary is used is determined by which Dictionary Lookup analysis > engine is included in the aggregate > DictionaryLookupAnnotator.xml for the tiny one > DictionaryLookupAnnotatorUMLS.xml for the one that has real UMLS terms and > CUIs > > -- James > > > From: > user-return-348-Masanz.James=mayo....@ctakes.apache.org<mailto:user-return-348-Masanz.James=mayo....@ctakes.apache.org> > [mailto:user-return-348-Masanz.James=mayo....@ctakes.apache.org] On Behalf > Of digital paula > Sent: Monday, November 18, 2013 1:37 PM > To: user@ctakes.apache.org<mailto:user@ctakes.apache.org> > Subject: RE: Dictionary and Assertion rules for the Aggregate Plaintext > Processor > > Oh, gosh....I meant 'Hello' again. > > ________________________________________ > From: cybersat...@hotmail.com<mailto:cybersat...@hotmail.com> > To: user@ctakes.apache.org<mailto:user@ctakes.apache.org> > Subject: Dictionary and Assertion rules for the Aggregate Plaintext Processor > Date: Mon, 18 Nov 2013 14:36:13 -0500 > Hell again cTakes Community, > > I think this will be an easy question. Okay I've decided to start simple > by first exploring the Aggregate Plaintext Processor without ULMS. Since > it's not using ULMS, where is the dictionary and the assertion rules being > defined? Can these be modified easily? For example, I see that 'pain' and > 'aspirin' gets annotated in text or if text states 'family history' this > will be noted as well. Can someone enlighten me as to how cTakes generates > this and if it's easily customizable? > > Thanks. > > Regards, > Paula