Further to my previous message, Sean, I was wondering if you could tell me whether this answer you gave in 2015, is still the right way to do things in ctakes4.x
permalink: http://markmail.org/message/s3ztinppusvsciss Subject: RE: How to update cTAKES so that new top level categories come out based on local dictionary?<http://markmail.org/message/s3ztinppusvsciss> [permalink] <http://markmail.org/message/s3ztinppusvsciss> From: Finan, Sean (sean...@childrens.harvard.edu) Date: Oct 6, 2015 2:04:56 pm List: org.apache.incubator.ctakes-dev Regards Peter From: <Abramowitsch>, Peter Abramowitsch <pabramowit...@hearst.com<mailto:pabramowit...@hearst.com>> Date: Thursday, January 4, 2018 at 12:50 PM To: "dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>" <dev@ctakes.apache.org<mailto:dev@ctakes.apache.org>> Subject: How to use external CSV or BSV in addition to FastUMLS Can someone point me to any up-to-date how-tos on how to include external CSV/BSV type resources to add synonyms, and other terms for dictionary lookup to augment the FAST UMLS resources that comes out of the box. Perhaps I have missed something, but looking at the CTakesDictionaryCreator UI, it looks like it is designed only to choose subsets of the UMLS data set rather than allowing one to bring in completely new information sources. I scoured the Marklogic ctakes user archive, but so many of the entries are old and I'm not sure they describe the current way of doing things. The only approach I could see would be to take use the AggregateEngine description and have it point to the CSV annotator, creating a completely new AE but this would build other types of annotation, whereas what I'm thinking about is a case for creating identified mentions such as a DiseaseDisorderMention based on finding an acronym that the UMLS resource doesn't know about, even though the concept in its full textual form is there. I'm sure this is not a unique request and apologize in advance if it has already been answered somewhere - Peter