Hi Larry, dictionaries/dictionary/metaFields These determine which fields are available for setting a property within lookupInitializer or lookupConsumer
dictionaries/dictionary/excludeList Something listed in the excludeList will be ignored during dictionary lookup This is used when something is in the dictionary that we are not interested in having annotated, or something in the dictionary is so much more often used to mean something else that we decide to skip having it annotated. For example, since cTAKES ignores cases during lookup, "Dr. Smith" normally would result in Dr being marked as diabetic retinopathy. Rather than having "Dr." marked incorrectly as diabetic retinopathy, we ignore all occurrences of "dr" (or "Dr" or "DR") by using the excludeList (which will ignore even those cases where DR is used to mean diabetic retinopathy - the rationale is that hopefully if diabetic retinopathy is an important concept for the document, it will be spelled out somewhere within the document) lookupBindings/lookupBinding/lookupInitializer/properties windowAnnotations - yes, this specifies which annotation type to perform lookups within exclusionTags - these are part of speech tags - tokens tagged with these are ignored maxPermutationLevel - affects the number of permutions of word orderings that are searched for multi-word dictionary entries. lookupConsumer I do see typeIdField in my LookupDesc_DrugNER.xml. lookupConsumer Defining a property for each of the fields in your lucene index that you want available to the Consumer, is the right thing to do. This is here so that if you have a lucene index that has more fields than what you need for cTAKES, you can have cTAKES just retrieve the fields you need it to. Hope that helps. If still more explanation would be useful, let me know. -- James -----Original Message----- From: [email protected] [mailto:[email protected]] On Behalf Of Kline, Larry D Sent: Wednesday, August 07, 2013 12:48 PM To: [email protected] Subject: RE: LookupDesc_DrugNER.xml Thanks James. I took the LookupDesc_DrugNER.xml file that came with cTAKES (2.5) and slightly modified it. It seems to work but I just like to know what the fields mean. For example: * dictionaries/dictionary/metaFields No idea what these do. I left them the same. * dictionaries/dictionary/metaFields I assume these words are excluded from consideration when looking up a string in the dictionary * lookupBindings/lookupBinding/lookupInitializer/properties I read through some of the code of the initializer. I can see what it's doing, but exactly how these fields affect the results is not obvious to me. I guess windowAnnotations specifies which annotation type to perform lookups within. The others I don't understand. * lookupBindings/lookupBinding/lookupConsumer I defined a property for each of the fields stored in my lucene index, but I'm not sure if I needed to do that. In the default implementation the field typeIdField is used in the lookup consumer, but it is defined nowhere in the xml file. Some background: I build my own Lucene index from tables of FDB data that we maintain locally. So I'm not looking anything up in UMLS. I've defined my own lookupConsumer that gets the data from Lucene and I defined my own DrugOntologyConcept (subtype of OntologyConcept) to hold that information. Thanks, Larry -----Original Message----- From: Masanz, James J. [mailto:[email protected]] Sent: Wednesday, August 07, 2013 9:59 AM To: '[email protected]' Subject: RE: LookupDesc_DrugNER.xml There's a very brief description of the file on https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+-+Dictiona ry+Lookup "A LookupDescriptorFile such as lookup/LookupDesc.xml, found in resources/, defines the dictionary(s) used, and the classes that interact with the dictionary(s). The implementation tag identifies the type of dictionary: Lucene index (luceneImpl), database (jdbcImpl), or delimited flat file (csvImpl). See class org.apache.ctakes.dictionary.lookup.ae.LookupParseUtilities.java for implementation details." There are a few comments within the file. But as far as the specifics of the individual elements, if you describe what you'd like to do, I or someone else on this list should be able to help. -- James From: [email protected] [mailto:[email protected]] On Behalf Of Kline, Larry D Sent: Wednesday, August 07, 2013 11:45 AM To: [email protected] Subject: LookupDesc_DrugNER.xml Can anyone tell me where I can find a description of the format of this file? </pre>The contents of this electronic mail message and any attachments are confidential, possibly privileged and intended for the addressee(s) only.<br>Only the addressee(s) may read, disseminate, retain or otherwise use this message. If received in error, please immediately inform the sender and then delete this message without disclosing its contents to anyone.</pre>
