Hi Larry,

dictionaries/dictionary/metaFields 
These determine which fields are available for setting a property within 
lookupInitializer or lookupConsumer

dictionaries/dictionary/excludeList
Something listed in the excludeList will be ignored during dictionary lookup
This is used when something is in the dictionary that we are not interested in 
having annotated, or something in the dictionary is so much more often used to 
mean something else that we decide to skip having it annotated. For example, 
since cTAKES ignores cases during lookup, "Dr. Smith" normally would result in 
Dr being marked as diabetic retinopathy. Rather than having "Dr." marked 
incorrectly as diabetic retinopathy, we ignore all occurrences of "dr" (or "Dr" 
or "DR") by using the excludeList (which will ignore even those cases where DR 
is used to mean diabetic retinopathy - the rationale is that hopefully if 
diabetic retinopathy is an important concept for the document, it will be 
spelled out somewhere within the document)

lookupBindings/lookupBinding/lookupInitializer/properties
windowAnnotations - yes, this specifies which annotation type to perform 
lookups within
exclusionTags - these are part of speech tags - tokens tagged with these are 
ignored
maxPermutationLevel - affects the number of permutions of word orderings that 
are searched for multi-word dictionary entries.

lookupConsumer
I do see typeIdField in my LookupDesc_DrugNER.xml.
lookupConsumer
Defining a property for each of the fields in your lucene index that you want 
available to the Consumer, is the right thing to do.
This is here so that if you have a lucene index that has more fields than what 
you need for cTAKES, you can have cTAKES just retrieve the fields you need it 
to.

Hope that helps. If still more explanation would be useful, let me know.

-- James

-----Original Message-----
From: [email protected] 
[mailto:[email protected]] On Behalf Of 
Kline, Larry D
Sent: Wednesday, August 07, 2013 12:48 PM
To: [email protected]
Subject: RE: LookupDesc_DrugNER.xml

Thanks James.  I took the LookupDesc_DrugNER.xml file that came with
cTAKES (2.5) and slightly modified it.  It seems to work but I just like
to know what the fields mean.  For example:

* dictionaries/dictionary/metaFields
No idea what these do.  I left them the same.

* dictionaries/dictionary/metaFields
I assume these words are excluded from consideration when looking up a
string in the dictionary

* lookupBindings/lookupBinding/lookupInitializer/properties
I read through some of the code of the initializer.  I can see what it's
doing, but exactly how these fields affect the results is not obvious to
me.  I guess windowAnnotations specifies which annotation type to
perform lookups within.  The others I don't understand.

* lookupBindings/lookupBinding/lookupConsumer
I defined a property for each of the fields stored in my lucene index,
but I'm not sure if I needed to do that. In the default implementation
the field typeIdField is used in the lookup consumer, but it is defined
nowhere in the xml file.

Some background: I build my own Lucene index from tables of FDB data
that we maintain locally.  So I'm not looking anything up in UMLS.  I've
defined my own lookupConsumer that gets the data from Lucene and I
defined my own DrugOntologyConcept (subtype of OntologyConcept) to hold
that information.

Thanks,
Larry

-----Original Message-----
From: Masanz, James J. [mailto:[email protected]] 
Sent: Wednesday, August 07, 2013 9:59 AM
To: '[email protected]'
Subject: RE: LookupDesc_DrugNER.xml

There's a very brief description of the file on

https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+-+Dictiona
ry+Lookup

"A LookupDescriptorFile such as lookup/LookupDesc.xml, found in
resources/, defines the dictionary(s) used, and the classes that
interact with the dictionary(s). The implementation tag identifies the
type of dictionary: Lucene index (luceneImpl), database (jdbcImpl), or
delimited flat file (csvImpl). See class
org.apache.ctakes.dictionary.lookup.ae.LookupParseUtilities.java for
implementation details."

There are a few comments within the file. But as far as the specifics of
the individual elements, if you describe what you'd like to do, I or
someone else on this list should be able to help.

-- James


From: [email protected]
[mailto:[email protected]] On
Behalf Of Kline, Larry D
Sent: Wednesday, August 07, 2013 11:45 AM
To: [email protected]
Subject: LookupDesc_DrugNER.xml

Can anyone tell me where I can find a description of the format of this
file?

</pre>The contents of this electronic mail message and any attachments
are confidential, possibly privileged and intended for the addressee(s)
only.<br>Only the addressee(s) may read, disseminate, retain or
otherwise use this message. If received in error, please immediately
inform the sender and then delete this message without disclosing its
contents to anyone.</pre>

Reply via email to