Hi Bruce,
I would venture to say that this is neither expected nor desired.

Before you fix it (or in addition to a fix), try to run with the new dictionary 
lookup.   It will have a different behavior, and it will be the default 
dictionary lookup in future releases of cTakes – making fixes to the current 
module slightly less urgent.

Sean

From: Bruce Tietjen [mailto:bruce.tiet...@perfectsearchcorp.com]
Sent: Wednesday, October 08, 2014 11:38 AM
To: dev@ctakes.apache.org
Subject: Differences in MedicationMention annotations on subsequent processing 
runs


I have encountered a situation in which the cTakes clinical pipeline output 
differs between multiple runs on the same text with the same configuration.
The following snippets from a single document are sufficient to demonstrate the 
issue:

 a gentle curve going into. irrigated with Bacitracin.

The source of the difference is that the DictionaryLookupAnnotator uses a map 
to filter out duplicate annotations for a single document location:
    // used to prevent duplicate hits
    // key = hit begin,end key (java.lang.String)
    // val = Set of MetaDataHit objects
    private Map<String,Set<MetaDataHit>> iv_dupMap = new HashMap<>();

This map is shared between both the umls_ms_2011ab lookup and the 
umls_ms_2011an_rxnorm lookup,

If both dictionaries contain the same term, the order of dictionary lookup 
execution determines the output.If the rxnorm lookup runs first, then a 
MedicationMention annotation for Bacitracin appears in the final output. If the 
standard umls lookup runs first, then there is no MedicationMention annotation 
for Bacitracin.
I will attach the output from the subsequent runs. (Hopefully the attachment 
will make it through the system)

Is this expected behavior? If not, what would be the expected behavior?

[Image removed by sender. IMAT Solutions]<http://imatsolutions.com>
Bruce Tietjen
Senior Software Engineer
[Image removed by sender. Mobile:]801.634.1547
bruce.tiet...@imatsolutions.com<mailto:bruce.tiet...@imatsolutions.com>

Reply via email to