cTAKES\resources\drugnerresources\lookup\LookupDesc_DrugNER.xml or similar has
the setting under the lookup bindings maxPermutationLevel (as it states here
that value used is '3':
<lookupInitializer
className="edu.mayo.bmi.uima.lookup.ae.FirstTokenPermLookupInitializerImpl">
<properties>
<property key="textMetaFields" value="preferred_designation|other_designation"/>
<property key="maxPermutationLevel" value="3"/>
<property key="windowAnnotations"
value="edu.mayo.bmi.uima.lookup.type.DrugLookupWindowAnnotation"/>
<property key="exclusionTags"
value="CC,CD,DT,EX,LS,MD,PDT,POS,PP,PP$,PRP,PRP$,RP,TO,WDT,WP,WPS,WRB"/> <!--
ohnlp ID# 3378705 -->
</properties>
</lookupInitializer>
From: [email protected]
[mailto:[email protected]] On Behalf Of
Kannan Thiagarajan
Sent: Monday, April 29, 2013 3:00 PM
To: [email protected]
Subject: Re: DrugAggregateUMLSPlainTextProcessor related question
Hello Sean,
Thanks for the response.
Just for my own understanding, do you know how many permutations its currently
limited to and where I might see that in the code
Best Regards
Kannan
On Mon, Apr 29, 2013 at 9:04 AM, Murphy, Sean P. [RO BIT]
<[email protected]<mailto:[email protected]>> wrote:
Hello Kannan,
The issue is mainly due to how cTAKES is handling permutations.
The overhead required to handle, say 7 or more permutations, was not found
to have a good return even if there was a corresponding RXCONSO entry.
Additionally, unless the text extracted represented the
normalized form, according to Rxnorm, the resulting named entity would be
missed.
So for the example below, if Lexapro had a corresponding entry
for 'Lexapro 10 MG' than the pipeline would have discovered the entity.
Thanks,
~Sean
From:
[email protected]<mailto:[email protected]>
[mailto:user-return-173-Murphy.Sean<mailto:user-return-173-Murphy.Sean>[email protected]<mailto:[email protected]>]
On Behalf Of Kannan Thiagarajan
Sent: Monday, April 29, 2013 7:52 AM
To: [email protected]<mailto:[email protected]>
Subject: DrugAggregateUMLSPlainTextProcessor related question
Hello,
I'm trying to understand the named entity recognition aspect of cTAKES.
If I pass-in a text such as below
Lexapro 10 mg oral tablet 3 times a day
cTAKES finds a single MedicationEventMention with the RxNorm code = 352741.
However looking in the RXCONSO database, I see that there is one specific entry
for the 10 mg.
352741|ENG||||||1551887|1551887|352741||RXNORM|BN|352741|Lexapro||N|4096|
352272|ENG||||||1937400|1937400|352272||RXNORM|SY|352272|Lexapro 10 MG Oral
Tablet||N|4096|
But, cTAKES always resorts to finding the first entry (without 10 mg).
I did however notice that in certain cases it finds two annotations. For example
Aspirin 325 mg two times a day
Comes up with two annotations - Asprin 325 mg (code 317300) and Aspirin (code
1191)
317300|ENG||||||1481682|1481682|317300||RXNORM|SCDC|317300|Aspirin 325
MG||N|4096|
1191|ENG||||||2596464|2596464|1191||MTHSPL|SU|R16CO5Y76E|Aspirin||N|4096|
Any thoughts as to why there might be a difference in the lookup.
Thanks
--
Best Regards
Kannan Thiagarajan
--
Best Regards
Kannan Thiagarajan