Hi Chen Pie,
I figured out where the problem was. But I am not able to figure out the reason
or solution.I had configured my own dictionary from the UMLS knowledge sources.
I had made two tables in MySQL, one containing CUIs from SNOMEDCT source
(umls_snomed_2015, for disease, symptoms etc) and the other containing CUIs
from RXNORM (umls_rxNorm_2015 for medication). After a lot of debugging and
print statements, I figured out that in
lookUpConsumer(UmlstoSnomedComsumerDbImpl), lookup hits are being matched
against the valid TUIs in DICT_UMLS_MS sometimes, and against valid TUIs in
DICT_RXNORM_MS sometimes. I have attached the LookUpDesc_Db file for reference.
<?xml version="1.0" encoding="UTF-8"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
<lookupSpecification>
<!-- Defines what dictionaries will be used in terms of implementation
specifics and metaField configuration. -->
<dictionaries>
<dictionary id="DICT_UMLS_MS" externalResourceKey="DbConnection"
caseSensitive="false">
<implementation>
<jdbcImpl tableName="umls_ms_2015"/>
</implementation>
<lookupField fieldName="fword"/>
<metaFields>
<metaField fieldName="cui"/>
<metaField fieldName="tui"/>
<metaField fieldName="text"/>
</metaFields>
</dictionary>
<dictionary id="DICT_RXNORM_MS" externalResourceKey="DbConnection"
caseSensitive="false">
<implementation>
<jdbcImpl tableName="umls_rxNorm_2015"/>
</implementation>
<lookupField fieldName="fword"/>
<metaFields>
<metaField fieldName="cui"/>
<metaField fieldName="tui"/>
<metaField fieldName="text"/>
</metaFields>
</dictionary>
</dictionaries>
<!-- Binds together the components necessary to perform the complete lookup
logic start to end. -->
<lookupBindings>
<lookupBinding>
<dictionaryRef idRef="DICT_UMLS_MS"/>
<lookupInitializer
className="org.apache.ctakes.dictionary.lookup.ae.FirstTokenPermLookupInitializerImpl">
<properties>
<property key="textMetaFields" value="text"/>
<property key="maxPermutationLevel" value="7"/>
<!-- <property key="windowAnnotations"
value="org.apache.ctakes.typesystem.type.textspan.Sentence"/> -->
<property key="windowAnnotations"
value="org.apache.ctakes.typesystem.type.textspan.LookupWindowAnnotation"/>
<property key="exclusionTags"
value="VB,VBD,VBG,VBN,VBP,VBZ,CC,CD,DT,EX,IN,LS,MD,PDT,POS,PP,PP$,PRP,PRP$,RP,TO,WDT,WP,WPS,WRB"/>
</properties>
</lookupInitializer>
<lookupConsumer
className="org.apache.ctakes.dictionary.lookup.ae.UmlsToSnomedDbConsumerImpl">
<properties>
<property key="codingScheme" value="SNOMED"/>
<property key="cuiMetaField" value="cui"/>
<property key="tuiMetaField" value="tui"/>
<property key="textMetaField" value="text"/>
<property key="anatomicalSiteTuis"
value="T021,T022,T023,T024,T025,T026,T029,T030"/>
<property key="procedureTuis" value="T060,T061"/>
<property key="disorderTuis"
value="T019,T020,T037,T046,T047,T048,T049,T050,T190,T191"/>
<property key="findingTuis"
value="T033,T034,T040,T041,T042,T043,T044,T045,T056,T057,T184"/>
<property key="labTuis" value="T059,T116"/>
<property key="dbConnExtResrcKey" value="DbConnection"/>
<property key="mapPrepStmt" value="select code from umls_snomed_map where
cui=?"/>
</properties>
</lookupConsumer>
</lookupBinding>
<lookupBinding>
<dictionaryRef idRef="DICT_RXNORM_MS"/>
<lookupInitializer
className="org.apache.ctakes.dictionary.lookup.ae.FirstTokenPermLookupInitializerImpl">
<properties>
<property key="textMetaFields" value="text"/>
<property key="maxPermutationLevel" value="7"/>
<!-- <property key="windowAnnotations"
value="org.apache.ctakes.typesystem.type.textspan.Sentence"/> -->
<property key="windowAnnotations"
value="org.apache.ctakes.typesystem.type.textspan.LookupWindowAnnotation"/>
<property key="exclusionTags"
value="VB,VBD,VBG,VBN,VBP,VBZ,CC,CD,DT,EX,IN,LS,MD,PDT,POS,PP,PP$,PRP,PRP$,RP,TO,WDT,WP,WPS,WRB"/>
</properties>
</lookupInitializer>
<lookupConsumer
className="org.apache.ctakes.dictionary.lookup.ae.UmlsToSnomedDbConsumerImpl">
<properties>
<property key="codingScheme" value="RXNORM"/>
<property key="cuiMetaField" value="cui"/>
<property key="tuiMetaField" value="tui"/>
<property key="textMetaField" value="text"/>
<property key="medicationTuis"
value="T073,T103,T109,T110,T111,T115,T121,T122,T123,T130,T168,T192,T195,T197,T200,T203
"/>
<property key="dbConnExtResrcKey" value="DbConnection"/>
<property key="mapPrepStmt" value="select code from umls_rxNorm_map where
cui=?"/>
</properties>
</lookupConsumer>
</lookupBinding>
</lookupBindings>
</lookupSpecification>
Regards,
Prashasti Agrawal | Data Engineer | Noida INDIA | GMT +5:30 hours
Mobile +91 9818812484 |
prashasti.agrawal<mailto:[email protected]>@wincere.com<mailto:[email protected]>
|
www.wincere.com<http://www.wincere.com/>
DISCLAIMER: This electronic transmission is governed by Wincere Inc. Any views
or opinions expressed in this email are solely those of the author and do not
necessarily reflect the opinions of Wincere Inc. If you have received this
email in error, please delete all copies from your system and notify the sender
or contact us at: +1 855 855 2946<tel:%2B1%20855%20855%202946> or
[email protected]<mailto:[email protected]>.
________________________________
From: Chen, Pei <[email protected]>
Sent: Friday, July 24, 2015 12:11 AM
To: [email protected]
Subject: RE: Inconsistent IdentifiedAnnotation in different runs
By any chance,
Are you running this in multi threaded mode within the same JVM? And do you
have LVG included in the pipeline?
I vaguely recall there were some non-thread safe code in the LVG component
(don't recall if the fix was made in the latest release yet.)
If it's still returning the behavior, would you be able to help recreate it
with sample/dummy examples that could be shared? In particular the output xmi
files?
--Pei
From: Prashasti Agrawal [mailto:[email protected]]
Sent: Thursday, July 23, 2015 5:05 AM
To: [email protected]
Subject: Inconsistent IdentifiedAnnotation in different runs
Hi,
I am running AggregatePlainTextUMLSProcessor analysis engine on a EMR document.
I have added some modules like drug NER and template filler in the pipeline. I
am getting different Identified Annotations in different runs on the same
document. (For example, in 8 DiseaseDisorderMention in one run, while 15 in
other).
I am unable to understand why is this so. What am I missing here?
Regards,
Prashasti Agrawal | Data Engineer | Noida INDIA | GMT +5:30 hours
Mobile +91 9818812484 |
prashasti.agrawal<mailto:[email protected]>@wincere.com<mailto:[email protected]>
|
www.wincere.com<https://urldefense.proofpoint.com/v2/url?u=http-3A__www.wincere.com_&d=BQMFAw&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=huK2MFkj300qccT8OSuuoYhy_xEYujfPwiAxhPVz5WY&m=U6__j_v3_B-W5JMJPciXAfZyN4BN_Fi4g6GcMDx8LuM&s=C7gs6IxajIF4w8cHqxyNVfyc1IinBBkEpGRa8efVTko&e=>
DISCLAIMER: This electronic transmission is governed by Wincere Inc. Any views
or opinions expressed in this email are solely those of the author and do not
necessarily reflect the opinions of Wincere Inc. If you have received this
email in error, please delete all copies from your system and notify the sender
or contact us at: +1 855 855 2946<tel:%2B1%20855%20855%202946> or
[email protected]<mailto:[email protected]>.