Hi Jessica –

Many thanks for the insight. I see where this is going wrong. Yes, DOC and ADR 
are present in the text. However, DOC is mentioned as “.doc” which is the 
representation of a file extension and not a drug from the text perspective. 
Also, ADR is mentioned in the document as an abbreviation to “Adverse Drug 
Reaction”.

I think the only way to exclude such words would be to pre-process the text 
before passing to cTAKES. Is there any other way within cTAKES to achieve this? 
(Ex: pass a file with stop words and add some of these abbreviations in that)?

Thanks
Sekhar H.

From: Jessica Glover <glover.jessic...@gmail.com>
Sent: Wednesday, June 5, 2019 1:07 AM
To: user@ctakes.apache.org
Subject: Re: cTAKES output

Hi Sekhar,

Do you use the CAS Visual Debugger (CVD), or even a text editor that will show 
you the character positions of the document text?
I can see from your output that the evidence spans for each RxNorm code are 
annotated.

Code       Evidence span offsets
10311      793-797
3256       1152-1155
450530   576-584
217992   452-460
3639       2454-2457

Look in those places in your document to find out what language is triggering 
these codes.

Other things to note:
A common abbreviation of Deoxycorticosterone is "DOC". I bolded where I see DOC 
and 3256 in your output. Similarly, "ADR" is another way to express 
Doxorubicin, and I've bolded that in your output as well. See below.

{'ANATOMICALSITEMENTION': {'ORAL': ['START: 793', 'END: 797', 'POLARITY: 1', 
'[CODINGSCHEME: SNOMEDCT_US, CODE: 74262004, CUI: C0226896, TUI: T030]']}, 
'MEDICATIONMENTION': {'TABLETS': ['START: 474', 'END: 481', 'POLARITY: 1', 
'[CODINGSCHEME: SNOMEDCT_US, CODE: 46992007, CUI: C0039225, TUI: T122]', 
'[CODINGSCHEME: SNOMEDCT_US, CODE: 385055001, CUI: C0039225, TUI: T122]', 
'[CODINGSCHEME: RXNORM, CODE: 10311, CUI: C0039225, TUI: T122]'], 'INJECTION': 
['START: 817', 'END: 826', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 
129326001, CUI: C1533685, TUI: T061]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 
28289002, CUI: C1533685, TUI: T061]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 
59108006, CUI: C1533685, TUI: T061]'], 'DOC': ['START: 1152', 'END: 1155', 
'POLARITY: 1', '[CODINGSCHEME: RXNORM, CODE: 3256, CUI: C0011710, TUI: T109]', 
'[CODINGSCHEME: RXNORM, CODE: 3256, CUI: C0011710, TUI: T121]', '[CODINGSCHEME: 
SNOMEDCT_US, CODE: 56156001, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: 
SNOMEDCT_US, CODE: 1336006, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: 
SNOMEDCT_US, CODE: 1336006, CUI: C0011710, TUI: T121]', '[CODINGSCHEME: 
SNOMEDCT_US, CODE: 75029008, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: 
RXNORM, CODE: 3256, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: SNOMEDCT_US, 
CODE: 56156001, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 
56156001, CUI: C0011710, TUI: T121]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 
1336006, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 
75029008, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 
75029008, CUI: C0011710, TUI: T121]'], 'SOLUTION': ['START: 576', 'END: 584', 
'POLARITY: 1', '[CODINGSCHEME: RXNORM, CODE: 450530, CUI: C1382100, TUI: 
T122]'], 'ORAL': ['START: 793', 'END: 797', 'POLARITY: 1', '[CODINGSCHEME: 
SNOMEDCT_US, CODE: 74262004, CUI: C0226896, TUI: T030]'], 'LEVAQUIN': ['START: 
452', 'END: 460', 'POLARITY: 1', '[CODINGSCHEME: RXNORM, CODE: 217992, CUI: 
C0721336, TUI: T121]', '[CODINGSCHEME: RXNORM, CODE: 217992, CUI: C0721336, 
TUI: T109]'], 'ADR': ['START: 2454', 'END: 2457', 'POLARITY: 1', 
'[CODINGSCHEME: SNOMEDCT_US, CODE: 68444001, CUI: C0013089, TUI: T195]', 
'[CODINGSCHEME: SNOMEDCT_US, CODE: 68444001, CUI: C0013089, TUI: T109]', 
'[CODINGSCHEME: RXNORM, CODE: 3639, CUI: C0013089, TUI: T195]', '[CODINGSCHEME: 
RXNORM, CODE: 3639, CUI: C0013089, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, 
CODE: 372817009, CUI: C0013089, TUI: T195]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 
372817009, CUI: C0013089, TUI: T109]']}, 'DRUGCHANGESTATUSANNOTATION': {}, 
'STRENGTHANNOTATION': {}, 'FRACTIONSTRENGTHANNOTATION': {}, 
'FREQUENCYUNITANNOTATION': {}, 'DISEASEDISORDERMENTION': {'RUPTURE': ['START: 
1579', 'END: 1586', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 
125671007, CUI: C3203359, TUI: T037]']}, 'SIGNSYMPTOMMENTION': {'RED': ['START: 
1081', 'END: 1084', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 
386713009, CUI: C0332575, TUI: T033]'], 'CONTENT': ['START: 2992', 'END: 2999', 
'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 271599002, CUI: C0423896, 
TUI: T041]'], 'HISTORY': ['START: 34', 'END: 41', 'POLARITY: 1', 
'[CODINGSCHEME: SNOMEDCT_US, CODE: 392521001, CUI: C0262926, TUI: T033]']}, 
'ROUTEANNOTATION': {}, 'DATEANNOTATION': {}, 'MEASUREMENTANNOTATION': {}, 
'PROCEDUREMENTION': {'INJECTION': ['START: 817', 'END: 826', 'POLARITY: 1', 
'[CODINGSCHEME: SNOMEDCT_US, CODE: 129326001, CUI: C1533685, TUI: T061]', 
'[CODINGSCHEME: SNOMEDCT_US, CODE: 28289002, CUI: C1533685, TUI: T061]', 
'[CODINGSCHEME: SNOMEDCT_US, CODE: 59108006, CUI: C1533685, TUI: T061]']}, 
'TIMEMENTION': {}, 'STRENGTHUNITANNOTATION': {}}


Hope this helps,
Jessica

On Tue, Jun 4, 2019 at 1:40 PM gandhi rajan 
<gandhiraja...@gmail.com<mailto:gandhiraja...@gmail.com>> wrote:
Hi Sekhar,

To answer your first question, As per my knowledge, I don't think there are any 
config change to filter output. You gotta pick and choose the desired output as 
per your requirement by parsing the output XML.

On Tuesday, June 4, 2019, Hari, Sekhar 
<sekhar.h...@cgi.com<mailto:sekhar.h...@cgi.com>> wrote:
Hi All –

I see something that is not correct in the cTAKES output for the text below. I 
sincerely hope somebody can guide me here with my questions at the end. Not 
sure if I’m doing anything wrong with the cTAKES configuration.

Content:
          “Since the last approved labeling, there has been no submission to 
LEVAQUIN®
          NDAs: NDA 20-634 LEVAQUIN® (levofloxacin) Tablets, NDA 20-635
          LEVAQUIN® (levofloxacin) Injection, NDA 21-721 LEVAQUIN®
          (levofloxacin) Oral Solution.”

There are several lines after this. But the only brand name of the drug that is 
mentioned in the whole document is ‘LEVAQUIN’ and generic name mentioned is 
‘levofloxacin’. These names appear at a couple of places in the document, and 
then there are some disease names mentioned too.

Objective:
Retrieve the generic name and brand name from the text using the cTAKES 
returned RXNORM codes.

We do a POST of the full text to the API - 
http://XX.XX.XX.XX/ctakes-web-rest/service/analyze<https://urldefense.proofpoint.com/v2/url?u=http-3A__XX.XX.XX.XX_ctakes-2Dweb-2Drest_service_analyze&d=DwMFaQ&c=H50I6Bh8SW87d_bXfZP_8g&r=GAipXiP0G0TsVpz6BpNhH1DSC_wewj2cdVIV-HVMiag&m=dOPc9E0_D-Pjz4yhOFxZhI7Qtok4PYvBQ9-6Xpd-w44&s=AJZRIsV1fJNXvx9LPRTm8NBxgaPFZAHaxc_zB7Jupkw&e=>.

…following is the output from API:
{'ANATOMICALSITEMENTION': {'ORAL': ['START: 793', 'END: 797', 'POLARITY: 1', 
'[CODINGSCHEME: SNOMEDCT_US, CODE: 74262004, CUI: C0226896, TUI: T030]']}, 
'MEDICATIONMENTION': {'TABLETS': ['START: 474', 'END: 481', 'POLARITY: 1', 
'[CODINGSCHEME: SNOMEDCT_US, CODE: 46992007, CUI: C0039225, TUI: T122]', 
'[CODINGSCHEME: SNOMEDCT_US, CODE: 385055001, CUI: C0039225, TUI: T122]', 
'[CODINGSCHEME: RXNORM, CODE: 10311, CUI: C0039225, TUI: T122]'], 'INJECTION': 
['START: 817', 'END: 826', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 
129326001, CUI: C1533685, TUI: T061]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 
28289002, CUI: C1533685, TUI: T061]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 
59108006, CUI: C1533685, TUI: T061]'], 'DOC': ['START: 1152', 'END: 1155', 
'POLARITY: 1', '[CODINGSCHEME: RXNORM, CODE: 3256, CUI: C0011710, TUI: T109]', 
'[CODINGSCHEME: RXNORM, CODE: 3256, CUI: C0011710, TUI: T121]', '[CODINGSCHEME: 
SNOMEDCT_US, CODE: 56156001, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: 
SNOMEDCT_US, CODE: 1336006, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: 
SNOMEDCT_US, CODE: 1336006, CUI: C0011710, TUI: T121]', '[CODINGSCHEME: 
SNOMEDCT_US, CODE: 75029008, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: 
RXNORM, CODE: 3256, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: SNOMEDCT_US, 
CODE: 56156001, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 
56156001, CUI: C0011710, TUI: T121]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 
1336006, CUI: C0011710, TUI: T125]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 
75029008, CUI: C0011710, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 
75029008, CUI: C0011710, TUI: T121]'], 'SOLUTION': ['START: 576', 'END: 584', 
'POLARITY: 1', '[CODINGSCHEME: RXNORM, CODE: 450530, CUI: C1382100, TUI: 
T122]'], 'ORAL': ['START: 793', 'END: 797', 'POLARITY: 1', '[CODINGSCHEME: 
SNOMEDCT_US, CODE: 74262004, CUI: C0226896, TUI: T030]'], 'LEVAQUIN': ['START: 
452', 'END: 460', 'POLARITY: 1', '[CODINGSCHEME: RXNORM, CODE: 217992, CUI: 
C0721336, TUI: T121]', '[CODINGSCHEME: RXNORM, CODE: 217992, CUI: C0721336, 
TUI: T109]'], 'ADR': ['START: 2454', 'END: 2457', 'POLARITY: 1', 
'[CODINGSCHEME: SNOMEDCT_US, CODE: 68444001, CUI: C0013089, TUI: T195]', 
'[CODINGSCHEME: SNOMEDCT_US, CODE: 68444001, CUI: C0013089, TUI: T109]', 
'[CODINGSCHEME: RXNORM, CODE: 3639, CUI: C0013089, TUI: T195]', '[CODINGSCHEME: 
RXNORM, CODE: 3639, CUI: C0013089, TUI: T109]', '[CODINGSCHEME: SNOMEDCT_US, 
CODE: 372817009, CUI: C0013089, TUI: T195]', '[CODINGSCHEME: SNOMEDCT_US, CODE: 
372817009, CUI: C0013089, TUI: T109]']}, 'DRUGCHANGESTATUSANNOTATION': {}, 
'STRENGTHANNOTATION': {}, 'FRACTIONSTRENGTHANNOTATION': {}, 
'FREQUENCYUNITANNOTATION': {}, 'DISEASEDISORDERMENTION': {'RUPTURE': ['START: 
1579', 'END: 1586', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 
125671007, CUI: C3203359, TUI: T037]']}, 'SIGNSYMPTOMMENTION': {'RED': ['START: 
1081', 'END: 1084', 'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 
386713009, CUI: C0332575, TUI: T033]'], 'CONTENT': ['START: 2992', 'END: 2999', 
'POLARITY: 1', '[CODINGSCHEME: SNOMEDCT_US, CODE: 271599002, CUI: C0423896, 
TUI: T041]'], 'HISTORY': ['START: 34', 'END: 41', 'POLARITY: 1', 
'[CODINGSCHEME: SNOMEDCT_US, CODE: 392521001, CUI: C0262926, TUI: T033]']}, 
'ROUTEANNOTATION': {}, 'DATEANNOTATION': {}, 'MEASUREMENTANNOTATION': {}, 
'PROCEDUREMENTION': {'INJECTION': ['START: 817', 'END: 826', 'POLARITY: 1', 
'[CODINGSCHEME: SNOMEDCT_US, CODE: 129326001, CUI: C1533685, TUI: T061]', 
'[CODINGSCHEME: SNOMEDCT_US, CODE: 28289002, CUI: C1533685, TUI: T061]', 
'[CODINGSCHEME: SNOMEDCT_US, CODE: 59108006, CUI: C1533685, TUI: T061]']}, 
'TIMEMENTION': {}, 'STRENGTHUNITANNOTATION': {}}

Questions:

1.       How do we restrict the output to show only RXNORM coding scheme? 
Please describe with any config change example, if possible.

2.       These are the unique RXNORM codes from the above output: '10311', 
'3256', '450530', '217992', '3639'. These codes map to the drug names: 
‘DESOXYCORTICOSTERONE', 'LEVAQUIN', 'DOXORUBICIN’

a.       The text do not mention anything about ‘DESOXYCORTICOSTERONE' and 
'DOXORUBICIN’. How is cTAKES reporting that?

b.       The text has ‘levofloxacin’, and an RXNORM code is not returned for 
this name. Any idea?

3.       How do we enable cTAKES so that it returns only those codes that are 
available in RxTerms dictionary? None of the RXNORM codes reported above are 
available in RxTerms.

Thanks
Sekhar H.



--
Regards,
Gandhi

"The best way to find urself is to lose urself in the service of others !!!"

Reply via email to