Wonderful explanation James, thank you!
JG — Sent from Mailbox for iPhone On Thu, Jul 17, 2014 at 2:41 PM, Masanz, James J. <[email protected]> wrote: > The order you mentioned in your previous email had been "pulsatile abdominal > mass" for both what is in UMLS and what was in the text being annotated, > which is why I was asking about the ordering. > Given that I now know the text you were annotating did have a different word > order than what is in umls, and seeing exactly what those orderings were, > that explains why it was not being picked up. > A quirk/feature of cTAKES (current) dictionary lookup (as opposed to the > newer one called lookup-2) is that the first word must be first, but in a > multi (>2) word entry, the order of the other words doesn't matter. > So for example, with "abdominal pulsatile mass" in the dictionary, both of > these should get annotated with the same cui > abdominal pulsatile mass > abdominal mass pulsatile > but this will not get an annotation for that CUI > pulsatile abdominal mass > unless that ordering is also in the dictionary. > As far as heart rate and temperature, whether they are annotated as > procedures all depends on if they show up in the UMLS with the semantic types > used by cTAKES. > To check those, I would do this > - Open the UMLS terminology services Metathesaurus Browser app > https://uts.nlm.nih.gov/home.html > Applications->UTS Metathesaurus Browser > - input the text of interest into the box in the left pane, and click Go > - select the CUI that looks hopeful > - the pane on the right will fill in with details about that Concept, > including the semantic all the Atoms. > - look at the Semantic Types in the pane on the right > - if not a semantic type that cTAKES annotates, select a different CUI > - Once found a CUI with a semantic type cTAKES annotates, if the text of the > UMLS Concept itself is not exactly what I was looking for, look at all the > Atoms, and see if the text I was looking for appears with SNOMED_CT, NCI, > MSH, or ICD9CM. > Note that cTAKES also uses normalized forms of the words in the text being > processed, so if the input text were "lymph nodes" it would match a > hypothetical dictionary entry of "lymph node". > Also note that intervening words can be OK, up to a limit, but all words > within the term must appear within a single LookupWindow. > Hope that is helpful > -- James > -----Original Message----- > From: John Green [mailto:[email protected]] > Sent: Thursday, July 17, 2014 12:57 PM > To: [email protected] > Cc: [email protected] > Subject: RE: Procedure > I didnt see how it appeared in dictionary, I just looked at the cui in umls, > which has it as abdominal pulsatile mass, which isnt the same order as the > text I annotated in ctakes (pulsatile abdominal mass); but if im wrong great, > it does raise the question even more why if it was in the lookup window and > in the dictionary that it was only annotated as abdominal mass. > Apropos temperature and heart rate, the results of these are measurements > right? But it seems also that they should be procedures in the sense that you > perform a physical manipulation on a pt. If I were checking notes for the > presence of whether or not someone checked vitals vs obtaining the > measurements, this seems within the current use case, but Im so often wrong > here being so new... > JG > — > Sent from Mailbox for iPhone > On Thu, Jul 17, 2014 at 1:44 PM, Masanz, James J. <[email protected]> > wrote: >> In general cTAKES doesn't pick up things with values, such as weight, >> height, lab values, temperature, with the exception that the drug ner >> pipeline can pick up medication related values such as dose, strength, etc. >> cTAKES does pick up a few things as MeasurementAnnotation just by pattern, >> but doesn't associate those with a named entity that has a cui. >> The example of "pulsatile abdominal mass" listed the same 3 words in the >> same order for the dictionary entry and the text that was processed, so I'm >> not clear what you meant about word order. >> -----Original Message----- >> From: John Green [mailto:[email protected]] >> Sent: Thursday, July 17, 2014 8:04 AM >> To: [email protected] >> Subject: Re: Procedure >> General so that I dont keep generating work for others :-) >> Specifically: Temperature wasnt annotated, neither was Heart rate, for >> example. >> different but related: it picked up "abdominal mass" (C0000734) but not >> "pulsatile abdominal mass" (C0266835) when given "pulsatile abdominal >> mass". I understand that this may be expected given the word order. If it >> wasnt, then the concern, of course, is that by clinical intuition abdominal >> mass isnt very specific and one wouldnt jump to thinking AAA. However, >> pulsatile abdominal mass you would immediately think AAA. While this delta >> is fairly well reflected in ytex's semantic similarity measure >> (particularly LCH) with the distance being 0.84 and 0.64 for abdominal mass >> to pulsatile abdominal mass and Abdominal Aortic Aneurysm (C0162871) >> respectively. >> Pulsatile abdominal mass was in the lookup window. >> JG >> On Wed, Jul 16, 2014 at 3:07 PM, Masanz, James J. <[email protected]> >> wrote: >>> >>> It depends on the type of annotation. >>> >>> Some are rule-based. Some are machine-learning based (models). Some are >>> dictionary dependent. And some are based on annotations earlier in the >>> pipeline, and so looking at the part of speech tags within the tokens, for >>> example, can explain which chunk something appears in, which can explain >>> why something might not have been annotated as a DiseaseDisorderMention, >>> for example. >>> >>> Are you asking a general question or is there a specific type of >>> annotation you are most interested in. >>> >>> -----Original Message----- >>> From: John Green [mailto:[email protected]] >>> Sent: Wednesday, July 16, 2014 2:01 PM >>> To: [email protected] >>> Subject: Procedure >>> >>> Is there a generally accepted procedure for identifying why an annotation >>> wasnt made? >>> >>> JG >>>
