You could possibly use norm
<http://lexsrv2.nlm.nih.gov/LexSysGroup/Projects/lvg/2015/docs/userDoc/tools/norm.html>
to
normalize the entity text strings. I can't vouch for its accuracy at this
point, though.

Jen Seale
Presidential Research Fellow, CUNY Graduate Center
512.705.4030


On Mon, Mar 2, 2015 at 11:29 AM, Dligach, Dmitriy <
[email protected]> wrote:

> Hello,
>
> Is anybody aware of a reliable way of identifying the head word of a UMLS
> entity? In the general domain, people often use Collins rules, but I'm not
> sure whether they would be applicable to clinical entities.
>
> Until recently I was under impression that taking the last word of an
> entity would work pretty well, but now that I have looked at the data more
> closely, I am not so sure. E.g. it fails in these cases: "breast, left",
> "ductal carcinoma in situ", "carcinoma, consistent with breast primary".
>
> Dima
>
>
> Dmitriy (Dima) Dligach, Ph.D.
> Boston Children's Hospital and Harvard Medical School
> (617) 651-0397
>
>
>
>

Reply via email to