Hello, Is anybody aware of a reliable way of identifying the head word of a UMLS entity? In the general domain, people often use Collins rules, but I’m not sure whether they would be applicable to clinical entities.
Until recently I was under impression that taking the last word of an entity would work pretty well, but now that I have looked at the data more closely, I am not so sure. E.g. it fails in these cases: “breast, left”, “ductal carcinoma in situ”, “carcinoma, consistent with breast primary”. Dima Dmitriy (Dima) Dligach, Ph.D. Boston Children's Hospital and Harvard Medical School (617) 651-0397
