Excellent answers Tim! > Is there a way to link all this term to x-ray without have to modify > fast dictionary for every x-ray entries?
Unfortunately no. The dictionary can only match (give or take) the data that it contains. I have long had an idea on how to improve this by trickery, but I'm not sure how well it would pan out in the end ... Plus there is the investment of time in implementation. Anyway, you would need to add every text that you would like to match to a dictionary. This doesn't need to be the hsqldb dictionary. ctakes can also read plain-text files as dictionary sources. But it requires a certain amount of crystal ball prediction on your part as you have to provide every permutation of a term that isn't already in the hsqldb dictionary. If you are interested in anatomic locations then I would try what Tim suggested and at the end of your piper file add: load RelationSubPipe That should add location relations (and degree-of) to your pipeline and would be easier than trying to rely on the dictionary to pick up every nuance. Sean -----Original Message----- From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu] Sent: Wednesday, November 29, 2017 9:48 AM To: dev@ctakes.apache.org Subject: Re: exact match to CUI_TERM table question. [EXTERNAL] [SUSPICIOUS] On Wed, 2017-11-29 at 09:36 -0500, Kathy Ferro wrote: > Good Morning, > > 1. I have a term for x-ray that has different spelling such as x.ray, > x.rays, xray, xrays, etc... > I see several files in > resources\org\apache\ctakes\assertion\semantic_classes > folder. > I created x-ray.txt with all the terms above and hoping it will do the > trick. No luck. > Is there a way to link all this term to x-ray without have to modify > fast dictionary for every x-ray entries? No, these files are not for the dictionary lookup and will not add concepts to the CAS. > > 2. This might not have solution, but I'll ask anyway. Looks like the > terms has to be exact match to terms in cut_terms table. Example > document has "x-ray right elbow" or "elbow x-ray". In the dictionary, > I have "x- ray of elbow" and "x-ray of the elbow". Is there a way to > pick up both of entries in the dictionary without using black box > (list)? The term "left" > and > "right" might be important in some instance. > How much is found really depends on the granularity of the source resource (UMLS/SNOMED) and whatever tricks Sean's import tool applies. UMLS often represents relations as concepts (elbow x-ray is in there). But as the modifiers get added it sometimes is easier to model as relations. For example, if you can detect "left" as a modifier, "elbow" as AnatomicalSite, and "x-ray" as procedure, then a relation extractor should find with "left" is modifying "elbow" and x-ray modifies "elbow," to give a complete picture. cTAKES can do relations between anatomical sites and other arguments, but I don't know if the default release does body side (left,right). > 3. This sample is kinda related to #2. Document has term "diabetes" > in one > sentence. Down several pages, it has more specific term such as " > retinopathy" and "controlled with insulin". > What is the best way to handle this? Do you suggest I add > "'retinopathy". > Does cTakes has term dependency? > > It picks up. (E08-E13) is wide range of codes. > PREFTERM VALUES(11849,'Diabetes Mellitus'). > ICD10CM VALUES(11849,'E08-E13'). > PREFTERM VALUES(11860,'Diabetes Mellitus, Non-Insulin-Dependent') > ICD10CM VALUES(11849,'E11'). > > I should also have pick up these, but didn't because of the exact > match. > INSERT INTO CUI_TERMS VALUES(11884,0,3,'retinopathy ; > diabetic','retinopathy') > INSERT INTO CUI_TERMS VALUES(11884,3,6,'retina abnormal - diabet - > relat','diabet') > INSERT INTO CUI_TERMS VALUES(11884,1,2,'diabetic > retinopathy','retinopathy') > INSERT INTO CUI_TERMS VALUES(11884,0,2,'retinopathy > diabetic','retinopathy') > > > Snip of Sample text: > chief complaint: Patient came in complaining of having chest pain. > Procedure: chest xrays. > Problems: > Type 2 diabetes > depression > retinopathy > patient controlled with insulin. > It should definitely get "retinopathy" since that's in snomed. The first thing I check when dictionary misses something is whether the linguistic annotations around it are correct (sentence, token, part of speech). > Sincerely appreciated you help. > Kathy