Hey Ted, Thanks again for looking at this. Again, any and all advice are appreciated. Right now, I'm doing a less-than-ideal approach of querying for similarity, and then if there is none, I do a back-up check by looking for overlapping words. This means I can catch "Diabetes" and "type 2 diabetes" but it does require manual curation to make sure I'm not getting erroneous overlaps!
Best, On Sun, Sep 3, 2017 at 7:52 PM, Ted Pedersen duluth...@gmail.com [umls-similarity] <umls-similarity@yahoogroups.com> wrote: > > > Hi Jen, > > I'm going to answer your question in a few notes maybe over the next few > days...so your first problem with finding a term is not unusual - sometimes > small variations in a term can cause a lookup to fail. I usually try to > work with Cuis as a result. In your example for whatever reason the (t2d) > seems to be causing a problem - when I omit that things seem to work ok... > > More soon! > Ted > > tpederse@maraca:~$ getAssociatedCuis.pl 'type 2 diabetes mellitus' > > > UMLS-Interface Configuration Information: > (Default Information - no config file) > > Sources (SAB): > MSH > Relations (REL): > PAR > CHD > > Sources (SABDEF): > UMLS_ALL > Relations (RELDEF): > UMLS_ALL > > > The CUIs associated with type 2 diabetes mellitus are: > 1. C0011860 > tpederse@maraca:~$ getAssociatedCuis.pl 'type 2 diabetes mellitus (t2d)' > > > UMLS-Interface Configuration Information: > (Default Information - no config file) > > Sources (SAB): > MSH > Relations (REL): > PAR > CHD > > Sources (SABDEF): > UMLS_ALL > Relations (RELDEF): > UMLS_ALL > > > No CUIs are associated with type 2 diabetes mellitus (t2d). > tpederse@maraca:~$ getAssociatedCuis.pl 'type 2 diabetes mellitus' > > > UMLS-Interface Configuration Information: > (Default Information - no config file) > > Sources (SAB): > MSH > Relations (REL): > PAR > CHD > > Sources (SABDEF): > UMLS_ALL > Relations (RELDEF): > UMLS_ALL > > > The CUIs associated with type 2 diabetes mellitus are: > 1. C0011860 > tpederse@maraca:~$ getChildren.pl 'type 2 diabetes mellitus' > > > UMLS-Interface Configuration Information: > (Default Information - no config file) > > Sources (SAB): > MSH > Relations (REL): > PAR > CHD > > Sources (SABDEF): > UMLS_ALL > Relations (RELDEF): > UMLS_ALL > > > The children of type 2 diabetes mellitus (C0011860) are: > lipoatrophic diabetes (C0011859) > tpederse@maraca:~$ getChildren.pl 'type 2 diabetes mellitus (t2d)' > > > UMLS-Interface Configuration Information: > (Default Information - no config file) > > Sources (SAB): > MSH > Relations (REL): > PAR > CHD > > Sources (SABDEF): > UMLS_ALL > Relations (RELDEF): > UMLS_ALL > > > Input type 2 diabetes mellitus (t2d) does not exist in this view of the > UMLS. > > > On Thu, Aug 31, 2017 at 4:03 PM, Jennifer Wilson jen.wilson...@gmail.com > [umls-similarity] <umls-similarity@yahoogroups.com> wrote: > >> >> >> Hi Ted, >> >> You've been incredibly helpful, on this whole project. Thanks for that. >> >> Part of my problem was that I was avoiding installing the full UMLS >> distribution because of storage, and then had a lot of trouble getting it >> downloaded, installed, and set up on my computer! But having access to the >> UMLS::Interface and UMLS::Similarity scripts is really helpful. >> >> I'm now at a more specific use of the UMLS in my project. I have a list >> of drugs with their disease indications (*most *map to a CUI term) and >> then for each drug I have predicted a list of disease for which the drugs >> could be used. I want to know "how semantically similar" are my predictions >> to the original diseases (to check how often my algorithm is "right"). >> >> There are few bugs in that for some reason, some diseases return an error >> that a CUI doesn't exist, even though, I can find a CUI for the disease in >> my umls database: >> >> >> >> >> *getChildren.pl 'type 2 diabetes mellitus (t2d)'...Input type 2 diabetes >> mellitus (t2d) does not exist in this view of the UMLS.* >> I also don't always get a match between diseases that I think are >> semantically similar, is this just a product of how the hierarchy works? I >> would like to capture these as matches if possible! >> >> >> *Hormone receptor positive malignant neoplasm of breast (C1562029) & >> Breast Carcinoma (C0678222) = -1* >> Also, because I am new to perl, I have been using the --infile option >> (for instance with getAssociatedCuis.pl script), piping the output to a >> text file and then using Python to awkwardly extract the disease -> CUI >> mappings. Is there a better way to do this? Ideally, I could implement a >> tiered system for checking matches where I look for an exact match, look >> for a matched parent or child term, and then as a last resort look for >> matched words (because the of the breast cancer example). >> >> Any and all advice is much appreciated. Thank you! >> >> On Tue, Aug 22, 2017 at 5:49 AM, duluth...@gmail.com [umls-similarity] < >> umls-similarity@yahoogroups.com> wrote: >> >>> >>> >>> Hi Jen, >>> >>> Here are some ideas about using some of the commands in the installed >>> version of UMLS::Similarity to find nearby branches of a given term of CUI. >>> A lot more information about the different commands available can be found >>> at : >>> >>> http://search.cpan.org/dist/UMLS-Interface/ >>> >>> I'm not certain how useful this all will be, but wanted to let you know >>> what I was thinking at least. Please feel free to follow up as needed. >>> >>> >>> tpederse@maraca:~$ getAssociatedCuis.pl "Diabetes Mellitus" >>> >>> >>> UMLS-Interface Configuration Information: >>> (Default Information - no config file) >>> >>> Sources (SAB): >>> MSH >>> Relations (REL): >>> PAR >>> CHD >>> >>> Sources (SABDEF): >>> UMLS_ALL >>> Relations (RELDEF): >>> UMLS_ALL >>> >>> >>> The CUIs associated with Diabetes Mellitus are: >>> 1. C0011849 >>> >>> tpederse@maraca:~$ getChildren.pl C0011849 >>> >>> >>> UMLS-Interface Configuration Information: >>> (Default Information - no config file) >>> >>> Sources (SAB): >>> MSH >>> Relations (REL): >>> PAR >>> CHD >>> >>> Sources (SABDEF): >>> UMLS_ALL >>> Relations (RELDEF): >>> UMLS_ALL >>> >>> >>> The children of Diabetes mellitus NOS (C0011849) are: >>> leprechaunism (C0265344) >>> experimental diabetes mellitus (C0011853) >>> compl diabetes mellitus (C0342257) >>> diabetes mellitus, sudden-onset (C0011854) >>> pregnancy-induced diabetes (C0085207) >>> states, prediabetic (C0362046) >>> noninsulin-dependent diabetes mellitus (C0011860) >>> acidoses, diabetic (C0011880) >>> >>> tpederse@maraca:~$ getParents.pl C0011849 >>> >>> >>> UMLS-Interface Configuration Information: >>> (Default Information - no config file) >>> >>> Sources (SAB): >>> MSH >>> Relations (REL): >>> PAR >>> CHD >>> >>> Sources (SABDEF): >>> UMLS_ALL >>> Relations (RELDEF): >>> UMLS_ALL >>> >>> >>> The parents of Diabetes mellitus NOS (C0011849) are: >>> endocrine system diseases(C0014130) >>> metabolism disorder, glucose(C1257958) >>> >>> tpederse@maraca:~$ getAssociatedCuis.pl >>> No term was specified on the command line >>> Usage: getAssociatedCuis.pl [OPTIONS] TERM >>> Type getAssociatedCuis.pl --help for help. >>> tpederse@maraca:~$ getAssociatedCuis.pl "Diabetes Mellitus" >>> >>> >>> UMLS-Interface Configuration Information: >>> (Default Information - no config file) >>> >>> Sources (SAB): >>> MSH >>> Relations (REL): >>> PAR >>> CHD >>> >>> Sources (SABDEF): >>> UMLS_ALL >>> Relations (RELDEF): >>> UMLS_ALL >>> >>> >>> The CUIs associated with Diabetes Mellitus are: >>> 1. C0011849 >>> tpederse@maraca:~$ getChildren.pl C0011849 >>> >>> >>> UMLS-Interface Configuration Information: >>> (Default Information - no config file) >>> >>> Sources (SAB): >>> MSH >>> Relations (REL): >>> PAR >>> CHD >>> >>> Sources (SABDEF): >>> UMLS_ALL >>> Relations (RELDEF): >>> UMLS_ALL >>> >>> >>> The children of Diabetes mellitus NOS (C0011849) are: >>> leprechaunism (C0265344) >>> experimental diabetes mellitus (C0011853) >>> compl diabetes mellitus (C0342257) >>> diabetes mellitus, sudden-onset (C0011854) >>> pregnancy-induced diabetes (C0085207) >>> states, prediabetic (C0362046) >>> noninsulin-dependent diabetes mellitus (C0011860) >>> acidoses, diabetic (C0011880) >>> tpederse@maraca:~$ getParents.pl C0011849 >>> >>> >>> UMLS-Interface Configuration Information: >>> (Default Information - no config file) >>> >>> Sources (SAB): >>> MSH >>> Relations (REL): >>> PAR >>> CHD >>> >>> Sources (SABDEF): >>> UMLS_ALL >>> Relations (RELDEF): >>> UMLS_ALL >>> >>> >>> The parents of Diabetes mellitus NOS (C0011849) are: >>> endocrine system diseases(C0014130) >>> metabolism disorder, glucose(C1257958) >>> >>> tpederse@maraca:~$ getRelated.pl C0011849 RB >>> >>> >>> UMLS-Interface Configuration Information: >>> (Default Information - no config file) >>> >>> Sources (SAB): >>> MSH >>> Relations (REL): >>> PAR >>> CHD >>> >>> Sources (SABDEF): >>> UMLS_ALL >>> Relations (RELDEF): >>> UMLS_ALL >>> >>> >>> No CUIs are associated with diabetes mellitus (C0011849) given the >>> relation (RB). >>> >>> >>> tpederse@maraca:~$ getRelated.pl C0011849 RN >>> >>> >>> UMLS-Interface Configuration Information: >>> (Default Information - no config file) >>> >>> Sources (SAB): >>> MSH >>> Relations (REL): >>> PAR >>> CHD >>> >>> Sources (SABDEF): >>> UMLS_ALL >>> Relations (RELDEF): >>> UMLS_ALL >>> >>> >>> The related (RN) CUIs to diabetes mellitus (C0011849): >>> premature aging, okamoto type (C2930860) >>> lipoatrophy with diabetes, hepatic steatosis, cardiomyopathy, and >>> leukomelanodermic papules (C2931057) >>> feigenbaum bergeron richardson syndrome (C2931125) >>> thiamine responsive megaloblastic anemia syndrome (C0342287) >>> yorifuji okuno syndrome (C2931296) >>> extrapyramidal disorder, progressive, with primary hypogonadism, >>> mental retardation, and alopecia (C0342286) >>> pancreatic beta cell agenesis with neonatal diabetes mellitus >>> (C1838655) >>> photomyoclonus, diabetes mellitus, deafness, nephropathy, and cerebral >>> dysfunction (C1809475) >>> furukawa takagi nakao syndrome (C2931765) >>> diabetes mellitus, neonatal, with congenital hypothyroidism (C1857775) >>> diabetes mellitus, transient neonatal, 2 (C1835887) >>> developmental delay, epilepsy, and neonatal diabetes (C1853564) >>> mitochondrial myopathy with diabetes (C1839028) >>> diabetes mellitus, transient neonatal, 3 (C1864623) >>> diabetes mellitus, insulin-resistant, with acanthosis nigricans >>> (C0342278) >>> maturity-onset diabetes of the young, type 7 (C1864839) >>> mitchell-riley syndrome (C2748662) >>> hyperproinsulinemia (C0342283) >>> muscular atrophy, ataxia, retinitis pigmentosa, and diabetes mellitus >>> (C0342281) >>> 6q24-related transient neonatal diabetes mellitus (C3711391) >>> diabetes mellitus, congenital autoimmune (C1857958) >>> pancreatic and cerebellar agenesis (C1836780) >>> stimmler syndrome (C1859965) >>> atherosclerosis, premature, with deafness, nephropathy, diabetes >>> mellitus, photomyoclonus, and degenerative neurologic disease (C1859596) >>> diabetes insipidus and mellitus with optic atrophy and deafness, >>> mitochondrial form (C1838782) >>> diabetes mellitus, transient neonatal, 1 (C1832386) >>> pancreatic hypoplasia, congenital, with diabetes mellitus and >>> congenital heart disease (C1838780) >>> lymphedema-distichiasis syndrome with renal disease and diabetes >>> mellitus (C2675066) >>> diabetes mellitus, permanent neonatal, with neurologic features >>> (C1833102) >>> diabetes mellitus, permanent, of infancy (C1833104) >>> martinez frias syndrome (C1832443) >>> >>> >>> >>> >>> >>> >>> >>> >>> UMLS-Interface Configuration Information: >>> (Default Information - no config file) >>> >>> Sources (SAB): >>> MSH >>> Relations (REL): >>> PAR >>> CHD >>> >>> Sources (SABDEF): >>> UMLS_ALL >>> Relations (RELDEF): >>> UMLS_ALL >>> >>> >>> The CUIs associated with Diabetes Mellitus are: >>> 1. C0011849 >>> >>> tpederse@maraca:~$ getChildren.pl C0011849 >>> >>> >>> UMLS-Interface Configuration Information: >>> (Default Information - no config file) >>> >>> Sources (SAB): >>> MSH >>> Relations (REL): >>> PAR >>> CHD >>> >>> Sources (SABDEF): >>> UMLS_ALL >>> Relations (RELDEF): >>> UMLS_ALL >>> >>> >>> The children of Diabetes mellitus NOS (C0011849) are: >>> leprechaunism (C0265344) >>> experimental diabetes mellitus (C0011853) >>> compl diabetes mellitus (C0342257) >>> diabetes mellitus, sudden-onset (C0011854) >>> pregnancy-induced diabetes (C0085207) >>> states, prediabetic (C0362046) >>> noninsulin-dependent diabetes mellitus (C0011860) >>> acidoses, diabetic (C0011880) >>> >>> tpederse@maraca:~$ getParents.pl C0011849 >>> >>> >>> UMLS-Interface Configuration Information: >>> (Default Information - no config file) >>> >>> Sources (SAB): >>> MSH >>> Relations (REL): >>> PAR >>> CHD >>> >>> Sources (SABDEF): >>> UMLS_ALL >>> Relations (RELDEF): >>> UMLS_ALL >>> >>> >>> The parents of Diabetes mellitus NOS (C0011849) are: >>> endocrine system diseases(C0014130) >>> metabolism disorder, glucose(C1257958) >>> >>> tpederse@maraca:~$ getAssociatedCuis.pl >>> No term was specified on the command line >>> Usage: getAssociatedCuis.pl [OPTIONS] TERM >>> Type getAssociatedCuis.pl --help for help. >>> tpederse@maraca:~$ getAssociatedCuis.pl "Diabetes Mellitus" >>> >>> >>> UMLS-Interface Configuration Information: >>> (Default Information - no config file) >>> >>> Sources (SAB): >>> MSH >>> Relations (REL): >>> PAR >>> CHD >>> >>> Sources (SABDEF): >>> UMLS_ALL >>> Relations (RELDEF): >>> UMLS_ALL >>> >>> >>> The CUIs associated with Diabetes Mellitus are: >>> 1. C0011849 >>> tpederse@maraca:~$ getChildren.pl C0011849 >>> >>> >>> UMLS-Interface Configuration Information: >>> (Default Information - no config file) >>> >>> Sources (SAB): >>> MSH >>> Relations (REL): >>> PAR >>> CHD >>> >>> Sources (SABDEF): >>> UMLS_ALL >>> Relations (RELDEF): >>> UMLS_ALL >>> >>> >>> The children of Diabetes mellitus NOS (C0011849) are: >>> leprechaunism (C0265344) >>> experimental diabetes mellitus (C0011853) >>> compl diabetes mellitus (C0342257) >>> diabetes mellitus, sudden-onset (C0011854) >>> pregnancy-induced diabetes (C0085207) >>> states, prediabetic (C0362046) >>> noninsulin-dependent diabetes mellitus (C0011860) >>> acidoses, diabetic (C0011880) >>> tpederse@maraca:~$ getParents.pl C0011849 >>> >>> >>> UMLS-Interface Configuration Information: >>> (Default Information - no config file) >>> >>> Sources (SAB): >>> MSH >>> Relations (REL): >>> PAR >>> CHD >>> >>> Sources (SABDEF): >>> UMLS_ALL >>> Relations (RELDEF): >>> UMLS_ALL >>> >>> >>> The parents of Diabetes mellitus NOS (C0011849) are: >>> endocrine system diseases(C0014130) >>> metabolism disorder, glucose(C1257958) >>> >>> tpederse@maraca:~$ getRelated.pl C0011849 RB >>> >>> >>> UMLS-Interface Configuration Information: >>> (Default Information - no config file) >>> >>> Sources (SAB): >>> MSH >>> Relations (REL): >>> PAR >>> CHD >>> >>> Sources (SABDEF): >>> UMLS_ALL >>> Relations (RELDEF): >>> UMLS_ALL >>> >>> >>> No CUIs are associated with diabetes mellitus (C0011849) given the >>> relation (RB). >>> >>> >>> tpederse@maraca:~$ getRelated.pl C0011849 RN >>> >>> >>> UMLS-Interface Configuration Information: >>> (Default Information - no config file) >>> >>> Sources (SAB): >>> MSH >>> Relations (REL): >>> PAR >>> CHD >>> >>> Sources (SABDEF): >>> UMLS_ALL >>> Relations (RELDEF): >>> UMLS_ALL >>> >>> >>> The related (RN) CUIs to diabetes mellitus (C0011849): >>> premature aging, okamoto type (C2930860) >>> lipoatrophy with diabetes, hepatic steatosis, cardiomyopathy, and >>> leukomelanodermic papules (C2931057) >>> feigenbaum bergeron richardson syndrome (C2931125) >>> thiamine responsive megaloblastic anemia syndrome (C0342287) >>> yorifuji okuno syndrome (C2931296) >>> extrapyramidal disorder, progressive, with primary hypogonadism, >>> mental retardation, and alopecia (C0342286) >>> pancreatic beta cell agenesis with neonatal diabetes mellitus >>> (C1838655) >>> photomyoclonus, diabetes mellitus, deafness, nephropathy, and cerebral >>> dysfunction (C1809475) >>> furukawa takagi nakao syndrome (C2931765) >>> diabetes mellitus, neonatal, with congenital hypothyroidism (C1857775) >>> diabetes mellitus, transient neonatal, 2 (C1835887) >>> developmental delay, epilepsy, and neonatal diabetes (C1853564) >>> mitochondrial myopathy with diabetes (C1839028) >>> diabetes mellitus, transient neonatal, 3 (C1864623) >>> diabetes mellitus, insulin-resistant, with acanthosis nigricans >>> (C0342278) >>> maturity-onset diabetes of the young, type 7 (C1864839) >>> mitchell-riley syndrome (C2748662) >>> hyperproinsulinemia (C0342283) >>> muscular atrophy, ataxia, retinitis pigmentosa, and diabetes mellitus >>> (C0342281) >>> 6q24-related transient neonatal diabetes mellitus (C3711391) >>> diabetes mellitus, congenital autoimmune (C1857958) >>> pancreatic and cerebellar agenesis (C1836780) >>> stimmler syndrome (C1859965) >>> atherosclerosis, premature, with deafness, nephropathy, diabetes >>> mellitus, photomyoclonus, and degenerative neurologic disease (C1859596) >>> diabetes insipidus and mellitus with optic atrophy and deafness, >>> mitochondrial form (C1838782) >>> diabetes mellitus, transient neonatal, 1 (C1832386) >>> pancreatic hypoplasia, congenital, with diabetes mellitus and >>> congenital heart disease (C1838780) >>> lymphedema-distichiasis syndrome with renal disease and diabetes >>> mellitus (C2675066) >>> diabetes mellitus, permanent neonatal, with neurologic features >>> (C1833102) >>> diabetes mellitus, permanent, of infancy (C1833104) >>> martinez frias syndrome (C1832443) >>> >>> >>> >>> >> >> >> -- >> Jennifer L. Wilson >> Bioengineering, Stanford University >> jen.wilson...@gmail.com / 703.969.3318 <(703)%20969-3318> >> >> > > -- Jennifer L. Wilson Bioengineering, Stanford University jen.wilson...@gmail.com / 703.969.3318