Hi Jen, There isn't a way to do this kind of query via the web interface, but it would be possible to do this using the command line interface if you have UMLS::Similarity installed locally. I will try and show a short example of that in the next few days, but wanted to give you at least some preliminary ideas.
More soon, Ted On Thu, Aug 17, 2017 at 9:51 AM, Jennifer Wilson jen.wilson...@gmail.com [umls-similarity] <umls-similarity@yahoogroups.com> wrote: > > > Hi Ted, > > Thanks for your answer. On a related note - is there any way to query a > local branch? Say that I have Diabetes Mellitus - can I find branches that > are close to that disease without exhaustively searching all diseases? > > Thanks again, > > On Wed, Aug 16, 2017 at 7:50 AM, Ted Pedersen duluth...@gmail.com > [umls-similarity] <umls-similarity@yahoogroups.com> wrote: > >> >> >> Hi Jen, >> >> A great question, but unfortunately we do not have any pre-computed files >> of distances. This would be a good thing to have available but we just >> haven't done that. I'm not aware of anyone else who has done that, but I'll >> ask (via this email). If anyone has done that and is able to share, that >> would be quite helpful I think. >> >> Good luck, >> Ted >> >> On Tue, Aug 15, 2017 at 7:04 PM, Jennifer Wilson jen.wilson...@gmail.com >> [umls-similarity] <umls-similarity@yahoogroups.com> wrote: >> >>> >>> >>> Hi Ted, >>> >>> I'm reviving this old email thread since this work is becoming relevant >>> to my project again. I realized I never asked - do you have any flat files >>> of disease distances that are pre-calcuated? >>> >>> I'm looking to cluster down my list of MeSH termed-diseases (all pulled >>> from DisGeNet) into groups of related diseases. For instance, I might want >>> to clump 'Diabetes Mellitus' and 'Diabetes Mellitus, Non-Insulin Dependent' >>> and have a separate group for things such as 'Depressive Symptoms' and >>> 'Depressive Episodes'. Do you have an easy way to create these clusters? >>> >>> Thank you again for your help! >>> >>> On Mon, Jun 5, 2017 at 5:50 PM, Ted Pedersen duluth...@gmail.com >>> [umls-similarity] <umls-similarity@yahoogroups.com> wrote: >>> >>>> >>>> >>>> When I am just trying to get a sense of a measure or test out something >>>> we've added, I often tend to use FMA / Foundational Model of Anatomy as my >>>> source. This is because it includes some fairly intuitive terms and is >>>> structured in a hierarchical fashion, so similarity measures like path and >>>> wup work fairly nicely. I tend to prefer wup over path since wup includes a >>>> kind of correction for the depth of the concepts involved, but at this >>>> point that might be a finer point. But, below are some examples of >>>> intuitive results which I think make some sense at least, and might be a >>>> good starting point for exploring. >>>> >>>> tpederse@maraca:~$ query-umls-similarity-webinterface.pl --sab FMA >>>> --measure wup femur skull >>>> Default Settings: >>>> --default http://atlas.ahc.umn.edu/ >>>> --rel PAR/CHD >>>> User Settings: >>>> --measure wup >>>> >>>> 0.8<>femur(C0015811)<>skull(C0037303) >>>> >>>> >>>> tpederse@maraca:~$ query-umls-similarity-webinterface.pl --sab FMA >>>> --measure wup femur bone >>>> Default Settings: >>>> --default http://atlas.ahc.umn.edu/ >>>> --rel PAR/CHD >>>> User Settings: >>>> --measure wup >>>> >>>> 0.8333<>femur(C0015811)<>bone(C0262950) >>>> >>>> >>>> tpederse@maraca:~$ query-umls-similarity-webinterface.pl --sab FMA >>>> --measure wup skull bone >>>> Default Settings: >>>> --default http://atlas.ahc.umn.edu/ >>>> --rel PAR/CHD >>>> User Settings: >>>> --measure wup >>>> >>>> 0.8696<>skull(C0037303)<>bone(C0262950) >>>> >>>> >>>> tpederse@maraca:~$ query-umls-similarity-webinterface.pl --sab FMA >>>> --measure wup finger hand >>>> Default Settings: >>>> --default http://atlas.ahc.umn.edu/ >>>> --rel PAR/CHD >>>> User Settings: >>>> --measure wup >>>> >>>> 0.6923<>finger(C0016129)<>hand(C0018563) >>>> >>>> >>>> tpederse@maraca:~$ query-umls-similarity-webinterface.pl --sab FMA >>>> --measure wup toe foot >>>> Default Settings: >>>> --default http://atlas.ahc.umn.edu/ >>>> --rel PAR/CHD >>>> User Settings: >>>> --measure wup >>>> >>>> 0.6923<>toe(C0040357)<>foot(C0016504) >>>> >>>> >>>> On Mon, Jun 5, 2017 at 7:30 PM, Ted Pedersen <duluth...@gmail.com> >>>> wrote: >>>> >>>>> Hi Jen, >>>>> >>>>> I looked at those particular CUIs and don't think they are in MSH or >>>>> SNOMEDCT - that's why you are getting the -1 even though one would imagine >>>>> there is some similarity between them. To find some other examples using >>>>> Alzheimer's I used UTS Metathesaurus to look up CUIs in MSH that included >>>>> the term Alzheimer's (and 9 were found in MSH). >>>>> >>>>> I took 2 of those and ran them with path and got -1, indicating no >>>>> path found. However, when I used lesk or vector I found non-zero values. >>>>> Lesk and vector are both based on comparing the definitions of two CUIs >>>>> and >>>>> do not rely on finding paths. >>>>> >>>>> tpederse@maraca:~$ perl query-umls-similarity-webinterface.pl >>>>> C0002395 C0299337 --measure vector --sab MSH >>>>> Default Settings: >>>>> --default http://atlas.ahc.umn.edu/ >>>>> --rel CUI/PAR/CHD/RB/RN >>>>> User Settings: >>>>> --measure vector >>>>> >>>>> 0.3131<>Disease, Alzheimer's(C0002395)<>familial Alzheimer's disease >>>>> protein 1(C0299337) >>>>> >>>>> >>>>> tpederse@maraca:~$ perl query-umls-similarity-webinterface.pl >>>>> C0002395 C0299337 --measure lesk --sab MSH >>>>> Default Settings: >>>>> --default http://atlas.ahc.umn.edu/ >>>>> --rel CUI/PAR/CHD/RB/RN >>>>> User Settings: >>>>> --measure lesk >>>>> >>>>> 19<>Disease, Alzheimer's(C0002395)<>familial Alzheimer's disease >>>>> protein 1(C0299337) >>>>> >>>>> So, the tricky part is sometimes the coverage in different sources - >>>>> two CUIs might be intuitively similar but simply not found in the source >>>>> being used (or not path between them may exist) so will show a -1 value. >>>>> >>>>> I'm not sure this exactly answers your question, but I will think a >>>>> little more and add what I can... >>>>> >>>>> More soon, >>>>> Ted >>>>> >>>>> On Mon, Jun 5, 2017 at 5:41 PM, Jennifer Wilson >>>>> jen.wilson...@gmail.com [umls-similarity] < >>>>> umls-similarity@yahoogroups.com> wrote: >>>>> >>>>>> >>>>>> >>>>>> Hey Ted, >>>>>> >>>>>> So I haven't quite figured out the MetaMap, but I have a set of >>>>>> diseases that I mapped to CUIs another way. I'm still getting negative >>>>>> results with diseases that I think should be "similar". For example: >>>>>> >>>>>> ./query-umls-similarity-webinterface.pl --sab MSH --rel PAR/CHD >>>>>> "C1864828" "C3810041" >>>>>> >>>>>> Default Settings: >>>>>> >>>>>> --default http://atlas.ahc.umn.edu/ >>>>>> >>>>>> --measure path >>>>>> >>>>>> >>>>>> User Settings: >>>>>> >>>>>> --rel PAR/CHD >>>>>> >>>>>> >>>>>> ["b'-1", 'ALZHEIMER DISEASE 10(C1864828)', "ALZHEIMER DISEASE >>>>>> 18(C3810041)\\n'"] >>>>>> >>>>>> You can see my results on the last row. Could you advise- Would you >>>>>> expect that these two CUIs would not be similar? I wanted to measure path >>>>>> as a simple starting point, but could you recommend that another distance >>>>>> might be more informative? Thanks again for your help! >>>>>> >>>>>> On Mon, Jun 5, 2017 at 1:43 PM, Jennifer Wilson < >>>>>> jen.wilson...@gmail.com> wrote: >>>>>> >>>>>>> Hey Ted, >>>>>>> >>>>>>> Thanks for all of the help. I found the interactive interface really >>>>>>> helpful and had been able to create inputs similar to what you shared. I >>>>>>> have an open help ticket now on trying to get the file to download. He >>>>>>> gave >>>>>>> me some commands to try that I had already tried, so there must be >>>>>>> something else to unzipping the code... >>>>>>> >>>>>>> Thanks again. Hopefully I'm close to a solution! >>>>>>> >>>>>>> On Mon, Jun 5, 2017 at 11:21 AM, Ted Pedersen duluth...@gmail.com >>>>>>> [umls-similarity] <umls-similarity@yahoogroups.com> wrote: >>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Hi Jen, >>>>>>>> >>>>>>>> Nothing to be embarrassed about at all!. If you haven't already >>>>>>>> used MetaMap interactively you might want to try that before you >>>>>>>> attempt a >>>>>>>> local install : >>>>>>>> >>>>>>>> https://ii.nlm.nih.gov/Interactive/UTS_Required/metamap.shtml >>>>>>>> >>>>>>>> (You would need to be logged into UTS for the link to work I >>>>>>>> think...) >>>>>>>> >>>>>>>> Anyway, once at that site on the right side there are some links >>>>>>>> for using MetaMap interactively. Below is an example of what that looks >>>>>>>> like (where the first line is my input and the rest is the output). I >>>>>>>> turned on the option to show CUIs, since I think that is your desire >>>>>>>> output... >>>>>>>> >>>>>>>> About the bz2 file, I think you'd need to uncompress that with >>>>>>>> bunzip2, although I have not done a local install for a while so I am >>>>>>>> not >>>>>>>> 100 percent sure if that is the issue or not. But, I've cc'd the >>>>>>>> MetaMap >>>>>>>> help line on this note, they are usually very good about following up >>>>>>>> on >>>>>>>> issues like this. >>>>>>>> >>>>>>>> I hope this helps! >>>>>>>> Ted >>>>>>>> >>>>>>>> Processing 00000000.tx.1: I have a really bad headache, and my joints >>>>>>>> ache. >>>>>>>> >>>>>>>> Phrase: I >>>>>>>> >>>>> Phrase >>>>>>>> i >>>>>>>> <<<<< Phrase >>>>>>>> >>>>> Mappings >>>>>>>> Meta Mapping (1000): >>>>>>>> 1000 C0021966:I- (Iodides) [Inorganic Chemical] >>>>>>>> Meta Mapping (1000): >>>>>>>> 1000 C0221138:I NOS (Blood group antibody I) [Amino Acid, Peptide, >>>>>>>> or Protein,Immunologic Factor] >>>>>>>> <<<<< Mappings >>>>>>>> >>>>>>>> Phrase: have >>>>>>>> >>>>> Phrase >>>>>>>> <<<<< Phrase >>>>>>>> >>>>>>>> Phrase: a really bad headache, >>>>>>>> >>>>> Phrase >>>>>>>> really bad headache >>>>>>>> <<<<< Phrase >>>>>>>> >>>>> Mappings >>>>>>>> Meta Mapping (790): >>>>>>>> 660 C0205169:Bad [Qualitative Concept] >>>>>>>> 827 C0018681:HEADACHE (Headache) [Sign or Symptom] >>>>>>>> <<<<< Mappings >>>>>>>> >>>>>>>> Phrase: and >>>>>>>> >>>>> Phrase >>>>>>>> <<<<< Phrase >>>>>>>> >>>>>>>> Phrase: my joints >>>>>>>> >>>>> Phrase >>>>>>>> joints >>>>>>>> <<<<< Phrase >>>>>>>> >>>>> Mappings >>>>>>>> Meta Mapping (1000): >>>>>>>> 1000 C0022417:Joints [Body Space or Junction] >>>>>>>> Meta Mapping (1000): >>>>>>>> 1000 C0392905:Joints (Articular system) [Body System] >>>>>>>> <<<<< Mappings >>>>>>>> >>>>>>>> Phrase: ache. >>>>>>>> >>>>> Phrase >>>>>>>> ache >>>>>>>> <<<<< Phrase >>>>>>>> >>>>> Mappings >>>>>>>> Meta Mapping (1000): >>>>>>>> 1000 C0234238:ACHE (Ache) [Sign or Symptom] >>>>>>>> <<<<< Mappings >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> On Mon, Jun 5, 2017 at 12:25 PM, Jennifer Wilson >>>>>>>> jen.wilson...@gmail.com [umls-similarity] < >>>>>>>> umls-similarity@yahoogroups.com> wrote: >>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> Hey Ted, >>>>>>>>> >>>>>>>>> I'm (embarrassingly) having some trouble navigating the NLM site. >>>>>>>>> I think I have an account and am trying to download some of the >>>>>>>>> MetaMap >>>>>>>>> software (I think that the "Lite" version is sufficient). But when I >>>>>>>>> download the bz2 file, it won't open because I think I need to >>>>>>>>> authenticate >>>>>>>>> it. Do you know how I'm supposed to access this software? Sorry if >>>>>>>>> this is >>>>>>>>> out of your realm, I can try someone else at NLM. This has just been >>>>>>>>> a lot >>>>>>>>> more difficult and confusing than I thought it should be! Thanks, >>>>>>>>> >>>>>>>>> On Fri, Jun 2, 2017 at 7:07 PM, Ted Pedersen duluth...@gmail.com >>>>>>>>> [umls-similarity] <umls-similarity@yahoogroups.com> wrote: >>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> Hi Jennifer, >>>>>>>>>> >>>>>>>>>> Mapping terms to CUIs is it's own problem, and there are a few >>>>>>>>>> nice tools already available that might be of some use. We've used >>>>>>>>>> MetaMap >>>>>>>>>> to good effect for this problem, so you might want to consider >>>>>>>>>> looking >>>>>>>>>> there. >>>>>>>>>> >>>>>>>>>> https://metamap.nlm.nih.gov/ >>>>>>>>>> >>>>>>>>>> I'd be curious if other users have recommendations as well.. >>>>>>>>>> >>>>>>>>>> Good luck, >>>>>>>>>> Ted >>>>>>>>>> >>>>>>>>>> On Fri, Jun 2, 2017 at 7:56 PM, Jennifer Wilson >>>>>>>>>> jen.wilson...@gmail.com [umls-similarity] < >>>>>>>>>> umls-similarity@yahoogroups.com> wrote: >>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> Hi Ted, >>>>>>>>>>> >>>>>>>>>>> Thank you again for all of this. I'm sorry I had to put down >>>>>>>>>>> this project for a few days and am only now getting back to it. >>>>>>>>>>> >>>>>>>>>>> I see that ontologies change and reproducing that result might >>>>>>>>>>> not be the best sanity check on the scripts that I wrote. >>>>>>>>>>> >>>>>>>>>>> I'm going to try and figure out how to map to CUI terms and I'll >>>>>>>>>>> be in touch if I get stuck again. Thanks, >>>>>>>>>>> >>>>>>>>>>> On Sun, May 28, 2017 at 10:59 AM, Ted Pedersen >>>>>>>>>>> duluth...@gmail.com [umls-similarity] < >>>>>>>>>>> umls-similarity@yahoogroups.com> wrote: >>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> This is perhaps a bit more than you were looking for, but there >>>>>>>>>>>> are quite a few command line tools available with UMLS::Similarity >>>>>>>>>>>> when you >>>>>>>>>>>> install locally that can be helpful for digging into situations >>>>>>>>>>>> like this. >>>>>>>>>>>> When I look for the path from each of these CUIs to the ROOT (of >>>>>>>>>>>> MSH) I >>>>>>>>>>>> find that one of them does not have a path to the root, while the >>>>>>>>>>>> other >>>>>>>>>>>> does (see command output below) >>>>>>>>>>>> >>>>>>>>>>>> The lack of a path to the root is going to cause a lot of >>>>>>>>>>>> measures to report a -1 value (since path, for example, relies on >>>>>>>>>>>> finding >>>>>>>>>>>> this path as a part of its computation). In fact, not having a >>>>>>>>>>>> path to the >>>>>>>>>>>> root makes me question if C0156543 is in MSH at all, so it might >>>>>>>>>>>> even be >>>>>>>>>>>> that the CUI is no longer a part of MSH (and not just lacking a >>>>>>>>>>>> path to the >>>>>>>>>>>> root). But, regardless, clearly something has changed since 2009 >>>>>>>>>>>> that is >>>>>>>>>>>> causing this measure to return a different value. This happens in >>>>>>>>>>>> some >>>>>>>>>>>> cases since UMLS continues to evolve and CUIs are added, removed, >>>>>>>>>>>> etc. It's >>>>>>>>>>>> important to know what version of the UMLS a previous study has >>>>>>>>>>>> used if you >>>>>>>>>>>> are interested in getting a very exact comparison. In the case of >>>>>>>>>>>> our AMIA >>>>>>>>>>>> 2009 paper we used 2008AB, so things have no doubt changed a bit >>>>>>>>>>>> since then. >>>>>>>>>>>> >>>>>>>>>>>> tpederse@maraca:~$ findPathToRoot.pl C0156543 >>>>>>>>>>>> >>>>>>>>>>>> UMLS-Interface Configuration Information: >>>>>>>>>>>> (Default Information - no config file) >>>>>>>>>>>> >>>>>>>>>>>> Sources (SAB): >>>>>>>>>>>> MSH >>>>>>>>>>>> Relations (REL): >>>>>>>>>>>> PAR >>>>>>>>>>>> CHD >>>>>>>>>>>> >>>>>>>>>>>> Sources (SABDEF): >>>>>>>>>>>> UMLS_ALL >>>>>>>>>>>> Relations (RELDEF): >>>>>>>>>>>> UMLS_ALL >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> There are no paths from the given C0156543 to the root. >>>>>>>>>>>> tpederse@maraca:~$ findPathToRoot.pl C0000786 >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> UMLS-Interface Configuration Information: >>>>>>>>>>>> (Default Information - no config file) >>>>>>>>>>>> >>>>>>>>>>>> Sources (SAB): >>>>>>>>>>>> MSH >>>>>>>>>>>> Relations (REL): >>>>>>>>>>>> PAR >>>>>>>>>>>> CHD >>>>>>>>>>>> >>>>>>>>>>>> Sources (SABDEF): >>>>>>>>>>>> UMLS_ALL >>>>>>>>>>>> Relations (RELDEF): >>>>>>>>>>>> UMLS_ALL >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> The paths between abortions, spontaneous (C0000786) and the >>>>>>>>>>>> root: >>>>>>>>>>>> => C0000000 (**UMLS ROOT**) C1135584 (mesh headings) C1256739 >>>>>>>>>>>> (mesh descriptors) C1256741 (topical descriptor) C0012674 >>>>>>>>>>>> (diseases (mesh >>>>>>>>>>>> category)) C1720765 (female urogenital dis pregnancy compl) >>>>>>>>>>>> C0032962 (compl >>>>>>>>>>>> pregn) C0000786 (abortions, spontaneous) >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On Sun, May 28, 2017 at 12:43 PM, Ted Pedersen < >>>>>>>>>>>> duluth...@gmail.com> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Hi Jennifer, >>>>>>>>>>>>> >>>>>>>>>>>>> Thanks for sharing this question. I think in general if you >>>>>>>>>>>>> have a choice between using CUIs or terms with UMLS::Similarity, >>>>>>>>>>>>> your best >>>>>>>>>>>>> option is to use the CUIs. Terms can map to multiple CUIs, and >>>>>>>>>>>>> UMLS::Similarity might pick a CUI associated with a sense of the >>>>>>>>>>>>> term you >>>>>>>>>>>>> aren't intending. Also, if you misspell a term or don't specify >>>>>>>>>>>>> it exactly >>>>>>>>>>>>> correctly, then it shows up as not found. One useful resource for >>>>>>>>>>>>> replicating similarity measure studies (like the one you cite) is >>>>>>>>>>>>> the >>>>>>>>>>>>> following page which includes term mappings for several of the >>>>>>>>>>>>> datasets >>>>>>>>>>>>> we've worked with over the years. >>>>>>>>>>>>> >>>>>>>>>>>>> http://www-users.cs.umn.edu/~bthomson/corpus/corpus.html >>>>>>>>>>>>> >>>>>>>>>>>>> I will admit to being a little puzzled about the case of >>>>>>>>>>>>> abortion - miscarriage. The paper you cite clearly reports a >>>>>>>>>>>>> value based on >>>>>>>>>>>>> MSH, but as I try to run that query now I get a value of -1 (even >>>>>>>>>>>>> when >>>>>>>>>>>>> using the CUIs). However, it appears that each of the CUIs is >>>>>>>>>>>>> found in MSH, >>>>>>>>>>>>> but that somehow we are not able to compute some of the measures >>>>>>>>>>>>> (a path >>>>>>>>>>>>> length, for example). This suggests that there is not a path >>>>>>>>>>>>> between the >>>>>>>>>>>>> two CUIs, which has something to do with the structure of >>>>>>>>>>>>> UMLS/MSH. >>>>>>>>>>>>> >>>>>>>>>>>>> One quick and dirty way to see if a CUI is in MSH is to find >>>>>>>>>>>>> the path length between a CUI and itself. If it is present in >>>>>>>>>>>>> MSH, that >>>>>>>>>>>>> value will be 1. We see that for each of the CUIs used for >>>>>>>>>>>>> abortion and >>>>>>>>>>>>> miscarriage. >>>>>>>>>>>>> >>>>>>>>>>>>> tpederse@maraca:~$ perl query-umls-similarity-webinterface.pl >>>>>>>>>>>>> --measure path --sab MSH C0156543 C0156543 >>>>>>>>>>>>> Default Settings: >>>>>>>>>>>>> --default http://atlas.ahc.umn.edu/ >>>>>>>>>>>>> --rel PAR/CHD >>>>>>>>>>>>> User Settings: >>>>>>>>>>>>> --measure path >>>>>>>>>>>>> >>>>>>>>>>>>> 1<>Unspecified abortion NOS(C0156543)<>Unspecified abortion >>>>>>>>>>>>> NOS(C0156543) >>>>>>>>>>>>> >>>>>>>>>>>>> tpederse@maraca:~$ perl query-umls-similarity-webinterface.pl >>>>>>>>>>>>> --measure path --sab MSH C0000786 C0000786 >>>>>>>>>>>>> Default Settings: >>>>>>>>>>>>> --default http://atlas.ahc.umn.edu/ >>>>>>>>>>>>> --rel PAR/CHD >>>>>>>>>>>>> User Settings: >>>>>>>>>>>>> --measure path >>>>>>>>>>>>> >>>>>>>>>>>>> 1<>Abortions.spontaneous(C0000786)<>Abortions.spontaneous(C0 >>>>>>>>>>>>> 000786) >>>>>>>>>>>>> >>>>>>>>>>>>> However, when I try to find the path length between the two >>>>>>>>>>>>> CUIs, I get -1. This suggests that the CUIs are not jointed by >>>>>>>>>>>>> PAR/CHD >>>>>>>>>>>>> relations...note that below you can see that the terms for the >>>>>>>>>>>>> CUIs have >>>>>>>>>>>>> been looked up, which shows us that MSH knows about them... >>>>>>>>>>>>> >>>>>>>>>>>>> tpederse@maraca:~$ perl query-umls-similarity-webinterface.pl >>>>>>>>>>>>> --measure path --sab MSH C0156543 C0000786 >>>>>>>>>>>>> Default Settings: >>>>>>>>>>>>> --default http://atlas.ahc.umn.edu/ >>>>>>>>>>>>> --rel PAR/CHD >>>>>>>>>>>>> User Settings: >>>>>>>>>>>>> --measure path >>>>>>>>>>>>> >>>>>>>>>>>>> -1<>Unspecified abortion NOS(C0156543)<>Abortions.spont >>>>>>>>>>>>> aneous(C0000786) >>>>>>>>>>>>> >>>>>>>>>>>>> So, in any case, it would appear that something has changed in >>>>>>>>>>>>> the structure of MSH since we reported our results in the 2009 >>>>>>>>>>>>> AMIA paper >>>>>>>>>>>>> you mention. I'm not sure what that is. But, I think the general >>>>>>>>>>>>> message is >>>>>>>>>>>>> that if you can use CUIs it will normally be more reliable to do >>>>>>>>>>>>> that. >>>>>>>>>>>>> Mapping terms to CUIs is of course it's own problem, but >>>>>>>>>>>>> UMLS::Similarity >>>>>>>>>>>>> doesn't do anything terribly fancy with that, and so probably >>>>>>>>>>>>> whatever you >>>>>>>>>>>>> do will be more extensive and reliable than what UMLS::Similarity >>>>>>>>>>>>> would >>>>>>>>>>>>> do... >>>>>>>>>>>>> >>>>>>>>>>>>> I hope this helps somehow, and please do feel free to follow >>>>>>>>>>>>> up. Thoughts from other users on this issue would also be most >>>>>>>>>>>>> welcome! >>>>>>>>>>>>> >>>>>>>>>>>>> Cordially, >>>>>>>>>>>>> Ted >>>>>>>>>>>>> >>>>>>>>>>>>> On Sat, May 27, 2017 at 12:18 PM, Jennifer Wilson >>>>>>>>>>>>> jen.wilson...@gmail.com [umls-similarity] < >>>>>>>>>>>>> umls-similarity@yahoogroups.com> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hi all, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I'm resending this now that I'm subscribed. Any advice would >>>>>>>>>>>>>> be much appreciated! Thank you, >>>>>>>>>>>>>> >>>>>>>>>>>>>> ---------- Forwarded message ---------- >>>>>>>>>>>>>> From: Jennifer Wilson <jen.wilson...@gmail.com> >>>>>>>>>>>>>> Date: Tue, May 23, 2017 at 6:13 PM >>>>>>>>>>>>>> Subject: Help with the best approach for using the query-UMLS >>>>>>>>>>>>>> interface >>>>>>>>>>>>>> To: umls-similarity@yahoogroups.com >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> Hello UMLS similarity team, >>>>>>>>>>>>>> >>>>>>>>>>>>>> I am trying to compute the similarity between ~30K >>>>>>>>>>>>>> disease/phenotype terms. Ideally, I would have a matrix of >>>>>>>>>>>>>> similarity for >>>>>>>>>>>>>> these terms. >>>>>>>>>>>>>> >>>>>>>>>>>>>> My first attempt was to write a python script to call the >>>>>>>>>>>>>> query-umls-similarity-webinterface.pl script. Though, before >>>>>>>>>>>>>> releasing the script on my dataset, I was trying to recreate the >>>>>>>>>>>>>> scores >>>>>>>>>>>>>> from this paper (https://www.ncbi.nlm.nih.gov/ >>>>>>>>>>>>>> pmc/articles/PMC2815481/) in table 1. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Here's the command I am using: >>>>>>>>>>>>>> >>>>>>>>>>>>>> ./query-umls-similarity-webinterface.pl --sab MSH --rel >>>>>>>>>>>>>> PAR/CHD "Abortion" "Miscarriage" >>>>>>>>>>>>>> >>>>>>>>>>>>>> Default Settings: >>>>>>>>>>>>>> >>>>>>>>>>>>>> --default http://atlas.ahc.umn.edu/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> --measure path >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> User Settings: >>>>>>>>>>>>>> >>>>>>>>>>>>>> --rel PAR/CHD >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> (-1.0, 'Abortion', 'Miscarriage') >>>>>>>>>>>>>> >>>>>>>>>>>>>> I also have not processed the text in my dataset much. I have >>>>>>>>>>>>>> basically pulled diseases and phenotypes from DisGeNet, OMIN, >>>>>>>>>>>>>> PheWas, and >>>>>>>>>>>>>> the GWAS catalogue. If I'm using data from all of these sources >>>>>>>>>>>>>> - do you >>>>>>>>>>>>>> recommend sending them directly to the query interface? Should I >>>>>>>>>>>>>> try and >>>>>>>>>>>>>> map to CUI terms? (examples below) >>>>>>>>>>>>>> >>>>>>>>>>>>>> Before I download the database and attempt to query the >>>>>>>>>>>>>> database (it's not a language that I use in my current work), I >>>>>>>>>>>>>> just wanted >>>>>>>>>>>>>> an outside perspective to see if there are best practices for >>>>>>>>>>>>>> using this >>>>>>>>>>>>>> data. Thank you in advance for your time! >>>>>>>>>>>>>> >>>>>>>>>>>>>> (examples) >>>>>>>>>>>>>> Here are two more examples showing the disease descriptions >>>>>>>>>>>>>> in my dataset. Is the UMLS interface robust to these various >>>>>>>>>>>>>> formats or do >>>>>>>>>>>>>> they need to be an exact match? >>>>>>>>>>>>>> >>>>>>>>>>>>>> ./query-umls-similarity-webinterface.pl --sab MSH --rel >>>>>>>>>>>>>> PAR/CHD "Testicular Neoplasms" "Amelogenesis imperfecta local >>>>>>>>>>>>>> hypoplastic >>>>>>>>>>>>>> form" >>>>>>>>>>>>>> >>>>>>>>>>>>>> Default Settings: >>>>>>>>>>>>>> >>>>>>>>>>>>>> --default http://atlas.ahc.umn.edu/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> --measure path >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> User Settings: >>>>>>>>>>>>>> >>>>>>>>>>>>>> --rel PAR/CHD >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> (-1.0, 'Testicular Neoplasms', 'Amelogenesis imperfecta local >>>>>>>>>>>>>> hypoplastic form') >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> ./query-umls-similarity-webinterface.pl --sab MSH --rel >>>>>>>>>>>>>> PAR/CHD "Hypotrichosis 2, 146520 (3)" "PERIODONTITIS, LOCALIZED >>>>>>>>>>>>>> AGGRESSIVE" >>>>>>>>>>>>>> >>>>>>>>>>>>>> Default Settings: >>>>>>>>>>>>>> >>>>>>>>>>>>>> --default http://atlas.ahc.umn.edu/ >>>>>>>>>>>>>> >>>>>>>>>>>>>> --measure path >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> User Settings: >>>>>>>>>>>>>> >>>>>>>>>>>>>> --rel PAR/CHD >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> (-1.0, 'Hypotrichosis 2, 146520 (3)', 'PERIODONTITIS, >>>>>>>>>>>>>> LOCALIZED AGGRESSIVE') >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Jennifer L. Wilson >>>>>>>>>>>>>> Bioengineering, Stanford University >>>>>>>>>>>>>> jen.wilson...@gmail.com / 703.969.3318 <(703)%20969-3318> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>>> -- >>>>>>>>>>>>>> Jennifer L. Wilson >>>>>>>>>>>>>> Bioengineering, Stanford University >>>>>>>>>>>>>> jen.wilson...@gmail.com / 703.969.3318 <(703)%20969-3318> >>>>>>>>>>>>>> >>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> Jennifer L. Wilson >>>>>>>>>>> Bioengineering, Stanford University >>>>>>>>>>> jen.wilson...@gmail.com / 703.969.3318 <(703)%20969-3318> >>>>>>>>>>> -- >>>>>>>>>>> Jennifer L. Wilson >>>>>>>>>>> Bioengineering, Stanford University >>>>>>>>>>> jen.wilson...@gmail.com / 703.969.3318 <(703)%20969-3318> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> -- >>>>>>>>> Jennifer L. Wilson >>>>>>>>> Bioengineering, Stanford University >>>>>>>>> jen.wilson...@gmail.com / 703.969.3318 <(703)%20969-3318> >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> >>>>>>> -- >>>>>>> Jennifer L. Wilson >>>>>>> Bioengineering, Stanford University >>>>>>> jen.wilson...@gmail.com / 703.969.3318 <(703)%20969-3318> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Jennifer L. Wilson >>>>>> Bioengineering, Stanford University >>>>>> jen.wilson...@gmail.com / 703.969.3318 <(703)%20969-3318> >>>>>> >>>>>> >>>>> >>>> >>> >>> >>> -- >>> Jennifer L. Wilson >>> Bioengineering, Stanford University >>> jen.wilson...@gmail.com / 703.969.3318 <(703)%20969-3318> >>> >>> >> > > > -- > Jennifer L. Wilson > Bioengineering, Stanford University > jen.wilson...@gmail.com / 703.969.3318 <(703)%20969-3318> > > >