Ok…I’m sorry, I’ll try to be more clear…I ran metamorphosis to create a new umls subset. At a certain point it is possible to apply filters to obtain a subset of the umls (it’s what metamorphosis is meant for), so I selected the language filter to exclude non english sources and the semantic type filter to have only T024 and T023 semantic types. Then I simply executed their script to save the data into mysql tables. I wanted to include all the sources in T024 and T023 semantic types so I used UMLS_ALL in my configuration file. The removeConfigData doesn’t work. As a solution I simply drop the the index database. None of the issues is solved, even when selecting only SNOMEDCT_US as source and PAR/CHD as relationships in the config file.
Eugenia Galeota, PhD Center for Genomic Science of IIT@SEMM Computational Epigenomics Email: eugenia.gale...@iit.it Tel: +39 02 9437 5046 Via Adamello 16, 20139 Milan, Italy On 08 Oct 2014, at 22:30, Bridget McInnes btmcin...@gmail.com [umls-similarity] <umls-similarity@yahoogroups.com> wrote: > > Hello Eugenia, > > I am not completely certain what you mean by a UMLS subset with only two > semantic types, but I know that there are many subset configuration that I am > not aware of or use. Are you able to build the index over smaller sets of > sources/relation (e.g. MSH with the PAR/CHD relations) in the UMLS now? Does > the removeConfigData.pl program work? Or are those not resolved? > > Thank you! > > Bridget > > On Wed, Oct 8, 2014 at 10:19 AM, Eugenia Galeota eugenia.gale...@iit.it > [umls-similarity] <umls-similarity@yahoogroups.com> wrote: > > Hi, > > yes the problem is that at the end of the code the index is not builded > completely and the removeConfigData.pl is not working (see the error in > previous message). I had this problem while I was using an UMLS subset with > only two semantic types obtained using Metamorphosis. > After noticing the first error I tried to build an index over the entire UMLS > and this time the program seems to never end. I was also keeping track of the > queries and in my general log table there where more than 5 million queries > on umls and umlsindex databases. > After your message I stopped the execution to add the realtime option as you > suggested. As before I’m logging the queries. Actually after more than one > hour I can count about 1800000 selects on cuis and the program is still > running. My configuration file is simply > SAB :: include UMLS_ALL > REL :: include PAR, CHD > > while the other parameters (database, password, socket…, and now realtime) > are specified in the code as you can see in the previous message. > Thanks for your help, > > Eugenia > > > > Eugenia Galeota, PhD > Center for Genomic Science of IIT@SEMM > Computational Epigenomics > Email: eugenia.gale...@iit.it > Tel: +39 02 9437 5046 > Via Adamello 16, 20139 Milan, Italy > > On 08 Oct 2014, at 14:33, Bridget McInnes btmcin...@gmail.com > [umls-similarity] <umls-similarity@yahoogroups.com> wrote: > >> >> Hi Eugenia, >> >> I apologize that things are not running smoothly. I would like to verify a >> few things with you: >> >> 1. the problem is that the index will not build; and removing the index >> using the removeConfigData.pl is not working? What configuration file where >> you using for this? >> >> >> 2. you are running the index over the entire UMLS? If so, we recommend that >> you use the --realtime option if you would like to take the similarity >> scores between concepts using the path information from the entire UMLS. The >> UMLS is so large, that there is typically not enough space to store all the >> path-to-root information for all the CUIs. The --realtime option allows the >> path information to be collected on the fly. It takes a little longer >> though; so we created the index for smaller sources/relation configurations >> and the --realtime option for the larger configurations. I hope that makes >> sense. Please let me know if it does not. >> >> Thank you! >> >> Best regards, >> >> Bridget >> >> >> On Wed, Oct 8, 2014 at 7:09 AM, Eugenia Galeota eugenia.gale...@iit.it >> [umls-similarity] <umls-similarity@yahoogroups.com> wrote: >> >> >> Dear developers, >> I’ve been trying in many ways to let UMLS::Similarity work, but it seems >> that there are some problems I’m not able to solve. Initially I had issues >> with my.cnf and mysql.socket. I had different versions of the same files, >> then the mysql.socket disappeared from the /tmp folder. Actually they seems >> to be solved, since I completely removed any previous mysql server >> installation and reinstalled the latest version using brew on my mac. I was >> using UMLS similarity on a small subset with only two semantic types created >> using Metamorphosys. I tested some of the Interface methods and they seems >> to be working, anyway when trying to create an index to use with umps >> similarity I always get the following error >> >> ERROR: UMLS::Interface::PathFinder->_checkIndex >> Index Error (Error Code 9). >> Index did not complete. Remove using the removeConfigData.pl program and >> re-run. >> >> and when I try to use the removeConfigData.pl to rebuild it from scratch the >> program never ends (it is still running since last friday). >> I thought that another possibility is that in some way the Pathfinder module >> cannot build the paths to the root of the UMLS because tables in mysql >> don't contain the UMLS root concept C0000000 and its relationships. By >> looking at the code I still wasn’t able to understand how the path from >> concepts without a parent to the UMLS root is created. I found a call at a >> certain point to some function getCuiChildren where the passed cui string >> seems to be an empty string empty. I also modified the my.cnf file as you >> specify in the documentation of the perl module. >> Finally I’m trying to do the same operations on the entire UMLS database. My >> code is the following >> >> use lib '/Library/Perl/5.16/'; >> use UMLS::Interface; >> use UMLS::Similarity::lin; >> %params = (); >> $params{"intrinsic"} = "sanchez"; >> $params{"username"} = "egaleota"; >> $params{"password"} = “password"; >> $params{"socket"} = "/tmp/mysql.sock"; >> $params{"config"} = >> "/Users/egaleota/umls/UMLS-Interface-1.41/myconfig.conf"; >> #$cuifinder = UMLS::Interface::CuiFinder->new(\%params); >> my $umls = UMLS::Interface->new(\%params); >> die "Unable to create UMLS::Interface object.\n" if(!$umls); >> $paramtri{"intrinsic"}="sanchez"; >> my $lin = UMLS::Similarity::lin->new($umls, \%paramtri); >> #die "Unable to create measure object.\n" if(!$lin); >> my $cui1 = "C0459385"; >> my $cui2 = "C0440746"; >> $ts1 = $umls->getTermList($cui1); >> my $term1 = pop @{$ts1}; >> $ts2 = $umls->getTermList($cui2); >> my $term2 = pop @{$ts2}; >> my $value = $lin->getRelatedness($cui1, $cui2); >> print "The similarity between $cui1 ($term1) and $cui2 ($term2) is $value\n"; >> >> >> >> Actually I’m running the code above. The index seems to be in creation >> phase. When I check for the current queries in the db I obtain results like >> the following which let me believe that the problems are not related to how >> perl communicates with mysql >> >> +----+----------+-----------+--------------------+---------+------+------------+------------------------------------------------------------------------------------------------------+ >> | Id | User | Host | db | Command | Time | State >> | Info >> | >> +----+----------+-----------+--------------------+---------+------+------------+------------------------------------------------------------------------------------------------------+ >> | 51 | egaleota | localhost | umls | Query | 0 | >> statistics | select distinct CUI2 from MRREL where CUI1='C3135343' and >> ((REL='CHD') ) and CUI2!='C3135343' and SU | >> | 52 | egaleota | localhost | umlsinterfaceindex | Sleep | 0 | >> | NULL >> | >> | 54 | egaleota | localhost | NULL | Query | 0 | init >> | show processlist >> | >> +----+----------+-----------+--------------------+---------+------+------------+------------------ >> >> >> Since all this tasks are really time consuming I was kindly wondering if you >> could help me to solve this issues. Thanks, >> >> Eugenia Galeota, PhD >> Center for Genomic Science of IIT@SEMM >> Computational Epigenomics >> Email: eugenia.gale...@iit.it >> Tel: +39 02 9437 5046 >> Via Adamello 16, 20139 Milan, Italy >> >> >> >> > > > > >