On Jul 29, 2014, at 6:25 PM, Steven Bethard <beth...@cis.uab.edu> wrote: > Iād like to start up the similarity server on a machine here so that we can > make queries to it. I tried running /usr/local/bin/umls_similarity_server.pl > but it immediately and silently exits. I re-ran it as > /usr/local/bin/umls_similarity_server.pl --logfile umls.log, and umls.log > gives me: [snip] > Could not open file /var/www/umls_similarity/icpropagation/icprop.msh.par.chd
On Jul 30, 2014, at 9:55 AM, Bridget McInnes btmcin...@gmail.com [umls-similarity] <umls-similarity@yahoogroups.com> wrote: > The icpropagation files need to go into the: > /var/www/umls_similarity/icpropagation/ [snip] > create-icfrequency.pl ICFREQUENCY_FILE INPUTFILE [snip] > create-icpropagation.pl ICPROPAGATION_FILE ICFREQUENCY_FILE Thanks, this solved the problem. Some notes for anyone else who has to do this: * The create-icfrequency.pl script took about 20 minutes on a text file of about 160M words. * The create-icpropagation.pl script took about 10 minutes * The icpropagation file has to be named /var/www/umls_similarity/icpropagation/icprop.msh.par.chd for the sever to run A further question about using the UMLS::Similarity server. What is the format that it expects if I interact directly with the socket? The documentation[1] suggests that 'g car#n#1\015\012\015\012ā should work, but this causes errors on the server side (and nothing is printed to the client): TYPE: g Use of uninitialized value $button in concatenation (.) or string at /usr/local/bin/umls_similarity_server.pl line 340, <GEN51> line 2. Use of uninitialized value $word in concatenation (.) or string at /usr/local/bin/umls_similarity_server.pl line 340, <GEN51> line 2. HERE () g () Use of uninitialized value $word in pattern match (m//) at /usr/local/bin/umls_similarity_server.pl line 342, <GEN51> line 2. Use of uninitialized value $button in string eq at /usr/local/bin/umls_similarity_server.pl line 346, <GEN51> line 2. Use of uninitialized value $word in concatenation (.) or string at /usr/local/bin/umls_similarity_server.pl line 553, <GEN51> line 2. In getDefAllForms () Use of uninitialized value $word in pattern match (m//) at /usr/local/bin/umls_similarity_server.pl line 558, <GEN51> line 2. ERROR: UMLS::Interface::CuiFinder->_getDefConceptList Undefined input value (Error Code 4). Error with input variable $term. If I instead read the source code[2], it appears that I should instead be using something like āg||car|\015\012\015\012ā which does indeed work: $ printf 'g||fracture|\015\012\015\012' | nc localhost 31135 g C0016658 NCI : A finding of traumatic injury to the bone in which the continuity of the bone is broken... Should I keep reading the source code, or is there something better I should be looking at? Steve [1] https://metacpan.org/pod/distribution/UMLS-Similarity/utils/umls_similarity_server.pl [2] https://metacpan.org/source/BTMCINNES/UMLS-Similarity-1.41/utils/umls_similarity_server.pl
signature.asc
Description: Message signed with OpenPGP using GPGMail