This is great!!!! Thank you so much, Andy!!! I agree that it will make life for many users MUCH easier. --guergana
-----Original Message----- From: Jay Vyas [mailto:[email protected]] Sent: Tuesday, November 11, 2014 5:31 PM To: [email protected] Subject: Re: Announcement: UMLS MedGen-MySQL dataset now available as open access download +1000 on this! Great lets make a jira!!! > On Nov 11, 2014, at 5:02 PM, andy mcmurry <[email protected]> wrote: > > Hello! > > https://bitbucket.org/invitae/medgen-mysql (Apache Licensed ASL2) > > We just released a new library containing a huge chunk of UMLS > concepts which are available without registering accounts/username/passwords. > LEGALLY. Yes, really! > > The subset is from NCBI and it contains *thousands of concepts from > SNOMED and other vocabularies*. > > The code is essentially > 1. a list of WGET targets to various NCBI FTP site mirrors 2. Makefile > for building the databases of interest > > Our legal team has approved distribution for Open Access work, ASL2 > LICENSE. > > I recommend we use this opportunity to make this the default > distribution for CTAKES UMLS connections, because it obviates the need > for so much painful credentialing and back and forth agreements with > the US National Library of Medicine. > > Cheers! > --Andy > > > On Wed, Sep 10, 2014 at 12:13 PM, Masanz, James J. > <[email protected]> > wrote: > >> >> I would love to see the install be as simple as apt-get install to >> end up with some working dictionary that have more than a handful of >> entries to get them started. >> >> Regards, >> James Masanz >> >> -----Original Message----- >> From: andy mcmurry [mailto:[email protected]] >> Sent: Tuesday, September 09, 2014 4:32 PM >> To: [email protected] >> Subject: Recommendation for ctakes default (UMLS) dictionaries >> >> Greetings ctakes-dev: >> >> *UMLS license restrictions have been getting more lax over the years >> -- *much of the UMLS can be downloaded directly from the NCBI >> official FTP site. >> >> In fact, the NIH (and implicitly the NLM) *have already made the >> standard terms public for some medical specialities*. >> >> For example: Here is the UMLS subset specific to Medical Genetics >> (MedGen) and Genetic Testing (GTR) complete with SNOMED-CT concept >> CUI(s) and names, etc : >> >> [ ftp://ftp.ncbi.nlm.nih.gov/pub/medgen/README.html ] >> >> My team has developed a JVM based wrapper for MetaMap 2013AB which I >> intend to open source soon (Clojure). It includes REST support for >> invoking MetaMap with any or all of the command line arguments. >> We do not integrate with UIMA, we are basically a wrapper around the >> binary installation of MetaMap. The emphasis is on publication text >> not clinical text, still, some services are common (such as LVG). >> >> Strangely, the NLM still requires UMLS licenses to download MetaMap >> execution binaries. The MetaMap binary install is better but >> customizing dictionaries (DataFileBuilder) is not as easy to use as >> CTAKES with YTEXT >> >> [ >> https://cwiki.apache.org/confluence/display/CTAKES/YTEX+Installation >> ] >> >> *** Hence, there is a real opportunity here to enable Apache cTAKES >> to have a stronger default dictionary. ** * >> >> Imagine if we could >> *$ apt-get install apache-ctakes * >> >> and instantly have a working package for SOME problem domain. >> In my case (Medical Genetics) the UMLS definitions are already >> available and the UMLS license problem becomes a non issue, at least >> for many first time users >> >> Your thoughts? >> AndyMC >>
