Hi Brandon,
Sorry for coming late on this as I was stuck with some other task. Ok so I have
used Dictionary_Gui with Java 8 and compiled successfully. After that used the
tool to create the hsql data files for my current UMLS2015 and it generates all
the relevant files as described in your post : "CTAKES DICTIONARY CREATOR GUI
!!!".
After generating the dictionary, I wanted to use it with cTakes , so I have
modified "UmlsLookupAnnotator.xml"
<fileResourceSpecifier>
<fileUrl>file:org/apache/ctakes/dictionary/lookup/fast/umls2015aa.xml</fileUrl>
</fileResourceSpecifier>
And selected "AggregatePlaintextFastUMLSProcessor" as analysis engine for
clinicalpipeline. Following is the error which Im getting:
Error Trace:
05 Apr 2016 11:52:55 INFO JdbcConnectionFactory - Connecting to
jdbc:hsqldb:file:resources/org/apache/ctakes/dictionary/lookup/fast/umls2015aa/umls2015aa:
.................................................. 50
.................................................. 100
.................................................. 150
....................
05 Apr 2016 11:53:52 INFO JdbcConnectionFactory - Database connected
05 Apr 2016 11:53:53 ERROR JdbcConceptFactory - Could not create Concept Data
Selection Call
java.sql.SQLException: Table not found in statement [SELECT * FROM long WHERE
CUI = ?]
at org.hsqldb.jdbc.Util.throwError(Unknown Source)
at org.hsqldb.jdbc.jdbcPreparedStatement.<init>(Unknown Source)
at org.hsqldb.jdbc.jdbcConnection.prepareStatement(Unknown Source)
at
org.apache.ctakes.dictionary.lookup2.concept.JdbcConceptFactory.createSelectCall(JdbcConceptFactory.java:204)
at
org.apache.ctakes.dictionary.lookup2.concept.JdbcConceptFactory.<init>(JdbcConceptFactory.java:83)
at
org.apache.ctakes.dictionary.lookup2.concept.JdbcConceptFactory.<init>(JdbcConceptFactory.java:57)
at
org.apache.ctakes.dictionary.lookup2.concept.UmlsJdbcConceptFactory.<init>(UmlsJdbcConceptFactory.java:30)
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown
Source)
at java.lang.reflect.Constructor.newInstance(Unknown Source)
Please find attached "umls2015aa.xml" file generated by dictionary_gui tool.
Why Im getting the above mentioned error.. Any ideas/suggestions will be
helpful.
Thanks &Regards
Stuti Awasthi
From: Stuti Awasthi
Sent: Thursday, March 31, 2016 5:17 PM
To: [email protected]
Subject: RE: Integrate cTakes with latest UMLS DB
Thanks.
Ya I have read the long conversation between you and Sean on how to use
dictionary tool utility and that really helped. Thanks to Sean.
Thanks for java 8 pointer. Presently Im using Java7 and already added libraries
in build path. Let me try with Java8 and Il come back.
Example of error : org.apache.ctakes.dictionary.creator.gui.umls.Concept.java
concept.getTuis().stream().forEach( this::addTui );
The method stream() is undefined for the type Collection<Tui>
This is just an example, similarly Im getting more errors in different classes.
Thanks
Stuti Awasthi
________________________________
From: Geise, Brandon D. [[email protected]]
Sent: Thursday, March 31, 2016 5:02 PM
To: [email protected]<mailto:[email protected]>
Subject: RE: Integrate cTakes with latest UMLS DB
Hi Stuti,
All credit for the dictionary work goes to Sean Finan. What errors are you
getting? I'm not aware of a precompiled jar. I do remember that I needed to
switch to Java 8, if you aren't already to compile the project. You also may
need to reference the lib folder or move those jars to the project library path
in order to compile.
Brandon
From: Stuti Awasthi [mailto:[email protected]]
Sent: Thursday, March 31, 2016 7:14 AM
To: '[email protected]'
<[email protected]<mailto:[email protected]>>
Subject: RE: Integrate cTakes with latest UMLS DB
Hi All,
Sorry , I think I got confused with dictionary-gui and dictionary-tool
projects. Jessica mentioned about dictionary tool and its README which I
followed along with the previous mail threads to generate umls dictionary in
file format. Im still to use this dictionary with cTakes so i might comeback
with issues.
Also I would like to use dictionary-gui posted by Brandon. When I have checkout
the code and imported in eclipse, its showing compilation error in classes due
to which even I have created the jar of the project, it is not running. Any
idea about the compilation errors or if there is a pre built binary which can
be used (similar to dictionary tools jar) that would be great.
Thanks for your help and patience.
Regards
Stuti Awasthi
________________________________
From: Stuti Awasthi
Sent: Thursday, March 31, 2016 11:58 AM
To: '[email protected]'
Subject: RE: Integrate cTakes with latest UMLS DB
Hi Brandon, Jessica
I lookedup at dictionary-gui utility. Here is what I did : Checkout the source
code https://svn.apache.org/repos/asf/ctakes/sandbox/dictionary-gui/ and
imported in my eclipse IDE. Now I was not able to find any README with this
project. Hence I checked the code and found that "CreaterGui.java" might be the
mail class to launch the GUI. My queries are :
* Is there any README as Jessica mentioned or simply launching the
CreaterGui.java class would be suffice.
* Is there any commands to execute tool on linux server (without gnome)
. Something like command like options? Im asking this because Im working on
linux machine where I have installed lated UMLS2015.
Few more questions:
* As Brandon mentioned : that I might need to create the tables that
will have same information (based on rare /first word approach). Do we know how
to create those tables ?? Any SQL statements or links will be useful.
* I found the following link in Ctakes3.1 documentation but looks like
broken. Any clue on redirected path/links
* https://cabig-kc.nci.nih.gov/Vocab/forums/viewtopic.php?f=28&t=423
*
https://cabig-kc.nci.nih.gov/Vocab/forums/viewtopic.php?f=28&t=80&start=20#p1459
* Can I create full dictionary(not only the fast one) from dictionary tool.
Thanks &Regards
Stuti Awasthi
From: Stuti Awasthi
Sent: Tuesday, March 29, 2016 5:35 PM
To: '[email protected]'
Subject: RE: Integrate cTakes with latest UMLS DB
Thanks Brandon, then I will check dictionary-gui tool and try to use it and
will get back with queries if any.
Regards
Stuti Awasthi
From: Geise, Brandon D. [mailto:[email protected]]
Sent: Tuesday, March 29, 2016 5:32 PM
To: Stuti Awasthi; '[email protected]'
Subject: RE: Integrate cTakes with latest UMLS DB
Hi Stuti,
Unless someone else has created another solution the current Dictionary Gui
tool doesn't have this support yet. It's something that I plan to work on in
the future for the GUI tool.
Thanks,
Brandon
Sent from Mail<https://go.microsoft.com/fwlink/?LinkId=550986> for Windows 10
From: Stuti Awasthi<mailto:[email protected]>
Sent: Tuesday, March 29, 2016 7:59 AM
To: '[email protected]'<mailto:[email protected]>
Subject: RE: Integrate cTakes with latest UMLS DB
Thanks Brandon, Jessica for response. I will look at dictionary tools as
mentioned and will try to figure out how it can be used for my purpose. I will
certainly get back on mailing list with more queries if any.
As of now with limited understanding, I understand that current UMLS2015
database will be converted to hsql format which I can use in the pipeline by
changing configurations in required XMLs. One more query, cTakes also provides
"DictionaryLookupAnnotatorDB" which is supposed to be used if we want to
connect to required tables in umls database. Please correct my understanding
but do we have similar tool/script to be execute on umls database to generate
the required format table "umls2015ab" which can be directly included in
dictionary lookup.
Thanks for the help!
Regards
Stuti Awasthi
From: Glover, Jessica [mailto:[email protected]]
Sent: Tuesday, March 29, 2016 5:17 PM
To: '[email protected]'
Subject: RE: Integrate cTakes with latest UMLS DB
Hi Stuti,
One thing to note about the dictionary tool is that it is still in sandbox
mode, so there is no official documentation or step-by-step guide for it.
However, it contains a readme, and it has been discussed on the mailing list,
so you can search the archive for that "documentation."
Searchable cTakes mailing list archive: http://ctakes.markmail.org/search/
Regards,
Jessica
From: Stuti Awasthi [mailto:[email protected]]
Sent: Tuesday, March 29, 2016 7:10 AM
To: '[email protected]'
Subject: RE: Integrate cTakes with latest UMLS DB
Hi All,
Any help will be good. Im still waiting for some response.
Thanks &Regards
Stuti Awasthi
From: Stuti Awasthi
Sent: Monday, March 28, 2016 2:46 PM
To: [email protected]<mailto:[email protected]>
Subject: Integrate cTakes with latest UMLS DB
Hi All,
I have been exploring cTakes and wanted to use latest UMLS 2015ab with it. To
achieve the same, I have installed UMLS in mysql database.
Database Name: umls
Numbers of Tables created : 53
Eg : mysql> show tables;
+--------------------------+
| Tables_in_umls |
+--------------------------+
| AMBIGLUI |
| AMBIGSUI |
| DELETEDCUI |
| DELETEDLUI |
| DELETEDSUI
......
| SRSTRE1 |
| SRSTRE2 |
Now im not able to find any step by step guide to integrate cTakes with umls
database.
Here is what I have done :
1) Modified DictionaryLookupAnnotatorDB.xml with jdbc url of my database along
with username and password.
Now I believe I also need to modify LookupDesc_DB.xml with some umls tablename
which can get cui, tui and text info.
I need some help , which table should I be pointing to lookup ? Also after is
there any other step or I should be good in executing clinical pipeline with
umls database.
Also please point some steps guide/document of available for cTakes integration
with UMLS.
Thanks &Regards
Stuti Awasthi
::DISCLAIMER::
----------------------------------------------------------------------------------------------------------------------------------------------------
The contents of this e-mail and any attachment(s) are confidential and intended
for the named recipient(s) only.
E-mail transmission is not guaranteed to be secure or error-free as information
could be intercepted, corrupted,
lost, destroyed, arrive late or incomplete, or may contain viruses in
transmission. The e mail and its contents
(with or without referred errors) shall therefore not attach any liability on
the originator or HCL or its affiliates.
Views or opinions, if any, presented in this email are solely those of the
author and may not necessarily reflect the
views or opinions of HCL or its affiliates. Any form of reproduction,
dissemination, copying, disclosure, modification,
distribution and / or publication of this message without the prior written
consent of authorized representative of
HCL is strictly prohibited. If you have received this email in error please
delete it and notify the sender immediately.
Before opening any email and/or attachments, please check them for viruses and
other defects.
----------------------------------------------------------------------------------------------------------------------------------------------------
________________________________
IMPORTANT WARNING: The information in this message (and the documents attached
to it, if any) is confidential and may be legally privileged. It is intended
solely for the addressee. Access to this message by anyone else is
unauthorized. If you are not the intended recipient, any disclosure, copying,
distribution or any action taken, or omitted to be taken, in reliance on it is
prohibited and may be unlawful. If you have received this message in error,
please delete all electronic copies of this message (and the documents attached
to it, if any), destroy any hard copies you may have created and notify me
immediately by replying to this email. Thank you. Geisinger Health System
utilizes an encryption process to safeguard Protected Health Information and
other confidential data contained in external e-mail messages. If email is
encrypted, the recipient will receive an e-mail instructing them to sign on to
the Geisinger Health System Secure E-mail Message Center to retrieve the
encrypted e-mail.
<?xml version="1.0" encoding="UTF-8"?>
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->
<!-- New format for the .xml lookup specification. Uses table name and value type/class for Concept Factories. -->
<lookupSpecification>
<dictionaries>
<dictionary>
<name>umls2015aaTerms</name>
<implementationName>org.apache.ctakes.dictionary.lookup2.dictionary.UmlsJdbcRareWordDictionary</implementationName>
<properties>
<!-- urls for hsqldb memory connections must be file types in hsql 1.8.
These file urls must be either absolute path or relative to current working directory.
They cannot be based upon the classpath.
Though JdbcConnectionFactory will attempt to "find" a db based upon the parent dir of the url
for the sake of ide ease-of-use, the user should be aware of these hsql limitations.
-->
<property key="jdbcDriver" value="org.hsqldb.jdbcDriver"/>
<property key="jdbcUrl" value="jdbc:hsqldb:file:resources/org/apache/ctakes/dictionary/lookup/fast/umls2015aa/umls2015aa"/>
<property key="jdbcUser" value="sa"/>
<property key="jdbcPass" value=""/>
<property key="rareWordTable" value="cui_terms"/>
<property key="umlsUrl" value="https://uts-ws.nlm.nih.gov/restful/isValidUMLSUser"/>
<property key="umlsVendor" value="NLM-6515182895"/>
<property key="umlsUser" value="CHANGE_ME"/>
<property key="umlsPass" value="CHANGE_ME"/>
</properties>
</dictionary>
</dictionaries>
<conceptFactories>
<conceptFactory>
<name>umls2015aaConcepts</name>
<implementationName>org.apache.ctakes.dictionary.lookup2.concept.UmlsJdbcConceptFactory</implementationName>
<properties>
<property key="jdbcDriver" value="org.hsqldb.jdbcDriver"/>
<property key="jdbcUrl" value="jdbc:hsqldb:file:resources/org/apache/ctakes/dictionary/lookup/fast/umls2015aa/umls2015aa"/>
<property key="jdbcUser" value="sa"/>
<property key="jdbcPass" value=""/>
<property key="umlsUrl" value="https://uts-ws.nlm.nih.gov/restful/isValidUMLSUser"/>
<property key="umlsVendor" value="NLM-6515182895"/>
<property key="umlsUser" value="CHANGE_ME"/>
<property key="umlsPass" value="CHANGE_ME"/>
<property key="tuiTable" value="tui"/>
<property key="prefTermTable" value="prefTerm"/>
<!-- Optional tables for optional term info.
Uncommenting these lines alone may not persist term information;
persistence depends upon the TermConsumer. -->
<property key="rxnormTable" value="long"/>
<property key="snomedct_usTable" value="long"/>
</properties>
</conceptFactory>
</conceptFactories>
<!-- Defines what terms and concepts will be used -->
<dictionaryConceptPairs>
<dictionaryConceptPair>
<name>umls2015aaPair</name>
<dictionaryName>umls2015aaTerms</dictionaryName>
<conceptFactoryName>umls2015aaConcepts</conceptFactoryName>
</dictionaryConceptPair>
</dictionaryConceptPairs>
<!-- DefaultTermConsumer will persist all spans.
PrecisionTermConsumer will only persist only the longest overlapping span of any semantic group.
SemanticCleanupTermConsumer works as Precision** but also removes signs/sympoms contained within disease/disorder,
and (just in case) removes any s/s and d/d that are also (exactly) anatomical sites. -->
<rareWordConsumer>
<name>Term Consumer</name>
<implementationName>org.apache.ctakes.dictionary.lookup2.consumer.DefaultTermConsumer</implementationName>
<!--<implementationName>org.apache.ctakes.dictionary.lookup2.consumer.PrecisionTermConsumer</implementationName>-->
<!--<implementationName>org.apache.ctakes.dictionary.lookup2.consumer.SemanticCleanupTermConsumer</implementationName>-->
<properties>
<!-- Depending upon the consumer, the value of codingScheme may or may not be used. With the packaged consumers,
codingScheme is a default value used only for cuis that do not have secondary codes (snomed, rxnorm, etc.) -->
<property key="codingScheme" value="umls2015aa"/>
</properties>
</rareWordConsumer>
</lookupSpecification>