Hi Mathew,

First I would like to tell you that even I m a newbie in cTAKES. Unfortunately 
I don’t find any documentation on this. I have followed a crude way to 
accomplish as this is an one time activity. This is what I did:

1) Used dictionary generator GUI to generate Snomed, RxNorm and MEDDRA 
dictionary data that resulted in '.script' file under my 
<ctakes_home>\resources\org\apache\ctakes\dictionary\lookup\fast\<project_name> 
folder
2) The '.script' file has HSQLDB specific queries. I have removed the unwanted 
statements for me pertaining to HSQLDB from the file and converted them to 
mysql specific queries manually.
3) I have added semicolons at the end of each line in the script using text 
editor and splitted the file in to five parts. Then I ran those five sctipr 
files  in five different mysql command lines. It took me approximately 4 hours 
to pump all the data in to MySQL DB.

I'm not sure whether it is the right way to proceed as I mentioned earlier. But 
with no documentation available for MySQL DB with  cTAKES, this is the 
approached that worked for me. Hope it will be helpful.

Regards,
Gandhi


-----Original Message-----
From: Matthew Vita [mailto:matthewvit...@gmail.com]
Sent: Monday, October 09, 2017 10:41 AM
To: dev@ctakes.apache.org
Subject: Re: HSQLDB out of memory with custom dictionary

Gandhi,

Thank you for the reply. Do you have any documentation on how to accomplish 
this?

Thanks,

Matthew Vita
www.matthewvita.com

On Sun, Oct 8, 2017 at 3:14 AM, Gandhi Rajan Natarajan < 
gandhi.natara...@arisglobal.com> wrote:

> Hi Mathew,
>
> I feel using MySQL Db would be better idea than using in-memory
> HSQLDB. In fact, this also comes handy when you are planning to deploy
> ctakes as a web application as in our case.
>
> Regards,
> Gandhi
>
> -----Original Message-----
> From: Matthew Vita [mailto:matthewvit...@gmail.com]
> Sent: Sunday, October 08, 2017 6:02 AM
> To: dev@ctakes.apache.org
> Subject: HSQLDB out of memory with custom dictionary
>
> Hi Sean, Tim, cTAKES Community,
>
> I have put together what I am considering a pretty standard dictionary
> with sources from the following:
>
>
>    -
>
>    MEDLINEPLUS
>    -
>
>    MSH
>    -
>
>    NCI
>    -
>
>    NDFRT
>    -
>
>    CHV
>    -
>
>    CSP
>    -
>
>    ICPC2P
>    -
>
>    MEDCIN
>    -
>
>    SNOMED
>    -
>
>    RXNORM
>    -
>
>    ICD10
>
>
> However, when copied over to cTAKES (handled by the handy Dictionary
> Creator GUI) HSQLDB runs out of memory.
>
> This is my first experience with HSQLDB so you’ll have to excuse my
> limited knowledge here. I do understand that it can run either
> in-memory and on disk, but I’m not sure how to configure this.
>
> Here is how I am connecting to it:
>
>
>   <dictionary>
>
>
>     <name>sno_rx_16abTerms</name>
>
>     <implementationName
> >org.apache.ctakes.dictionary.lookup2.dictionary.UmlsJdbcRareWordDicti
> >on
> >ary</
> implementationName>
>
>     <properties>
>
>       <property key="jdbcDriver" value="org.hsqldb.jdbcDriver" />
>
>       <property key="jdbcUrl" value=
> "jdbc:hsqldb:file:resources/org/apache/ctakes/dictionary/
> lookup/fast/sno_rx_16ab/sno_rx_16ab"
> />
>
>       <property key="jdbcUser" value="sa" />
>
>       <property key="jdbcPass" value="" />
>
>       <property key="rareWordTable" value="cui_terms" />
>
>       <property key="umlsUrl" value="
> https://uts-ws.nlm.nih.gov/restful/isValidUMLSUser"; />
>
>       <property key="umlsVendor" value="NLM-6515182895" />
>
>       <property key="umlsUser" value="CHANGE_ME" />
>
>       <property key="umlsPass" value="CHANGE_ME" />
>
>     </properties>
>
>   </dictionary>
>
>   <dictionary>
>
>
>
> Can I configure HSQLDB to be used on disk? If this is not a good
> approach, can I spin up MySQL in its place?
>
>
> Sorry if this has asked before.
>
>
> Thanks,
>
> Matthew Vita
> www.matthewvita.com
> This email and any files transmitted with it are confidential and
> intended solely for the use of the individual or entity to whom they are 
> addressed.
> If you are not the named addressee you should not disseminate,
> distribute or copy this e-mail. Please notify the sender or system
> manager by email immediately if you have received this e-mail by
> mistake and delete this e-mail from your system. If you are not the
> intended recipient you are notified that disclosing, copying,
> distributing or taking any action in reliance on the contents of this
> information is strictly prohibited and against the law.
>
This email and any files transmitted with it are confidential and intended 
solely for the use of the individual or entity to whom they are addressed. If 
you are not the named addressee you should not disseminate, distribute or copy 
this e-mail. Please notify the sender or system manager by email immediately if 
you have received this e-mail by mistake and delete this e-mail from your 
system. If you are not the intended recipient you are notified that disclosing, 
copying, distributing or taking any action in reliance on the contents of this 
information is strictly prohibited and against the law.

Reply via email to