Gandhi, I really appreciate this information. I have started working out the schema and plan on writing a program that will automatically prepare a script to work with MySQL. Work in progress. Can you do a quick review of my MySQL schema so far?
CREATE SCHEMA CTAKES_DATA; use CTAKES_DATA; CREATE TABLE CUI_TERMS ( CUI BIGINT NOT NULL, RINDEX INT(128) NOT NULL, TCOUNT INT(128) NOT NULL, TEXT VARCHAR(255) NOT NULL, RWORD VARCHAR(48) NOT NULL ); CREATE INDEX IDX_CUI_TERMS ON CUI_TERMS (RWORD); CREATE TABLE TUI ( CUI BIGINT NOT NULL, TUI INT(128) NOT NULL ); CREATE INDEX IDX_TUI ON TUI (CUI); CREATE TABLE PREFTERM ( CUI BIGINT NOT NULL, PREFTERM VARCHAR(511) NOT NULL ); CREATE INDEX IDX_PREFTERM ON PREFTERM (CUI); CREATE TABLE RXNORM ( CUI BIGINT NOT NULL, RXNORM BIGINT NOT NULL ); CREATE INDEX IDX_RXNORM ON RXNORM (CUI); CREATE TABLE SNOMEDCT_US ( CUI BIGINT NOT NULL, SNOMEDCT_US BIGINT NOT NULL ); CREATE INDEX IDX_SNOMEDCT_US ON SNOMEDCT_US (CUI); Quick question: do you use the AIR table? Thanks, Matthew Vita www.matthewvita.com On Mon, Oct 9, 2017 at 1:14 AM, Gandhi Rajan Natarajan < [email protected]> wrote: > Hi Mathew, > > First I would like to tell you that even I m a newbie in cTAKES. > Unfortunately I don’t find any documentation on this. I have followed a > crude way to accomplish as this is an one time activity. This is what I did: > > 1) Used dictionary generator GUI to generate Snomed, RxNorm and MEDDRA > dictionary data that resulted in '.script' file under my > <ctakes_home>\resources\org\apache\ctakes\dictionary\lookup\fast\<project_name> > folder > 2) The '.script' file has HSQLDB specific queries. I have removed the > unwanted statements for me pertaining to HSQLDB from the file and converted > them to mysql specific queries manually. > 3) I have added semicolons at the end of each line in the script using > text editor and splitted the file in to five parts. Then I ran those five > sctipr files in five different mysql command lines. It took me > approximately 4 hours to pump all the data in to MySQL DB. > > I'm not sure whether it is the right way to proceed as I mentioned > earlier. But with no documentation available for MySQL DB with cTAKES, > this is the approached that worked for me. Hope it will be helpful. > > Regards, > Gandhi > > > -----Original Message----- > From: Matthew Vita [mailto:[email protected]] > Sent: Monday, October 09, 2017 10:41 AM > To: [email protected] > Subject: Re: HSQLDB out of memory with custom dictionary > > Gandhi, > > Thank you for the reply. Do you have any documentation on how to > accomplish this? > > Thanks, > > Matthew Vita > www.matthewvita.com > > On Sun, Oct 8, 2017 at 3:14 AM, Gandhi Rajan Natarajan < > [email protected]> wrote: > > > Hi Mathew, > > > > I feel using MySQL Db would be better idea than using in-memory > > HSQLDB. In fact, this also comes handy when you are planning to deploy > > ctakes as a web application as in our case. > > > > Regards, > > Gandhi > > > > -----Original Message----- > > From: Matthew Vita [mailto:[email protected]] > > Sent: Sunday, October 08, 2017 6:02 AM > > To: [email protected] > > Subject: HSQLDB out of memory with custom dictionary > > > > Hi Sean, Tim, cTAKES Community, > > > > I have put together what I am considering a pretty standard dictionary > > with sources from the following: > > > > > > - > > > > MEDLINEPLUS > > - > > > > MSH > > - > > > > NCI > > - > > > > NDFRT > > - > > > > CHV > > - > > > > CSP > > - > > > > ICPC2P > > - > > > > MEDCIN > > - > > > > SNOMED > > - > > > > RXNORM > > - > > > > ICD10 > > > > > > However, when copied over to cTAKES (handled by the handy Dictionary > > Creator GUI) HSQLDB runs out of memory. > > > > This is my first experience with HSQLDB so you’ll have to excuse my > > limited knowledge here. I do understand that it can run either > > in-memory and on disk, but I’m not sure how to configure this. > > > > Here is how I am connecting to it: > > > > > > <dictionary> > > > > > > <name>sno_rx_16abTerms</name> > > > > <implementationName > > >org.apache.ctakes.dictionary.lookup2.dictionary.UmlsJdbcRareWordDicti > > >on > > >ary</ > > implementationName> > > > > <properties> > > > > <property key="jdbcDriver" value="org.hsqldb.jdbcDriver" /> > > > > <property key="jdbcUrl" value= > > "jdbc:hsqldb:file:resources/org/apache/ctakes/dictionary/ > > lookup/fast/sno_rx_16ab/sno_rx_16ab" > > /> > > > > <property key="jdbcUser" value="sa" /> > > > > <property key="jdbcPass" value="" /> > > > > <property key="rareWordTable" value="cui_terms" /> > > > > <property key="umlsUrl" value=" > > https://uts-ws.nlm.nih.gov/restful/isValidUMLSUser" /> > > > > <property key="umlsVendor" value="NLM-6515182895" /> > > > > <property key="umlsUser" value="CHANGE_ME" /> > > > > <property key="umlsPass" value="CHANGE_ME" /> > > > > </properties> > > > > </dictionary> > > > > <dictionary> > > > > > > > > Can I configure HSQLDB to be used on disk? If this is not a good > > approach, can I spin up MySQL in its place? > > > > > > Sorry if this has asked before. > > > > > > Thanks, > > > > Matthew Vita > > www.matthewvita.com > > This email and any files transmitted with it are confidential and > > intended solely for the use of the individual or entity to whom they are > addressed. > > If you are not the named addressee you should not disseminate, > > distribute or copy this e-mail. Please notify the sender or system > > manager by email immediately if you have received this e-mail by > > mistake and delete this e-mail from your system. If you are not the > > intended recipient you are notified that disclosing, copying, > > distributing or taking any action in reliance on the contents of this > > information is strictly prohibited and against the law. > > > This email and any files transmitted with it are confidential and intended > solely for the use of the individual or entity to whom they are addressed. > If you are not the named addressee you should not disseminate, distribute > or copy this e-mail. Please notify the sender or system manager by email > immediately if you have received this e-mail by mistake and delete this > e-mail from your system. If you are not the intended recipient you are > notified that disclosing, copying, distributing or taking any action in > reliance on the contents of this information is strictly prohibited and > against the law. >
