Pausing the CPE

2015-02-04 Thread Clayton Turner
Hi everybody,

I've been running approximately 1million notes through the CPE pipeline of
the ctakes/ytex branch.

I'm around 700k notes through, but the VM in which I am running the
pipeline has its resources fully allocated to the pipeline. I'm trying to
run some data processing side-by-side with it which requires more heap
space for the JVM.

So my question is, does hitting the pause button on the CPE disrupt the
pipeline? Aka, can I resume the pipeline after pausing it without losing
any information?

Clayton Turner


YTEX Exporting with Large Dataset

2014-10-29 Thread Clayton Turner
​Hi everyone:

So I'm doing some work with the ctakes-ytex branch of ctakes. So, in the
past, I've been able to use the YTEX exporter (for going to sparsematrix)
on datasets of about 300-400 notes. Now I have run my full dataset through
the pipeline and want to set up the exporter.

I'm getting a null pointer exception when using the big dataset, but no
error occurs if I use my old, smaller dataset​ even though the export files
are nearly identical.

Are there file size limits that I am potentially hitting or is my error
likely something else?

Thanks,
Clayton Turner


Re: YTEX Exporting with Large Dataset

2014-10-29 Thread Clayton Turner
Ah, so apparently YTEX does not like me using a join inside the
InstanceClassQuery. This is inconvenient, but I can work around it.


Clayton Turner
Graduate Research Assistant at The College of Charleston
Web Developer at Innovative Resource Management
Email: caturn...@g.cofc.edu
Phone: (843)-424-3784
Blog: claytonturner.blogspot.com
--
“When scientifically investigating the natural world, the only thing worse
than a blind believer is a seeing denier.”
- Neil deGrasse Tyson

On Wed, Oct 29, 2014 at 2:16 PM, Clayton Turner caturn...@g.cofc.edu
wrote:

 ​Hi everyone:

 So I'm doing some work with the ctakes-ytex branch of ctakes. So, in the
 past, I've been able to use the YTEX exporter (for going to sparsematrix)
 on datasets of about 300-400 notes. Now I have run my full dataset through
 the pipeline and want to set up the exporter.

 I'm getting a null pointer exception when using the big dataset, but no
 error occurs if I use my old, smaller dataset​ even though the export files
 are nearly identical.

 Are there file size limits that I am potentially hitting or is my error
 likely something else?

 Thanks,
 Clayton Turner



Re: YTEX Exporting with Large Dataset

2014-10-29 Thread Clayton Turner
So, YTEX does not like having a join inside the InstanceClassQuery. This is
inconvenient, but I can work around it.



On Wed, Oct 29, 2014 at 2:16 PM, Clayton Turner caturn...@g.cofc.edu
wrote:

 ​Hi everyone:

 So I'm doing some work with the ctakes-ytex branch of ctakes. So, in the
 past, I've been able to use the YTEX exporter (for going to sparsematrix)
 on datasets of about 300-400 notes. Now I have run my full dataset through
 the pipeline and want to set up the exporter.

 I'm getting a null pointer exception when using the big dataset, but no
 error occurs if I use my old, smaller dataset​ even though the export files
 are nearly identical.

 Are there file size limits that I am potentially hitting or is my error
 likely something else?

 Thanks,
 Clayton Turner



Re: YTEX Exporting with Large Dataset

2014-10-29 Thread Clayton Turner
Oops! Let me clarify in case someone else hits this or thinks I'm just
messing up really badly.

I had null values in my dataset and forgot to do a simple is not null -
that explains a 'null pointer exception' alright.




On Wed, Oct 29, 2014 at 2:39 PM, Clayton Turner caturn...@g.cofc.edu
wrote:

 So, YTEX does not like having a join inside the InstanceClassQuery. This
 is inconvenient, but I can work around it.



 On Wed, Oct 29, 2014 at 2:16 PM, Clayton Turner caturn...@g.cofc.edu
 wrote:

 ​Hi everyone:

 So I'm doing some work with the ctakes-ytex branch of ctakes. So, in the
 past, I've been able to use the YTEX exporter (for going to sparsematrix)
 on datasets of about 300-400 notes. Now I have run my full dataset through
 the pipeline and want to set up the exporter.

 I'm getting a null pointer exception when using the big dataset, but no
 error occurs if I use my old, smaller dataset​ even though the export files
 are nearly identical.

 Are there file size limits that I am potentially hitting or is my error
 likely something else?

 Thanks,
 Clayton Turner





Re: Change from SNOMEDCT to SNOMEDCT_US affecting v_snomed_fword_lookup

2014-08-21 Thread Clayton Turner
Awesome. This is just what I needed for the longest time.

I'm having a slight issue. When running either the ytex pipeline or ytex
version of the AggregatePlaintextUMLSProcessor I get an error during
initialization.

My DictionaryLookupAnnotator.xml is raising a
org.apache.uima.resource.ResourceInitializationException causedby:
java.lang.ClassNotFoundException:
edu.mayo.bmi.uima.lookup.ae.FirstTokenPermLookupInitializerImpl

I feel like I may have drifted away from what I need, though, because
before this the CPE was complaining about a lack of LookupDesc_SNOMED.xml
file. I found a ytex version of this on a google code site somewhere and
pasted it where the CPE was looking for it. Now this error is coming up.

Could my problem be solved with just a re-run of the ant script (was just
trying to avoid since it takes ages) or is it a different issue?


On Tue, Aug 19, 2014 at 12:58 PM, Tim O'Connell tim.oconn...@gmail.com
wrote:

 Hi John,

 I'm not sure what was going on with the @db.schema@ error, although I was
 getting it as well before with my prior build of 3.1.2 - I assume that
 you've fixed something (thank you!) to make this go away.  I rebuilt
 everything from scratch and it's working now.

 I think one other thing I had to change was that after I had finished the
 install/build, the cTakes version of LookupDesc_Db.xml doesn't work (in
 resources\org\apache\ctakes\dictionary\lookup) - I'm pretty sure I had to
 copy in an older version of the file from 3.1.1 to get the default cTakes
 AggregatePlaintextUMLSProcessor pipeline working, although please
 double-check that as my memory is a little foggy.

 But yes, here's what I have working since re-building:
 1. ytex-pipeline.xml
 2. ytex version of AggregatePlaintextUMLSProcessor.xml
 3. cTakes version of AggregatePlaintextUMLSProcessor.xml (with swapping the
 LookupDesc_Db.xml file as above)

 I've even made modifications to the ytex version of LookupDesc_SNOMED.xml
 to get it tagging Disease Disorders, along with database modifications to
 have it store these entries as well, which is working great.   Literally,
 everything is working perfectly now.

 Still so much for me to learn!  Let me know if you need any more details.

 All the best,
 Tim



 On Tue, Aug 19, 2014 at 4:31 AM, John Green john.travis.gr...@gmail.com
 wrote:

  I have not had time to implement this - to clarify out of curiosity, does
  this clear up the @db.schema@ error Tim? And did you successfully run
  ytex with the ctakes dictionary-lookup?
 
 
  JG
  —
  Sent from Mailbox for iPhone
 
  On Sat, Aug 16, 2014 at 2:53 AM, Tim O'Connell tim.oconn...@gmail.com
  wrote:
 
   Hi folks,
   I was having an issue with the current build (from svn) of ctakes/ytex
  not
   identifying any annotations as some folks on this board.  I traced it
 to
   the fact that the UMLS database has at sometime in the relatively
 recent
   past changed the SAB tag in the MRCONSO table for SNOMED terms from
   SNOMEDCT to SNOMEDCT_US.  I just had a newer version of UMLS that uses
   SNOMEDCT_US.  Thus when the install script tried to create the
   v_snomed_fword_lookup table, it wasn't finding any of the SNOMEDCT
 terms,
   thus nothing was getting annotated.
   The ytex install script was just looking for things in MRCONSO with the
   SNOMEDCT SAB tag when it created the ytex lookup table - so, by
 changing
   this to SNOMEDCT_US in the file
  
 
 CTAKES_HOME/bin/ctakes-ytex/scripts/data/mysql/umls/insert_view_template.sql
   it now works (for mysql users) to find the annotations. You can just
  re-run
   the ytex setup script, but that takes hours - instead, I just deleted
 all
   the data from the v_snomed_fword_lookup table and basically ran the sql
   command to repopulate the table and it worked fine. Here's the code,
 n.b.
   my schema name for my umls database is 'umls' - change the code below
 if
   yours is different.
   delete from v_snomed_fword_lookup;
   insert into v_snomed_fword_lookup (cui, tui, fword, fstem, tok_str,
   stem_str)
   select mrc.cui, t.tui, c.fword, c.fstem, c.tok_str, c.stem_str
   from umls_aui_fword c
   inner join umls.MRCONSO mrc on c.aui = mrc.aui and mrc.SAB in (
   'SNOMEDCT_US', 'RXNORM')
   inner join
   (
   select cui, min(tui) tui
   from umls.MRSTY sty
   where sty.tui in
   (
  'T019', 'T020', 'T037', 'T046', 'T047', 'T048', 'T049', 'T050',
'T190', 'T191', 'T033',
  'T184',
  'T017', 'T029', 'T023', 'T030', 'T031', 'T022', 'T025', 'T026',
  'T018', 'T021', 'T024',
   'T116', 'T195', 'T123', 'T122', 'T118', 'T103', 'T120', 'T104',
  'T200', 'T111', 'T196', 'T126', 'T131', 'T125', 'T129', 'T130',
  'T197', 'T119', 'T124', 'T114', 'T109', 'T115', 'T121', 'T192',
  'T110', 'T127',
  'T060', 'T065', 'T058', 'T059', 'T063', 'T062', 'T061',
  'T074', 'T075',
  'T059'
   )
   group by cui
   ) t on t.cui = mrc.cui
   ;
   Hope it helps - cheers,
   Tim
 




-- 
--
Clayton Turner
email: caturn

Re: Change from SNOMEDCT to SNOMEDCT_US affecting v_snomed_fword_lookup

2014-08-21 Thread Clayton Turner
Ah, I just switched to the ytex branch and all is good now. The SNOMED_US
issue has been plaguing me for weeks now so thanks a million for that.


On Thu, Aug 21, 2014 at 2:13 PM, Clayton Turner caturn...@g.cofc.edu
wrote:

 Awesome. This is just what I needed for the longest time.

 I'm having a slight issue. When running either the ytex pipeline or ytex
 version of the AggregatePlaintextUMLSProcessor I get an error during
 initialization.

 My DictionaryLookupAnnotator.xml is raising a
 org.apache.uima.resource.ResourceInitializationException causedby:
 java.lang.ClassNotFoundException:
 edu.mayo.bmi.uima.lookup.ae.FirstTokenPermLookupInitializerImpl

 I feel like I may have drifted away from what I need, though, because
 before this the CPE was complaining about a lack of LookupDesc_SNOMED.xml
 file. I found a ytex version of this on a google code site somewhere and
 pasted it where the CPE was looking for it. Now this error is coming up.

 Could my problem be solved with just a re-run of the ant script (was just
 trying to avoid since it takes ages) or is it a different issue?


 On Tue, Aug 19, 2014 at 12:58 PM, Tim O'Connell tim.oconn...@gmail.com
 wrote:

 Hi John,

 I'm not sure what was going on with the @db.schema@ error, although I was
 getting it as well before with my prior build of 3.1.2 - I assume that
 you've fixed something (thank you!) to make this go away.  I rebuilt
 everything from scratch and it's working now.

 I think one other thing I had to change was that after I had finished the
 install/build, the cTakes version of LookupDesc_Db.xml doesn't work (in
 resources\org\apache\ctakes\dictionary\lookup) - I'm pretty sure I had to
 copy in an older version of the file from 3.1.1 to get the default cTakes
 AggregatePlaintextUMLSProcessor pipeline working, although please
 double-check that as my memory is a little foggy.

 But yes, here's what I have working since re-building:
 1. ytex-pipeline.xml
 2. ytex version of AggregatePlaintextUMLSProcessor.xml
 3. cTakes version of AggregatePlaintextUMLSProcessor.xml (with swapping
 the
 LookupDesc_Db.xml file as above)

 I've even made modifications to the ytex version of LookupDesc_SNOMED.xml
 to get it tagging Disease Disorders, along with database modifications to
 have it store these entries as well, which is working great.   Literally,
 everything is working perfectly now.

 Still so much for me to learn!  Let me know if you need any more details.

 All the best,
 Tim



 On Tue, Aug 19, 2014 at 4:31 AM, John Green john.travis.gr...@gmail.com
 wrote:

  I have not had time to implement this - to clarify out of curiosity,
 does
  this clear up the @db.schema@ error Tim? And did you successfully run
  ytex with the ctakes dictionary-lookup?
 
 
  JG
  —
  Sent from Mailbox for iPhone
 
  On Sat, Aug 16, 2014 at 2:53 AM, Tim O'Connell tim.oconn...@gmail.com
  wrote:
 
   Hi folks,
   I was having an issue with the current build (from svn) of ctakes/ytex
  not
   identifying any annotations as some folks on this board.  I traced it
 to
   the fact that the UMLS database has at sometime in the relatively
 recent
   past changed the SAB tag in the MRCONSO table for SNOMED terms from
   SNOMEDCT to SNOMEDCT_US.  I just had a newer version of UMLS that uses
   SNOMEDCT_US.  Thus when the install script tried to create the
   v_snomed_fword_lookup table, it wasn't finding any of the SNOMEDCT
 terms,
   thus nothing was getting annotated.
   The ytex install script was just looking for things in MRCONSO with
 the
   SNOMEDCT SAB tag when it created the ytex lookup table - so, by
 changing
   this to SNOMEDCT_US in the file
  
 
 CTAKES_HOME/bin/ctakes-ytex/scripts/data/mysql/umls/insert_view_template.sql
   it now works (for mysql users) to find the annotations. You can just
  re-run
   the ytex setup script, but that takes hours - instead, I just deleted
 all
   the data from the v_snomed_fword_lookup table and basically ran the
 sql
   command to repopulate the table and it worked fine. Here's the code,
 n.b.
   my schema name for my umls database is 'umls' - change the code below
 if
   yours is different.
   delete from v_snomed_fword_lookup;
   insert into v_snomed_fword_lookup (cui, tui, fword, fstem, tok_str,
   stem_str)
   select mrc.cui, t.tui, c.fword, c.fstem, c.tok_str, c.stem_str
   from umls_aui_fword c
   inner join umls.MRCONSO mrc on c.aui = mrc.aui and mrc.SAB in (
   'SNOMEDCT_US', 'RXNORM')
   inner join
   (
   select cui, min(tui) tui
   from umls.MRSTY sty
   where sty.tui in
   (
  'T019', 'T020', 'T037', 'T046', 'T047', 'T048', 'T049', 'T050',
'T190', 'T191', 'T033',
  'T184',
  'T017', 'T029', 'T023', 'T030', 'T031', 'T022', 'T025', 'T026',
  'T018', 'T021', 'T024',
   'T116', 'T195', 'T123', 'T122', 'T118', 'T103', 'T120', 'T104',
  'T200', 'T111', 'T196', 'T126', 'T131', 'T125', 'T129', 'T130',
  'T197', 'T119', 'T124', 'T114', 'T109', 'T115', 'T121', 'T192

Re: Change from SNOMEDCT to SNOMEDCT_US affecting v_snomed_fword_lookup

2014-08-21 Thread Clayton Turner
It didn't fix the @db.schema@ - I just went in and manually changed it
whenever the CPE complained. I assume that's supposed to be reading from
ytex.properties, but mine was set and it didn't resolve that @db.schema@
issue.


On Thu, Aug 21, 2014 at 5:00 PM, John Green john.travis.gr...@gmail.com
wrote:

 Clayton - this indeed did fix the @db.schema@ for you? Im gonna try and
 reproduce (havent had time yet) then ill close the Jira ticket out.


 JG
 —
 Sent from Mailbox for iPhone

 On Thu, Aug 21, 2014 at 1:24 PM, Clayton Turner caturn...@g.cofc.edu
 wrote:

  Ah, I just switched to the ytex branch and all is good now. The SNOMED_US
  issue has been plaguing me for weeks now so thanks a million for that.
  On Thu, Aug 21, 2014 at 2:13 PM, Clayton Turner caturn...@g.cofc.edu
  wrote:
  Awesome. This is just what I needed for the longest time.
 
  I'm having a slight issue. When running either the ytex pipeline or ytex
  version of the AggregatePlaintextUMLSProcessor I get an error during
  initialization.
 
  My DictionaryLookupAnnotator.xml is raising a
  org.apache.uima.resource.ResourceInitializationException causedby:
  java.lang.ClassNotFoundException:
  edu.mayo.bmi.uima.lookup.ae.FirstTokenPermLookupInitializerImpl
 
  I feel like I may have drifted away from what I need, though, because
  before this the CPE was complaining about a lack of
 LookupDesc_SNOMED.xml
  file. I found a ytex version of this on a google code site somewhere and
  pasted it where the CPE was looking for it. Now this error is coming up.
 
  Could my problem be solved with just a re-run of the ant script (was
 just
  trying to avoid since it takes ages) or is it a different issue?
 
 
  On Tue, Aug 19, 2014 at 12:58 PM, Tim O'Connell tim.oconn...@gmail.com
 
  wrote:
 
  Hi John,
 
  I'm not sure what was going on with the @db.schema@ error, although I
 was
  getting it as well before with my prior build of 3.1.2 - I assume that
  you've fixed something (thank you!) to make this go away.  I rebuilt
  everything from scratch and it's working now.
 
  I think one other thing I had to change was that after I had finished
 the
  install/build, the cTakes version of LookupDesc_Db.xml doesn't work (in
  resources\org\apache\ctakes\dictionary\lookup) - I'm pretty sure I had
 to
  copy in an older version of the file from 3.1.1 to get the default
 cTakes
  AggregatePlaintextUMLSProcessor pipeline working, although please
  double-check that as my memory is a little foggy.
 
  But yes, here's what I have working since re-building:
  1. ytex-pipeline.xml
  2. ytex version of AggregatePlaintextUMLSProcessor.xml
  3. cTakes version of AggregatePlaintextUMLSProcessor.xml (with swapping
  the
  LookupDesc_Db.xml file as above)
 
  I've even made modifications to the ytex version of
 LookupDesc_SNOMED.xml
  to get it tagging Disease Disorders, along with database modifications
 to
  have it store these entries as well, which is working great.
  Literally,
  everything is working perfectly now.
 
  Still so much for me to learn!  Let me know if you need any more
 details.
 
  All the best,
  Tim
 
 
 
  On Tue, Aug 19, 2014 at 4:31 AM, John Green 
 john.travis.gr...@gmail.com
  wrote:
 
   I have not had time to implement this - to clarify out of curiosity,
  does
   this clear up the @db.schema@ error Tim? And did you successfully
 run
   ytex with the ctakes dictionary-lookup?
  
  
   JG
   —
   Sent from Mailbox for iPhone
  
   On Sat, Aug 16, 2014 at 2:53 AM, Tim O'Connell 
 tim.oconn...@gmail.com
   wrote:
  
Hi folks,
I was having an issue with the current build (from svn) of
 ctakes/ytex
   not
identifying any annotations as some folks on this board.  I traced
 it
  to
the fact that the UMLS database has at sometime in the relatively
  recent
past changed the SAB tag in the MRCONSO table for SNOMED terms from
SNOMEDCT to SNOMEDCT_US.  I just had a newer version of UMLS that
 uses
SNOMEDCT_US.  Thus when the install script tried to create the
v_snomed_fword_lookup table, it wasn't finding any of the SNOMEDCT
  terms,
thus nothing was getting annotated.
The ytex install script was just looking for things in MRCONSO with
  the
SNOMEDCT SAB tag when it created the ytex lookup table - so, by
  changing
this to SNOMEDCT_US in the file
   
  
 
 CTAKES_HOME/bin/ctakes-ytex/scripts/data/mysql/umls/insert_view_template.sql
it now works (for mysql users) to find the annotations. You can
 just
   re-run
the ytex setup script, but that takes hours - instead, I just
 deleted
  all
the data from the v_snomed_fword_lookup table and basically ran the
  sql
command to repopulate the table and it worked fine. Here's the
 code,
  n.b.
my schema name for my umls database is 'umls' - change the code
 below
  if
yours is different.
delete from v_snomed_fword_lookup;
insert into v_snomed_fword_lookup (cui, tui, fword, fstem, tok_str,
stem_str)
select mrc.cui, t.tui

Re: v_snomed_fword_lookup view

2014-08-13 Thread Clayton Turner
Okay, I believe I have ctakes dictionary fast working now. Something I'm
curious about, though, is how you extract the data in order to conduct
analysis.

I've, in the past, been using the SparseDataExporterImpl from ytex in order
to create a .arff file for use in weka, but the ctakes pipeline I'm using
doesn't seem to be compatible with this ytex exporting as I'm not getting
any cuis in my arff file.

I'm using the aggregate plain text umls processor analysis engine from
ctakes and then using the dbconsumer analysis engine from ytex (for storing
into the database with regard to analysis batch).

Any tips for exporting or some simple issue I'm missing?

Thanks,
Clayton


On Mon, Aug 11, 2014 at 2:09 PM, Harpreet Khanduja hsk5...@rit.edu wrote:

 Yes, absolutely and
 no problem at all.

 Regards,
 Harpreet


 On Mon, Aug 11, 2014 at 1:16 PM, Finan, Sean 
 sean.fi...@childrens.harvard.edu wrote:

  Thanks Harpreet,
  That is definitely necessary to build!
 
  Those lines should already be in the pom, but commented out.  I think
 that
  some version/branching issues may have arisen at some point wrt this
 module
  ...
 
  If somebody beats me to it then cheers, otherwise I will try to check out
  tonight and get all the bits in place.
 
  Sean
 
   -Original Message-
   From: Harpreet Khanduja [mailto:hsk5...@rit.edu]
   Sent: Monday, August 11, 2014 1:12 PM
   To: dev@ctakes.apache.org
   Subject: Re: v_snomed_fword_lookup view
  
   Hello Clayton,
 I do not know about ytex, but I did switch from dictionary-lookup to
  dictionary-
   lookup-fast.
 I update my ctakes-dictionary-lookup-fast project using maven.
 I think I used Team- Update and switched to the latest revision
  available and
   then
 I downloaded new 3.2 resources from the for umls. and then I added
  these
   resources to my
 ctakes-dictionary-lookup-fast resources folder and also the classpath
  in ctakes-
   clinical-pipeline.
  
Then I changed the pom.xml file which belongs to the whole ctakes
  project and
   added dependency groupIdorg.apache.ctakes/groupId
   artifactIdctakes-dictionary-lookup-res/artifactId
   version${ctakes.version}/version
   /dependency
   dependency
   groupIdorg.apache.ctakes/groupId
   artifactIdctakes-dictionary-lookup-fast/artifactId
   version${ctakes.version}/version
   /dependency
  
  
these two dependencies to the file.
  
  
   After this, I also added the dependency
   dependency
   groupIdorg.apache.ctakes/groupId
   artifactIdctakes-dictionary-lookup-fast/artifactId
   /dependency
  
   to the pom.xml of ctakes-clinical-pipeline.
  
   And then add the resources folder in ctakes-clinical-pipeline using
  build path
   configuration under add class option.
  
   After this it should work.
  
  
   Regards,
   Harpreet
  
  
  
  
  
  
   On Mon, Aug 11, 2014 at 12:44 PM, Clayton Turner caturn...@g.cofc.edu
 
   wrote:
  
I still get the same error with the ctakes3.2 branch. Any
 suggestions?
   
   
On Mon, Aug 11, 2014 at 12:06 PM, Clayton Turner
caturn...@g.cofc.edu
wrote:
   
 I'm going to do a clean install through the repo rather than the
 binaries and see if that fixes my issue because I think I just read
 a past post saying the lookup2 folders exist there.


 On Mon, Aug 11, 2014 at 11:52 AM, Clayton Turner
 caturn...@g.cofc.edu
 wrote:

 When navigating to
 ctakes-dictionary-lookup-fast\desc\analysis_engine
 there are 2 files, assumedly analysis engines.

 SnomedLookupAnnotator.xml and SnomedOvLookupAnnotator.xml

 If I pick either, I put in my UMLS information but receive an
 error
 when trying to run the CPE:

 Initialization of CAS Processor with name
 SnomedOvLookupAnnotator
 failed.
 CausedBy: org.apache.uima.resource.ResourceConfigurationException:
 Initialization of CAS processor with name
 SnomedOvLookupAnnotator
 failed.
 CausedBy:
 org.apache.uima.resource.ResourceInitializationException:
Error
 initializing org.apache.uima.resource.impl.DataResource_impl
 from
 descriptor file:..SnomedLookupAnnotator.xml
 CausedBy:
 org.apache.uima.resource.ResourceInitializationException:
Could
 not
 access the resource data at


   
 file:org\apache\ctakes\dictionary\lookup2\Snomed2011ab_ctakesTui\cTake
sSnomed.xml

 Now, I don't even have a lookup2 folder and, subsequently the
 Tui
 folder and cTakesSnomed.xml file. This seems to be the problem,
 but
 I'm
not
 sure where these files are supposed to be grabbed from.


 On Mon, Aug 11, 2014 at 11:47 AM, Clayton Turner
 caturn...@g.cofc.edu
 wrote:

 Hi again:

 How exactly do you switch to using the cTakes
  dictionary-lookup-fast.
Do
 I need to go in and alter xml files or is it as simple as adding
 a
certain
 item to the list of analysis engines?


 On Fri

v_snomed_fword_lookup view

2014-08-08 Thread Clayton Turner
Hi Everyone:

I have a question about how the v_snomed_fword_lookup view works when
running the CPE.

So my understanding of the view is that it is a view comprised of the
ytex.umls_aui_fword table, the umls.mrconso table and bits/pieces from
other umls tables.

I feel like this is not completely correct or my idea of how the join to
create the view works is off. For example, let's say I want the CPE to find
malar  (e.g. malar rash) as a concept in the annotations. It never
happens after running my CPE descriptor and I cannot find it in my
v_snomed_fword_lookup view.

select count(*) from umls_aui_fword where fword='malar'; yields 34 results

select count(*) from umls.mrconso where str='malar'; yields 3 results.

So clearly these two tables know what the cui and context(s) are for malar
. Yet, whenever I run a gold standard set of notes through the CPE,
malar is constantly flagged as just a word token and the concept is never
grabbed. This is recurrent for lots of other concepts, as well, I just
wanted to use an example to illustrate my issue.

Some troubleshooting I already went through:
1) Reinstalled ytex and umls database objects
2) Reinstalled a second time after redownloading umls through
metamorphosys, ensuring that snomed vocabularies were included (also
checked file sizes and noticed a big difference so I know those
vocabularies ARE included

Anyone got any ideas as to what the issue could be?

Thank you,
Clayton Turner


Exporting YTEX Pipeline

2014-07-30 Thread Clayton Turner
Hi, I'm trying to export the data I get from running the pipeline through
the Collection Processing Engine.

I set up the pipeline where I have a directory where all the XML is output
to, but I am having issues at this point.

I've tried using the built in Exporter from the Data Mining section on this
page https://cwiki.apache.org/confluence/display/CTAKES/User%27s+Guide but
those notes are out of date. Even altering directories to match the files
still gives me errors about not being able to find the ExporterImpl class.
The class version of this file only exists outside of the target directory
for the ctakes snapshot and attempting to use it still fails.

I then ventured to here:
https://code.google.com/p/ytex/source/browse/#svn%2Ftrunk%2Fworkspace%2Fexamples%2Ffracture

The files here match up to the data mining section from the previous link -
so I created my export.xml file and changed everything that needed to be
changed for my example (tried to even run bone fracture), but I cannot get
data exported, no matter what I do.

Is there a way to use some new(er) implementation of the
SparseDataExporterImpl class or is there an alternative for extracting data
for use with weka?

I've messaged about this in the past but I don't believe I was thorough
enough with my issues.

Thanks in advance,
Clayton


Re: Exporting YTEX Pipeline

2014-07-30 Thread Clayton Turner
Awesome!!

It worked!

The only things I had to change (since I'm on Windows) was flipping the
slashes when necessary and removing the first slash when specifying the
-Dlog4j.configuration=file:/...

Thank you so much for putting up with my issues

-Clayton


On Wed, Jul 30, 2014 at 2:48 PM, vijay garla vnga...@gmail.com wrote:

 Can you try this:
 copy

 https://code.google.com/p/ytex/source/browse/trunk/workspace/examples/fracture/cui/export.template.xml
 to CTAKES_HOME\desc\ctakes-ytex\fracture\cui.xml
 replace %DB_SCHEMA% with your database schema name (value of db.schema in
 your ytex.properties file)

 Then from a command prompt, execute the following commands:
 cd CTAKES_HOME
 bin\setenv.bat
 java -cp %CLASSPATH%
 -Dlog4j.configuration=file:/%CTAKES_HOME%/config/log4j.xml -Xmx256m
 org.apache.ctakes.ytex.kernel.SparseDataExporterImpl -prop
 desc\ctakes-ytex\fracture\cui.xml -type weka

 Tell me if you run into any issues.

 I will add this to the ctakes confluence doc.

 Best,

 VJ


 On Wed, Jul 30, 2014 at 5:11 PM, Clayton Turner caturn...@g.cofc.edu
 wrote:

  Hi, I'm trying to export the data I get from running the pipeline through
  the Collection Processing Engine.
 
  I set up the pipeline where I have a directory where all the XML is
 output
  to, but I am having issues at this point.
 
  I've tried using the built in Exporter from the Data Mining section on
 this
  page https://cwiki.apache.org/confluence/display/CTAKES/User%27s+Guide
 but
  those notes are out of date. Even altering directories to match the files
  still gives me errors about not being able to find the ExporterImpl
 class.
  The class version of this file only exists outside of the target
 directory
  for the ctakes snapshot and attempting to use it still fails.
 
  I then ventured to here:
 
 
 https://code.google.com/p/ytex/source/browse/#svn%2Ftrunk%2Fworkspace%2Fexamples%2Ffracture
 
  The files here match up to the data mining section from the previous
 link -
  so I created my export.xml file and changed everything that needed to be
  changed for my example (tried to even run bone fracture), but I cannot
 get
  data exported, no matter what I do.
 
  Is there a way to use some new(er) implementation of the
  SparseDataExporterImpl class or is there an alternative for extracting
 data
  for use with weka?
 
  I've messaged about this in the past but I don't believe I was thorough
  enough with my issues.
 
  Thanks in advance,
  Clayton
 




-- 
--
Clayton Turner
email: caturn...@g.cofc.edu
phone: (843)-424-3784
web: claytonturner.blogspot.com
-
“When scientifically investigating the natural world, the only thing worse
than a blind believer is a seeing denier.”
- Neil deGrasse Tyson


cTAKES CPE MySQL Exception

2014-07-24 Thread Clayton Turner
Hi, everyone.

First off, I'd like to say awesome and thank you for the cTAKES 3.2
release and information. I've been following those pages and it's been
really helpful for helping me move along in my own progress. Really cool
stuff.

So I'm using the Collection Processing Engine (with ytex and umls) and I'm
trying to process ~1 million notes (as opposed to the about 30 in the given
demo).

I've tried this the past 2 days and when I come back in to check the
progress I see that I've received an error about 14000 notes into the
process:

org.apache.uima.analysis_engine.AnalysisEngineProcessException: Annotator
processing failed.
CausedBy: org.springframework.transaction.CannotCreateTransactionException:
Could not open Hibernate Session for transaction; nested exception is
com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: The last packet
successfully received from the server was 53,888,249 milliseconds ago. The
last packet sent successfully to the server was 53,888,249 milliseconds
ago. is longer than the server configured value of 'wait_timeout'. You
should consider either expiring and/or testing connection validity before
use in your application, increasing the server configured values for client
timeouts, or using the Connector/J connection property 'autoReconnect=true'
to avoid this problem.

So, in my own debugging, I have ensured that autoReconnect true was on (it
always has been).

I looked at my CPE output in the command prompt and noticed a
PacketTooBigException so I increased the packet max size to 1G (the max
for sql server).

I increased the time allowed for timeouts.

I'm really unsure of what to do here. Should I find a way to see if there
is a problematic note that is giving me issues (though I can't understand
how 1 note would make a packet too large)? Should I try to do some
horizontal sharding and break the problem into smaller chunks (though I
would think this program could handle large datasets since it's using a
query language)? I'm just at a loss with this error, especially since it
takes so long to actually spit the error out at me.

Thanks in advance everyone,
Clayton

-- 
--
Clayton Turner
email: caturn...@g.cofc.edu
phone: (843)-424-3784
web: claytonturner.blogspot.com
-
“When scientifically investigating the natural world, the only thing worse
than a blind believer is a seeing denier.”
- Neil deGrasse Tyson


ytex examples

2014-07-24 Thread Clayton Turner
I've been following the usage component guide for ctakes 3.2 and ytex, but
I'm having an issue.

I get to the point where I want to export my data as a bag of words (or
cuis), but the documentation on the wiki seems to be really out of date
when it comes to the exporting for data mining step.

The YTEX home directory doesn't seem to actually be a thing and there's no
fracture example directory with a cui/word folder for the examples anymore.

Is there an updated version of this documentation in the works or can
someone just give me pointers on how to execute the command over the
command prompt (Windows)?

Thank you,
Clayton


Re: ytex examples

2014-07-24 Thread Clayton Turner
Hi,

Alright, I planned on using weka, but it might not be a bad idea to just
jump in with either R or Python.

I'll check out that link.

Thanks!


On Thu, Jul 24, 2014 at 2:11 PM, vijay garla vnga...@gmail.com wrote:

 Hi Clayton,

 Haven't gotten around to upgrading the docs
 look here for examples:

 https://code.google.com/p/ytex/source/browse/#svn%2Ftrunk%2Fworkspace%2Fexamples%2Ffracture

 If you are using R/Matlab/Python it is easy to generate a sparse matrix
 directly via database queries, I can give you a few examples

 Best,

 VJ


 On Thu, Jul 24, 2014 at 8:02 PM, Clayton Turner caturn...@g.cofc.edu
 wrote:

  I've been following the usage component guide for ctakes 3.2 and ytex,
 but
  I'm having an issue.
 
  I get to the point where I want to export my data as a bag of words (or
  cuis), but the documentation on the wiki seems to be really out of date
  when it comes to the exporting for data mining step.
 
  The YTEX home directory doesn't seem to actually be a thing and there's
 no
  fracture example directory with a cui/word folder for the examples
 anymore.
 
  Is there an updated version of this documentation in the works or can
  someone just give me pointers on how to execute the command over the
  command prompt (Windows)?
 
  Thank you,
  Clayton
 




-- 
--
Clayton Turner
email: caturn...@g.cofc.edu
phone: (843)-424-3784
web: claytonturner.blogspot.com
-
“When scientifically investigating the natural world, the only thing worse
than a blind believer is a seeing denier.”
- Neil deGrasse Tyson


Re: cTAKES 3.2 Analysis Batch Issue

2014-07-08 Thread Clayton Turner
I don't see a log file when running the CPE. When running the CVD I have
access to a log file within the gui, but that does not seem to be present
here. Is there a specific place that this log file is saved?


On Tue, Jul 8, 2014 at 3:14 AM, vijay garla vnga...@gmail.com wrote:

 Hi Clayton,

 The screenshot is not coming through via the newsgroup emails.  can you
 attach the log file?

 vj


 On Mon, Jul 7, 2014 at 5:38 PM, Clayton Turner caturn...@g.cofc.edu
 wrote:

  Any update on this issue? I have this problem even if I don't use the
 ytex
  version of the aggregate text processor (UMLS-independent as well).
 
 
  On Thu, Jul 3, 2014 at 2:33 PM, Clayton Turner caturn...@g.cofc.edu
  wrote:
 
  Yes, I am running the fracture_demo.xml cpe.
 
  There is no option for the analysis batch (that's the main issue). I
 also
  get no response in my MySQL database (umls installed - not sure if that
 can
  be related).
 
  Here's a screenshot of my CPE (using ytex):
  [image: Inline image 1]
 
 
 
 
  On Wed, Jul 2, 2014 at 10:48 PM, vijay garla vnga...@gmail.com wrote:
 
  Hi clayton,
 
  I assume you are running the fracture_demo.xml cpe - is that correct?
   The CPE GUI should give you the option to set the analysis batch. (see
  attached screenshot).  That being said, the analysis_batch is not
 required
  (it will default to the current date).  Can you attach the log file?
 
  -vj
 
  [image: Inline image 1]
 
 
  On Wed, Jul 2, 2014 at 12:22 PM, Clayton Turner caturn...@g.cofc.edu
  wrote:
 
  Hi, I'm a relatively new user of cTAKES.
 
  I recently cloned cTAKES from the repository and I am using UMLS
  installed
  in my mysql database. I have recently noticed an issue, though. When
  conducting the bone fracture demo, In the CPE, I use the
  DBCollectionReader
  and Analysis Engine from the ctakes-ytex-uima directory within my
  CTAKES_HOME.
 
  I can get this to run successfully, but I am not able to specify an
  analysis batch in the CPE. Because of this, my ytex database is not
  being
  updated with results of the CPE run (in the v_document tables). Any
  ideas
  why the analysis batch field is missing?
 
  Side question: Any update on when cTAKES 3.2 will be officially
  released? I
  see we're passed the expected release and was curious on how long it
  will
  be until it will officially come out.
 
  Thanks a lot,
  --
  Clayton Turner
 
 
 
 
 
  --
  --
  Clayton Turner
  email: caturn...@g.cofc.edu
  phone: (843)-424-3784
  web: claytonturner.blogspot.com
 
 
 -
  “When scientifically investigating the natural world, the only thing
  worse than a blind believer is a seeing denier.”
  - Neil deGrasse Tyson
 
 
 
 
  --
  --
  Clayton Turner
  email: caturn...@g.cofc.edu
  phone: (843)-424-3784
  web: claytonturner.blogspot.com
 
 
 -
  “When scientifically investigating the natural world, the only thing
 worse
  than a blind believer is a seeing denier.”
  - Neil deGrasse Tyson
 




-- 
--
Clayton Turner
email: caturn...@g.cofc.edu
phone: (843)-424-3784
web: claytonturner.blogspot.com
-
“When scientifically investigating the natural world, the only thing worse
than a blind believer is a seeing denier.”
- Neil deGrasse Tyson


Re: cTAKES 3.2 Analysis Batch Issue

2014-07-07 Thread Clayton Turner
Any update on this issue? I have this problem even if I don't use the ytex
version of the aggregate text processor (UMLS-independent as well).


On Thu, Jul 3, 2014 at 2:33 PM, Clayton Turner caturn...@g.cofc.edu wrote:

 Yes, I am running the fracture_demo.xml cpe.

 There is no option for the analysis batch (that's the main issue). I also
 get no response in my MySQL database (umls installed - not sure if that can
 be related).

 Here's a screenshot of my CPE (using ytex):
 [image: Inline image 1]




 On Wed, Jul 2, 2014 at 10:48 PM, vijay garla vnga...@gmail.com wrote:

 Hi clayton,

 I assume you are running the fracture_demo.xml cpe - is that correct?
  The CPE GUI should give you the option to set the analysis batch. (see
 attached screenshot).  That being said, the analysis_batch is not required
 (it will default to the current date).  Can you attach the log file?

 -vj

 [image: Inline image 1]


 On Wed, Jul 2, 2014 at 12:22 PM, Clayton Turner caturn...@g.cofc.edu
 wrote:

 Hi, I'm a relatively new user of cTAKES.

 I recently cloned cTAKES from the repository and I am using UMLS
 installed
 in my mysql database. I have recently noticed an issue, though. When
 conducting the bone fracture demo, In the CPE, I use the
 DBCollectionReader
 and Analysis Engine from the ctakes-ytex-uima directory within my
 CTAKES_HOME.

 I can get this to run successfully, but I am not able to specify an
 analysis batch in the CPE. Because of this, my ytex database is not
 being
 updated with results of the CPE run (in the v_document tables). Any ideas
 why the analysis batch field is missing?

 Side question: Any update on when cTAKES 3.2 will be officially
 released? I
 see we're passed the expected release and was curious on how long it will
 be until it will officially come out.

 Thanks a lot,
 --
 Clayton Turner





 --
 --
 Clayton Turner
 email: caturn...@g.cofc.edu
 phone: (843)-424-3784
 web: claytonturner.blogspot.com

 -
 “When scientifically investigating the natural world, the only thing worse
 than a blind believer is a seeing denier.”
 - Neil deGrasse Tyson




-- 
--
Clayton Turner
email: caturn...@g.cofc.edu
phone: (843)-424-3784
web: claytonturner.blogspot.com
-
“When scientifically investigating the natural world, the only thing worse
than a blind believer is a seeing denier.”
- Neil deGrasse Tyson


Re: cTAKES 3.2 Analysis Batch Issue

2014-07-03 Thread Clayton Turner
Yes, I am running the fracture_demo.xml cpe.

There is no option for the analysis batch (that's the main issue). I also
get no response in my MySQL database (umls installed - not sure if that can
be related).

Here's a screenshot of my CPE (using ytex):
[image: Inline image 1]




On Wed, Jul 2, 2014 at 10:48 PM, vijay garla vnga...@gmail.com wrote:

 Hi clayton,

 I assume you are running the fracture_demo.xml cpe - is that correct?  The
 CPE GUI should give you the option to set the analysis batch. (see attached
 screenshot).  That being said, the analysis_batch is not required (it will
 default to the current date).  Can you attach the log file?

 -vj

 [image: Inline image 1]


 On Wed, Jul 2, 2014 at 12:22 PM, Clayton Turner caturn...@g.cofc.edu
 wrote:

 Hi, I'm a relatively new user of cTAKES.

 I recently cloned cTAKES from the repository and I am using UMLS installed
 in my mysql database. I have recently noticed an issue, though. When
 conducting the bone fracture demo, In the CPE, I use the
 DBCollectionReader
 and Analysis Engine from the ctakes-ytex-uima directory within my
 CTAKES_HOME.

 I can get this to run successfully, but I am not able to specify an
 analysis batch in the CPE. Because of this, my ytex database is not
 being
 updated with results of the CPE run (in the v_document tables). Any ideas
 why the analysis batch field is missing?

 Side question: Any update on when cTAKES 3.2 will be officially released?
 I
 see we're passed the expected release and was curious on how long it will
 be until it will officially come out.

 Thanks a lot,
 --
 Clayton Turner





-- 
--
Clayton Turner
email: caturn...@g.cofc.edu
phone: (843)-424-3784
web: claytonturner.blogspot.com
-
“When scientifically investigating the natural world, the only thing worse
than a blind believer is a seeing denier.”
- Neil deGrasse Tyson