Role of white-box logic/models in cTAKES

2015-08-05 Thread Peter Klügl
Hi,

my (uninformed) view on cTAKES was that it is mainly based on black-box
machine learning models.

There were some mentions of rule-based approaches on the mailing list
and a quick look in the source code revealed to me some functionality
that is based on FSMs and regular expressions (and the grey area of rule
logic implemented in plain java).

I'm just curious. Is this code actively used in cTAKES and is there a
general position of the cTAKES community on rules-based/white-box
approaches?

Best,

Peter


RE: cTAKES GUI for I2B2

2015-08-05 Thread Hari, Sekhar
Hello Timothy -

I have posted the  screenshots here:

https://drive.google.com/file/d/0B4sR85qs377yTThzWHM4YXlxOFE/view?usp=sharing

Kindly advise as soon as possible.

Many thanks,
Sekhar H.

-Original Message-
From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu] 
Sent: Tuesday, August 04, 2015 4:14 PM
To: dev@ctakes.apache.org
Subject: RE: cTAKES GUI for I2B2

Can you post the screenshot somewhere it might be linked to? I don't know if we 
can post image attachments to the dev list.
Thanks
Tim


From: Hari, Sekhar [sekhar.h...@cgi.com]
Sent: Monday, August 03, 2015 10:35 PM
To: chen...@apache.org; dev@ctakes.apache.org; u...@ctakes.apache.org
Subject: RE: cTAKES GUI for I2B2

Hello there -

Please, can somebody advise me on my question below?

Thanks,
Sekhar H.

From: Hari, Sekhar
Sent: 31 July 2015 15:02
To: chen...@apache.org
Subject: cTAKES GUI for I2B2

Hello Pei -

Can you please assist me. I am doing a few experiments using I2B2. There is a 
requirement for me to use cTAKES for reading clinical notes and to extract the  
key terminologies from the notes so it can be inserted into I2B2. I found 
cTAKES GUI for I2B2 and installed it. While trying to read a sample clinical  
note, though seemingly the pipelines run, I don't see any useful output in the  
Results section of the GUI. I have included a screenshot below. The Results 
page says language: x-unspecified. Not sure what is going wrong. Am I doing 
anything wrong or missing  any configurations? Also, I have included the NLP 
processors that am using to do this read and extract. You can see this 
screenshot at the bottom of this email. Hope you can help.

Many thanks,
Sekhar H.






Re: Role of white-box logic/models in cTAKES

2015-08-05 Thread Pei Chen
Peter,
Good to hear from you again!
Yes, I believe there are some regex and rules based annotators that
are in used (and probably the future for as long as it out performs
other methods for certain tasks.)
I don't think there is specific position form the community on this
approach.  (ASF's 'Do-acracy')
Were you thinking of writing some Annotators in Ruta?

--Pei


On Wed, Aug 5, 2015 at 3:47 AM, Peter Klügl peter.klu...@averbis.com wrote:
 Hi,

 my (uninformed) view on cTAKES was that it is mainly based on black-box
 machine learning models.

 There were some mentions of rule-based approaches on the mailing list
 and a quick look in the source code revealed to me some functionality
 that is based on FSMs and regular expressions (and the grey area of rule
 logic implemented in plain java).

 I'm just curious. Is this code actively used in cTAKES and is there a
 general position of the cTAKES community on rules-based/white-box
 approaches?

 Best,

 Peter


Re: cTAKES GUI for I2B2

2015-08-05 Thread Pei Chen
Sekhar,
That application was done as a prototype/POC many years ago and hasn't
been actively maintained (hence in sandbox).
It seems from your screenshot that you have it up and running though.
Would you mind attaching the log files as well?

--Pei



On Wed, Aug 5, 2015 at 4:41 AM, Hari, Sekhar sekhar.h...@cgi.com wrote:
 Hello Timothy -

 I have posted the  screenshots here:

 https://drive.google.com/file/d/0B4sR85qs377yTThzWHM4YXlxOFE/view?usp=sharing

 Kindly advise as soon as possible.

 Many thanks,
 Sekhar H.

 -Original Message-
 From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu]
 Sent: Tuesday, August 04, 2015 4:14 PM
 To: dev@ctakes.apache.org
 Subject: RE: cTAKES GUI for I2B2

 Can you post the screenshot somewhere it might be linked to? I don't know if 
 we can post image attachments to the dev list.
 Thanks
 Tim

 
 From: Hari, Sekhar [sekhar.h...@cgi.com]
 Sent: Monday, August 03, 2015 10:35 PM
 To: chen...@apache.org; dev@ctakes.apache.org; u...@ctakes.apache.org
 Subject: RE: cTAKES GUI for I2B2

 Hello there -

 Please, can somebody advise me on my question below?

 Thanks,
 Sekhar H.
 
 From: Hari, Sekhar
 Sent: 31 July 2015 15:02
 To: chen...@apache.org
 Subject: cTAKES GUI for I2B2

 Hello Pei -

 Can you please assist me. I am doing a few experiments using I2B2. There is a 
 requirement for me to use cTAKES for reading clinical notes and to extract 
 the  key terminologies from the notes so it can be inserted into I2B2. I 
 found cTAKES GUI for I2B2 and installed it. While trying to read a sample 
 clinical  note, though seemingly the pipelines run, I don't see any useful 
 output in the  Results section of the GUI. I have included a screenshot 
 below. The Results page says language: x-unspecified. Not sure what is 
 going wrong. Am I doing anything wrong or missing  any configurations? Also, 
 I have included the NLP processors that am using to do this read and extract. 
 You can see this screenshot at the bottom of this email. Hope you can help.

 Many thanks,
 Sekhar H.






how to run i2b2 data

2015-08-05 Thread Justin Zhang
Hello everyone,

I am running ctakes with i2b2 data
https://www.i2b2.org/NLP/DataSets/Main.php

In each xml file, there are multiple patient records. I am able to separate
each patient into single files and process them with runCPE.sh

Is there a way to convert this single xml file into the format ctakes
accepted, and process as a single input file, and generate a single output
file (results labelled by patient id). For example, each patient id has a
smoking status.

Thanks,

-- 
Justin


RE: how to run i2b2 data

2015-08-05 Thread Finan, Sean
Hi Justin,

A shot in the dark:
You could create a collection reader that works similarly to 
org.apache.ctakes.core.cr.FilesInDirectoryCollectionReader , but instead of 
grabbing all of the files in a directory it grabs all the records parsed from a 
single .xml and runs a pipeline per record.  Basically, swap a directory for an 
.xml, a text file for an xml element containing a record.
Somebody out there might have something that already does as much.

Sean

-Original Message-
From: Justin Zhang [mailto:justinzhang...@gmail.com] 
Sent: Wednesday, August 05, 2015 6:40 PM
To: u...@ctakes.apache.org; dev@ctakes.apache.org
Subject: how to run i2b2 data

Hello everyone,

I am running ctakes with i2b2 data
https://urldefense.proofpoint.com/v2/url?u=https-3A__www.i2b2.org_NLP_DataSets_Main.phpd=BQIBaQc=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFUr=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTaom=IygWj6YGkcjofGRbrDiFJacJHMaBveHR9qzY0VD1AAEs=swpt3QP4-B392iLlJ9wypBwD17tRDOCxPdSZOW1rS8se=
 

In each xml file, there are multiple patient records. I am able to separate 
each patient into single files and process them with runCPE.sh

Is there a way to convert this single xml file into the format ctakes
accepted, and process as a single input file, and generate a single output file 
(results labelled by patient id). For example, each patient id has a smoking 
status.

Thanks,

--
Justin