Role of white-box logic/models in cTAKES
Hi, my (uninformed) view on cTAKES was that it is mainly based on black-box machine learning models. There were some mentions of rule-based approaches on the mailing list and a quick look in the source code revealed to me some functionality that is based on FSMs and regular expressions (and the grey area of rule logic implemented in plain java). I'm just curious. Is this code actively used in cTAKES and is there a general position of the cTAKES community on rules-based/white-box approaches? Best, Peter
RE: cTAKES GUI for I2B2
Hello Timothy - I have posted the screenshots here: https://drive.google.com/file/d/0B4sR85qs377yTThzWHM4YXlxOFE/view?usp=sharing Kindly advise as soon as possible. Many thanks, Sekhar H. -Original Message- From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu] Sent: Tuesday, August 04, 2015 4:14 PM To: dev@ctakes.apache.org Subject: RE: cTAKES GUI for I2B2 Can you post the screenshot somewhere it might be linked to? I don't know if we can post image attachments to the dev list. Thanks Tim From: Hari, Sekhar [sekhar.h...@cgi.com] Sent: Monday, August 03, 2015 10:35 PM To: chen...@apache.org; dev@ctakes.apache.org; u...@ctakes.apache.org Subject: RE: cTAKES GUI for I2B2 Hello there - Please, can somebody advise me on my question below? Thanks, Sekhar H. From: Hari, Sekhar Sent: 31 July 2015 15:02 To: chen...@apache.org Subject: cTAKES GUI for I2B2 Hello Pei - Can you please assist me. I am doing a few experiments using I2B2. There is a requirement for me to use cTAKES for reading clinical notes and to extract the key terminologies from the notes so it can be inserted into I2B2. I found cTAKES GUI for I2B2 and installed it. While trying to read a sample clinical note, though seemingly the pipelines run, I don't see any useful output in the Results section of the GUI. I have included a screenshot below. The Results page says language: x-unspecified. Not sure what is going wrong. Am I doing anything wrong or missing any configurations? Also, I have included the NLP processors that am using to do this read and extract. You can see this screenshot at the bottom of this email. Hope you can help. Many thanks, Sekhar H.
Re: Role of white-box logic/models in cTAKES
Peter, Good to hear from you again! Yes, I believe there are some regex and rules based annotators that are in used (and probably the future for as long as it out performs other methods for certain tasks.) I don't think there is specific position form the community on this approach. (ASF's 'Do-acracy') Were you thinking of writing some Annotators in Ruta? --Pei On Wed, Aug 5, 2015 at 3:47 AM, Peter Klügl peter.klu...@averbis.com wrote: Hi, my (uninformed) view on cTAKES was that it is mainly based on black-box machine learning models. There were some mentions of rule-based approaches on the mailing list and a quick look in the source code revealed to me some functionality that is based on FSMs and regular expressions (and the grey area of rule logic implemented in plain java). I'm just curious. Is this code actively used in cTAKES and is there a general position of the cTAKES community on rules-based/white-box approaches? Best, Peter
Re: cTAKES GUI for I2B2
Sekhar, That application was done as a prototype/POC many years ago and hasn't been actively maintained (hence in sandbox). It seems from your screenshot that you have it up and running though. Would you mind attaching the log files as well? --Pei On Wed, Aug 5, 2015 at 4:41 AM, Hari, Sekhar sekhar.h...@cgi.com wrote: Hello Timothy - I have posted the screenshots here: https://drive.google.com/file/d/0B4sR85qs377yTThzWHM4YXlxOFE/view?usp=sharing Kindly advise as soon as possible. Many thanks, Sekhar H. -Original Message- From: Miller, Timothy [mailto:timothy.mil...@childrens.harvard.edu] Sent: Tuesday, August 04, 2015 4:14 PM To: dev@ctakes.apache.org Subject: RE: cTAKES GUI for I2B2 Can you post the screenshot somewhere it might be linked to? I don't know if we can post image attachments to the dev list. Thanks Tim From: Hari, Sekhar [sekhar.h...@cgi.com] Sent: Monday, August 03, 2015 10:35 PM To: chen...@apache.org; dev@ctakes.apache.org; u...@ctakes.apache.org Subject: RE: cTAKES GUI for I2B2 Hello there - Please, can somebody advise me on my question below? Thanks, Sekhar H. From: Hari, Sekhar Sent: 31 July 2015 15:02 To: chen...@apache.org Subject: cTAKES GUI for I2B2 Hello Pei - Can you please assist me. I am doing a few experiments using I2B2. There is a requirement for me to use cTAKES for reading clinical notes and to extract the key terminologies from the notes so it can be inserted into I2B2. I found cTAKES GUI for I2B2 and installed it. While trying to read a sample clinical note, though seemingly the pipelines run, I don't see any useful output in the Results section of the GUI. I have included a screenshot below. The Results page says language: x-unspecified. Not sure what is going wrong. Am I doing anything wrong or missing any configurations? Also, I have included the NLP processors that am using to do this read and extract. You can see this screenshot at the bottom of this email. Hope you can help. Many thanks, Sekhar H.
how to run i2b2 data
Hello everyone, I am running ctakes with i2b2 data https://www.i2b2.org/NLP/DataSets/Main.php In each xml file, there are multiple patient records. I am able to separate each patient into single files and process them with runCPE.sh Is there a way to convert this single xml file into the format ctakes accepted, and process as a single input file, and generate a single output file (results labelled by patient id). For example, each patient id has a smoking status. Thanks, -- Justin
RE: how to run i2b2 data
Hi Justin, A shot in the dark: You could create a collection reader that works similarly to org.apache.ctakes.core.cr.FilesInDirectoryCollectionReader , but instead of grabbing all of the files in a directory it grabs all the records parsed from a single .xml and runs a pipeline per record. Basically, swap a directory for an .xml, a text file for an xml element containing a record. Somebody out there might have something that already does as much. Sean -Original Message- From: Justin Zhang [mailto:justinzhang...@gmail.com] Sent: Wednesday, August 05, 2015 6:40 PM To: u...@ctakes.apache.org; dev@ctakes.apache.org Subject: how to run i2b2 data Hello everyone, I am running ctakes with i2b2 data https://urldefense.proofpoint.com/v2/url?u=https-3A__www.i2b2.org_NLP_DataSets_Main.phpd=BQIBaQc=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFUr=fs67GvlGZstTpyIisCYNYmQCP6r0bcpKGd4f7d4gTaom=IygWj6YGkcjofGRbrDiFJacJHMaBveHR9qzY0VD1AAEs=swpt3QP4-B392iLlJ9wypBwD17tRDOCxPdSZOW1rS8se= In each xml file, there are multiple patient records. I am able to separate each patient into single files and process them with runCPE.sh Is there a way to convert this single xml file into the format ctakes accepted, and process as a single input file, and generate a single output file (results labelled by patient id). For example, each patient id has a smoking status. Thanks, -- Justin