You can use the MITRE MIST tool for the deidentification. It allows 
re-training, etc. You have to run it as a pre-processor independent of cTAKES, 
then use its output as the input to cTAKES.
http://mist-deid.sourceforge.net/

Compete de-identification is an unsolved problem though, there are no 
guarantees there would be no leaks.

I hope this helps.
--Guergana Savova, PhD, FACMI
Associate Professor
PI Natural Language Processing Lab
Boston Children's Hospital and Harvard Medical School
300 Longwood Avenue
Mailstop: BCH3092
Enders 144.1
Boston, MA 02115
Tel: (617) 919-2972
Fax: (617) 730-0817
[email protected]<mailto:[email protected]>
Harvard Scholar: http://scholar.harvard.edu/guergana_k_savova/biocv
ctakes.apache.org
thyme.healthnlp.org
cancer.healthnlp.org
share.healthnlp.org

From: Dipankar Ray [mailto:[email protected]]
Sent: Friday, January 13, 2017 6:01 PM
To: [email protected]
Subject: de-identification

Hi folks,

Apologies if this is a newbie question - tried to look for an earlier 
occurrence of it, but was unsuccessful.

From this website (https://open.med.harvard.edu/project/scrubber/) I learned 
that the Scrubber de-identification tool is now available as part of CTAKES. 
But I didn't see anything about de-identification listed among the components 
here:

https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.2+Component+Use+Guide<https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CTAKES_cTAKES-2B3.2-2BComponent-2BUse-2BGuide&d=DgMFaQ&c=qS4goWBT7poplM69zy_3xhKwEW14JZMSdioCoppxeFU&r=SeLHlpmrGNnJ9mI2WCgf_wwQk9zL4aIrVmfBoSi-j0kfEcrO4yRGmRCJNAr-rCmP&m=BPD8OBFn5bnp0ZZrPiqD5jss63CaCnPz943cABqbAi4&s=5vXOFR62vx5O31vm16WYuFde-0OzHIogPqEqhO4gcmY&e=>

Question: How do I use CTAKES for de-identification?

best,
Dipankar

Reply via email to