Re: Medical de-identification

Rohit Shinde Mon, 23 Mar 2015 21:12:21 -0700

Thanks Britt! I am downloading the source code now and I will install it
soon. Right now, I have my mid semester exams for three days, I will come
back in three days and start learning about what you have told me.


I am very familiar with Java. I know very little about UIMA. I know
decision trees also very well. And I will learn about ctakes more soon.
What all should I know about UIMA?

On Sun, Mar 22, 2015 at 9:28 PM, britt fitch <
[email protected]> wrote:

> Sounds good.
>
> Starting with some references:
> Docs: https://open.med.harvard.edu/wiki/display/SCRUBBER/3.X
> Publication: http://www.biomedcentral.com/1472-6947/13/112/abstract
> (check out the supplemental material as well for additional details on
> running and improvements)
> SVN (old, standalone, Scrubber v.3.x):
> https://open.med.harvard.edu/wiki/display/SCRUBBER/Software
> SVN (initial apache port to ctakes sandbox):
> https://svn.apache.org/repos/asf/ctakes/sandbox/ctakes-scrubber-deid/
>
> The project started off as a standalone process and became a UIMA pipeline
> (outside of ctakes).
> The plan had always been to port this to an optional ctakes module but we
> never got that fully implemented.
>
> Some of the parts that need the most attention to get going:
>
>    - working with the ctakes type system
>    - pulling out weka (ML lib) for an asf 2.0 friendly lib instead
>    - simpler process for building the models.
>
>
> Regarding knowledge, its good to be familiar with java, UIMA, decision
> trees, and ctakes. Likely in that order.
>
> While this is still in the sandbox and you are still getting familiar with
> running it as a standalone app feel free to ping me and andy off-list if
> thats more convenient.
> Then we can definitely bring it back to the dev list while getting it
> running in ctakes.
>
> Cheers,
>
> Britt
>
>     Britt Fitch
> Wired Informatics
> 265 Franklin St Ste 1702
> Boston, MA 02110
> http://wiredinformatics.com
> [email protected]
>
> On Mar 20, 2015, at 7:57 PM, andy mcmurry <[email protected]> wrote:
>
> Britt et al: here is a student named rohit interested in getting the
> deidentification pipeline running again. Hoping there is still interest in
> getting this going in ctakes for real. Comments?
> ---------- Forwarded message ----------
> From: "Rohit Shinde" <[email protected]>
> Date: Mar 20, 2015 5:02 AM
> Subject: Re: Medical de-identification
> To: "andy mcmurry" <[email protected]>
> Cc:
>
> I would certainly be interested into "production grade code". The project
> also sounds interesting. How do I start working on it? I know Java well.
> What else would I need to know before starting on this project?
>
> On Fri, Mar 20, 2015 at 12:44 PM, andy mcmurry <[email protected]>
> wrote:
>
> Yes, the project is in Java, the code was written for a research project
> and never made into "production grade code". If you are interested, we
> would like to turn the scrubber into a solid pipeline. Java programming
> 100%, with Colt statistical library
> On Mar 19, 2015 7:52 PM, "Rohit Shinde" <[email protected]>
> wrote:
>
> Hi Andy,
>
> Could you please tell me more about that project? I would really like a
> reply.
>
> Thank you,
> Rohit Shinde
>
> On Wed, Mar 18, 2015 at 5:51 PM, Rohit Shinde <
> [email protected]> wrote:
>
> Hi Andy,
>
> I am interested in medical de-identification. I would like to know what
> this project consists of. Is it partially implemented, or does the
> implementation need to start?
>
> What languages would I need to know? What theoretical background would I
> need? Also, how complex would this task be? What parts of OpenNLP does this
> project use?
>
> Thank you,
> Rohit Shinde
>
>
>
>
>

Re: Medical de-identification

Reply via email to