I would be interested in helping to develop / maintain a regression testing framework for that. I'm new to ctakes (and just recently started stalking the dev mailing list) but I've been a software engineer for 20 years and have done a lot of framework automation stuff that will probably be required. As I write this, I am working on an automated integration test that will run on Jenkins that fires up and load an h2 database, a solr instance, an in-house indexing pipeline and an in-house search service, indexes 10k documents and executes and evaluates some canned queries before shutting itself down. I'm also working on a MS in Predictive Analytics and I am interested in applying machine learning and NLP to medical informatics, so I would welcome the chance to get dirty with that side of stuff, also. From: Jay Vyas <jayunit100.apa...@gmail.com> To: "dev@ctakes.apache.org" <dev@ctakes.apache.org> Sent: Friday, July 24, 2015 10:44 AM Subject: Re: Combining Knowledge- and Data-driven Methods for De-identification of Clinical Narratives Yes this is very interesting work.
- If we have access to a large corpus of de identified records we can recession test the ctakes platform. - I can help collaborate on a regression testing framework if someone else wants to help Maintain it. > On Jul 24, 2015, at 11:12 AM, Pei Chen <chen...@apache.org> wrote: > > Hi, > Re: http://www.sciencedirect.com/science/article/pii/S1532046415001392 > This is very interesting work and I think it would be very valuable > for the general community. Is this something that you may be in > interested in contributing/sharing the code with the Apache cTAKES > community? > Thanks, > Pei