+1 (binding) On Jun 5, 2012, at 5:48 PM, Chen, Pei wrote:
> Including the original Proposal raw text below this time: > > Hi, > We are proposing cTAKES to be an Apache Incubator Project and would like to > request that the IPMC vote for cTAKES to join the Incubator. > Below, you will find the original proposal and details. > > Please cast your vote: > [ ] +1 to recommend cTAKES to be an Apache Incubator Project [ ] 0 don't care > [ ] -1 no, don't recommend yet, (because...) > > Thanks, > Pei > > = cTAKES Proposal = > The following is a proposal for a new top-level project within the ASF. > > == Abstract == > cTAKES: (clinical Text Analysis and Knowledge Extraction System) is an > natural language processing tool for information extraction from electronic > medical record clinical free-text. > > == Proposal == > cTAKES comprises a collection of components and tooling written in Java > specifically trained for the clinical domain, and creates rich linguistic and > semantic annotations that can be utilized by clinical decision support > systems & clinical research. > == Background == > The development of cTAKES started in 2006 by a team of physicians, computer > scientists and software engineers at the Mayo Clinic. The development team > was led by Dr. Guergana Savova & Dr. Christopher Chute. cTAKES is released > open source under an Apache v2.0 license. This system was deployed at Mayo > and is currently an integral part of their clinical data management > infrastructure and has processed in excess of 80 million clinical notes. > Currently, the core development team is co-located at Mayo Clinic and > Children's Hospital Boston following Dr. Savova's move to Children's Hospital > Boston in early 2010. Additional collaborations with external groups at > University of Colorado, Brandeis University, University of Pittsburgh, > University of California at San Diego continue to extend the capabilities of > cTAKES into areas such Temporal Reasoning, Clinical Question and Answering, > and coreference resolution for the clinical domain. In 2010, cTAKES was > adopted by the I2B2 program and is a central component of the SHARP Area 4. > The current cTAKES components include: > > * Sentence boundary detector > * Rule-based tokenizer to separate punctuations from words > * Normalizer > * Context dependent tokenizer > * Part-of-speech tagger > * Phrasal chunker > * Dictionary lookup annotator and normalization to an ontology > * Context annotator > * Negation detector > * Dependency parser > * Constituency parser > * Semantic Role Labeler > * Coreference resolver > * Module for the identification of patient smoking status > * Drug mention annotator > > == Rationale == > We believe there is a clear gap between cutting edge technologies developed > out of research labs and in the clinical practice. We believe that moving > cTAKES development to the Apache development community will lead to faster > innovation, better integration with other open source software, and broader > adoption of cTAKES within clinical institutions and improve our healthcare > system. We believe that having cTAKES on Apache will encourage the > development of a basic set of open source components that will jumpstart > these developers' efforts. > > == Initial Goals == > The initial goals of the proposed project are: > > * Bring the community together at the ASF and make the development process > transparent for them > * Write user documentation about all major components > * Automated build/continuous integration > * Automate regression tests > * Produce an Incubating release > > == Current Status == > === Meritocracy === > Some of the initial committers are familiar with Apache's idea of > meritocracy, others aren't. We will get everybody on the same level as part > of the incubation process. > > === Community === > cTAKES already has a considerable user base, both in industry and academia. > > === Core Developers === > See the initial committer list. > > === Alignment === > cTAKES has tie-ins with several existing Apache projects. We have been > building our components using the UIMA framework. We are also reusing > existing Apache projects such as Lucene, Solr, Maven. We expect these > collaborations to strengthen further after our move to Apache and experiment > with other projects under the Lucene umbrella such as Hadoop and Mahout. > Another obvious connection exists to some of the projects under the OpenNLP > umbrella. > > == Known Risks == > === Orphaned products === > The project has been around for quite a number of years already, it has a > well-established user community and a diverse set of committers. > > === Inexperience with Open Source === > cTAKES has been an open source project for many years. Many of the developers > are already familiar with both open source in general and the ASF in > particular. > > === Homogenous Developers === > The current group of developers is very diverse and spans globally and across > multiple institutions. > > === Reliance on Salaried Developers === > Most of the developers are not paid to work specifically on cTAKES, so there > is little reliance on salaried developers. > > === Relationships with Other Apache Products === > NLP is often used in search and other algorithms that work with unstructured > data, thus cTAKES is likely to be useful to the Lucene and Solr communities. > It also aligns nicely with both Mahout and UIMA as well as OpenNLP. > > === A Excessive Fascination with the Apache Brand === > We think the project aligns nicely with the goals of the ASF to disseminate > source code to the public free of charge. Clinical NLP has long been the > subject of cutting edge research, but is often lacking in community and > shared knowledge. We believe that by bringing cTAKES to the ASF, the Apache > brand will help deliver clinical NLP capabilities to a much larger audience > and likewise a cutting edge project like cTAKES can further the ASF brand by > providing users with tried and true, as well as new, natural language > processing capabilities. > > == Documentation == > * https://wiki.nci.nih.gov/display/VKC/cTAKES+2.0 > * http://en.wikipedia.org/wiki/CTAKES > > == Initial Source == > The source code is maintained in SVN on SourceForge: cTAKES: > http://sourceforge.net/projects/ohnlp/ > > == Source and Intellectual Property Submission Plan == > The cTAKES source code is already open source under the AL 2.0. > > == External Dependencies == > ||'''Library''' ||||<style="text-align: center">'''License''' > ||||<style="text-align: center">'''Description''' || > ||libsvm ||||<style="text-align: center">BSD ||||<style="text-align: > center">Machine Learning Library || > ||UIMA ||||<style="text-align: center">AL 2.0 ||||<style="text-align: > center">Unstructured Information Management Architecture || > ||Lucene Core ||||<style="text-align: center">AL 2.0 ||||<style="text-align: > center">Plain Text Search Engine Library || > ||OpenNLP||||<style="text-align: center">AL 2.0 ||||<style="text-align: > center">General Purpose Natural Language Processing Library|| > ||HSQLDB||||<style="text-align: center">BSD||||<style="text-align: center">In > Memory DB|| > ||JDOM||||<style="text-align: center">Apache Style||||<style="text-align: > center">Java XML Manipulation Libraryv|| > ||Open AI FSM||||<style="text-align: center">Apache > Style||||<style="text-align: center">Finite State Machines Toolset|| > > > == Cryptography == > cTAKES neither provides nor uses any cryptography. > > == Required Resources == > === Mailing lists === > * ctakes-dev > * ctakes-private > * ctakes-user > * ctakes-commits > > === Subversion Directory === > https://svn.apache.org/repos/asf/incubator/ctakes > > === Issue Tracking === > Jira: cTAKES > > === Other Resources === > == Initial Committers == > ||'''Name''' ||||<style="text-align: center">'''Email''' > ||||<style="text-align: center">'''CLA''' || > ||Pei J Chen ||||<style="text-align: center">pei.c...@childrens.harvard.edu > ||||<style="text-align: center">yes || > ||Sean Finan ||||<style="text-align: center">sean.fi...@childrens.harvard.edu > ||||<style="text-align: center">no || > ||Guergana K. Savova ||||<style="text-align: > center">guergana.sav...@childrens.harvard.edu ||||<style="text-align: > center">no || > ||James J Masanz ||||<style="text-align: center">masanz.ja...@mayo.edu > ||||<style="text-align: center">no || > > > == Affiliations == > == Sponsors == > === Champion === > Jörn Kottmann > > === Nominated Mentors === > * Jörn Kottmann > * Grant Ingersoll > * Chris A Mattmann > > === Sponsoring Entity === > The Apache Incubator > > > On 05/30/2012 11:59 PM, Chen, Pei wrote: >> Hi All, >> >> We would like to propose cTAKES to be an Apache Incubator project. >> >> cTAKES: (clinical Text Analysis and Knowledge Extraction System) is an >> natural language processing tool for information extraction from electronic >> medical record clinical free-text. Additional information is available at >> http://en.wikipedia.org/wiki/CTAKES and >> https://wiki.nci.nih.gov/display/VKC/cTAKES+2.5 . >> >> The draft proposal document is available at >> http://wiki.apache.org/incubator/cTAKESProposal >> >> We're excited about the opportunity to work with ASF and the community to >> create an Incubator project for Natural Language Processing for the clinical >> domain. We'll welcome all feedback on the proposal. >> >> Thanks. >> >> >> >> --- >> Pei Chen >> Lead Application Development Specialist Childrens Hospital Boston / >> Harvard Medical School >> 300 Longwood Avenue, Enders 142 >> Boston, MA 02115 >> tel: (617) 919-4423 >> fax: (617) 730-0057 >> pei.c...@childrens.harvard.edu >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org >> For additional commands, e-mail: general-h...@incubator.apache.org >> > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > For additional commands, e-mail: general-h...@incubator.apache.org > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: general-unsubscr...@incubator.apache.org > For additional commands, e-mail: general-h...@incubator.apache.org > -------------------------------------------- Grant Ingersoll http://www.lucidimagination.com