Hi Sandeep, I think the best thing to do is:
1. Develop a JIRA issue here: https://issues.apache.org/jira/browse/CTAKES 1a. you can register for a new account on JIRA 2. Once your JIRA issue is created, feel free to start a [DISCUSS] thread (e.g., with subject [DISCUSS] "some topic" where "some topic" is perhaps the main idea you have) on [email protected], referencing your issue and asking for feedback 3. Work with the Apache cTAKES PMC and committers to get your patches and other items attached to your issue from #1 committed into the sources Ideally if 1-3 happen and it's a good interaction, Apache is built on meritocracy and you could possibly earn the merit to become a PMC member or committer on the project. Cheers, Chris ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -----Original Message----- From: sandeep rg <[email protected]> Reply-To: "[email protected]" <[email protected]> Date: Thursday, July 11, 2013 11:30 AM To: "[email protected]" <[email protected]> Subject: Re: to involve in your development group >can you provide what all details i should include in a proposal?whether i >wanted to include all implemetation(technical) details in the proposal? > > >On Thu, Jul 11, 2013 at 9:45 PM, Mattmann, Chris A (398J) < >[email protected]> wrote: > >> Dear Sandeep, >> >> Thanks for your interest in cTAKES. We would welcome your contribution >> and are happy to have your interest in the project. >> >> Cheers, >> Chris >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Chris Mattmann, Ph.D. >> Senior Computer Scientist >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >> Office: 171-266B, Mailstop: 171-246 >> Email: [email protected] >> WWW: http://sunset.usc.edu/~mattmann/ >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Adjunct Assistant Professor, Computer Science Department >> University of Southern California, Los Angeles, CA 90089 USA >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> >> >> >> >> >> >> -----Original Message----- >> From: sandeep rg <[email protected]> >> Reply-To: "[email protected]" <[email protected]> >> Date: Wednesday, July 10, 2013 11:01 AM >> To: "[email protected]" <[email protected]> >> Subject: Re: to involve in your development group >> >> >sir, >> > >> >My name is sandeep rg.i am a btech graduate in computer science.now >>doing >> >an internship in a company in java language. >> > >> >then i had installed all things succesfully,now downloading the >> >resource.ittake too much time. >> > >> >i have gone through the suggested ocr technologies. >> >Javaocr has some good user review. >> >Apache tika has a capability to process different types of format. >> >More than that there is tesserract which are also used for ocr purpose. >> >then apache pdfbox is also used for text extratcion but only for pdf >> >files. >> >now i am going through every thing to find out best technology from >>this. >> > >> > >> >On Wed, Jul 10, 2013 at 12:52 AM, Chen, Pei >> ><[email protected]>wrote: >> > >> >> Hi Sandeep, >> >> I am delighted to work with you on this project. >> >> >> >> I was not sure if I understood you correctly- did you mean to say >>that >> >>you >> >> have already tried using cTAKES and it's components? >> >> If not, you can do an svn checkout of the code and try running the >> >> debugger gui from the command line (or eclipseide) that will allow >>you >> >>to >> >> type in plain text and get back the different structured content >>(types) >> >> that cTAKES produces: >> >> MAVEN_OPTS="-Xmx2g -Xms1g" >> >> mvn -PrunCVD compile >> >> From the guide: >> >> >> >> >> >>https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+Developer+I >> >>nstall+Guide >> >> >> >> A bit of background: >> >> Apache cTAKES uses SVN for version on control: >> >> https://svn.apache.org/repos/asf/ctakes/trunk/ >> >> Jira for issues tracking: >> >> https://issues.apache.org/jira/browse/ctakes >> >> Maven for building and dependency management. >> >> A lot of the developers use Eclipse IDE for their development. >> >> More info on ctakes.apache.org >> >> >> >> cTAKES is built on top of the Apache UIMA Framework. Essentially, >> >>cTAKES >> >> is a collection of Annotators (Java Classes) and wired together to >>into >> >>a >> >> pipeline. >> >> It's goal in a nutshell is to turn unstructured plain text into >> >> structured/normalized form and specially trained for medical notes. >> >> Right now- the input cTAKES expects would be in plain text form and >> >>cTAKES >> >> does not have an OCR component. >> >> cTAKE-189:GSoC:implement OCR/tika to standardize text inputs was an >>idea >> >> to allow cTAKES to take in any type of input (PDF, Images, Word, XLS, >> >>etc.) >> >> and pass the text for cTAKES processing. >> >> [I was originally thinking this could be done in some kind of >> >> preprocessing, or an optional Annotator that could be added in the >> >> beginning of a pipeline]. There may be some existing work that >>could be >> >> potentially reused: Apache Tika ( >> >> https://issues.apache.org/jira/browse/TIKA-93 ) as well as some open >> >> source OCR toolkits (JavaOCR). >> >> >> >> About Me: >> >> >> >> >> >> >> >>http://childrenshospital.org/cfapps/research/data_admin/Site3240/mainpage >> >>S3240P8.html >> >> http://www.linkedin.com/in/peistation >> >> http://people.apache.org/committer-index.html#chenpei >> >> >> >> > -----Original Message----- >> >> > From: sandeep rg [mailto:[email protected]] >> >> > Sent: Tuesday, July 09, 2013 1:19 PM >> >> > To: [email protected] >> >> > Subject: Re: to involve in your development group >> >> > >> >> > Thanks a lot for giving me support.i like to work with you. >> >> > >> >> > I have gone through the objectives of the software,used the >>software >> >>and >> >> > gone through various components of the project.can you provide me >> >> starting >> >> > point from where i should start to know more about the coding part >>of >> >>the >> >> > project. >> >> > >> >> > can you tell me more about the project and about you also? >> >> > >> >> > >> >> > On Tue, Jul 9, 2013 at 1:14 AM, Chen, Pei >> >> > <[email protected]>wrote: >> >> > >> >> > > Hi Sandeep, >> >> > > Thank you for the interest. I just had a quick look at the >>ICFOSS >> >> > > pilot mentoring program and will be happy to serve as a mentor >>for >> >> > > your project >> >> > > proposal(s) if you are interested. >> >> > > >> >> > > --Pei >> >> > > >> >> > > > -----Original Message----- >> >> > > > From: sandeep rg [mailto:[email protected]] >> >> > > > Sent: Monday, July 08, 2013 2:24 PM >> >> > > > To: [email protected] >> >> > > > Subject: Re: to involve in your development group >> >> > > > >> >> > > > sir, >> >> > > > >> >> > > > details of the program Pilot mentoring programme with india >>ICFOSS >> >> > > > is >> >> > > given >> >> > > > in the below web address >> >> > > > >> >> > > > >>http://community.apache.org/mentoringprogramme-icfoss-pilot.html >> >> > > > >> >> > > > >> >> > > > I am new to this community so i need a mentor for the >>project.It >> >> > > > will be >> >> > > more >> >> > > > helpful for me.. >> >> > > > >> >> > > > >> >> > > > On Mon, Jul 8, 2013 at 7:22 PM, Chen, Pei >> >> > > > <[email protected]>wrote: >> >> > > > >> >> > > > > Hi Sandeep, >> >> > > > > Welcome! I am not familiar with the details of >>icfoss-apache, >> >>but >> >> > > > > please- you are more than welcome to work on the code and >> >> > > > > contributions will be greatly appreciated! >> >> > > > > There may be a learning curve, but feel free let us know if >>you >> >> > > > > have any questions/issues. >> >> > > > > Thanks, >> >> > > > > Pei >> >> > > > > >> >> > > > > > -----Original Message----- >> >> > > > > > From: sandeep rg [mailto:[email protected]] >> >> > > > > > Sent: Saturday, July 06, 2013 11:50 AM >> >> > > > > > To: [email protected] >> >> > > > > > Subject: to involve in your development group >> >> > > > > > >> >> > > > > > my name is sandeep.i am btech graduate.i had participated >>in >> >>a >> >> > > > > > camp coordinated in kerala,India in association with >> >> > > > > > icfoss-apache called as >> >> > > > > youth >> >> > > > > > mentoring programme coordinated by Luciano resende. >> >> > > > > > >> >> > > > > > i like the project >>and >> >> > > > > > like to >> >> > > > > involve in your project as a >> >> > > > > > programmer.i have gone through the your project and gone >> >>through >> >> > > > > > the bugs list.I like to work on the bug >> >> > > > > > "cTAKE-189:GSoC:implement OCR/tika to standardize text >>inputs >> >> > > > > > for cTAKES".can you allow me to >> >> > > work >> >> > > > on that? >> >> > > > > >> >> > > >> >> >> >>
