Dear Sandeep, Thanks for your interest in cTAKES. We would welcome your contribution and are happy to have your interest in the project.
Cheers, Chris ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -----Original Message----- From: sandeep rg <[email protected]> Reply-To: "[email protected]" <[email protected]> Date: Wednesday, July 10, 2013 11:01 AM To: "[email protected]" <[email protected]> Subject: Re: to involve in your development group >sir, > >My name is sandeep rg.i am a btech graduate in computer science.now doing >an internship in a company in java language. > >then i had installed all things succesfully,now downloading the >resource.ittake too much time. > >i have gone through the suggested ocr technologies. >Javaocr has some good user review. >Apache tika has a capability to process different types of format. >More than that there is tesserract which are also used for ocr purpose. >then apache pdfbox is also used for text extratcion but only for pdf >files. >now i am going through every thing to find out best technology from this. > > >On Wed, Jul 10, 2013 at 12:52 AM, Chen, Pei ><[email protected]>wrote: > >> Hi Sandeep, >> I am delighted to work with you on this project. >> >> I was not sure if I understood you correctly- did you mean to say that >>you >> have already tried using cTAKES and it's components? >> If not, you can do an svn checkout of the code and try running the >> debugger gui from the command line (or eclipseide) that will allow you >>to >> type in plain text and get back the different structured content (types) >> that cTAKES produces: >> MAVEN_OPTS="-Xmx2g -Xms1g" >> mvn -PrunCVD compile >> From the guide: >> >>https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+Developer+I >>nstall+Guide >> >> A bit of background: >> Apache cTAKES uses SVN for version on control: >> https://svn.apache.org/repos/asf/ctakes/trunk/ >> Jira for issues tracking: >> https://issues.apache.org/jira/browse/ctakes >> Maven for building and dependency management. >> A lot of the developers use Eclipse IDE for their development. >> More info on ctakes.apache.org >> >> cTAKES is built on top of the Apache UIMA Framework. Essentially, >>cTAKES >> is a collection of Annotators (Java Classes) and wired together to into >>a >> pipeline. >> It's goal in a nutshell is to turn unstructured plain text into >> structured/normalized form and specially trained for medical notes. >> Right now- the input cTAKES expects would be in plain text form and >>cTAKES >> does not have an OCR component. >> cTAKE-189:GSoC:implement OCR/tika to standardize text inputs was an idea >> to allow cTAKES to take in any type of input (PDF, Images, Word, XLS, >>etc.) >> and pass the text for cTAKES processing. >> [I was originally thinking this could be done in some kind of >> preprocessing, or an optional Annotator that could be added in the >> beginning of a pipeline]. There may be some existing work that could be >> potentially reused: Apache Tika ( >> https://issues.apache.org/jira/browse/TIKA-93 ) as well as some open >> source OCR toolkits (JavaOCR). >> >> About Me: >> >> >>http://childrenshospital.org/cfapps/research/data_admin/Site3240/mainpage >>S3240P8.html >> http://www.linkedin.com/in/peistation >> http://people.apache.org/committer-index.html#chenpei >> >> > -----Original Message----- >> > From: sandeep rg [mailto:[email protected]] >> > Sent: Tuesday, July 09, 2013 1:19 PM >> > To: [email protected] >> > Subject: Re: to involve in your development group >> > >> > Thanks a lot for giving me support.i like to work with you. >> > >> > I have gone through the objectives of the software,used the software >>and >> > gone through various components of the project.can you provide me >> starting >> > point from where i should start to know more about the coding part of >>the >> > project. >> > >> > can you tell me more about the project and about you also? >> > >> > >> > On Tue, Jul 9, 2013 at 1:14 AM, Chen, Pei >> > <[email protected]>wrote: >> > >> > > Hi Sandeep, >> > > Thank you for the interest. I just had a quick look at the ICFOSS >> > > pilot mentoring program and will be happy to serve as a mentor for >> > > your project >> > > proposal(s) if you are interested. >> > > >> > > --Pei >> > > >> > > > -----Original Message----- >> > > > From: sandeep rg [mailto:[email protected]] >> > > > Sent: Monday, July 08, 2013 2:24 PM >> > > > To: [email protected] >> > > > Subject: Re: to involve in your development group >> > > > >> > > > sir, >> > > > >> > > > details of the program Pilot mentoring programme with india ICFOSS >> > > > is >> > > given >> > > > in the below web address >> > > > >> > > > http://community.apache.org/mentoringprogramme-icfoss-pilot.html >> > > > >> > > > >> > > > I am new to this community so i need a mentor for the project.It >> > > > will be >> > > more >> > > > helpful for me.. >> > > > >> > > > >> > > > On Mon, Jul 8, 2013 at 7:22 PM, Chen, Pei >> > > > <[email protected]>wrote: >> > > > >> > > > > Hi Sandeep, >> > > > > Welcome! I am not familiar with the details of icfoss-apache, >>but >> > > > > please- you are more than welcome to work on the code and >> > > > > contributions will be greatly appreciated! >> > > > > There may be a learning curve, but feel free let us know if you >> > > > > have any questions/issues. >> > > > > Thanks, >> > > > > Pei >> > > > > >> > > > > > -----Original Message----- >> > > > > > From: sandeep rg [mailto:[email protected]] >> > > > > > Sent: Saturday, July 06, 2013 11:50 AM >> > > > > > To: [email protected] >> > > > > > Subject: to involve in your development group >> > > > > > >> > > > > > my name is sandeep.i am btech graduate.i had participated in >>a >> > > > > > camp coordinated in kerala,India in association with >> > > > > > icfoss-apache called as >> > > > > youth >> > > > > > mentoring programme coordinated by Luciano resende. >> > > > > > >> > > > > > i like the project and >> > > > > > like to >> > > > > involve in your project as a >> > > > > > programmer.i have gone through the your project and gone >>through >> > > > > > the bugs list.I like to work on the bug >> > > > > > "cTAKE-189:GSoC:implement OCR/tika to standardize text inputs >> > > > > > for cTAKES".can you allow me to >> > > work >> > > > on that? >> > > > > >> > > >>
