i hava done sequence diagram and done some small changes,please go through it and tell me if any more thing is to be included
On Wed, Jul 17, 2013 at 9:37 PM, sandeep rg <[email protected]> wrote: > it just a skeleton of original proposal > > > On Wed, Jul 17, 2013 at 9:31 PM, sandeep rg <[email protected]>wrote: > >> the sample work is shared with you both.any more details to be included >> please tell me. >> In which,GUI design,schedule and implementation flow chart design is to >> added which is under construction and will be uploaded within few hours. >> >> >> On Wed, Jul 17, 2013 at 7:56 PM, Chen, Pei < >> [email protected]> wrote: >> >>> [email protected] >>> >>> > -----Original Message----- >>> > From: Mattmann, Chris A (398J) [mailto:[email protected]] >>> > Sent: Wednesday, July 17, 2013 10:22 AM >>> > To: [email protected] >>> > Subject: Re: to involve in your development group >>> > >>> > [email protected] >>> > >>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> > ++++++++ >>> > Chris Mattmann, Ph.D. >>> > Senior Computer Scientist >>> > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >>> > Office: 171-266B, Mailstop: 171-246 >>> > Email: [email protected] >>> > WWW: http://sunset.usc.edu/~mattmann/ >>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> > ++++++++ >>> > Adjunct Assistant Professor, Computer Science Department University of >>> > Southern California, Los Angeles, CA 90089 USA >>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> > ++++++++ >>> > >>> > >>> > >>> > >>> > >>> > >>> > -----Original Message----- >>> > From: sandeep rg <[email protected]> >>> > Reply-To: "[email protected]" <[email protected]> >>> > Date: Wednesday, July 17, 2013 6:53 AM >>> > To: "[email protected]" <[email protected]> >>> > Subject: Re: to involve in your development group >>> > >>> > >can you provide your gmail id to share the proposal document with you? >>> > > >>> > > >>> > > >>> > >On Tue, Jul 16, 2013 at 11:33 PM, sandeep rg <[email protected]> >>> > >wrote: >>> > > >>> > >> sir, >>> > >> i am providing proposal by two days.now i am mainly going through >>> > >>ASF-ICFOSS gateway because if i gone through their way and my >>> proposal >>> > >>is get selected,ICFOSS will provide some sort of support such as >>> > >>certificates,small financial support etc. to us. >>> > >> >>> > >> >>> > >> but,main thing is i like programming,i like to explore through the >>> > >> new technologies in coding and like to interact with the coding.so >>> if >>> > >> my proposal is got rejected,then also i like to work in your project >>> > >> as a volunteer if you allow me.. >>> > >> >>> > >> now i am preparing a proposal,within 2 days i will submit >>> > >> it..Mattmann chris helped me to know more about the format of >>> > proposal. >>> > >> >>> > >> >>> > >> On Tue, Jul 16, 2013 at 8:12 PM, Chen, Pei >>> > >><[email protected] >>> > >> > wrote: >>> > >> >>> > >>> Chris/Sandeep, >>> > >>> According to ASF-ICFOSS, I believe the deadline for submitting >>> > >>>proposals is this coming Friday (July 19). >>> > >>> After which point, mentors will have 2 weeks to review and >>> > >>>score/accept. >>> > >>> Just curious, are we planning to follow the same process here? Or >>> > >>>since it's all volunteer work, technically- sandeep and still >>> > >>>contribute code to the community and participate in the dev group >>> > >>>here. >>> > >>> >>> > >>> Looking forward to it. >>> > >>> --Pei >>> > >>> >>> > >>> >>> > >>> > -----Original Message----- >>> > >>> > From: sandeep rg [mailto:[email protected]] >>> > >>> > Sent: Monday, July 15, 2013 1:05 PM >>> > >>> > To: [email protected] >>> > >>> > Subject: Re: to involve in your development group >>> > >>> > >>> > >>> > sir, >>> > >>> > i gone through most of the ocr technologies and reached a >>> > >>>conclusion.i >>> > >>> > would like to use apache tika and java ocr for this pupose. >>> > >>> > >>> > >>> > Tessearact is a ocr tool,it can be used for extracting from >>> > >>> > multiple languages.it is implemented in vc++.so it can acceded >>> > >>> > using java >>> > >>>native >>> > >>> > function.they provided another tool tess4j but review says that >>> > >>> > it >>> > >>>has >>> > >>> > many bugs. >>> > >>> > >>> > >>> > Apache tika developed in java language.it can be used to extract >>> > >>> > text >>> > >>> data >>> > >>> > from .xls,word,txt,pdf and other many formats.it is easy for >>> > >>> implementing >>> > >>> > in project also.i have just gone through its implementation way. >>> > >>> > >>> > >>> > then about javaocr,its good for extrating text from a jpeg or >>> > >>> > scanned images.we can train it with various fonts.more we train >>> > >>> > more will be >>> > >>>its >>> > >>> > accuracy but its speed will get decreased.i didn't find any >>> > >>>particular >>> > >>> > documentation for that. >>> > >>> > >>> > >>> > >>> > >>> > >>> > >>> > On Sun, Jul 14, 2013 at 9:18 PM, sandeep rg >>> > >>> > <[email protected]> >>> > >>> > wrote: >>> > >>> > >>> > >>> > > thanks a lot for both of your support.I will do my best to find >>> > >>> solution >>> > >>> > > for jira problem.i will share the proposal with both of you.. >>> > >>> > > >>> > >>> > > >>> > >>> > > >>> > >>> > > On Sun, Jul 14, 2013 at 1:46 AM, Chen, Pei >>> > >>> > <[email protected] >>> > >>> > > > wrote: >>> > >>> > > >>> > >>> > >> Sandeep, >>> > >>> > >> Its great to have Chris on board as well- he was one of the >>> > >>> coordinators >>> > >>> > >> of GSoC. >>> > >>> > >> Looking forward to it. >>> > >>> > >> >>> > >>> > >> Sent from my iPhone >>> > >>> > >> >>> > >>> > >> On Jul 13, 2013, at 12:24 PM, "Mattmann, Chris A (398J)" < >>> > >>> > >> [email protected]> wrote: >>> > >>> > >> >>> > >>> > >> > Hi Sandeep, >>> > >>> > >> > >>> > >>> > >> > That is great news, and good job. OK, for some ideas about >>> > >>> developing >>> > >>> > >> > your proposal, you may want to simply start with a Google >>> > >>> > >> > Docs, >>> > >>>and >>> > >>> > then >>> > >>> > >> > share it with Pei. I'd be happy to help co-mentor if Pei and >>> > >>> > >> > you >>> > >>> think >>> > >>> > >> > it's useful too. >>> > >>> > >> > >>> > >>> > >> > Your proposal should likely cover: >>> > >>> > >> > >>> > >>> > >> > 1. Background - what's the state of CTAKES-189 and what's it >>> > >>> trying to >>> > >>> > >> > accomplish >>> > >>> > >> > (include some figures, etc. along with your text) >>> > >>> > >> > >>> > >>> > >> > 2. Approach - what are you going to do to solve CTAKES-189. >>> > >>> > >> > Be >>> > >>> specific, >>> > >>> > >> > and >>> > >>> > >> > try to break it down into smaller, easily reversible steps >>> > >>> > >> > >>> > >>> > >> > 3. Schedule - how long and what is the schedule for >>> achieving >>> > >>>this? >>> > >>> > >> > >>> > >>> > >> > 4. Risks/etc. - any known risks like are you taking a >>> > >>> > >> > vacation >>> > >>> anytime >>> > >>> > >> > soon :) >>> > >>> > >> > or are there other time constraints? >>> > >>> > >> > >>> > >>> > >> > 5. References, etc. >>> > >>> > >> > >>> > >>> > >> > HTH and I'd be happy if you want to share the GDocs with me >>> > >>> > >> > as >>> > >>>you >>> > >>> > >> develop >>> > >>> > >> > it. >>> > >>> > >> > >>> > >>> > >> > Cheers! >>> > >>> > >> > >>> > >>> > >> > Chris >>> > >>> > >> > >>> > >>> > >> > >>> > >>> > >>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> > >>> > ++++++++ >>> > >>> > >> > Chris Mattmann, Ph.D. >>> > >>> > >> > Senior Computer Scientist >>> > >>> > >> > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >>> > >>> > >> > Office: 171-266B, Mailstop: 171-246 >>> > >>> > >> > Email: [email protected] >>> > >>> > >> > WWW: http://sunset.usc.edu/~mattmann/ >>> > >>> > >> > >>> > >>> > >>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> > >>> > ++++++++ >>> > >>> > >> > Adjunct Assistant Professor, Computer Science Department >>> > >>> > >> > University of Southern California, Los Angeles, CA 90089 USA >>> > >>> > >> > >>> > >>> > >>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> > >>> > ++++++++ >>> > >>> > >> > >>> > >>> > >> > >>> > >>> > >> > >>> > >>> > >> > >>> > >>> > >> > >>> > >>> > >> > >>> > >>> > >> > -----Original Message----- >>> > >>> > >> > From: sandeep rg <[email protected]> >>> > >>> > >> > Reply-To: "[email protected]" <[email protected]> >>> > >>> > >> > Date: Saturday, July 13, 2013 8:57 AM >>> > >>> > >> > To: "[email protected]" <[email protected]> >>> > >>> > >> > Subject: Re: to involve in your development group >>> > >>> > >> > >>> > >>> > >> >> i have also gone through the technologies available for >>> > >>> development >>> > >>> > of >>> > >>> > >> >> ocr,from that i think apache tika and tessearact is best >>> for >>> > >>> resolving >>> > >>> > >> the >>> > >>> > >> >> problem. >>> > >>> > >> >> >>> > >>> > >> >> >>> > >>> > >> >> On Sat, Jul 13, 2013 at 9:02 PM, sandeep rg >>> > >>> > <[email protected]> >>> > >>> > >> >> wrote: >>> > >>> > >> >> >>> > >>> > >> >>> hi Mattamann Chris, >>> > >>> > >> >>> i has participated in the event coordinated by luciano >>> > >>> > >> >>> resende >>> > >>> > >> >>> >>> > >>> > >> >>> http://community.apache.org/mentoringprogramme-icfoss- >>> > >>> > pilot.html >>> > >>> > >> >>> >>> > >>> > >> >>> and from that i learned about open source and like to work >>> > >>> > >> >>> on >>> > >>> your >>> > >>> > >> >>> project >>> > >>> > >> >>> ctakes.i would like to fix the jira >>> > >>> > >> >>> >>> > >>> > >> >>> https://issues.apache.org/jira/browse/CTAKES-189 >>> > >>> > >> >>> >>> > >>> > >> >>> chen pei accepted my requested to be my mentor.now i want >>> > >>> > >> >>> to >>> > >>>give >>> > >>> > a >>> > >>> > >> >>> proposal to apache about the project i am going to work >>> > >>> > >> >>> on.can >>> > >>> you >>> > >>> > >> help >>> > >>> > >> >>> me >>> > >>> > >> >>> to prepare a proposal to be submitted before 18 th of this >>> > >>>july. >>> > >>> > >> >>> >>> > >>> > >> >>> >>> > >>> > >> >>> >>> > >>> > >> >>> >>> > >>> > >> >>> >>> > >>> > >> >>> >>> > >>> > >> >>> On Sat, Jul 13, 2013 at 2:26 AM, Mattmann, Chris A (398J) >>> < >>> > >>> > >> >>> [email protected]> wrote: >>> > >>> > >> >>> >>> > >>> > >> >>>> Hi Sandeep, >>> > >>> > >> >>>> >>> > >>> > >> >>>> I think the best thing to do is: >>> > >>> > >> >>>> >>> > >>> > >> >>>> 1. Develop a JIRA issue here: >>> > >>> > >> >>>> https://issues.apache.org/jira/browse/CTAKES >>> > >>> > >> >>>> 1a. you can register for a new account on JIRA 2. Once >>> > >>> > >> >>>> your JIRA issue is created, feel free to start a >>> > >>> [DISCUSS] >>> > >>> > >> >>>> thread >>> > >>> > >> >>>> (e.g., with subject [DISCUSS] "some topic" where "some >>> > >>>topic" is >>> > >>> > >> >>>> perhaps >>> > >>> > >> >>>> the main idea you have) on [email protected], >>> > >>> > >> >>>> referencing >>> > >>> > your >>> > >>> > >> >>>> issue >>> > >>> > >> >>>> and >>> > >>> > >> >>>> asking for feedback >>> > >>> > >> >>>> 3. Work with the Apache cTAKES PMC and committers to get >>> > >>> > >> >>>> your >>> > >>> > patches >>> > >>> > >> >>>> and >>> > >>> > >> >>>> other items attached to your issue from #1 committed into >>> > >>> > >> >>>> the >>> > >>> > sources >>> > >>> > >> >>>> >>> > >>> > >> >>>> Ideally if 1-3 happen and it's a good interaction, Apache >>> > >>> > >> >>>> is >>> > >>> built on >>> > >>> > >> >>>> meritocracy and you could possibly earn the merit to >>> > >>> > >> >>>> become a >>> > >>> PMC >>> > >>> > >> >>>> member >>> > >>> > >> >>>> or committer on the project. >>> > >>> > >> >>>> >>> > >>> > >> >>>> Cheers, >>> > >>> > >> >>>> Chris >>> > >>> > >> >>>> >>> > >>> > >> >>>> >>> > >>> > >>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> > >>> > ++++++++ >>> > >>> > >> >>>> Chris Mattmann, Ph.D. >>> > >>> > >> >>>> Senior Computer Scientist >>> > >>> > >> >>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >>> > >>> > >> >>>> Office: 171-266B, Mailstop: 171-246 >>> > >>> > >> >>>> Email: [email protected] >>> > >>> > >> >>>> WWW: http://sunset.usc.edu/~mattmann/ >>> > >>> > >> >>>> >>> > >>> > >>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> > >>> > ++++++++ >>> > >>> > >> >>>> Adjunct Assistant Professor, Computer Science Department >>> > >>> > >> >>>> University of Southern California, Los Angeles, CA 90089 >>> > >>> > >> >>>> USA >>> > >>> > >> >>>> >>> > >>> > >>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> > >>> > ++++++++ >>> > >>> > >> >>>> >>> > >>> > >> >>>> >>> > >>> > >> >>>> >>> > >>> > >> >>>> >>> > >>> > >> >>>> >>> > >>> > >> >>>> >>> > >>> > >> >>>> -----Original Message----- >>> > >>> > >> >>>> From: sandeep rg <[email protected]> >>> > >>> > >> >>>> Reply-To: "[email protected]" >>> > <[email protected]> >>> > >>> > >> >>>> Date: Thursday, July 11, 2013 11:30 AM >>> > >>> > >> >>>> To: "[email protected]" <[email protected]> >>> > >>> > >> >>>> Subject: Re: to involve in your development group >>> > >>> > >> >>>> >>> > >>> > >> >>>>> can you provide what all details i should include in a >>> > >>> > >> >>>> proposal?whether i >>> > >>> > >> >>>>> wanted to include all implemetation(technical) details >>> in >>> > >>>the >>> > >>> > >> >>>> proposal? >>> > >>> > >> >>>>> >>> > >>> > >> >>>>> >>> > >>> > >> >>>>> On Thu, Jul 11, 2013 at 9:45 PM, Mattmann, Chris A >>> (398J) >>> > >>> > >> >>>>> < [email protected]> wrote: >>> > >>> > >> >>>>> >>> > >>> > >> >>>>>> Dear Sandeep, >>> > >>> > >> >>>>>> >>> > >>> > >> >>>>>> Thanks for your interest in cTAKES. We would welcome >>> > >>> > >> >>>>>> your >>> > >>> > >> >>>> contribution >>> > >>> > >> >>>>>> and are happy to have your interest in the project. >>> > >>> > >> >>>>>> >>> > >>> > >> >>>>>> Cheers, >>> > >>> > >> >>>>>> Chris >>> > >>> > >> >>>>>> >>> > >>> > >> >>>>>> >>> > >>> > >>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> > >>> > ++++++++ >>> > >>> > >> >>>>>> Chris Mattmann, Ph.D. >>> > >>> > >> >>>>>> Senior Computer Scientist NASA Jet Propulsion >>> Laboratory >>> > >>> > >> >>>>>> Pasadena, CA 91109 USA >>> > >>> > >> >>>>>> Office: 171-266B, Mailstop: 171-246 >>> > >>> > >> >>>>>> Email: [email protected] >>> > >>> > >> >>>>>> WWW: http://sunset.usc.edu/~mattmann/ >>> > >>> > >> >>>>>> >>> > >>> > >>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> > >>> > ++++++++ >>> > >>> > >> >>>>>> Adjunct Assistant Professor, Computer Science >>> > Department >>> > >>> > >> >>>>>> University of Southern California, Los Angeles, CA >>> 90089 >>> > >>>USA >>> > >>> > >> >>>>>> >>> > >>> > >>> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> > >>> > ++++++++ >>> > >>> > >> >>>>>> >>> > >>> > >> >>>>>> >>> > >>> > >> >>>>>> >>> > >>> > >> >>>>>> >>> > >>> > >> >>>>>> >>> > >>> > >> >>>>>> >>> > >>> > >> >>>>>> -----Original Message----- >>> > >>> > >> >>>>>> From: sandeep rg <[email protected]> >>> > >>> > >> >>>>>> Reply-To: "[email protected]" >>> > >>> > >> >>>>>> <[email protected]> >>> > >>> > >> >>>>>> Date: Wednesday, July 10, 2013 11:01 AM >>> > >>> > >> >>>>>> To: "[email protected]" <[email protected]> >>> > >>> > >> >>>>>> Subject: Re: to involve in your development group >>> > >>> > >> >>>>>> >>> > >>> > >> >>>>>>> sir, >>> > >>> > >> >>>>>>> >>> > >>> > >> >>>>>>> My name is sandeep rg.i am a btech graduate in >>> computer >>> > >>> > >> science.now >>> > >>> > >> >>>>>> doing >>> > >>> > >> >>>>>>> an internship in a company in java language. >>> > >>> > >> >>>>>>> >>> > >>> > >> >>>>>>> then i had installed all things succesfully,now >>> > >>>downloading >>> > >>> the >>> > >>> > >> >>>>>>> resource.ittake too much time. >>> > >>> > >> >>>>>>> >>> > >>> > >> >>>>>>> i have gone through the suggested ocr technologies. >>> > >>> > >> >>>>>>> Javaocr has some good user review. >>> > >>> > >> >>>>>>> Apache tika has a capability to process different >>> types >>> > >>> > >> >>>>>>> of >>> > >>> format. >>> > >>> > >> >>>>>>> More than that there is tesserract which are also used >>> > >>> > >> >>>>>>> for >>> > >>> ocr >>> > >>> > >> >>>> purpose. >>> > >>> > >> >>>>>>> then apache pdfbox is also used for text extratcion >>> but >>> > >>>only >>> > >>> for >>> > >>> > >> >>>> pdf >>> > >>> > >> >>>>>>> files. >>> > >>> > >> >>>>>>> now i am going through every thing to find out best >>> > >>> technology >>> > >>> > >> from >>> > >>> > >> >>>>>> this. >>> > >>> > >> >>>>>>> >>> > >>> > >> >>>>>>> >>> > >>> > >> >>>>>>> On Wed, Jul 10, 2013 at 12:52 AM, Chen, Pei >>> > >>> > >> >>>>>>> <[email protected]>wrote: >>> > >>> > >> >>>>>>> >>> > >>> > >> >>>>>>>> Hi Sandeep, >>> > >>> > >> >>>>>>>> I am delighted to work with you on this project. >>> > >>> > >> >>>>>>>> >>> > >>> > >> >>>>>>>> I was not sure if I understood you correctly- did you >>> > >>>mean >>> > >>> to >>> > >>> > say >>> > >>> > >> >>>>>> that >>> > >>> > >> >>>>>>>> you >>> > >>> > >> >>>>>>>> have already tried using cTAKES and it's components? >>> > >>> > >> >>>>>>>> If not, you can do an svn checkout of the code and >>> try >>> > >>> running >>> > >>> > >> >>>> the >>> > >>> > >> >>>>>>>> debugger gui from the command line (or eclipseide) >>> > >>> > >> >>>>>>>> that >>> > >>>will >>> > >>> > >> >>>> allow >>> > >>> > >> >>>>>> you >>> > >>> > >> >>>>>>>> to >>> > >>> > >> >>>>>>>> type in plain text and get back the different >>> > >>> > >> >>>>>>>> structured >>> > >>> content >>> > >>> > >> >>>>>> (types) >>> > >>> > >> >>>>>>>> that cTAKES produces: >>> > >>> > >> >>>>>>>> MAVEN_OPTS="-Xmx2g -Xms1g" >>> > >>> > >> >>>>>>>> mvn -PrunCVD compile >>> > >>> > >> >>>>>>>> From the guide: >>> > >>> > >> >>>> >>> > >>> > >> >>>> >>> > >>> > >> >>> > >>> > >>> > >>>https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+Devel >>> > op >>> > >>>e >>> > >>> > r+ >>> > >>> > >> >>>> I >>> > >>> > >> >>>>>>>> nstall+Guide >>> > >>> > >> >>>>>>>> >>> > >>> > >> >>>>>>>> A bit of background: >>> > >>> > >> >>>>>>>> Apache cTAKES uses SVN for version on control: >>> > >>> > >> >>>>>>>> https://svn.apache.org/repos/asf/ctakes/trunk/ >>> > >>> > >> >>>>>>>> Jira for issues tracking: >>> > >>> > >> >>>>>>>> https://issues.apache.org/jira/browse/ctakes >>> > >>> > >> >>>>>>>> Maven for building and dependency management. >>> > >>> > >> >>>>>>>> A lot of the developers use Eclipse IDE for their >>> > >>> development. >>> > >>> > >> >>>>>>>> More info on ctakes.apache.org >>> > >>> > >> >>>>>>>> >>> > >>> > >> >>>>>>>> cTAKES is built on top of the Apache UIMA Framework. >>> > >>> > >> >>>> Essentially, >>> > >>> > >> >>>>>>>> cTAKES >>> > >>> > >> >>>>>>>> is a collection of Annotators (Java Classes) and >>> wired >>> > >>> together >>> > >>> > >> >>>> to >>> > >>> > >> >>>>>> into >>> > >>> > >> >>>>>>>> a >>> > >>> > >> >>>>>>>> pipeline. >>> > >>> > >> >>>>>>>> It's goal in a nutshell is to turn unstructured plain >>> > >>>text >>> > >>> into >>> > >>> > >> >>>>>>>> structured/normalized form and specially trained for >>> > >>>medical >>> > >>> > >> >>>> notes. >>> > >>> > >> >>>>>>>> Right now- the input cTAKES expects would be in plain >>> > >>>text >>> > >>> > form >>> > >>> > >> >>>> and >>> > >>> > >> >>>>>>>> cTAKES >>> > >>> > >> >>>>>>>> does not have an OCR component. >>> > >>> > >> >>>>>>>> cTAKE-189:GSoC:implement OCR/tika to standardize text >>> > >>> > inputs was >>> > >>> > >> >>>> an >>> > >>> > >> >>>>>> idea >>> > >>> > >> >>>>>>>> to allow cTAKES to take in any type of input (PDF, >>> > >>>Images, >>> > >>> > Word, >>> > >>> > >> >>>> XLS, >>> > >>> > >> >>>>>>>> etc.) >>> > >>> > >> >>>>>>>> and pass the text for cTAKES processing. >>> > >>> > >> >>>>>>>> [I was originally thinking this could be done in some >>> > >>>kind >>> > >>> of >>> > >>> > >> >>>>>>>> preprocessing, or an optional Annotator that could be >>> > >>>added >>> > >>> in >>> > >>> > >> >>>> the >>> > >>> > >> >>>>>>>> beginning of a pipeline]. There may be some existing >>> > >>>work >>> > >>> > that >>> > >>> > >> >>>>>> could be >>> > >>> > >> >>>>>>>> potentially reused: Apache Tika ( >>> > >>> > >> >>>>>>>> https://issues.apache.org/jira/browse/TIKA-93 ) as >>> > >>> > >> >>>>>>>> well >>> > >>>as >>> > >>> > some >>> > >>> > >> >>>> open >>> > >>> > >> >>>>>>>> source OCR toolkits (JavaOCR). >>> > >>> > >> >>>>>>>> >>> > >>> > >> >>>>>>>> About Me: >>> > >>> > >> >>>> >>> > >>> > >> >>>> >>> > >>> > >> >>> > >>> > >>> > >>> >>> > >>> >>> http://childrenshospital.org/cfapps/research/data_admin/Site3240/main >>> > >>>pag >>> > >>> > >> >>>> e >>> > >>> > >> >>>>>>>> S3240P8.html >>> > >>> > >> >>>>>>>> http://www.linkedin.com/in/peistation >>> > >>> > >> >>>>>>>> http://people.apache.org/committer- >>> > index.html#chenpei >>> > >>> > >> >>>>>>>> >>> > >>> > >> >>>>>>>>> -----Original Message----- >>> > >>> > >> >>>>>>>>> From: sandeep rg [mailto:[email protected]] >>> > >>> > >> >>>>>>>>> Sent: Tuesday, July 09, 2013 1:19 PM >>> > >>> > >> >>>>>>>>> To: [email protected] >>> > >>> > >> >>>>>>>>> Subject: Re: to involve in your development group >>> > >>> > >> >>>>>>>>> >>> > >>> > >> >>>>>>>>> Thanks a lot for giving me support.i like to work >>> > >>> > >> >>>>>>>>> with >>> > >>>you. >>> > >>> > >> >>>>>>>>> >>> > >>> > >> >>>>>>>>> I have gone through the objectives of the >>> > >>> > >> >>>>>>>>> software,used >>> > >>>the >>> > >>> > >> >>>>>> software >>> > >>> > >> >>>>>>>> and >>> > >>> > >> >>>>>>>>> gone through various components of the project.can >>> > >>> > >> >>>>>>>>> you >>> > >>> > provide >>> > >>> > >> >>>> me >>> > >>> > >> >>>>>>>> starting >>> > >>> > >> >>>>>>>>> point from where i should start to know more about >>> > >>> > >> >>>>>>>>> the >>> > >>> > coding >>> > >>> > >> >>>> part >>> > >>> > >> >>>>>> of >>> > >>> > >> >>>>>>>> the >>> > >>> > >> >>>>>>>>> project. >>> > >>> > >> >>>>>>>>> >>> > >>> > >> >>>>>>>>> can you tell me more about the project and about you >>> > >>>also? >>> > >>> > >> >>>>>>>>> >>> > >>> > >> >>>>>>>>> >>> > >>> > >> >>>>>>>>> On Tue, Jul 9, 2013 at 1:14 AM, Chen, Pei >>> > >>> > >> >>>>>>>>> <[email protected]>wrote: >>> > >>> > >> >>>>>>>>> >>> > >>> > >> >>>>>>>>>> Hi Sandeep, >>> > >>> > >> >>>>>>>>>> Thank you for the interest. I just had a quick >>> look >>> > >>> > >> >>>>>>>>>> at >>> > >>> the >>> > >>> > >> >>>>>> ICFOSS >>> > >>> > >> >>>>>>>>>> pilot mentoring program and will be happy to serve >>> > >>> > >> >>>>>>>>>> as a >>> > >>> > >> >>>> mentor >>> > >>> > >> >>>>>> for >>> > >>> > >> >>>>>>>>>> your project >>> > >>> > >> >>>>>>>>>> proposal(s) if you are interested. >>> > >>> > >> >>>>>>>>>> >>> > >>> > >> >>>>>>>>>> --Pei >>> > >>> > >> >>>>>>>>>> >>> > >>> > >> >>>>>>>>>>> -----Original Message----- >>> > >>> > >> >>>>>>>>>>> From: sandeep rg [mailto:[email protected]] >>> > >>> > >> >>>>>>>>>>> Sent: Monday, July 08, 2013 2:24 PM >>> > >>> > >> >>>>>>>>>>> To: [email protected] >>> > >>> > >> >>>>>>>>>>> Subject: Re: to involve in your development group >>> > >>> > >> >>>>>>>>>>> >>> > >>> > >> >>>>>>>>>>> sir, >>> > >>> > >> >>>>>>>>>>> >>> > >>> > >> >>>>>>>>>>> details of the program Pilot mentoring programme >>> > >>> > >> >>>>>>>>>>> with >>> > >>> > india >>> > >>> > >> >>>>>> ICFOSS >>> > >>> > >> >>>>>>>>>>> is >>> > >>> > >> >>>>>>>>>> given >>> > >>> > >> >>>>>>>>>>> in the below web address >>> > >>> > >> >>>>>> http://community.apache.org/mentoringprogramme- >>> > icfoss- >>> > >>> > pilot.html >>> > >>> > >> >>>>>>>>>>> >>> > >>> > >> >>>>>>>>>>> >>> > >>> > >> >>>>>>>>>>> I am new to this community so i need a mentor for >>> > >>> > >> >>>>>>>>>>> the >>> > >>> > >> >>>>>> project.It >>> > >>> > >> >>>>>>>>>>> will be >>> > >>> > >> >>>>>>>>>> more >>> > >>> > >> >>>>>>>>>>> helpful for me.. >>> > >>> > >> >>>>>>>>>>> >>> > >>> > >> >>>>>>>>>>> >>> > >>> > >> >>>>>>>>>>> On Mon, Jul 8, 2013 at 7:22 PM, Chen, Pei >>> > >>> > >> >>>>>>>>>>> <[email protected]>wrote: >>> > >>> > >> >>>>>>>>>>> >>> > >>> > >> >>>>>>>>>>>> Hi Sandeep, >>> > >>> > >> >>>>>>>>>>>> Welcome! I am not familiar with the details of >>> > >>> > >> >>>>>> icfoss-apache, >>> > >>> > >> >>>>>>>> but >>> > >>> > >> >>>>>>>>>>>> please- you are more than welcome to work on the >>> > >>> > >> >>>>>>>>>>>> code >>> > >>> > and >>> > >>> > >> >>>>>>>>>>>> contributions will be greatly appreciated! >>> > >>> > >> >>>>>>>>>>>> There may be a learning curve, but feel free let >>> > >>> > >> >>>>>>>>>>>> us >>> > >>>know >>> > >>> > >> >>>> if >>> > >>> > >> >>>>>> you >>> > >>> > >> >>>>>>>>>>>> have any questions/issues. >>> > >>> > >> >>>>>>>>>>>> Thanks, >>> > >>> > >> >>>>>>>>>>>> Pei >>> > >>> > >> >>>>>>>>>>>> >>> > >>> > >> >>>>>>>>>>>>> -----Original Message----- >>> > >>> > >> >>>>>>>>>>>>> From: sandeep rg >>> > [mailto:[email protected]] >>> > >>> > >> >>>>>>>>>>>>> Sent: Saturday, July 06, 2013 11:50 AM >>> > >>> > >> >>>>>>>>>>>>> To: [email protected] >>> > >>> > >> >>>>>>>>>>>>> Subject: to involve in your development group >>> > >>> > >> >>>>>>>>>>>>> >>> > >>> > >> >>>>>>>>>>>>> my name is sandeep.i am btech graduate.i had >>> > >>> > >> >>>> participated >>> > >>> > >> >>>>>> in >>> > >>> > >> >>>>>>>> a >>> > >>> > >> >>>>>>>>>>>>> camp coordinated in kerala,India in association >>> > >>> > >> >>>>>>>>>>>>> with icfoss-apache called as >>> > >>> > >> >>>>>>>>>>>> youth >>> > >>> > >> >>>>>>>>>>>>> mentoring programme coordinated by Luciano >>> > resende. >>> > >>> > >> >>>>>>>>>>>>> >>> > >>> > >> >>>>>>>>>>>>> i like >>> the >>> > >>> > >> >>>> project >>> > >>> > >> >>>>>> and >>> > >>> > >> >>>>>>>>>>>>> like to >>> > >>> > >> >>>>>>>>>>>> involve in your project as a >>> > >>> > >> >>>>>>>>>>>>> programmer.i have gone through the your project >>> > >>> > >> >>>>>>>>>>>>> and >>> > >>> > >> >>>> gone >>> > >>> > >> >>>>>>>> through >>> > >>> > >> >>>>>>>>>>>>> the bugs list.I like to work on the bug >>> > >>> > >> >>>>>>>>>>>>> "cTAKE-189:GSoC:implement OCR/tika to >>> > standardize >>> > >>> > text >>> > >>> > >> >>>>>> inputs >>> > >>> > >> >>>>>>>>>>>>> for cTAKES".can you allow me to >>> > >>> > >> >>>>>>>>>> work >>> > >>> > >> >>>>>>>>>>> on that? >>> > >>> > >> > >>> > >>> > >> >>> > >>> > > >>> > >>> > > >>> > >>> >>> > >> >>> > >> >>> >>> >> >
