thanks a lot for both of your support.I will do my best to find solution for jira problem.i will share the proposal with both of you..
On Sun, Jul 14, 2013 at 1:46 AM, Chen, Pei <[email protected]>wrote: > Sandeep, > Its great to have Chris on board as well- he was one of the coordinators > of GSoC. > Looking forward to it. > > Sent from my iPhone > > On Jul 13, 2013, at 12:24 PM, "Mattmann, Chris A (398J)" < > [email protected]> wrote: > > > Hi Sandeep, > > > > That is great news, and good job. OK, for some ideas about developing > > your proposal, you may want to simply start with a Google Docs, and then > > share it with Pei. I'd be happy to help co-mentor if Pei and you think > > it's useful too. > > > > Your proposal should likely cover: > > > > 1. Background - what's the state of CTAKES-189 and what's it trying to > > accomplish > > (include some figures, etc. along with your text) > > > > 2. Approach - what are you going to do to solve CTAKES-189. Be specific, > > and > > try to break it down into smaller, easily reversible steps > > > > 3. Schedule - how long and what is the schedule for achieving this? > > > > 4. Risks/etc. - any known risks like are you taking a vacation anytime > > soon :) > > or are there other time constraints? > > > > 5. References, etc. > > > > HTH and I'd be happy if you want to share the GDocs with me as you > develop > > it. > > > > Cheers! > > > > Chris > > > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > Chris Mattmann, Ph.D. > > Senior Computer Scientist > > NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > > Office: 171-266B, Mailstop: 171-246 > > Email: [email protected] > > WWW: http://sunset.usc.edu/~mattmann/ > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > Adjunct Assistant Professor, Computer Science Department > > University of Southern California, Los Angeles, CA 90089 USA > > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > > > > > > > > > > > -----Original Message----- > > From: sandeep rg <[email protected]> > > Reply-To: "[email protected]" <[email protected]> > > Date: Saturday, July 13, 2013 8:57 AM > > To: "[email protected]" <[email protected]> > > Subject: Re: to involve in your development group > > > >> i have also gone through the technologies available for development of > >> ocr,from that i think apache tika and tessearact is best for resolving > the > >> problem. > >> > >> > >> On Sat, Jul 13, 2013 at 9:02 PM, sandeep rg <[email protected]> > >> wrote: > >> > >>> hi Mattamann Chris, > >>> i has participated in the event coordinated by luciano resende > >>> > >>> http://community.apache.org/mentoringprogramme-icfoss-pilot.html > >>> > >>> and from that i learned about open source and like to work on your > >>> project > >>> ctakes.i would like to fix the jira > >>> > >>> https://issues.apache.org/jira/browse/CTAKES-189 > >>> > >>> chen pei accepted my requested to be my mentor.now i want to give a > >>> proposal to apache about the project i am going to work on.can you help > >>> me > >>> to prepare a proposal to be submitted before 18 th of this july. > >>> > >>> > >>> > >>> > >>> > >>> > >>> On Sat, Jul 13, 2013 at 2:26 AM, Mattmann, Chris A (398J) < > >>> [email protected]> wrote: > >>> > >>>> Hi Sandeep, > >>>> > >>>> I think the best thing to do is: > >>>> > >>>> 1. Develop a JIRA issue here: > >>>> https://issues.apache.org/jira/browse/CTAKES > >>>> 1a. you can register for a new account on JIRA > >>>> 2. Once your JIRA issue is created, feel free to start a [DISCUSS] > >>>> thread > >>>> (e.g., with subject [DISCUSS] "some topic" where "some topic" is > >>>> perhaps > >>>> the main idea you have) on [email protected], referencing your > >>>> issue > >>>> and > >>>> asking for feedback > >>>> 3. Work with the Apache cTAKES PMC and committers to get your patches > >>>> and > >>>> other items attached to your issue from #1 committed into the sources > >>>> > >>>> Ideally if 1-3 happen and it's a good interaction, Apache is built on > >>>> meritocracy and you could possibly earn the merit to become a PMC > >>>> member > >>>> or committer on the project. > >>>> > >>>> Cheers, > >>>> Chris > >>>> > >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>>> Chris Mattmann, Ph.D. > >>>> Senior Computer Scientist > >>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > >>>> Office: 171-266B, Mailstop: 171-246 > >>>> Email: [email protected] > >>>> WWW: http://sunset.usc.edu/~mattmann/ > >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>>> Adjunct Assistant Professor, Computer Science Department > >>>> University of Southern California, Los Angeles, CA 90089 USA > >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> -----Original Message----- > >>>> From: sandeep rg <[email protected]> > >>>> Reply-To: "[email protected]" <[email protected]> > >>>> Date: Thursday, July 11, 2013 11:30 AM > >>>> To: "[email protected]" <[email protected]> > >>>> Subject: Re: to involve in your development group > >>>> > >>>>> can you provide what all details i should include in a > >>>> proposal?whether i > >>>>> wanted to include all implemetation(technical) details in the > >>>> proposal? > >>>>> > >>>>> > >>>>> On Thu, Jul 11, 2013 at 9:45 PM, Mattmann, Chris A (398J) < > >>>>> [email protected]> wrote: > >>>>> > >>>>>> Dear Sandeep, > >>>>>> > >>>>>> Thanks for your interest in cTAKES. We would welcome your > >>>> contribution > >>>>>> and are happy to have your interest in the project. > >>>>>> > >>>>>> Cheers, > >>>>>> Chris > >>>>>> > >>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>>>>> Chris Mattmann, Ph.D. > >>>>>> Senior Computer Scientist > >>>>>> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA > >>>>>> Office: 171-266B, Mailstop: 171-246 > >>>>>> Email: [email protected] > >>>>>> WWW: http://sunset.usc.edu/~mattmann/ > >>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>>>>> Adjunct Assistant Professor, Computer Science Department > >>>>>> University of Southern California, Los Angeles, CA 90089 USA > >>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> > >>>>>> -----Original Message----- > >>>>>> From: sandeep rg <[email protected]> > >>>>>> Reply-To: "[email protected]" <[email protected]> > >>>>>> Date: Wednesday, July 10, 2013 11:01 AM > >>>>>> To: "[email protected]" <[email protected]> > >>>>>> Subject: Re: to involve in your development group > >>>>>> > >>>>>>> sir, > >>>>>>> > >>>>>>> My name is sandeep rg.i am a btech graduate in computer science.now > >>>>>> doing > >>>>>>> an internship in a company in java language. > >>>>>>> > >>>>>>> then i had installed all things succesfully,now downloading the > >>>>>>> resource.ittake too much time. > >>>>>>> > >>>>>>> i have gone through the suggested ocr technologies. > >>>>>>> Javaocr has some good user review. > >>>>>>> Apache tika has a capability to process different types of format. > >>>>>>> More than that there is tesserract which are also used for ocr > >>>> purpose. > >>>>>>> then apache pdfbox is also used for text extratcion but only for > >>>> pdf > >>>>>>> files. > >>>>>>> now i am going through every thing to find out best technology from > >>>>>> this. > >>>>>>> > >>>>>>> > >>>>>>> On Wed, Jul 10, 2013 at 12:52 AM, Chen, Pei > >>>>>>> <[email protected]>wrote: > >>>>>>> > >>>>>>>> Hi Sandeep, > >>>>>>>> I am delighted to work with you on this project. > >>>>>>>> > >>>>>>>> I was not sure if I understood you correctly- did you mean to say > >>>>>> that > >>>>>>>> you > >>>>>>>> have already tried using cTAKES and it's components? > >>>>>>>> If not, you can do an svn checkout of the code and try running > >>>> the > >>>>>>>> debugger gui from the command line (or eclipseide) that will > >>>> allow > >>>>>> you > >>>>>>>> to > >>>>>>>> type in plain text and get back the different structured content > >>>>>> (types) > >>>>>>>> that cTAKES produces: > >>>>>>>> MAVEN_OPTS="-Xmx2g -Xms1g" > >>>>>>>> mvn -PrunCVD compile > >>>>>>>> From the guide: > >>>> > >>>> > https://cwiki.apache.org/confluence/display/CTAKES/cTAKES+3.0+Developer+ > >>>> I > >>>>>>>> nstall+Guide > >>>>>>>> > >>>>>>>> A bit of background: > >>>>>>>> Apache cTAKES uses SVN for version on control: > >>>>>>>> https://svn.apache.org/repos/asf/ctakes/trunk/ > >>>>>>>> Jira for issues tracking: > >>>>>>>> https://issues.apache.org/jira/browse/ctakes > >>>>>>>> Maven for building and dependency management. > >>>>>>>> A lot of the developers use Eclipse IDE for their development. > >>>>>>>> More info on ctakes.apache.org > >>>>>>>> > >>>>>>>> cTAKES is built on top of the Apache UIMA Framework. > >>>> Essentially, > >>>>>>>> cTAKES > >>>>>>>> is a collection of Annotators (Java Classes) and wired together > >>>> to > >>>>>> into > >>>>>>>> a > >>>>>>>> pipeline. > >>>>>>>> It's goal in a nutshell is to turn unstructured plain text into > >>>>>>>> structured/normalized form and specially trained for medical > >>>> notes. > >>>>>>>> Right now- the input cTAKES expects would be in plain text form > >>>> and > >>>>>>>> cTAKES > >>>>>>>> does not have an OCR component. > >>>>>>>> cTAKE-189:GSoC:implement OCR/tika to standardize text inputs was > >>>> an > >>>>>> idea > >>>>>>>> to allow cTAKES to take in any type of input (PDF, Images, Word, > >>>> XLS, > >>>>>>>> etc.) > >>>>>>>> and pass the text for cTAKES processing. > >>>>>>>> [I was originally thinking this could be done in some kind of > >>>>>>>> preprocessing, or an optional Annotator that could be added in > >>>> the > >>>>>>>> beginning of a pipeline]. There may be some existing work that > >>>>>> could be > >>>>>>>> potentially reused: Apache Tika ( > >>>>>>>> https://issues.apache.org/jira/browse/TIKA-93 ) as well as some > >>>> open > >>>>>>>> source OCR toolkits (JavaOCR). > >>>>>>>> > >>>>>>>> About Me: > >>>> > >>>> > http://childrenshospital.org/cfapps/research/data_admin/Site3240/mainpag > >>>> e > >>>>>>>> S3240P8.html > >>>>>>>> http://www.linkedin.com/in/peistation > >>>>>>>> http://people.apache.org/committer-index.html#chenpei > >>>>>>>> > >>>>>>>>> -----Original Message----- > >>>>>>>>> From: sandeep rg [mailto:[email protected]] > >>>>>>>>> Sent: Tuesday, July 09, 2013 1:19 PM > >>>>>>>>> To: [email protected] > >>>>>>>>> Subject: Re: to involve in your development group > >>>>>>>>> > >>>>>>>>> Thanks a lot for giving me support.i like to work with you. > >>>>>>>>> > >>>>>>>>> I have gone through the objectives of the software,used the > >>>>>> software > >>>>>>>> and > >>>>>>>>> gone through various components of the project.can you provide > >>>> me > >>>>>>>> starting > >>>>>>>>> point from where i should start to know more about the coding > >>>> part > >>>>>> of > >>>>>>>> the > >>>>>>>>> project. > >>>>>>>>> > >>>>>>>>> can you tell me more about the project and about you also? > >>>>>>>>> > >>>>>>>>> > >>>>>>>>> On Tue, Jul 9, 2013 at 1:14 AM, Chen, Pei > >>>>>>>>> <[email protected]>wrote: > >>>>>>>>> > >>>>>>>>>> Hi Sandeep, > >>>>>>>>>> Thank you for the interest. I just had a quick look at the > >>>>>> ICFOSS > >>>>>>>>>> pilot mentoring program and will be happy to serve as a > >>>> mentor > >>>>>> for > >>>>>>>>>> your project > >>>>>>>>>> proposal(s) if you are interested. > >>>>>>>>>> > >>>>>>>>>> --Pei > >>>>>>>>>> > >>>>>>>>>>> -----Original Message----- > >>>>>>>>>>> From: sandeep rg [mailto:[email protected]] > >>>>>>>>>>> Sent: Monday, July 08, 2013 2:24 PM > >>>>>>>>>>> To: [email protected] > >>>>>>>>>>> Subject: Re: to involve in your development group > >>>>>>>>>>> > >>>>>>>>>>> sir, > >>>>>>>>>>> > >>>>>>>>>>> details of the program Pilot mentoring programme with india > >>>>>> ICFOSS > >>>>>>>>>>> is > >>>>>>>>>> given > >>>>>>>>>>> in the below web address > >>>>>> http://community.apache.org/mentoringprogramme-icfoss-pilot.html > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> I am new to this community so i need a mentor for the > >>>>>> project.It > >>>>>>>>>>> will be > >>>>>>>>>> more > >>>>>>>>>>> helpful for me.. > >>>>>>>>>>> > >>>>>>>>>>> > >>>>>>>>>>> On Mon, Jul 8, 2013 at 7:22 PM, Chen, Pei > >>>>>>>>>>> <[email protected]>wrote: > >>>>>>>>>>> > >>>>>>>>>>>> Hi Sandeep, > >>>>>>>>>>>> Welcome! I am not familiar with the details of > >>>>>> icfoss-apache, > >>>>>>>> but > >>>>>>>>>>>> please- you are more than welcome to work on the code and > >>>>>>>>>>>> contributions will be greatly appreciated! > >>>>>>>>>>>> There may be a learning curve, but feel free let us know > >>>> if > >>>>>> you > >>>>>>>>>>>> have any questions/issues. > >>>>>>>>>>>> Thanks, > >>>>>>>>>>>> Pei > >>>>>>>>>>>> > >>>>>>>>>>>>> -----Original Message----- > >>>>>>>>>>>>> From: sandeep rg [mailto:[email protected]] > >>>>>>>>>>>>> Sent: Saturday, July 06, 2013 11:50 AM > >>>>>>>>>>>>> To: [email protected] > >>>>>>>>>>>>> Subject: to involve in your development group > >>>>>>>>>>>>> > >>>>>>>>>>>>> my name is sandeep.i am btech graduate.i had > >>>> participated > >>>>>> in > >>>>>>>> a > >>>>>>>>>>>>> camp coordinated in kerala,India in association with > >>>>>>>>>>>>> icfoss-apache called as > >>>>>>>>>>>> youth > >>>>>>>>>>>>> mentoring programme coordinated by Luciano resende. > >>>>>>>>>>>>> > >>>>>>>>>>>>> i like the > >>>> project > >>>>>> and > >>>>>>>>>>>>> like to > >>>>>>>>>>>> involve in your project as a > >>>>>>>>>>>>> programmer.i have gone through the your project and > >>>> gone > >>>>>>>> through > >>>>>>>>>>>>> the bugs list.I like to work on the bug > >>>>>>>>>>>>> "cTAKE-189:GSoC:implement OCR/tika to standardize text > >>>>>> inputs > >>>>>>>>>>>>> for cTAKES".can you allow me to > >>>>>>>>>> work > >>>>>>>>>>> on that? > > >
