Pl dont send me mail.
On Fri, Jun 27, 2014 at 12:28 PM, John Hewson <[email protected]> wrote: > Hi Dimuthu > > That’s great. We should wait until closer to the end of the GSoC period to > integrate your work with PDFBox, as ideally we only want to have to do it > once. We’ve not included C++ dependencies before so no, there won’t be a > standard way, we’ll have to think something up. We’ll either make it an > optional sub-project and the Tesseract JNI bindings might be better of > having their own branch so that they are more like an external dependency - > I’ll ask the dev mailing list. > > To prepare your code for contribution you’ll need to add the Apache header > to each.java file (see any PDFBox .java file for an example) and submit a > signed ICLA http://www.apache.org/licenses/icla.pdf to Apache. > > Regarding additional functionality, the most useful would be for a new > command line tool which could write the OCR’d text back into the original > PDF file as “invisible text”, which would allow for copy and paste and text > search to then work for that PDF file. A starting point for this would be > to try and write the OCR’d text into the original PDF as “visible” text - > we can make it invisible later! > > -- John > > On 19 Jun 2014, at 13:57, DImuthu Upeksha <[email protected]> > wrote: > > > Hi John, > > Except providing compatibility for platforms like windows, I think most > of the functionalities of OCR plugin are finished (Please correct me if I'm > wrong). But I would like to contribute to project further. Do you have > anything to add as a new functionality? And If you plan to add this to > PDFBox code, how should prepare my code? Is there any standard way? > > > > Thanks > > Dimuthu > > -- > > Regards > > W.Dimuthu Upeksha > > Undergraduate > > Department of Computer Science And Engineering > > University of Moratuwa, Sri Lanka > >
