Hello, can you please post a link, where I can find "speedy-ocr bash script"
Zd. On Tue, Feb 8, 2011 at 10:06 AM, SpeedyChair <[email protected]> wrote: > Another way to prepare a PDF document for tesseract is to use the > 'convert' command from the ImageMagick package to split an image only PDF > file into a series of GrayScale TIFF images, one for each page. This > convert command can work on just about any image. For PDF conversions, it > actually makes ghostscript do all of the work. This same syntax also works > with multi-page TIFF files and Postscript files. > > convert mydoc.pdf -type GrayScale -depth 8 -scene 1 mydoc-%03d.tif > > Then you would need to loop through the TIFF files to perform OCR on each > page image. In a day or two, I will update my speedy-ocr bash script, which > will now handle PDF image files. > > Don Marang > Vinux Software Coordinator - vinux.org.uk > > There is just so much stuff in the world that, to me, is devoid of any real > substance, value, and content that I just try to make sure that I am working > on things that matter. > Dean Kamen > > *From:* KHEM Sochenda <[email protected]> > *Sent:* Monday, February 07, 2011 10:23 PM > *To:* [email protected] > *Subject:* Re: VietOCR v2.0/3.1 & VietOCR.NET v2.0 Releases > > Dear Quan, > > I would like to know how to let tesseract OCR work with pdf documents. > > Thank you very much in advance for you kind response. > > With Best Regards, > > Sochenda > > On Tue, Feb 8, 2011 at 7:56 AM, Quan Nguyen <[email protected]> wrote: > >> A Java/.NET GUI frontend for Tesseract OCR engine. The releases >> include the following fixes and improvements: >> >> * Add support for spellcheck suggestion in context menu >> * Improve program accessibility and usability >> * Add support for downloading and installing language data packs and >> appropriate spell dictionaries >> * Add UI localization for Lithuanian and Slovak >> * Update Tesseract OCR engine to 3.01 (r551) (v3.1 only) >> >> http://vietocr.sf.net >> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To post to this group, send email to [email protected]. >> To unsubscribe from this group, send email to >> [email protected]. >> For more options, visit this group at >> http://groups.google.com/group/tesseract-ocr?hl=en. >> >> > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en. > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]. > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

