Thank you Don for the comments.

On Tue, Feb 8, 2011 at 4:06 PM, SpeedyChair <[email protected]> wrote:
>  Another way to prepare a PDF document for tesseract is to use the 'convert'
> command from the ImageMagick package to split an image only PDF file into a
> series of GrayScale TIFF images, one for each page.  This convert command
> can work on just about any image.  For PDF conversions, it actually makes
> ghostscript do all of the work.  This same syntax also works with multi-page
> TIFF files and Postscript files.
>
> convert mydoc.pdf -type GrayScale -depth 8 -scene 1 mydoc-%03d.tif
>
> Then you would need to loop through the TIFF files to perform OCR on each
> page image.  In a day or two, I will update my speedy-ocr bash script, which
> will now handle PDF image files.
>
> Don Marang
> Vinux Software Coordinator - vinux.org.uk
>
> There is just so much stuff in the world that, to me, is devoid of any real
> substance, value, and content that I just try to make sure that I am working
> on things that matter.
> Dean Kamen
>
> From: KHEM Sochenda
> Sent: Monday, February 07, 2011 10:23 PM
> To: [email protected]
> Subject: Re: VietOCR v2.0/3.1 & VietOCR.NET v2.0 Releases
> Dear Quan,
>
> I would like to know how to let tesseract OCR work with pdf documents.
>
> Thank you very much in advance for you kind response.
>
> With Best Regards,
>
> Sochenda
>
> On Tue, Feb 8, 2011 at 7:56 AM, Quan Nguyen <[email protected]> wrote:
>>
>> A Java/.NET GUI frontend for Tesseract OCR engine. The releases
>> include the following fixes and improvements:
>>
>> * Add support for spellcheck suggestion in context menu
>> * Improve program accessibility and usability
>> * Add support for downloading and installing language data packs and
>> appropriate spell dictionaries
>> * Add UI localization for Lithuanian and Slovak
>> * Update Tesseract OCR engine to 3.01 (r551) (v3.1 only)
>>
>> http://vietocr.sf.net
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "tesseract-ocr" group.
>> To post to this group, send email to [email protected].
>> To unsubscribe from this group, send email to
>> [email protected].
>> For more options, visit this group at
>> http://groups.google.com/group/tesseract-ocr?hl=en.
>>
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en.
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Reply via email to