In addition to all that has been suggested, if you have the Adobe Acrobat 
(Writer) installed (version 6 and up), go to File menu and then Save As and 
select the image type like jpg; then all the pages will be saved in a separate 
image each.

Hussein Al-Hussein

> From: [email protected]
> To: [email protected]
> Subject: RE: Extracting text from PDF
> Date: Mon, 4 Jan 2010 14:21:20 -0500
> 
> Personally, I would just use Image::Magick or GD to convert the .pdf into a
> .tiff and then simply have tesseract ocr it.
> 
> Someone else may have a better solution though.
> 
> 
> 
> -----Original Message-----
> From: [email protected] [mailto:[email protected]]
> On Behalf Of Eitan
> Sent: Monday, January 04, 2010 8:10 AM
> To: tesseract-ocr
> Subject: Extracting text from PDF
> 
> Hi
> 
> I am a newbie...
> Is there a standard way to extract text from PDF using tesseract-ocr ?
> 
> Thanks
> 
> --
> 
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to
> [email protected].
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en.
> 
> 
> 
> --
> 
> You received this message because you are subscribed to the Google Groups 
> "tesseract-ocr" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected].
> For more options, visit this group at 
> http://groups.google.com/group/tesseract-ocr?hl=en.
> 
> 
                                          

--

You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.


Reply via email to