Thanks Nick

I already have it set up for ghostscript as it gives better results than 
imagemagick.

As the PDF's are mostly multi-page and ghostscript can generate multi-page 
TIFF from these, I can feed these directly into Tesseract.

So I don't think pdfimages is an option as it spits out multiple files.

Steve

On Tuesday, April 30, 2013 12:39:53 AM UTC+12, Nick White wrote:
>
> On Mon, Apr 29, 2013 at 04:10:43AM -0700, Steven McArdle wrote: 
> > What do you mean by "it doesn't support straight PDF" ? 
>
> I mean it only accepts image files. So you need to extract the 
> images from the PDF before getting Tesseract to process them. 
>
> Now I think of it, the 'pdfimages' tool is better for this than 
> imagemagick, as it will extract without converting or losing any 
> quality. But either would work fine (or Ghostscript, as you point 
> out). 
>
> Nick 
>

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.


Reply via email to