Use page segmentation mode 5, 6 or 7 (the -psm command line switch). Tesseract's automatic layout analysis fails for this image so you have to specify the layout manually.
Warm regards, Dmitri Silaev www.CustomOCR.com On Wed, Feb 20, 2013 at 6:42 PM, Andrea Fontana <[email protected]> wrote: > I'm working on a ocr batch system. Tesseract works for the most input > images. > > I this case (and similars cases too) tesseract won't work and say "empty > page": > > Here is the image (i did my best to automatically remove noises & > artifacts: it's a pre-processed image taken with scanner) > https://dl.dropbox.com/u/11129378/geo.png > > tried with tesseract 3.02 ubuntu 64. > command: > tesseract geo.png output > > -- > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to [email protected] > To unsubscribe from this group, send email to > [email protected] > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > > --- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/groups/opt_out. > > > -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.

