Hi all,

first thank you for super software and FAQ on how to install it - went
smoothly.

The question is though, maybe anyone has any tips on how to prepare
image for tesseract? I've read the FAQ on borders, resolution, etc,
but i have one particular scenario were image prepared on my Mac with
Preview works almost flawlessly (considering how bad the original jpg
is) and if i try to do the same on linux machine with imagemagick
convert function, the results are disappointing .

Currently i have an image - 
http://dl.dropbox.com/u/12535857/tesseract/sample.jpg

If i convert it on my Mac and apply auto-levels i get image like this
http://dl.dropbox.com/u/12535857/tesseract/mac.tiff

which is finely transformed into text like this:

‘or the short pastry:
a|1.purp0se flour,
1% 613:5 more as needed
stick) unsalted butter
Z/6 cup sugar
5 egg yolks
Salt
l4 lb. (1
For the filling:
6 oz. blanched almonds
6 large eggs, separated
1% cup sugar
1 pinch ground cinnamon
Grated zest of 1 lemon
‘A cup pearjelly,
warmed to liquld
For the glaze:
1% cups sugar
107- (1 square) unsweetened
l chocolate


And if i run transform with imagemagick

convert -compress none -auto-level -auto-gamma sample.jpg linux.tiff

I get an image  like this http://dl.dropbox.com/u/12535857/tesseract/linux.tiff
and the output is truly worse:


.17 flied
3-‘butter
nsugar
 yolks
 Salt
 filling:
 lalmonds
Eseparated
Q {cup sugar
 cinnamon
10:1 lemon
 1 pearjelly,
 ; " _ ed to liquid
‘:5 i
 the glazg:
3%‘; .
 cups sugar
imsweetened
311‘_’ - chocolate

Maybe i should use some other tools ? or fine tune conversion? Tried
playing with contrast, but that doesn't seem to help. Thanks!

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to