Please file an issue on GitHub repo with these files so that it can be looked at by the developers.
However, for your app, add the whitespace margin to your images as part of preprocessing, since any fix may take a while. - sent from my phone. excuse the brevity. On 21-Apr-2016 11:49 am, "S.J. Becker" <[email protected]> wrote: > > I've attached two files. > > The first file is my original one. It returns empty page (with > eng.traineddata). > > I noticed that there was no margin at the top and little at the bottom. > So I used gimp to add about 4 pixels at the top and bottom. The result > is the second attached file. > > This ocred properly. > > Command line: > tesseract -c tessedit_create_tsv=1 tess_1_1b.tif tess > > Output: > level page_num block_num par_num line_num word_num left > top width height conf text > 1 1 0 0 0 0 0 0 336 110 -1<> > 2 1 1 0 0 0 28 7 270 98 -1<> > 3 1 1 1 0 0 28 7 270 98 -1<> > 4 1 1 1 1 0 28 7 270 98 -1<> > 5 1 1 1 1 1 28 7 270 98 91 A1.01 > > > A1.01 with a confidence of 91 > > Should I file a bug? Or always pad my images with whitespace? > > thanks > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/40a4828d-9a46-4e36-9b22-8b925f39a046%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/40a4828d-9a46-4e36-9b22-8b925f39a046%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduUn1PWio0o-n_J80ihc-92Qv5q8JwkK6k%3DxM0qbd0shHw%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

