Please read the wiki https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality#page-segmentation-method
Zdenko On Wed, Apr 20, 2016 at 10:37 PM, S.J. Becker <[email protected]> wrote: > > I've attached two files. > > The first file is my original one. It returns empty page (with > eng.traineddata). > > I noticed that there was no margin at the top and little at the bottom. > So I used gimp to add about 4 pixels at the top and bottom. The result > is the second attached file. > > This ocred properly. > > Command line: > tesseract -c tessedit_create_tsv=1 tess_1_1b.tif tess > > Output: > level page_num block_num par_num line_num word_num left > top width height conf text > 1 1 0 0 0 0 0 0 336 110 -1<> > 2 1 1 0 0 0 28 7 270 98 -1<> > 3 1 1 1 0 0 28 7 270 98 -1<> > 4 1 1 1 1 0 28 7 270 98 -1<> > 5 1 1 1 1 1 28 7 270 98 91 A1.01 > > > A1.01 with a confidence of 91 > > Should I file a bug? Or always pad my images with whitespace? > > thanks > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/40a4828d-9a46-4e36-9b22-8b925f39a046%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/40a4828d-9a46-4e36-9b22-8b925f39a046%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8y5jp1Dp_V5y5ETpXocfEHPb4xSXBc6kh3jzN88f4nvMQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

