Have you seen https://github.com/tesseract-ocr/tesseract/wiki/TestingTesseract
On 26 Sep 2016 11:28 p.m., "Pedro Correia" <[email protected]> wrote: > Hi there, I'm currently testing different custom thresholding methods with > tesseract and I need a database of book pages pictures in order to compare > these processes results. Doesn't need to be book pages necessarily, but all > I have found so far are natural scenes pictures, glyphs and wrapped > sentences databases, and those do not interest me, because I need full > texts. > Does anyone know about a pictures database that already contains it's > transcription to ASCII? > Also is there any software that compares the results obtained by me with > this ASCII dataset? > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To post to this group, send email to [email protected]. > Visit this group at https://groups.google.com/group/tesseract-ocr. > To view this discussion on the web visit https://groups.google.com/d/ > msgid/tesseract-ocr/44c219a6-445a-4eec-80c5-6c5a69a103eb% > 40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/44c219a6-445a-4eec-80c5-6c5a69a103eb%40googlegroups.com?utm_medium=email&utm_source=footer> > . > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To post to this group, send email to [email protected]. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduXikAMMGSN81QOhAGP-yK1M5z2b4-WdCj0r_Uaj4AuRsQ%40mail.gmail.com. For more options, visit https://groups.google.com/d/optout.

