First I test tesseract on file generated as flat image.
I generate Lorem Ipsum text:
5 paragraphs, 452 words 2978 bytes, 24 lines + 4 blank lines, maximal line
len in my editor was 135 chars.
Result: 100% accurate but two full stop marks, fantastic.
Next, I rotate image. Only 0.7 degree caused a lot of confusion and minor
rotation 0.1-0.6 degree - treat some m as n.
In my book photo images are often rotate up to 3.5 degree.
Worse, text is transformed into curve lines of text like F-distribution
("What function looks like the edge of a paper book sideways? on
math.stackexchange.com)
how to work with real photos of books, it is possible as option or thing
that is missing in tesseract ?
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/tesseract-ocr/9ac3343e-df3c-432e-8066-af21a20eda1cn%40googlegroups.com.