Tesseract is OCR engine and the user is responsible for preprocessing  -
see the documentation.
IMO there is already app (using tesseract) for what you try to do: Text
Fairy [1]

[1] https://play.google.com/store/apps/details?id=com.renard.ocr&hl=en

Zdenko


st 31. 1. 2024 o 2:00 Borneq <[email protected]> napĂ­sal(a):

> First I test tesseract on file generated as flat image.
> I generate Lorem Ipsum text:
>
> 5 paragraphs, 452 words 2978 bytes, 24 lines + 4 blank lines, maximal line
> len in my editor was 135 chars.
>
> Result: 100% accurate but two full stop marks, fantastic.
>
> Next, I rotate image. Only 0.7 degree caused a lot of confusion and minor
> rotation 0.1-0.6 degree - treat some m as n.
>
> In my book photo images are often rotate up to 3.5 degree.
> Worse, text is transformed into curve lines of text like F-distribution
>
> ("What function looks like the edge of a paper book sideways? on
> math.stackexchange.com)
>
> how to work with real photos of books, it is possible as option or thing
> that is missing in tesseract ?
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/9ac3343e-df3c-432e-8066-af21a20eda1cn%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/9ac3343e-df3c-432e-8066-af21a20eda1cn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8wdJtDmAiBLstMRU2CVe_ZL2RiMeZH5wk%3DXFW-crK16yw%40mail.gmail.com.

Reply via email to