AFAIR there were tests with the legacy engine where the effect of improving
results quality by dictionaries where measured as 10-15% for common text.
However: adding a word to a dictionary has never ensured Tesseract's
accurate recognition of that word.
For non-word inputs (e.g. serial numbers ...) it was always suggested to
turn off dictionaries.
IMO results depend on the input image quality (for good image quality it
seems like no effect). If you need more detail/experiences dig into the
history of this forum (especially after releasing first version 3).

I never heard that anybody would do such a test for the LSTM engine.

Zdenko


ne 19. 11. 2023 o 18:37 Des Bw <[email protected]> napĂ­sal(a):

> Does Tesseract actually use the dictionary (wordlist) included into the
> model (traineddata file)?
>
> - I am not getting any difference/impact by including a dictionary (word
> list) into the file.
>
> Has anybody experimented with a dictionary set up?
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/381c213c-da12-482a-accf-e6847c0fc01bn%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/381c213c-da12-482a-accf-e6847c0fc01bn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8zRPX6wxb7U38HqittfFh1Wg9_1xPrwoTZYba357gWQvg%40mail.gmail.com.

Reply via email to