AFAIR there were tests with the legacy engine where the effect of improving results quality by dictionaries where measured as 10-15% for common text. However: adding a word to a dictionary has never ensured Tesseract's accurate recognition of that word. For non-word inputs (e.g. serial numbers ...) it was always suggested to turn off dictionaries. IMO results depend on the input image quality (for good image quality it seems like no effect). If you need more detail/experiences dig into the history of this forum (especially after releasing first version 3).
I never heard that anybody would do such a test for the LSTM engine. Zdenko ne 19. 11. 2023 o 18:37 Des Bw <[email protected]> napĂsal(a): > Does Tesseract actually use the dictionary (wordlist) included into the > model (traineddata file)? > > - I am not getting any difference/impact by including a dictionary (word > list) into the file. > > Has anybody experimented with a dictionary set up? > > -- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > To view this discussion on the web visit > https://groups.google.com/d/msgid/tesseract-ocr/381c213c-da12-482a-accf-e6847c0fc01bn%40googlegroups.com > <https://groups.google.com/d/msgid/tesseract-ocr/381c213c-da12-482a-accf-e6847c0fc01bn%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAJbzG8zRPX6wxb7U38HqittfFh1Wg9_1xPrwoTZYba357gWQvg%40mail.gmail.com.

