Hi all, I have an issue with providing list of user word to tesseract. I use Windows 10. Installed tesseract version:
>tesseract.exe -v tesseract v5.0.0-alpha.20191030 leptonica-1.78.0 libgif 5.1.4 : libjpeg 8d (libjpeg-turbo 1.5.3) : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11 : libwebp 0.6.1 : libopenjp2 2.3.0 Found AVX2 Found AVX Found FMA Found SSE Found libarchive 3.3.2 zlib/1.2.11 liblzma/5.2.3 bz2lib/1.0.6 liblz4/1.7.5 My test image: [image: test.jpg] <about:invalid#zClosurez> I have "eng.user-words" file in the directory with traindata files that contains: B1adeb1ab1a Config file "bazaar" as follow: load_system_dawg F load_freq_dawg F user_words_file path/to/eng.user-words user_words_suffix user-words language_model_penalty_non_freq_dict_word 1 language_model_penalty_non_dict_word 1 Running this command "C:\Program Files\Tesseract-OCR\tesseract.exe" test.jpg stdout -l eng bazaar gives "Bladeblabla" instead of "B1adeb1ab1a" As well as this command "C:\Program Files\Tesseract-OCR\tesseract.exe" test.jpg stdout -l eng --user-words path/to/eng.user-words gives "Bladeblabla" instead of "B1adeb1ab1a" Where am I wrong? -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/8a9fc351-5bdb-4122-ab1c-bbb516e8e2d4%40googlegroups.com.