Perhaps this is just a misunderstanding or bad documentation. The
--print-parameters dump shows the input parameters, and the user_words_file
/ user_patterns_file parameters, if they're not set on the command line,
will always be empty.
The actual file name that gets loaded gets computed on the fly here:
https://github.com/tesseract-ocr/tesseract/blob/master/dict/dict.cpp#L274
but the result isn't saved into the user_words_file parameter
Tom
On Monday, September 28, 2015 at 4:02:15 AM UTC-4, Stef wrote:
>
> Tom,
>
> I wasn't aware of the new possiblity to specify user words on the command
> line. Instead I used the config file method with the following command
> lines and outputs:
>
> tesseract.exe --version
> tesseract 3.05.00dev
> leptonica-1.72
> libgif 4.1.6(?) : libjpeg 8d (libjpeg-turbo 1.3.1) : libpng 1.6.17 :
> libtiff 4.0.3 : zlib 1.2.8 : libwebp 0.4.3
>
> tesseract.exe test.jpg stdout -l deu --print-parameters bazaar | grep
> load_system\|load_freq\|user_
> load_system_dawg 0 Load system word dawg.
> load_freq_dawg 0 Load frequent word dawg.
> user_words_file A filename of user-provided words.
> user_words_suffix user-words A suffix of user-provided words located
> in tessdata.
> user_patterns_file A filename of user-provided patterns.
> user_patterns_suffix user-patterns A suffix of user-provided
> patterns located in tessdata.
>
>
> tesseract.exe test.jpg stdout -l deu --print-parameters | grep
> load_system\|load_freq\|user_
> load_system_dawg 1 Load system word dawg.
> load_freq_dawg 1 Load frequent word dawg.
> user_words_file A filename of user-provided words.
> user_words_suffix A suffix of user-provided words located in
> tessdata.
> user_patterns_file A filename of user-provided patterns.
> user_patterns_suffix A suffix of user-provided patterns located in
> tessdata.
>
> My bazaar config file:
>
> load_system_dawg F
> load_freq_dawg F
> user_words_suffix user-words
> user_patterns_suffix user-patterns
>
> For the time being, I solved my problem by increasing the scan resolution
> from 300 dpi to 600 dpi which ensures everything to be recognized correctly
> with the default (system) settings.
>
>
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit
https://groups.google.com/d/msgid/tesseract-ocr/8f3f0836-fdb8-410e-a4a3-d7da6e1c50fe%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.