Can any one suggest some debug settings I can activate to try to trace down 
why I'm getting no output?
Thanks
Danny

On Tuesday, July 30, 2024 at 8:23:38 PM UTC+8 Danny wrote:

> I have a problem where tesseract produces no output (zero byte output 
> file) when presented with Chinese characters followed by either an ellipsis 
> or three periods.
>
> [image: bad_sub_243.png]
>
> If I crop the image in photoshop to remove the dots, the three Chinese 
> characters are recognized perfectly. Feeding the image above, or feeding 
> just the three dots, produces no output.
>
> I've just recompiled with the latest GIT version (see below).  I've also 
> re-trained the chi_tra model several times and added many words with the 
> three dots to the wordlist. The result is the same with both.
>
> Any suggestions?
>
> *Command*
> tesseract bad_sub_243.png  output -l tqChiTra --loglevel TRACE   -c 
> edges_debug=1   -c ambigs_debug_level=10   -c classify_debug_level=10   -c 
> dawg_debug_level=3   -c wordrec_debug_blamer=1   -c tessedit_dump_choices=1 
>   -c tessedit_debug_block_rejection=1   -c textord_noise_debug=1   -c 
> applybox_debug=10
>
> *Messages*
> Warning: Parameter not found: language_model_ngram_on
> Warning: Parameter not found: segsearch_max_char_wh_ratio
> Warning: Parameter not found: language_model_ngram_space_delimited_language
> Warning: Parameter not found: language_model_use_sigmoidal_certainty
> Warning: Parameter not found: language_model_ngram_nonmatch_score
> Warning: Parameter not found: classify_integer_matcher_multiplier
> Warning: Parameter not found: assume_fixed_pitch_char_segment
> Warning: Parameter not found: allow_blob_division
> Warning: Parameter not found: segsearch_max_char_wh_ratio
> Warning: Parameter not found: language_model_ngram_space_delimited_language
> Warning: Parameter not found: language_model_use_sigmoidal_certainty
> Warning: Parameter not found: language_model_ngram_nonmatch_score
> Warning: Parameter not found: classify_integer_matcher_multiplier
> Warning: Parameter not found: assume_fixed_pitch_char_segment
> Warning: Parameter not found: allow_blob_division
> Estimating resolution as 675
> Row ending at (221,23.6372): R=9999, dc=3, nc=0, REJECTED
> cleanup_blocks: # rows = 0 / 1
> cleanup_blocks: # blocks = 0 / 1
> Estimating resolution as 675
> Row ending at (221,23.6372): R=9999, dc=3, nc=0, REJECTED
> cleanup_blocks: # rows = 0 / 1
> cleanup_blocks: # blocks = 0 / 1
>
> *Version*
> # tesseract --version
> tesseract 5.4.1-11-g46b9
>  leptonica-1.76.0
>   libgif 5.1.4 : libjpeg 6b (libjpeg-turbo 1.5.3) : libpng 1.6.34 : 
> libtiff 4.0.9 : zlib 1.2.11 : libwebp 1.0.0
>  Found AVX
>  Found SSE4.1
>  Found OpenMP 201511
>  Found libarchive 3.3.2 zlib/1.2.11 liblzma/5.2.4 bz2lib/1.0.6 liblz4/1.8.1
>  Found libcurl/7.61.1 OpenSSL/1.1.1c zlib/1.2.11 brotli/1.0.6 
> libidn2/2.2.0 libpsl/0.20.2 (+libidn2/2.0.5) libssh/0.9.0/openssl/zlib 
> nghttp2/1.33.0
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/11209fd7-65f6-49d1-8153-ae217db71e85n%40googlegroups.com.

Reply via email to