Have you tried using PSM 13? I get a few % more than 6 on my dataset. Also,
what kind of image preprocessing are you doing? I've reclaimed a ton of
accuracy finely tuning my preprocessing. Mind posting some pictures of what
you're recognizing?

On Fri, Sep 13, 2019 at 2:00 AM Dustin Spicuzza <dustin.spicu...@gmail.com>
wrote:

> Hey,
>
> Using @shreeshrii's excellent examples at
> https://github.com/Shreeshrii/tessdata_shreetest, I've fine tuned on a
> single monospace font with a giant pile of representative data. With very
> little effort the recognition results have been significantly better than
> using the stock english data -- just a few errors per page. Thanks so much!
>
> However, I'd like to get even closer to zero errors. I've been trying to
> constrain my problem in an effort to get better results:
>
>    - Known monospaced font, font size, page size
>    - Known character set (ASCII)
>    - Data layout is fairly consistent
>
> Are there configuration settings that I can use to provide hints to
> tesseract about the nature of the data? I don't really want it to do layout
> or blocks or anything particularly fancy, I just want it to recognize all
> the text and give it to me. I've been using page segment mode 6 (Assume a
> single uniform block of text). I've been going through the wiki but I
> haven't been able to make much more progress there.
>
> Thanks for any tips!
>
> Dustin
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/4bfaf2ed-a8a0-429b-8b8f-cc9db11ba5a8%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/4bfaf2ed-a8a0-429b-8b8f-cc9db11ba5a8%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CABtjQ9%2BTy%2BYQUZNtE--OMr6zFhTDO4%2B5_RYdTnNaHqtGN7-8Wg%40mail.gmail.com.

Reply via email to