On Wednesday, January 27, 2021 at 5:28:27 AM UTC-5 Merlijn Wajer wrote:
>
> The Internet Archive has switched to using Tesseract for all our OCR,
That's great to hear! It's certainly been a long time coming. Nick White &
I tried to get this to happen 7 years ago and even volunteered to help,
Digits included in language model with letters. And model most trained to
phrase recognition, not separate digits. Mistakes on digits unavoidable.
суббота, 30 января 2021 г. в 19:12:39 UTC+3, Benek:
> I still need to read the dot in the correct place which makes it a bit
> harder. So you
Heh. It's an old issue.
For 100% accuracy, you must use a digit-only language model. But there is
no such thing.
Besides trivial perceptron shows good results on digits recognition.
суббота, 30 января 2021 г. в 18:41:13 UTC+3, Benek:
> Hello! I'm trying to read some digits and I thought it was
3 matches
Mail list logo