Digits included in language model with letters. And model most trained to phrase recognition, not separate digits. Mistakes on digits unavoidable.
суббота, 30 января 2021 г. в 19:12:39 UTC+3, Benek: > I still need to read the dot in the correct place which makes it a bit > harder. So you don't think it's a problem with the input data? > > On Saturday, 30 January 2021 at 17:03:13 UTC+1 v.kala...@gmail.com wrote: > >> >> Heh. It's an old issue. >> For 100% accuracy, you must use a digit-only language model. But there is >> no such thing. >> Besides trivial perceptron shows good results on digits recognition. >> суббота, 30 января 2021 г. в 18:41:13 UTC+3, Benek: >> >>> Hello! I'm trying to read some digits and I thought it was a rather >>> simple task yet still I can't receive satisfying results. So my first >>> question is: is it possible to get 100% accuracy when reading some >>> standardized input? Or there will be always some errors when reading? >>> >>> Here are some sample inputs that I wanted to read: >>> The digits that are being misread are: >>> on the photo t3 >>> 5.1 is read as 9.1 >>> on the photo t4 >>> : 10.2 is read as 610.2 >>> >>> I'm using: >>> >>> tesseract/4.1.1 >>> >>> config : >>> >>> oem: 3, >>> >>> psm: 11, >>> >>> tessedit_char_whitelist: "0123456789.", >>> >>> load_system_dawg: false, >>> >>> load_freq_dawg: false, >>> >>> The images have 2700x2100 resolution. >>> >>> The 999 on the left are markers that I added to be able to recognize >>> which line belongs to which output text and they are always read correctly. >>> I tried experimenting with some different image preprocessing techniques >>> like blur, median, changing the size of the image etc. >>> >>> Do you have any other tips that could lead to better reading accuracy? >>> Thanks in advance for any help! >> >> -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/bc864cf3-c18e-4f49-b6a5-7b0d63cff4c2n%40googlegroups.com.