I added more white space around the target text by scaling the canvas to 500 pixels wide, and then scaled up the whole image by a factor of 2.
-230 6 5O 90 6 50 90 6 -100 130 6 -100 130 6 -150 On Thu, Nov 2, 2023 at 8:35 AM La Monte H. P. Yarroll < [email protected]> wrote: > I had a little success applying 2.5 pixels of blur and then thresholding > at 217-255. FWIW, I used gimp for the preprocesing. Here's what I got after > just a few minutes: > a i @) > > -230 & 50 > 90 6 50 > 90 6 -100 > > 130 6 > 130 6 > > ~100 > -130 > > I don't know what happened to the first column or why the last 2 lines got > split the way they did. > > > On Wed, Nov 1, 2023 at 4:30 PM Slartybartfast <[email protected]> > wrote: > >> Doesn't anybody have any ideas? :-( >> >> On Tuesday, October 24, 2023 at 5:40:20 PM UTC+1 Slartybartfast wrote: >> >>> Hi >>> I am a new tesseract user, and I'm really struggling to get it to >>> produce any kind of sensible results, especially with numerical text. I >>> have some text that looks like this: >>> [image: example_input.jpg] >>> I've read the documentation, and looked through the parameter list, and >>> I added the following to the command line: >>> --psm 6 >>> -c preserve_interword_spaces=1 >>> -c textord_dotmatrix_gap=6 >>> -c classify_bln_numeric_mode=1 >>> -c rej_alphas_in_number_perm=1 >>> >>> But I just get garbage out: >>> >>> Oo -250 6 3a >>> 190 & So >>> 190 6 -100 >>> 1 $1290 6 ~140 >>> 1 $130 6 ~150 >>> >>> I've tried all sorts of additional image processing to try and improve >>> the look of the text, but none of it works. In fact, this is the best >>> output of seen. It's usually worse. I'm really hoping someone who has >>> worked with dot-matrix input can offer some magic incantation to make >>> tesseract come to its senses. Thanks. >>> >> -- >> You received this message because you are subscribed to the Google Groups >> "tesseract-ocr" group. >> To unsubscribe from this group and stop receiving emails from it, send an >> email to [email protected]. >> To view this discussion on the web visit >> https://groups.google.com/d/msgid/tesseract-ocr/15797f86-58c9-4e71-b316-54f663d04cbfn%40googlegroups.com >> <https://groups.google.com/d/msgid/tesseract-ocr/15797f86-58c9-4e71-b316-54f663d04cbfn%40googlegroups.com?utm_medium=email&utm_source=footer> >> . >> > -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/CAL7mBq6v97CDOH%2B0GN3zmMwd0tHixfCp%2BWfKQC9wApQQM%3DCP4g%40mail.gmail.com.

