I had a little success applying 2.5 pixels of blur and then thresholding at
217-255. FWIW, I used gimp for the preprocesing. Here's what I got after
just a few minutes:
a i @)

-230 & 50
90 6 50
90 6 -100

130 6
130 6

~100
-130

I don't know what happened to the first column or why the last 2 lines got
split the way they did.


On Wed, Nov 1, 2023 at 4:30 PM Slartybartfast <[email protected]>
wrote:

> Doesn't anybody have any ideas?  :-(
>
> On Tuesday, October 24, 2023 at 5:40:20 PM UTC+1 Slartybartfast wrote:
>
>> Hi
>> I am a new tesseract user, and I'm really struggling to get it to produce
>> any kind of sensible results, especially with numerical text. I have some
>> text that looks like this:
>> [image: example_input.jpg]
>> I've read the documentation, and looked through the parameter list, and I
>> added the following to the command line:
>> --psm 6
>> -c preserve_interword_spaces=1
>> -c textord_dotmatrix_gap=6
>> -c classify_bln_numeric_mode=1
>> -c rej_alphas_in_number_perm=1
>>
>> But I just get garbage out:
>>
>> Oo -250 6 3a
>> 190 & So
>> 190 6 -100
>> 1 $1290 6 ~140
>> 1 $130 6 ~150
>>
>> I've tried all sorts of additional image processing to try and improve
>> the look of the text, but none of it works. In fact, this is the best
>> output of seen. It's usually worse. I'm really hoping someone who has
>> worked with dot-matrix input can offer some magic incantation to make
>> tesseract come to its senses. Thanks.
>>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/15797f86-58c9-4e71-b316-54f663d04cbfn%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/15797f86-58c9-4e71-b316-54f663d04cbfn%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAL7mBq42NBRBQH6BP1MTVC2T7ww3AV4shvcGmaTsiC-CNwT%2B5Q%40mail.gmail.com.

Reply via email to