https://stackoverflow.com/questions/9480013/image-processing-to-improve-tesseract-ocr-accuracy

Gimp is your friend.

From: [email protected] [mailto:[email protected]] On 
Behalf Of Ravi Annaswamy
Sent: 05 October 2019 11:08
To: [email protected]
Subject: Re: [tesseract-ocr] Re: tesseract ignores single/short characters -> 
any ideas?

I didn’t try these images but my first guess: can you not provide dpi 72 as 
option and try?
Sent from my iPhone

On Oct 5, 2019, at 4:04 AM, test0r man 
<[email protected]<mailto:[email protected]>> wrote:
--Push--

does anyone have an idea?

thanks for help!


Am Sonntag, 8. September 2019 12:23:28 UTC+2 schrieb test0r man:
hi,
i use this command:

tesseract input/image.jpg output/output --dpi 72 --oem 1 -l deu+eng

to scan image like "1_input.jpg" and "2_input.jpg". the ocr result is good, but 
it seems that tesseract ignores short/single characters.
in the first image it ignores the three "0".
in the second image it only detects the "10.".

the tessinput files are attached too.
if i use the "--psm 6" command, all other words won't be detected right.
if i scale the images to 300 dpi, it's the same result.

has anyone an idea? thanks for help!






--
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 
[email protected]<mailto:[email protected]>.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/6bb8a731-afa3-4dbf-a805-90b9120b791b%40googlegroups.com<https://groups.google.com/d/msgid/tesseract-ocr/6bb8a731-afa3-4dbf-a805-90b9120b791b%40googlegroups.com?utm_medium=email&utm_source=footer>.
--
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 
[email protected]<mailto:[email protected]>.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/AC804F44-0BB2-4391-A38B-0E3059834D97%40gmail.com<https://groups.google.com/d/msgid/tesseract-ocr/AC804F44-0BB2-4391-A38B-0E3059834D97%40gmail.com?utm_medium=email&utm_source=footer>.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/5a4fd89b7d35433cafe4f634f88620ff%40eesm.com.

Reply via email to