Hi,

Tesseract is detecting the blobs for each character correctly at least. One 
trick is to leverage the coordinates of each character for extracting 
individual images, invert the colours, and use single character mode (-psm 10) 
to do the recognition. I think you have to dig into the API to get the 
character coordinates or use the makebox option (e.g. tesseract license.png 
license makebox). If you isolate each character, it usually recognizes it, not 
something that is recommended for a lot of text but maybe worthwhile in this 
case.

art

From: [email protected] [mailto:[email protected]] On 
Behalf Of Alex Szeto
Sent: Wednesday, March 30, 2016 11:17 AM
To: tesseract-ocr <[email protected]>
Subject: [tesseract-ocr] High Error rate even if good quality image and low 
noise

I am working on a license plate recognition project, I have trouble in improve 
accuracy of OCR.
Attached is one of the image I used and the result is very poor.

version of tesseract : 3.0.3
The command that I used : tesseract Untitled.jpg out -psm 9
The result is : SXUSBBB  while I am expecting for 5X0S888
I have did some experiments and I have found some character pairs are easily 
get confused by tesseract.
for example :  '0' become 'U' ; '5' and 'S' ; 'B' and '8'

Is there some methods or parameters I can set so the result can be improved?
Thank a lot and I really appreciated any advises.

--
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to 
[email protected]<mailto:[email protected]>.
To post to this group, send email to 
[email protected]<mailto:[email protected]>.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/abcbfacf-3491-4b85-87b1-a43e5e4de56f%40googlegroups.com<https://groups.google.com/d/msgid/tesseract-ocr/abcbfacf-3491-4b85-87b1-a43e5e4de56f%40googlegroups.com?utm_medium=email&utm_source=footer>.
For more options, visit https://groups.google.com/d/optout.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/BY2PR11MB055276744AB3ECB615DAD4F7DC990%40BY2PR11MB0552.namprd11.prod.outlook.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to