On Wednesday, March 30, 2016 at 11:34:14 AM UTC-4, Alex Szeto wrote:
>
> I am working on a license plate recognition project, I have trouble in 
> improve accuracy of OCR.
> Attached is one of the image I used and the result is very poor.
>
> version of tesseract : 3.0.3
> The command that I used : tesseract Untitled.jpg out -psm 9
> The result is : SXUSBBB  while I am expecting for 5X0S888
> I have did some experiments and I have found some character pairs are 
> easily get confused by tesseract.
> for example :  '0' become 'U' ; '5' and 'S' ; 'B' and '8'
>
> Is there some methods or parameters I can set so the result can be 
> improved? 
>

Looking at the image and result, it's pretty easy to see what the confusion 
is, particularly for a recognizer tuned to deal with a wide variety of 
fonts, and given the fact that you're not attempting to recognize actual 
words, but arbitrary strings of symbols.

Have you considered building something on OpenCV or a similar tool where 
you could take advantage of a) the very small number of symbols and their 
specific shapes and b) knowledge of the specific ordering of numbers and 
letters plus any other domain knowledge that's available.

Tom 

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/c3335dbd-d631-458d-a196-fb172a65ebcd%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to