Re: [tesseract-ocr] High Error rate even if good quality image and low noise

Alex Szeto Fri, 01 Apr 2016 02:10:54 -0700

Hi art, In fact my program have did your trick, isolating the character and 
use -psm 10. However, result haven't get better.
I have one question about this. when using -psm 10, what background color 
should be used? As I suspect the tesseract sometime not knowing whether 
black or white color is the background, it then get bad result.
Is there a option in tesseract for setting background color or text color? 
I have actually found some parameter related but I dont know what value 
should be input.
For example , the preset value have no much sense to me , why it is '2' for 
editor_image_text_color 
..etc .  Really appreciated if you could help. Thank you
*name    * *value * *description*
editor_image_word_bb_color 7 Word bounding box colour
editor_image_blob_bb_color 4 Blob bounding box colour
editor_image_text_color 2 Correct text colour


ref  : http://www.sk-spell.sk.cx/tesseract-ocr-parameters-in-302-version

  Alex

On Friday, April 1, 2016 at 2:35:26 AM UTC+8, Art Rhyno wrote:
>
> Hi,
>
>  
>
> Tesseract is detecting the blobs for each character correctly at least. 
> One trick is to leverage the coordinates of each character for extracting 
> individual images, invert the colours, and use single character mode (-psm 
> 10) to do the recognition. I think you have to dig into the API to get the 
> character coordinates or use the makebox option (e.g. tesseract license.png 
> license makebox). If you isolate each character, it usually recognizes it, 
> not something that is recommended for a lot of text but maybe worthwhile in 
> this case.
>
>  
>
> art
>
>  
>
> *From:* [email protected] <javascript:> [mailto:
> [email protected] <javascript:>] *On Behalf Of *Alex Szeto
> *Sent:* Wednesday, March 30, 2016 11:17 AM
> *To:* tesseract-ocr <[email protected] <javascript:>>
> *Subject:* [tesseract-ocr] High Error rate even if good quality image and 
> low noise
>
>  
>
> I am working on a license plate recognition project, I have trouble in 
> improve accuracy of OCR.
>
> Attached is one of the image I used and the result is very poor.
>
>  
>
> version of tesseract : 3.0.3
>
> The command that I used : tesseract Untitled.jpg out -psm 9
>
> The result is : SXUSBBB  while I am expecting for 5X0S888
>
> I have did some experiments and I have found some character pairs are 
> easily get confused by tesseract.
>
> for example :  '0' become 'U' ; '5' and 'S' ; 'B' and '8'
>
>  
>
> Is there some methods or parameters I can set so the result can be 
> improved? 
>
> Thank a lot and I really appreciated any advises. 
>
>  
>
> -- 
> You received this message because you are subscribed to the Google Groups 
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an 
> email to [email protected] <javascript:>.
> To post to this group, send email to [email protected] 
> <javascript:>.
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit 
> https://groups.google.com/d/msgid/tesseract-ocr/abcbfacf-3491-4b85-87b1-a43e5e4de56f%40googlegroups.com
>  
> <https://groups.google.com/d/msgid/tesseract-ocr/abcbfacf-3491-4b85-87b1-a43e5e4de56f%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/525e45b7-be8d-4f05-beca-a6740661d198%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: [tesseract-ocr] High Error rate even if good quality image and low noise

Reply via email to