> 
> I believe that tesseract operates on black and white
> images.  All grayscale and colour images are converted
> internally to black and white if necessary.  In your
> case, you could probably do the conversion yourself,
> turning every pixel that is not black to white, since
> all of the text is black.
> 
> Many people have converted numeric text, and there
> are many posts in the archive about that.  I think
> some used a whitelist of numeric characters, and
> others created dictionaries containing valid combinations
> of numbers to search against.  Tesseract does not
> just try to recognize each character, it also tries
> to recognize each "word" against dictionaries, so
> it helps to let tesseract know that "8008" is a
> better answer than "BOOB".
> 
> Cheers,
> Rob Komar
> 

ok, cool, very good to know. So what will try then is to make a target list of 
rooms that we want to find and feed this list as a 'numeric dictionary' into to 
Tesseract.

We keep you updated on the results, somewhere next week.

Thanks again,

Rutger

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To post to this group, send email to [email protected].
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/917E9B58-35AB-4452-B278-7E9EC1484D7A%40gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to