Just grab the pixels in that "box", go through them to find the one furthest from your background colour, and you're done. (Pixels on an edge will be a blend of the background and font colour.) Probably the easiest image format to work with is "Plain PPM" since it consists entirely of ASCII characters.

lux wrote, On 2010-05-04 12:56:

No, it must be something given by tesseract because there could be
more red than black (font color in this example) and so it would all
screw up!
Anyway I can just get the text from tesseract before with the boxes
positions... but the problem is that I also need the exact color of
the word tesseract picked up.

Tesseract surelly store the positions of the texts when it compute the
image, but the point is... is there a way to get these?

On 3 Mag, 21:01, Sven Pedersen <[email protected]> wrote:
Using filters to cancel out colors other than the target color, it
should be possible to iteratively extract text of a certain color (say
red, green, blue, black, etc.) But that would be hard. Generally
people just want to get the text and fix the colors later.
--Sven





On Sun, May 2, 2010 at 1:41 PM, Sandro Zahra <[email protected]> wrote:
I think that OCR is not about colours.....
On 2 May 2010 17:35, lux <[email protected]> wrote:
I need the RIGHT position of the text or the RIGHT color, not an
average color :/.
On 11 Apr, 20:48, MARTIN Pierre <[email protected]> wrote:
So how can I get the position of text?
I've tryed with makebox but it's not really right, it gives me the
cordinates of the whole "letter box" so it's impossible for me to get
the right pixel of the letter
(e.g. it would work for an 'I' but for an 'A' it gives me the box left
up and right down position so I don't know how to get the letter color
because the 'A' is not at the start nor at the end of the box).
That's the right method. If you want to know where the "pixels" are, do
an histogram equalization of your picture, then contrast it with a fairly
agressive threshold (If it's not already in 1bpp), this will give you a copy
of your picture with only black and black pixels. Now, that's on this
picture (Basically 1bpp depth picture) that you run tesseract.
Then given the boxes, you look in your black & white picture where black
pixels are in the boxes, and then with the same coordinates you can see them
in your original picture. After that, do color average from all pixels in a
box in your original picture and you're good.
Pierre.


--
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Reply via email to