The problem is that it's not always the same background color, and
many times the background is an image and such.
As for hOrc, doesn't it gives only the position like tesseract's box?
xstart,ystart,xend,yend

On 5 Mag, 05:05, Eugene Reimer <[email protected]> wrote:
> Just grab the pixels in that "box", go through them to find the one
> furthest from your background colour, and you're done.  (Pixels on an
> edge will be a blend of the background and font colour.)  Probably the
> easiest image format to work with is "Plain PPM" since it consists
> entirely of ASCII characters.
>
> lux wrote, On 2010-05-04 12:56:
>
> >No, it must be something given by tesseract because there could be
> >more red than black (font color in this example) and so it would all
> >screw up!
> >Anyway I can just get the text from tesseract before with the boxes
> >positions... but the problem is that I also need the exact color of
> >the word tesseract picked up.
>
> >Tesseract surelly store the positions of the texts when it compute the
> >image, but the point is... is there a way to get these?
>
> >On 3 Mag, 21:01, Sven Pedersen <[email protected]> wrote:
>
> >>Using filters to cancel out colors other than the target color, it
> >>should be possible to iteratively extract text of a certain color (say
> >>red, green, blue, black, etc.) But that would be hard. Generally
> >>people just want to get the text and fix the colors later.
> >>--Sven
>
> >>On Sun, May 2, 2010 at 1:41 PM, Sandro Zahra <[email protected]> wrote:
>
> >>>I think that OCR is not about colours.....
>
> >>>On 2 May 2010 17:35, lux <[email protected]> wrote:
>
> >>>>I need the RIGHT position of the text or the RIGHT color, not an
> >>>>average color :/.
>
> >>>>On 11 Apr, 20:48, MARTIN Pierre <[email protected]> wrote:
>
> >>>>>>So how can I get the position of text?
> >>>>>>I've tryed with makebox but it's not really right, it gives me the
> >>>>>>cordinates of the whole "letter box" so it's impossible for me to get
> >>>>>>the right pixel of the letter
> >>>>>>(e.g. it would work for an 'I' but for an 'A' it gives me the box left
> >>>>>>up and right down position so I don't know how to get the letter color
> >>>>>>because the 'A' is not at the start nor at the end of the box).
>
> >>>>>That's the right method. If you want to know where the "pixels" are, do
> >>>>>an histogram equalization of your picture, then contrast it with a fairly
> >>>>>agressive threshold (If it's not already in 1bpp), this will give you a 
> >>>>>copy
> >>>>>of your picture with only black and black pixels. Now, that's on this
> >>>>>picture (Basically 1bpp depth picture) that you run tesseract.
> >>>>>Then given the boxes, you look in your black & white picture where black
> >>>>>pixels are in the boxes, and then with the same coordinates you can see 
> >>>>>them
> >>>>>in your original picture. After that, do color average from all pixels 
> >>>>>in a
> >>>>>box in your original picture and you're good.
>
> >>>>>Pierre.
>
> --
> You received this message because you are subscribed to the Google Groups 
> "tesseract-ocr" group.
> To post to this group, send email to [email protected].
> To unsubscribe from this group, send email to 
> [email protected].
> For more options, visit this group 
> athttp://groups.google.com/group/tesseract-ocr?hl=en.

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to 
[email protected].
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en.

Reply via email to