Ok, so I thought more on this. What I will end up with is segments of
possible various colors. Instead of handing tesseract I would like to
hand it a set of (closed) contours. Is that possible?

On 16 Nov., 08:09, Dmitri Silaev <[email protected]> wrote:
> Indeed, ultimately Tesseract operates on contours which are always
> extracted from blobs. Blobs are structures in turn extracted from
> binary images, That's why these contours are always closed. It is
> possible, however, to tinker inside the guts and make Tesseracrt match
> your contours as partial prototypes. (Refer to papers 
> athttp://code.google.com/p/tesseract-ocr/wiki/Documentation) You're
> going to have hard time doing this as the class hierarchy is really
> convoluted and a bit awkward. You're also going to do much R&D because
> (as I know) nothing had been done previously to check how it'll work
> for such a task, and you should investigate accuracy both for contour
> extraction and Tesseract parts.
>
> And I don't know if this info can really help ))
>
> Warm regards,
> Dmitri Silaevwww.CustomOCR.com
>
>
>
>
>
>
>
> On Tue, Nov 15, 2011 at 11:03 PM, daniel <[email protected]> 
> wrote:
> > Hi,
>
> > I was just referring to your previous post, were you said I should
> > just convert a list of blobs into a binary image. I don't think that
> > will always work. Since I don't know in advance which segments are
> > writing I would have to generate a binary image from an arbitrary
> > segmentation. That in general is the map coloring problem, for which
> > you need up to four colors, given you find a good coloring algorithm.
> > Basically I was wondering if it is not possible to give tesseract a
> > list of contours, that my not even have to be closed. I.e. edges.
> > Well, I will just play around with giving it edge images. But I was
> > just hoping I could go one level deeper, and give tesseract directly a
> > list of contours, as I assume that is what it operates on, in the end.
>
> > I don't have good example images ready right now, but as soon as I do
> > I will post them here.
>
> > Daniel
>
> > On 15 Nov., 14:44, Dmitri Silaev <[email protected]> wrote:
> >> I don't know either. Sample images are still wanted. In the worst case
> >> it may end up in need to develop your own code, not just a sequence of
> >> ready library calls.
> >> --
> >> Dmitri
>
> >> On Tue, Nov 15, 2011 at 1:13 PM, daniel <[email protected]> 
> >> wrote:
> >> > Hey,
>
> >> > I don't know. How about situations where more than two colors are
> >> > involved. I would have to map the discovered segments to two colors,
> >> > which may even be impossible. And with contours even more so, as the
> >> > contours may not be closed...
>
> >> > On 12 Nov., 18:26, Dmitri Silaev <[email protected]> wrote:
> >> >> If you're able to use OpenCV then, given a list of contours or blobs,
> >> >> you should be able to reconstruct a binary image. This is a general
> >> >> thought. To get a more practical advice, send us your sample image(s)
>
> >> >> Warm regards,
> >> >> Dmitri Silaevwww.CustomOCR.com
>
> >> >> On Sat, Nov 12, 2011 at 4:37 PM, daniel <[email protected]> 
> >> >> wrote:
> >> >> > Hi,
>
> >> >> > I want to use tesseract to read text off things like posters and
> >> >> > packages. The text will have different colors, there will be images
> >> >> > and other mess, so it seems like a non-standard situation. I thought
> >> >> > it would help if I use some opencv segmentation or contour finding
> >> >> > algorithm instead of the thresholding that tessearact seems to do.
> >> >> > That, however, will not provide a binary image, but a list of
> >> >> > components/contours. How can I feed this to tesseract?
>
> >> >> > Best
>
> >> >> > Daniel
>
> >> >> > --
> >> >> > You received this message because you are subscribed to the Google
> >> >> > Groups "tesseract-ocr" group.
> >> >> > To post to this group, send email to [email protected]
> >> >> > To unsubscribe from this group, send email to
> >> >> > [email protected]
> >> >> > For more options, visit this group at
> >> >> >http://groups.google.com/group/tesseract-ocr?hl=en
>
> >> > --
> >> > You received this message because you are subscribed to the Google
> >> > Groups "tesseract-ocr" group.
> >> > To post to this group, send email to [email protected]
> >> > To unsubscribe from this group, send email to
> >> > [email protected]
> >> > For more options, visit this group at
> >> >http://groups.google.com/group/tesseract-ocr?hl=en
>
> > --
> > You received this message because you are subscribed to the Google
> > Groups "tesseract-ocr" group.
> > To post to this group, send email to [email protected]
> > To unsubscribe from this group, send email to
> > [email protected]
> > For more options, visit this group at
> >http://groups.google.com/group/tesseract-ocr?hl=en

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to