Am I the only one wondering what a printable control character might look
like? To me "control character" is a thing like carriage return or form
feed which doesn't have a printable representation.

On Wed, Jun 6, 2012 at 12:48 AM, Sven Pedersen <[email protected]>wrote:

> Hi Tobias,
> In the form processing industry control characters are typically
> recognized and them discarded -- that allows better debugging and
> calibration than just ignoring them entirely.
> --Sven
>
> On Mon, Jun 4, 2012 at 11:51 AM, TobiasS <[email protected]> wrote:
> > Yes, but the issue with blacklist is that the control characters are
> > not part of the Unicode character set (or any character set - they are
> > symbols). If possible I would like to use a cleaner solution than to
> > recognize, map to an arbitrary character and then blacklist.
> >
> > On Jun 4, 6:08 pm, Debayan Banerjee <[email protected]> wrote:
> >> On 4 June 2012 20:35, TobiasS <[email protected]> wrote:
> >>
> >> > Hi,
> >>
> >> > Is it possible to train Tesseract to not output/recognize a character?
> >>
> >> Try Tesseract blacklist feature.
> >>
> >> --
> >> Debayan Banerjee
>
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to