I am trying to provide a black list with UTF8 characters specified
using their byte codes, as follows:
// U+FB00 ff ef ac 80 LATIN SMALL LIGATURE FF
// U+FB01 fi ef ac 81 LATIN SMALL LIGATURE FI
myTess->SetVariable("tessedit_char_blacklist", "\xef\xac\x80\xef\xac
\x81");
But this doesn't work. I tried "\x0ef\x0ac\x080" (adding a leading 0)
but same result. The call doesn't return an error but the characters
in question are not black listed.
Is this string variable not in UTF8 format? Is there a problem in the
C syntax I used to provide the hex codes?
Thanks!
Patrick
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en.