Hello Dmitri and thanks for your help,

> If you want someone here to dig into your issue, you should give as much as 
> possible info about it.
Well, i thought i did.

> From what you've given no one can reproduce it, and reproduction is a common 
> method to solve issues.
i really don't need anyone to reproduce it. i'm really asking if anyone have 
had the same issue with a relatively recent source code, i'm myself using it 
from C++ with the wrapper API.
Basically: is the behavior of the "tessedit_char_whitelist" the same as before? 
Or did it change in some way? In previous versions it was not only a hint to 
the classifier, but would also completely disallow it to follow learnt paths 
other than the ones in the whitelist. Now it seems to be different, because 
(And again, whatever the image is) the output contains characters not in this 
whitelist.

The basic idea of a whitelist is a safer blacklist… Blacklist tends to be a way 
to exclude few possibilities, while whitelist tends to include only a given 
amount. i would like to know if this behavior is still the same, that's it.

> Show us your images, full code snippet, config files, etc. Then maybe you'll 
> get the answer.
Whatever the image is, the output is in contradiction with a basic rule 
(Whitelist) which used to work when i first started to use tesseract years ago 
with older versions.
As a code snipet, the only useful piece of code i can think of copy pasting 
would be this line:
>>> _tessApi->setVariable("tessedit_char_whitelist", "><0123456789");
But the result i get contains other characters, not allowed by the whitelist:
>>> 3000657806S<00S60':0<3000657B0<

Again: i'm using a fresh svn HEAD version of tesseract via the C++ wrapping API.

Would it be possible for anyone here to give me a snippet of a working 
whitelist, as it was conceptually made in the previous versions?

Thanks a lot,
Pierre.

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to