This looks like a latin-1 vs utf-8 issue.You need to use "\u00e4\u00c4\u00f6\u00d6\u00fc\u00dc\u00df\u20ac" which most compilers accept and turn to utf-8. Ray.
On Wed, Feb 11, 2009 at 3:24 AM, paulwesterkamp < [email protected]> wrote: > > Me Again. > > i use visual Studio 2005. i tried using the tessnet2.dll > but when i use one of these symbols [ä Ä ö Ö ü Ü ß €] in the whitelist > via > > ocr.SetVariable("tessedit_char_whitelist", "äÄ0123€"); > > the application crashes and throws an Assertion Failed Error in line > 76 of unicharset.cpp > > assert(ids.contains(unichar_repr, length)); > > Following the steps in Debugger i find out he passes unichar_repr = € > and length=2 > therefore the method Contains returns false. When Length is 2 he sets > Current_nodes to childnodes but these are deleted in > > UNICHARMAP::UNICHARMAP_NODE::~UNICHARMAP_NODE() { > if (children != 0) { > delete[] children; > } > } > > So return current_nodes != 0 always returns false > and so the assert fails. > > > --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en -~----------~----~----~----~------~----~------~--~---

