OK, I'll not create an issue for this for now, but the reason I
brought it up was that I have saw isspace fail last night with text
that was out of bounds ( the char value was -128 ).

When I changed the isspace to iswspace the problem was resolved. I'm
no expert at this and I am likely to be fixing this incorrectly, but
isspace was crashing out which meant I was unable to do any training.

I can supply you with the tiff / box file if you want.

On Feb 9, 3:07 am, "Jimmy O'Regan" <[email protected]> wrote:
> On 9 February 2012 00:02, Wil Hadden <[email protected]> wrote:
>
> > Hi,
>
> > I thought I should let you know of an issue I may have uncovered.
>
> > In paragraphs.cpp in InitializeRowInfo there is a call to GetUTF8Text
> > followed by a while loop that uses isspace.
>
> > The problem is that isspace expects a char, not utf8 and it can throw
> > an assert. Changing the isspace to iswspace fixes the issue on Windows
> > builds, you may need another solution for other platforms.
>
> No, you're imagining there's a problem where none exists, and your
> change might introduce a problem where none had existed because
> GetUTF8Text returns char*, not wchar_t* (UTF-8 is backwards compatible
> with ASCII, there's no problem with isspace).
>
> --
> <Sefam> Are any of the mentors around?
> <jimregan> yes, they're the ones trolling you

-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

Reply via email to