See http://www.isri.unlv.edu/ISRI/OCRtk
The ^ before a character indicates that it is "suspicious" in some sense to
tesseract, and ~ indicates a reject. The output is in latin 1 instead of
utf-8, and may not work at all for non-latin text.
Ray.

On Mon, Aug 17, 2009 at 2:52 PM, jia <[email protected]> wrote:

>
> Hi, group,
>
> Here's an example of returned string (excluding double quotes) when
> calling TessBaseAPI::TesseractRectUNLV():
>
> "^L^0^v^e ^c^o^m^e^s ^a^n^d ^g^o^e^s
> ^A^nd o^f^te^n it h^as paused,
> Then ^c^o^m^e back t^0 ^see
> The damage it ^ha^s caused."
>
>
> I am not sure how I should interpret this result. I searched "UNLV" in
> this group, and nothing shows up. I also google'd around a bit, and
> there wasn't an obvious answer. Can someone explain what exactly UNLV-
> style output is.
>
> Thanks.
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to