Re: Errors that might be decreasing my OCR precision

p . athanasopoulos Mon, 27 Jan 2014 00:03:05 -0800

Hi everyone,

I know this is quite an old topic by now, but this question still stands 
and I saw no reason to create a new one for it.


I use tesseract 3.0.2 (with leptonica 1.67, which was the recommended at 
the time of the installation) on Centos 6.5. I convert large pdf files to 
seperate page-PNGs, then use tesseract to scan for specific keywords.

A few pages have given me the following errors (the errors always come 
together):

Error in boxClipToRectangle: box outside rectangle
Error in pixScanForForeground: invalid box

These pages seem to be OCRed correctly, with more or less the same 
precision as the rest of the pages (~96% characters recognised), but I have 
only found three pages with these errors so my sample is not very 
significant. 

What do these errors mean?
Do these hint to a user error?
Is there any possibility they can mean a loss of precision?

Thanks in advance for the help. :)

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Re: Errors that might be decreasing my OCR precision

Reply via email to