Interesting result. The problem is that the value of DangAmbigs varies
according to the size of the document being OCRed.
Very small documents don't benefit from the adaptive classifier at all, so
DangAmbigs has very little effect.
Very large (eg multipage) documents benefit greatly from the adaptive
classifier, and mis-adaption has the greatest cost, so adaption has to be
carefully controlled, hence DangAmbigs is very important.
On medium-sized documents, adaption has a strong effect, but the cost (and
danger) of mis-adaption is lower, so it pays to make riskier adaptions -
hence an empty DangAmbigs can lead to higher accuracy.

Ray.

On Wed, Apr 8, 2009 at 12:27 PM, Michael Reimer <[email protected]>wrote:

>
> Also, I've run the UNLV tests with the default DangAmbigs from the
> English language pack, with my own generated one, and with an empty
> one.  The empty one gives the best performance on my system.  Is that
> normal?
> >
>

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to