Is the quality of recognition expected to be 
improving significantly, if the new scans (same 
font and size, same book in fact) are processed 
into the .box files using the previous training 
results (-l <langcode>) and the resulting new 
training then merged with the bulk, then used 
for the next new scan? What is the expected 
quality curve (exponential, logarithmic etc.)? 
Is there any reliably known quantity of data 
that's expected to produce, say, 95% accuracy?

I can't figure this out for myself. Don't see 
this happening with my data anyway yet.

-Yury

--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to 
[email protected]
For more options, visit this group at 
http://groups.google.com/group/tesseract-ocr?hl=en
-~----------~----~----~----~------~----~------~--~---

Reply via email to