On 5 Jan 2012, at 08:08, Ralf Stephan wrote:
> On Jan 4, 2012, at 4:29 PM, Lars Aronsson wrote:
>> One problem is if older scans were OCRed with older
>> software and worse results. Should one go back and
>> run a new OCR on these? Perpetually every 5 years?
> 
> What's easier? Replace the OCR or write rules that only catch
> the quirks of a specific OCR software+language+font combination?
> Clearly the former, IMHO.

The 5-years-later OCR will not necessarily be better... and if any human 
proofreading has been done on the text, it would be wrong to override with the 
new OCR.

With a human proofreading UI, it seems essential to be able to “approve” pages 
even if no errors are found, to dissuade future OCR from making changes.

- L

_______________________________________________
Ol-discuss mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
To unsubscribe from this mailing list, send email to 
[email protected]

Reply via email to