Re: [ol-discuss] Recording the quality of a book's OCR

Janusz S. Bień Thu, 29 Dec 2011 22:12:44 -0800

On Thu, 29 Dec 2011  Edward Betts <[email protected]> wrote:

> We don't currently have a system for recording the quality of the OCR or 
> correcting mistakes.
>
> As you point out the OCR doesn't properly handle blackletter type.


There is a solution to it, but it is expensive:

      http://www.frakturschrift.com/

> A system for correcting OCR is often requested, conceptually it is quite 
> simple. 

But not in practise...

> Just a web page that shows the page image and a way to edit the 
> text. We keen to maintain page coordinate information for each word so 
> that we can highlight words in the book reader and search inside. This 
> makes the problem more difficult.
>
> We would like to build a correction system, but we don't have the resources.

Building such a system seems to be a goal of several projects, but I
haven't found yet anything satisfactory for my purposes. The IMPACT
project developed a system that looks nice but again it is probably to
be quite expensive:

     http://www.digitisation.eu/index.php?id=109

Best regards

Janusz

-- 
                           ,   
Prof. dr hab. Janusz S. Bien -  Uniwersytet Warszawski (Katedra Lingwistyki 
Formalnej)
Prof. Janusz S. Bien - University of Warsaw (Formal Linguistics Department)
[email protected], [email protected], http://fleksem.klf.uw.edu.pl/~jsbien/
_______________________________________________
Ol-discuss mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
To unsubscribe from this mailing list, send email to 
[email protected]

Re: [ol-discuss] Recording the quality of a book's OCR

Reply via email to