On Thu, 29 Dec 2011 Edward Betts <[email protected]> wrote:
> We don't currently have a system for recording the quality of the OCR or
> correcting mistakes.
>
> As you point out the OCR doesn't properly handle blackletter type.
There is a solution to it, but it is expensive:
http://www.frakturschrift.com/
> A system for correcting OCR is often requested, conceptually it is quite
> simple.
But not in practise...
> Just a web page that shows the page image and a way to edit the
> text. We keen to maintain page coordinate information for each word so
> that we can highlight words in the book reader and search inside. This
> makes the problem more difficult.
>
> We would like to build a correction system, but we don't have the resources.
Building such a system seems to be a goal of several projects, but I
haven't found yet anything satisfactory for my purposes. The IMPACT
project developed a system that looks nice but again it is probably to
be quite expensive:
http://www.digitisation.eu/index.php?id=109
Best regards
Janusz
--
,
Prof. dr hab. Janusz S. Bien - Uniwersytet Warszawski (Katedra Lingwistyki
Formalnej)
Prof. Janusz S. Bien - University of Warsaw (Formal Linguistics Department)
[email protected], [email protected], http://fleksem.klf.uw.edu.pl/~jsbien/
_______________________________________________
Ol-discuss mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
To unsubscribe from this mailing list, send email to
[email protected]