On Fri, 30 Dec 2011 Ralf Stephan <[email protected]> wrote:
> On Dec 30, 2011, at 7:12 AM, Janusz S. Bień wrote:
>> On Thu, 29 Dec 2011 Edward Betts <[email protected]> wrote:
>>> As you point out the OCR doesn't properly handle blackletter type.
>>
>> There is a solution to it, but it is expensive:
>>
>> http://www.frakturschrift.com/
>
> tesseract is free and has support for broken fonts in German,
> Swedish and Dansk. The results are near as good as with ABBYY.
That's a good news. I was aware of the blackletter support in
tesseract (it is even available as a Debian package), but when I gave
it a try (quite long ago) I was not satisfied with the results.
>
>>> A system for correcting OCR is often requested, conceptually it is quite
>>> simple.
>
> What about the interface of Distributed Proofreaders pgdp.net?
> It's written in PHP and provides a full editor.
I'm also aware of it but I was unable to find any information about
its license. Looks like the only way to give it a try is to register
as a proofreader and I don't want to do it. The screenshot
http://www.pgdp.net/d/walkthrough/04_Proof.htm
looks nice but it's of course not sufficient for evaluation.
Best regards
Janus
--
,
Prof. dr hab. Janusz S. Bien - Uniwersytet Warszawski (Katedra Lingwistyki
Formalnej)
Prof. Janusz S. Bien - University of Warsaw (Formal Linguistics Department)
[email protected], [email protected], http://fleksem.klf.uw.edu.pl/~jsbien/
_______________________________________________
Ol-discuss mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
To unsubscribe from this mailing list, send email to
[email protected]