I inadvertently purchase ABBYY Finereader 11 Corporate thinking that it would
be capable of outputting to ALTO XML. I was wrong. ABBYY Finereader Engine
does:/
Ultimately, I want to OCR some newspaper images and export them to ALTO XML
and, until the proof of concept is done, I want to try to
You might take a look at Tesseract [1]. On a typical Linux box:
$ tesseract input.tif outputName hocr
renders html with some coordinate information. You might be able to process
from that output to ALTO.
Cheers,
Bridger
[1] http://code.google.com/p/tesseract-ocr/
On Thu, Sep 6, 2012 at 8:29