On Thu, Nov 05, 2015 at 10:16:04PM -0500, Tom Morris wrote:
> I've got a fix in hand and will generate a pull request as soon as I have
> some test data to test with.
It looks like the 'epub' project requires 'abbyy' OCR output as a
starting point. Is the toolchain for going from raw scans to abbyy also
available, so we might be able to generate our own individual test
datasets from our own books? I skimmed over all the other github
internetarchive projects, but it wasn't apparent which, if any of them
handles the scan->abbyy steps of the pipeline.
Jon
_______________________________________________
Ol-tech mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
Archives: http://www.mail-archive.com/[email protected]/
To unsubscribe from this mailing list, send email to
[email protected]