OCRopus 0.5 was released a few weeks ago on Google Code. There are a lot
of changes relative to older versions:
- OCRopus has been completely refactored and now consists of a set of
Python modules, with some native code modules.
- Unicode and ligature support should be fully working now.
- Language modeling still uses finite state transducers, but all finite
state transducer code has been refactored into ocrofst.
- There is a completely new recognizer that performs much better than the
old recognizer and scales to millions of training samples.
- Databases for training/testing have been changed from SQLite format to
HDF5 (using PyTables).
- You can pull over everything you need for an install using a single
command ("hg clone https://code.google.com/p/ocropus")
There are some videos on Google showing installation and training:
http://www.youtube.com/playlist?list=PL8B1A3C55DD915896&feature=mh_lolz
There is also some additional documentation here:
https://docs.google.com/a/iupr.com/document/d/1RxXeuuYJRhrOkK8zcVpYtTo_zMG9u6Y0t4s-VoSVoPs/edit
Image preprocessing and layout analysis are still basically the old
versions from OCRopus. They are still fairly sensitive to noise and will
be replaced in future releases.
Tom
--
You received this message because you are subscribed to the Google Groups
"ocropus" group.
To view this discussion on the web visit
https://groups.google.com/d/msg/ocropus/-/VL6raX9pO5wJ.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/ocropus?hl=en.