Hi folks, I've belatedly got around to writing up a little experiment I did a while ago for training OCRopus to read a really weird font. It's in no way supposed to be taken as an example of real-world usage since everything was highly contrived. Like many people on here I've been attempting to train it to yield better results on Latin texts by using the default character model, altogether without much success. Still, it might be helpful.
Caveats: - doesn't cover the newer training techniques (clustering, labelling?) that are apparently available in newest release - the version of OCRopus used was tip with parent 349:ef1e07e86895 from Feb 23rd Here's the link: http://ocropodium.cerch.kcl.ac.uk/?p=82 Also, people attempting training on "difficult" sources might be interested in a couple of tools I made for playing with line transcripts and character segmentation. They pretty rough and you'll have to build them from source, but again, might come in handy: http://code.google.com/p/ocropodium/ Cheers, Mike -- You received this message because you are subscribed to the Google Groups "ocropus" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/ocropus?hl=en.
