> What is the recommended procedure for manually correcting cseg.gt.png > files? Is there a utility that I am overlooking?
There isn't one yet; we've been working on it. > When generating text for training images, should this include spaces? Yes; however, the space handling in OCRopus is currently inconsistent so that the spaces are ignored. > My overall procedure : I have spent some time training ocropus on a > custom font, images from JPGs. I am using the following methods : > > 1) Generate a variety of single line training images programatically > 2) Manually type the text contained in each training image If you generate it, why not save the text? > 3) Places these in a directory training/0000 or training/0001 etc > 4) run ocropus lines2fsts training > 5) replace the generate txt files with my txt files and run ocropus > align training to generate cseg.png > 6) run ocropus trainseg on training to generate a new model file > 7) goto 1 using the new training model If you can write a script that takes a text file and font and generates a book directory full of binary line images, corresponding csegs, and corresponding Unicode strings, that would be useful. Tom --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "ocropus" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/ocropus?hl=en -~----------~----~----~----~------~----~------~--~---
