Hi Donaldo, Well, I'm relieved it's compiling for you now!
I'll reply to your questions below. > Right, so now what do I do? You said last week to use commands such as: > > ./lazytrain textfile.txt DejaVu-Serif-Book 1.png 1.box > > which seems to work, but I guess that I need to run it for many fonts and then > combine them all into a traineddata file? Yep, exactly correct. > Are there guidelines on choice of fonts for a Latin-based alphabet, > bold, italic etc? How many is enough? Well, they want to reflect whatever you are actually going to be scanning. For example I initially used every font on my system that covered the characters I cared about. But choosing a subset which actually represented what was printed in the books I was scanning produced much better results. > Is it > still necessary to have a text file with multiple instances of each letter, > which I have, but which will produce large box files? Yes. Large box files are fine, don't fear them ;) One of the advantages of generating this stuff automatically is that you can be more comprehensive than you have time to do manually. Hope this helps, let us know of any other questions you have. Nick -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

