Using tesseract i can limit the output with a shell command. I just need to create a file in the tesseract-ocr/tessdata/configs/ that, for example, I call myletters. In the file i define the whitelist in this way, writing in the file:
tessedit_char_whitelist QWERTYUIOPASDFGHJKLZXCVBNM After that i can process an image writing: $ tesseract prova.tif out nobatch myletters I will have just upper case letters as result. (letters from my white list) Can I do something like that in ocropus or I need to do that whit a language model? Thanks. ----------------------------- Pierpaolo Monaco ---------------------------- --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "ocropus" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/ocropus?hl=en -~----------~----~----~----~------~----~------~--~---
