Hi, After some tests I realized the best for me is to put effort to extend the Catalan Diccionari which is in svn repository (v3). It will be so useful if you can do one of these:
-> deliver the different files combined to create the cat.traineddata unified file. (the utf8 files used to generate the dawg would be also amazing!). -> show how to extract these files from the cat.traineddata and how to dawg2utf8 (if it is possible). THANKS! -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en.

