I'm dealing with font subsets, and I generate an image per font, so there 
is no reading order. Though I've seen latin and cjk in the same font 
subset. If OSD just gives, reading, orientation, and text order, it is not 
going to give me anything useful. Plus I have the font, so I could get some 
of that info from the font, just no idea what language (though maybe I 
should go back and take another look...).

I've got training up and running, on Ubuntu. I modified the text file you 
gave me, just adding some missing ligatures (ff, ffi, ffl), but my 
asc.traineddata is way worse then yours.

*Do you have a list of fonts you used to create asc.traineddata that I 
could start with*? For example, I think my fonts are missing the old ascii 
drawing blocks  that you include, and which works great on the fonts that 
use those (for bullets usually).


-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at http://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/9f084bab-80b2-4c3b-9de8-9add618a8484%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to