Hi, I'm working on a project that is using OCR to detect different labeled faces on a box (eg. Top, Bottom, Left, etc...). I've gotten the OCR working but the results leave something to be desired. I can control how big the font size will be on the box, and there are only six words I'm looking for. So my question is, is there a way to set tesseract to ignore things below a certain size so I can help filter out the noise? So I can use decently large font, and it will know that anything smaller than that should just be ignored? Also, I tried setting up eng.user-words file to create a small dictionary for myself, but it didn't appear to work. Is there a guide for how to set that up so I can have tesseract just look for the 6 words I'm using and ignore everything else?
-- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en

