I’ve been using PageSegMode with PSM_SINGLE_CHAR and I wasn’t being able to get any choices with ChoiceIterator.
I workarounded this by changing the line in textord.cpp (in release # 724 this is line # 336): if (PSM_LINE_FIND_ENABLED( pageseg_mode)) to if (PSM_LINE_FIND_ENABLED(pageseg_mode) || pageseg_mode == PSM_SINGLE_CHAR) I don’t know if this could carry any side effects, but so far it is working for me. If this is not a bug, it could be interesting to introduce a variable in order to be able to access to this behavior. Without this I’ve been having big problems; when trying to recognize a single char I was getting the recognition of the inner contour of the Q (see my example below). Now I have the two options and I can decide between them, based on the coordinates that GetBoxText() returns. These are my two results, for clarification: Reading 1: https://docs.google.com/file/d/0BxkuvS_LuBAzYm9IUDVKVDJPaUk/edit?usp=sharing Reading 2: https://docs.google.com/file/d/0BxkuvS_LuBAzVkdBaHRmNWtYMW8/edit?usp=sharing By the way, I’ve found a few variables which were very useful for debugging this, they are: tessedit_dump_choices tessedit_debug_quality_metrics tessedit_debug_doc_rejection Finally, I have two questions to the list: 1) I would have expected to have more results in the results iterator in characters like “Q” which are too close to an “O”. Is there a way to increase this ? 2) I trained for a specific font (FE Schrift) printing in a paper and then scanning. But as this project is for capturing with a camera, I need then improve the training with the real captures character images, which end up to be different. Should I use a different Tiff page for doing that, as they were a different font ? Or could I include them in the same unique page ? This comes from my previous post: https://groups.google.com/forum/?fromgroups#!topic/tesseract-ocr/et7bS5QRf2o Thanks, Andres Hurtis – www.visiondepatentes.com.ar - sorry for this, I’m in need of SEO :) -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.

