Hi Muhammad, sorry for not replying sooner. I wonder whether Tesseract is trying to apply binarisation to the image which you've already binarised, and is making things worse as a result. You can see what Tesseract's binarised version looks like with the configuration variable 'tessedit_write_images' - see the PoorQuality[0] page on the wiki for more details on using it.
If the quality is indeed reduced by Tesseract poorly re-binarising, there may be a way to disable it doing that (I seem to recall someone mentioning it...) This way, though, you can check what Tesseract is using. If the final binarised version looks fine, check that the lines are being detected properly (which will be less reliable if the image is skewed). The easiest way to do that would be to just check the HOCR output. If lines and characters look like they're being correctly determined, but the characters are just being recognised incorrectly, try disabling the dictionaries. I doubt that retraining for the monospace font would be worthwhile, as it looks like you're working with pretty ordinary fonts, which Tesseract ought to do a decent job on. Let us know how you get on, and do ask any more questions you have. Apologies again for being so slow to respond. Nick 0. https://code.google.com/p/tesseract-ocr/wiki/PoorQuality -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.

