Hi, I'm trying to convert some old C code I only have printouts of back to source. I expected to have to do a little editing, but Tesseract is having serious problems.
I scanned the images in at 800 DPI, it looks clean and I tried some of the imagemagic scripts to cleanup, it looks a bit cleaner on the screen but did not help the OCR accuracy. Searches on this topic yield loads of refernces on how ot link tesseract libraries into your own C but nothing about actually processing C code. I have tried adding user words for things like fprintf etc... and common variable names in the code, but it does not help (although I'm not entirely convinced I did it right). Does anyone have any advice ? Should it work ok, maybe its the proportional spaced times roman font its in thats causing problems. Thanks, Stuart -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.

