OCR of C code

Stuart Wed, 11 Sep 2013 23:45:36 -0700

Hi,

I'm trying to convert some old C code I only have printouts of back to 
source. I expected to have to do a little editing, but Tesseract is having 
serious problems.


I scanned the images in at 800 DPI, it looks clean and I tried some of the 
imagemagic scripts to cleanup, it looks a bit cleaner on the screen but did 
not help the OCR accuracy.

Searches on this topic yield loads of refernces on how ot link tesseract 
libraries into your own C but nothing about actually processing C code.

I have tried adding user words for things like fprintf etc... and common 
variable names in the code, but it does not help (although I'm not entirely 
convinced I did it right).

Does anyone have any advice ?

Should it work ok, maybe its the proportional spaced times roman font its 
in thats causing problems.

Thanks,

Stuart

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

OCR of C code

Reply via email to