Try downscaling to about 300-400 dpi. Check the documentation for ideal
character height. I think such high resolution would be out of range.
Sven

On Wednesday, September 11, 2013, Stuart wrote:

> Hi,
>
> I'm trying to convert some old C code I only have printouts of back to
> source. I expected to have to do a little editing, but Tesseract is having
> serious problems.
>
> I scanned the images in at 800 DPI, it looks clean and I tried some of the
> imagemagic scripts to cleanup, it looks a bit cleaner on the screen but did
> not help the OCR accuracy.
>
> Searches on this topic yield loads of refernces on how ot link tesseract
> libraries into your own C but nothing about actually processing C code.
>
> I have tried adding user words for things like fprintf etc... and common
> variable names in the code, but it does not help (although I'm not entirely
> convinced I did it right).
>
> Does anyone have any advice ?
>
> Should it work ok, maybe its the proportional spaced times roman font its
> in thats causing problems.
>
> Thanks,
>
> Stuart
>
> --
> --
> You received this message because you are subscribed to the Google
> Groups "tesseract-ocr" group.
> To post to this group, send email to 
> [email protected]<javascript:_e({}, 'cvml', 
> '[email protected]');>
> To unsubscribe from this group, send email to
> [email protected] <javascript:_e({}, 'cvml',
> 'tesseract-ocr%[email protected]');>
> For more options, visit this group at
> http://groups.google.com/group/tesseract-ocr?hl=en
>
> ---
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected] <javascript:_e({},
> 'cvml', 'tesseract-ocr%[email protected]');>.
> For more options, visit https://groups.google.com/groups/opt_out.
>


-- 
``All that is gold does not glitter,
  not all those who wander are lost;
the old that is strong does not wither,
  deep roots are not reached by the frost.
>From the ashes a fire shall be woken,
  a light from the shadows shall spring;
renewed shall be blade that was broken,
  the crownless again shall be king.”

-- 
-- 
You received this message because you are subscribed to the Google
Groups "tesseract-ocr" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/tesseract-ocr?hl=en

--- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/groups/opt_out.

Reply via email to