Try downscaling to about 300-400 dpi. Check the documentation for ideal character height. I think such high resolution would be out of range. Sven
On Wednesday, September 11, 2013, Stuart wrote: > Hi, > > I'm trying to convert some old C code I only have printouts of back to > source. I expected to have to do a little editing, but Tesseract is having > serious problems. > > I scanned the images in at 800 DPI, it looks clean and I tried some of the > imagemagic scripts to cleanup, it looks a bit cleaner on the screen but did > not help the OCR accuracy. > > Searches on this topic yield loads of refernces on how ot link tesseract > libraries into your own C but nothing about actually processing C code. > > I have tried adding user words for things like fprintf etc... and common > variable names in the code, but it does not help (although I'm not entirely > convinced I did it right). > > Does anyone have any advice ? > > Should it work ok, maybe its the proportional spaced times roman font its > in thats causing problems. > > Thanks, > > Stuart > > -- > -- > You received this message because you are subscribed to the Google > Groups "tesseract-ocr" group. > To post to this group, send email to > [email protected]<javascript:_e({}, 'cvml', > '[email protected]');> > To unsubscribe from this group, send email to > [email protected] <javascript:_e({}, 'cvml', > 'tesseract-ocr%[email protected]');> > For more options, visit this group at > http://groups.google.com/group/tesseract-ocr?hl=en > > --- > You received this message because you are subscribed to the Google Groups > "tesseract-ocr" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected] <javascript:_e({}, > 'cvml', 'tesseract-ocr%[email protected]');>. > For more options, visit https://groups.google.com/groups/opt_out. > -- ``All that is gold does not glitter, not all those who wander are lost; the old that is strong does not wither, deep roots are not reached by the frost. >From the ashes a fire shall be woken, a light from the shadows shall spring; renewed shall be blade that was broken, the crownless again shall be king.” -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.

