HI, I played with textcleaner :
http://www.fmwconcepts.com/imagemagick/textcleaner/ These options : textcleaner -g -e stretch -f 25 -o 10 -u -s 1 -T -p 10 -t 80 page_0003.jpg page_0003_clean.jpg The "-t 80" : -t .... threshold ....... text smoothing threshold; 0<=threshold<=100; ......................... nominal value is about 50; default is no smoothing thins the lines enough to make a difference in the run together characters for tesseract. I played with several settings from 50 to 100 and 80 was the best for me. Its still only about 75% way below what tesseract handles on normal text I have but its going to work out. Thanks for the help, Stuart On Thursday, September 12, 2013 10:18:44 PM UTC-4, rkomar wrote: > > On Thu, 12 Sep 2013, Stuart wrote: > > > Automatically subdividing each image into character cells > > and OCR'ing each character separately sems like the only > > way out of this. I am experimenting with makebox to define > > the boxes first. > > Argh! When I read "proportional font" I thought > "monospace font", assuming that that was what the code > had been printed in. That was why I suggested creating > the character cells, because it would be easy then. > I'm not sure it's worth trying to figure out where > the bounds of each character are, in your case. > Sorry, for reading the problem incorrectly. > > Rob > -- -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en --- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/groups/opt_out.

