On Thu, Jan 15, 2009 at 7:46 PM, disciple <[email protected]> wrote: > > That sounds very much like this: > http://groups.google.com/group/tesseract-ocr/browse_thread/thread/bc687b07cac549ed?hl=en
Oh gosh, night and day difference. I am currently using the following ghostscript command to convert the PDF to tiff: gs -dGraphicsAlphaBits=4 -dTextAlphaBits=4 -dDOINTERPOLATE -dSAFER -dBATCH -dNOPAUSE \ -sDEVICE=tiff24nc \ -sOutputFile=out/page01_%d.tif \ -r1024 \ Statement.pdf But that seems to make a color tiff. Using gimp to convert to gray scale made the difference. And it's recognizing the 1s that gocr missed.. If I figure out a better ghostscript, I will post it for archival purposes. -- Fedora 9 : sulphur is good for the skin ( www.pembo13.com ) --~--~---------~--~----~------------~-------~--~----~ You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To post to this group, send email to [email protected] To unsubscribe from this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/tesseract-ocr?hl=en -~----------~----~----~----~------~----~------~--~---

