There's also ocropus as a pre-processor to tesseract-ocr, although I think it's still in alpha, it may be worth a look. This is of course another option aside from unpaper. Prior articles on FOSS OCR using tesseract-ocr would point to doing cleanup with gimp or ImageMagick to produce better results.
On Thu, Jun 26, 2008 at 9:17 AM, Orlando Andico <[EMAIL PROTECTED]> wrote: > Actually Eric the main thing I got from the article was the > pre-processing with "unpaper." > > When I tested Tesseract before, I didn't know about "unpaper" and got > poor results. -- eric pareja ([EMAIL PROTECTED]) "Ang mundo ay aklat, at iisang pahina lamang ang nababasa ng hindi naglalakbay." わかよたれぞ つねならむ _________________________________________________ Philippine Linux Users' Group (PLUG) Mailing List http://lists.linux.org.ph/mailman/listinfo/plug Searchable Archives: http://archives.free.net.ph

