Tesseract is an example of what I was calling "it won't be good enough". It's source code for a command line tool, not a program, and it does only text analysis, not layout analysis. The latter is also crucial to be able to select. And it certainly does not output PDF. So you're still (very) far from having selectable PDFs, as Noam is asking for. Unfortunately.
Christiaan On 14 Jan 2009, at 3:17 AM, Mahn-Soo Choi wrote: > There is a free OCR engine, which they say would possibly be running > on Mac OS X: > > http://code.google.com/p/tesseract-ocr/ > > The quality is quite "good" for my taste; I know this because I'm > using it from time to time > (it is the core OCR engine of a commercial software PDFpen costing > about 50 USD). > (* Note also that PDFpen has a serious problem when OCR a big PDF > file, > more than 100 pages. *) > > Once I tried briefly the Tesseract engine itself. It compiled on my > Mac OS X (10.5.5 back then) > with no problem, but unfortunately, the resulting program didn't work. > It may require a bit of code hacking to make it run on Mac. > > mahn-soo > > > On Jan 14, 2009, at 7:12 AM, Noam A. Osband wrote: > >> So, a common problem I have with Skim is that I can't highlight or >> underline text in a file. This happens with scanned files, >> apparently because the letters come up as an image and not text. An >> OCR program can fix this. they are expensive. Anyone know a good one >> for free for a Mac? >> >> thanks! ------------------------------------------------------------------------------ This SF.net email is sponsored by: SourcForge Community SourceForge wants to tell your story. http://p.sf.net/sfu/sf-spreadtheword _______________________________________________ Skim-app-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/skim-app-users
