On Sep 21, 2009, at 8:08 AM, engel wrote: > IMHO a good typist is far faster then a good OCR software ...
There's no way Google could have compiled the collection of digital books they have amassed using teams of typists. The cost alone would have made that infeasible, not to mention the time. The OCR software (and more importantly, hardware) that Google is using to scan books is obviously many orders of magnitude faster and accurate than human typists. Unfortunately, it's unlike any of us home users will ever be able to afford a Google OCR scan station, with optimized lighting, stereo cameras, automated page turning, etc. The reality remains that for home or small business users hiring a good typist is still the optimal solution for most OCR scanning jobs. An interesting aside, even Google's OCR scanning is inaccurate. Google recently purchased the company reCAPTCHA. ReCAPTCHA takes its word images from scanned print materials. Every time people solve a captcha on a website, they are also, as a byproduct, helping to turn scanned words into plain text that can be indexed and made searchable by search engines. That's why these wavy word captchas always have TWO words, one is known, one is unknown. After a certain small number of captcha results, the unknown word becomes statically "known" through the results, and the domino effect starts to roll through any size list of unknown words. I have a little experience with small scale home OCR, and the time needed to correct the relatively small number of errors was large. I agree a good typist would win a contest for speed alone. --~--~---------~--~----~------------~-------~--~----~ You received this message because you are a member of G-Books, a group for those using G3 iBooks and PowerBooks (we run a separate list for G4 'Books). The list FAQ is at http://lowendmac.com/lists/g-books.html and our netiquette guide is at http://www.lowendmac.com/lists/netiquette.shtml To post to this group, send email to [email protected] For more options, visit this group at http://groups.google.com/group/g-books Support for older Macs: http://lowendmac.com/services/ -~----------~----~----~----~------~----~------~--~---
