On Mar 31, 2013, at 12:22 PM, Karen Coyle <[email protected]> wrote:
> Nearly every digital library has best practices for scanning, as do many > library organizations. Just plug "digital library scanning" into any > search engine and you'll have more than you want to know. > > Unless you have the appropriate equipment, including OCR software, your > digital scan will not be terribly usable. You haven't said what you > would be scanning, but book scanning requires special software to > correct the curvature of the pages and to keep the images in focus. It's > not really a DIY operation, unless you are doing it only for your own > use. OCR is essential, although it can be a separate step. If you upload raw images without OCR data to archive.org, IA will OCR your material for you. To take advantage of the archive.org text processing, upload your non-ocr'ed images in the format described here: http://raj.blog.archive.org/2011/02/24/new-upload-format-_images-zip-for-scribe-style-uploads/ Also worth noting, openlibrary.org is a project of the Internet Archive (archive.org). Open Library only holds metadata about books, not the actual scanned pages. When you see a book (pdf/epub/read online) linked to from OL, the raw data is usually hosted on archive.org. -raj > > And that's about all I know. > > kc > > On 3/31/13 11:17 AM, [email protected] wrote: >> On Sunday, March 31, 2013 11:06:13 AM you wrote: >> >>> Having said that, the digitization is the hard part (at least to do it >> >>> right), not the storage. You can store it on one and move it to the other >> >>> if the first fails, store it on both, store it additional places besides >> >>> these two, etc. >> >> K. Am I best to scan them as PDF? >> >> Of course they would be images, and I know it would be best to OCR for >> some kind of underlying text layer, but I doubt I have the tools for >> that in Linux. Any suggestions? >> >> (I tried to post this to -discussions, but it was rejected) >> >> >> >> _______________________________________________ >> Ol-tech mailing list >> [email protected] >> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech >> To unsubscribe from this mailing list, send email to >> [email protected] >> > > -- > Karen Coyle > [email protected] http://kcoyle.net > ph: 1-510-540-7596 > m: 1-510-435-8234 > skype: kcoylenet > _______________________________________________ > Ol-tech mailing list > [email protected] > http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech > To unsubscribe from this mailing list, send email to > [email protected] _______________________________________________ Ol-tech mailing list [email protected] http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech To unsubscribe from this mailing list, send email to [email protected]
