On Mar 31, 2013, at 12:22 PM, Karen Coyle <[email protected]> wrote:

> Nearly every digital library has best practices for scanning, as do many 
> library organizations. Just plug "digital library scanning" into any 
> search engine and you'll have more than you want to know.
> 
> Unless you have the appropriate equipment, including OCR software, your 
> digital scan will not be terribly usable. You haven't said what you 
> would be scanning, but book scanning requires special software to 
> correct the curvature of the pages and to keep the images in focus. It's 
> not really a DIY operation, unless you are doing it only for your own 
> use. OCR is essential, although it can be a separate step.

If you upload raw images without OCR data to archive.org, IA will OCR your 
material for you.

To take advantage of the archive.org text processing, upload your non-ocr'ed 
images in the format described here:
http://raj.blog.archive.org/2011/02/24/new-upload-format-_images-zip-for-scribe-style-uploads/

Also worth noting, openlibrary.org is a project of the Internet Archive 
(archive.org). Open Library only holds metadata about books, not the actual 
scanned pages. When you see a book (pdf/epub/read online) linked to from OL, 
the raw data is usually hosted on archive.org.

-raj


> 
> And that's about all I know.
> 
> kc
> 
> On 3/31/13 11:17 AM, [email protected] wrote:
>> On Sunday, March 31, 2013 11:06:13 AM you wrote:
>> 
>>> Having said that, the digitization is the hard part (at least to do it
>> 
>>> right), not the storage. You can store it on one and move it to the other
>> 
>>> if the first fails, store it on both, store it additional places besides
>> 
>>> these two, etc.
>> 
>> K. Am I best to scan them as PDF?
>> 
>> Of course they would be images, and I know it would be best to OCR for
>> some kind of underlying text layer, but I doubt I have the tools for
>> that in Linux. Any suggestions?
>> 
>> (I tried to post this to -discussions, but it was rejected)
>> 
>> 
>> 
>> _______________________________________________
>> Ol-tech mailing list
>> [email protected]
>> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
>> To unsubscribe from this mailing list, send email to 
>> [email protected]
>> 
> 
> -- 
> Karen Coyle
> [email protected] http://kcoyle.net
> ph: 1-510-540-7596
> m: 1-510-435-8234
> skype: kcoylenet
> _______________________________________________
> Ol-tech mailing list
> [email protected]
> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
> To unsubscribe from this mailing list, send email to 
> [email protected]

_______________________________________________
Ol-tech mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
To unsubscribe from this mailing list, send email to 
[email protected]

Reply via email to