Re: [ol-discuss] Multivolume works

Lars Aronsson Thu, 03 May 2012 16:17:13 -0700

On 2012-05-04 00:51, Karen Coyle wrote:
> The difficulty seems to arise in the process of scanning. For the
> purposes of scanning, each physical volume becomes a scanned file.


At any serious scale (e.g. Google or Internet Archive), I think
book scanning needs to be organized as multiple work stations,
each taking their portion of a day's batch of books, meaning
that the 10 or 20 volumes of an encyclopedia will be scanned
by different people, each generating a job that goes through
OCR and postprocessing, so each volume needs its own metadata
record.

However, with Google I often find volumes 2 and 5 being all that
is scanned. And at the Internet Archive I sometimes find everything
except volumes 2 and 7 has been scanned. So there is more chaos
than necessary.

When we're trying to use scanned books for reference and
for proofreading the text, we must hunt down individual parts
from different sources. The prime example must be the German
branch of Wikisource, here trying to find all 143 parts of the
Weimar edition (1887-1919) of Goethe's collected works,
http://de.wikisource.org/wiki/Goethe#Sophien-_oder_Weimarer_Ausgabe_.28WA.29

Now, the structure shown on that wiki page is something that
should go into OpenLibrary.org, because it is open (as all
of Wikisource is free and open) bibliographic data.



-- 
   Lars Aronsson ([email protected])
   Project Runeberg - free Nordic literature - http://runeberg.org/



_______________________________________________
Ol-discuss mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
To unsubscribe from this mailing list, send email to 
[email protected]

Re: [ol-discuss] Multivolume works

Reply via email to