On 10/27/12 12:06 PM, Ben Companjen wrote:
> Hi all,
>
> Since I received my e-book reader a couple of weeks ago, I have been
> looking at out-of-copyright books to load. The few books that I
> downloaded as EPUB from the OL / Internet Archive contain many OCR
> errors. Rather than correcting these by hand just for myself (as OL/IA
> doesn't provide an obvious way to let me upload a more correct
> version), I remembered that there is a web place where people gather
> to improve texts for e-book readers and re-discovered Project
> Gutenberg [1].
>
> Community members involved with Project Gutenberg produce e-book
> versions of out-of-copyright books, which can then be downloaded from
> the website. But whereas OL EPUBs can be linked to a specific edition,
> the PG EPUBs are mostly "reconstructed" from the text and harder to
> link to a paper edition.
>
> Hence my following questions:
> Do people agree that Project Gutenberg editions be seen as separate editions?

Yes, definitely. I also think that a corrected OL edition should be 
stored separately from its original un-corrected OCR. The reason is that 
at some point it may be desirable to go back and see what was there 
before the correction. Ideally, there could be versioning and forking, 
much like software.

> Do people agree the release date given by the project is the publish date?

The release date of the digital edition is a publish date, but I think 
that it isn't sufficient. If the text is derived from a physical book, 
then the date of the book is also needed. I also would like to see 
"original" dates where known -- that is the original publication date of 
the text. Otherwise, Moby Dick and Origin of Species end up being 
presented as 21st century texts, which really messes up the cultural and 
scientific context.

> Do people agree that there is some sense in PG editions' formats being
> something like "E-book" or "Electronic resource"

They are electronic resources, but if they are plain text I have a hard 
time seeing them as "ebooks" -- to me, ebook implies something more 
structured than plain text. (Title pages, navigable chapters, etc.) I 
know not everyone sees it that way.


> Why are there only (19 | less than 19 | 281) of the 40000+ editions
> [2] in OL? These 19 seem to be linked to IA items, coming from
> "European libraries", although not all seem to be really published by
> PG (e.g. [3]). In the latest data dump, there are 281 editions with at
> least one PG identifier, but they are not listed under publisher PG.
> Are there people around who know about connecting or importing the PG
> catalogue?

I believe that the PG books are not in the OL/IA workflow for a reason, 
although I don't recall the reason. It may have to do with the 
availability of bibliographic data?

Note, though, that from what I understand there is no new development 
happening on OL at the moment and I don't know if it will be taken up 
again. There seems to be no staff dedicated to the project. So it's 
unlikely that any new data types will be added.

kc

> Are there other known publishers named Project Gutenberg?
>
> (Feel free to answer a subset of these questions :) )
>
> Ben
>
> [1] http://www.gutenberg.org
> [2] http://openlibrary.org/publishers/Project_Gutenberg
> [3] http://openlibrary.org/books/OL20478553M/The_Lady_of_the_Lake
> _______________________________________________
> Ol-discuss mailing list
> Ol-discuss@archive.org
> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
> To unsubscribe from this mailing list, send email to 
> ol-discuss-unsubscr...@archive.org
>

-- 
Karen Coyle
kco...@kcoyle.net http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet
_______________________________________________
Ol-discuss mailing list
Ol-discuss@archive.org
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
To unsubscribe from this mailing list, send email to 
ol-discuss-unsubscr...@archive.org

Reply via email to