Quoting Lars Aronsson <[email protected]>:

> Were you at the OpenKnowledge conference Saturday April 24?
> I was not there, but apparently, this was the topic of some
> presentations there.

No, I wasn't.

>
> I got introduced to the OKFN bibliographic project by
> Tatiana de la O ten days earlier, at a Wikimedia meetup
> in Berlin. I got the impression that publicdomainworks.net
> was just one of several facets, and that the whole database
> behind that was very similar to OpenLibrary. I could be
> wrong about this. I didn't take notes, and can't remember
> what the other domain names were that I was shown.

It doesn't look very similar to OL to me, although they both have  
authors and titles. OL doesn't current identify public domain works,  
but it does link to many digitized public domain works that are open  
access. In that sense, a link between the two projects would bring  
users closer to finding the works they are looking for.

Note that the Book Rights Registry that Google is creating in support  
of Book Search has a lot of overlap with OKFN and OL, in that it will  
identify the copyright status of books (which is not the same as the  
copyright status of works, and which I think is going to be a source  
of great confusion until we come up with shared terminology and shared  
definitions). OKFN seems to be the only one, however, looking at  
copyright status of resources other than books.

> We don't need multiple projects with exchange of
> data and a never-ending circulation of errors.
> We need one centralized project, with a focus on
> quality improvement.

Actually, I disagree about a centralized project. I think those days  
are past. We should now be able to interlink projects, which will  
allow more freedom and innovation, and will let different folks try  
out different approaches. By sharing data we save time and can help  
each other with quality issues. It would definitely be good to have a  
place where all of us working with bibliographic data can hash out  
issues, but I don't think that has developed yet.

>
> On www.openlibrary.org the first thing I see is the
> number of 24 million "books". You got to stop counting
> all these duplicate records. You must start to focus on
> quality instead of quantity. There aren't 24 million books.
> Maybe half of these are duplicate records. Have you
> got any idea how much junk you are carrying around?
>
> On the new "upstream.openlibrary.org" complete beginners
> are encouraged to add books, as if adding more books
> was needed. No, it's not. Removing duplicate records
> is what's needed. Adding birth years and other
> information to author records is also needed. Things
> that add quality, not quantity. What percentage of
> author records have anything more than the name?
> How do we increase that?

The author names come primarily from library catalogs, and there's a  
practice in libraries that makes sense to librarians but to no one  
else, AFAIK. The birth and death dates in library catalogs are used  
only when they are necessary to distinguish between two authors with  
the same name. So for every "Smith, John, 1906-" there is a "Smith,  
John" who was the first one entered into the catalog (and therefore no  
distinguishing date was needed). (However, I can find exceptions to  
this, as well, so it is very confusing.) I presume that library users  
haven't understood this (and why should they? it's not very logical  
from a user point of view), and probably figure that some names are  
without dates because the librarians didn't know them. This is just  
one of the things that divides libraries from their users.

Once the new version of OL is available, the next step is to make it  
possible to merge author names, works, and editions. What merging has  
been done already is based on algorithms, and it appears that some  
data loads didn't get merged property. Solving the quality issues is  
very much on the task list.

In terms of folks adding more books, it has been interesting to see  
what books have been added by individuals. I am hoping that there will  
be a way to identify those at some point -- because many are being  
added by authors outside of the US whose books often do not get much  
attention. Authors, of course, are highly motivated to make the  
existence of their books visible and willing to put in the effort.  
This particular aspect of the library is something I find both  
fascinating and encouraging.

kc

>
>
> --
>   Lars Aronsson ([email protected])
>   Aronsson Datateknik - http://aronsson.se
>
> _______________________________________________
> Ol-discuss mailing list
> [email protected]
> http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
> To unsubscribe from this mailing list, send email to   
> [email protected]
>



-- 
Karen Coyle
[email protected] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234  
begin_of_the_skype_highlighting              1-510-435-8234      end_of_the_skype_highlighting
skype: kcoylenet

_______________________________________________
Ol-discuss mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-discuss
To unsubscribe from this mailing list, send email to 
[email protected]

Reply via email to