On 2/2/12 4:24 AM, Ben Companjen wrote:

>
> I looked at the underlying type schema at http://openlibrary.org/type
> and it seems that there is no "contributor" type. There is
> /type/edition/contributions (an array of strings), which may be where
> contributors are stored when I manually enter translators and
> designers, though how exactly the role and contributor name would be
> stored in this string is unclear to me.

OL doesn't currently store roles, AFAIK. The data element for roles for 
authors was added later, but most of the incoming data doesn't include 
roles, and none of it includes roles for authors (only contributors).

> When looking at some MARC records for
> http://openlibrary.org/books/OL21322149M/Mies_van_der_Rohe (author
> "The Museum of Modern Art"), indeed MoMA is in the 110 field (I
> learned yesterday that field is for corporate entities) and the author
> is in the "by statement", shown by OL too, as "by Philip C. Johnson."
> I don't want to accuse anybody, but this leads me to think perhaps
> ImportBot doesn't know how to import this. Or maybe this record was
> imported (November 1, 2008) before the decision was made?

It looks to me like this record was imported correctly based on the 
algorithm. The "by" statement unfortunately is just free text so no data 
elements are taken from it. Johnson is included as an added author (a 
700 in the MARC record). A big problem with the library data format is 
that you don't know the relationship of the person listed in the 700 to 
the item being cataloged: it could be the author of a part (like a 
chapter or intro), it could be a co-author, it could be a conductor of a 
piece of music, etc. There are some serious issues with the library data 
as it exists today, and these limit what you can know from the metadata 
you receive.


>>
> Searching for authors with "museum" in the name yields 5608 results,
> many having over 100 works attached :)

Yes, these are from an input source that we now regret having imported. 
The input source used the MARC record format, but used it incorrectly. 
When it went through the normal processing, those errors followed through.

> The work type doesn't have a contributors/collaborators field at all.

No, it isn't supposed to. A work has creators. Contributors are 
associated with expressions and manifestations. This all comes from 
something called "FRBR"

You might want to start here:

http://en.wikipedia.org/wiki/Functional_Requirements_for_Bibliographic_Records

and then move on to the original document,

http://www.ifla.org/en/publications/functional-requirements-for-bibliographic-records

FRBR itself has serious issues, IMO, but it's the model folks are trying 
to work with today for bibliographic data. OL does not follow FRBR 
faithfully; it combines expression and manifestation into the OL 
Edition. To see other sites using FRBR, I suggest having a look at 
Librarything (http://librarything.com).


>
>  From a linked data perspective it would be nice if corporate entities
> (including publishers, although not all publishers are corporate
> entities) are not just strings, but real entities.

Yes, it would be nice, but the data unfortunately doesn't always support 
it. The corporate entities that come in on 110/710 fields are entities 
in the library world and can be found in VIAF with identifiers. The 
publishers are NOT entities, but are a transcription of how the 
publisher or imprint name was presented on the title page of the book. 
The imprint name \= publisher name, so connecting these is difficult. 
Edward Betts did some experimentation around this at one point, but the 
results were very fuzzy.


>>
> I would only consider putting a corporate body in the author field if
> a human author is not mentioned in the book at all, which is very
> rarely the case.

Not so rare, actually. Most documents out of corporations and government 
bodies don't attribute the document to a human. But aside from that, 
much of the data in OL is based on rules used by Anglo-American 
libraries to make these decisions. Many of us could see logic in other 
ways of doing things.

The difficulty is getting enough consistency to do the merging between 
editions and works. That's one of the reasons why we put corporate 
bodies in collaborator -- how could you possibly explain when a 
corporate body could be an author in a way that folks could easily 
understand? The rules are over a thousand pages long, and if you want to 
delve into that, here's a zip file with the final draft:

http://www.archive.org/details/ResourceDescriptionAccessrdaDraftNov.2008

>
>
> VIAF could be useful for disambiguation, but there is no obvious way
> to enter a VIAF ID (or any other URI for a person) in an OL record.

This seems to me like a good feature request. Note that VIAF and 
Wikipedia have some mutual linking.


  Is
> this what http://openlibrary.org/type/author/uris is for? That could
> make Open Library less isolated in the LOD cloud [2]. I have added
> links to VIAF pages to a few authors, but they are probably in
> /type/author/links.
> It seems VIAF has at least 4 entries for the MoMA (New York), by the way.

VIAF takes data from about 20 different national library systems and 
clusters the headings for the same entity, where it can. A match on any 
entity in a cluster should be linked to the cluster ID. The MOMA example 
shows the difficulty of creating the clusters algorithmically. I am 
hoping that VIAF will eventually allow human merging of entries.

> A (software) agent consuming RDF should not have a hard time figuring
> out that a foaf:Person in its knowledge base is the same as a
> foaf:Agent in Open Library, so I don't think this is the best reason
> to not change to foaf:Agent - not losing specificity is a better
> reason :)

That's if you think that reasoning will be a common feature of RDF 
software. Some folks have doubts.

kc



-- 
Karen Coyle
[email protected] http://kcoyle.net
ph: 1-510-540-7596
m: 1-510-435-8234
skype: kcoylenet
_______________________________________________
Ol-tech mailing list
[email protected]
http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech
To unsubscribe from this mailing list, send email to 
[email protected]

Reply via email to