On 2/2/12 4:24 AM, Ben Companjen wrote:
> > I looked at the underlying type schema at http://openlibrary.org/type > and it seems that there is no "contributor" type. There is > /type/edition/contributions (an array of strings), which may be where > contributors are stored when I manually enter translators and > designers, though how exactly the role and contributor name would be > stored in this string is unclear to me. OL doesn't currently store roles, AFAIK. The data element for roles for authors was added later, but most of the incoming data doesn't include roles, and none of it includes roles for authors (only contributors). > When looking at some MARC records for > http://openlibrary.org/books/OL21322149M/Mies_van_der_Rohe (author > "The Museum of Modern Art"), indeed MoMA is in the 110 field (I > learned yesterday that field is for corporate entities) and the author > is in the "by statement", shown by OL too, as "by Philip C. Johnson." > I don't want to accuse anybody, but this leads me to think perhaps > ImportBot doesn't know how to import this. Or maybe this record was > imported (November 1, 2008) before the decision was made? It looks to me like this record was imported correctly based on the algorithm. The "by" statement unfortunately is just free text so no data elements are taken from it. Johnson is included as an added author (a 700 in the MARC record). A big problem with the library data format is that you don't know the relationship of the person listed in the 700 to the item being cataloged: it could be the author of a part (like a chapter or intro), it could be a co-author, it could be a conductor of a piece of music, etc. There are some serious issues with the library data as it exists today, and these limit what you can know from the metadata you receive. >> > Searching for authors with "museum" in the name yields 5608 results, > many having over 100 works attached :) Yes, these are from an input source that we now regret having imported. The input source used the MARC record format, but used it incorrectly. When it went through the normal processing, those errors followed through. > The work type doesn't have a contributors/collaborators field at all. No, it isn't supposed to. A work has creators. Contributors are associated with expressions and manifestations. This all comes from something called "FRBR" You might want to start here: http://en.wikipedia.org/wiki/Functional_Requirements_for_Bibliographic_Records and then move on to the original document, http://www.ifla.org/en/publications/functional-requirements-for-bibliographic-records FRBR itself has serious issues, IMO, but it's the model folks are trying to work with today for bibliographic data. OL does not follow FRBR faithfully; it combines expression and manifestation into the OL Edition. To see other sites using FRBR, I suggest having a look at Librarything (http://librarything.com). > > From a linked data perspective it would be nice if corporate entities > (including publishers, although not all publishers are corporate > entities) are not just strings, but real entities. Yes, it would be nice, but the data unfortunately doesn't always support it. The corporate entities that come in on 110/710 fields are entities in the library world and can be found in VIAF with identifiers. The publishers are NOT entities, but are a transcription of how the publisher or imprint name was presented on the title page of the book. The imprint name \= publisher name, so connecting these is difficult. Edward Betts did some experimentation around this at one point, but the results were very fuzzy. >> > I would only consider putting a corporate body in the author field if > a human author is not mentioned in the book at all, which is very > rarely the case. Not so rare, actually. Most documents out of corporations and government bodies don't attribute the document to a human. But aside from that, much of the data in OL is based on rules used by Anglo-American libraries to make these decisions. Many of us could see logic in other ways of doing things. The difficulty is getting enough consistency to do the merging between editions and works. That's one of the reasons why we put corporate bodies in collaborator -- how could you possibly explain when a corporate body could be an author in a way that folks could easily understand? The rules are over a thousand pages long, and if you want to delve into that, here's a zip file with the final draft: http://www.archive.org/details/ResourceDescriptionAccessrdaDraftNov.2008 > > > VIAF could be useful for disambiguation, but there is no obvious way > to enter a VIAF ID (or any other URI for a person) in an OL record. This seems to me like a good feature request. Note that VIAF and Wikipedia have some mutual linking. Is > this what http://openlibrary.org/type/author/uris is for? That could > make Open Library less isolated in the LOD cloud [2]. I have added > links to VIAF pages to a few authors, but they are probably in > /type/author/links. > It seems VIAF has at least 4 entries for the MoMA (New York), by the way. VIAF takes data from about 20 different national library systems and clusters the headings for the same entity, where it can. A match on any entity in a cluster should be linked to the cluster ID. The MOMA example shows the difficulty of creating the clusters algorithmically. I am hoping that VIAF will eventually allow human merging of entries. > A (software) agent consuming RDF should not have a hard time figuring > out that a foaf:Person in its knowledge base is the same as a > foaf:Agent in Open Library, so I don't think this is the best reason > to not change to foaf:Agent - not losing specificity is a better > reason :) That's if you think that reasoning will be a common feature of RDF software. Some folks have doubts. kc -- Karen Coyle [email protected] http://kcoyle.net ph: 1-510-540-7596 m: 1-510-435-8234 skype: kcoylenet _______________________________________________ Ol-tech mailing list [email protected] http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech To unsubscribe from this mailing list, send email to [email protected]
