On Sun, Feb 5, 2012 at 8:02 PM, Ben Companjen <[email protected]> wrote: > On 5 February 2012 17:06, Karen Coyle <[email protected]> wrote: >> >> On 2/2/12 4:24 AM, Ben Companjen wrote: >> >>> >>> I looked at the underlying type schema at http://openlibrary.org/type >>> and it seems that there is no "contributor" type. There is >>> /type/edition/contributions (an array of strings), which may be where >>> contributors are stored when I manually enter translators and >>> designers, though how exactly the role and contributor name would be >>> stored in this string is unclear to me. >> >> >> OL doesn't currently store roles, AFAIK. The data element for roles for >> authors was added later, but most of the incoming data doesn't include >> roles, and none of it includes roles for authors (only contributors). > > I was talking about roles that one can add to contributors of an > edition. They must be stored somewhere, because OL remembers what I > enter there. > http://openlibrary.org/works/OL16419933W/Een_kleurige_wiskundige_wereld > not only shows names, but also roles of contributors. The RDF > currently only shows the names. > Just to be precise: 'data element for roles for authors' = > /type/author_role ? It's too bad I cannot easily view the rationale > for each element, because I wondered what this element/type was > supposed to do and how this element/type is used. Or can I see it > somewhere? > >> >>> When looking at some MARC records for >>> http://openlibrary.org/books/OL21322149M/Mies_van_der_Rohe (author >>> "The Museum of Modern Art"), indeed MoMA is in the 110 field (I >>> learned yesterday that field is for corporate entities) and the author >>> is in the "by statement", shown by OL too, as "by Philip C. Johnson." >>> I don't want to accuse anybody, but this leads me to think perhaps >>> ImportBot doesn't know how to import this. Or maybe this record was >>> imported (November 1, 2008) before the decision was made? >> >> >> It looks to me like this record was imported correctly based on the >> algorithm. The "by" statement unfortunately is just free text so no data >> elements are taken from it. Johnson is included as an added author (a 700 in >> the MARC record). A big problem with the library data format is that you >> don't know the relationship of the person listed in the 700 to the item >> being cataloged: it could be the author of a part (like a chapter or intro), >> it could be a co-author, it could be a conductor of a piece of music, etc. >> There are some serious issues with the library data as it exists today, and >> these limit what you can know from the metadata you receive. >> > Oh, I missed the 700 field. But even if the record was imported > correctly, and it looks to me that all the other editions were created > from similar records, how does MoMA end up as the author? Was it > perhaps WorkBot then? > > I quite liked this post about free text in MARC fields - and then > noticed you had commented: > http://robotlibrarian.billdueber.com/isbn-parenthetical-notes-bad-marc-data-1/ > It mentions the many ways books are described as hardcover or > paperback and that made me wonder why OL wants that description as > free text. The options could be limited, I think. Or does OL > automatically normalize the input to "hardcover", "paperback", "...", > like the example descriptions? > > On a side note: I just had a wild idea: don't show the "by statement" > field on the edit form, not even in the librarian mode, if it is > empty, so that no one is tempted to put anything in it. > >>>> >>> Searching for authors with "museum" in the name yields 5608 results, >>> many having over 100 works attached :) >> >> >> Yes, these are from an input source that we now regret having imported. The >> input source used the MARC record format, but used it incorrectly. When it >> went through the normal processing, those errors followed through. >> >> >>> The work type doesn't have a contributors/collaborators field at all. >> >> >> No, it isn't supposed to. A work has creators. Contributors are associated >> with expressions and manifestations. This all comes from something called >> "FRBR" >> > Thanks for the pointers. I had already read a few things on FRBR, and > I'll take it from you that Works only have authors/creators. I guess > it makes sense that they do. > > It's just that I wasn't sure where the corporate identities should go > in the Open Library if I were to add for example a publication > "authored" by a government (no references to a human author) manually. > The same question arises when I were to edit the example of the MoMA > book about Mies van der Rohe (maybe not the best example, as Philip > Johnson becomes author and MoMA doesn't need to be a contributor). > Can I have an OL Work without author, and put the goverment or MoMA in > a contributor field in the OL Edition(s)? OL Edition contributors are > just saved and treated as strings (which I find a little > dissapointing), so entering a corporate identity there isn't a > problem. The RDF template outputs these contributors as foaf:Persons, > though. > > Your proposed field for "responsible organization" could help here. > But should it be part of OL's Work or Edition type? > >>> From a linked data perspective it would be nice if corporate entities >>> (including publishers, although not all publishers are corporate >>> entities) are not just strings, but real entities. >> >> >> Yes, it would be nice, but the data unfortunately doesn't always support it. >> The corporate entities that come in on 110/710 fields are entities in the >> library world and can be found in VIAF with identifiers. The publishers are >> NOT entities, but are a transcription of how the publisher or imprint name >> was presented on the title page of the book. The imprint name \= publisher >> name, so connecting these is difficult. Edward Betts did some >> experimentation around this at one point, but the results were very fuzzy. > > If we (users) want it, and if the data model supports it, I think we > can make the data support it, programmatically or manually. I'm from > the Discogs world, in which everything is done manually by people > mostly smart enough to match imprints (a.k.a. labels) to > entities/publishers behind those labels. Even recording, mixing and > mastering studios and the companies behind the labels (mentioned as > copyright holders) are matched. Guidelines and some rules are needed > there, but they seem to work. > Sure, Open Library is not Discogs and the world of books is different > from the world of recorded music (one important reason, I guess, is > that the former is much older), but goals of OL and Discogs are > similar (one page for every book/music record) and the means > (collaborative editing) are too. But I'm not here to just promote > Discogs - I like MusicBrainz very much too ;-) > >>>> >>> I would only consider putting a corporate body in the author field if >>> a human author is not mentioned in the book at all, which is very >>> rarely the case. >> >> >> Not so rare, actually. Most documents out of corporations and government >> bodies don't attribute the document to a human. But aside from that, much of >> the data in OL is based on rules used by Anglo-American libraries to make >> these decisions. Many of us could see logic in other ways of doing things. >> >> The difficulty is getting enough consistency to do the merging between >> editions and works. That's one of the reasons why we put corporate bodies in >> collaborator -- how could you possibly explain when a corporate body could >> be an author in a way that folks could easily understand? The rules are over >> a thousand pages long, and if you want to delve into that, here's a zip file >> with the final draft: >> >> http://www.archive.org/details/ResourceDescriptionAccessrdaDraftNov.2008 > > Just to make sure I understand you correctly: by "collaborator" you > mean "contributor" in the OL Edition? Or a role at the "creator" > level? > I may have a look at the RDA rules (I had signed up for access to the > final rules in the trial period, but was quickly scared away by the > extent of the documents). About authorship: I believe in The > Netherlands by default you don't own the author's rights (~copyright) > for a publication if you wrote it as part of your job - the > organization you work for owns the copyright in that case. So I don't > find it hard to understand the human author of a publication is > hidden. But that may be just me. > >>> >>> VIAF could be useful for disambiguation, but there is no obvious way >>> to enter a VIAF ID (or any other URI for a person) in an OL record. >> >> >> This seems to me like a good feature request. Note that VIAF and Wikipedia >> have some mutual linking. >> > The LOD cloud diagram indeed shows an arrow from VIAF to DBPedia. And > the Wikipedia article about J.K. Rowling not only has a link to VIAF, > there is one to OL as well :) This should only make things easier to > link. > > I'll open an issue to request allowing for entering VIAF IDs soon. > >>> Is >>> this what http://openlibrary.org/type/author/uris is for? That could >>> make Open Library less isolated in the LOD cloud [2]. I have added >>> links to VIAF pages to a few authors, but they are probably in >>> /type/author/links. >>> It seems VIAF has at least 4 entries for the MoMA (New York), by the way. >> >> >> VIAF takes data from about 20 different national library systems and >> clusters the headings for the same entity, where it can. A match on any >> entity in a cluster should be linked to the cluster ID. The MOMA example >> shows the difficulty of creating the clusters algorithmically. I am hoping >> that VIAF will eventually allow human merging of entries. >> >> >>> A (software) agent consuming RDF should not have a hard time figuring >>> out that a foaf:Person in its knowledge base is the same as a >>> foaf:Agent in Open Library, so I don't think this is the best reason >>> to not change to foaf:Agent - not losing specificity is a better >>> reason :) >> >> >> That's if you think that reasoning will be a common feature of RDF software. >> Some folks have doubts. >> > Sindice.com does some simple inferencing: it can add the superclasses > of resources in a search result. Not all RDF software will do > reasoning of course, but I believe in the world of catalogs there will > be useful software that does do it. > I agree with both of you here (some things will do it, however they will be at an increasingly smaller percentage than those that don't) -- that said, I think that most agents will know to look for foaf:Person, Organization AND Agent.
-Ross. > Ben > >> kc >> >> >> >> -- >> Karen Coyle >> [email protected] http://kcoyle.net >> ph: 1-510-540-7596 >> m: 1-510-435-8234 >> skype: kcoylenet > _______________________________________________ > Ol-tech mailing list > [email protected] > http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech > To unsubscribe from this mailing list, send email to > [email protected] _______________________________________________ Ol-tech mailing list [email protected] http://mail.archive.org/cgi-bin/mailman/listinfo/ol-tech To unsubscribe from this mailing list, send email to [email protected]
