[RDA-L] Triumvirate of giants committed to scheming
Some of us have anticipated that one day Google would enter the metadata arena with an approach entirely their own. Now, this seems to have happened. But not just Google alone is making the move, they have forged an unprecedented triumvirate with their two biggest competitors, Microsoft and Yahoo: http://schema.org Didn't we also expect that their design would bear scant resemblance with anything the library world has ever come up with? And it is true. There's also no similarity with the Dublin Core, for that matter. OTOH, their vision is far removed from anything like catalog cards, just what we've been dreaming of, is it not? Even better, it is a record-free concept. The word metadata, not to speak of catalog, has obviously been carefully circumvented, for whatever reason. There is also no pondering of functional requirements or user tasks, and a closer look reveals, in particular, that the FRBR user tasks can have been of no concern in their reasoning. There is, however, something akin to an authority concept for persons. The whole scheme addressesprimarilythe tasks of the SEO, the Search Engine Optimizer, and these do not necessarily coincide with the interests of the search engine user in every search situation, AGWS. Structured markup, instead of metadata, is a much-used term in the documentation. It is based on w3.org's Microdata (http://dev.w3.org/html5/md-LC/), and the gist of it all appears to be this: By adding additional tags to the HTML of your web pages -- tags that say, Hey search engine, this information describes this specific movie, or place, or person, or video -- you can help search engines and other applications better understand your content and display it in a useful, relevant way. Microdata is a set of tags, introduced with HTML5, that allows you to do this. For now, it is only HTML documents that microdata can be applied to. Different from DC, microdata tags can be spread out all over the file, just in those places it applies to. That means the metadata for a Web page is tightly integrated with the content, it does not form a record for the page as a whole but it can describe any and many parts of it, but it is useless if ripped out of context. It could thus not become an easy successor to MARC in which records stand in as surrogates for resources. All of that sounds pretty remote from what we need and what we are doing, and why not indeed. But if this thing picks up speed (not totally unlikely, considering who's involved), we better take a look. If it won't, one may still learn a bit from the way it fails. Reproduced here, for the record (no pun intended), is the list of attributes for their Book schema. Note what they regard important and what not. Book is on the third level of an object hierarchy: Thing / CreativeWork / Book http://schema.org/Book (contains example, as of now, draft version 0.9) PROPERTY TYPEDESCRIPTION Properties from Thing - description TextA short description of the item. image URL URL of an image of the item. name TextThe name of the item. url TextURL of the item. Properties from CreativeWork about Thing The subject matter of the content. aggregateRating AggregateRating The overall rating, based on a collection of reviews or ratings, of the item. audio AudioObject An embeded audio object or URL assoc. w. the content author Person or Organization The author of this content. Please note that author is special in that HTML 5 provides a special mechanism for indicating authorship via the rel tag. That is equivalent to this and may be used interchangabely. awards TextAwards won by this person or for this creative work. contentLocation Place The location of the content. contentRating TextOfficial rating of a piece of content for example,'MPAA PG-13'. datePublished DateDate of first broadcast/publication. editor Person Editor for this content. encodings MediaObject The media objects that encode this creative work genre TextGenre of the creative work headline TextHeadline of the article inLanguage TextThe language of the content. please use one of the language codes from the IETF BCP 47 standard. interactionCount TextA count of a specific user interactions with this item - for example, 20 UserLikes, 5 UserComments, or 300 UserDownloads. The user interaction type should be one of the sub types of UserInteraction. isFamilyFriendly Boolean Indicates whether this content is family friendly (!) keywords TextThe
Re: [RDA-L] Triumvirate of giants committed to scheming
On 21/06/2011 11:08, Bernhard Eversberg wrote: snip Some of us have anticipated that one day Google would enter the metadata arena with an approach entirely their own. Now, this seems to have happened. But not just Google alone is making the move, they have forged an unprecedented triumvirate with their two biggest competitors, Microsoft and Yahoo: http://schema.org /snip Stu Weibel had an interesting blog post on this at http://weibel-lines.typepad.com/weibelines/2011/06/uncommon-cause.html. He says: Will they achieve semantic web goals? Perhaps incrementally, but I suspect not a lot. The goal is to sell more stuff, and optimization will be based on that. To expect semantic value to ooze from the seams of commercial advertising (no matter how structured) seems unrealistic. I think he's right, but I personally can't blame Google and Co. for opting for something immeasurably simpler than RDF (almost anybody can implement the schema.org schema which is not at all the case with RDF) and certainly not anything like FRBR structures. Plus, it's available now and not in 10 or 15 years at the earliest, when everything we are doing today will be changed. If I were a publisher, I would really be interested in this initiative--probably more interested than working with libraries, although, as Bernhard points out, if I were a journal editor, I may not be too happy! I think it would be wise for libraries to join this initiative if possible. One clever attempt appears to be trying to coopt schema.org by putting it into RDF: http://schema.rdfs.org/ -- James Weinheimer weinheimer.ji...@gmail.com First Thus: http://catalogingmatters.blogspot.com/ Cooperative Cataloging Rules: http://sites.google.com/site/opencatalogingrules/
Re: [RDA-L] Triumvirate of giants committed to scheming
It should be borne in mind that the focus of schema.org is search engine optimization, whereas the Semantic Web and linked data have somewhat more ambitious--if so far elusive--goals. Ed Jones National University (San Diego) -Original Message- From: Resource Description and Access / Resource Description and Access [mailto:RDA-L@LISTSERV.LAC-BAC.GC.CA] On Behalf Of James Weinheimer Sent: Tuesday, June 21, 2011 3:42 AM To: RDA-L@LISTSERV.LAC-BAC.GC.CA Subject: Re: [RDA-L] Triumvirate of giants committed to scheming On 21/06/2011 11:08, Bernhard Eversberg wrote: snip Some of us have anticipated that one day Google would enter the metadata arena with an approach entirely their own. Now, this seems to have happened. But not just Google alone is making the move, they have forged an unprecedented triumvirate with their two biggest competitors, Microsoft and Yahoo: http://schema.org /snip Stu Weibel had an interesting blog post on this at http://weibel-lines.typepad.com/weibelines/2011/06/uncommon-cause.html. He says: Will they achieve semantic web goals? Perhaps incrementally, but I suspect not a lot. The goal is to sell more stuff, and optimization will be based on that. To expect semantic value to ooze from the seams of commercial advertising (no matter how structured) seems unrealistic. I think he's right, but I personally can't blame Google and Co. for opting for something immeasurably simpler than RDF (almost anybody can implement the schema.org schema which is not at all the case with RDF) and certainly not anything like FRBR structures. Plus, it's available now and not in 10 or 15 years at the earliest, when everything we are doing today will be changed. If I were a publisher, I would really be interested in this initiative--probably more interested than working with libraries, although, as Bernhard points out, if I were a journal editor, I may not be too happy! I think it would be wise for libraries to join this initiative if possible. One clever attempt appears to be trying to coopt schema.org by putting it into RDF: http://schema.rdfs.org/ -- James Weinheimer weinheimer.ji...@gmail.com First Thus: http://catalogingmatters.blogspot.com/ Cooperative Cataloging Rules: http://sites.google.com/site/opencatalogingrules/
Re: [RDA-L] Triumvirate of giants committed to scheming
Didn't we also expect that their design would bear scant resemblance with anything the library world has ever come up with? The design or markup language might be unique (and I'm not saying it is), but things like author, title, genre, subject, date published, are all standard. OTOH, their vision is far removed from anything like catalog cards, just what we've been dreaming of, is it not? Even better, it is a record-free concept. Not exactly-- it's just that the record is embedded in the item itself (the webpage). That's a property that libraries don't have access to because our books exists on shelves, DVDs in drawers, articles in databases, etc. Websites exist in the ether, with no catalogue or database, and search engines crawl around looking for what's up there-- any schema that integrates authors, publishers, dates, and subjects right in the website is doing what library catalogues have always done: provide access. We have an insurmountable barrier between record and item; websites don't. That doesn't make it record-free. That means the metadata for a Web page is tightly integrated with the content, it does not form a record for the page as a whole but it can describe any and many parts of it, but it is useless if ripped out of context. This looks a lot like 100 |a to me: span itemprop=nameJames Cameron/span (born span itemprop=birthDateAugust 16, 1954)/span It would take minutes for a programmer to write a conversion from this markup to something an ILS could read. It could thus not become an easy successor to MARC in which records stand in as surrogates for resources. It works with MARC-- they're both markup languages. A Schema record wouldn't be a full AACR2 record, but that is easily noted in the coding level field for anyone who cares. You're right that not all resources need a surrogate in the form of a record. But many still do-- not because of some inherent inferiority in library resources, but because of the separation between where our resources come from, how they are accessed, and how library patrons know we have them. All of that sounds pretty remote from what we need and what we are doing. Nah... Google et al may be in it to sell products, but it still has to function-- it still has to identify the resource and ease access to it. I can easily see a future where the library's catalog crawls the web for authoritative resources, searches the physical holdings (presuming there is such a thing in the future), and searches subscriber databases all together. Perhaps there could be a Schema tag for some sort of authoritative score, where librarians can rate websites (rather than a rating for how many liked this on Facebook, I mean). But perhaps I'm getting ahead of myself. Emily Croft University of Redlands Library