Ben Adida wrote: > http://www.w3.org/2006/07/SWD/RDFa/primer/20070910/
Feedback on the 20070910 Primer: - Really great job on getting the concepts more accessible to web developers! I really couldn't justify changing too much, this version seemed much more polished than the last. - Can we use color or highlight the important parts of the XHTML code in some way? It is hard to differentiate the text that matters to RDFa from the text that doesn't in some of the larger examples. - The scenarios were much easier to follow this time around. - The document looks very scary and large in a web browser. I realize that it might be against W3C policy to break up a large page, like the Primer document, into multiple pages... but it really needs it. The HTML 4.01 Specification does it: http://www.w3.org/TR/html4/cover.html#minitoc - The rest of the feedback is attached (tiny grammar corrections, missing words, etc). Changes are highlighted in yellow. Even without the proposed changes, the document is much more accessible and streamlined from the last revision. Later today, I'll pass it by some of our developers that don't know anything about RDFa and see what they have to say. -- manu -- Manu Sporny President/CEO, Digital Bazaar, Inc. http://wiki.digitalbazaar.com/en/BTPWS-Album-RecommendationTitle: RDFa Primer 1.0
RDFa Primer 1.0Embedding Structured Data in Web PagesEditors' Draft
Copyright © W3C® (MIT, ERCIM, Keio), All Rights Reserved. W3C liability, trademark, document use, and software licensing rules apply. AbstractCurrent web pages, written in XHTML, contain inherent structured data: calendar events, contact information, photo captions, song titles, copyright licensing information, etc. When publishers can express this data precisely, and when tools can read it robustly, a new world of user functionality becomes available, letting users transfer structured data between applications and web sites. An event on a web page can be directly imported into a desktop calendar. A license on a document can be detected to inform the user of his rights automatically. A photo's creator, camera setting information, resolution, and topic can be published as easily as the original photo itself. RDFa lets an XHTML author express this structured data using extra XHTML attributes. Where the data is already present on the page, e.g. the photo's caption, the author need not repeat it. A web publisher can easily reuse concepts, e.g. an event's date, defined by other publishers, or create new ones altogether. RDFa gets most of its expressive power from RDF, though the reader need not understand RDF to read on. This document introduces XHTML authors to RDFa with simple examples. For more detailed syntax specification, please consult the RDFa Syntax Document. Status of this DocumentThis is an internal draft produced by the Semantic Web Deployment Working Group [SWD-WG], in cooperation with the HTML Working Group [HTML-WG]. Initial work on RDFa began with the Semantic Web Best Practices and Deployment Working Group [SWBPD-WG]. This document is for internal review only and is subject to change without notice. This document has no formal standing within the W3C. ChangesSince Working Draft #3 of this document:
Table of Contents1 Purpose and Preliminaries 1 Purpose and PreliminariesCurrent web pages, written in XHTML, contain inherent structured data: calendar events, contact information, photo captions, song titles, copyright licensing information, etc. When publishers can express this data precisely, and when tools can read it robustly, a new world of user functionality becomes available, letting users transfer structured data between applications and web sites. An event on a web page can be directly imported into a desktop calendar. A license on a document can be detected to inform the user of his rights automatically. A photo's creator, camera setting information, resolution, and topic can be published as easily as the original photo itself. NOTE: It's good that you explain exactly what 'structured data' is at the beginning of this section. Its much more clear as to what you're talking about when you mention structured data elsewhere in the document. RDFa lets an XHTML author express this structured data using extra XHTML attributes. Where the data is already present on the page, e.g. the photo's caption, the author need not repeat it. A web publisher can easily reuse concepts, e.g. an event's date, defined by other publishers, or create new ones altogether. RDFa gets most of its expressive power from RDF, though the reader need not understand RDF to read on. We note that RDFa makes use of XML namespaces. In this document,
we assume, for simplicity's sake, that the following namespaces are
defined:
2 Simple Data: Publishing Events and ContactsJo keeps a private blog for her friends and family. 2.1 The Basic XHTMLJo is organizing one last summer Barbecue, which she hopes all of
her friends and family will attend. She blogs an announcement of her
talk at her private blog, <html> <head><title>Jo's Friends and Family Blog</title></head> <body> ... <p> I'm holding one last summer Barbecue, on September 16th at 4pm. </p> ... <p class="contactinfo"> Jo Smith. Web hacker at <a href="" Example.org </a>. You can contact me <a href="" PROTECTED]"> via email </a>. </p> ... </body> </html> This short piece of mark-up contains important structured data. The markup describes an event: a Barbecue that Jo is hosting. This Barbecue starts at 4pm on September 16th. A summary of the event is "one last summer Barbecue." We also have contact information for Jo: she works for the organization Example.org, with job title of "Web Hacker." She can be contacted at the email address "[EMAIL PROTECTED]" At the moment, it is very difficult for software — like web browsers and search engines — to make use of this data's implicit structure. We need a standard mechanism to explicitly express it, so that it can be extracted consistently. This is precisely where RDFa comes in. 2.2 Publishing An EventJo would like to label this blog entry so that her friends and family can add her Barbecue directly to their calendar. RDFa allows her to express this structure using a small handful of extra attributes. Since this is a calendar event, Jo will specifically use the iCal vocabulary [ICAL-RDF] to denote the data's structure. NOTE: Are we assuming that readers will know what a “vocabulary” is when reading this document? I don't think many people outside of the semantic web know what a vocabulary is... The first step is to reference the iCal vocabulary within the XHTML page, so that Jo's friends' web browsers can look up the calendar concepts and make use of them: <html xmlns:cal="http://www.w3.org/2002/12/cal/ical#"> ... then, Jo declares a new event: <p instanceof="cal:Vevent"> ... </p>
Note how the I'm holding <span property="cal:summary"> one last summer Barbecue </span>
The <span property="cal:dtstart" content="20070916T1600-0500"> September 16th at 4pm </span>
The actual content of the <html xmlns:cal="http://www.w3.org/2002/12/cal/ical#"> <head><title>Jo's Friends and Family Blog</title></head> <body> ... <p instanceof="cal:Vevent"> I'm holding <span property="cal:summary"> one last summer Barbecue, </span> on <span property="cal:dtstart" content="20070916T1600-0500"> September 16th at 4pm. </span> </p> ... </body> </html>
Note that Jo could have used any other XHTML element, not just (For the RDF-inclined reader, the RDF triples that correspond to the above markup are available in Section 4.1 Events and Contact Information.) 2.3 Publishing Contact InformationNow that Jo has published her event in a human-and-machine-readable way, she realizes there is much data on her blog that she can mark up in the same way. Her contact information, in particular, is an easy target: ... <p class="contactinfo"> Jo Smith. Web hacker at <a href="" Example.org </a>. You can contact me <a href="" PROTECTED]"> via email </a>. </p> ...
Jo discovers the vCard RDF vocabulary [VCARD-RDF],
which she adds to her existing page. Since Jo thinks of vCards as a
way to publish her contact information, she uses the prefix <html xmlns:cal="http://www.w3.org/2002/12/cal/ical#" xmlns:contact="http://www.w3.org/2001/vcard-rdf/3.0#"> ...
Jo then sets up her vCard using RDFa, by deciding that the ... <p class="contactinfo" about="http://example.org/staff/jo"> <!-- everything here pertains to http://example.org/staff/jo --> </p> ... "Simple enough!" Jo realizes. She adds her first vCard fields: name, title, organization and email. NOTE: We really should think about highlighting the important attributes and text in examples such as this. It is difficult to see what is being changed and what really matters to RDFa for a beginner reading this document. Are we not doing it because there is some W3C rule against using certain colors in primer documents? ... <p class="contactinfo" about="http://example.org/staff/jo"> <span property="contact:fn"> Jo Smith </span>. <span property="contact:title"> Web hacker </span> at <a rel="contact:org" href="" Example.org </a>. You can contact me <a rel="contact:email" href="" PROTECTED]"> via email </a>. </p> ...
Notice how Jo was able to use the For simplicity's sake, we have slightly abused the vCard
vocabulary above: vCard technically requires that the type
of the email address be specified, e.g. work or home email. In
Section 4.3 Layered Data and Subresources,
we show how 2.4 The Complete XHTML with RDFaJo's complete XHTML with RDFa is thus: <html xmlns:cal="http://www.w3.org/2002/12/cal/ical#" xmlns:contact="http://www.w3.org/2001/vcard-rdf/3.0#"> <head><title>Jo's Friends and Family Blog</title></head> <body> ... <p instanceof="cal:Vevent"> I'm holding <span property="cal:summary"> one last summer Barbecue, </span> on <span property="cal:dtstart" content="20070916T1600-0500"> September 16th at 4pm. </span> </p> ... <p class="contactinfo" about="http://example.org/staff/jo"> <span property="contact:fn"> Jo Smith </span>. <span property="contact:title"> Web hacker </span> at <a rel="contact:org" href="" Example.org </a>. You can contact me <a rel="contact:email" href="" PROTECTED]"> via email </a>. </p> ... </body> </html>
If Jo changes her email address link, her organization, or the title
of her talk, RDFa-enabled browsers will automatically pick up these
changes in the marked up, structured data. The only places where this
doesn't happen is when the (Once again, the RDF-inclined reader will want to consult the resulting RDF triples 4.1 Events and Contact Information.) 2.5 Working Within a Fragment of the XHTMLWhat if Jo does not have complete control over the XHTML of her
blog? For example, she may be using a templating system which makes
it particularly difficult to add the vocabularies in the Fortunately, RDFa uses standard XML namespaces, which means that the vocabularies can be imported "locally" to an XHTML element. Jo's blog page could express the exact same structured data with the following markup: NOTE: Again, it would be much easier to see the changes if they were highlighted in some way. <html> <head><title>Jo's Friends and Family Blog</title></head> <body> ... <p instanceof="cal:Vevent" xmlns:cal="http://www.w3.org/2002/12/cal/ical#"> I'm holding <span property="cal:summary"> one last summer Barbecue, </span> on <span property="cal:dtstart" content="20070916T1600-0500"> September 16th at 4pm. </span> </p> ... <p class="contactinfo" about="http://example.org/staff/jo" xmlns:contact="http://www.w3.org/2001/vcard-rdf/3.0#"> <span property="contact:fn"> Jo Smith </span>. <span property="contact:title"> Web hacker </span> at <a rel="contact:org" href="" Example.org </a>. You can contact me <a rel="contact:email" href="" PROTECTED]"> via email </a>. </p> ... </body> </html>
In this case, each NOTE: This is great! I've spent around 15 minutes reading this primer and I could walk away at this point with a pretty good high-level understanding of what RDFa does. 3 Advanced Concepts: Custom Vocabularies, Document Fragments, Complex Data, ...RDFa can do much more than the simple examples described above. In this section, we explore some of its advanced capabilities. We consider:
3.1 Creating a Custom Vocabulary and Using Compact URIsAll field names and data types in RDFa are URIs, e.g.
This helps keep the markup short and clean: <div xmlns:dc="http://purl.org/dc/elements/1.1/"> <span property="dc:title">Yowl</span>, created by <span property="dc:creator">Mark Birbeck</span>. </div> Because concepts are simply URIs, it is trivial to create one's own vocabulary: simply mint new URIs in a domain you control, and use them in RDFa markup. This is, in fact, the power of RDF, RDFa's underlying technology. Consider a (fictional) photo management web site called Shutr,
whose web site is Shutr chooses to mark up its photos with RDFa, so that browsers
may be able to extract information automatically. Some concepts, such
as
Shutr can then publish terms such as
NOTE: For some reason, I finished the last sentence and felt like the section ended abruptly. So what? I can define my own vocabulary... why is that important? Who can use it? Who cares? You might want to say that other websites might want to use your vocabulary to describe things on their website. That this allows communities to develop their own vocabulary at their pace and not depend on a large organization to standardize a vocabulary. Why is this a cool feature? 3.2 Qualifying Other Documents and Document ChunksShutr may choose to present many photos in a given XHTML page. In
particular, at the URI <ul> <li> <img src="" />, <span about="/user/markb/photo/23456" property="dc:title"> Sunset in Nice </span> </li> <li> <img src="" />, <span about="/user/markb/photo/34567" property="dc:title"> W3C Meeting in Mandelieu </span> </li> </ul> NOTE: It took me a while to see the difference between /user/markb/photo/34567_thumbnail and /user/markb/photo/34567. The resource names were so similar I didn't think there was a distinction, thus the example was lost on me at first. It is also not very clear that the resources /user/markb/photo/23456 and /user/markb/photo/34567 even exist on the page... do they? Is this a realistic example? Wouldn't something like this be a bit more real-world: <p> <ul> <li><a href=""> <img src="" /> </a>, <li><a href=""> <img src="" />, </a> </ul> Photos: <span about="/user/markb/photo/23456" property="dc:title"> Sunset in Nice </span>, and <span about="/user/markb/photo/34567" property="dc:title"> W3C Meeting in Mandelieu </span> </p>
This same approach applies when the field value is a URI. For
example, each photo in the album has a creator and may have its own
copyright license. We can use the convenient inheritance of the <ul> <li about="/user/markb/photo/23456"> <img src="" />, <span property="dc:title"> Sunset in Nice </span> taken by photographer <a property="dc:creator" href="" Mark Birbeck </a>, licensed under a <a rel="cc:license" href="" Creative Commons Non-Commercial License </a>. </li> <li about="/user/markb/photo/34567"> <img src="" /> <span property="dc:title"> W3C Meeting in Mandelieu </span> taken by photographer <a property="dc:creator" href="" Steven Pemberton </a>, licensed under a <a rel="cc:license" href="" Creative Commons Commercial License </a>. </li> </ul> NOTE: Why would a web developer mark up a resource if it isn't linked to in some way from the current page? In other words, if you can't navigate to the resource, why would you include information on the resource in the current page? Again, is this a practical example? How does the web browser know that the image is located at /user/markb/photo/34567? I realize that this is one of the cool features of RDFa, but it doesn't come across as something that web developers would do. The question really has more to do with adoption than the Primer... the only thing it really helps are search engines, right? While it makes sense for Shutr to have a whole web page dedicated to each photo album, it might not make as much sense to have a single page for each camera owned by a user. A single page that describes all cameras belonging to a single user is more appropriate. For this purpose, RDFa provides ways to mark up document fragments using natural XHTML constructs. Consider the page <ul> <li id="nikon_d200"> Nikon D200, 3 pictures/second. </li> <li id="canon_sd550"> Canon Powershot SD550, 5 pictures/second. </li> </ul> and the photo page will then include information about which camera was used to take each photo: <ul> <li> <img src="" /> ... using the <a href="" D200</a>, ... </li> ... </ul> The RDFa syntax for formally specifying the relationship is exactly the same as before, as expected: <ul> <li about="/user/markb/photo/23456"> <img src="" /> ... using the <a rel="shutr:takenWithCamera" href="" D200</a>, ... </li> ... </ul>
Then, the XHTML snippet at <ul> <li id="nikon_d200" about="#nikon_d200"> <span property="dc:title"> Nikon D200 </span> <span property="shutr:shutterSpeed"> 3 pictures/second </span> </li> <li id="canon_sd550" about="#canon_sd550"> <span property="dc:title"> Canon Powershot SD550 </span> <span property="shutr:shutterSpeed"> 5 pictures/second </span> </li> </ul> Notice, again, how text can serve both for the human and machine readable versions: there is no need to keep a separate file up-to-date. 3.3 Data TypesWhen dealing with fields of structured data, one may well want (or
need) to specify a data type so that computer programs that read the
data can make sense of it. Consider the _expression_ of a date. We have
already seen how the human-rendered and machine-readable data may not
be the same, and how we can use <ul> <li about="/user/markb/photo/23456"> ... take on <span property="dc:date" content="2007-05-12" datatype="xsd:date"> May 12th, 2007 </span> ... </li> </ul> Note how we use XML data types. 3.4 Layers of Structure — SubresourcesSometimes, one may need to mark up a resource with a number of fields, all without giving it a URL, or even a fragment identifier. Consider the case where Shutr decides to let users annotate photos in order to indicate which individuals are depicted in the photo. The barebones XHTML is: This photo depicts Mark Birbeck ([EMAIL PROTECTED]) and Steven Pemberton ([EMAIL PROTECTED]).
The simplest way to mark this up without attempting to resolve unique
identities for photo subjects is to define subresources,
effectively new resources that are not given a name. (In RDF, we call
these blank nodes.) The following markup will do just that, thanks to
the FOAF (Friend-Of-A-Friend) vocabulary which includes the handy
field <div about="/user/markb/photo/23456"> This photo depicts <span rel="foaf:depicts"> <span property="foaf:firstname">Mark</span> <span property="foaf:lastname">Birbeck</span> (<span property="foaf:mbox">[EMAIL PROTECTED]</span>) </span> and <span rel="foaf:depicts"> <span property="foaf:firstname">Steven</span> <span property="foaf:lastname">Pemberton</span> (<span property="foaf:mbox">[EMAIL PROTECTED]<span>). </span> </div>
The use of the 3.5 Using
|