Re: [whatwg] RDFa

Dan Brickley Sat, 23 Aug 2008 08:16:57 -0700

+cc: Paul Miller of Talis, who worked on the AHDS report mentioned below.


Henri Sivonen wrote:

On Aug 23, 2008, at 02:43, Ben Adida wrote:
Why would you reinvent URIs in a way that they can't be de-referenced?
To avoid having misleading affordances.
http://en.wikipedia.org/wiki/Affordance
We want one parser, with variability and innovation in the vocabularydefinition only.
Having one parser seems appealing compared to using the nativemechanisms of each of HTML (<meta>, <link>), PDF (document informationdictionary), PNG (tEXt chunk), etc. at first, but the vision that toolshandle this all when you remix culture already requires the tools tosupport reading and writing the file formats they remix. When youalready have format-native key-value read/write capability, the abilityto build and mine RDF *graphs* becomes an additional burden.

It may not be obvious to those who haven't followed the history, or whowere at school at the time, but many of us did indeed invest a lot oftime and effort using name/value metadata structures in HTML. Forexample, the Dublin Core project began with this technology basebeginning back in 1994/5, and the experience of metadata implementorsusing it was one of the drivers for the creation of RDF. At the timethere no WHATWG to talk to, but the metadata community *did* talk to W3C.


See http://dublincore.org/about/history/

Early on, the Dublin Core community found a lot of pressure forfeature-creep: new elements/terms to address the needs of various groupswho liked Dublin Core, but wanted some specifics added. This situationgave rise to the 'Warwick Framework', defined in 1996 -http://www.dlib.org/dlib/july96/lagoze/07lagoze.html

[[

While there was consensus among the attendees that the concept of asimple metadata set is useful, there were a number of fundamentalquestions concerning the real utility of the Dublin Core as it wasdefined at the end of the preceding workshop. Does the very looselydefined Dublin Core really qualify as a "standard" that can be read andprocessed programmatically? Should the number of the core elements beexpanded, to increase semantic richness, or reduced, to improveease-of-use by authors and/or web publishers? Will authors reliablyattach core metadata elements to their content? Should a core metadataset be restricted to only descriptive cataloging information or shouldit include other types of metadata such as administrative information,linkage data, and the like? What is the relationship of the Dublin Coreto other developing work in metadata schemes, particularly in thoseareas such as rights management information (terms and conditions)?

The workshop attendees concluded that the answer to these questions andthe route to progress on the metadata issue lay in the formulation ahigher-level context for the Dublin Core. This context should define howthe Core can be combined with other sets of metadata in a manner thataddresses the individual integrity, distinct audiences, and separaterealms of responsibility of these distinct metadata sets.

]]

For an implementor report typical of the experience from this era, ie.with name/value pairs, see the UK Arts and Humanities Data Servicedocument http://ahds.ac.uk/public/metadata/discovery.html which waspresented at the Oct'97 Helsinki workshop of the Dublin Core. At thetime I was involved with the ROADS internet cataloguing project and canvouch that we hit a similar ceiling with attribute/value metadata.

From the appendix, http://ahds.ac.uk/public/metadata/disc_09.html ...here are some of attribute/value structures they were forced to squashtheir metadata records into.



DC.creator.corporateName.1
        Canterbury Archaeological Trust

DC.creator.phone.1
        +44 227 462062


DC.creator.personalName.2
        Paul Miller

DC.creator.affiliation.2
        Archaeology Data Service

...this expresses name, affiliation and contact information for a numberof contributors to a work. Another example describes severalcontributors along with their roles (actor, director, etc). Again theattribute/value representations contained numeric indexes('DC.creator.role.9') to disambiguate which individual was being described.

What barrier is there to building reusable vocabularies?
The follow-your-nose principle is missing, which is fairly essential for
discovering the meaning of vocabularies (partially automatically, not by
doing a Google search.)
The partial automation with RDFa doesn't go very far. If a programautomatically dereferences http://creativecommons.org/ns# and parses theresult as RDFa, the program now has a human-readable string for eachproperty--not exactly something that the program can act on furtherwithout human help.



Looking at this example,

          <div id="license" about="#license" typeof="rdf:Property">
              <h4>cc:license</h4>

A <a rel="rdfs:domain" href="#Work">Work</a> <spanproperty="rdfs:label">has license</span> a <a rel="rdfs:range"href="#License">License</a>. <br />

(a <a rel="rdfs:subPropertyOf"href="http://purl.org/dc/terms/license";>subproperty of dc:license</a>,<a rel="owl:sameAs"href="http://www.w3.org/1999/xhtml/vocab#license";>the same asxhtml:license</a>)

          </div>

Actually we can do a fair bit more than simply have human readablestrings. For example from the CC case, we've got a sub-propertyrelationship between cc:license and dc:license. RDF often (more often,even) has relationships amongst classes too, and between classes andproperties. So for example, the SIOC vocabulary defines a classsioc:User as a subclass of foaf:OnlineAccount; this is mechanicallyevident from http://rdfs.org/sioc/ns# .... similarly,http://trac.usefulinc.com/doap defines the DOAP vocabulary, schema here- http://usefulinc.com/ns/doap# (webserver misconfigured re mimetyperight now). DOAP defines a class doap:Project that subclasses FOAF's'Project' class, and which comes with a number of properties describingopensource software projects. Again this is mechanically evident. As theccREL paper explains, and I can confirm w.r.t. FOAF, it is very usefulto allow related projects to define related classes and properties butmanage their evolution separately. It's a strategy for makingincremental progress without a single project/organization carrying theburden of total coordination. Edd and friends in the DOAP project, forexample, can keep developing new properties for describing projects.Elsewhere in the Web, we can be annotating the URI for 'foaf:Project'eg. with translations.http://svn.foaf-project.org/foaftown/foaf18n/foaf-kr.rdf tells us that aKorean rdfs:label for http://xmlns.com/foaf/0.1/Project is "프로젝트 (어떤 형태의 협업).". The DOAP list is busy figuring out how they mightwant (within DOAP or elsewhere, depending on complexity) to modelcustomer relationships w.r.t. DOAP's notion of project, seehttp://lists.usefulinc.com/pipermail/doap-interest/2008-August/000338.html... but whatever they come up with will be linked back to otherinformation about FOAF's broader notion of Project.

So while it is useful to have human readable strings (includingtranslations) we also get simple relationships between independentlydefined vocabulary terms. RDFS basics here are sub-property, sub-class,range and domain. Without clear Web identifiers for vocabulary terms Ibelieve this kind of distributed, collaborative approach becomessignificantly harder. And I believe the experience of many in the DublinCore metadata scene since the mid-90s backs this up...


cheers,

Dan

--
http://danbri.org/

For an example of browsing this kind of data structure btw seehttp://mqlx.com/~david/parallax/

Re: [whatwg] RDFa

Reply via email to