XMLLiterals and c14n (was: HTML+RDFa (2nd draft))

Philip Taylor Mon, 07 Sep 2009 07:40:17 -0700

Manu Sporny wrote:

The most recent HTML+RDFa draft can be found here:


http://html5.digitalbazaar.com/specs/rdfa.html


Section 4.2 now says:

The markup above should produce the following triple:<><http://example.org/vocab#markup>"<rect xmlns=\"http://www.w3.org/2000/svg\"; xmlns:ex=\"http://example.org/vocab#\";→ xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\"; width=\"300\"→ height=\"100\" style=\"fill:rgb(0,0,255);stroke-width:1; stroke:rgb(0,0,0)\"/>→ <rect xmlns=\"http://www.w3.org/2000/svg\"; xmlns:ex=\"http://example.org/vocab#\";→ xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\"; width=\"50\"→ height=\"50\" style=\"fill:rgb(255,0,0);stroke-width:2;→ stroke:rgb(0,0,0)\"/>"^^http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral

As far as I can tell, that violates the RDF specs.http://www.w3.org/TR/rdf-concepts/#section-XMLLiteral says "The lexicalspace is the set of all strings ... for which encoding as UTF-8 [RFC2279] yields exclusive Canonical XML (with comments, with emptyInclusiveNamespaces PrefixList) [XML-XC14N]". The XML in the spec is notin Exclusive Canonical Form (in particular I believe the xmlns:ex andxmlns:rdf must not be present), so it's not a legal XMLLiteral string.

This seems a slightly more widespread problem within RDFa, e.g.http://www.w3.org/2006/07/SWD/RDFa/testsuite/xhtml1-testcases/0011.sparqlexplicitly permits output which violates the definition of XMLLiteral -I think it should be updated to only permit output which is valid RDF(i.e. with XC14N-style XMLLiterals).http://www.w3.org/2006/07/SWD/RDFa/testsuite/xhtml1-testcases/0100.sparqlhas the same issue, and presumably other XMLLiteral tests do too.


http://www.w3.org/TR/rdfa-syntax/ says:

The value of the [XML literal] is a string created by serializing to
text, all nodes that are descendants of the [current element], i.e., not
including the element itself, and giving it a datatype of
rdf:XMLLiteral.

which should be updated to state that the descendants must be serializedwith the Exclusive XML Canonicalization algorithm. Similarly, theHTML+RDFa draft should refer to that algorithm instead of (or inaddition to?) HTML5's #serializing-xhtml-fragments algorithm.

(Separately from the c14n issue,http://www.w3.org/TR/rdfa-syntax/#s_xml_literals says the expectedoutput for one example is '<> dc:title "E = mc<sup>2</sup>: The MostUrgent Problem of Our Time"^^rdf:XMLLiteral', which is incorrect becauseit's lost the HTML namespace of the <sup> element.)

Another concern with c14n: My understanding is that Exclusive C14n onlyincludes namespace declarations when the namespaces are "visiblyutilized" by element or attribute names, and namespaces used only inCURIEs in attribute values are not visibly utilized, so theirdeclarations will be removed. So Exclusive C14n of an RDFa document orfragment will almost certainly destroy its RDFa content. The only ways Ican see to fix that are to change rdf-concepts to not require XC14N, orto change RDFa to not use XML Namespaces, though it might not be aproblem that's worth fixing.


--
Philip Taylor
pj...@cam.ac.uk

XMLLiterals and c14n (was: HTML+RDFa (2nd draft))

Reply via email to