Manu Sporny wrote:
The most recent HTML+RDFa draft can be found here:
http://html5.digitalbazaar.com/specs/rdfa.html
Section 4.2 now says:
The markup above should produce the following triple:
<>
<http://example.org/vocab#markup>
"<rect xmlns=\"http://www.w3.org/2000/svg\" xmlns:ex=\"http://example.org/vocab#\"
→ xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\" width=\"300\"
→ height=\"100\" style=\"fill:rgb(0,0,255);stroke-width:1; stroke:rgb(0,0,0)\"/>
→ <rect xmlns=\"http://www.w3.org/2000/svg\" xmlns:ex=\"http://example.org/vocab#\"
→ xmlns:rdf=\"http://www.w3.org/1999/02/22-rdf-syntax-ns#\" width=\"50\"
→ height=\"50\" style=\"fill:rgb(255,0,0);stroke-width:2;
→ stroke:rgb(0,0,0)\"/>"^^http://www.w3.org/1999/02/22-rdf-syntax-ns#XMLLiteral
As far as I can tell, that violates the RDF specs.
http://www.w3.org/TR/rdf-concepts/#section-XMLLiteral says "The lexical
space is the set of all strings ... for which encoding as UTF-8 [RFC
2279] yields exclusive Canonical XML (with comments, with empty
InclusiveNamespaces PrefixList) [XML-XC14N]". The XML in the spec is not
in Exclusive Canonical Form (in particular I believe the xmlns:ex and
xmlns:rdf must not be present), so it's not a legal XMLLiteral string.
This seems a slightly more widespread problem within RDFa, e.g.
http://www.w3.org/2006/07/SWD/RDFa/testsuite/xhtml1-testcases/0011.sparql
explicitly permits output which violates the definition of XMLLiteral -
I think it should be updated to only permit output which is valid RDF
(i.e. with XC14N-style XMLLiterals).
http://www.w3.org/2006/07/SWD/RDFa/testsuite/xhtml1-testcases/0100.sparql
has the same issue, and presumably other XMLLiteral tests do too.
http://www.w3.org/TR/rdfa-syntax/ says:
The value of the [XML literal] is a string created by serializing to
text, all nodes that are descendants of the [current element], i.e., not
including the element itself, and giving it a datatype of
rdf:XMLLiteral.
which should be updated to state that the descendants must be serialized
with the Exclusive XML Canonicalization algorithm. Similarly, the
HTML+RDFa draft should refer to that algorithm instead of (or in
addition to?) HTML5's #serializing-xhtml-fragments algorithm.
(Separately from the c14n issue,
http://www.w3.org/TR/rdfa-syntax/#s_xml_literals says the expected
output for one example is '<> dc:title "E = mc<sup>2</sup>: The Most
Urgent Problem of Our Time"^^rdf:XMLLiteral', which is incorrect because
it's lost the HTML namespace of the <sup> element.)
Another concern with c14n: My understanding is that Exclusive C14n only
includes namespace declarations when the namespaces are "visibly
utilized" by element or attribute names, and namespaces used only in
CURIEs in attribute values are not visibly utilized, so their
declarations will be removed. So Exclusive C14n of an RDFa document or
fragment will almost certainly destroy its RDFa content. The only ways I
can see to fix that are to change rdf-concepts to not require XC14N, or
to change RDFa to not use XML Namespaces, though it might not be a
problem that's worth fixing.
--
Philip Taylor
pj...@cam.ac.uk