Re: XMLLiterals and c14n

Philip Taylor Mon, 07 Sep 2009 10:30:01 -0700

Ivan Herman wrote:

Sigh. This is indeed a slightly muddy area where the RDF concept
document should be written differently. But, well, this is not something
either of these two working groups can do...


I think the issue is that the RDF concept spec describes the abstract
concepts for abstract RDF graphs, and not a serialization thereof.  [...]

As I understand it, rdf-concepts explicitly describes the lexical spaceof XMLLiterals, i.e. the set of Unicode strings which values of typeXMLLiteral must be a member of.

I'm happy to agree that serialisations like RDF/XML and RDFa specifytheir own transformations/mappings from the input document onto thatabstract RDF lexical space, and there's no need for the input documentto care about C14N at all - the input can be anything, and the mappingcan be arbitrarily complicated, as long as the resultant triples containvalues from the appropriate lexical space.

But serialisations of RDF like N3/Turtle/N-Triples represent XMLLiteralsas typed strings. I'm making the (hopefully reasonable) assumption thatthose strings correspond directly (after appropriate charset decoding)to the lexical space defined by rdf-concepts - there is no non-trivialmapping there. (In particular, no automatic canonicalisation occurs.)

(If that assumption is wrong, and there is a non-trivial mapping betweenN3/Turtle/N-Triples serialised strings and the XMLLiteral lexical space,then I can't find any definition of that mapping at all, which is abigger problem (unless I'm just missing it).)

The RDFa spec examples and test cases represent triples usingTurtle/N-Triples as the serialisation format, so their strings mapdirectly onto the restricted lexical space, so I believe thoseparticular cases need to use canonicalised form for their serialisationsof XMLLiteral strings.

The RDFa spec also refers to abstract triples (as the result ofprocessing a document), at which point there is no serialisationinvolved at all, and so a value of type XMLLiteral must be a member ofthe lexical space of XMLLiteral, i.e. must be a canonical-form string.

So I think I agree with everything you are saying (that RDF/XML and RDFadon't require c14n of their input) and I think that's all good, but Idon't think that's addressing the problems I see (which are with theabstract triple output of RDFa, and with specific examples ofTurtle/N-Triples serialised triples).

(On a practical level, all RDF environments and serializations I know
about behave similarly: they would take any (valid) XML as XML Literal,
and the C14N comes into the picture when two XML literals are checked,
eg, for equality.)

(If equality is always checked in terms of C14N-equivalence, why doeshttp://www.w3.org/2006/07/SWD/RDFa/testsuite/xhtml1-testcases/0011.sparqlsay that the output must equal either one of two strings that areC14N-equivalent? If it's equal to one, it would also be equal to theother. So I presume at least some implementations just do simple stringequality, instead of dealing with C14N when checking equality, and theC14N should be dealt with at an earlier point (when generating thetriples) to avoid making equality comparisons hopelessly inefficient.)

Ivan


--
Philip Taylor
pj...@cam.ac.uk

Re: XMLLiterals and c14n

Reply via email to