Re: Request to publish HTML+RDFa (draft 3) as FPWD

Maciej Stachowiak Tue, 22 Sep 2009 16:49:02 -0700


On Sep 22, 2009, at 3:42 PM, Mark Birbeck wrote:

HI Jonas,

It certainly matters. If for example if method 1 or 2 were used then

no prefix mappings would be found at all in the DOM output from aHTML

parser. So it really *does* matter how you do prefix mapping. And as
far as DOM 2 goes, I think 1 or 2 are the intuitive solutions so if
we're not using those then I *really* think it's important to specify
so.

In any case, I think I've spent enough time on this issue. I can't
really articulate the problem any more than I have. I hope this issue
is solved by the time last call rolls around.


I see that you are frustrated, but you seem to think that the issue is
that no-one understands your position.

We *do* understand your position, and are trying to explain to you,
that -- with all due respect -- it is based on a misunderstanding.

You are looking at implementation specifics, and as many people have
explained, implementation is not the issue. This is because the spec
is defining an algorithm, which entitles people to implement things
how they see fit, on whatever platform they want to write for, using
whatever language they want to use.

What Jonas is saying is that the spec algorithms as stated don't letyou choose between implementation strategies that at first glance seemequally valid but in fact will give different results. He gave somespecific examples - how to get prefix mappings in a DOM, how toextract triples from an HTML document that would result inreparenting, and whether prefix mappings should be assigned toelements at parse time or extraction time if the DOM can be mutatedafter parsing.

It seems like people reject his arguments for what superficiallyappear to be mutually contradictory reasons: (a) that RDFa doesn'treally use Namespaces in XML, it just uses a syntax that looks thesame but could have been anything; (b) that RDFa normativelyreferences Namespaces in XML for implementation requirements; (c) thatRDFa is defined purely at the raw source text level (even though thespec's processing rules speak of an abstract tree model); (d) thatRDFa can be applied directly to situations where original source textis not available or may not even exist.

I'm pretty puzzled by the argument that RDFa is defined in terms ofraw source text. The start of section 5 or XHTML+RDFa says:

"Processing need not follow the DOM traversal technique outlined here,although the effect of following some other manner of processing mustbe the same as if the processing outlined here were followed. Theprocessing model is explained using the idea of DOM traversal whichmakes it easier to describe (particularly in relation to the[evaluation context])."

And indeed Section 5 describes processing in terms of DOM conceptssuch as "document object", "child element", "document order" and soforth. Later Section 5.5 describes its algorithm as "the DOM traversaltechnique defined here".

It seems to me like it would be much more fruitful to go with this DOM-like formalism instead of pretending that things are actually definedat the textual level. They are not - nowhere does RDFa describe how toget from source characters to its tree model for processing, that isall left up to other specs (and with the understanding thatimplementations may do things without a tree, as long as they giveequivalent results).

Buying into the DOM-based model that XHTML+RDFa already uses for itsprocessing rules would immediately answer many of Jonas's questions:

- HTML5+RDFa should be processed by taking the DOM that results fromthe HTML5 parsing algorithm. As with XHTML+RDFa, you don't have toliterally create a DOM, but your output must be equivalent to theprocessing defined in DOM terms.- DOM mutations that happen before RDFa extraction *do* potentiallyaffect the extracted triples.

- HTML source documents that are parsed in a way that reparents nodes.

- There is no need to first serialize a DOM in order to process itaccording to RDFa.

The only detail that would have to be filled in, if we accept the DOM-based model that the spec already uses, is how to find the prefixmappings. Either an XHTML+RDFa erratum or HTML5+RDFa could specifythat any attribute with a qualified name (tagName) that starts with"xmlns:" creates a prefix mapping.

Buying into the DOM approach would also address Henri's objectionabout bad spec layering.


Regards,
Maciej

Re: Request to publish HTML+RDFa (draft 3) as FPWD

Reply via email to