Re: Request to publish HTML+RDFa (draft 3) as FPWD

Shane McCarron Mon, 21 Sep 2009 15:03:40 -0700

Maciej,

My comments inline:


Maciej Stachowiak wrote:

Here is the only implementation conformance requirement regardingprefix mapping that I could find in section 5.4:
"Since CURIE mappings are created by authors via the XML namespacesyntax [XMLNS] an RDFa processor MUST take into account thehierarchical nature of prefix declarations."
I do not think this is adequate to define the processing model,particularly not in a non-XML context. Indeed, this requirementappears to be redundant with what Section 5.5 says (or is trying tosay, anyway), so I'm not sure why it is there at all.

It doesn't define a processing model. It defines the syntax.Regardless, I appreciate that you feel it is extraneous ANDinsufficient. It is there to help tie together some concepts. See below.

Section 5.5 does define the processing model in some detail (modulobugs). But even assuming the bugs are fixed, it does not defineanything in the context of HTML as opposed to XML.

Of course it doesn't. Why would it? We had no scope to define ANYTHINGabout HTML processing when that document was written. Let's introducesome terms, so that we avoid more confusion. Let's call the existing,approved Recommendation "RDFa Syntax". Let's call the new candidateFPWD "RDFa in HTML".

RDFa Syntax is an XHTML specification. I fear that we are conflatingthe concepts of document processing with RDFa processing. Theconceptual model of RDFa is one in which triples are extracted from adocument that matches the *syntax* defined in the RDFa SyntaxRecommendation. That extraction is achieved using the processing rulesalso defined in that Recommendation. What happens between retrieval ofthe resource (source document) conforming to the syntax rules and theparsing of that document by a Conforming RDFa Processor is outside thescope of the RDFa Syntax document. It is the purview of the underlying(host language) specification.

Manu's draft augments the *syntax* rules that are defined in RDFa Syntaxso that they are supported in HTML5, but remain *identical* in XHTML andHTML. What you seem to be requesting is that we continue to extend thetext to tighten down the definition. I have no objection to that in theabstract. However, as we do that we MUST ensure that the changes do notredefine behavior already defined elsewhere. Further, we MUST ensurethat changes are made in RDFa Syntax when that is appropriate, and inRDFa in HTML when it is not something that effects the base, commonsyntax or the base, common triple extraction rules.

It would not be at all difficult to define this unambiguously inRDFa+HTML. Here is a sample attempt by me:
'When applying the processing rules of XHTML+RDFa section 5.5 to anHTML document, modify step 2 as follows. In addition to XML namespacedeclarations, attributes in no namespace that start with the string"xmlns:" create a namespace mapping if the attribute name matches thePrefixedName production from [Namespaces in XML]. For each suchattribute, add a mapping to the [local list of URI mappings]; thevalue to be mapped is the attribute name with the first six characters(the initial "xmlns:") removed, and the value to map to is theattribute value.'
(Note: this allows proper namespace declarations added withnamespace-aware DOM APIs to still work in HTML documents. If this isnot desired, then simply replace "In addition to" with "Instead of".)

Okay, I understand what you are looking for. I think that yoursuggested text is correct when talking about the DOM and Infosetprocessing. But the processing rules in section 5.5 are not writtenfrom a DOM or Infoset perspective - at least not exclusively norintentionally. We really, really, really were talking about the syntaxand then the extraction of data from structures that conform to thatsyntax. Obviously it is possible to construct DOM trees that containthe relevant attributes even if there were no source document at all -however, a conforming RDFa processor wouldn't know the difference so...it would behave as if there were a source document that conformed to thesyntax.

Regardless, if a change of this nature satisfies your objection (andobjections like yours) I would not object to its inclusion. I wouldprobably suggest that text with similar precision be added to RDFaSyntax - if only as explanatory text supporting the interpretation ofthe rules in the context of the DOM.

As to your concern about section 5.5, thanks for bringing that to ourattention. I proposed errata text to clarify that wording [1] and Iexpect it to be approved at the next Task Force meeting.
[1]http://lists.w3.org/Archives/Public/public-rdf-in-xhtml-tf/2009Sep/0092.html
I find that errata text a bit confusing. Here's a few issues I spotted:
1) Instead of "Mappings are provided via the PrefixedAttNameproduction as defined in [XMLNS]," it should probably say "Mappingsare provided by XML namespace declarations, excluding defaultnamespace declarations, as defined in [XMLNS]". I say this becauseRDFa processing rules operate on an abstract tree model, and not atthe raw textual level. If you want to use a grammar rule, you have todefine what it applies to (the qualified name of the attribute I guess?)

Hmm... As I have indicated above, I believe that the RDFa Syntaxspecification *is* a grammar specification. It defines extensions toXHTML via a module, defines a markup language based upon that module,and provides a DTD for that language (we have an XML Schemaimplementation of it too, for some future update).The grammar of the prefix declarations for RDFa is defined via thePrefixedAttName production. Your suggested text achieves something verydifferent - to me anyway. The implication is that the prefixes areprovided by XML namespace declarations... which is not *wrong* for someenvironments... but it's surely not exclusively what we meant NORexclusively how it is used in the wild.

I hope that we were very careful in the Recommendation to indicate thatit is the *syntax* of the XML Namespace declarations that is used todefine RDFa prefix mappings. The fact that those mappings ALSO declarean XML Namespace in some contexts is great... but from a syntaxperspective we don't care. We don't use XML Namespaces. I have writtena few different RDFa and generic CURIE processors now, and none of themused XML Namespaces. Namespaces are just not necessary in order to dothe extraction of triples via the processing rules in section 5.5. I amsure there are tool chains where it is necessary (because some elementof the chain has hidden the original syntax from the RDFa processor),but that is surely an exercise for that implementor in that styletoolchain, isn't it? The specification is not aware of every possibleway in which the *syntax* of a conforming source document is fed to aconforming RDFa processor.

2) The erratum text says: "The real meaning if this is only clear inthe context of Section 5.4.1. Scoping of Prefix Mappings, whichnormatively includes the syntax processing rules of the Namespaces inXML Recommendation," but section 5.4.1 does not appear to do that. Theonly mention of XMLNS is in a factual dependent clause: "Since CURIEmappings are created by authors via the XML namespace syntax [XMLNS]",that precedes a conformance requirement to "take into account thehierarchical nature of prefix declarations". As far as I can tell,there is no conformance requirement to follow the syntax processingrules of Namespaces in XML in general. If such a requirement wasintended, it should be stated clearly. Though personally, I think it'sbetter to precisely define the exact processing rules in section 5.5,since Namespaces in XML is defined purely at a textual level, but RDFaprocessing is defined on an abstract tree model, so it's notnecessarily obvious how to apply the rules.

I see what you are saying, and I do not mind making a further specificnormative reference to [XMLNS] and its attendant syntax in section 5.5step 2. I will try to update my proposed errata text to reflect yourconcerns. I further agree that section 5.5 is written in a way thatmakes it possible to interpret the rules in the context of an abstracttree. However, that section does not REQUIRE an abstract tree in orderfor it to be implemented. You could, for example, implement the wholemess using a tokenizing parser that had callouts each time a token wasencountered. I did an implementation that way in Perl just for fun oneweekend.

Needless to say, I am not satisfied that my comment on this has beenaddressed. It appears to me that the xmlns processing model for HTMLremains totally undefined.
There is no "xmlns" processing model in RDFa. There is a syntaxspecification and rules for extracting prefix mappings from thatsyntax. Both of those are normative, including by reference fortheir relevant, defining Recommendations.
Namespaces in XML does not apply to HTML, it only defines grammar andprocessing rules for well-formed XML documents. So citing Namespacesin XML doesn't answer anything. It's like explaining UTF-8 by pointingto a spec for UTF-16 surrogate pairs.

Well - again, obviously I disagree. The syntax rules defined inNamespaces in XML could be used in ANY language if you wanted to. It'sjust an eBNF grammar, after all. The Namespaces in XML Recommendationdoes not define the ways in which those syntactic namespace declarationsare mapped into a DOM, nor into the Infoset. It defines a syntax andalso defines the hierarchic nature of XML Namespace declarations.Section 5.5 step 2 is (or at least attempts to be) explicit about thehandling of this syntax. Section 5.5 overall defines a recursiveprocessing model that incorporates the hierarchic nature of the prefixdeclared via the syntax. Finally, section 5.4.1 expressly refers tosyntax of XML Namespaces AND their hierarchical nature, in an attempt toensure these concepts were clear to the reader / implementor - inparticular to an implementor who might not be working in some abstracttree environment nor in some environment in which the "namespaceness" ofthe XML Namespace declarations is enforced.

It's really not very hard to define the processing rules in a clear anprecise way. I gave an example for how to do it. This doesn't have amaterial effect on the intent of the spec, it just makes it unambiguous.

I am sure that we are working toward the same goals here. Several of usin the RDFa Task Force spent a lot of time over years trying to ensurethat the language in the RDFa Syntax document is not proscriptive. Thatremains my primary concern. We have to be certain that we are notpre-supposing a processing model nor a processing environment. Thatdoesn't mean we can't say what we mean in more precise language. Italso doesn't mean we cannot provide guidance to implementors in variousenvironments. However, I am adamant that guidance be provided outsideof the W3C Recommendation (e.g., in an implementor's guide wiki). Thatway we can keep it up to date, extend it as we learn, and NOT putimplementation-specific language into the general case document that aW3C recommendation should (always) be.


Thanks as always for your insight!

--
Shane P. McCarron                          Phone: +1 763 786-8160 x120
Managing Director                            Fax: +1 763 786-8180
ApTest Minnesota                            Inet: sh...@aptest.com

Re: Request to publish HTML+RDFa (draft 3) as FPWD

Reply via email to