As an attempt to clarify my current views:
I'm working from the basis that for any arbitrary stream of bytes served
as text/html, it should be possible to determine the set of triples that
are extracted when you apply the standard RDFa triple-extraction
algorithm to the document, and determine whether the document is valid.
That behaviour should be well-defined (including for invalid inputs) and
well tested and should (eventually) match implementations, and ideally
it should be easy to implement and should match authors' expectations
and should be similar to the XHTML syntax and so on.
As an example of some arbitrary inputs, I've made
http://philip.html5.org/demos/rdfa/tests.html [currently very
experimental; only tested in Firefox 3.0 and Opera 9.6, has too little
documentation and too many bugs, etc] to illustrate various cases that
might be interesting. That also shows that current implementations are
quite varied in how they handle the inputs, and so presumably the
implicit mapping from text/html to RDFa-in-XHTML is not obvious enough
by itself to ensure interoperable implementations.
Given that basis, I can't see a sensible way to solve the problem
without relying on HTML 5 to define the mapping from an arbitrary stream
of bytes to a DOM (because practical text/html parsing isn't defined
anywhere else), and then defining the RDFa processing on top of that
(perhaps via an explicit mapping onto another RDFa spec if that's possible).
My document was a rough attempt to show how I imagined it could be
defined in a way that would give clear answers to all the test cases
above, by building on top of HTML 5, as an alternative approach to what
I saw in http://www3.aptest.com/standards/rdfa-html/ and
http://www.w3.org/TR/rdfa-syntax/
I don't intend this to be a competing specification - fragmentation
would certainly be bad, and (in the long term) everything should be
consistent and integrated and clear and it should all be defined in
official RDFa specifications. (I don't think I have the time or
motivation or skill to write a proper specification for this anyway, so
I'm more than happy to let other people do the work!)
It may have been a bad idea to make the document look like a spec, but
I'm not sure of a better way to express what I imagine a solution could
look like.
Responding to some specific points:
Shane McCarron wrote:
I'm sorry that my draft "profile" document doesn't answer your
questions. Of course my intent is to evolve that profile so that, in
conjunction with the other RFCs, Candidate Recommendations, and
Recommendations it normatively references, it represents a thorough
description of the model for embedding RDFa in HTML documents.
That sounds like the best approach to the problem. My criticism of your
published document is coming from an understanding that it's an early
draft and doesn't claim to be perfect and there's plenty of opportunity
for any problems to be solved in the future. (My intent is for the
criticism to be constructive, not rude - apologies if it's too much of
the latter!)
http://lists.w3.org/Archives/Public/public-html/2009May/0127.html
highlighted some specific issues, but I didn't see how they could be
resolved by localised changes to your existing document, which is why I
wanted to look at a more radical way of trying to resolve those issues.
My way certainly isn't the best way, but I hope it can be used as a
piece of feedback that will lead to a better solution in the end.
If there are
things in the CURIE spec that need clarification, then that is the place
to fix those.
Sure - perhaps my document should have said "I think the CURIE spec
should be clarified by changing it to say something more like: ...".
That would still have been missing the reasons why I think it should be
changed: the reasons are basically that for some of the examples in
http://philip.html5.org/demos/rdfa/tests.html I don't see what the
RDFa/CURIE specs say the output should be (mainly in terms of handling
errors), but I don't have an exhaustive list of cases. (Would such a
list be useful?)
Personally, I would rather have a quality test suite that exercises the
specification and ensure that suite gets extended to clarify any edge
cases that implementors are curious about.
I would agree that's the best way to ensure the quality of
implementations - it'd be great if the tests I linked above could
perhaps become useful as part of that.
--
Philip Taylor
pj...@cam.ac.uk