Shane McCarron wrote:
...
Because, Henri, we don't grok the problem. I am slowly beginning to
understand that this might be due to our talking past one another. The
W3C has a Recommendation that defines the Syntax of RDFa *input* and the
extraction of RDF triples from that *input*. It defines this as an
extension to XHTML. XHTML Modularization provides the structure for a
host language. The Recommendation is carefully vague about how that
input is parsed because that is properly the job of the host language.
...
It appears that one *real* problem was mentioned; the case where the
HTML source document is invalid, and the HTML parser rearranges
elements, before a DOM-based RDFa extractor would even see it (the table
example).
This *is* a problem, in particular because prefix mappings that appear
to be in scope looking at the source won't be anymore once the data is
processed by the HTML processor.
If this can't be resolved somehow (and I have no idea how), the only
resort seems to state that the result for documents like these are
undefined. (*)
BR, Julian
(*) It would be nice if the information whether the source was
re-arranged by the HTML processor would be available for scripts.