Re: Does xml:base apply to type="html" content?

Antone Roundy Thu, 30 Mar 2006 21:20:27 -0800


On Mar 30, 2006, at 8:34 PM, M. David Peterson wrote:

...the content element can be basically anything as long as its either
- non-escaped plain text with a @type value set to text,
- escaped text,with a @type set to a valid 'text' mime-type
- enitity escaped with @type set to html,
- xhtml wrapped in a properly xhtml namespaced div with @type setto xhtml,
- base64 encoded with @type set to the proper media type, or
- its xml with @type set to a proper XML mime-type.
In each of these cases, the only one that shold have even a remotechance of the current value of the @xml:base in current contextapplying to is inline xml.

...

The escaped HTML content contained within the content element thatDavid was originally concerned with is more than likely a copy ofall or part of the elements and content contained inside the bodytag of the external document referenced by an associated linkelement, and therefore no guarentee that the xml:base of the atomfeed is going to be anywhere even close to accurate.

On what basis are you concluding that Atom publishers are more likelyto be smart enough to set xml:base correctly when publishing inlineXML than when publishing escaped HTML? What if the source materialis tag soup HTML? You could clean it up and turn it into XHTML orpublish it as is as escaped HTML. Either option is valid, and may bepreferable in some situations. I don't see how any assumptions canbe made about the publisher's ability to set xml:base correctly basedon the content type.

If you're assuming that xml:base is going to be set only at the topof the Atom document, then it may very well fail to be correct for alot of the content. But xml:base may also be set at on the entry orcontent element, and could easily be set correctly based on thepublisher's knowledge of the appropriate base URI for the content.

Anyway,theoretical arguments aside, there are two questions to answerfor the real world:

1) If you're publishing Atom, in which content @types can you userelative URIs with reasonable confidence that consumers will applythe base URI correctly?

2) If you're consuming Atom and you encounter a relative URI, howshould you choose the appropriate base URI with which to resolve it?

I think there are only three remotely possible answers to #2:xml:base (including the URI from which the feed was retrieved ifxml:base isn't explicitly defined), the URI of the self link, and theURI of the alternate link. Given that Atom explicitly supportsxml:base, if it's explicitly defined, it's difficult to justifyignoring it in favor of anything else.

If xml:base isn't explicitly defined, there may be some justificationfor using the self link rather than the URI from which the feed wasretrieved. It's sloppy on the publisher's part, but might be morelikely to succeed in practice.

The alternate link is only a possible choice if there is at least onealternate link, and if either there is only one, or there are morethan one, and all of them point to documents in the same directory.I'd say it's a fairly weak choice.

Conclusion: you've got to resolve relative URIs with respect toSOMETHING, and clearly the best choice is xml:base if it's explicitlydefined. If not, the self link and the URI from which the feed isretrieved each have some merit.

If that's the correct answer for #2, then in a reasonably perfectworld, the answer to #1 should be that relative URIs should be safeanywhere as long as you're explicitly (and correctly!) definingxml:base. In the real world, I'd guess that more consumingapplications will get it right in inline XML than in escaped HTML.

Re: Does xml:base apply to type="html" content?

Reply via email to