On Mar 30, 2006, at 8:34 PM, M. David Peterson wrote:
...the content element can be basically anything as long as its either
- non-escaped plain text with a @type value set to text,
- escaped text,with a @type set to a valid 'text' mime-type
- enitity escaped with @type set to html,
- xhtml wrapped in a properly xhtml namespaced div with @type set
to xhtml,
- base64 encoded with @type set to the proper media type, or
- its xml with @type set to a proper XML mime-type.
In each of these cases, the only one that shold have even a remote
chance of the current value of the @xml:base in current context
applying to is inline xml.
...
The escaped HTML content contained within the content element that
David was originally concerned with is more than likely a copy of
all or part of the elements and content contained inside the body
tag of the external document referenced by an associated link
element, and therefore no guarentee that the xml:base of the atom
feed is going to be anywhere even close to accurate.
On what basis are you concluding that Atom publishers are more likely
to be smart enough to set xml:base correctly when publishing inline
XML than when publishing escaped HTML? What if the source material
is tag soup HTML? You could clean it up and turn it into XHTML or
publish it as is as escaped HTML. Either option is valid, and may be
preferable in some situations. I don't see how any assumptions can
be made about the publisher's ability to set xml:base correctly based
on the content type.
If you're assuming that xml:base is going to be set only at the top
of the Atom document, then it may very well fail to be correct for a
lot of the content. But xml:base may also be set at on the entry or
content element, and could easily be set correctly based on the
publisher's knowledge of the appropriate base URI for the content.
Anyway,theoretical arguments aside, there are two questions to answer
for the real world:
1) If you're publishing Atom, in which content @types can you use
relative URIs with reasonable confidence that consumers will apply
the base URI correctly?
2) If you're consuming Atom and you encounter a relative URI, how
should you choose the appropriate base URI with which to resolve it?
I think there are only three remotely possible answers to #2:
xml:base (including the URI from which the feed was retrieved if
xml:base isn't explicitly defined), the URI of the self link, and the
URI of the alternate link. Given that Atom explicitly supports
xml:base, if it's explicitly defined, it's difficult to justify
ignoring it in favor of anything else.
If xml:base isn't explicitly defined, there may be some justification
for using the self link rather than the URI from which the feed was
retrieved. It's sloppy on the publisher's part, but might be more
likely to succeed in practice.
The alternate link is only a possible choice if there is at least one
alternate link, and if either there is only one, or there are more
than one, and all of them point to documents in the same directory.
I'd say it's a fairly weak choice.
Conclusion: you've got to resolve relative URIs with respect to
SOMETHING, and clearly the best choice is xml:base if it's explicitly
defined. If not, the self link and the URI from which the feed is
retrieved each have some merit.
If that's the correct answer for #2, then in a reasonably perfect
world, the answer to #1 should be that relative URIs should be safe
anywhere as long as you're explicitly (and correctly!) defining
xml:base. In the real world, I'd guess that more consuming
applications will get it right in inline XML than in escaped HTML.