Antone, Very good write up. The fact that xml:base on div is not valid XHTML is somewhat irrelevant given that there is an identical problem with xml:lang. For instance, if I have <content xml:lang="en"><div xml:lang="fr">...</div></content> and I drop the div silently, then I've got a problem. Granted, the producer of the atom feed really shouldn't have done this, but we still need to be able to handle it properly if it does happen. The solution I think I'm going to go with is to support both approaches. Our default behavior will be to return the div. A separate API will provide the content without the div. When it doubt, do both.
- James Antone Roundy wrote: > > On Jun 28, 2006, at 12:06 PM, A. Pagaltzis wrote: >> * James M Snell <[EMAIL PROTECTED]> [2006-06-28 20:00]: >>> A. Pagaltzis wrote: >>>> * James M Snell <[EMAIL PROTECTED]> [2006-06-28 14:35]: >>>>> Hiding the div completely from users of Abdera would mean >>>>> potentially losing important data (e.g. the div may contain >>>>> an xml:lang or xml:base) or forcing me to perform additional >>>>> processing (pushing the in-scope xml:lang/xml:base down to >>>>> child elements of the div. >>>> >>>> How is that any different from having to find ways to pass >>>> any in-scope xml:lang/xml:base down to API clients when the >>>> content is type="html" or type="text"? I hope you didn’t punt >>>> on those? >>> >>> Our Content interface has methods for getting to that >>> information. >> >> Then stripping the `div` is not an issue, is it? > > Consider this: > > <entry xml:lang="en" xml:base="http://example.com/foo/"> > ... > <content type="xhtml"> > <xhtml:div xml:lang="fr" > xml:base="http://example.com/feu/"><xhtml:a > href="axe.html">axe</xhtml:a></xhtml:div> > </content> > </entry> > > Whether there's a problem depends on whether one requests the xml:base, > xml:lang, or whatever for the atom:content element itself or for the > CONTENT OF the atom:content element, in which case the library could > return the values it got from the xhtml:div. Except in unusual cases > like this, the result would be identical. > > Certainly a distinction could be made between how an XML library would > handle this vs. how an Atom library would handle it. An Atom processing > library might be expected to be able to do things like: > > * give me the raw contents of the atom:content element > * give me the contents of the atom:content element converted to > well-formed XHTML (whether it started as text, escaped tag soup, or > inline xhtml) > > In the former case, keeping the div feels like the right thing to > do--the consuming app would have to know to remove it. In the latter > case, removing the div from xhtml content feels like the right thing to > do. But unless the library gives me the xml:base, for example, which > applies to the content of the atom:content element (as converted to > well-formed xhtml or whatever), as opposed to the xml:base which applied > to the atom:content element itself, there's potential for trouble. > > ...now that I think about it, if the library always returns the xml:base > which applies to the content of the element, that could cause trouble too: > > <entry xml:lang="en" xml:base="http://example.com/"> > ... > <content type="xhtml"> > <xhtml:div xml:lang="fr" xml:base="feu/"><xhtml:a > href="axe.html">axe</xhtml:a></xhtml:div> > </content> > </entry> > > Here, if I get xml:base for the content of content, it will be > "http://example.com/feu/". Then, if I get the raw content of the > element, strip the div, and apply xml:base myself, I'll erroneously use > "http://example.com/feu/feu/" as the base URI unless I know to ignore > the xml:base attribute on the div. > >