Antone,

Very good write up.  The fact that xml:base on div is not valid XHTML is
somewhat irrelevant given that there is an identical problem with
xml:lang. For instance, if I have <content xml:lang="en"><div
xml:lang="fr">...</div></content> and I drop the div silently, then I've
got a problem.  Granted, the producer of the atom feed really shouldn't
have done this, but we still need to be able to handle it properly if it
does happen.  The solution I think I'm going to go with is to support
both approaches.  Our default behavior will be to return the div.  A
separate API will provide the content without the div.  When it doubt,
do both.

- James

Antone Roundy wrote:
> 
> On Jun 28, 2006, at 12:06 PM, A. Pagaltzis wrote:
>> * James M Snell <[EMAIL PROTECTED]> [2006-06-28 20:00]:
>>> A. Pagaltzis wrote:
>>>> * James M Snell <[EMAIL PROTECTED]> [2006-06-28 14:35]:
>>>>> Hiding the div completely from users of Abdera would mean
>>>>> potentially losing important data (e.g. the div may contain
>>>>> an xml:lang or xml:base) or forcing me to perform additional
>>>>> processing (pushing the in-scope xml:lang/xml:base down to
>>>>> child elements of the div.
>>>>
>>>> How is that any different from having to find ways to pass
>>>> any in-scope xml:lang/xml:base down to API clients when the
>>>> content is type="html" or type="text"? I hope you didn’t punt
>>>> on those?
>>>
>>> Our Content interface has methods for getting to that
>>> information.
>>
>> Then stripping the `div` is not an issue, is it?
> 
> Consider this:
> 
> <entry xml:lang="en" xml:base="http://example.com/foo/";>
>     ...
>     <content type="xhtml">
>         <xhtml:div xml:lang="fr"
> xml:base="http://example.com/feu/";><xhtml:a
> href="axe.html">axe</xhtml:a></xhtml:div>
>     </content>
> </entry>
> 
> Whether there's a problem depends on whether one requests the xml:base,
> xml:lang, or whatever for the atom:content element itself or for the
> CONTENT OF the atom:content element, in which case the library could
> return the values it got from the xhtml:div.  Except in unusual cases
> like this, the result would be identical.
> 
> Certainly a distinction could be made between how an XML library would
> handle this vs. how an Atom library would handle it.  An Atom processing
> library might be expected to be able to do things like:
> 
> * give me the raw contents of the atom:content element
> * give me the contents of the atom:content element converted to
> well-formed XHTML (whether it started as text, escaped tag soup, or
> inline xhtml)
> 
> In the former case, keeping the div feels like the right thing to
> do--the consuming app would have to know to remove it.  In the latter
> case, removing the div from xhtml content feels like the right thing to
> do.  But unless the library gives me the xml:base, for example, which
> applies to the content of the atom:content element (as converted to
> well-formed xhtml or whatever), as opposed to the xml:base which applied
> to the atom:content element itself, there's potential for trouble.
> 
> ...now that I think about it, if the library always returns the xml:base
> which applies to the content of the element, that could cause trouble too:
> 
> <entry xml:lang="en" xml:base="http://example.com/";>
>     ...
>     <content type="xhtml">
>         <xhtml:div xml:lang="fr" xml:base="feu/"><xhtml:a
> href="axe.html">axe</xhtml:a></xhtml:div>
>     </content>
> </entry>
> 
> Here, if I get xml:base for the content of content, it will be
> "http://example.com/feu/";.  Then, if I get the raw content of the
> element, strip the div, and apply xml:base myself, I'll erroneously use
> "http://example.com/feu/feu/"; as the base URI unless I know to ignore
> the xml:base attribute on the div.
> 
> 

Reply via email to