Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
James M Snell wrote: Antone, Very good write up. The fact that xml:base on div is not valid XHTML is somewhat irrelevant given that there is an identical problem with xml:lang. For instance, if I have content xml:lang=endiv xml:lang=fr.../div/content and I drop the div silently, then I've got a problem. Granted, the producer of the atom feed really shouldn't have done this, but we still need to be able to handle it properly if it does happen. I don't agree bug compliance is the way to go. If downstream code has to patch against broken providers that's a race to the bottom - it's a culture where specs cease to matter because they can be mercilessly E and E'd. File a bug report instead. Otoh if we have spec'd in a feature here which doesn't sit on top on XML infrastructure properly, that's another matter - hey, xml lib, handle this element special like, cos atom markup don't care about clean layering sounds like a problem. We seem to keep doing that with xml:* features (lang, include, base). Atom is to XML as HTML is to SGML? cheers Bill
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
On 6/27/06, James M Snell [EMAIL PROTECTED] wrote: Please define conformance in regards to this test. That is, what is the exact behavior that a library must perform when a code library has an API like, getContent on the content element. e.g., is a parser not conformant if it passes the DIV on to the consuming application with the expectation that the application is responsible for doing the right thing with it? Don't be dense. Would the parser be conformant if it passed on the feed, entry, and div elements with that expectation? I'll file a bug on UFP and I bet you it'll get fixed without a question, because there won't be a bad-faith interpretation to fight. That's two demerits this week for you. Tsk tsk. Could this teasing please stop? It noises the debate and starts being *really* annoying for all of us. If you two have issues, do that in private as this is not the right place. Thanks, - Sylvain
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
Robert Sayre wrote: I'll file a bug on UFP and I bet you it'll get fixed without a question http://sourceforge.net/tracker/index.php?func=detailaid=1474256group_id=112328atid=661937 - Sam Ruby
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
On Jun 28, 2006, at 02:27, James M Snell wrote: That is, what is the exact behavior that a library must perform when a code library has an API like, getContent on the content element. One sane behaviour is to return an org.w3c.dom.DocumentFragment with the deep copies of the children of the namespace div with the xml:base and xml:lang context pushed down on each child. That's a bit awkward, so I guess using a placeholder root element with the xml:base and xml:lang context would make sense, provided that the API doc says that the root is not part of the logical content. This could be emphasized by using a root in a private namespace instead of an XHTML div. (Just to be obnoxious enough to make sure users of the API take note. :-) Or, alternatively, the API could construct a full XHTML nu.xom.Document or org.w3c.dom.Document and thereby unify the return value for type=application/xhtml+xml, type=text/html, type=xhtml and type=html. (Assuming, that is that the library runs TagSoup and automatically converts HTML to XHTML.) Actually, I think this would be the best way. In any case, returning a String as the value of the content means that the library is not fully doing its job when the logical value is an XML document fragment. -- Henri Sivonen [EMAIL PROTECTED] http://hsivonen.iki.fi/
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
Hey Sylvain, On this one, I'm being very serious. I need to know what conformance means. Hiding the div completely from users of Abdera would mean potentially losing important data (e.g. the div may contain an xml:lang or xml:base) or forcing me to perform additional processing (pushing the in-scope xml:lang/xml:base down to child elements of the div. It also has ease-of-use ramifications on the API. So I really do need a solid answer on this one. - James Sylvain Hellegouarch wrote: On 6/27/06, James M Snell [EMAIL PROTECTED] wrote: Please define conformance in regards to this test. That is, what is the exact behavior that a library must perform when a code library has an API like, getContent on the content element. e.g., is a parser not conformant if it passes the DIV on to the consuming application with the expectation that the application is responsible for doing the right thing with it? Don't be dense. Would the parser be conformant if it passed on the feed, entry, and div elements with that expectation? I'll file a bug on UFP and I bet you it'll get fixed without a question, because there won't be a bad-faith interpretation to fight. That's two demerits this week for you. Tsk tsk. Could this teasing please stop? It noises the debate and starts being *really* annoying for all of us. If you two have issues, do that in private as this is not the right place. Thanks, - Sylvain
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
Hey Sylvain, On this one, I'm being very serious. I need to know what conformance means. Hiding the div completely from users of Abdera would mean potentially losing important data (e.g. the div may contain an xml:lang or xml:base) or forcing me to perform additional processing (pushing the in-scope xml:lang/xml:base down to child elements of the div. It also has ease-of-use ramifications on the API. So I really do need a solid answer on this one. - James Hi James, I can totally be wrong but as the div tag is added by the Atom processor when creating a content of type XHTML, I believe it should be stripped out when extracting the content to avoid altering the original meaning of the message. That's my understanding anyway. - Sylvain
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
* James M Snell [EMAIL PROTECTED] [2006-06-28 14:35]: Hiding the div completely from users of Abdera would mean potentially losing important data (e.g. the div may contain an xml:lang or xml:base) or forcing me to perform additional processing (pushing the in-scope xml:lang/xml:base down to child elements of the div. How is that any different from having to find ways to pass any in-scope xml:lang/xml:base down to API clients when the content is type=html or type=text? I hope you didn’t punt on those? Regards, -- Aristotle Pagaltzis // http://plasmasturm.org/
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
Our Content interface has methods for getting to that information. - James A. Pagaltzis wrote: * James M Snell [EMAIL PROTECTED] [2006-06-28 14:35]: Hiding the div completely from users of Abdera would mean potentially losing important data (e.g. the div may contain an xml:lang or xml:base) or forcing me to perform additional processing (pushing the in-scope xml:lang/xml:base down to child elements of the div. How is that any different from having to find ways to pass any in-scope xml:lang/xml:base down to API clients when the content is type=html or type=text? I hope you didn’t punt on those? Regards,
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
Wednesday, June 28, 2006, 1:22:00 PM, James Snell wrote: Hiding the div completely from users of Abdera would mean potentially losing important data (e.g. the div may contain an xml:lang or xml:base) I don't think that the div should contain an xml:base, because it isn't valid to use xml:base in XHTML 1.x. As the xhtml:div is added by the producer, it should be removed by the consumer, so there shouldn't be an xml:lang in there either. I wouldn't expect consumers to handle either consistently, so if you are a producer don't do it. I think in my implementation I handle lang and base on the div, and store them out-of-band, but it is more by accident than anything. I would hope that any other xmlns:* declarations on xhtml:div are honoured. Namespaces are so core to XML that making any recommendations about their placement is asking for trouble. or forcing me to perform additional processing (pushing the in-scope xml:lang/xml:base down to child elements of the div. I avoid that, it isn't nice as the xml:base will make the XHTML invalid and browser-dependant. In my RDF implementation, I store the lang context, base context, content model, and other stuff out-of-band from the content itself. I do rely on RDF's exclusive canonicalization rules though, to preserve the inscope namespace decls. (I assume that namespace decls aren't strictly allowed in valid XHTML either? Oh well...) It also has ease-of-use ramifications on the API. So I really do need a solid answer on this one. You need to preserve a load of context in addition to the content string itself, so expect to have to return these extra properties for each use of Text Constructs in your API. It is a bit of a high-barrier to entry really. (If Atom had been designed in JSON, instead of XML, I wonder if it would have been more sympathetic to the OO/RDBMS crowd, and whether we would have bothered with such fine-grained language tagging?) -- Dave
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
* James M Snell [EMAIL PROTECTED] [2006-06-28 20:00]: A. Pagaltzis wrote: * James M Snell [EMAIL PROTECTED] [2006-06-28 14:35]: Hiding the div completely from users of Abdera would mean potentially losing important data (e.g. the div may contain an xml:lang or xml:base) or forcing me to perform additional processing (pushing the in-scope xml:lang/xml:base down to child elements of the div. How is that any different from having to find ways to pass any in-scope xml:lang/xml:base down to API clients when the content is type=html or type=text? I hope you didn’t punt on those? Our Content interface has methods for getting to that information. Then stripping the `div` is not an issue, is it? Regards, -- Aristotle Pagaltzis // http://plasmasturm.org/
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
On Jun 28, 2006, at 12:06 PM, A. Pagaltzis wrote: * James M Snell [EMAIL PROTECTED] [2006-06-28 20:00]: A. Pagaltzis wrote: * James M Snell [EMAIL PROTECTED] [2006-06-28 14:35]: Hiding the div completely from users of Abdera would mean potentially losing important data (e.g. the div may contain an xml:lang or xml:base) or forcing me to perform additional processing (pushing the in-scope xml:lang/xml:base down to child elements of the div. How is that any different from having to find ways to pass any in-scope xml:lang/xml:base down to API clients when the content is type=html or type=text? I hope you didn’t punt on those? Our Content interface has methods for getting to that information. Then stripping the `div` is not an issue, is it? Consider this: entry xml:lang=en xml:base=http://example.com/foo/; ... content type=xhtml xhtml:div xml:lang=fr xml:base=http://example.com/ feu/xhtml:a href=axe.htmlaxe/xhtml:a/xhtml:div /content /entry Whether there's a problem depends on whether one requests the xml:base, xml:lang, or whatever for the atom:content element itself or for the CONTENT OF the atom:content element, in which case the library could return the values it got from the xhtml:div. Except in unusual cases like this, the result would be identical. Certainly a distinction could be made between how an XML library would handle this vs. how an Atom library would handle it. An Atom processing library might be expected to be able to do things like: * give me the raw contents of the atom:content element * give me the contents of the atom:content element converted to well- formed XHTML (whether it started as text, escaped tag soup, or inline xhtml) In the former case, keeping the div feels like the right thing to do-- the consuming app would have to know to remove it. In the latter case, removing the div from xhtml content feels like the right thing to do. But unless the library gives me the xml:base, for example, which applies to the content of the atom:content element (as converted to well-formed xhtml or whatever), as opposed to the xml:base which applied to the atom:content element itself, there's potential for trouble. ...now that I think about it, if the library always returns the xml:base which applies to the content of the element, that could cause trouble too: entry xml:lang=en xml:base=http://example.com/; ... content type=xhtml xhtml:div xml:lang=fr xml:base=feu/xhtml:a href=axe.htmlaxe/xhtml:a/xhtml:div /content /entry Here, if I get xml:base for the content of content, it will be http://example.com/feu/;. Then, if I get the raw content of the element, strip the div, and apply xml:base myself, I'll erroneously use http://example.com/feu/feu/; as the base URI unless I know to ignore the xml:base attribute on the div.
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
Antone, Very good write up. The fact that xml:base on div is not valid XHTML is somewhat irrelevant given that there is an identical problem with xml:lang. For instance, if I have content xml:lang=endiv xml:lang=fr.../div/content and I drop the div silently, then I've got a problem. Granted, the producer of the atom feed really shouldn't have done this, but we still need to be able to handle it properly if it does happen. The solution I think I'm going to go with is to support both approaches. Our default behavior will be to return the div. A separate API will provide the content without the div. When it doubt, do both. - James Antone Roundy wrote: On Jun 28, 2006, at 12:06 PM, A. Pagaltzis wrote: * James M Snell [EMAIL PROTECTED] [2006-06-28 20:00]: A. Pagaltzis wrote: * James M Snell [EMAIL PROTECTED] [2006-06-28 14:35]: Hiding the div completely from users of Abdera would mean potentially losing important data (e.g. the div may contain an xml:lang or xml:base) or forcing me to perform additional processing (pushing the in-scope xml:lang/xml:base down to child elements of the div. How is that any different from having to find ways to pass any in-scope xml:lang/xml:base down to API clients when the content is type=html or type=text? I hope you didn’t punt on those? Our Content interface has methods for getting to that information. Then stripping the `div` is not an issue, is it? Consider this: entry xml:lang=en xml:base=http://example.com/foo/; ... content type=xhtml xhtml:div xml:lang=fr xml:base=http://example.com/feu/;xhtml:a href=axe.htmlaxe/xhtml:a/xhtml:div /content /entry Whether there's a problem depends on whether one requests the xml:base, xml:lang, or whatever for the atom:content element itself or for the CONTENT OF the atom:content element, in which case the library could return the values it got from the xhtml:div. Except in unusual cases like this, the result would be identical. Certainly a distinction could be made between how an XML library would handle this vs. how an Atom library would handle it. An Atom processing library might be expected to be able to do things like: * give me the raw contents of the atom:content element * give me the contents of the atom:content element converted to well-formed XHTML (whether it started as text, escaped tag soup, or inline xhtml) In the former case, keeping the div feels like the right thing to do--the consuming app would have to know to remove it. In the latter case, removing the div from xhtml content feels like the right thing to do. But unless the library gives me the xml:base, for example, which applies to the content of the atom:content element (as converted to well-formed xhtml or whatever), as opposed to the xml:base which applied to the atom:content element itself, there's potential for trouble. ...now that I think about it, if the library always returns the xml:base which applies to the content of the element, that could cause trouble too: entry xml:lang=en xml:base=http://example.com/; ... content type=xhtml xhtml:div xml:lang=fr xml:base=feu/xhtml:a href=axe.htmlaxe/xhtml:a/xhtml:div /content /entry Here, if I get xml:base for the content of content, it will be http://example.com/feu/;. Then, if I get the raw content of the element, strip the div, and apply xml:base myself, I'll erroneously use http://example.com/feu/feu/; as the base URI unless I know to ignore the xml:base attribute on the div.
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
David, you're right, ideally the xhtml container div would be nothing but the div, but if it's not, we still need to be prepared to handle it. Silent data loss sucks, if it's silly data :-) - James David Powell wrote: Wednesday, June 28, 2006, 1:22:00 PM, James Snell wrote: Hiding the div completely from users of Abdera would mean potentially losing important data (e.g. the div may contain an xml:lang or xml:base) I don't think that the div should contain an xml:base, because it isn't valid to use xml:base in XHTML 1.x. As the xhtml:div is added by the producer, it should be removed by the consumer, so there shouldn't be an xml:lang in there either. I wouldn't expect consumers to handle either consistently, so if you are a producer don't do it. I think in my implementation I handle lang and base on the div, and store them out-of-band, but it is more by accident than anything. [snip]
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
On 6/28/06, James M Snell [EMAIL PROTECTED] wrote: Our default behavior will be to return the div. A separate API will provide the content without the div. So, standards-off-by-default then? Unbelievable. -- Robert Sayre
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
Antone Roundy wrote: Consider this: entry xml:lang=en xml:base=http://example.com/foo/; ... content type=xhtml xhtml:div xml:lang=fr xml:base=http://example.com/ feu/xhtml:a href=axe.htmlaxe/xhtml:a/xhtml:div /content /entry Another observation for those of you curious about interoperability... the last time I tested xml:base conformance (which admittedly was a while back) I couldn't find a single aggregator that supported xml:base on the div element. Regards James
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
Irrelevant. The content in the entries below should be handled the same way: entry xml:lang=en xml:base=http://example.com/foo/; ... content type=xhtml xhtml:div xml:lang=fr xml:base=http://example.com/ feu/xhtml:a href=axe.htmlaxe/xhtml:a/xhtml:div /content /entry entry xml:lang=en xml:base=http://example.com/foo/; ... content type=xhtml xml:lang=fr xml:base=http://example.com/ feu/ xhtml:div xhtml:a href=axe.htmlaxe/xhtml:a/xhtml:div /content /entry On 6/28/06, Antone Roundy [EMAIL PROTECTED] wrote: On Jun 28, 2006, at 12:06 PM, A. Pagaltzis wrote: * James M Snell [EMAIL PROTECTED] [2006-06-28 20:00]: A. Pagaltzis wrote: * James M Snell [EMAIL PROTECTED] [2006-06-28 14:35]: Hiding the div completely from users of Abdera would mean potentially losing important data (e.g. the div may contain an xml:lang or xml:base) or forcing me to perform additional processing (pushing the in-scope xml:lang/xml:base down to child elements of the div. How is that any different from having to find ways to pass any in-scope xml:lang/xml:base down to API clients when the content is type=html or type=text? I hope you didn't punt on those? Our Content interface has methods for getting to that information. Then stripping the `div` is not an issue, is it? Consider this: entry xml:lang=en xml:base=http://example.com/foo/; ... content type=xhtml xhtml:div xml:lang=fr xml:base=http://example.com/ feu/xhtml:a href=axe.htmlaxe/xhtml:a/xhtml:div /content /entry Whether there's a problem depends on whether one requests the xml:base, xml:lang, or whatever for the atom:content element itself or for the CONTENT OF the atom:content element, in which case the library could return the values it got from the xhtml:div. Except in unusual cases like this, the result would be identical. Certainly a distinction could be made between how an XML library would handle this vs. how an Atom library would handle it. An Atom processing library might be expected to be able to do things like: * give me the raw contents of the atom:content element * give me the contents of the atom:content element converted to well- formed XHTML (whether it started as text, escaped tag soup, or inline xhtml) In the former case, keeping the div feels like the right thing to do-- the consuming app would have to know to remove it. In the latter case, removing the div from xhtml content feels like the right thing to do. But unless the library gives me the xml:base, for example, which applies to the content of the atom:content element (as converted to well-formed xhtml or whatever), as opposed to the xml:base which applied to the atom:content element itself, there's potential for trouble. ...now that I think about it, if the library always returns the xml:base which applies to the content of the element, that could cause trouble too: entry xml:lang=en xml:base=http://example.com/; ... content type=xhtml xhtml:div xml:lang=fr xml:base=feu/xhtml:a href=axe.htmlaxe/xhtml:a/xhtml:div /content /entry Here, if I get xml:base for the content of content, it will be http://example.com/feu/;. Then, if I get the raw content of the element, strip the div, and apply xml:base myself, I'll erroneously use http://example.com/feu/feu/; as the base URI unless I know to ignore the xml:base attribute on the div. -- Robert Sayre I would have written a shorter letter, but I did not have the time.
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
Actually, switch this. I realized after I sent this that I had it backwards. The default behavior will be to not return the div. A separate API will provide the content with the div. - James James M Snell wrote: [snip]...The solution I think I'm going to go with is to support both approaches. Our default behavior will be to return the div. A separate API will provide the content without the div. When it doubt, do both. - James
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
Wednesday, June 28, 2006, 9:55:29 PM, James Snell wrote: David, you're right, ideally the xhtml container div would be nothing but the div, but if it's not, we still need to be prepared to handle it. Silent data loss sucks, if it's silly data :-) I'm just looking at it from the perspective of the producer and the consumer. In my consumer implementation, I take the resolved base URI of the div (including any xml:base there), and the language context of the div, discard the div, and store them both out-of-band of the content, with namespace prefixes inline. That's probably good enough. Some post-processing is used to convert the data in the store into a form that allows it to be safely embedded in an HTML page - I've been trying XSLT (with TagSoup for HTML content). I don't think that the div should have lang or base attached, but if it is there, it is better to use it than ignore it, cause it is likely there for a reason. I wouldn't produce feeds like that though. If people start using CSS links in feeds (or even just CSS styling in aggregators), discarding the div could be important. If you're going to supply an API for extracting usable [X]HTML, there are a number of features that consumers might want in some combination: * Forcing the XHTML to use a blank namespace prefix to make it DTD compatable, and removing unused prefixes. * Resolving relative references (which will inevitably be a lossy process) * Removing XSS risks (intentionally lossy) I still keep the original content in a reasonably accurate form though. -- Dave
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
On Jun 28, 2006, at 3:10 PM, Robert Sayre wrote: The content in the entries below should be handled the same way: entry xml:lang=en xml:base=http://example.com/foo/; ... content type=xhtml xhtml:div xml:lang=fr xml:base=http://example.com/ feu/xhtml:a href=axe.htmlaxe/xhtml:a/xhtml:div /content /entry entry xml:lang=en xml:base=http://example.com/foo/; ... content type=xhtml xml:lang=fr xml:base=http:// example.com/ feu/ xhtml:div xhtml:a href=axe.htmlaxe/xhtml:a/ xhtml:div /content /entry Of course the end result of both should be identical. Is that what you mean by should be handled the same way? The question is, if the xhtml:div is stripped by the library before handing it off to the app, how is the app going to get the attributes that were on the div? Is the library going to push those values down into the content or act as if they were on the atom:content element (or something similar to that)? BTW, it just occurred to me that pushing them down into the content won't work. Here's an example where that would fail: entry xml:lang=en ... content type=xhtml xhtml:div xml:lang=frOui!/xhtml:div /content /entry Notice that there are no elements inside the xhtml:div for xml:lang to be attached to (and even if there were any, any text appearing outside of them would not have the correct xml:lang attached to it). So it looks like the options (both of a which a single library could support, of course) are: * Strip the div, but provide a way to get the attributes that were on it or * Leave the div
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
On Jun 28, 2006, at 23:53, James M Snell wrote: or instance, if I have content xml:lang=endiv xml:lang=fr.../div/content and I drop the div silently, then I've got a problem. Dropping the div shouldn't mean dropping the language and base URL context. You need to communicate those anyway in the case they are inherited from higher up in the document tree. (When the script that generates my feed copies node from a document tree to another, it checks the inherited language of the node being copied. If it differs from the inherited language of the insertion target, the newly inserted copy gets an explicit xml:lang.) -- Henri Sivonen [EMAIL PROTECTED] http://hsivonen.iki.fi/
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
Thanks everyone for this really interesting discussion. I have added a note to this effect to the latest atom-owl ontology [1]. In Atom-Owl we could easily do both. [] :content xhtml:div xml:lang=frOui!/xhtml:div^:xhtml. or we could have [] :content Oui!@fr^:xhtml . or [] :content [ :xhtml oui; :lang en ]. It would be simplest I suppose to have the :xhtml type be defined as always having an div ... element. Except that of course it would look odd for xhtml content that contains an html base tag such as div htmlhead...body.../body/head /div From this discussion it looks like the most reasonable would be to strip the div element. In which case one may wonder what the whole purpose of putting the div in the content really was in the first place. Henry [1] https://sommer.dev.java.net/atom/2006-06-06/awol.html#term_xhtml On 28 Jun 2006, at 23:53, Antone Roundy wrote: On Jun 28, 2006, at 3:10 PM, Robert Sayre wrote: The content in the entries below should be handled the same way: entry xml:lang=en xml:base=http://example.com/foo/; ... content type=xhtml xhtml:div xml:lang=fr xml:base=http://example.com/ feu/xhtml:a href=axe.htmlaxe/xhtml:a/xhtml:div /content /entry entry xml:lang=en xml:base=http://example.com/foo/; ... content type=xhtml xml:lang=fr xml:base=http:// example.com/ feu/ xhtml:div xhtml:a href=axe.htmlaxe/xhtml:a/ xhtml:div /content /entry Of course the end result of both should be identical. Is that what you mean by should be handled the same way? The question is, if the xhtml:div is stripped by the library before handing it off to the app, how is the app going to get the attributes that were on the div? Is the library going to push those values down into the content or act as if they were on the atom:content element (or something similar to that)? BTW, it just occurred to me that pushing them down into the content won't work. Here's an example where that would fail: entry xml:lang=en ... content type=xhtml xhtml:div xml:lang=frOui!/xhtml:div /content /entry Notice that there are no elements inside the xhtml:div for xml:lang to be attached to (and even if there were any, any text appearing outside of them would not have the correct xml:lang attached to it). So it looks like the options (both of a which a single library could support, of course) are: * Strip the div, but provide a way to get the attributes that were on it or * Leave the div
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
On the other hand, if one strips the div element, then :xhtml can no longer be an inverse functional property, as contents with different bases could have very different meanings. Just think of xhtml content with a picture, which in one subtree points to bush, and in another one points to Gore, the relative uri references being the same in both cases. This seems to make it more reasonable to create a new literal type which contains the div. (it makes finding duplicates in an rdf database easier). On that topic are there not xhtml ways to create xml:base and xml:lang elements? Should those not perhaps be used instead on the div element? Henry On 29 Jun 2006, at 00:11, Henry Story wrote: Thanks everyone for this really interesting discussion. I have added a note to this effect to the latest atom-owl ontology [1]. In Atom-Owl we could easily do both. [] :content xhtml:div xml:lang=frOui!/xhtml:div^:xhtml. or we could have [] :content Oui!@fr^:xhtml . or [] :content [ :xhtml oui; :lang en ]. It would be simplest I suppose to have the :xhtml type be defined as always having an div ... element. Except that of course it would look odd for xhtml content that contains an html base tag such as div htmlhead...body.../body/head /div From this discussion it looks like the most reasonable would be to strip the div element. In which case one may wonder what the whole purpose of putting the div in the content really was in the first place. Henry [1] https://sommer.dev.java.net/atom/2006-06-06/awol.html#term_xhtml On 28 Jun 2006, at 23:53, Antone Roundy wrote: On Jun 28, 2006, at 3:10 PM, Robert Sayre wrote: The content in the entries below should be handled the same way: entry xml:lang=en xml:base=http://example.com/foo/; ... content type=xhtml xhtml:div xml:lang=fr xml:base=http://example.com/ feu/xhtml:a href=axe.htmlaxe/xhtml:a/xhtml:div /content /entry entry xml:lang=en xml:base=http://example.com/foo/; ... content type=xhtml xml:lang=fr xml:base=http:// example.com/ feu/ xhtml:div xhtml:a href=axe.htmlaxe/xhtml:a/ xhtml:div /content /entry Of course the end result of both should be identical. Is that what you mean by should be handled the same way? The question is, if the xhtml:div is stripped by the library before handing it off to the app, how is the app going to get the attributes that were on the div? Is the library going to push those values down into the content or act as if they were on the atom:content element (or something similar to that)? BTW, it just occurred to me that pushing them down into the content won't work. Here's an example where that would fail: entry xml:lang=en ... content type=xhtml xhtml:div xml:lang=frOui!/xhtml:div /content /entry Notice that there are no elements inside the xhtml:div for xml:lang to be attached to (and even if there were any, any text appearing outside of them would not have the correct xml:lang attached to it). So it looks like the options (both of a which a single library could support, of course) are: * Strip the div, but provide a way to get the attributes that were on it or * Leave the div
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
I've just made a change in our implementation that in the following case... content type=xhtml xml:lang=en xml:base=foo/ div xml:lang=fr xml:base=barOui!/div /content Content.getLanguage() will return fr Content.getBaseUri() will return foo/bar The other possible attributes are still available via a Div interface, but in the default case, things just sort of work themselves out. - James Antone Roundy wrote: On Jun 28, 2006, at 3:10 PM, Robert Sayre wrote: The content in the entries below should be handled the same way: entry xml:lang=en xml:base=http://example.com/foo/; ... content type=xhtml xhtml:div xml:lang=fr xml:base=http://example.com/ feu/xhtml:a href=axe.htmlaxe/xhtml:a/xhtml:div /content /entry entry xml:lang=en xml:base=http://example.com/foo/; ... content type=xhtml xml:lang=fr xml:base=http://example.com/ feu/ xhtml:div xhtml:a href=axe.htmlaxe/xhtml:a/xhtml:div /content /entry Of course the end result of both should be identical. Is that what you mean by should be handled the same way? The question is, if the xhtml:div is stripped by the library before handing it off to the app, how is the app going to get the attributes that were on the div? Is the library going to push those values down into the content or act as if they were on the atom:content element (or something similar to that)? BTW, it just occurred to me that pushing them down into the content won't work. Here's an example where that would fail: entry xml:lang=en ... content type=xhtml xhtml:div xml:lang=frOui!/xhtml:div /content /entry Notice that there are no elements inside the xhtml:div for xml:lang to be attached to (and even if there were any, any text appearing outside of them would not have the correct xml:lang attached to it). So it looks like the options (both of a which a single library could support, of course) are: * Strip the div, but provide a way to get the attributes that were on it or * Leave the div
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
* Henri Sivonen [EMAIL PROTECTED] [2006-06-29 00:20]: On Jun 28, 2006, at 23:53, James M Snell wrote: or instance, if I have content xml:lang=endiv xml:lang=fr.../div/content and I drop the div silently, then I've got a problem. Dropping the div shouldn't mean dropping the language and base URL context. You need to communicate those anyway in the case they are inherited from higher up in the document tree. Exactly. Regards, -- Aristotle Pagaltzis // http://plasmasturm.org/
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
* Antone Roundy [EMAIL PROTECTED] [2006-06-28 21:30]: Consider this: entry xml:lang=en xml:base=http://example.com/foo/; ... content type=xhtml xhtml:div xml:lang=fr xml:base=http://example.com/ feu/xhtml:a href=axe.htmlaxe/xhtml:a/xhtml:div /content /entry Whether there's a problem depends on whether one requests the xml:base, xml:lang, or whatever for the atom:content element itself or for the CONTENT OF the atom:content element, in which case the library could return the values it got from the xhtml:div. Except in unusual cases like this, the result would be identical. I can see your argument, but I find this too fine a distinction. The `div` is part of the container when `type=xhtml` as far as I’m concerned. I’d just merge the information with that on the `content` element and pretend there’s no difference. As far as the feed’s *meaning* is concerned, there isn’t, after all. * give me the raw contents of the atom:content element * give me the contents of the atom:content element converted to well-formed XHTML (whether it started as text, escaped tag soup, or inline xhtml) In the former case, keeping the div feels like the right thing to do -- the consuming app would have to know to remove it. In the latter case, removing the div from xhtml content feels like the right thing to do. Yes, that sounds sane. “Give me the raw contents” would be somehting only an Atom-aware API client would want to do, so it is reasonable to expect that the client knows what to do with the `div` when it finds that the content type was `xhtml`. Anyone who just wants to use the data and doesn’t want to have to care about how Atom works should just ask for XHTML and not care what it was originally packaged as. ...now that I think about it, if the library always returns the xml:base which applies to the content of the element, that could cause trouble too: entry xml:lang=en xml:base=http://example.com/; ... content type=xhtml xhtml:div xml:lang=fr xml:base=feu/xhtml:a href=axe.htmlaxe/xhtml:a/xhtml:div /content /entry Here, if I get xml:base for the content of content, it will be http://example.com/feu/;. Then, if I get the raw content of the element, strip the div, and apply xml:base myself, I'll erroneously use http://example.com/feu/feu/; as the base URI unless I know to ignore the xml:base attribute on the div. I agree, but I don’t see how that’s at all to the point. Such an API client is just buggy. If they ask for the raw `content` content, then they should also ask for the `content` base URI, not for the content’s base URI. Guiding API clients to avoid such a mistake should be reasonably easy by naming the methods appropriately, ie something like `get_container_content` and `get_container_base` vs `get_content` and `get_base`. (That the first pair of names is so long is fully intentional… :-) ) Regards, -- Aristotle Pagaltzis // http://plasmasturm.org/
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
On 6/28/06, James M Snell [EMAIL PROTECTED] wrote: Actually, switch this. I realized after I sent this that I had it backwards. The default behavior will be to not return the div. A separate API will provide the content with the div. Next time, don't start out with egregious obfuscation, and then kick and scream through tons of list traffic with beyond-bogus arguments. Here's how it started: http://mail-archives.apache.org/mod_mbox/incubator-abdera-dev/200606.mbox/[EMAIL PROTECTED] It's a waste of other people's time. Once or twice is understandable, reasonable people sometimes disagree on basic things. I think we're up to 20 or 30 of these incidents with you, though. At this point, it's not something to be replied to with a smart remark. It belies contempt for your colleagues. We shouldn't have to sit here and listen to specious tripe because it sounds semi-plausible to a non-implementor. It's abusive, and it's much worse than the nasty messages so many of us have sent. -- Robert Sayre
http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
When XHTML content is used, The XHTML div element itself MUST NOT be considered part of the content. http://atompub.org/rfc4287.html#rfc.section.4.1.3.3 This is hard to test with aggregators, but conforming libraries definitely need to get this right. http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests -- Robert Sayre
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
Please define conformance in regards to this test. That is, what is the exact behavior that a library must perform when a code library has an API like, getContent on the content element. e.g., is a parser not conformant if it passes the DIV on to the consuming application with the expectation that the application is responsible for doing the right thing with it? Robert Sayre wrote: When XHTML content is used, The XHTML div element itself MUST NOT be considered part of the content. http://atompub.org/rfc4287.html#rfc.section.4.1.3.3 This is hard to test with aggregators, but conforming libraries definitely need to get this right. http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
On 6/27/06, James M Snell [EMAIL PROTECTED] wrote: Please define conformance in regards to this test. That is, what is the exact behavior that a library must perform when a code library has an API like, getContent on the content element. e.g., is a parser not conformant if it passes the DIV on to the consuming application with the expectation that the application is responsible for doing the right thing with it? Don't be dense. Would the parser be conformant if it passed on the feed, entry, and div elements with that expectation? I'll file a bug on UFP and I bet you it'll get fixed without a question, because there won't be a bad-faith interpretation to fight. That's two demerits this week for you. Tsk tsk. -- Robert Sayre I would have written a shorter letter, but I did not have the time.
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
I'm shooting for at least five demerits. Otherwise, the week will be completely sunk. And yes, the parser would be conformant. Abdera is conformant even tho it is possible to use Abdera to produce and read invalid Atom. Returning the div in the getContent method is incorrect and I'm fixing that now; making the div available for the application using Abdera should be ok. I want to make sure this conformance test isn't saying that the parser must hide the div completely. - James Robert Sayre wrote: On 6/27/06, James M Snell [EMAIL PROTECTED] wrote: Please define conformance in regards to this test. That is, what is the exact behavior that a library must perform when a code library has an API like, getContent on the content element. e.g., is a parser not conformant if it passes the DIV on to the consuming application with the expectation that the application is responsible for doing the right thing with it? Don't be dense. Would the parser be conformant if it passed on the feed, entry, and div elements with that expectation? I'll file a bug on UFP and I bet you it'll get fixed without a question, because there won't be a bad-faith interpretation to fight. That's two demerits this week for you. Tsk tsk.
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
Robert Sayre wrote: When XHTML content is used, The XHTML div element itself MUST NOT be considered part of the content. http://atompub.org/rfc4287.html#rfc.section.4.1.3.3 FWIW, most aggregators that I've tested do not strip the div element. Regards James