Re: type=HTML
On Tue, 08 Feb 2005 15:36:11 +0100, Julian Reschke [EMAIL PROTECTED] wrote: Shouldn't we at least give content producers the hint that producing XHTML content is preferred over HTML? (sorry if I'm opening a can of worms here) Sounds reasonable, but as type=XHTML. Escaping XHMTL seems to be defeating the object somewhat (we should be encouraging XML processing rather than tag soup microparsing). -- http://dannyayers.com
Re: PaceXhtmlNamespaceDiv
Henri Sivonen wrote: On Feb 9, 2005, at 15:28, Sam Ruby wrote: Here's the key question. Consider the following XML fragment: summary type='XHTML'div xmlns='http://www.w3.org/1999/xhtml'Hey, this is my space, if I want to run a picture of a chair I can. And its a emnice/em chair./div/summary Given this fragment, what is the value of the summary? Is the div element to be considered part of the format (and therefore not part of the summary). Or is the div element to be considered part of the summary itself. The div is part of the summary according to current spec text. That's what I want to change. I've updated the Pace to make this clearer. I replaced the abstract completely, and added one sentence to the proposal. New abstract: Given that common practice is to include this element, making it mandatory makes things clearer to both people who are producing consuming tools based on the spec, and people who are producing new feeds based on copy and paste. New spec text: The xhtml:div element itself MUST NOT be considered part of the content. - Sam Ruby
Re: PaceXhtmlNamespaceDiv
Sam Ruby wrote: That's what I want to change. I've updated the Pace to make this clearer. I replaced the abstract completely, and added one sentence to the proposal. New abstract: Given that common practice is to include this element, making it mandatory makes things clearer to both people who are producing consuming tools based on the spec, and people who are producing new feeds based on copy and paste. New spec text: The xhtml:div element itself MUST NOT be considered part of the content. I find it a bit problematic to use common practice in Atom feeds as justification for spec changes. Let's make the spec as clear and simple as possible. If this is in conflict with common usage in experimental Atom feeds, so be it. Best regards, Julian -- green/bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760
Re: PaceXhtmlNamespaceDiv
Julian Reschke wrote: Sam Ruby wrote: That's what I want to change. I've updated the Pace to make this clearer. I replaced the abstract completely, and added one sentence to the proposal. New abstract: Given that common practice is to include this element, making it mandatory makes things clearer to both people who are producing consuming tools based on the spec, and people who are producing new feeds based on copy and paste. New spec text: The xhtml:div element itself MUST NOT be considered part of the content. I find it a bit problematic to use common practice in Atom feeds as justification for spec changes. Let's make the spec as clear and simple as possible. If this is in conflict with common usage in experimental Atom feeds, so be it. That is consistent with your prior statement that you don't believe that implementation issues should affect the format: http://www.imc.org/atom-syntax/mail-archive/msg12699.html Yes, I want a spec that is simple. I also want a spec that average people can implement simply and correctly. We have seen on this very mailing list people who have an above average understanding of XML trip over this particular area numerous times. I am not content to create a format for which the answers to such common user errors is so be it. - Sam Ruby
PaceXhtmlNamespaceDiv
I've updated the examples as follows: Removed the style attribute from the div in one--if the div is not part of the content, it doesn't make sense to me allow it to control styling of the content. Yeah, I wrote the original example, but I hadn't thought through everything clearly enough yet. Added an example that presumes that the XHTML namespace has already been bound to the prefix xhtml.
Re: PaceXhtmlNamespaceDiv
Sam Ruby wrote: New abstract: Given that common practice is to include this element, making it mandatory makes things clearer to both people who are producing consuming tools based on the spec, and people who are producing new feeds based on copy and paste. New spec text: The xhtml:div element itself MUST NOT be considered part of the content. I find it a bit problematic to use common practice in Atom feeds as justification for spec changes. Let's make the spec as clear and simple as possible. If this is in conflict with common usage in experimental Atom feeds, so be it. That is consistent with your prior statement that you don't believe that implementation issues should affect the format: http://www.imc.org/atom-syntax/mail-archive/msg12699.html Yes, I want a spec that is simple. I also want a spec that average people can implement simply and correctly. We have seen on this very mailing list people who have an above average understanding of XML trip over this particular area numerous times. I am not content to create a format for which the answers to such common user errors is so be it. However, what is the problem with people using a DIV element inside SUMMARY and the CONTENT element if they wish to do so? By the way, I have read the thing you wrote about things like planet copy the contents and put it in their own DIV element but if that is how they are going to treat Atom, Atom will not be solving anything and will just be another RSS I guess. Authors who do copy and paste and others should always validate their feed. I guess the feed validator could flag elements that are in the Atom namespace and should not be there according to the latest updates of the Atom namespace. Eventually, I guess it is about getting the major weblog systems and companies to get their implementation right. The Atom WG and other people should also provide tutorials on how to create Atom feeds and how to make sure everything works as it should. -- Anne van Kesteren http://annevankesteren.nl/
Re: PaceXhtmlNamespaceDiv
On Feb 10, 2005, at 18:02, Sam Ruby wrote: We have seen on this very mailing list people who have an above average understanding of XML trip over this particular area numerous times. Those trip-ups have not been as much about div vs. no div but about XMLNS which we can't and should not attempt to change. I should also note that typed examples on the list and output from debugged serializers are different things.* * Aka. the tools will save us argument. Despite the tools will save us argument being unpopular, I think it is unwise for an average developer to approach XMLNS without proper tools. -- Henri Sivonen [EMAIL PROTECTED] http://iki.fi/hsivonen/
Re: PaceXhtmlNamespaceDiv
Julian Reschke wrote: Sam Ruby wrote: That is consistent with your prior statement that you don't believe that implementation issues should affect the format: http://www.imc.org/atom-syntax/mail-archive/msg12699.html What I said is that very *specific* implementation issue shouldn't affect the format. Please cite correctly. I also posted the following clarification in http://www.imc.org/atom-syntax/mail-archive/msg12697.html: OK, I'll try to rephrase: changing the protocol format because one implementor says that this makes it easier to implement IMHO is a bad idea. Of course things look differently if this issue affects more platforms/parsers/toolkits. So yes, more information is needed. Yes, I want a spec that is simple. I also want a spec that average people can implement simply and correctly. We have seen on this very mailing list people who have an above average understanding of XML trip over this particular area numerous times. I am not content to create a format for which the answers to such common user errors is so be it. Nor am I. The question is what's the best way to enhance the spec. One alternative suggestion was made by Martin Dürst in http://www.imc.org/atom-syntax/mail-archive/msg13531.html: Note: It is important to make sure that correct namespace declarations for XHTML are present. One way to do this is by using an xhtml:div element as the content of the atom:content element and specifying the XHTML namespace on that div element. Here are some examples: ... [use proposed examples] There are other ways to declare the namespace URI for XHTML content; this specification does not limit the placement of such declarations in any way. My issue with that wording is that it doesn't make it clear whether the xhtml:div that is added is to be considered a part of the content or not. Put another way, how does the consumer know that if a given xhtml:div element is part of the content, or was added per the above? Julian, you previously said Let's make the spec as clear and simple as possible. How about this: xhtml:div is required. xhtml:div is not part of the content. Clear. Simple. And difficult to get wrong. - Sam Ruby
Re: PaceXhtmlNamespaceDiv
Sam Ruby wrote: Nor am I. The question is what's the best way to enhance the spec. One alternative suggestion was made by Martin Dürst in http://www.imc.org/atom-syntax/mail-archive/msg13531.html: Note: It is important to make sure that correct namespace declarations for XHTML are present. One way to do this is by using an xhtml:div element as the content of the atom:content element and specifying the XHTML namespace on that div element. Here are some examples: ... [use proposed examples] There are other ways to declare the namespace URI for XHTML content; this specification does not limit the placement of such declarations in any way. My issue with that wording is that it doesn't make it clear whether the xhtml:div that is added is to be considered a part of the content or not. I'd assume it's part of the content because that's what the spec currently says. Put another way, how does the consumer know that if a given xhtml:div element is part of the content, or was added per the above? It is, unless the spec says otherwise. Julian, you previously said Let's make the spec as clear and simple as possible. How about this: xhtml:div is required. xhtml:div is not part of the content. Clear. Simple. And difficult to get wrong. Well, but not sufficient as spec text right? To summarize my p.o.v.: - the spec shouldn't require any specific container element for XHTML content, - the spec should warn people about that the child elements MUST be in the XHTML namespace if the recipient is supposed to interpret them as as XHTML markup, - whether or not a feed producer puts in a div container doesn't seem to be relevant to me as it doesn't affect the semantics of what the text construct carries. Best regards, Julian -- green/bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760
Re: PaceXhtmlNamespaceDiv
Sam Ruby wrote: xhtml:div is required. xhtml:div is not part of the content. Clear. Simple. And difficult to get wrong. I'd much prefer: xhtml:div is required. xhtml:div is part of the content. But I can live with it either way - James M Snell
Re: PaceXhtmlNamespaceDiv
On 10 Feb 2005, at 3:35 pm, Sam Ruby wrote: The xhtml:div element itself MUST NOT be considered part of the content. What does this mean? Define content and considered please. Graham
Re: PaceXhtmlNamespaceDiv
Robert Sayre wrote: Julian Reschke wrote: So do you think we'll have to live with that, or should the spec be clarified/changed to reduce the chance of people getting it wrong? I think Sam's approach is best. The objections are all impractical pedantry. I think the proposal won't really help for cases where people don't know what they do and/or use the wrong tools, but adds completely unnecessary complexity for everybody else. Best regards, Julian -- green/bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760
Re: PaceXhtmlNamespaceDiv
Julian Reschke wrote: To summarize my p.o.v.: - the spec shouldn't require any specific container element for XHTML content, We continue to talk past one another. The above line is key. Some examples might help. Perhaps once we are actually understanding each other's points, then we can work backward from there to spec text. So, suppose my XHTML content is: pWhat a nice day!/p My XHTML container element is p. That is completely my choice. It is not required by the spec. Now if I place that inside an atom feed, I'm going to get something like this (heavily elided, all namespace details omitted): feed entry summary pWhat a nice day!/p /summary /entry /feed Depending on the how the question is phrased, one could take the position that feed, entry, and summary are container elements. Or not. Again, depending on how the question is phrased. I don't believe that these elements are the ones that you have an issue with. Correct? Now, consider a different document, again heavily elided, etc: feed entry summary div pWhat a nice day!/p /div /summary /entry /feed The key difference between these two documents is that instead of three elements around which there should be no issue, there now are four. But for some reason, this causes a big controversy. My theory is that the controversy is that people initially assumed that this div element was to be considered part of the content and not part of the format. And thereby was mandating that all content have a given container element. An entirely unreasonable mandate. I agree that this would be an unreasonable mandate. But I don't want to force a top level container element for the xhtml, I want to define a bottom level container element in the format for the xhtml. There is a big difference. The difference between four feed container elements and mandating that all xhtml content have a uniform top level container element. Which again, I will agree is an entirely unreasonable assumption. - - - On the optimistic presumption that you are with me so far, I'll press on. What desirable characteristics are there for feed container elements in this circumstance? To answer that question, it is important to understand how CMS software tends to be implemented. In particular, how they are layered. This is difficult as there isn't any one reference implementation that we can consult. We also need to consider software which isn't written yet. As I said, this is diffuclt. But we can observe common problems that people have had, and try to engineer a solution that avoids them. I hold the belief that if somebody writes a simple and clear spec that a significant number of people get wrong, that we are looking at a spec bug. Enough hand waving, onto the problem at hand. What we are looking at here is an xhtml fragment. Not a complete xhtml document, but some fragment of a web page. Now, fragments tend not to exist independent of a context. And in virtually all xhtml documents I have seen (including the ones I produce), any fragment presumes that the xhtml namespace was defined as the default namespace earlier in the document (in particular, on the document element). So, a desirable characteristic for a container element would be one in which the default namespace can be set. At this point, the discussion can fragment into any number of different directions. - - - One is for those who view XML as merely one potential serialization format, and something that their tool takes care of for them. For them, double escaping the content is the right answer, the simplest thing that can possibly work, end of discussion. While neither you nor I are in that camp (nor is Norm, and others), I am quite willing to leave that as a valid option, as long as it is explicitly declared. Another is to declare the use of default namespaces as evil, and rewrite both the document and the content to use explicit namespaces on every element. This may very well be where you and I part ways. If so, peace. Just please give the people who want to use default namespaces the same consideration that I am willing to give those who wish to double escape. And finally, there is a desire to create a format that can be done entirely with default namespaces, and without the need to rewrite or modify the content. The simple fact is that well formed xhtml does not always exist in the form of DOM nodes. Sometimes it is serialized as a string and stored in a file or a MySQL database. That does not make it any less well formed. It doesn't mean that it wasn't produced by a proper tool. Not having seen Tim's implementation, I'm just speculating at this point, but it probably falls into this category. Based on the tools he is using, he is confident that his content is well formed, even if it is stored as a string. As such, he can confidently use
Re: PaceXhtmlNamespaceDiv
Sam Ruby wrote: [..snip excellent rationale..] So, a desirable characteristic for a container element would be one in which the default namespace can be set. That is not a desirable characteristic. At this point, the discussion can fragment into any number of different directions. [...] Another is to declare the use of default namespaces as evil, and rewrite both the document and the content to use explicit namespaces on every element. This may very well be where you and I part ways. If so, peace. Just please give the people who want to use default namespaces the same consideration that I am willing to give those who wish to double escape. I believe the easiest, most robust, least error-prone approach to this sort of problem is to attempt to eliminate default namespace usage whenever possible. Every time a default namespace is elided system robustness and comprehension are improved - I've never seen it work the other way. And finally, there is a desire to create a format that can be done entirely with default namespaces, and without the need to rewrite or modify the content. That is a questionable desire. It leads us directly to promoting the use of a div wrapper to protect XHTML from Atom. Any container format that can so easily damage content we have to enforce a shim to protect it, arguably has a design flaw. Atom is just the most of recent of string of flawed container formats. So, what would a desirable feed container element be for this scenario? I would suggest that it would be something in the xhtml namespace. If it were in the atom namespace, you would have to do something along the lines of: atom:summary xmlns:atom=... xmlns=... Sam is 100% right this is problem. I arrive at a very different conclusion. If you are still with me, what I am proposing is that the simplest and cleanest solution for people who like default namespaces would be to define the format so that there is an xhtml:div element between the atom:summary and the xhtml fragment that is being syndicated. It's interesting you call them out so specifically, but no - default namespaces are the problem. Free your mind, and all that. This can be solved in a general way, not just for XHTML, by banning the use of default namespaces for Atom elements. That means the Atom format would actively subset XMLNS. I see that as a preferable option to anything presented in this thread. [Although it's time past for paces, I have one on this computer somewhere for default namespaces, but after I got shouted down last year about xmlns= I didn't think there was much point. Maybe I'll publish it on April 1st] In the meantime I support Sam's position, but think we're missing an opportunity to produce a more robust XML container format. cheers Bill
Re: PaceXhtmlNamespaceDiv
Sam Ruby wrote: Julian Reschke wrote: Sam, thanks for the long reply. I'll try my best to dig it and to offer constructive remarks... To summarize my p.o.v.: - the spec shouldn't require any specific container element for XHTML content, We continue to talk past one another. The above line is key. Some examples might help. Perhaps once we are actually understanding each other's points, then we can work backward from there to spec text. So, suppose my XHTML content is: pWhat a nice day!/p My XHTML container element is p. That is completely my choice. It is not required by the spec. Yep. Now if I place that inside an atom feed, I'm going to get something like this (heavily elided, all namespace details omitted): feed entry summary pWhat a nice day!/p /summary /entry /feed Yep. Depending on the how the question is phrased, one could take the position that feed, entry, and summary are container elements. Or not. Again, depending on how the question is phrased. Fine with me. I don't believe that these elements are the ones that you have an issue with. Correct? Yes. Now, consider a different document, again heavily elided, etc: feed entry summary div pWhat a nice day!/p /div /summary /entry /feed The key difference between these two documents is that instead of three elements around which there should be no issue, there now are four. But for some reason, this causes a big controversy. My theory is that the controversy is that people initially assumed that this div element was to be considered part of the content and not part of the format. And thereby was mandating that all content have a given container element. An entirely unreasonable mandate. Well, the current spec says it's part of the content. I personally feel it really doesn't matter. Adding DIVs around XHTML content doesn't change the semantics of the content, in particular if it doesn't carry any additional attributes. So, I wouldn't have any problems with recipients that collapse multiple nested xhtml:div elements into one or none (in absence of other attributes on it). I agree that this would be an unreasonable mandate. But I don't want to force a top level container element for the xhtml, I want to define a bottom level container element in the format for the xhtml. There is a big difference. It's still hard to see the difference, It's certainy not obvious on the syntactical level, and at the end of the day, that's what we are discussing here, right? The difference between four feed container elements and mandating that all xhtml content have a uniform top level container element. Which again, I will agree is an entirely unreasonable assumption. - - - On the optimistic presumption that you are with me so far, I'll press on. What desirable characteristics are there for feed container Not entirely, but trying :-) elements in this circumstance? To answer that question, it is important to understand how CMS software tends to be implemented. In particular, how they are layered. This is difficult as there isn't any one reference implementation that we can consult. We also need to consider software which isn't written yet. As I said, this is diffuclt. But we can observe common problems that people have had, and try to engineer a solution that avoids them. I hold the belief that if somebody writes a simple and clear spec that a significant number of people get wrong, that we are looking at a spec bug. Sure. But, are we looking at the whole set of implementors, or only those who actually read the spec? We all know that those sets aren't identical... Enough hand waving, onto the problem at hand. What we are looking at here is an xhtml fragment. Not a complete xhtml document, but some fragment of a web page. Yes. Now, fragments tend not to exist independent of a context. And in virtually all xhtml documents I have seen (including the ones I produce), any fragment presumes that the xhtml namespace was defined as the default namespace earlier in the document (in particular, on the document element). Well, that depends how you define fragment. For instance, I can use XSLT to produce that fragment and I certainly don't have to make any assumptions about default namespaces. The XSLT processor cares for me. The same thing applies when serializing a node set from an namespace-aware DOM (at least that's what I'd expect and MSXML has been doing for years now). So, a desirable characteristic for a container element would be one in which the default namespace can be set. I disagree that this is important, but the atom text constructs do have that characteristic already. At this point, the discussion can fragment into any number of different directions. - - - One is for those who view XML as merely one potential serialization format, and something that their tool takes care of for them. For them,