Re: type=HTML

2005-02-10 Thread Danny Ayers

On Tue, 08 Feb 2005 15:36:11 +0100, Julian Reschke
[EMAIL PROTECTED] wrote:

 Shouldn't we at least give content producers the hint that producing
 XHTML content is preferred over HTML? (sorry if I'm opening a can of
 worms here)

Sounds reasonable, but as type=XHTML. 

Escaping XHMTL seems to be defeating the object somewhat (we should be
encouraging XML processing rather than tag soup microparsing).


-- 

http://dannyayers.com



Re: PaceXhtmlNamespaceDiv

2005-02-10 Thread Sam Ruby
Henri Sivonen wrote:
On Feb 9, 2005, at 15:28, Sam Ruby wrote:
Here's the key question.  Consider the following XML fragment:
  summary type='XHTML'div xmlns='http://www.w3.org/1999/xhtml'Hey, 
this is my space, if I want to run a picture of a chair I can. And 
its a emnice/em chair./div/summary

Given this fragment, what is the value of the summary?  Is the div 
element to be considered part of the format (and therefore not part of 
the summary).  Or is the div element to be considered part of the 
summary itself.
The div is part of the summary according to current spec text.
That's what I want to change.  I've updated the Pace to make this 
clearer.  I replaced the abstract completely, and added one sentence to 
the proposal.

New abstract:
  Given that common practice is to include this element, making it
  mandatory makes things clearer to both people who are producing
  consuming tools based on the spec, and people who are producing new
  feeds based on copy and paste.
New spec text:
  The xhtml:div element itself MUST NOT be considered part of the
  content.
- Sam Ruby


Re: PaceXhtmlNamespaceDiv

2005-02-10 Thread Julian Reschke
Sam Ruby wrote:
That's what I want to change.  I've updated the Pace to make this 
clearer.  I replaced the abstract completely, and added one sentence to 
the proposal.

New abstract:
  Given that common practice is to include this element, making it
  mandatory makes things clearer to both people who are producing
  consuming tools based on the spec, and people who are producing new
  feeds based on copy and paste.
New spec text:
  The xhtml:div element itself MUST NOT be considered part of the
  content.
I find it a bit problematic to use common practice in Atom feeds as 
justification for spec changes. Let's make the spec as clear and simple 
as possible. If this is in conflict with common usage in experimental 
Atom feeds, so be it.

Best regards, Julian
--
green/bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760


Re: PaceXhtmlNamespaceDiv

2005-02-10 Thread Sam Ruby
Julian Reschke wrote:
Sam Ruby wrote:
That's what I want to change.  I've updated the Pace to make this 
clearer.  I replaced the abstract completely, and added one sentence 
to the proposal.

New abstract:
  Given that common practice is to include this element, making it
  mandatory makes things clearer to both people who are producing
  consuming tools based on the spec, and people who are producing new
  feeds based on copy and paste.
New spec text:
  The xhtml:div element itself MUST NOT be considered part of the
  content.
I find it a bit problematic to use common practice in Atom feeds as 
justification for spec changes. Let's make the spec as clear and simple 
as possible. If this is in conflict with common usage in experimental 
Atom feeds, so be it.
That is consistent with your prior statement that you don't believe that 
implementation issues should affect the format:

http://www.imc.org/atom-syntax/mail-archive/msg12699.html
Yes, I want a spec that is simple.  I also want a spec that average 
people can implement simply and correctly.

We have seen on this very mailing list people who have an above average 
understanding of XML trip over this particular area numerous times.

I am not content to create a format for which the answers to such common 
user errors is so be it.

- Sam Ruby


PaceXhtmlNamespaceDiv

2005-02-10 Thread Antone Roundy
I've updated the examples as follows:
Removed the style attribute from the div in one--if the div is not part 
of the content, it doesn't make sense to me allow it to control styling 
of the content.  Yeah, I wrote the original example, but I hadn't 
thought through everything clearly enough yet.

Added an example that presumes that the XHTML namespace has already 
been bound to the prefix xhtml.



Re: PaceXhtmlNamespaceDiv

2005-02-10 Thread Anne van Kesteren
Sam Ruby wrote:
New abstract:
  Given that common practice is to include this element, making it
  mandatory makes things clearer to both people who are producing
  consuming tools based on the spec, and people who are producing new
  feeds based on copy and paste.
New spec text:
  The xhtml:div element itself MUST NOT be considered part of the
  content.
I find it a bit problematic to use common practice in Atom feeds as 
justification for spec changes. Let's make the spec as clear and 
simple as possible. If this is in conflict with common usage in 
experimental Atom feeds, so be it.
That is consistent with your prior statement that you don't believe that 
implementation issues should affect the format:

http://www.imc.org/atom-syntax/mail-archive/msg12699.html
Yes, I want a spec that is simple. I also want a spec that average 
people can implement simply and correctly.

We have seen on this very mailing list people who have an above average 
understanding of XML trip over this particular area numerous times.

I am not content to create a format for which the answers to such common 
user errors is so be it.
However, what is the problem with people using a DIV element inside 
SUMMARY and the CONTENT element if they wish to do so?

By the way, I have read the thing you wrote about things like planet 
copy the contents and put it in their own DIV element but if that is how 
they are going to treat Atom, Atom will not be solving anything and will 
just be another RSS I guess.

Authors who do copy and paste and others should always validate their 
feed. I guess the feed validator could flag elements that are in the 
Atom namespace and should not be there according to the latest updates 
of the Atom namespace.

Eventually, I guess it is about getting the major weblog systems and 
companies to get their implementation right. The Atom WG and other 
people should also provide tutorials on how to create Atom feeds and how 
to make sure everything works as it should.

--
 Anne van Kesteren
 http://annevankesteren.nl/


Re: PaceXhtmlNamespaceDiv

2005-02-10 Thread Henri Sivonen
On Feb 10, 2005, at 18:02, Sam Ruby wrote:
We have seen on this very mailing list people who have an above 
average understanding of XML trip over this particular area numerous 
times.
Those trip-ups have not been as much about div vs. no div but about 
XMLNS which we can't and should not attempt to change. I should also 
note that typed examples on the list and output from debugged 
serializers are different things.*

* Aka. the tools will save us argument. Despite the tools will save 
us argument being unpopular, I think it is unwise for an average 
developer to approach XMLNS without proper tools.

--
Henri Sivonen
[EMAIL PROTECTED]
http://iki.fi/hsivonen/


Re: PaceXhtmlNamespaceDiv

2005-02-10 Thread Sam Ruby
Julian Reschke wrote:
Sam Ruby wrote:
That is consistent with your prior statement that you don't believe 
that implementation issues should affect the format:

http://www.imc.org/atom-syntax/mail-archive/msg12699.html
What I said is that very *specific* implementation issue shouldn't 
affect the format. Please cite correctly. I also posted the following 
clarification in 
http://www.imc.org/atom-syntax/mail-archive/msg12697.html:

OK, I'll try to rephrase: changing the protocol format because one 
implementor says that this makes it easier to implement IMHO is a bad 
idea. Of course things look differently if this issue affects more 
platforms/parsers/toolkits.

So yes, more information is needed.
Yes, I want a spec that is simple.  I also want a spec that average 
people can implement simply and correctly.

We have seen on this very mailing list people who have an above 
average understanding of XML trip over this particular area numerous 
times.

I am not content to create a format for which the answers to such 
common user errors is so be it.
Nor am I. The question is what's the best way to enhance the spec. One 
alternative suggestion was made by Martin Dürst in 
http://www.imc.org/atom-syntax/mail-archive/msg13531.html:

Note: It is important to make sure that correct namespace declarations
for XHTML are present. One way to do this is by using an xhtml:div
element as the content of the atom:content element and specifying
the XHTML namespace on that div element. Here are some examples:
... [use proposed examples]
There are other ways to declare the namespace URI for XHTML content;
this specification does not limit the placement of such declarations
in any way.
My issue with that wording is that it doesn't make it clear whether the 
xhtml:div that is added is to be considered a part of the content or not.

Put another way, how does the consumer know that if a given xhtml:div 
element is part of the content, or was added per the above?

Julian, you previously said Let's make the spec as clear and simple as 
possible.  How about this:

  xhtml:div is required.  xhtml:div is not part of the content.
Clear.  Simple.  And difficult to get wrong.
- Sam Ruby


Re: PaceXhtmlNamespaceDiv

2005-02-10 Thread Julian Reschke
Sam Ruby wrote:

Nor am I. The question is what's the best way to enhance the spec. One
alternative suggestion was made by Martin Dürst in 
http://www.imc.org/atom-syntax/mail-archive/msg13531.html:

Note: It is important to make sure that correct namespace declarations
for XHTML are present. One way to do this is by using an xhtml:div
element as the content of the atom:content element and specifying
the XHTML namespace on that div element. Here are some examples:
... [use proposed examples]
There are other ways to declare the namespace URI for XHTML content;
this specification does not limit the placement of such declarations
in any way.

My issue with that wording is that it doesn't make it clear whether the 
xhtml:div that is added is to be considered a part of the content or not.
I'd assume it's part of the content because that's what the spec 
currently says.

Put another way, how does the consumer know that if a given xhtml:div 
element is part of the content, or was added per the above?
It is, unless the spec says otherwise.
Julian, you previously said Let's make the spec as clear and simple as 
possible.  How about this:

  xhtml:div is required.  xhtml:div is not part of the content.
Clear.  Simple.  And difficult to get wrong.
Well, but not sufficient as spec text right?
To summarize my p.o.v.:
- the spec shouldn't require any specific container element for XHTML 
content,

- the spec should warn people about that the child elements MUST be in 
the XHTML namespace if the recipient is supposed to interpret them as as 
XHTML markup,

- whether or not a feed producer puts in a div container doesn't seem 
to be relevant to me as it doesn't affect the semantics of what the text 
construct carries.

Best regards, Julian
--
green/bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760


Re: PaceXhtmlNamespaceDiv

2005-02-10 Thread James M Snell
Sam Ruby wrote:
   xhtml:div is required.  xhtml:div is not part of the content.

 Clear.  Simple.  And difficult to get wrong.
I'd much prefer:
  xhtml:div is required. xhtml:div is part of the content.
But I can live with it either way
- James M Snell


Re: PaceXhtmlNamespaceDiv

2005-02-10 Thread Graham
On 10 Feb 2005, at 3:35 pm, Sam Ruby wrote:
  The xhtml:div element itself MUST NOT be considered part of the
  content.
What does this mean? Define content and considered please.
Graham


Re: PaceXhtmlNamespaceDiv

2005-02-10 Thread Julian Reschke
Robert Sayre wrote:
Julian Reschke wrote:
So do you think we'll have to live with that, or should the spec be 
clarified/changed to reduce the chance of people getting it wrong?

I think Sam's approach is best. The objections are all impractical 
pedantry.
I think the proposal won't really help for cases where people don't know 
what they do and/or use the wrong tools, but adds completely unnecessary 
complexity for everybody else.

Best regards, Julian
--
green/bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760


Re: PaceXhtmlNamespaceDiv

2005-02-10 Thread Sam Ruby
Julian Reschke wrote:
To summarize my p.o.v.:
- the spec shouldn't require any specific container element for XHTML 
content,
We continue to talk past one another.  The above line is key.
Some examples might help.  Perhaps once we are actually understanding 
each other's points, then we can work backward from there to spec text.

So, suppose my XHTML content is:
  pWhat a nice day!/p
My XHTML container element is p.  That is completely my choice.  It is 
not required by the spec.

Now if I place that inside an atom feed, I'm going to get something like 
this (heavily elided, all namespace details omitted):

  feed
entry
  summary
 pWhat a nice day!/p
  /summary
/entry
  /feed
Depending on the how the question is phrased, one could take the 
position that feed, entry, and summary are container elements.  Or 
not.  Again, depending on how the question is phrased.

I don't believe that these elements are the ones that you have an issue 
with.  Correct?

Now, consider a different document, again heavily elided, etc:
  feed
entry
  summary
 div
   pWhat a nice day!/p
 /div
  /summary
/entry
  /feed
The key difference between these two documents is that instead of three 
elements around which there should be no issue, there now are four.  But 
for some reason, this causes a big controversy.

My theory is that the controversy is that people initially assumed that 
this div element was to be considered part of the content and not part 
of the format.  And thereby was mandating that all content have a given 
container element.  An entirely unreasonable mandate.

I agree that this would be an unreasonable mandate.  But I don't want to 
force a top level container element for the xhtml, I want to define a 
bottom level container element in the format for the xhtml.  There is a 
big difference.

The difference between four feed container elements and mandating that 
all xhtml content have a uniform top level container element.  Which 
again, I will agree is an entirely unreasonable assumption.

 - - -
On the optimistic presumption that you are with me so far, I'll press 
on.  What desirable characteristics are there for feed container 
elements in this circumstance?

To answer that question, it is important to understand how CMS software 
tends to be implemented.  In particular, how they are layered.  This is 
difficult as there isn't any one reference implementation that we can 
consult.  We also need to consider software which isn't written yet.  As 
I said, this is diffuclt.

But we can observe common problems that people have had, and try to 
engineer a solution that avoids them.  I hold the belief that if 
somebody writes a simple and clear spec that a significant number of 
people get wrong, that we are looking at a spec bug.

Enough hand waving, onto the problem at hand.  What we are looking at 
here is an xhtml fragment.  Not a complete xhtml document, but some 
fragment of a web page.

Now, fragments tend not to exist independent of a context.  And in 
virtually all xhtml documents I have seen (including the ones I 
produce), any fragment presumes that the xhtml namespace was defined as 
the default namespace earlier in the document (in particular, on the 
document element).

So, a desirable characteristic for a container element would be one in 
which the default namespace can be set.

At this point, the discussion can fragment into any number of different 
directions.

  - - -
One is for those who view XML as merely one potential serialization 
format, and something that their tool takes care of for them.  For them, 
double escaping the content is the right answer, the simplest thing that 
can possibly work, end of discussion.  While neither you nor I are in 
that camp (nor is Norm, and others), I am quite willing to leave that as 
a valid option, as long as it is explicitly declared.

Another is to declare the use of default namespaces as evil, and rewrite 
 both the document and the content to use explicit namespaces on every 
element.  This may very well be where you and I part ways.  If so, 
peace.  Just please give the people who want to use default namespaces 
the same consideration that I am willing to give those who wish to 
double escape.

And finally, there is a desire to create a format that can be done 
entirely with default namespaces, and without the need to rewrite or 
modify the content.

The simple fact is that well formed xhtml does not always exist in the 
form of DOM nodes.  Sometimes it is serialized as a string and stored in 
a file or a MySQL database.  That does not make it any less well formed. 
 It doesn't mean that it wasn't produced by a proper tool.

Not having seen Tim's implementation, I'm just speculating at this 
point, but it probably falls into this category.  Based on the tools he 
is using, he is confident that his content is well formed, even if it is 
stored as a string.  As such, he can confidently use 

Re: PaceXhtmlNamespaceDiv

2005-02-10 Thread Bill de hÓra
Sam Ruby wrote:
 [..snip excellent rationale..]
So, a desirable characteristic for a container element would be one in 
which the default namespace can be set.
That is not a desirable characteristic.

At this point, the discussion can fragment into any number of different 
directions.

[...]

Another is to declare the use of default namespaces as evil, and rewrite 
 both the document and the content to use explicit namespaces on every 
element.  This may very well be where you and I part ways.  If so, 
peace.  Just please give the people who want to use default namespaces 
the same consideration that I am willing to give those who wish to 
double escape.
I believe the easiest, most robust, least error-prone approach to this 
sort of problem is to attempt to eliminate default namespace usage 
whenever possible. Every time a default namespace is elided system 
robustness and comprehension are improved - I've never seen it work the 
other way.


And finally, there is a desire to create a format that can be done 
entirely with default namespaces, and without the need to rewrite or 
modify the content.
That is a questionable desire. It leads us directly to promoting the use 
of a div wrapper to protect XHTML from Atom. Any container format that 
can so easily damage content we have to enforce a shim to protect it, 
arguably has a design flaw. Atom is just the most of recent of string of 
flawed container formats.


So, what would a desirable feed container element be for this scenario? 
 I would suggest that it would be something in the xhtml namespace.  If 
it were in the atom namespace, you would have to do something along the 
lines of:

  atom:summary xmlns:atom=... xmlns=...
Sam is 100% right this is problem. I arrive at a very different conclusion.

If you are still with me, what I am proposing is that the simplest and 
cleanest solution for people who like default namespaces would be to 
define the format so that there is an xhtml:div element between the 
atom:summary and the xhtml fragment that is being syndicated.
It's interesting you call them out so specifically, but no - default 
namespaces are the problem. Free your mind, and all that.

This can be solved in a general way, not just for XHTML, by banning the 
use of default namespaces for Atom elements. That means the Atom format 
would actively subset XMLNS. I see that as a preferable option to 
anything presented in this thread.

[Although it's time past for paces, I have one on this computer 
somewhere for default namespaces, but after I got shouted down last year 
about xmlns= I didn't think there was much point. Maybe I'll publish 
it on April 1st]

In the meantime I support Sam's position, but think we're missing an 
opportunity to produce a more robust XML container format.

cheers
Bill


Re: PaceXhtmlNamespaceDiv

2005-02-10 Thread Julian Reschke
Sam Ruby wrote:
Julian Reschke wrote:
Sam, thanks for the long reply. I'll try my best to dig it and to offer 
constructive remarks...

To summarize my p.o.v.:
- the spec shouldn't require any specific container element for XHTML 
content,

We continue to talk past one another.  The above line is key.
Some examples might help.  Perhaps once we are actually understanding 
each other's points, then we can work backward from there to spec text.

So, suppose my XHTML content is:
  pWhat a nice day!/p
My XHTML container element is p.  That is completely my choice.  It is 
not required by the spec.
Yep.
Now if I place that inside an atom feed, I'm going to get something like 
this (heavily elided, all namespace details omitted):

  feed
entry
  summary
 pWhat a nice day!/p
  /summary
/entry
  /feed
Yep.
Depending on the how the question is phrased, one could take the 
position that feed, entry, and summary are container elements.  Or 
not.  Again, depending on how the question is phrased.
Fine with me.
I don't believe that these elements are the ones that you have an issue 
with.  Correct?
Yes.
Now, consider a different document, again heavily elided, etc:
  feed
entry
  summary
 div
   pWhat a nice day!/p
 /div
  /summary
/entry
  /feed
The key difference between these two documents is that instead of three 
elements around which there should be no issue, there now are four.  But 
for some reason, this causes a big controversy.

My theory is that the controversy is that people initially assumed that 
this div element was to be considered part of the content and not part 
of the format.  And thereby was mandating that all content have a given 
container element.  An entirely unreasonable mandate.
Well, the current spec says it's part of the content. I personally feel 
it really doesn't matter. Adding DIVs around XHTML content doesn't 
change the semantics of the content, in particular if it doesn't carry 
any additional attributes.

So, I wouldn't have any problems with recipients that collapse multiple 
nested xhtml:div elements into one or none (in absence of other 
attributes on it).

I agree that this would be an unreasonable mandate.  But I don't want to 
force a top level container element for the xhtml, I want to define a 
bottom level container element in the format for the xhtml.  There is a 
big difference.
It's still hard to see the difference, It's certainy not obvious on the 
syntactical level, and at the end of the day, that's what we are 
discussing here, right?

The difference between four feed container elements and mandating that 
all xhtml content have a uniform top level container element.  Which 
again, I will agree is an entirely unreasonable assumption.

 - - -
On the optimistic presumption that you are with me so far, I'll press 
on.  What desirable characteristics are there for feed container 
Not entirely, but trying :-)
elements in this circumstance?
To answer that question, it is important to understand how CMS software 
tends to be implemented.  In particular, how they are layered.  This is 
difficult as there isn't any one reference implementation that we can 
consult.  We also need to consider software which isn't written yet.  As 
I said, this is diffuclt.

But we can observe common problems that people have had, and try to 
engineer a solution that avoids them.  I hold the belief that if 
somebody writes a simple and clear spec that a significant number of 
people get wrong, that we are looking at a spec bug.
Sure. But, are we looking at the whole set of implementors, or only 
those who actually read the spec? We all know that those sets aren't 
identical...

Enough hand waving, onto the problem at hand.  What we are looking at 
here is an xhtml fragment.  Not a complete xhtml document, but some 
fragment of a web page.
Yes.
Now, fragments tend not to exist independent of a context.  And in 
virtually all xhtml documents I have seen (including the ones I 
produce), any fragment presumes that the xhtml namespace was defined as 
the default namespace earlier in the document (in particular, on the 
document element).
Well, that depends how you define fragment. For instance, I can use 
XSLT to produce that fragment and I certainly don't have to make any 
assumptions about default namespaces. The XSLT processor cares for me. 
The same thing applies when serializing a node set from an 
namespace-aware DOM (at least that's what I'd expect and MSXML has been 
doing for years now).

So, a desirable characteristic for a container element would be one in 
which the default namespace can be set.
I disagree that this is important, but the atom text constructs do have 
that characteristic already.

At this point, the discussion can fragment into any number of different 
directions.

  - - -
One is for those who view XML as merely one potential serialization 
format, and something that their tool takes care of for them.  For them,