text/html with mode=xml in Atom 0.3

2006-03-23 Thread James Holderness
I've been seeing a number of feeds recently using Atom 0.3 with a content type of text/html and no mode attribute (i.e. the equivalent of mode=xml). However, the markup in that content is wrapped in a CDATA section, for example something like this: content type=text/html ![CDATA[div

Re: atom:name ... text or html?

2006-03-23 Thread Anne van Kesteren
Quoting Eric Scheid [EMAIL PROTECTED]: If I have an author with the name Bertrand Café, is it acceptable to put that into atom:author like this; authorname![CDATA[Bertrand Cafeacute;]]/name/author or should I be using the unicode numeric entity instead? Even if it was HTML you couldn't

Re: atom:name ... text or html?

2006-03-23 Thread James M Snell
+1 to what Anne says. If I received that Atom author name, I would display it exactly as presented Bertrand Cafeacute; - James Anne van Kesteren wrote: Quoting Eric Scheid [EMAIL PROTECTED]: If I have an author with the name Bertrand Café, is it acceptable to put that into atom:author

Re: atom:name ... text or html?

2006-03-23 Thread James Holderness
Hahaha! It's RSS all over again. In the words of Mark Pilgrim: Here's something that might be HTML. Or maybe not. I can't tell you, and you can't guess. :-) Seriously though, the atom:name element is described as a human-readable name, so unless your name really is Betrand Cafeacture; that

Re: atom:name ... text or html?

2006-03-23 Thread A. Pagaltzis
* Eric Scheid [EMAIL PROTECTED] [2006-03-23 17:30]: If I have an author with the name Bertrand Café, is it acceptable to put that into atom:author like this; authorname![CDATA[Bertrand Cafeacute;]]/name/author No. That means the author’s name is Bertrand Cafeacute; (he must have had very

Re: text/html with mode=xml in Atom 0.3

2006-03-23 Thread A. Pagaltzis
* James Holderness [EMAIL PROTECTED] [2006-03-23 17:30]: So is this a bug in the content generator (all the feeds I've seen appear to be using TypePad) Yes. or are you supposed to ignore the mode attribute when the content type is set to text/html and always treat it as escaped? No. In 0.3,

Re: atom:name ... text or html?

2006-03-23 Thread Sylvain Hellegouarch
Seriously though, the atom:name element is described as a human-readable name, Do you mean that human-readable is equivalent to solely English? Because as a French, having accents in names is so natural that I see it as human readable too ;) - Sylvain

Re: atom:name ... text or html?

2006-03-23 Thread James Holderness
Sylvain Hellegouarch wrote: Do you mean that human-readable is equivalent to solely English? Because as a French, having accents in names is so natural that I see it as human readable too ;) No. I mean that the literal sequence of characters e a c u t e ; is not human-readable (or at least

Re: atom:name ... text or html?

2006-03-23 Thread Stephane Bortzmeyer
On Fri, Mar 24, 2006 at 03:16:18AM +1100, Eric Scheid [EMAIL PROTECTED] wrote a message of 10 lines which said: or should I be using the unicode numeric entity instead? Or the character itself, in UTF-8 or any other encoding (but UTF-8 is the most widely implemented, so you limit the

Re: atom:name ... text or html?

2006-03-23 Thread David Powell
Thursday, March 23, 2006, 4:57:11 PM, you wrote: On 24/3/06 3:21 AM, Anne van Kesteren [EMAIL PROTECTED] wrote: authorname![CDATA[Bertrand Cafeacute;]]/name/author Even if it was HTML you couldn't really use the entity, could you? I think you have to use a character reference or the

Re: atom:name ... text or html?

2006-03-23 Thread Stephane Bortzmeyer
On Thu, Mar 23, 2006 at 05:01:03PM +0100, Sylvain Hellegouarch [EMAIL PROTECTED] wrote a message of 11 lines which said: Because as a French, having accents in names is so natural that I see it as human readable too ;) As I wrote and used and tested on my blog, there is no problem in Atom

Re: text/html with mode=xml in Atom 0.3

2006-03-23 Thread James Holderness
A. Pagaltzis wrote: So is this a bug in the content generator (all the feeds I've seen appear to be using TypePad) Yes. or are you supposed to ignore the mode attribute when the content type is set to text/html and always treat it as escaped? No. Thanks for the confirmation. I was

Re: atom:name ... text or html?

2006-03-23 Thread A. Pagaltzis
* Eric Scheid [EMAIL PROTECTED] [2006-03-23 18:05]: It's true that XML has only a half dozen or so entities defined, meaning most interesting entities from html can't exist in XML ... unless maybe they are wrapped like in CDATA block like above? No, a CDATA block simply means that characters

Re: atom:name ... text or html?

2006-03-23 Thread A. Pagaltzis
* Sylvain Hellegouarch [EMAIL PROTECTED] [2006-03-23 18:15]: Do you mean that human-readable is equivalent to solely English? Because as a French, having accents in names is so natural that I see it as human readable too ;) Even as a French, you probably write é, not eacute;. :-) Regards, --

Re: atom:name ... text or html?

2006-03-23 Thread Antone Roundy
On Mar 23, 2006, at 9:48 AM, James Holderness wrote: Hahaha! It's RSS all over again. In the words of Mark Pilgrim: Here's something that might be HTML. Or maybe not. I can't tell you, and you can't guess. :-) Seriously though, the atom:name element is described as a human- readable name,

Re: text/html with mode=xml in Atom 0.3

2006-03-23 Thread A. Pagaltzis
* James Holderness [EMAIL PROTECTED] [2006-03-23 18:40]: I tested this in 15 different aggregators and all but one ignored the mode and unescaped the content anyway. Good thing this rule was changed in Atom 1.0, then… What I really don’t get is what that `xmlns` attribute is doing there in the

Re: atom:name ... text or html?

2006-03-23 Thread James Holderness
David Powell wrote: [Hmm, internal DTD subsets completely fail in IE7's feed reader, throwing up a friendly error message] If I remember correctly they considered that a feature. Something to do with DTDs being a security risk. I'm not sure if this also meant they were incapable of

Does xml:base apply to type=html content?

2006-03-23 Thread David Powell
xml:base applies to type=xhtml content, but I'm not sure whether it is supposed to apply to escaped type=html content? I reckon that it does. Anybody came across this? Any opinions? -- Dave

Re: text/html with mode=xml in Atom 0.3

2006-03-23 Thread James Holderness
A. Pagaltzis wrote: What I really don’t get is what that `xmlns` attribute is doing there in the CDATA block of your data sample. Sometimes I wonder if CDATA should not have been left out of the XML spec; it seems to create far too much confusion to be worthwhile. Well if you look at some of

Re: atom:name ... text or html?

2006-03-23 Thread Tim Bray
On Mar 23, 2006, at 8:01 AM, Sylvain Hellegouarch wrote: Seriously though, the atom:name element is described as a human- readable name, Do you mean that human-readable is equivalent to solely English? Because as a French, having accents in names is so natural that I see it as human

Re: atom:name ... text or html?

2006-03-23 Thread Tim Bray
On Mar 23, 2006, at 8:57 AM, Eric Scheid wrote: On 24/3/06 3:21 AM, Anne van Kesteren [EMAIL PROTECTED] wrote: authorname![CDATA[Bertrand Cafeacute;]]/name/author Even if it was HTML you couldn't really use the entity, could you? I think you have to use a character reference or the

Re: atom:name ... text or html?

2006-03-23 Thread Tim Bray
On Mar 23, 2006, at 8:16 AM, Eric Scheid wrote: If I have an author with the name Bertrand Café, is it acceptable to put that into atom:author like this; authorname![CDATA[Bertrand Cafeacute;]]/name/author or should I be using the unicode numeric entity instead? The key point is that

Re: Atom Thread Feed syntax

2006-03-23 Thread James M Snell
Just wanted to follow through on this for everyone. Given that there are vendors getting ready to ship code based on the current rev of the spec, I'm *not* going to rename the id attribute to ref. Yes, I know that id is confusing to some folks, but we're just talking the name of a single

Re: atom:name ... text or html?

2006-03-23 Thread Eric Scheid
On 24/3/06 4:42 AM, A. Pagaltzis [EMAIL PROTECTED] wrote: I'm getting the data by scraping an html page, so I'm expecting it to be acceptable html code, including html entities. Then decode the entities to a Unicode string and emit the feed as Unicode. Simplest thing that will work

Re: atom:name ... text or html?

2006-03-23 Thread Tim Bray
On Mar 23, 2006, at 2:20 PM, Eric Scheid wrote: Oh well, now to track down a list of html entities and their corresponding unicodes ... http://www.google.com/search?q=xhtml%20entities

Re: atom:name ... text or html?

2006-03-23 Thread A. Pagaltzis
* Eric Scheid [EMAIL PROTECTED] [2006-03-23 23:30]: Oh well, now to track down a list of html entities and their corresponding unicodes ... That would be in the spec. http://www.w3.org/TR/REC-html40/sgml/entities.html But you shouldn’t have to. Any self-respecting language has a library for

Re: Atom Thread Feed syntax

2006-03-23 Thread David Powell
Thursday, March 23, 2006, 9:39:09 PM, James M Snell wrote: Just wanted to follow through on this for everyone. Given that there are vendors getting ready to ship code based on the current rev of the spec, I'm *not* going to rename the id attribute to ref. Yes, I know that id is confusing

Re: Atom Thread Feed syntax

2006-03-23 Thread A. Pagaltzis
* David Powell [EMAIL PROTECTED] [2006-03-24 02:20]: The abandonment of extension constructs in favour of undefined markup by this draft, and other draft-*-atompub-* drafts would be an interoperability concern if these drafts were deployed. If you want to extend Atom, use Extension Elements. I

Re: Atom Thread Feed syntax

2006-03-23 Thread James M Snell
I believe the concern is over the thr:count and thr:when attributes for the replies link relation, both of which are optional, and both of which provide what I consider to be extra information. In other words, it's ok if an implementation drops them. The important bit is the in-reply-to element

Re: Atom Thread Feed syntax

2006-03-23 Thread James M Snell
David Powell wrote: [snip] The abandonment of extension constructs in favour of undefined markup by this draft, and other draft-*-atompub-* drafts would be an interoperability concern if these drafts were deployed. If you want to extend Atom, use Extension Elements. I'm most certainly not