RE: IE7 Atom Handling (was RE: Link rel attribute stylesheet)
This is fantastic, David. Many thanks. We're going over your feed in detail and I'll respond with bug information as soon as possible. Thanks, Sean -Original Message- From: David Powell [mailto:[EMAIL PROTECTED] Sent: Wednesday, March 01, 2006 11:33 AM To: Sean Lyndersay Cc: atom-syntax@imc.org Subject: Re: IE7 Atom Handling (was RE: Link rel attribute stylesheet) Hi Sean, I've been testing IE7 beta 2's support for Atom, with the following test feed: http://djpowell.net/atom-test/hardfeed/feed/hard-feed.atom Also here for easier viewing in IE7 http://djpowell.net/atom-test/hardfeed/feed/hard-feed.xml Here are the problems that I have found: 01. Person Extensions In Atom, extension elements can appear in feeds, entries, and person constructs. So atom:author and atom:contributor should preserve any extension elements. Currently, the transform only preserves atom:uri, atom:name, and atom:email. It should preserve all extensions too. 02. Timezones atom:updated is converted to RSS's RFC822 pubDate element, but the timezone information is lost. Eg: a date such as 2006-01-01T05:00:00+02:00 is converted to Sun, 01 Jan 2006 05:00:00 GMT, which is incorrect. 03. atom:published While atom:updated is converted to pubDate, atom:published is kept as atom:published; except, the date format is converted to RFC822 format. I think that the date format should be kept as-is in ISO8601-style format. 04. Alternate links for non-HTML types The entry called Binary Entry contains a link of the form: atom:link href=../files/bin.png length=684 type=image/png / This link isn't treated as the link for the entry, presumably because it has a type other than HTML. If no HTML link can be found for the alternate link, perhaps it would be worth just choosing any other alternate link instead. 05. HTML titles HTML in feed and entry titles is interpreted properly, but flattened to text. This is presumably deliberate but it does mean that there is some data loss. Perhaps the original atom namespaced element should be preserved as well in these cases? 06. atom:generator atom:generator is converted to RSS's generator. The uri attribute is included as an unnamespaced uri, but the version attribute is dropped. Perhaps both should be preserved, and it might be better to put the attributes into a namespace? 07. XHTML namespace prefix More of a rendering problem, but I've included it here because it is significant: xhtml content currently only works if the xhtml is in the default namespace. If a namespace prefix is used, it fails to be interpretted correctly. See the entry entitled: Entry with full iana [EMAIL PROTECTED] values; the link should appear as an HTML link, but doesn't. 08. IANA URIs for link relations A bit of a quirky one, but in Atom the rel values are actually URIs relative to http://www.iana.org/assignments/relation/, so rel=alternate and rel=http://www.iana.org/assignments/relation/alternate; should be treated the same. The same goes for enclosures. See the entry: Entry with full iana [EMAIL PROTECTED] values, which should show an enclosure and a valid entry link. 09. Category label atom:category is converted to RSS's category element. This causes the label attribute to be lost. It perhaps should be preserved as a namespaced attribute. Also, if it is available it might be better to use the label rather than the term as the RSS2 category name, because term might not be very human readable, that is the purpose of label. See Content Source Entry, which causes the WordNet URI to be displayed in the category filter box. 10. xml:base everywhere Some handling of xml:base is done if it appears on atom:feed or atom:entry, but it can appear anywhere. Eg, xml:base on the atom:link element should affect that link. There are a number of examples of xml:base being handled wrongly in the example, eg the broken feed logo. 11. xml:base / xml:lang namespace I notice that lang and base attributes appear on the transformed feed, but don't have the xml: namespace prefix? Is this a bug caused by the weirdness of the implicit xml: namespace? 12. Subscription name An IE7 bug, but I'll mention it here: If the feed title contains a line-break, the Subscribe to feed-dialog doesn't work because the line-break get's embedded as a hollow-square in the text box and causes an error. Try subscribing to the test feed, it works if you remove the hollow-box from the subscription name. 13. xml:base and xml:lang inheritance from atom:feed to entries xml:base and xml:lang at feed level should apply to all elements nested within the feed document. However the atom:feed element and its metadata can obviously change over time. This creates a problem: What if the atom:feed element contains an xml:base element, and it changes. The feed document as polled can be assumed to be consistent, but it would be wrong to retroactively apply this new base to old entries. In order to avoid these problems each
Re: IE7 Atom Handling (was RE: Link rel attribute stylesheet)
Hi Sean, I've been testing IE7 beta 2's support for Atom, with the following test feed: http://djpowell.net/atom-test/hardfeed/feed/hard-feed.atom Also here for easier viewing in IE7 http://djpowell.net/atom-test/hardfeed/feed/hard-feed.xml Here are the problems that I have found: 01. Person Extensions In Atom, extension elements can appear in feeds, entries, and person constructs. So atom:author and atom:contributor should preserve any extension elements. Currently, the transform only preserves atom:uri, atom:name, and atom:email. It should preserve all extensions too. 02. Timezones atom:updated is converted to RSS's RFC822 pubDate element, but the timezone information is lost. Eg: a date such as 2006-01-01T05:00:00+02:00 is converted to Sun, 01 Jan 2006 05:00:00 GMT, which is incorrect. 03. atom:published While atom:updated is converted to pubDate, atom:published is kept as atom:published; except, the date format is converted to RFC822 format. I think that the date format should be kept as-is in ISO8601-style format. 04. Alternate links for non-HTML types The entry called Binary Entry contains a link of the form: atom:link href=../files/bin.png length=684 type=image/png / This link isn't treated as the link for the entry, presumably because it has a type other than HTML. If no HTML link can be found for the alternate link, perhaps it would be worth just choosing any other alternate link instead. 05. HTML titles HTML in feed and entry titles is interpreted properly, but flattened to text. This is presumably deliberate but it does mean that there is some data loss. Perhaps the original atom namespaced element should be preserved as well in these cases? 06. atom:generator atom:generator is converted to RSS's generator. The uri attribute is included as an unnamespaced uri, but the version attribute is dropped. Perhaps both should be preserved, and it might be better to put the attributes into a namespace? 07. XHTML namespace prefix More of a rendering problem, but I've included it here because it is significant: xhtml content currently only works if the xhtml is in the default namespace. If a namespace prefix is used, it fails to be interpretted correctly. See the entry entitled: Entry with full iana [EMAIL PROTECTED] values; the link should appear as an HTML link, but doesn't. 08. IANA URIs for link relations A bit of a quirky one, but in Atom the rel values are actually URIs relative to http://www.iana.org/assignments/relation/, so rel=alternate and rel=http://www.iana.org/assignments/relation/alternate; should be treated the same. The same goes for enclosures. See the entry: Entry with full iana [EMAIL PROTECTED] values, which should show an enclosure and a valid entry link. 09. Category label atom:category is converted to RSS's category element. This causes the label attribute to be lost. It perhaps should be preserved as a namespaced attribute. Also, if it is available it might be better to use the label rather than the term as the RSS2 category name, because term might not be very human readable, that is the purpose of label. See Content Source Entry, which causes the WordNet URI to be displayed in the category filter box. 10. xml:base everywhere Some handling of xml:base is done if it appears on atom:feed or atom:entry, but it can appear anywhere. Eg, xml:base on the atom:link element should affect that link. There are a number of examples of xml:base being handled wrongly in the example, eg the broken feed logo. 11. xml:base / xml:lang namespace I notice that lang and base attributes appear on the transformed feed, but don't have the xml: namespace prefix? Is this a bug caused by the weirdness of the implicit xml: namespace? 12. Subscription name An IE7 bug, but I'll mention it here: If the feed title contains a line-break, the Subscribe to feed-dialog doesn't work because the line-break get's embedded as a hollow-square in the text box and causes an error. Try subscribing to the test feed, it works if you remove the hollow-box from the subscription name. 13. xml:base and xml:lang inheritance from atom:feed to entries xml:base and xml:lang at feed level should apply to all elements nested within the feed document. However the atom:feed element and its metadata can obviously change over time. This creates a problem: What if the atom:feed element contains an xml:base element, and it changes. The feed document as polled can be assumed to be consistent, but it would be wrong to retroactively apply this new base to old entries. In order to avoid these problems each entry needs to store the xml:lang and xml:base context at the time it was last seen in the document. I think that if a document has xml:lang set on atom:feed, then this attribute should be written to all item elements, unless it is overridden on that atom:entry element. Same for xml:base, except you might need to resolve the entry base against the feed base. Actually if you
Re: IE7 Atom Handling (was RE: Link rel attribute stylesheet)
David Powell wrote: 03. atom:published While atom:updated is converted to pubDate, atom:published is kept as atom:published; except, the date format is converted to RFC822 format. I think that the date format should be kept as-is in ISO8601-style format. Why is atom:updated converted to pubDate? If any atom date is converted to pubDate, why isn't it atom:published? - Sam Ruby
IE7 Atom Handling (was RE: Link rel attribute stylesheet)
For anyone interested, I created a validated Atom feed that (http://softwareme.com/ie7test.xml) exhibits the problems with IE's refactor as RSS2. After hitting the IE7 Subscribe button, the feed is then converted to RSS2 (http://softwareme.com/ie7testsubscribed.xml), which doesn't validate (http://feedvalidator.org/check.cgi?url="">)although still works in IE. I would imagine this is important when you start using the MS API and figure out things are wrong or missing, e.g. atom:published in rfc822 format, etc. The other problem seems to be that IE7 doesn't allow or use in the same way the xml-stylesheet directive - its stripped off. FF1.5 and IE6 render using the supplied xml-stylesheet directive, as can be seen using http://www.atomenabled.org/atom.xml Note: This test feed does not include the xsl/css solution I'm using. From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of M. David PetersonSent: Sunday, February 26, 2006 10:50 PMTo: James YenneCc: atom-syntax@imc.orgSubject: Re: Link rel attribute "stylesheet" If you quickly check the list archives you will notice that this very conversation is taking place directly with members of the IE7/RSS team. The short of it is that that MS is taking the RSS '2.0' format and extending it in areas necessary to allow for what will eventually be a 1:1 mapping, without data corruption or fidelity loss. I've written a follow-up piece to this conversation directed towards the IE7/RSS folks, of which you can find here http://www.xsltblog.com/archives/2006/02/this_is_a_call.html While the post I think brings out some important points, recent comments led to what I believe is a bit more interesting in regards to the mindset of a lot of folks in that they simply do not understand what the inherent issues with RSS ' 2.0' are and how and why the Atom specification fixes these problems. If there was ever a more crucial moment in time to bring these points into an easily accessible and recognizabledomainm which provided a consolidated, well linked, and well documented summary of the available content, now would be that time. In fact, after recording with Kurt (Cagle) our next podcast tonight, I will be finishing off a wiki in which anyone can then easily access and update as they see fit, with this exact purpose in mind. Using a domain that I think fits quite well into this general topic, it will be found at http://www.understandingatom.com/when complete. When all is said and done I then plan to create a blog post to on my O'Reilly blog such that the process of evangelizing this can be made known to the broad, XML.com/OReillynet audience. I'll ping back this list when its ready. re: your XSLT PI problem with IE7. Is the code live somewhere that is accessible such that I can take a look at it and help you debug the problem? On 2/26/06, James Yenne [EMAIL PROTECTED] wrote: I'm "pre-caching" feeds as xml files and have not added this mime type to the server, so no, they're served as text/xml. IE7 appears to recognize them as feeds because you can see it does its re-factor of the Atom feed as RSS2. Why does IE7 convert Atom feeds to RSS2 and use the Atom ns in the RSS2 feed?. I seem to have lost control of how my feed is rendered in IE7. I can render per my own XSL in IE6 and FF1.5. Are you saying that using application/atom+xml cause IE7 to keep the xsl? The links in the generated RSS2 are broken: when I mouse over links, it doesn't include anything after the domain(.com) in the url. Soif an entry link href contains search parameters, it's chopped off. Inspecting the raw xml, however, the search parameters are still present, so IE seems to have a bug. The IE7 conversion from Atom to RSS2 also doesn't pass the feed validators... here's an example validated Atom feed, and the IE7 RSS2 conversion creates a channel that does not validate: Datesare left in the atom ns, but arenot rendered as rfc3339, but rather converted to rfc822; an email address is not included in the RSS2 managing editor as required, butexists in the Atom feed author/email,and more. line 2, column 252: managingEditor must include an email address [ help] ... GMT/pubDatemanagingEditorShop/managingEditoratom:authora ...^ line 2, column 269: Undefined channel element: atom:author [ help] ... nagingEditorShop/managingEditoratom:authoratom:nameSh ...^ line 2, column 370: Undefined channel element: guid [ help] ... [EMAIL PROTECTED]/atom:email/atom:authorguid isPermaLink="false"urn:gu ... ^ line 2, column 643: Unexpected uri attribute on generator element [ help] ... .css" rel="stylesheet" type="text/css"/generator uri="urn:activera-com ...^ line 8, column 181: atom:published must be an RFC-3339 date-time (3 occurrences) [ help] ... 2005/Atom"Sun, 26 Feb 2006 14:35:04 GMT/atom:publishedauthorS ...^ line 8,
Re: IE7 Atom Handling (was RE: Link rel attribute stylesheet)
Sean Lyndersay wrote: Thanks James. I’ve filed bugs in our bug tracking database for each of the issues that came up in the feed validator (except for flagging /atom:*/ items, since these are a correct use of RSS 2.0 extension namespaces). Re the flagging of atom: elements: this was indeed a bug in the Feed Validator. The Feed Validator was incorrectly flagging the use of atom:author elements at the channel level and atom:link elements at the item level. A test case has been expanded to include these elements, and these problems have been corrected. The fix should be deployed online in a matter of hours. - Sam Ruby