Re: I-D ACTION:draft-ietf-atompub-format-05.txt
On 28 Jan 2005, at 15:14, Danny Ayers wrote: On Thu, 27 Jan 2005 16:10:06 -0500, Robert Sayre [EMAIL PROTECTED] wrote: http://atompub.org/2005/01/27/draft-ietf-atompub-format-05.html http://atompub.org/2005/01/27/draft-ietf-atompub-format-05.txt Thanks Robert. The Relax NG snippets make a *huge* difference to the clarity. (Thanks Norm!). Yes. A real pleasure to read now :-)
Re: PaceXhtmlNamespaceDiv posted
On Thu, 27 Jan 2005 13:30:40 -0700, Antone Roundy [EMAIL PROTECTED] wrote: As far as the question of CSS and/or elements/tags everywhere, I'd think that would be a matter for the security considerations section (protecting against the Raging Platypus, for example). Whatever restrictions we may pronounce, consumers will still have to include code to protect against abuses. And these issues apply equally to HTML as to XHTML. I'm not in favor of mandating restrictions, because there are probably legitimate uses for anything we might try to protect people against. +1, and the same goes for 'id', just leave it as an item for the security considerations. -joe -- Joe Gregoriohttp://bitworking.org
Re: PaceXhtmlNamespaceDiv posted
Antone Roundy wrote: On Thursday, January 27, 2005, at 10:38 PM, Henri Sivonen wrote: On Jan 27, 2005, at 22:30, Antone Roundy wrote: I'm not in favor of mandating restrictions, because there are probably legitimate uses for anything we might try to protect people against. The namespace div places restrictions on where namespace declarations appear and, therefore, limits the legitimate use of serializers that take care of namespace declarations. -1 for the pace, still. Okay, this one's obviously dead. Let's just make sure we have examples that make how all these things work clear. I also don't like the restriction on where namespace declarations must be placed, but overall, I believe that the pace is a good idea. Consumers don't want full web pages (complete with html head and titles) as summaries, they want something that they can *insert* into a web page. Requiring a div element addresses a number of needs - it makes it easier to get the namespace right, and it succinctly provides a rather good hint as to what child elements are valid. On content, the situation is a bit different - the content need not be displayable, after all. I would be OK with either keeping the definition of type='XHTML' consistent (there are other types available, after all) or requiring a summary element to be present if the first child element of atom:content with type='XHTML' is not an xhtml:div. - Sam Ruby
Re: PaceXhtmlNamespaceDiv posted
Sam Ruby wrote: I also don't like the restriction on where namespace declarations must be placed, but overall, I believe that the pace is a good idea. Consumers don't want full web pages (complete with html head and titles) as summaries, they want something that they can *insert* into a web page. Requiring a div element addresses a number of needs - it makes it easier to get the namespace right, and it succinctly provides a rather good hint as to what child elements are valid. That's what the spec already says, doesn't it? - http://atompub.org/2005/01/27/draft-ietf-atompub-format-05.html#rfc.section.3.1.1.p.6 ... Best regards, Julian -- green/bytes GmbH -- http://www.greenbytes.de -- tel:+492512807760
Re: PaceXhtmlNamespaceDiv posted
On 28 Jan 2005, at 6:21 pm, Sam Ruby wrote: I also don't like the restriction on where namespace declarations must be placed, but overall, I believe that the pace is a good idea. Yes. and it succinctly provides a rather good hint as to what child elements are valid. Yes. I would be OK with either keeping the definition of type='XHTML' consistent (there are other types available, after all) Yes. or requiring a summary element to be present if the first child element of atom:content with type='XHTML' is not an xhtml:div. Ew. Graham
Re: PaceXhtmlNamespaceDiv posted
Julian Reschke wrote: Sam Ruby wrote: I also don't like the restriction on where namespace declarations must be placed, but overall, I believe that the pace is a good idea. Consumers don't want full web pages (complete with html head and titles) as summaries, they want something that they can *insert* into a web page. Requiring a div element addresses a number of needs - it makes it easier to get the namespace right, and it succinctly provides a rather good hint as to what child elements are valid. That's what the spec already says, doesn't it? - http://atompub.org/2005/01/27/draft-ietf-atompub-format-05.html#rfc.section.3.1.1.p.6 There are cases where explicit is better than implicit. Given that common practice is to include this element, making it mandatory makes things clearer to both people who are producing consuming tools based on the spec, and people who are producing new feeds based on copy and paste. - Sam Ruby
Re: PaceXhtmlNamespaceDiv posted
Given that common practice is to include this element, making it mandatory makes things clearer to both people who are producing consuming tools based on the spec, and people who are producing new feeds based on copy and paste. +1 -- Roger Benningfield
PaceFormatSecurity
I looked at format-05 and found that the security section is still pretty anemic. Here is my stab at fleshing out that section: http://www.intertwingly.net/wiki/pie/PaceFormatSecurity: === == Abstract == Fill out the security section of the format spec. == Status == Open Author: JoeGregorio Much of the material presented here has been covered by Mark Pilgrim in his post on consuming RSS Safely: http://diveintomark.org/archives/2003/06/12/how_to_consume_rss_safely == Rationale == Security is more than just encryption and signatures. == Proposal == Add the following text to 10 Security Considerations {{{ 10.1 HTML and XHTML Text Constructs Text Constructs allow the delivery of HTML and XHTML into a client application which may then display that (X)HTML. Because that (X)HTML may be displayed either in a web browser or via an embedded web browser in a desktop application, many security concerns will arise since that (X)HTML may be displayed in a different context from which it was originally served. A consuming application needs to be very careful about the context in which that (X)HTML is displayed to avoid cross site scripting attacks and other forms information leakage. An aggregator will certainly display the (X)HTML of a Text Construct in a different context than if an HTML page had been loaded from the same server as that had served up the Atom feed. That is, the (X)HTML may be displayed through a different web site if is a web based aggregator, or as a local file if the aggregator is a desktop kind. There are also aggregators that serve files up via a web server that run off the desktop. Because of these differening contexts there is an opening for cross site scripting attacks or other forms of information leakage. 10.1.1 HTML Elements The following elements are consider 'unsafe' in that they open clients to one or more types of attack. Every client should consider carefully their handling of each of them when processing incoming (X)HTML in Text Constructs. 10.1.1.1 IMG Element The image element may pose a threat by inadvertely leaking information. That is, a hostile feed may include a Text Construct with a web bug, a 1x1 pixel image that gets loaded invisibly to the user. The request itself and the referral information the client application provides may leak information about who is reading the content and when the content was read. 10.1.1.2 SCRIPT Element All SCRIPT elements should be stripped from Text Constructs or all native scripting support of the (X)HTML display engine should be disabled. Allowing any script to run would allow cross site scripting attacks. 10.1.1.3 EMBED and OBJECT Elements All EMBED and OBJECT elements should be stripped from Text Constructs. The danger here is loading up an an embedded object in an unsafe context. For example an ActiveX control could be run in local context considered safe while it would not normally be loaded from it's origin site which was considered unsafe. ActiveX is not the only technology to suffer from this problem, SVG allows JavaScript to be embedded in it, and if displayed in an EMBEB or OBJECT element could open the client up to a cross site scripting attack. 10.1.1.4 FRAME, FRAMESET, and IFRAME Elements The FRAME, FRAMESET, and IFRAME Elements allow loading (X)HTML in from a different context. 10.1.1.5 META Elements Some (X)HTML processors are very loose in what they will accept for HTML, including processing elements that would normally appear in the HEAD of a document even when they are present in the BODY. Such a loose (X)HTML processor may process a META element which could redirect the HTML processor to load another page. 10.1.1.6 LINK Elements The same loose processing that may inadvertenly pick up META elements can also pick up LINK elements which can cause CSS Stylesheets to be loaded. Please see Section 10.1.2 on the potential problems with CSS. 10.1.2 CSS The processing of CSS (Cascading Stylesheets) also has security concers. CSS allows the loading of images, which has all the same concerns as the IMG element [Section 10.1.1.1]. In addition CSS allows HTML elements to be hidden or positioned absolutely. If a group of syndication feeds are processed and displayed in a single HTML page then some errant or malicious CSS could ovelay the entire page with a single large image repeated endlessly, thus rendering the entire page unusable. Some browsers also support proprietary extensions which allow the execution of scripts within CSS. For these reasons clients should strongly consider stripping all STYLE elements from the (X)HTML and also remove all STYLE Attributes in the (X)HTML elements themselves. 10.1.3 URIs Since any consumer of an Atom feed will be processing URIs, the security concerns for handling URIs must also be taken into account. See Section 7 of RFC 3986. 10.1.4 IRIs Since any consumer of an Atom feed will be processing IRIs, the security
Re: PaceFormatSecurity
I don't like stuff like: All SCRIPT elements should be stripped from Text Constructs or all native scripting support of the (X)HTML display engine should be disabled. I don't think you need to should do any more than outline the threat model from each tech. Proscribing how to deal with it is not on, especially when they're this drastic. Graham smime.p7s Description: S/MIME cryptographic signature
Re: PaceFormatSecurity
Graham wrote: I don't like stuff like: All SCRIPT elements should be stripped from Text Constructs or all native scripting support of the (X)HTML display engine should be disabled. I don't think you need to should do any more than outline the threat model from each tech. Proscribing how to deal with it is not on, especially when they're this drastic. Agree w/ Graham. We don't know what kind of relationship the publisher and consumer have. I would strike all the details on HTML, leave the first paragraph, and refer to the security sections of RFC 2854 and HTML 4.01. Robert Sayre
Re: PaceFormatSecurity
Joe Gregorio wrote: Those two references are woefully inadequate, just compare the threats they outline versus the ones I outline in the Pace. If there were a good reference of all the problems that HTML can cause when used in email, *that* would be more in line with what we need, but I was unable to find such a reference myself. Maybe someone else has better google-fu than me. I guess the question is whether we can and should outline HTML security issues. I don't think we can or should. Robert Sayre
Re: Proof-of-concept RDF mapping for Atom
Friday, January 28, 2005, 9:27:11 PM, you wrote: Sorry, that version created duplicate rdf:nodeIDs. I've fixed it now, the new version is 9826 bytes. -- Dave
Re: PaceXhtmlNamespaceDiv posted
Henri Sivonen wrote: On Jan 28, 2005, at 20:21, Sam Ruby wrote: I also don't like the restriction on where namespace declarations must be placed, but overall, I believe that the pace is a good idea. I, for one, use gnu.xml.pipeline.NSFilter for ensuring the namespace correctness in my RSS feed. If the current spec language stands, I will be able to trivially add Atom output with type='XHTML' by doing a DOM to DOM copy (without div cruft) and letting the serialization phase sort out the namespace declarations for me. If this pace was accepted, I'd have to break the namespace abstraction and fiddle with the namespace declaration details to meet additional requirements that are unnecessary as per Namespaces in XML. Are you saying that you can do a DOM to DOM copy to place a series of elements inside the following: atom:feed/atom:entry/atom:content But you would find it extraordinarily difficult to place the exact same series of elements inside the following: atom:feed/atom:entry/atom:content/xhtml:div If so, I would find such an assertion to be hard to accept. (Similar considerations apply to GenX if the user chooses to leave namespace declaration management to the serializer.) Consumers don't want full web pages (complete with html head and titles) as summaries, they want something that they can *insert* into a web page. Insertion is possible without a div as well if the insertion is implemented using proper XML tools and the target of the insertion is a real XHTML skeleton and not a tag soup skeleton and the resulting document is sent down the XML code path of Gecko, WebCore, Presto, etc. For Trident, the aggregator would have to serialize to HTML, which is pretty easy. On the other hand, a div does not make Atom safe for tag soup concatenators, because the element names may be prefixed. If element names are prefixed, string concatenation is not an option anyway. If element names are not prefixed (as is the case with the overwheming majority of existing HTML), adding a div is exactly what makes things safe for simple string concatenators. - Sam Ruby
Re: PaceIRI status: RFC 3987 and STD 66, RFC 3986, published
Martin Duerst wrote: The IRI spec is now published as RFC 3987 (Proposed Standard, http://www.ietf.org/rfc/rfc3987.txt). The update of the URI spec, known as RFC2396bis, is now published as STD 66, RFC 3986. Even less reason for not adopting them. Editors, please update your references. I'll update PaceIRI in a day or two. IRIs are a step forward and important to include in the spec, but they also worry me. In RFC3987, I read the following: The approach of defining a new protocol element was chosen instead of extending or changing the definition of URIs. This was done in order to allow a clear distinction and to avoid incompatibilities with existing software. Do you expect Atom implementors will be using incompatible existing software? I think this question should face roughly the same scrutiny that PUT/DELETE did. I'm also worried that the term IRI will cause confusion. After all, the catch phrase is not Cool IRIs Don't Change. What can we do minimize confusion? Robert Sayre
Re: PaceFormatSecurity
At 12:56 PM -0800 1/28/05, Tim Bray wrote: At this point we should appeal to our designated IETF culture/process experts; Scott/Ted/Paul, any guidance? -Tim It's up to the WG. If we do a long list, we will probably be told to make it much longer. If we do security-by-reference, we will probably be told that those references aren't very good, or up to date, or something. Given the two choices, I actually prefer security-by-reference because it points out the similarity of what we are doing to other protocols. Section 10.1.1 might instead read: - Many elements are consider 'unsafe' in that they open clients to one or more types of attack. Every client should consider carefully their handling of every type of element when processing incoming (X)HTML in Text Constructs. See the security sections of RFC 2854 and HTML 4.01 for some guidance on many type of attacks. Atom readers should pay particular attention to the security of the IMG, SCRIPT, EMBED, OBJECT FRAME, FRAMESET, IFRAME, META, LINK elements, but other elements may also have negative security properties. - Then skip the subsections. That gives the reader some guidance, but doesn't lock us into covering everything. --Paul Hoffman, Director --Internet Mail Consortium
RE: PaceFormatSecurity
Given the two choices, I actually prefer security-by-reference because it points out the similarity of what we are doing to other protocols. I agree. It's also a good practice to have only one authoritative source that talks about a topic, especially when that source has already been through the RFC approval process. -Scott-
Re: PaceFormatSecurity
On Fri, 28 Jan 2005 17:01:06 -0500, Robert Sayre [EMAIL PROTECTED] wrote: I guess the question is whether we can and should outline HTML security issues. I don't think we can or should. Considering the large amount of (X)HTML that are being syndicated via RSS and Atom today and will be in the future, I think we should. (X)HTML will be the main markup used inside all Atom Text Constructs, and while MathML, SVG and other markup languages we don't know about may contain security issues, they aren't nearly as important to mention as those that lie within (X)HTML. -- Asbjørn Ulsberg -=|=-http://virtuelvis.com/quark/ «He's a loathsome offensive brute, yet I can't look away»
Re: PaceFormatSecurity
On Fri, 28 Jan 2005 13:21:08 -0800, Tim Bray [EMAIL PROTECTED] wrote: Whereas you could technically get by with warning-by-reference, I think that it's OK and fact probably essential to point out that img and script and object and so on; are potentially lethal; I agree. However, it is impossible to write a specification today about security issues we don't know of, but those we do know, like the elements you mention, should also be mentioned in the specification. I thought Joe got about the right level, except for the what to do stuff. Yep. If he leaves that out of the pace, I'm all +1 to it. -- Asbjørn Ulsberg -=|=-http://virtuelvis.com/quark/ «He's a loathsome offensive brute, yet I can't look away»
Re: PaceFormatSecurity
On Fri, 28 Jan 2005 17:13:26 -0800, Paul Hoffman [EMAIL PROTECTED] wrote: Many elements are consider 'unsafe' in that they open clients to one or more types of attack. Every client should consider carefully their handling of every type of element when processing incoming (X)HTML in Text Constructs. See the security sections of RFC 2854 and HTML 4.01 for some guidance on many type of attacks. Atom readers should pay particular attention to the security of the IMG, SCRIPT, EMBED, OBJECT FRAME, FRAMESET, IFRAME, META, LINK elements, but other elements may also have negative security properties. This reads well, imo. But I would replace «(X)HTML» with «markup» in the first paragraph, because there may be security issues with other markup languages as well. I would then rewrite the second paragraph like this: Atom readers should pay particular attention to the security of HTML and XHTML's IMG, SCRIPT, EMBED, OBJECT, FRAME, FRAMESET, IFRAME, META and LINK elements, but other elements may also have negative security properties. I'm having a bit problem with calling EMBED an HTML element, though, since no HTML standard includes it. -- Asbjørn Ulsberg -=|=-http://virtuelvis.com/quark/ «He's a loathsome offensive brute, yet I can't look away»
Re: PaceEnclosuresAndPix status
On Thu, 27 Jan 2005 22:12:41 +1100, Eric Scheid [EMAIL PROTECTED] wrote: http://www.intertwingly.net/wiki/pie/PaceIconAndImage Nice. But if we have both atom:icon and atom:image for the feed, why do we need to do all kinds of wierd stuff to have images attached to Atom entries? Can't atom:image (perhaps not atom:icon) occur as a child of atom:entry too? This competes with parts of PaceEnclosuresAndPix, and so have also written PaceLinkEnclosure which simply strips out the Pix part. Right. So we include images of the feed with 'image', while images for entries with 'link'. That doesn't make sense. It also doesn't make much sense to have several elements to do the exact same thing. My head is screaming for atom:object here. -- Asbjørn Ulsberg -=|=-http://virtuelvis.com/quark/ «He's a loathsome offensive brute, yet I can't look away»
Re: PaceFeedLink
On Wed, 26 Jan 2005 07:53:33 -0800, Tim Bray [EMAIL PROTECTED] wrote: Software which discovers that the FeedLink URI is different from that used to retrieve the atom:feed document containing MAY choose to use the FeedLink URI for subsequent fetches. Nicely put. +1. -- Asbjørn Ulsberg -=|=-http://virtuelvis.com/quark/ «He's a loathsome offensive brute, yet I can't look away»
Re: PaceEnclosuresAndPix status
On 29/1/05 4:22 PM, Asbjørn Ulsberg [EMAIL PROTECTED] wrote: http://www.intertwingly.net/wiki/pie/PaceIconAndImage Nice. But if we have both atom:icon and atom:image for the feed, why do we need to do all kinds of wierd stuff to have images attached to Atom entries? Can't atom:image (perhaps not atom:icon) occur as a child of atom:entry too? Maybe image is the wrong name for the concept. We're not talking about some random image associated with some entity, we're talking about a branding badge or logo of some kind which is representative of the feed. While entries may well have images of various kinds attached to them (cat pictures, anyone?), the individual entries don't get branded with their own logo/badge, do they? Hmmm... maybe I'll rename it to atom:logo ... would that help? e.
Re: Questions about -04
On Wed, 26 Jan 2005 21:39:34 -0500, Sam Ruby [EMAIL PROTECTED] wrote: There are now, by some counts, ten versions of formats that call themselves RSS. Every last one of then has a required channel/link. Every last one of them. Yes. Relaxing a restriction requires consumers to handle more cases. How much does it cost for consumers to handle these cases compared to how much it restricts the producers? With this restriction, all Atom feeds needs to be a copy of another resource type. It can never be a first class resource. I think feed level alternative links are useful, but not more than that they need to be a SHOULD, not MUST. Because of this, I would like to request that there be a compelling use case be found which for feeds for which there can not be a atom:link defined. I would much rather require 'rel=self' than 'rel=alternate'. The former doesn't require anyone to double-produce anything, while the latter almost always requires people to have some kind of HTML representation of their feed lying around somewhere. If all the consumer and producer's interested in is the Atom feed, why would one need a secondary resource that the feed can point to? I cannot understand why the alternative HTML page is needed. If people mystically and by mere coincidence subscribe to a feed without an alternate resource in their aggregator, how can this hurt anyone? The will be subscribed, have all the entries show up in their aggregators, but not be able to view the feed in another format. Note atom:link is defined as a URI. While most examples that we have seen use the HTTP scheme, this is not a requirement. No, but we've also seen that most other schemes are totally useless in practice. What should an aggregator do with a 'news' scheme, for instance? What about 'prospero'? What about proprietary schemes that aren't registered in IANA and only retreivable through some sort of direct TCP socket? We have lots of these inside the Norwegian Broadcasting Corporation; we are an old publisher and broadcaster. I do not see why we (or anyone else) should be required to publish all the content we wish to syndicate to users, other companies and such, in alternative formats to Atom, just because the Atom specificatino requires it. If we don't do it to please our users, I don't see why we should do it at all. Just to comply with a specification is imho not a good enough reason. -- Asbjørn Ulsberg -=|=-http://virtuelvis.com/quark/ «He's a loathsome offensive brute, yet I can't look away»
how to write spec language for language variants?
nitpickers welcome. I have this spec text in my draft Pace... atom:head elements MAY contain one or more atom:foo elements, so long no two atom:foo elements have the same combination of atom:hreflang, xml:lang, or atom:type. I also considered writing it like this... atom:head elements MAY contain one or more atom:foo elements, so long as they differ in the values they have for the attributes atom:hreflang, xml:lang, or atom:type. I'm not comfortable with either wording. Seems clumsy. meta-question: should the spec even bother asserting this restriction? e.
Re: PaceEnclosuresAndPix status
On Sat, 29 Jan 2005 16:47:09 +1100, Eric Scheid [EMAIL PROTECTED] wrote: Maybe image is the wrong name for the concept. We're not talking about some random image associated with some entity, we're talking about a branding badge or logo of some kind which is representative of the feed. I know what we're talking about and still don't understand why embedding so-called «favorite icons» is so wildely different from embedding other types of graphical objects in feeds (at all levels) that it has to be done in a wildely different way. We're trying to create a mechanism for embedding graphics, and in some cases the graphic has special meaning. Then let's of course assert that special meaning, but not by creating several completely different mechanisms for embedding graphics in Atom feeds! While entries may well have images of various kinds attached to them (cat pictures, anyone?), the individual entries don't get branded with their own logo/badge, do they? If they do, that's none of our (or at least my) business. If people want icons for their entries, let them have it. If aggregator writers want to add support for bookmarking individual entries (it makes sense, doesn't it?), it would be a lot easier to find these entries later on if they had some kind of icon attached to them. That icon could of course be the same as the feed's, but it could also be something completely different. The point is that if we have a general way of embedding graphical objects in Atom feeds, we don't need to differ between feed- and entry-level graphics. We only need to define some types of graphical objects that have special meaning so they can be treated specially by aggregators. One such graphical object is «icon». It is an embedded graphical object, but it's supposed to be shown in a special place in the aggregator. Hmmm... maybe I'll rename it to atom:logo ... would that help? If the renamed element can be used to embed all sorts of graphical objects in both feeds and entries; yes. :) -- Asbjørn Ulsberg -=|=-http://virtuelvis.com/quark/ «He's a loathsome offensive brute, yet I can't look away»