Re: Current and permalink link rel values
On Feb 23, 2007, at 7:16 AM, Elliotte Harold wrote: I'd like to add multiple links to my feed for both the current version of the story and the permalink. E.g. ... link href=http://www.cafeconleche.org/#February_22_2007_30633/ link rel=permalink href=http://www.cafeconleche.org/oldnews/ news2007February22.html#February_22_2007_30633/ Both of those would probably be best described as alternate links. The second one in particular is what alternate was intended to be used for. However, RFC 4287 contains the following: o atom:entry elements MUST NOT contain more than one atom:link element with a rel attribute value of alternate that has the same combination of type and hreflang attribute values. So you couldn't keep both as alternate links. In my opinion, you should use the second one (the longer lasting one) only, and omit the first (which is going to become invalid as soon as the entry falls off the page anyway -- anyone who used it to get to your page and bookmarked it, and anyone who follows it from a cached copy of your feed isn't going to be able to find the entry without a lot of needless digging through your archives). You should have a link to http://www.cafeconleche.org/ at the feed level. While that won't link directly to that entry, it'll get people to it as long as it's on that page.
Re: Query re: support of Media RSS extensions inside Atom feeds
On Feb 9, 2007, at 9:23 PM, John Panzer wrote: Does anyone know of any issues with placing Yahoo! Media RSS extensions (which seem to fit the requirments for Atom extensions to me) inside an Atom feed? Secondarily, do feed readers in general recognize MRSS inside either Atom or RSS? Looking for field experience/implementor intentions here. CaRP partially supports Media RSS in RSS (it doesn't directly support Atom at all, and Grouper, the companion script that converts Atom to RSS for it doesn't yet have Media RSS support, though I may add it in the next update). It only looks at elements pointing to images (@type=image/*) and their types, heights and widths. I added this in response to user requests--primarily, I believe, for use with Flickr feeds. Antone
Re: AD Evaluation of draft-ietf-atompub-protocol-11
I'm not subscribed to the APP mailing list, so hopefully this isn't all redundant: On 12/15/06, Lisa Dusseault [EMAIL PROTECTED] wrote: A model where servers aren't required to keep such information won't, in practice, allow that kind of extension. If clients can't rely on their markup getting stored, then clients can't extend Atom unilaterally using XML markup. There are two different issues here, which I think has been mentioned, but which might bear being clearly stated: 1) Do servers have to keep all extension data? 2) Can a server accept an entry while discarding some or all extension data, or do they have to reject the entry and return an error code? I think the answer to the first question is clearly no--servers shouldn't be required to store all arbitrary data that is sent to them. So the questions are: 1) Which hurts more--data loss or rejected entries? 2) Is there any way to reduce that pain? The pain of data loss is obvious--the data is lost. The pain of rejected entries is having to fix and repost them or decide not to try again. In either case, it might be useful to be able to query the server somehow to find out what it will and won't preserve. If data is discarded, you can figure that out after the fact by loading the resulting entry and checking whether the data is all there, but one might prefer to know ahead of time if something is going to be lost in order to be able to decide whether to post it or not. If the entry is just going to be rejected, unless there's a way for the server to communicate exactly which data it had issues with, fixing the data to make it acceptable could be extremely difficult (Hmm, I'll leave this data out and try again...nope, still rejected. I'll put that back in and leave this out...nope. I'll take both out...nope. I'll put both back in and take yet another piece of data out...). So, how might a client query a server to see what it will preserve? A few possibilities: 1) Have some way to request some sort of description of what will and won't be preserved and what might be altered. I don't know how one would go about responding to such an inquiry except to basically send back a list of what will be preserved, including some way to say I'll preserve unknown attributes here, I'll preserve unknown child elements (and their children) here, I'll store up to 32767 bytes here, etc. If there is any known extension markup that a server wants to explicitly state that it won't preserve, there may need to be a way to do that too. 2) Have a way to do a test post, where one posts the data one is considering posting (or something structurally identical), but says don't store this--just tell me what you WOULD store. The response could include what would be returned if one were to load the data after it being stored, or it could be some sort of list of anything that would be discarded or altered. 3) (I get the impression this could be done without requiring changes--is this the sort of process that has already been selected?) Post the data as a draft, reload it to see if it's all still there. If so, or if what has been preserved is acceptable, change it's status to published or whatever it's called. If not delete it and give up or take whatever other action is appropriate. My impression is that data loss would be less painful and more easily dealt with than rejection of entries that won't be completely preserved. ...but I haven't followed the discussion, so what do I know.
Re: PaceEntryMediatype
On Dec 6, 2006, at 12:14 PM, Jan Algermissen wrote: Following a link is not the same thing as subscribing to something. The act of subscribing is a local activity performed by the user agent. What you do when you follow the link to a feed is a GET. Your agent then decides if subscribing to that resource is a good idea. To make that decision, the agent has to look at the representation and the it is insignificant overhead to see if the thing returnes feed or entry. ... Maybe I want to monitor a single media resource; an Atom media entry would be an ideal thing to do so (I'd rather look at the meta data than at the media resource upon each poll). I'd say: stick with the one media type that is currently there - there is no problem, just misconception about what it means to subscribe. A few reasons why a user agent might want to be able to tell the difference between a link to a feed and a link to an entry beforehand is in order to: 1) be able to ignore the link to the entry (ie. not present it to the user) if the user agent doesn't handle entry documents (rather than presenting it as a subscribe link, only to have to say sorry, it's not a feed after the user tries to subscribe). 2) be able to say subscribe to links to feeds, and monitor links to entries (the user may not be interested in monitoring a single entry for changes--if they can't tell that that's what the link is for, they may end up needlessly doing so but think that they've added another feed to their subscription list).
Re: PaceEntryMediatype
On Dec 6, 2006, at 4:26 PM, Jan Algermissen wrote: Most feed readers knows how to handle feeds, but have no idea how to handle entries. So they should be fixed, should they not? If the purpose of a feed reader is to subscribe to feeds and bring new and updated entries to the user's attention, then if they don't also handle the monitoring of single entry documents (interesting to some people in some cases, but I doubt interesting to most people), that's not necessarily something that needs fixing. They seem to only have implemented half a media type. ...or they've implemented all of what should be covered by one media type.
Re: PaceEntryMediatype
On 12/1/06, Mark Baker [EMAIL PROTECTED] wrote: On 11/30/06, Thomas Broyer [EMAIL PROTECTED] wrote: All a media type tells you (non-authoritatively too) is the spec you need to interpret the document at the other end of the link. That has very little to do with the reasons that you might want to follow the link, subscribe to it, etc.. Which is why you need a mechanism independent from the media type. Like link types. Now that this has sunk in, it makes a lot of sense--the @rel value says you can subscribe to that, that is an alternative representation of this, that is where you'd go to edit this, and so on. The media type helps the user agent figure out whether it has the capability to do those things. For example, a feed reader that only handles RSS could ignore subscription links to resources of type application/atom+xml (ie. not present the subscription option to the user). The subscribe to hAtom feed case where @type is text/ html might be a little difficult to make a decision on, because there's no indication of what microformat is being used by the feed (or even if there's a microformat in use at all--maybe it really is just an HTML page, and subscribing to it just means watch for changes to the entire document). But in the case of bare syndication formats, things should be clear enough. So if it really is possible to do option 5 (new media type for entry documents, and @rel values to solve the rest of the issues), and do it cleanly, then that'd be my first choice. If that's doomed (due to a need to be backwards compatible with existing practice) to be a mess of ambiguities and counter-intuitivities (eg. alternate means subscribe when combined with a syndication type, except when it might really mean alternate because it points to a feed archive document, but anything with feed in it always means subscribe...) then oh my. One problem that I hadn't really thought clearly about till right now is that understanding the nature of the think linked TO may require some understanding of the nature of the thing linked FROM. For example, an alternate link from a typical blog homepage to its feed really does point to the same thing in an alternative format. Both are live documents in which new data gets added to the top, and old data drops off the bottom. But if you don't know that the webpage is a live document, you wouldn't know whether the link pointed to a static or live document. alternate is perfectly accurate, but it's not helpful enough. subscribe would be much more explicit. Which raises the question of how to point to a static alternative representation of the data currently found in the document. alternate WOULD be a good word to use for that except that it's already being used to point to live feeds. An option that would almost surely cause confusion would be to use alternative for static alternative representations. The meaning of static wouldn't exactly be intuitively clear. Maybe something more long-winded like (oh no! hyphenation!) static-alternate would do. Or would static alternate (and alternate static and static foo alternate,etc., or perhaps archive alternate, etc.) be better? For backwards compatibility (at least with UAs that don't expect only one value in @rel), subscribe alternate (and alternate subscribe, etc.) could be used rather than simply subscribe. BTW, am I remembering correctly that feed is being promoted for use the way I'm considering subscribe above? If it's not already in use, I'd thinK subscribe would be much better than feed, because feed could as easily mean archive feed as subscription feed-- it's just not explicit enough. But perhaps this discussion all belongs in a different venue anyway... But before I end, what about the question of a different media type for entry documents? For the APP accept element issue, it sounds like maybe they do. But for autodiscovery, maybe they don't. Perhaps neither @type nor @rel is the place to distinguish, for example, between the edit links for entries, their parent feeds, their per-entry comment feeds, monolithic comment feeds, etc. (A media type for entry documents would only help with one of those). Perhaps that is the domain of @title (title=Edit this entry, etc.) Do UAs really need to know the difference, or do only the users need to know? Would making that information machine readable be worth the pain involved (rel=edit monolithic parent comments???) Okay, that's all I can take for now.
Re: PaceEntryMediatype
On Nov 30, 2006, at 2:13 AM, Jan Algermissen wrote: On Nov 29, 2006, at 7:22 PM, James M Snell wrote: One such problem occurs in atom:link and atom:content elements. Specifically: atom:link type=application/atom+xml href=a.xml / atom:content type=application/atom+xml src=b.xml / Given no other information I have no way of knowing whether these are references to Feed or Entry documents. And what is the problem with that? Here's one problem: in this and the autodiscovery case, the UA can't tell without fetching the remote resource whether it's appropriate to display a subscribe link. In fact, even if the remote resource is a feed, it may not be appropriate to subscribe to, because it may be an archive document rather than the live end of a feed. Of the options presented, I'd favor adding a type parameter to application/atom+xml. In addition to feed and entry, we may want archive.
Re: PaceEntryMediatype
Summary of thoughts and questions: *** Problems with the status quo *** A) Consumers don't have enough information (without retrieving the remote resource) to determine whether to treat a link to an Atom document as a link to a live feed, a feed archive, or an entry. (Is it appropriate to poll the link repeatedly? How should information about the link be presented to the user?) B) APP servers can't communicate whether they will accept feed documents or only entry documents. *** Possible solutions *** 1) Add a type parameter to the existing media type: + With the exception of a few details, the documents are all exactly the same format (does it contain a feed element, or does it start at the entry element, is it a live feed document or an archive, etc.), so a single media type makes the most sense (definitely for live feeds vs. archives, less certainly for feeds vs. entries). - Some existing applications will ignore the parameter and may handle links to non-live-feeds inappropriately - Some existing applications may not recognize application/atom +xml;type=feed as something appropriate to handle the same way they handle application/atom+xml now. ? I haven't been following development of the APP, so forgive my ignorance, but can parameters be included in the accept element? 2) Create (a) new media type(s) (whether like application/atomentry +xml or application/atom.entry+xml): + Applications that currently treat all cases of application/atom+xml the same would ignore non-feed links until they were updated to do something appropriate with the new media type. - Differentiating between live feeds and archives by media type seems really wrong since their formats are identical. This isn't as big a negative for entry documents, but it still seems suboptimal to me. - If a media type were created for archive documents, would APP accept including application/atom+xml imply acceptance of archive documents too? Neither yes nor no feels like a satisfying answer. 3) Use @rel values to differentiate: - That territory is already a bit of a mess, what with feed vs. alternate vs. alternate feed vs. feed alternate -- why make it worse? + That territory is already a bit of a mess, what with feed vs. alternate vs. alternate feed vs. feed alternate -- why not work on all these messy problems in the same place? - That wouldn't help with the APP accept issue. 4) Create a new media type for entry documents, and add a parameter to application/atom+xml to differentiate between live and archive feeds (and for any other documents that have the identical format, but should be handled differently in significant cases). - Doesn't prevent existing apps that ignore the parameter from polling archive documents. + Does solve the rest of the problems without the negatives of #2 above. 5) Create a new media type for entry documents, and use @rel values to solve issues that doesn't solve: +/- Messy territory If we were starting from scratch, I'd probably vote for #1. Since we're not, I'd vote for #4 first, and perhaps #5 second, but I'd have to think about #5 more first. Antone
Re: atom license extension (Re: [cc-tab] *important* heads up)
On Sep 6, 2006, at 7:51 AM, James M Snell wrote: The problem with specifying a per-feed default license is that there is currently no way of explicitly indicating that an entry does not have a license or that any particular entry should not inherit the default feed-level license. With respect to atom:rights (from RFC 4287 section 4.2.10): If an atom:entry element does not contain an atom:rights element, then the atom:rights element of the containing atom:feed element, if present, is considered to apply to the entry. Thus, at the entry level, atom:rights / would (certainly ought to!) detach a feed level atom:rights element from the entry without replacing it with anything. With link rel=license..., I'm not sure how you'd do the same thing. Is it possible to specify a null URI? link rel=license href= / points to the in-scope xml:base URI, right? Perhaps the specification could define a null license URI. With respect to the issue of aggregate feeds, I had thought that the existence of an atom:source element at the entry level blocked any inheritance of the feed metadata, but looking at RFC 4287, I don't see that explicitly stated. Certainly if atom:source contains atom:rights, then that element overrides the feed-level atom:rights of the aggregate feed, but if neither atom:source nor atom:entry contains an atom:rights element, what then? Perhaps in that case, the aggregator should add atom:rights / as a child of atom:source (I'd think that preferable to adding it as a child of atom:entry). On Sep 6, 2006, at 4:38 AM, Thomas Roessler wrote: So, here's the proposal: - Use link rel=license/ for entry licenses -- either on the feed level, setting a default analogous to what atom:rights does, or on the element level. - Introduce link rel=collection-license/ (or whatever else you find suitable) for licenses about the collection, to be used only on the feed level. If there's a @rel=license at the feed level, but no rel=collection- license, does the @rel=license also become a collection- license? (People who don't read the spec would probably think so). If there is no license for the collection, but one wishes to specify a default license for the entries, a null license would once again be useful. Antone
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
On Jun 28, 2006, at 12:06 PM, A. Pagaltzis wrote: * James M Snell [EMAIL PROTECTED] [2006-06-28 20:00]: A. Pagaltzis wrote: * James M Snell [EMAIL PROTECTED] [2006-06-28 14:35]: Hiding the div completely from users of Abdera would mean potentially losing important data (e.g. the div may contain an xml:lang or xml:base) or forcing me to perform additional processing (pushing the in-scope xml:lang/xml:base down to child elements of the div. How is that any different from having to find ways to pass any in-scope xml:lang/xml:base down to API clients when the content is type=html or type=text? I hope you didn’t punt on those? Our Content interface has methods for getting to that information. Then stripping the `div` is not an issue, is it? Consider this: entry xml:lang=en xml:base=http://example.com/foo/; ... content type=xhtml xhtml:div xml:lang=fr xml:base=http://example.com/ feu/xhtml:a href=axe.htmlaxe/xhtml:a/xhtml:div /content /entry Whether there's a problem depends on whether one requests the xml:base, xml:lang, or whatever for the atom:content element itself or for the CONTENT OF the atom:content element, in which case the library could return the values it got from the xhtml:div. Except in unusual cases like this, the result would be identical. Certainly a distinction could be made between how an XML library would handle this vs. how an Atom library would handle it. An Atom processing library might be expected to be able to do things like: * give me the raw contents of the atom:content element * give me the contents of the atom:content element converted to well- formed XHTML (whether it started as text, escaped tag soup, or inline xhtml) In the former case, keeping the div feels like the right thing to do-- the consuming app would have to know to remove it. In the latter case, removing the div from xhtml content feels like the right thing to do. But unless the library gives me the xml:base, for example, which applies to the content of the atom:content element (as converted to well-formed xhtml or whatever), as opposed to the xml:base which applied to the atom:content element itself, there's potential for trouble. ...now that I think about it, if the library always returns the xml:base which applies to the content of the element, that could cause trouble too: entry xml:lang=en xml:base=http://example.com/; ... content type=xhtml xhtml:div xml:lang=fr xml:base=feu/xhtml:a href=axe.htmlaxe/xhtml:a/xhtml:div /content /entry Here, if I get xml:base for the content of content, it will be http://example.com/feu/;. Then, if I get the raw content of the element, strip the div, and apply xml:base myself, I'll erroneously use http://example.com/feu/feu/; as the base URI unless I know to ignore the xml:base attribute on the div.
Re: http://www.intertwingly.net/wiki/pie/XhtmlContentDivConformanceTests
On Jun 28, 2006, at 3:10 PM, Robert Sayre wrote: The content in the entries below should be handled the same way: entry xml:lang=en xml:base=http://example.com/foo/; ... content type=xhtml xhtml:div xml:lang=fr xml:base=http://example.com/ feu/xhtml:a href=axe.htmlaxe/xhtml:a/xhtml:div /content /entry entry xml:lang=en xml:base=http://example.com/foo/; ... content type=xhtml xml:lang=fr xml:base=http:// example.com/ feu/ xhtml:div xhtml:a href=axe.htmlaxe/xhtml:a/ xhtml:div /content /entry Of course the end result of both should be identical. Is that what you mean by should be handled the same way? The question is, if the xhtml:div is stripped by the library before handing it off to the app, how is the app going to get the attributes that were on the div? Is the library going to push those values down into the content or act as if they were on the atom:content element (or something similar to that)? BTW, it just occurred to me that pushing them down into the content won't work. Here's an example where that would fail: entry xml:lang=en ... content type=xhtml xhtml:div xml:lang=frOui!/xhtml:div /content /entry Notice that there are no elements inside the xhtml:div for xml:lang to be attached to (and even if there were any, any text appearing outside of them would not have the correct xml:lang attached to it). So it looks like the options (both of a which a single library could support, of course) are: * Strip the div, but provide a way to get the attributes that were on it or * Leave the div
Re: Feed Thread in Last Call
On May 18, 2006, at 8:10 AM, Brendan Taylor wrote: Do you have any suggestions about how this metadata could be included without changing the content of the feed? AFAICT the only solution is to not use the attributes (which aren't required, of course). If it's in the feed document and it gets updated other than when the entry itself is updated (...and it wouldn't be of much use if it were only updated when the entry was updated), it's going to result in data getting re-fetched when nothing but the comment count and timestamp change. I don't see any way around that. So if you really want a way to publish comment counts and timestamps without causing lots of unchanged data from getting refetched, you're going to have to separate that data out of the feed. Here's pseudo-XML for a possible approach: feed ... ... link rel=comment-tracking href=... / ... entry idfoo/id ... /entry entry idbar/td ... /entry ... /feed and in another document: ct:comment-tracking xmlns:ct=... xmlns:atom=... ... atom:link rel=related href=URL of the feed ... / ct:entry ref=foo atom:link rel=comments href=... type=... hreflang=... ct:count=5 ct:when=... / atom:link rel=comments href=... type=... hreflang=... ct:count=3 ct:when=... / /ct:entry ct:entry ref=bar atom:link rel=comments href=... type=... hreflang=... ct:count=0 ct:when=... / atom:link rel=comments href=... type=... hreflang=... ct:count=1 ct:when=... / /ct:entry ... /ct:comment-tracking Of course the comment tracking document wouldn't only be authoritative for feeds that pointed to it with a comment-tracking link. This would require an extra subscription to track the comments, as well as understanding an additional format (as opposed to just an additional extension--either approach requires SOME additional work), but it would prevent unnecessary downloads by clients that aren't aware of it, and would reduce the bandwidth used by those that are. This approach could be generalized to enable offloading of other metadata that's more volatile than the entries themselves. Antone
Re: Feed Thread in Last Call
On May 18, 2006, at 12:31 PM, A. Pagaltzis wrote: Actually, you don’t really need another format. There’s no reason why you couldn’t use atom:feed in place of your hypothetical ct:comment-tracking. :-) Your ct:entry elements could almost be atom:entry ones instead, too, except that assigning them titles and IDs feels like overkill. The point of the whole exercise is to create a lightweight document for volatile metadata. If it's an atom:feed, you have to include a lot of stuff that's not needed here--atom:title, atom:updated, atom:author, and atom:summary or atom:content. Also, you'd need to have an atom:id for each entry in addition to the @ref pointing to the entry that it talks about. The real cost is not the cost of an extra format, but that implementations then need to understand the FTE in order to know to poll an extra document to retrieve the out-of-band metadata. Sure, but if they don't understand FTE, they wouldn't know what to do with the extra metadata anyway even if it were in the main feed. They MIGHT be able to do some generic processing of the comments link, but the reliability of any generic processing algorithm for unknown link types is questionable since we left atom:link open to all sorts of uses. And you COULD keep the comments links in the main feed but just leave off @count and @when for the benefit of apps that don't process the sibling document. On May 18, 2006, at 11:48 AM, Antone Roundy wrote: This approach could be generalized to enable offloading of other metadata that's more volatile than the entries themselves. I don't know yet what other metadata might be handled this way, but here's slightly revised pseudo-XML that makes it more general and adds a few useful things: feed ... idfoobar/id ... link rel=volatile href=... / ... entry idfoo/id ... /entry entry idbar/td ... /entry ... /feed v:volatile ref=foobar xmlns:v=... xmlns=http://www.w3.org/2005/ Atom xmlns:thr=...!-- @ref could be omitted if using with RSS -- link rel=related href=URL of the feed ... /!-- don't really need something different from related, right? -- updated.../updated v:entry ref=foo!-- @ref could be a guid if using with an RSS 2.0 feed, though we all know that RSS 2.0 guids are misused in ways that might make the connection unreliable -- updated.../updated link rel=comments href=... type=... hreflang=... thr:count=5 thr:when=... / link rel=comments href=... type=... hreflang=... thr:count=3 thr:when=... / /v:entry v:entry ref=bar updated.../updated link rel=comments href=... type=... hreflang=... thr:count=0 thr:when=... / link rel=comments href=... type=... hreflang=... thr:count=1 thr:when=... / /v:entry ... /v:volatile
Re: Does xml:base apply to type=html content?
On Mar 31, 2006, at 7:01 AM, A. Pagaltzis wrote: * M. David Peterson [EMAIL PROTECTED] [2006-03-31 07:55]: I speaking in terms of mashups... If a feed comes from one source, then I would agree... but mashups from both a syndication as well as an application standpoint are become the primary focus of EVERY major vendor. Its in this scenario that I see the problem of assuming the xml:base in current context has any value whatsoever. No. That is only a problem if you just mash markup together without taking care to preserve base URIs by adding xml:base at the junction points as necessary. Copying an atom:entry from one feed to another correctly requires that you query the base URI which is in effect in the scope of the atom:entry in the source feed, and add an xml:base attribute to that effect to the copied atom:entry in the destination feed. If you do this, any xml:base attributes within the copy of the atom:entry will continue to resolve correctly. It’s much easier to get right than copying markup without violating namespace-wellformedness, even. Exactly. When creating a mashup feed, there are any number of things that the ... masher(?) has to be careful of--for example: * Getting namespace prefixes right * Creating an atom:source element and putting the right data into it * Ensuring that all entries use the same character encoding * Ensuring that the xml:lang in context is correct * Ensuring that the xml:base in context is correct * If any of the source data isn't Atom, ensuring that all the required elements exist (...even if the source data IS Atom--you never know when you're going to aggregate from an invalid Atom feed-- then you have to decide whether to fix the entry or drop it to make your output correct) If we start assuming that mashers can't do those correctly, then we may as well not be using Atom, or even XML. If we did a proper job of specifying Atom, then we should be able to hold publishers' feet to the fire and make them get their feeds right. In Atom, xml:base is the mechanism used to determine base URIs.
Re: Does xml:base apply to type=html content?
On Mar 30, 2006, at 10:30 PM, James M Snell wrote: Antone Roundy wrote: [snip] 2) If you're consuming Atom and you encounter a relative URI, how should you choose the appropriate base URI with which to resolve it? I think there are only three remotely possible answers to #2: xml:base (including the URI from which the feed was retrieved if xml:base isn't explicitly defined), the URI of the self link, and the URI of the alternate link. Given that Atom explicitly supports xml:base, if it's explicitly defined, it's difficult to justify ignoring it in favor of anything else. There is no basis in any of the specs for using the URI of the self or alternate link as a base uri for resolving relative references in the content. The process for resolving relative references is very clearly defined. Right--my point is: 1) If the original publisher made the mistake of using relative references without explicitly setting xml:base (figuring that consumers could resolve the references relative to the location of the feed), and then the feed got moved or mirrored, one would certainly fail at finding the things the publisher intended to point to if the URI from which the feed was retrieved was used as the base URI, but might succeed by using the self link as the base URI. (I do not advocate doing this as default behavior, as stated below). 2) If the original publisher made the mistake of not even thinking about relative references in the content and therefore didn't set xml:base, the relative references may very well be relative to the location pointed to by the alternate link. For example, the person generating the content may have been thinking my blog entry will appear at http://example.org/blog/2006/03/foo.html, so I can use the relative URL ../../../img/button.gif to point to the image at http://example.org/img/button.gif;. If the alternate link points to http://example.org/blog/2006/03/foo.html, then the consumer that wants to find the image will only succeed by using the alternate link as the base URI. (I do not advocate doing this as default behavior, as stated below). Moral of this story: failing to explicitly set xml:base is bad because it tempts consumers to ignore the spec in order to get what they want. I do not advocate ignoring the spec as default behavior. But honestly, I might give the user of a consuming application the option of overriding the default behavior on specific feeds if they know that the publisher makes the mistake of publishing links relative to the self or alternate link without setting xml:base. I'd LIKE to be able to hold the publisher's feet to the fire and make them fix the feed, but sometimes my users hold MY feet to the fire and make me give them usable workarounds. Antone
xml:base in your Atom feed
Sam, Funny that this should come up today given the recent discussion on the mailing list--NetNewsWire isn't getting the links in your Atom feed right. I looked at the source, and it's clearly a NetNewsWire bug since it's not even trying to resolve relative to the URI from which it retrieves the feed. In fact it appears to be resolving relative to the alternate link (link href=/blog//), and not doing such a good job of it--for example, instead of http:// www.intertwingly.net/blog/2006/03/31/Rogers-Switches, it's pointing to http:/blog/blog/2006/03/31/Rogers-Switches--but I wonder whether it would get it right if you set xml:base explicitly. Antone
Re: xml:base in your Atom feed
On Mar 31, 2006, at 4:12 PM, Sam Ruby wrote: Antone Roundy wrote: Sam, Funny that this should come up today given the recent discussion on the mailing list--NetNewsWire isn't getting the links in your Atom feed right. There is an off chance that I have been following the list. ;-) I certainly didn't mean to imply that you weren't--I just wanted to point out what I'm seeing in case you didn't know that this particular feed reader is having this particular problem today. And I thought it might be of interest to the WG to know what NNW is doing given that it's doing something that has been argued against within the last 24 hours. I don't remember which version of your feed I was subscribed to before--perhaps I wasn't subscribed to the Atom feed and NNW updated my subscription when you redirected to it. So I don't know whether you purposely removed xml:base to see what chaos would ensue, or whether it hasn't been there all along and I just haven't seen the problem since I was subscribed to a different version.
Re: atom:name ... text or html?
On Mar 23, 2006, at 9:48 AM, James Holderness wrote: Hahaha! It's RSS all over again. In the words of Mark Pilgrim: Here's something that might be HTML. Or maybe not. I can't tell you, and you can't guess. :-) Seriously though, the atom:name element is described as a human- readable name, so unless your name really is Betrand Cafeacture; that can't be right. If RFC4287 had intended to allow markup in the element it would have used atomTextConstruct. I agree with James here--if we had intended for the name to be able to include markup, we should have used the construct we created to allow that. This from RFC 4287 (section 3.2): element atom:name { text } would have been this: element atom:name { atomTextConstruct } if we had intended for it to be able to contain anything but literal text after XML un-escaping, right? On Mar 23, 2006, at 9:57 AM, Eric Scheid wrote: It's true that XML has only a half dozen or so entities defined, meaning most interesting entities from html can't exist in XML ... unless maybe they are wrapped like in CDATA block like above? If they're wrapped in a CDATA block, then they don't trigger an XML parsing error, but wrapping something in CDATA isn't a license to enter data in a format other than what the RFC allows. I'm getting the data by scraping an html page, so I'm expecting it to be acceptable html code, including html entities. You, the producer, are getting the data from an HTML page, so you should certainly be prepared to handle HTML entities in it. But you the Atom publisher are responsible for making sure that you've made any changes to the data that are necessary for it to be proper Atom before you publish it. The consumer of the Atom feed doesn't know where you got the data, and thus can't be expected to decide how to process it based on where you got it.
Re: Feed paging and atom:feed/atom:id
On 10 Mar 2006, at 18:44, James M Snell wrote: If the feeds have the same atom:id, I would submit that they form a single logical feed. Meaning that all of the feed documents in an incremental feed (using Mark's Feed History terminology) SHOULD use the same atom:id value. This is the way I have implemented paging in our APP implementation. If the linked feeds have different atom:id values, they should represent different logical feeds. Agreed. From 4.2.6: Put another way, an atom:id element pertains to all instantiations of a particular Atom entry or feed; revisions retain the same content in their atom:id elements. All the Atom Feed Documents representing one incremental feed (or parts of one incremental feed) are instantiations of a particular Atom ... feed, are they not? So they should have the same value in atom:id. If they don't, then they can't be considered instantiations of the same Atom feed.
Re: IE7 Feed Rendering Issue
On Mar 9, 2006, at 12:07 PM, James M Snell wrote: As an alternative, Feed Readers can provide publishers with a way of specifying optionally applied styling for feeds and entries.. e.g., feed ... link rel=stylesheet type=... href=... / ... entry ... link rel=stylesheet type=... href=... / ... /entry /feed Given my opinion on the use of the link element, I suppose I should propose an alternative: ext:style type=text/css ... /ext:style or ext:style src=http://...; / Either method permitted, like how we do atom:content. 'type=text/ css' optional, or is it needed? Warning to those daring to try the second that some feed readers won't bother downloading the external file. Warning to publishers that if they specify styles for body, for example, some readers may say there's no body element in the content, so I'll ignore this rule (so put the content in a container with an ID or class and set the style for that instead), and others may say how dare you try to take over the styling of the body when the body element isn't allowed in the content, I'll ignore this rule, and others may just ignore all or some of it for whatever reason they wish. Can be at feed or entry level and be intended for application to its siblings and their children (those with textual content only--and of course, some clients may not apply it to all siblings and children even if they are textual). If we really want to get fancy (big if), we could add @apply-to=content, but then you get into the qnames in attributes problem... Or we could specify that it only applies to atom:content and perhaps atom:summary (and any extension element that explicitly specifies that it applies). Well, that's enough off the top of my head. Antone
Re: Link rel attribute stylesheet
On Feb 26, 2006, at 9:10 PM, James Yenne wrote: My feeds contain a generic xml-stylesheet, which formats the feed for display along with a feed-specific css. Since xsl processors do not have a standard way to pass parameters to xsl stylesheets, I provide this feed-specific css to the xsl processor in the feed as a link with rel=stylesheet. Generating xhtml with this xsl/css solution works for rendering both in IE6 and FF1.5. (Why does IE7 rip out xml-stylesheet directives?) A link rel=stylesheet seems to be the most efficient solution, however, a fully qualified URI relation does the job too. I would like to request a stylesheet link relation be added to the IANA List of Relations and supported in the validators. Thoughts? One problem with this is that there's no machine readable way without an extension attribute to indicate what format the stylesheet is going to transform the data to. If you're going to add an extension attribute, I'd suggest just making the whole thing an extension element instead. Of course, my opinion is partly based on my preference which was rejected by the group for limiting the link element to links intended for traversal, so maybe that doesn't matter. But certainly the possibility should be considered that this is stretching the use of the link element beyond what it was designed for. Antone
Re: Link rel attribute stylesheet
On Feb 27, 2006, at 8:29 AM, M. David Peterson wrote: When you say what it was designed for can you be specific as to what that definition is? Well, we failed to gain consensus on that. Some of us wanted it to be used only for links intended to be traversed by the user (like the a element in HTML with an href attribute--the link is there so that the user can click it and get to the linked resource). Others didn't want this limitation, but wanted the link to be resolvable (eg., no tag: URIs). Others wanted to be able to stick any URI in it. So there is no tightly defined what it was designed for. I'm just saying that if an extra attribute is required to disambiguate what's being pointed to in a case like the following (without requiring the link target to be loaded and inspected), then maybe you're trying to make this one element do too much: link rel=stylesheet href=http://example.org/atom-2-rss-2.0.xsl; / link rel=stylesheet href=http://example.org/atom-2-rss-1.0.xsl; / link rel=stylesheet href=http://example.org/atom-2-fooml.xsl; / etc. If one were to encounter such a list of links at the top of an Atom document, which should one use? Should one download all of them and then pick one? Or are you going to add an attribute something like this: link rel=stylesheet href=http://example.org/atom-2-rss-2.0.xsl; ext:targettype=application/xml+rss / Sorry, new to the conversation, but I have particular interest in this topic as it is my belief that the URI/IRI can be used to imply a lot of information that is otherwise hidden from view, or uses more complex mechanisms to achieve the same result. If there is real concern as to this approach, it would be great to gain a greater understanding as what they are such that I can apply this to the work I am doing in this area. For a particular example of what I mean, please see this post http://www.xsltblog.com/archives/2006/02/what_rest_gets_1.html Hmm. If I'm reading that right, I wouldn't want to organize my websites that way. And unless the specification for the stylesheet link relation were to mandate that URIs be constructed in a way enables readers to tell from the local path what type the stylesheet is going to transform the feed to, you wouldn't have any way to know whether you could apply such an interpretation in any given case. I don't really see the benefit of putting the information into the URI versus creating an attribute whose sole purpose is to specify the type. The number of bits it would save is trivial, and it would require the extra step of parsing the URI's local path to pull information out of it that could be taken more easily from a dedicated attribute. Antone
Re: partial xml in atom:content ?
On Jan 17, 2006, at 11:04 AM, James Holderness wrote: but I think I've shown some pretty compelling reasons why a producer (if they really absolutely have to use application/xhtml +xml), would be wiser to use an xhtml document fragment than a complete xhml document. I'm all for consuming applications that want to be really smart checking whether the content of content type=application/xhtml +xml is a fragment or a complete document and handling either, but if your content is an xhtml document fragment, is there any reason at all to publish type=application/xhtml+xml rather than type=xhtml? The only justification that comes to mind is if you want to make a political protest statement against the required wrapper div. But unless you prominently warn your users that your app is doing this, you're doing them a grave disservice by making their feed content less likely to be seen.
Re: partial xml in atom:content ?
On Jan 15, 2006, at 8:09 PM, James Holderness wrote: Thus, can atom be used to ship around parcels of xml snippets? I suppose it could, but only so long as both ends knew what was going on, and knew naïve atom processors might barf on the incomplete xml, right? The one time I'd think it might be safe is with XHTML (as I mentioned in a previous message) since Atom processors are already required to handle XHTML fragments in the content element. Anything else would be highly risky unless it was a proprietary feed communicating between two known applications. Processing type=xhtml and type=application/xhtml+xml are very different beasts. Say your application converts Atom feeds to HTML to display in webpages. With type=xhtml, the data could just be dumped into the webpage (after appropriate stripping of nasty tags and CSS and such). With type=application/xhtml+xml, you'd have to figure out to do with everything outside of the body element. If there's CSS involved for example, simply throwing it away could lead to some very messed up display. But assuming your application is being called from within the webpage, it's not going to have the opportunity to add a style section to the document's head. So to avoid losing the styling, for example, it would have to replace all id=foo and class=bar attributes with style=all of the styling for the id and class and parent classes, etc., with all cascading applied. In other words, it's not going to happen. Given the tremendously increased complexity involved, some apps are likely to refuse to process anything that's not one of Atom's three special types.
Re: partial xml in atom:content ?
On Jan 16, 2006, at 4:21 PM, James Holderness wrote: For example, below are the results of some tests I've run on 15 aggregators. The tests included the use of a div tag as the root element, a p tag as the root element, and an html tag as the root element (i.e. a complete xhtml document). The following applications worked with all three tests: BlogBridge 2.7 Bloglines BottomFeeder 4.1 Google Reader Snarfer 0.1.2 The following applications worked with the div tag and the p tag, but failed to handle a full document (the html tag): FeedDemon 1.5 GreatNews 1.0.0.354 Newz Crawler 1.8.0 RSS Bandit 1.3.0.38 SharpReader 0.9.6.0 Out of curiosity, what constitutes success in the html case? I'm mostly curious about the browser based readers. If they displayed the content within a webpage, but failed to strip out the html, / html, body and /body tags and head section (assuming the test feed contained one), would that be a success or failure? What did the apps that failed do in the html case?
Re: Sponsored Links and other link extensions
On Oct 25, 2005, at 12:59 AM, A. Pagaltzis wrote: I am asking if is there a generic way for an application to implement alternate-link processing that gives sensible behaviour for any type of main link. If an implementor has to support alternative links explicitly for each type of main link, where’s the difference to having specific relationships for alternative links depending on the main link type? Here are a few examples of generic processing algorithms an application might use: Mirrors: 1) Randomly selecting a mirror to download from, thus helping to spread the bandwidth usage among them. 2) Try the main link, and if the DNS lookup fails, or a connection can't be made or something, automatically try the next one. 3) Ping each of the servers in the background, and if the user clicks the link, use the fastest one. Alternates: 1) Have a prioritized list of formats, and choose the link that points to the highest priority format. 2) Of all the formats the app supports, choose the one with the smallest @length, if present. Either one: 1) Show some sort of UI for selecting which link to follow (perhaps have the main link selected by default, but allow the user to select an alternate from the popup). None of those ideas is necessarily tied to any particular link relation. They might be more important for enclosures than any of the other relations that have been defined so far, and an application may or may not do some for enclosures that it doesn't do for some other specific link relations. But again, it comes back to the yet unanswered question, are there any disadvantages to keeping it generic? I haven't heard anyone suggest any downside yet--only that some people can't imagine why anyone would want to use alternative links for anything but enclosures.
Re: Sponsored Links and other link extensions
On Oct 25, 2005, at 11:04 AM, James M Snell wrote: All-in-one example The x:group attribute links the two alternates into a single grouping; the x:mirror specifies the mirrors for each link. nf:follow=no is my Atom Link No Follow extension that tells clients not to automatically download the enclosure. Dumb clients will see what amounts to the current status quo, two different enclosures of different types. Smart clients will see the mirrors, the grouping and the no-follow instruction. link rel=enclosure href=http://example.com/softwarepackage.zip; type=application/zip x:group=software-package nf:follow=no x:mirror href=http://example2.com/softwarepackage.zip; title=California Server / x:mirror href=http://example3.com/softwarepackage.zip; title=European Server / /link link rel=enclosure href=http://example.com/ softwarepackage.tar.gz type=application/x-gzip x:group=software- package nf:follow=no x:mirror href=http://example2.com/softwarepackage.tar.gz; title=California Server / x:mirror href=http://example3.com/softwarepackage.tar.gz; title=European Server / /link Thoughts? The only thing I would change is the name of x:mirror/@title to make it clear that it isn't intended(?) to replace the parent link's @title. My current favorite name is label.
Re: Sponsored Links and other link extensions
On Oct 25, 2005, at 1:16 PM, James M Snell wrote: Also, assuming the title on the main link is supposed to describe the download file itself, there appears to be no way to inform the user of the mirror location of the main URI. Without a location name of some sort, the user can't make an informed decision about which mirror would be best to use. Perhaps something along the line of Antone's label suggestion might help here. I could just do this: link rel=enclosure href=http://example.com/ softwarepackage.tar.gz type=application/x-gzip x:group=software- package nf:follow=no x:mirror href=http://example.com/softwarepackage.tar.gz; label=Main Server / x:mirror href=http://example2.com/softwarepackage.tar.gz; label=California Server / x:mirror href=http://example3.com/softwarepackage.tar.gz; label=European Server / /link or this: link rel=enclosure href=http://example.com/ softwarepackage.tar.gz type=application/x-gzip x:group=software- package x:label=Main Server nf:follow=no x:mirror href=http://example2.com/softwarepackage.tar.gz; x:label=California Server / x:mirror href=http://example3.com/softwarepackage.tar.gz; x:label=European Server / /link
Re: New Link Relations -- Ready to go?
On Oct 24, 2005, at 8:13 AM, James Holderness wrote: With what we have so far we can do incremental feed archives; we can do at least some form of searching; we can do non-incremental feeds (of the Top 10 variety) with history. I think that's a good start. But we also want paged non-incremental feeds (OpenSearch result feeds), while non-incremental feeds with history have not yet proven to be needed. I still don't see why OpenSearch result feeds can't be implemented as incremental feeds. Perhaps they can, but that wouldn't always be desirable. Consider this scenario: Somebody writes a program that searches Google, scrapes the HTML results, and publishes them as an Atom feed. My purpose in subscribing to the feed is not to be alerted when a new webpage is added to page 20 of Google's results, it's to be alerted whenever a new webpage makes it onto page 1. So I don't want new pages added to the live end of the feed--I just want whatever is currently in the top 10 results, and my feed reader will tell me when one of them is one it hasn't seen before. Either they're being used as a one-off search and you can't subscribe to them (in which case there is no difference between incremental and non-incremental), or they're being updated with new results over time (like a filtered aggregate feed) in which case I would think they have to be incremental. Given the above scenario, why wouldn't you be able to subscribe to them? I'm proposing previous/next linking from chunk to chunk inside the same snapshot and adding a new link relation (or set of link relations) for linking from snapshot to snapshot. Do you now see what I'm talking about? I understand what you're talking about, but I just don't see the need. I would have expected a non-incremental feed to be a single Atom document. In the case of something like a top 10 feed, I'd imagine it would be. But a search results feed like what's described above may not be. My reason for wanting paging is so that a user doesn't need to fetch data that he already has - this can never be a problem with a non-incremental feed because it doesn't grow. I'm not sure I understand--it's not as if a non-incremental feed were simply a static document. They're resources whose contents are replaced wholesale (with the things that were in the old set possibly still being in the new set) rather than having their old contents augmented when new things are added.
Re: Profile links
On Oct 23, 2005, at 6:45 PM, James Holderness wrote: James M Snell wrote: 1. Can a profile element appear in an atom:feed/atom:source? If so, what does it mean? I think it should with the caveat that the profile attribute should only impact the feed and should not reflect on the individual entries within that feed. I can't see any particular use for atom:source myself, but I would definately want profile support at the feed level. As an aggregator I want to be able to display a custom view for a particular feed based on what it contains (e.g. slideshow view if it's a flickr feed - all images). It would be difficult to do something like that with only entry level profiles. I don't think it's possible to allow something at the feed level, but disallow it in atom:source (the Atom format spec could have done that, but I don't think an extension can add such restrictions). What does it mean in atom:source? That the feed that the entry came from conformed to the profile. What will consuming applications do with profile elements in atom:source? That's entirely up to the application developer. Maybe nothing--maybe they'll ignore profiles that don't apply to the entire feed. Or maybe they'll come up with something useful.
Re: Sponsored Links and other link extensions
On Oct 24, 2005, at 5:18 AM, James Holderness wrote: Eric Scheid wrote: The challenge with using alternate to point to files of different types is that why would someone do (a) when they can already do (b) without the help of a new extension (a) link rel=enclosure type=audio/mpeg href=http://example.com/ file.mp3 x:alternate type=application/ogg href=http://example2.com/ file.ogg / /link (b) link rel=enclosure type=audio/mpeg href=http://example.com/file.mp3; / link rel=enclosure type=application/ogg href=http://example2.com/file.ogg; / With (a), we know the .mp3 and the .ogg are simply different formats of the same thing. With (b) we don't know either way. I like (a) in concept because, as you say, it enables you to tell when two links are the same so if you're auto-downloading you don't need them both. However, I do think James is right in thinking that many people will just use (b) because it's already there. I don't see the harm in allowing (a) though. If a feed producer uses (a) and an end-user has auto-downloading enabled for that feed, they both benefit from less wasted bandwidth. The only downside would be that aggregators that aren't aware of this extension would fail to see the alternate enclosures. Is that so bad though? It's a trade-off the feed producer has to make - I'm not sure we should be making that decision for them. Here's the middle path: (c) link rel=enclosure type=audio/mpeg href=http://example.com/ file.mp3 x:link-set=a / link rel=enclosure type=application/ogg href=http:// example2.com/file.ogg x:link-set=a / This won't save you from bandwidth waste by aggregators that don't support the extension, but it also won't prevent users of those aggregators from getting the data in a format they can use. That said, this is not my preferred method. I'd rather protect bandwidth and the user's hard drive space--all the more important because enclosures are often quite large. Here's a final option--is it legal? Is it better or worse than (a) in any ways? (d) link rel=enclosure type=audio/mpeg href=http://example.com/ file.mp3 link rel=alternate type=application/ogg href=http:// example2.com/file.ogg / /link Better: Doesn't require processing of a new namespace or element-- just a new way of using the data that one gets out of an existing element. I prefer d, a, c and then b.
Re: New Link Relations -- Ready to go?
On Oct 24, 2005, at 11:16 AM, James Holderness wrote: A more sensible approach would be a single feed document containing the top N results (where N is manageable in size). You could subscribe to that as a non-incremental feed and you would know at any point in time which were the top 10 results. There is no real need for paging other than as a form of snapshot history (i.e. what were the top 10 results last week). That is certainly a good approach--allowing the number of results to be determined dynamically by something in the URL, for example. However, it could be useful to limit the chunk size and allow paging for people who want more. For example, you might allow a maximum of 50 results per chunk, and then support ETags. That way, if somebody wants to monitor the top 250, they can send 5 requests, and if most of the time there are no changes, they'll get a lot of 304s, but if occasionally something changes in the last chunk of 50 for example, they're only downloading 50 results each time something changes. There are of course other approaches, like support for just sending the diffs. But that would probably more difficult for most people to implement, and may be less likely to be supported by a wide variety of clients. Another reason for wanting to limit the number of results per query (and support paging for those who want more) is to avoid bandwidth waste if someone accidentally ads an extra digit to the desired number of results; or tries to waste your system resources by requesting huge result sets (but dropping the connection before using up their own bandwidth actually receiving the whole result set); or has a client that doesn't support paging or diffs or ETags or anything, and wants a huge result set (and you don't want to accommodate them since it would use so much bandwidth), etc. Once again, I have to ask the same question I asked Thomas: do you have a problem with Mark's next/prev proposal as it stands, or are you just arguing with me because you think I'm wrong? If the latter, feel free to just ignore me. We can agree to disagree. Unless we're discussing a particular proposal I don't see the point. I have a problem with not having link relations specific to paging through a feed's current state. I'm fine with having general chain navigation link relations, but hope that we'll get something specific to paging and that people will use it instead of the general link relations. I've spoken my peace on that and have given up swimming against the tide, but am still willing to discuss specific related issues.
Re: Sponsored Links and other link extensions
On Oct 24, 2005, at 1:48 PM, A. Pagaltzis wrote: I have a completely different proposition. (e) link rel=enclosure type=audio/mpeg href=http://example.com/file.mp3; encl:mirrors=http://www2.example.com/file.mp3 http:// www3.example.com/file.mp3 xml:id=x-file / link rel=alternative-enclosure type=application/ogg href=http://example2.com/file.ogg; encl:alternative-to=x-file / Since bit-for-bit identical files all have the exact same attributes, there is absolutely no reason to have an entire tag dedicated to each. In addition, making mirror URLs second-class citizens in this ways provides an intuitive hint at the bit-for-bit identity semantics. Interesting. Filling an attribute with a list of URIs doesn't really appeal to me though. How about this: link rel=enclosure type=audio/mpeg href=http://example.com/ file.mp3 xml:id=x-file altlink:mirror href=http://www2.example.com/file.mp3; / altlink:mirror href=http://www3.example.com/file.mp3; / /link Specifying alternative formats with a distinct link relationship prevents bandwidth and diskspace drain from oblivious clients. Sounds good, but you may have noticed above that I used a prefix not specific to enclosures--there's no reason to tie this all to one particular type of link (nor to make it look as if it were tied to one specific link type). So the other link might, for example, be: link rel=alternative-link type=application/ogg href=http:// example2.com/file.ogg altlink:primary=x-file / Although alternative-link doesn't tell you what kind of link this is, since you're going to have to tie it back to the primary link to decide what to do with it anyway, it really shouldn't matter. Note that I changed alternative-to to primary just because it's shorter and one word.
Re: Sponsored Links and other link extensions
On Oct 24, 2005, at 2:59 PM, A. Pagaltzis wrote: * Antone Roundy [EMAIL PROTECTED] [2005-10-24 22:35]: Interesting. Filling an attribute with a list of URIs doesn't really appeal to me though. How about this: link rel=enclosure type=audio/mpeg href=http://example.com/ file.mp3 xml:id=x-file altlink:mirror href=http://www2.example.com/file.mp3; / altlink:mirror href=http://www3.example.com/file.mp3; / /link It’s a lot more verbose and you have to fiddle with nesting. What do you get in return? “It looks more XMLish”? 1) Easier parsing, as James said, since your XML parsing library is going to give you the data with the URI's already split apart. 2) You can break lines between elements, but you can't inside an attribute, so it's better for display for humans. I think XMLishness leans this direction for good reason. Sounds good, but you may have noticed above that I used a prefix not specific to enclosures--there's no reason to tie this all to one particular type of link (nor to make it look as if it were tied to one specific link type). So the other link might, for example, be: I don’t know if striving for generality in this fashion without a practical need is worthwhile. It smells of architecture astronautics for a reason I can’t particularly pinpoint. So maybe my instinct is wrong. The way I see it, striving for specificity without a practical need isn't worthwhile. Unless generalizing risks leading to some sort of problem, why do it? I see no potential problems. What if someday somebody does come up with a non-enclosure use for this (which hardly seems far-fetched to me--enclosures aren't the only things that get mirrored or exist in multiple formats)? They'll have to define a new mechanism for it which is either going to be identical except for element names, or they're going to invent another way to do the same thing. Either way, the pain of supporting both is completely unnecessary unless there's potential for generality causing problems.
Re: Sponsored Links and other link extensions
On Oct 24, 2005, at 9:59 PM, A. Pagaltzis wrote: * Antone Roundy [EMAIL PROTECTED] [2005-10-25 00:35]: 2) You can break lines between elements, but you can't inside an attribute, so it's better for display for humans. That’s not what the XML spec says. Doh! Who knows where I got that idea. I still prefer to have each piece of data in it's own place. What if someday somebody does come up with a non-enclosure use for this (which hardly seems far-fetched to me--enclosures aren't the only things that get mirrored or exist in multiple formats)? They'll have to define a new mechanism for it which is either going to be identical except for element names, or they're going to invent another way to do the same thing. Either way, the pain of supporting both is completely unnecessary unless there's potential for generality causing problems. If it isn’t obvious from the start what it means that there’s an alternative-link for a via link or a previous or next link, then clients will have to support each of these use case separately. So on the implementor’s end, there’s no discernible difference between the pain of supporting either approach. I'm not sure I understand what you're saying. Are you saying that one might do this if they want and alternate of a next link? link rel=next xml:id=foo ... / link rel=alternate-enclosure x:alternate-of=foo ... / If that's what you mean, then sure, the code for that would be the same as for: link rel=next xml:id=foo ... / link rel=alternate-link x:alternate-of=foo ... / ...but it would sure look odd. I see no advantage to naming these things in terms of enclosures.
Re: What is this entry about?
On Oct 21, 2005, at 5:47 PM, James M Snell wrote: Err, are you forgetting atom:category? Doesn’t that satisfy all your wants *and* more? It has a URI, a term and a human-readable label. Regards, I dunno, that's why I was asking ;-) atom:category works well for categorizing entries, but does it really tell us what the entry is about? For instance, suppose that I want to indicate that an entry is about http://www.ibm.com and file that in a category called technology? The categorization of the entry is different than the subject of the entry.. tho both are definitely related. Why don't we define link/@rel=about for pointing to a specific internet resource that an entry is about (a little more specific than the general case of rel=related). I know we discussed this before and in the chaos of trying to hammer the spec out, didn't do it, but I still think it's a good idea.
Re: New Link Relations -- Ready to go?
On Oct 21, 2005, at 7:19 PM, James Holderness wrote: What's the difference between a search feed and a non-incremental feed? Aren't search feeds one facet of non-incremental feeds? Not necessarily, no. A search feed could quite easily be implemented as an incremental feed. This is the most sensible approach since it would allow the feed to be viewed in all existing aggregators without requiring a special knowledge of non- incremental feeds. If your goal is to work as well as possible with today's client software, then bending your data to fit their model is the most sensible approach, but that's not always the goal. The initial feed document consists of all known results at the time the search is initiated. As new results are discovered over time, the feed can be updated by adding new entries to the top of the feed in much the same way that new entries would be added to the top of a blogging feed. In fact, if you do a search with something like feedster, this is exactly the sort of feed you will get back. If creation time is relevant to the data being searched, then this makes sense. But what if I want to subscribe to the top 10 Google results for some keywords I'm trying to optimize my site for (ignoring the fact that Google doesn't return search results in any feed format right now)? Or what about alternative sort orders which are available on sites like Feedster, Google News, etc.? (You can sort by relevance rather than date--the date still has some weight, but the results aren't strictly in date order). How about Amazon.com affiliates who want to use an RSS parser to display affiliates links to best sellers search results? There are a lot of search use cases that don't fit the incremental model. All that said, search results are often a bit different than top 10 lists and the like. With search results, you often don't want to view the contents of the feed in order all at once--the first time you do, but after that, you may just want to see new things as they make it up into the top positions. Today's clients can handle that just fine, unless you want to monitor more than just the first page of results.
Re: General/Specific [was: Feed History / Protocol overlap]
On Oct 19, 2005, at 11:12 AM, Mark Nottingham wrote: next next-chunk next-page next-archive next-entries are all workable for me. ... Perhaps people could +1/-1 the following options: * Reconstructing a feed should use: a) a specific relation, e.g., prev-archive b) a generic relation, e.g., previous I'd prefer prev-page. prev-archive doesn't sound right for paging through search results. Also, prev-archive or next- archive (whichever ends up going forward in time) doesn't quite work if the final step forward points to the subscription feed URI (which isn't an archive. That's a small matter since it's only that last step, but in search results type cases, archive would definitely be odd. Just a little follow up on what I wrote last night about generic vs. specific link relations: related is a generic term that is likely to be a bit of a catch-all for links that don't have a specific relation defined for them. alternate is a specific relation created for one of the major historical use cases for rss/link. The proposed but not accepted about would have been the specific relation for the other major use case that rss/link was commonly used for. related could conceivably handle the hypothetical use case of traversing a chain of different feeds--you'd just have to remember which related link to a feed document you had already traversed to know which one to follow next to continue down the chain. It wouldn't be quite as nice for such an application as having a next and prev for that use, but I'd rather see it done that way till it's clear that such a thing is even needed than see intrafeed paging links used for interfeed navigation.
Re: Feed History / Protocol overlap
On Oct 18, 2005, at 6:10 PM, Robert Sayre wrote: On 10/18/05, Antone Roundy [EMAIL PROTECTED] wrote: -3 to being that generic. That's a very large negative number. Can you explain how your version will me write software I otherwise couldn't? Anything larger than -2 is bogomips--the point I was trying to make is that I think the idea of using the same link relation for paging within a feed and for navigating between feeds is absolutely absurd-- completely lacking in foresight--almost looks like an attempt to create for future problems. People were complaining that trying to avoid problems with the hypothetical top 100 DVDs scenario (not trying to solve it--just trying to avoid problems if it comes about) was wandering too far off into hypotheticals, but now people want to make sure they can use the next relation for the arguably even more hypothetical idea of building a chain of otherwise independent feeds? This boggles my mind. Here's what my version will let you do that you won't be able to do if the definitions of these links allows them to be used for interfeed navigation--it will enable you to do paging within a feed that is also part of a chain of feeds (because anyone wanting to create a chain of feeds will have to come up with a non-conflicting link relation to do it). It will also enable you to know that (unless somebody's breaking the spec) you are navigating through a single feed when you follow next and prev links around--that you are not jumping from feed to feed. Your software will be able to follow those links with a much greater degree of confidence that it won't result in your users complaining what the hell are you doing showing me entries from a feed I didn't subscribe to? It will enable your application to take more actions automatically without having to ask for confirmation from the user every time you follow another next or prev link to avoid such complaints.
Re: Feed History / Protocol overlap
Here's what this discussion makes me think of--RSS has a link element. That link was very generic, and has been variously used to link to what Atom calls link/@rel=alternate and link/ @rel=related, and perhaps even other things. Once we'd gained a little experience and discovered that the imprecision of the meaning of the element was limiting uses we wanted to make of feeds, we created more specific types of links. Hopefully, we were specific enough this time that we won't run into significant use cases that we've rendered impossible, but who knows. Now we're defining a method of navigating through a chain of linked documents. We know of two specific use cases that we're sure we want to be able to do: paging through things like search results, and catching up on incremental feeds (or reconstructing the entire state of the feed, which is an extension of catching up). It would appear that the same link relation can be used to do both of those things without the fear of conflict, because they operate within feeds that have a basic difference in nature, so they're unlikely to both be needed within one feed. Also, from a certain point of view, they are really the same thing--a way to navigate through the current state of the feed. The fact that incremental feeds don't have old states that have been discarded and replaced the way non-incremental feeds do (their former state gets augmented rather than being replaced) doesn't make a difference with respect to the issue of navigating through their current state. So why don't we create a mechanism to do those two things (that are really one thing), and NOT make it generic enough to encompass other things that we might want to do someday, which might lead to the same sort of limitation that RSS has by only having one generic link element? Sure, we COULD do all of our interdocument navigation using next and prev until someday when we decide that we need something more specific for some of the navigation use cases. But then we'll be doing some of the same things multiple ways--some people sticking with next and prev, and some using whatever new methods or link relations are invented, and nobody quite sure what next and prev mean in any particular feed. Why not wait till we've really figured out what other ways we might want to navigate between documents, and then devise a new method for doing it? If we're going to create some generic link relations for people to experiment with, let's create somethings that's explicitly for doing experimental things with so that the link relations we want to do more specific things with aren't rendered less useful by the experimentation. Register x-next and x-prev or something for that, or register next-page and prev-page for the things we know we want to do. Or don't register any such thing--just don't promote use of the the link relations we define for (reasonably) well understood use cases to do experimental things. We'll, I've spoken my mind plenty on this issue, so unless somebody brings up an issue that my opinion on couldn't be understood from what I've written already, I think I'll leave it at that. If we go with a highly-generic definition and it causes trouble down the road, I'll have some big ASCII art letters ready to say I told you so. If not, then oops, I guess I was wrong.
Re: Feed History -04
On Oct 17, 2005, at 2:20 AM, Eric Scheid wrote: On 17/10/05 5:09 PM, James Holderness [EMAIL PROTECTED] wrote: 1. Which relationship, next or prev, is used to specify a link backwards in time to an older archive. Mark Nottingham's Feed History proposal used prev. Mark Pilgrim's XML.com article used next. I'd prefer that our use of 'prev' and 'next' be consistent with other uses elsewhere, where 'next' traverses from the current position to the one that *follows*, whether in time or logical order. Consider the use of 'first/next/prev/last' with chapters or sections rendered in HTML. ...so do you follow forward through time or backward? Is the starting current position now or the the beginning of time? Especially if we're talking about history, following backward makes as much sense as following forward. I prefer next to go back in time (if temporally ordered--from the most current chunk to the next most current chunk) or to less significant pages (in things like search results). But I'll probably have to stop and think what next means in temporally ordered feeds from time to time since it'd be the reverse of temporal order. 2. Are next and prev both needed in the spec if we only require one of them to reconstruct the full history? Knowing that the most recently published archive won't likely remain the most recently published archive, there will be use cases where it's better to reconstruct the full history by starting at the one end which is fixed. Not much sense starting at the other end which is constandly shifting. Is this only going to be used to reconstruct full history? What about just reconstructing the last 3 months (in which case you'd want a link from closer to the live end to close to the fixed end), or reading from the beginning before deciding whether to continue reading what comes later (in which case you'd want a link from closer to the fixed end to closer to the live end). 3. Are the first/last relationships needed? See (2) above for 'first'. Meanwhile 'last' could be followed by a user to jump ahead to the end of the set of archives to see if the butler did it. Who said 'first/next/prev/last' would only be used by machines? As mentioned above, there may be cases where you'd prefer to start at either the fixed or live end, so as long as complete feed reconstruction isn't the only goal, I'd say yes. But what's first? It'd be the top results in a search feed, but would it be the start of time or the start from the present (before possibly traveling backward through time) in a temporally ordered feed? Making it the start of time would prevent it from matching up well with how significance ordered feeds match up (ie. does start point to the thing you'd most likely want to see if you subscribed to the feed?) If we're not careful, we'll be traversing out of first through prev and last through next! 4. Is the order of the entries in a feed relevant to this proposal? not to this proposal. If you mean not just the order within each chunk of the feed, but the order of the chunks, then not central, but certainly related. Two cases come to mind: 1) A chain of temporally ordered chunks in the history of a feed where new entries are tacked onto the end. 2) Search results, where the order of everything all along the entire chain shifts around all the time. If you're not going to reconstruct the whole thing, then your decision function for when to stop may have to be different depending on how things are ordered. BTW, case 2 destroys the idea of a fixed end and a live end. Having a means to indicate what the ordering is might make it easier to make the distinction between next and prev more intuitive. I'm not sure how else we're going to reconcile terminology for significance and temporally ordered feeds. 5. Is the issue of whether a feed is incremental or not (the fh:incremental element) relevant to this proposal? non-incremental feeds wouldn't be paged, by definition, would they? This week's top ten on the first page, last week's ten on the second page... Since this proposal is defining a paging mechanism, I think that what each page represents is relevant. Is it an earlier part of the feed or an earlier state of the feed? 6. What to name the link relation that points to the active feed document? subscribe, subscription, self, something else? 'subscribe' I just noticed something about the definition of self in the format spec. In one place it says: o atom:feed elements SHOULD contain one atom:link element with a rel attribute value of self. This is the preferred URI for retrieving Atom Feed Documents representing this Atom feed. Does that mean that it's the preferred optionsubscription/option URI, or the preferred place to retrieve optionthis chunk/option of the feed history? The format spec didn't define paging, so it didn't
Re: Feed History -04
On Oct 17, 2005, at 10:04 AM, Antone Roundy wrote: 4. Is the order of the entries in a feed relevant to this proposal? ... 1) A chain of temporally ordered chunks in the history of a feed where new entries are tacked onto the end. 2) Search results, where the order of everything all along the entire chain shifts around all the time. If you're not going to reconstruct the whole thing, then your decision function for when to stop may have to be different depending on how things are ordered. BTW, case 2 destroys the idea of a fixed end and a live end. Having a means to indicate what the ordering is might make it easier to make the distinction between next and prev more intuitive. I'm not sure how else we're going to reconcile terminology for significance and temporally ordered feeds. Okay, I've got another idea--switch to totally generic terminology, a la: end-a: the URI of most significant, most current, prerequisite[1], etc. end of a sequence of documents, or a randomly selected end if there is no order. end-b: the URI of the least significant, least current, or ...uh, postrequisite? end of a sequence of documents or otherwise the opposite end from end-a. a-ward: the URI of the document next closest to end-a in the sequence. b-ward: the URI of the document next closest to end-b in the sequence. If you have neither end-a nor end-b, then you should use b-ward to traverse out of the subscription document (ie. the subscription document in that case is assumed to be end-a). [1] if the sequence should be read first to last, for example, if it's a novel broken down into entries, end-a points to the place from which one should start. Which end is end-a and which is end- b is somewhat subjective. For example, in a temporally ordered feed, is it most important to read what's most current, or to understand the origins of the present first before reading what's most current? One more thing occurs to me--if this extension is going to be used to handle things like paging in search results, then it's not really feed history, it's paging.
Re: Feed History -04 -- is it history or paging or both?
If we're going to separate the concepts of history and paging, then the term history doesn't really apply to incremental feeds. In an incremental feed, all of the entries are part of the current state of the feed. We don't go back through history to find the present--we go to different pages of the present. In a non- incrememental feed also, we may have multiple pages of current entries (eg. the top 100 DVDs in chunks of 10), or we may have just one. We also may preserve historical data (eg. the top 10 songs last week, the week before, etc.) So what we end up with might looks like this: Any feed, whether incremental or not, MAY contain something like this (names chosen somewhat arbitrarily, with an eye toward avoiding excess conceptual baggage): page-a - the URI of one end of a chain of documents representing one state of a feed resource (eg. the current state of an incremental feed)--it doesn't really matter which end it is page-b - the other end of the chain of documents page++ - the next farther page from page-a page-- - the next closer page to page-a Neither page-a nor page-b is necessarily fixed--the entire contents of the chain may shuffle around, be added to, be deleted from, etc., in the case of something like search results. A non-incremental feed MAY also contain something like this (history is temporal, so we can use temporally loaded terminology): history-1 - a document containing a representation of one of the ends of or the entire temporally first historical state of the feed resource history-n - a document containing a representation of one of the ends of or the entire temporally last (perhaps current and still changing) historical state of the feed resource history++ - one of the ends or ... of the the next more recent historical state... (moves toward history-n) history-- - one of the ends ... of the next less recent historical state... (moves toward history-1) If you want to catch up on an incremental feed to which you're subscribed, or want to get the last month of an incremental feed to which you are newly subscribed, you look for page++ or page-- and follow whichever one the subscription document (which can only have one, since it's one of the ends) contains till you've got everything you want. If you start in the middle, you don't know which direction you're going...but since the ordering of the chain isn't defined, it's like the Cheshire cat says--it doesn't matter which direction you go if you don't know where you want to end up...or something like that. Perhaps convention could dictate that page-a be where the publisher subjectively thinks that a newcomer to the feed would be most likely to want to start reading. It wouldn't always be correct, but so what?
Re: Are Generic Link Relations Always a Good Idea? [was: Feed History -04]
On Oct 17, 2005, at 5:17 PM, Mark Nottingham wrote: They seem similar. But, what if you want to have more than one paging semantic applied to a single feed, and those uses of paging don't align? I.e., there's contention for prev/next? If no one shares my concern, I'll drop it... as long as I get to say I told you so if/when this problem pops up :) I share your concern. On 17/10/2005, at 3:21 PM, Thomas Broyer wrote: I don't think there are different concepts of paging. Paging is navigation through subsets (chunks) of a complete set of entries. Yeah, but what if you need what amounts to a multi-dimensional array. The method of addressing each dimension has to be distinguishable from the others. If the complete set represents all the entries ever published through an ever-changing feed document (what a feed currently is, you subscribe with an URI and the document you get when dereferencing the URI changes as a sliding-window upon a set of entries), then paging allows for feed state reconstruction. In other terms, feed state reconstruction is a facet of paging, an application to non-incremental feeds. Let's say you're doing a feed for the Billboard top 100 songs. Each week, the entire contents of the feed are swapped out and replaced by a new top 100 (ie. it is a non-incremental feed). And let's say you don't want to put all 100 in the same document, but you want to break it up into 4 documents with 25 entries each. You now have two potential axes that people might want to traverse--from songs 1-25 to 26-50 to 51-75 to 76-100, or from this weeks 1-25 to last weeks 1-25 to two weeks ago's 1-25, etc. You can't link in both directions with the same next. There are clearly two distinct concepts here--navigating through the chunks that make up the current state of the feed resource, and in a non-incremental feed, navigating though the historical states of the feed resource.
Re: New Link Relations? [was: Feed History -04]
On Oct 17, 2005, at 3:44 PM, Mark Nottingham wrote: On 17/10/2005, at 12:31 PM, James M Snell wrote: Debating how the entries are organized is fruitless. The Atom spec already states that the order of elements in the feed has no significance; trying to get an extension to retrofit order- significance into the feed is going to fail... just as I discovered with my Feed Index extension proposal. Here's what the spec says: This specification assigns no significance to the order of atom:entry elements within the feed. ...but there may be some. ...but there's no action you can take based on it unless something else tells you what the significance is. ...which, yes, is very difficult to specify. For the purposes of this discussion, it doesn't matter what the order of atom:entry elements within a feed document is. But the order of chunks of atom:entry elements within a linked series of feed documents may have significance, and in fact, unless you just want to reconstruct the complete feed state, working with a series of feed documents with no specific order would be fairly unwieldy. Imagine paging though a feed of search results with no idea of whether you'd just jumped from the most to the least significant results, or to the second most significant results. Imagine trying to catch up on a fast-moving incremental feed without having any idea whether a link would take you to the first entries ever added to a feed or the one's you just missed. I do believe that a last link relation would be helpful for completeness ...and last certainly seems to imply SOME sort of ordering of chunks, even if we know nothing about the order of the entries in each chunk. To each of the following, perhaps we could add something to indicate that these link relations are all used to page through the current state of a feed, and not to navigate among various states of a feed. The fact that most people wouldn't have a clue what that means without some discussion of incremental and non-incremental feeds may be an argument for having a spec document to provide more explanation (rather than embedding an identical explanation in each Description). Example: At any point in time, a feed may be represented by a series of Feed documents, each containing some of the entries that exist in the feed at that point in time. In other words, a feed may contain more entries than exist in the Feed document that one retrieves when dereferencing the subscription URI, and there may be other documents containing representations of those additional entries. The link relations defined in this specification are used to navigate between Feed documents containing pages or chunks of those entries which exist simultaneously within a feed. Note that this specification does not address navigation between the current and previous states of a type of feed which does not simultaneously contain it's current and past entries. For example, a Top 100 Songs feed might at any point in time only contain entries for the top 100 songs for a single week, which entries may or may not be divided among a number of Feed documents. The entries for the top 100 songs from the previous week are not only no longer part of the Feed document or Feed documents representing the current state of the feed--they are no longer part of the feed at all. Another specification may describe a method of navigating between the current and previous states of such a feed. The link relations defined in this specification are only used to navigate between the various Feed documents representing any single state of such a feed. - Attribute Value: prev - Description: A stable URI that, when dereferenced, returns a feed document containing entries that sequentially precede those in the current document. Note that the exact nature of the ordering between the entries and documents containing them is not defined by this relation; i.e., this relation is only relative. - Expected display characteristics: Undefined. - Security considerations: Because automated agents may follow this link relation to construct a 'virtual' feed, care should be taken when it crosses administrative domains (e.g., the URI has a different authority than the current document). - Attribute Value: next - Description: A stable URI that, when dereferenced, returns a feed document containing entries that sequentially follow those in the current document. Note that the exact nature of the ordering between the entries and documents containing them is not defined by this relation; i.e., this relation is only relative. - Expected display characteristics: Undefined. - Security considerations: Because automated agents may follow this link relation to construct a 'virtual' feed, care should be taken when it crosses administrative domains (e.g., the URI has a different authority
Re: New Link Relations? [was: Feed History -04]
On Oct 17, 2005, at 10:17 PM, James M Snell wrote: When I think of next/prev I'm not thinking about any form of temporal semantic. I'm thinking about nothing more than a linked list of feed documents. If you want to add a temporal semantic into the picture, use a mechanism such as the Feed History incremental=true element. I don't think I expressed the point I wanted to make quite clearly enough, so let me try again. Chains of Feed documents are going to be ordered in some way, whether it's specified or not, whether they explicitly indicate it or not. For example, the chain of Feed documents representing an incremental feed is going to naturally be in temporal order. You're not going to be tacking on new entries willy nilly to whichever of the documents in the chain fits your fancy at the moment. You're going to create a new document when the one you were most recently adding entries to gets full, and then your going to add entries there till that one is full, and so on. There may be exceptions, but by and large, whether the temporal order is explicit or not, that's what's going to happen. Chains of pages of search results feeds are going to naturally be ordered with the best matches on top. The point I was trying to make was that you're not going to create all the documents without links between then and then randomly assign links between them in no specific order. You're going to link between then in an order that makes sense within the context of how the feed was created. I don't know how client applications are going to adapt to deal with the difference between incremental feeds and, for example, search results feeds, but I can't imagine that client software isn't going to rely on there being some sort of sense to the order of the Feed documents. What I was trying to say further down with the example spec text I wrote was, let's state explicitly that this link relation does not have a temporal semantic, and if somebody want's a link relation with a temporal semantic, they should create another link/@rel value for it. In other words... In other words, this does not imply a feed history thing... ...let's have this be a link for navigating among the pages of the current state of the feed (whether it be incremental or not--noting that some non-incremental feeds will only have one page, and won't need it). The entries in the current state of the feed are not in any specific order (though we know that naturally they will be in some sort of order): feed ... link rel=next href=... / /feed How does the following have anything to do with history? In an incremental feed, all of the entries, whether part of the Feed document at the subscription end or not, are part of the present state of the feed--they don't just exist back in history. History is for non-incremental feeds. I'm saying let's not work on navigation through history right now, but let's recognize that unless we say not to, people might try to use the mechanism designed for paging through the current state of a feed to navigate through the history of a feed too, so let's say not to. I understand (or at least suppose) that you don't think we need to say not to, because you don't see the harm in making the link relation more generic. I disagree. I think we're going to end up with a mess if we don't make it specifically for navigating the current state. this does... feed ... fh:incrementaltrue/fh:incremental link rel=next href=... / /feed
Re: Spec wording bug?
On Oct 14, 2005, at 5:43 AM, Danny Ayers wrote: I believe the language of the resource for hreflang makes no sense - it will be the *representations* that are associated with languages, and the implies a single language - there may be more than one. If content negotiation might be used to select from among different languages (ie. if multiple representations are available from the same URI), then perhaps the hreflang attribute should be omitted. Were we to have allowed multiple languages to be specified in the same hreflang attribute to cover such cases, the wording would be incorrect, but since we didn't, I think it's correct as it is.
Re: Feed History -04
On Oct 14, 2005, at 11:13 AM, Mark Nottingham wrote: On 14/10/2005, at 9:22 AM, Lindsley Brett-ABL001 wrote: I have a suggestion that may work. The issue of defining what is prev and next with respect to a time ordered sequence seems to be a problem. How about defining the link relationships in terms of time - such as newer and older or something like that. That way, the collection returned should be either newer (more recent updated time) or older (later updated time) with respect to the current collection doc. A feed isn't necessarily a time-ordered sequence. Even a feed reconstructed using fh:prev (or a similar mechanism) could have its constituent parts generated on the fly, e.g., in response to a search query. The OpenSearch case mentioned by Thomas is what convinced me that terms related to temporal ordering aren't appropriate (what a pity, since newer and older are the perfect terms for time ordered sequences of feed documents!) Previous and next suffer from the fact that they could easily be interpreted differently in different use cases. For example, for OpenSearch results pages, clearly prev points to the search results that come up on top and next to the lower results. But in a conventional syndication feed, next could easily be taken to mean either the next batch of entries as you track back towards the beginning of time from where you started (which is usually going to be the growing end of the feed), or a batch of entries containing the entries that were published next after the ones in this batch. I'd have to look at the document to remind myself of which next means, because either makes just as much sense to me. Which brings me back to top, bottom, up and down. In the OpenSearch case, it's clear which end the top results are going to be found. In the syndication feed case, the convention is to put the most recent entries at the top. If you think of a feed as a stack, new entries are stacked on top. The fact that these terms are less generic and flexible than previous and next is both an advantage and a disadvantage. I think the question is whether it's an advantage in a significant majority of cases or not. What orderings would those terms not work well for? Antone
Re: Feed History -04
On Oct 14, 2005, at 11:28 AM, Thomas Broyer wrote: Mark Nottingham wrote: How about: atom:link rel=subscription href=.../ ? I always thought this was the role of @rel=self to give the URI you should subscribe to, though re-reading the -11 it deals with a resource equivalent to the containing element. That's what some of us wanted it to be and thought it was intended to be. The language that made it into the spec certainly falls short of expressing what was in PaceFeedLink, which is the proposal that added @rel=self [1]. 1. Isn't a resource equivalent to the containing element the same as an alternate version of the resource described by the containing element? That's how I would read that language knowing nothing of the history of that part of the spec. I think some people intended equivalent to mean it may not be a different copy of the same bits, but whatever it is, it contains the same bits (or at least the same code points, if it happens to be transcoded). 2. Is the answer to 1. is no then what does a resource equivalent … mean? Is it really different than the URI you should subscribe to (at least if @type=application/atom+xml)? I think what some people want that to mean is here's a place you could get the feed, but I'm not making any assertions regarding whether it's preferable to get it from there or somewhere else. [1] http://www.imc.org/atom-syntax/mail-archive/msg15062.html
Re: more than one content element?
On Oct 13, 2005, at 12:06 PM, A. Pagaltzis wrote: * John Panzer [EMAIL PROTECTED] [2005-10-13 19:40]: Well, you can pass them around by reference with [EMAIL PROTECTED] I think. By the letter of the spec, but not by the spirit. I just ran through the discussion of this very question on the mailing list[1], and though it looks like allowing composite types in remote content had pretty good support, that doesn't appear to have been translated into a Pace, and obviously, no language specifically allowing it got into the spec document. Thus, it looks like the prohibition from section 4.1.3.1 stands, and that you're right that the only way you could do it without breaking the rules outright would be by ignoring the SHOULD (have content/@type when using content/@src), which would certainly be contrary to the spirit of the spec as it stands. [1] http://www.imc.org/atom-syntax/mail-archive/msg15949.html
Re: Feed History -04
On Oct 13, 2005, at 7:58 PM, Eric Scheid wrote: On 14/10/05 9:18 AM, James M Snell [EMAIL PROTECTED] wrote: Excellent. If this works out, there is an opportunity to merge the paging behavior of Feed History, OpenSearch and APP collections into a single set of paging link relations (next/previous/first/last). 'first' or 'start'? Do we need to define what 'first' means though? I recall a dissenting opinion on the wiki that the 'first' entry could be at either end of the list, which could surprise some. Yeah, that's a good question. Maybe calling them top and bottom would work better. Considering that the convention is to put the newest entry at the top of a feed document, top might be more intuitively understandable as being the new end. You might also rename next and previous (or is it previous and next?) to down and up. There's SOME chance of that getting confused with hierarchical levels, but I could live with that.
Re: Straw Poll: age:expires vs. ...... plus a gazillion words
Gh! Sorry about the mile long subject. Gotta be careful with that copy and paste!
Re: Straw Poll: age:expires vs. dcterms:valid (was Re: Unofficial last call on draft-snell-atompub-feed-expires-04.txt) On Oct 8, 2005, at 8:37 AM, James M Snell wrote: I wanted to indicate that a gi
Oops, sent this from the wrong address on Saturday. No wonder it didn't get through. On Oct 8, 2005, at 8:37 AM, James M Snell wrote: I wanted to indicate that a given entry must expire at Midnight on Dec, 12, 2005 (GMT). using age:expires: [snip] using dcterms:valid (http://web.resource.org/rss/1.0/modules/ dcterms/#valid) entry dcterms:validend:2005-12-12T00:00:00Z/dcterms:valid /entry Advantage: * Existing namespace, known element Disadvantage: * Value can be many different things. I've even seen cases in which the content of dcterms:valid is an XML structure. My chief problem with dcterms:valid (and with dublin core in general) is that the elements are very loosely defined. The content can literally be anything folks want it to be and still be considered valid. Unless we constrain the value space for this element when used in Atom, it *could* lead to a bunch of extra work for consumers to parse and process those dates. I prefer very crisply defined elements. Then again, reusing an existing namespace is Goodness. I think it would be going too far to say when using dcterms:valid in Atom, you must follow this profile, because we don't own dcterms, and doing so might limit people from doing valid things with it that don't follow that profile. But I do think it would be reasonable to say when using dcterms:valid in Atom, it is recommended that you follow this profile--otherwise your data may be technically valid, but not widely understood, thus giving developers an excuse for not supporting data not formatted according to that profile. If a use case that requires a different format becomes common, then developers can start supporting more formats at that point. That said, my vote is for doing what I just said--advocate the use of dcterms:valid for this purpose, with the date formatted to match Atom's date construct profile. BTW, you might choose language that leaves room for having both start and end dates for validity--for example, to enable Atom delivery of a coupon that's valid for a particular span of dates.
Re: ACE - Atom Common Extensions Namespace
On Oct 2, 2005, at 11:15 PM, Mark Nottingham wrote: I think this is a well-intentioned effort, but at the wrong end of the process. The market (i.e., users and implementors) should have a go at sorting out at what's common/prevalent enough to merit this sort of thing; having a co-ordinated namespace will lead to the problem of what to lump into it, how to version individual extensions within it, etc. I have to agree with Mark. Consider this scenario: an extension gets added to ACE. Someone makes an extension that does the same thing differently. The market prefers the non-ACE method and adopts it more widely than the ACE solution. Now not only do you have multiple namespaces to declare, but one of them has a bunch of elements that don't get used, yet implementors feel compelled to implement them because they're part of this special namespace. Here's another scenario: an extension gets added to ACE, and another extension gets created that does the same thing better. Because the first has the ACE stamp of approval, the inferior method gets wide support, and the superior method dies. Both scenarios suggest that the market should be given time to choose best practices rather than some group deciding which practices are going to get special status in advance. If a feed is going to carry elements from a bunch of different extensions, it's going to be a relatively heavy feed anyway. The overhead of including multiple namespace declarations isn't going to be that great.
Re: FYI: Updated Index draft
On Wednesday, September 21, 2005, at 11:43 PM, James M Snell wrote: feed xmlns:i=urn:ranking i:domain{domain}/i:domain I was thinking yesterday of suggesting that feed/id be used the way you're using i:domain. Which is better is probably a matter of whether ranking domains that span multiple feeds will be useful or not. In the movie ratings use case presented below, perhaps rather than a fivestarts scheme and netflix and amazon domains, it might make more sense to do this: feed idurn:my_reviews/id i:order scheme=urn:netflix.com/reviews label=Netflix ratingdescending/i:order i:order scheme=urn:amazon.com/reviews label=Amazon ratingdescending/i:order entry idMovie A/id i:rank scheme=urn:netflix.com/reviews3/i:rank i:rank scheme=urn:amazon.com/reviews4/i:rank /entry entry idMovie B/id i:rank scheme=urn:netflix.com/reviews2/i:rank i:rank scheme=urn:amazon.com/reviews1/i:rank /entry /feed Notes: * The i:order element tells the user agent whether higher or lower numbers are considered better, higher priority, first, or whatever. In these cases, higher numbers are better, so would typicially be shown first, so they're considered a descending schemes. * i:order/@label indicates a human readable label for the scheme, and could be optional. * Since the urn:(netflix|amazon).com/reviews schemes are feed independent, it is not necessary to indicate a feed (or domain) in this case. * For a feed-specific scheme, like natural order, the feed ID would be included like this (so that if these entries were aggregated, it would be clear that the i:order elements were relevant to the source feed, not the aggregate feed): feed idurn:my_feed/id i:order scheme=urn:indexascending/i:order entry idurn:my_feed/a/id i:rank scheme=urn:index feed=urn:my_feed1/i:rank /entry entry idurn:my_feed/b/id i:rank scheme=urn:index feed=urn:my_feed2/i:rank /entry /feed If sticking with i:domain, I'd recommend that you recommend that in cases where a ranking domain does not span multiple feeds, the feed/id value be used for the value of i:domain, and that in all cases, the same care be taken to (attempt to) ensure that i:domain's value is unique to what is intended to be a particular domain.
Re: FYI: Updated Index draft
On Thursday, September 22, 2005, at 10:20 AM, James M Snell wrote: Antone Roundy wrote: I was thinking yesterday of suggesting that feed/id be used the way you're using i:domain. Which is better is probably a matter of whether ranking domains that span multiple feeds will be useful or not. In the movie ratings use case presented below, perhaps rather than a fivestarts scheme and netflix and amazon domains, it might make more sense to do this: Using atom:id as the ranking domain would limit the ranking to a single feed which is useful, but does not cover the full range of cases. ... Yes, there are two special cases here: 1. Lack of a i:domain 2. i:domain value that is a same document reference I think a ranking without a domain is pretty much useless--or at least likely to lead to problems downstream--so that case doesn't need to be covered. More on that below. xhtml:html ... xhtml:body atom:feed atom:idFeed1/atom:id i:domain#/i:domain !-- document ranking domain -- atom:entry atom:idA/atom:id i:rank scheme=priority50/i:rank i:rank scheme=priority domain=#20/i:rank /atom:entry atom:entry atom:idB/atom:id i:rank scheme=priority25/i:rank i:rank scheme=priority domain=#40/i:rank /atom:entry /atom:feed atom:feed atom:idFeed2/atom:id i:domain#/i:domain !-- document ranking domain -- atom:entry atom:idC/atom:id i:rank scheme=priority50/i:rank i:rank scheme=priority domain=#30/i:rank /atom:entry atom:entry atom:idD/atom:id i:rank scheme=priority25/i:rank i:rank scheme=priority domain=#10/i:rank /atom:entry /atom:feed /xhtml:body /xhtml:html In this example, the domainless rankings were added when the XHTML document was created, right? So the XHTML document is essentially an aggregate feed, just not in Atom format. Would it not make as much or more sense to mint an ID for the document (call it the ID of a virtual Atom Feed Document if you don't actually create an aggregate feed) and use it to scope those i:rank elements? If, somehow, someone were to pull the atom:feeds out of the XHTML document (if atom:feed getting embedded into xhtml:body is going to happen, then is not atom:feed getting extracted from xhtml:body also likely?) and aggregate them with other feeds with domainless i:rank elements, the scopes of those elements would get mixed. * Since the urn:(netflix|amazon).com/reviews schemes are feed independent, it is not necessary to indicate a feed (or domain) in this case. * For a feed-specific scheme, like natural order, the feed ID would be included like this (so that if these entries were aggregated, it would be clear that the i:order elements were relevant to the source feed, not the aggregate feed): The goal of @scheme is to identify the type of ranking to apply while the goal of @domain is to identify the scope of the ranking. I do not believe that it is a good idea to conflate the two. Okay, I've come to agree with that while writing and editing this message. Note however that fivestar also indicates multiple things: 1) Higher numbers are better 2) The range is 0 to 5 (BTW, if this is limited to integers, how will you handle things like 3.5 stars, which are common in that type of rating system? Maybe decimal values need to be allowed.) 3) Hint: you might want to display the value as stars #1 is the only one needed for sorting of entries. #2 would be useful if the feed reader wanted to display some sort of graphical element to indicate the ranking. #3 might be slightly useful, but except for the most popular schemes, would probably be ignored. Perhaps all of these should be separated, a la: i:ranking-scheme label=Amazon rating order=descending min-value=0 max-value=5 symbol=stars domain=urn:amazon.com/customer-rating / ... entry i:rank domain=url:amazon.com/customer-rating3/i:rank ...where @domain is the feed/id of the feed if there's just one feed in scope, or a value that won't be duplicated by any feed/id otherwise (if one can mint a unique feed id, surely one can also mint a unique id that won't be used for a feed). I'd suggest that i:ranking-scheme/@domain either default to the containing feed/id (or the one from atom:source, if it exists) or be required, i:rank/@domain be required, @order default to ascending, @min-value default to 0, and the rest of the attributes be optional with no defaults.
Re: Don't Aggregrate Me
On Monday, August 29, 2005, at 10:12 AM, Mark Pilgrim wrote: On 8/26/05, Graham [EMAIL PROTECTED] wrote: (And before you say but my aggregator is nothing but a podcast client, and the feeds are nothing but links to enclosures, so it's obvious that the publisher wanted me to download them -- WRONG! The publisher might want that, or they might not ... So you're saying browsers should check robots.txt before downloading images? ... Normal Web browsers are not robots, because they are operated by a human, and don't automatically retrieve referenced documents (other than inline images). As has been suggested, to inline images, we need to add frame documents, stylesheets, Java applets, external JavaScript code, objects such as Flash files, etc., etc., etc. The question is, with respect to feed readers, do external feed content (content src=... /), enclosures, etc. fall into the same exceptions category or not? If not, then what's the best mechanism for telling feed readers whether they can download them automatically--robots.txt, another file like robots.txt, or something in the XML? I'd prefer something in the XML. A possibility: feed ext:auto-download target=enclosures default=false / ext:auto-download target=content default =true / ... entry link rel=enclosure href=... ext:auto-download=yes / content src=... ext:auto-download=0 / ...
Re: Don't Aggregrate Me
On Monday, August 29, 2005, at 10:39 AM, Antone Roundy wrote: ext:auto-download target=enclosures default=false / More robust would be: ext:auto-download target=[EMAIL PROTECTED]'enclosure'] default=false / ...enabling extension elements to be named in @target without requiring a list of @target values to be maintained anywhere.
Re: Don't Aggregrate Me
On Friday, August 26, 2005, at 04:39 AM, Eric Scheid wrote: On 26/8/05 3:55 PM, Bob Wyman [EMAIL PROTECTED] wrote: Remember, PubSub never does anything that a desktop client doesn't do. Periodic re-fetching is a robotic behaviour, common to both desktop aggregators and server based aggregators. Robots.txt was established to minimise harm caused by automatic behaviour, whether by excluding non-idempotent URL, avoiding tarpits of endless dynamic links, and such forth. While true that each of these scenarios involve crawling new links, the base principle at stake is to prevent harm caused by automatic or robotic behaviour. That can include extremely frequent periodic re-fetching, a scenario which didn't really exist when robots.txt was first put together. I'm with Bob on this. If a person publishes a feed without limiting access to it, they either don't know what they're doing, or they're EXPECTING it to be polled on a regular basis. As long as PubSub doesn't poll too fast, the publisher is getting exactly what they should be expecting. Any feed client, whether a desktop aggregator or aggregation service, that polls too fast (extremely frequent re-fetching above) is breaking the rules of feed consuming etiquette--we don't need robots.txt to tell feed consumers to slow down.
Re: Feed History: stateful - incremental?
On Wednesday, August 24, 2005, at 04:07 PM, Mark Nottingham wrote: Just bouncing an idea around; it seems that there's a fair amount of confusion / fuzziness caused by the term 'stateful'. Would people prefer the term 'incremental'? I.e., instead of a stateful feed, it would be an incremental feed; fh:stateful would become fh:incremental. Worth it? I think it's worth seeing if a term can be found that has a more intuitively understandable meaning. It might be helpful to explore the kinds of names that describe non-stateful feeds too--if a better term can be found for that, it could be used instead (and just reverse true false). Brainstorming a little: Stateful: sliding window, most recent segment, segment, stream, entry stream, appendable, appending, augmentable, augmenting Non-stateful: uh...stateful? (what you just downloaded represents the current state of the entire feed), current state, current, snapshot, fixed entry, set entry, replacable, replacing, entry replacing, non-appending, non-augmenting
Re: Don't Aggregrate Me
On Thursday, August 25, 2005, at 12:25 AM, James M Snell wrote: Up to this point, the vast majority of use cases for Atom feeds is the traditional syndicated content case. A bunch of content updates that are designed to be distributed and aggregated within Feed readers or online aggregators, etc. But with Atom providing a much more flexible content model that allows for data that may not be suitable for display within a feed reader or online aggregator, I'm wondering what the best way would be for a publisher to indicate that a feed should not be aggregated? For example, suppose I build an application that depends on an Atom feed containing binary content (e.g. a software update feed). I don't really want aggregators pulling and indexing that feed and attempting to display it within a traditional feed reader. What can I do? In that particular use case, I'd expect entries something like this: entry ... titlePatch for MySoftware/title summaryThis patch updated MySoftware version 1.0.1 to version 1.0.2/summary content type=[...whatever goes here]k3jafidf8adf.../content /entry Looking at this, my thoughts are: 1) Feed readers that can't handle the content type are just going to display the summary or title anyway, so it's not going to hurt anything. 2) People whose feed readers can't handle the patches probably aren't going to subscribe to this feed anyway. Instead they'll subscribe to your other feed (?) which gives them a link to use to download the patch: entry ... titlePatch for MySoftware/title summaryThis patch updated MySoftware version 1.0.1 to version 1.0.2/summary link rel=[???] type=[???] href=... / /entry I don't think we need anything special to tell aggregators to beware content that they don't know how to handle in this feed. That should be marked clearly enough by @type. More in a separate message...
Re: Don't Aggregrate Me
On Thursday, August 25, 2005, at 08:16 AM, James M Snell wrote: Good points but it's more than just the handling of human-readable content. That's one use case but there are others. Consider, for example, if I was producing a feed that contained javascript and CSS styles that would otherwise be unwise for an online aggregator to try to display (e.g. the now famous Platypus prank... http://diveintomark.org/archives/2003/06/12/ how_to_consume_rss_safely). Typically aggregators and feed readers are (rightfully) recommended to strip scripts and styles from the content in order to reliably display the information. But, it is foreseeable that applications could be built that rely on these types of mechanism within the feed content. For example, I may want to create a feed that provides the human interaction for a workflow process -- each entry contains a form that uses javascript for validation and perhaps some CSS styles for formatting. For that, you'd either need to use a less sophisticated feed reader that didn't strip anything out (and only use it to subscribe to fully trusted feeds, like internal feeds), or a more sophisticated feed reader that allowed you to turn off the stripping of potentially dangerous stuff, or to configure exactly what was, or better yet, wasn't, stripped (perhaps and a feed by feed basis). The stripping-or-not behavior should be controlled from the client side, so I don't see any point in providing a mechanism for the publisher to provide hints about whether or not to strip things out. That would probably only benefit malicious publishers at the expense of brain-dead clients: entry ... ext:keep-potentially-dangerous-stuff=true / content ... script ... TriggerExploitThatErasesDrive('C:');/script/content /entry
Re: Don't Aggregrate Me
I can see reasonable uses for this, like marking a feed of local disk errors as not of general interest. This is not published data - http://www.spacekdet.com/pipe/ Security by obscurity^H^H^H^H^H^H^H^H^H saying please - http://www-cs-faculty.stanford.edu/~knuth/ (see the second link from the bottom) This certainly wouldn't be useful as a security measure. But yeah, a way to tell the big republishing aggregators that you'd prefer they didn't republish the feed could be useful, in case they somehow go ahold of the URL of a non-sensitive (and thus non- encrypted and authentication-protected), but not-intended-for-public-consumption feed. Ideally though, such feeds should probably be password protected, since that wouldn't require aggregator support for an extension element.
Re: Don't Aggregrate Me
On Thursday, August 25, 2005, at 03:12 PM, Walter Underwood wrote: I would call desktop clients clients not robots. The distinction is how they add feeds to the polling list. Clients add them because of human decisions. Robots discover them mechanically and add them. So, clients should act like browsers, and ignore robots.txt. How could this all be related to aggregators that accept feed URL submissions? I'd imagine the desired behavior is the same as for crawlers--should they check for robots.txt at the root of any domain where a feed is submitted? How about cases where the feed is hosted on a site other than the website that it's tied to (for example, a service like FeedBurner) so some other site's robot.txt controls access to the feed (...or at least tries to)? We've already rejected the idea of trying to build DRM into feeds--is there some way to sidestep the legal complexities and problems that would arise from trying to to that and at the same time enable machine readable statements about what the publisher wants to allow others to do with the feed, and things they want to prohibit, into the feed? If we're not qualified to design an extension to do that, is there someone else who is qualified, and who cares enough to do it?
Re: If you want Fat Pings just use Atom!
On Monday, August 22, 2005, at 09:54 PM, A. Pagaltzis wrote: * Martin Duerst [EMAIL PROTECTED] [2005-08-23 05:10]: Well, modulo character encoding issues, that is. An FF will look differently in UTF-16 than in ASCII-based encodings. Depends on whether you specify a single encoding for all entries at the HTTP level or not. For this application, I would do just that, in which case, as a bonus, non-UTF-8 streams would get to avoid resending the XML preamble over and over and over. Of course, if you do that, you won't be able to keep signatures for entries originally published in an encoding other than the one you've chosen. If one were to want to signal an encoding change mid-stream, how might that work with what's been proposed thus far?
Re: Comments Draft
On Sunday, July 31, 2005, at 10:24 AM, A. Pagaltzis wrote: * Antone Roundy [EMAIL PROTECTED] [2005-07-31 01:15]: I could add more, but instead, here's my suggestion for replacing that sentence: If the resource being replied to is an atom:entry, the value of the href attribute MUST be the atom:id of the atom:entry. If the value of the type attribute is application/atom+xml then the href attribute MUST be the (URL/URI/IRI) of an Atom Entry Document containing the atom:entry being replied to. This undermines the purpose of the link. I'd say that not being able to tell whether @href in [EMAIL PROTECTED]in-reply-to] is dereferencable or not is what undermines link. The primary purpose of [EMAIL PROTECTED]in-reply-to] is to identify the resource (which may be an atom:entry) being replied to. If that resource is an atom:entry, then the appropriate identifier for it is it's atom:id. If If the resource being replied to is an atom:entry, the value of the href attribute MUST be the atom:id of the atom:entry doesn't sound like a good rule, then I'd argue that using atom:link to identify the resource being replied to is a bad idea. As I've said before, I think that stuffing data that happens to be a URI but may not be dereferencable into link/@href is a bad idea. If we ARE going to do it, then I think we need a way to at least hint at whether it's a dereferencable link or some other data stuffed into a link element. Here's what the spec says @type is for: On the link element, the type attribute's value is an advisory media type; it is a hint about the type of the representation that is expected to be returned when the value of the href attribute is dereferenced. If @href isn't dereferenable, then the existence of @type is deceptive. I suppose it could mean when I saw it, it was in some kind of Atom document, but so what? What if the feed gets converted to RSS 2.0, the atom:id is put into guid, and I find the entry in the RSS feed? Atom Entry Documents can move around; their IDs are eternal. True, so you could just omit @type from this link if you're concerned that your entry document might move. Or we could go with something like this: ext:in-reply-to id=... atom:link rel=found-in-entry-document href=... / atom:link rel=found-in-feed-document href=... / /ext:in-reply-to Or we could just stick with what has been proposed, perhaps including what I proposed in my last message, and if they entry document moves, then oh well, the web has another broken link just as it would in what I proposed just above here or in any case where a dereferencable link was published, but the atom:id would still be valid. If after moving the entry document, one were to publish the in-reply-to link again, it would be appropriate to remove the @type attribute. ...okay, that last sentence suggests that what I propose just above here is a superior way to having possibly-derefencable atom:links, because you could update the found-in-entry-document link if it got out of sync with the location of the document. Otherwise, we'll have to be limited to linking to the feed in which the entry is found.
Re: Proposed changes for format-11
On Monday, August 1, 2005, at 09:55 AM, A. Pagaltzis wrote: * Robert Sayre [EMAIL PROTECTED] [2005-08-01 17:25]: On 8/1/05, Sam Ruby [EMAIL PROTECTED] wrote: Perhaps the following could be added to section 6.2: The Atom namespace is reserved for future forwards-compatable revisions of Atom. s/compatable/compatible/ Sounds OK to me, but I recall squawking about this. There wasn’t any squawking about the rule as such, I think. A minor amount of squawking was about what a consumer should do when it encounters Atom-namespaced elements in locations it didn’t expect them. Per spec: it should simply treat them as unknown foreign markup. Intent: this allows old consumers to continue working with future revisions of the spec, so long as changes are not so drastic that a new namespace is warranted to prevent existing consumers from doing anything with new documents. It sounds to me like we might benefit from adding language specifying that elements in the Atom namespace can appear as children of elements from other namespaces, but may not appear as children of elements in the Atom namespace except as specified by the spec (or from wording the language to be added so that it says that). ...I am correct about our intent to allow Atom elements to be used as children of extension elements, right? For example, that one be able to do this: foo:bar qwerty=asdf atom:titleMy title/atom:title atom:link rel=foo:my-rel href=... / /foo:bar ...rather than having to do this: foo:bar qwerty=asdf foo:titleMy title/foo:title foo:link rel=foo:my-rel href=... / /foo:bar ...right?
Re: Comments Draft
On Saturday, July 30, 2005, at 02:38 PM, A. Pagaltzis wrote: * James M Snell [EMAIL PROTECTED] [2005-07-30 18:10]: Yeah, source is likely the most logical choice, but I didn't want to confuse folks with a link @rel=source that has a different meaning from atom:source. An argument by way of which I came around to Antone’s suggested “start-of-thread,” though I was going to suggest “thread-start.” I took a look at the draft to verify whether I correctly understood what this link points to, and I think it isn't what I originally thought based on the old name root. Does this point to the feed in which the immediate parent entry was found, or to the feed in which the first entry in a thread of replies was found? If the former, which the draft seems to suggest, and which seems more useful, then start-of-thread and thread-start probably aren't such good names after all. With clarity in mind, in-reply-to-feed might be good, though it's a bit long. And problem comes to mind: if you have multiple in-reply-to links, how do you related those to their respective in-reply-to-feed links (in case they're different)? Is it possible? Dare we do something like this? (Wish we to if we dare?) link rel=in-reply-to ... link rel=in-reply-to-feed ... / /link Pro: * Groups the two links together * Gives us more options for what to call the inside one without creating confusion: source-feed, for example. It would be nice to choose a name that's not likely to be the perfect name for some other use, or to define this @rel value broadly enough to be applicable to other purposes. Con: * Puts an atom:link in a location not expected by apps that don't understand this extension.
Re: Comments Draft
On Saturday, July 30, 2005, at 04:37 PM, James M Snell wrote: One challenge is that for anything besides references to Atom entries, there is no guarantee that in-reply-to links will be non-traversable. For instance, if someone were to go and define a behavior for using in-reply-to with RSS, the href of the link may point to the same URL that the RSS item's link element points to (given that there is no way to uniquely identify an RSS item). link rel=in-reply-to type=text/html href=http://www.example.com/entries/1; / This is legal in the spec but is left undefined. The natural choice of values when replying to an RSS 2.x item would be the guid, since it's the closest counterpart to atom:id. But if the guid is not a permalink (ie. not dereferencable), then it won't have a MIME type, just as non-dereferencable atom:id's don't have a MIME type. Both of these facts suggest that the following sentence should probably be removed from section 3: If the type attribute is omitted, it's value is assumed to be application/atom+xml. Instead, I'd suggest stating that if the type attribute is omitted, the in-reply-to link cannot be assumed to be dereferencable, and that non-dereferencable links MUST NOT have a type attribute. Editorial notes about this sentence: A type attribute value of application/atom+xml indicates that the resource being responded to is an atom:entry and that the href attribute MUST specify the value of the parent entries atom:id element. 1) parent probably isn't the best word here since in-reply-to isn't being defined in terms of parents and children. 2) entries - entry's I could add more, but instead, here's my suggestion for replacing that sentence: If the resource being replied to is an atom:entry, the value of the href attribute MUST be the atom:id of the atom:entry. If the value of the type attribute is application/atom+xml then the href attribute MUST be the (URL/URI/IRI) of an Atom Entry Document containing the atom:entry being replied to. Anything else could lead to inconsistencies. For example, when replying to an atom:entry that can be found in an Atom Entry Document, but whose atom:id does NOT point to that document, there would be multiple choices available for the reply link's href attribute.
Re: Comments Draft
On Friday, July 29, 2005, at 02:41 PM, A. Pagaltzis wrote: * Antone Roundy [EMAIL PROTECTED] [2005-07-29 02:40]: On Thursday, July 28, 2005, at 05:58 PM, James M Snell wrote: root is now called replies-source... which is a horrible name but I'm not sure what else to call it How about start-of-thread. Or maybe “parent-entries?” How about mother-of-all-entries? Ha ha. The problem with parent-entries is that this link may not be pointing to the immediate parent, right?
Re: Comments Draft
On Thursday, July 28, 2005, at 05:58 PM, James M Snell wrote: * root is now called replies-source... which is a horrible name but I'm not sure what else to call it How about start-of-thread.
Re: I-D ACTION:draft-nottingham-atompub-feed-history-01.txt
On Wednesday, July 20, 2005, at 11:44 AM, Thomas Broyer wrote: I was actually wondering why non-stateful feeds couldn't have archives: in the This month's Top 10 records feed, why couldn't I link to Last month's Top 10 records? If this kind of links are not dealt within feed-history, then I suggest splitting the draft into two (three) parts: 1. fh:stateful: whether a feed is stateful or not 2. fh:prev: state reconstruction of a stateful feed 3. (published later) fh:: link to archives of a non-stateful feed (no, I actually don't want such a split, I'd rather deal with the 3. in feed-history, no matter how) If we want to solve this issue using a distinct element (fh:prev if fh:stateful=true, fh: if fh:stateful=false), is fh:stateful still needed? The presence of fh:prev would be equivalent to fh:stateful=true, the presence of fh: would be equivalent to fh:stateful=false, the absence of both fh:prev and fh: would be equivalent to the absence of fh:stateful, and the presence of both fh:prev and fh: would be an error. This is off course true only if fh:prev must be accompanied by fh:stateful=true. The question is: is it useful to have fh:stateful if you have no link to any kind of archive? I would think that rather than fh:stateful=true | false, it might be more useful to have (with a different element name, and perhaps different values) fh:what-kind-of-feed-is-this=sliding-window | snapshot | ???. If it's a sliding-window feed, fh:prev points to the previous sliding window. If it's a snapshot feed, then fh:prev points to the previous snapshot. fh:what-kind-of-feed-is-this might have a default value of sliding-window.
Re: Notes on the latest draft - xml:base
On Wednesday, July 20, 2005, at 10:22 PM, A. Pagaltzis wrote: * James Cerra [EMAIL PROTECTED] [2005-07-21 05:00]: Sjoerd Visscher, That's because it is not an attempt at abbreviating strings, but to preserve the meaning of relative URIs, when content is used outside of its original context. Same thing. You are framing the question in a manner that hides the problem, but it's still there. No, it frames the question in a manner that addresses the purpose of having the mechanism. Right--it frames it in the context created by RFC 3986. However, since this issue is commonly misunderstood, it's likely that xml:base will often be used for string abbreviation in the wild--thus, indeed the problem is still there. If anyone doubts that base URIs as defined by RFC 3986 are not intended simply for abbreviation, read section 4.4 (Same-Document References). The method outlined there for recognizing same-document references would be entirely unreliable if base URIs were used to abbreviate arbitrary portions of URIs. It only works if the base URI is an address from which the data containing the relative URI can be retrieved. If base URIs are intended for abbreviation convenience, then that section of RFC 3986 is completely broken. My impression is that it isn't broken, but says what was intended. ...but now I've forgotten whether anyone has made a concrete suggestion about what can be done at this point, and to solve exactly what problem. Do I smell another note in the infamous implementers guide?
Re: Feed History -02
On Monday, July 18, 2005, at 01:59 AM, Stefan Eissing wrote: Ch 3. fh:stateful seems to be only needed for a newborn stateful feed. As an alternative one could drop fh:stateful and define that an empty fh:prev (refering to itself) is the last document in a stateful feed. That would eliminate the cases of wrong mixes of fh:stateful and fh:prev. The problem is that an empty @href in fh:prev is subject to xml:base processing, and who knows what the current xml:base is going to be when you get to it. Is there a way to explicitly make xml:base undefined? If I'm not mistaken xml:base= doesn't do it--it just adds nothing to the existing xml:base. If there is a way, you could say link rel=fhprev href= xml:base=[whatever value sets it to undefined] /, but otherwise, using an empty @href is probably overloading the wrong attribute. A different @rel value like fh:noprev (with an empty link, since it doesn't matter what it actually points to) might be a step up, but using any kind of link to indicate the lack of a link is a little odd.
Re: Feed History -02
On Tuesday, July 19, 2005, at 12:29 PM, Antone Roundy wrote: On Monday, July 18, 2005, at 01:59 AM, Stefan Eissing wrote: Ch 3. fh:stateful seems to be only needed for a newborn stateful feed. As an alternative one could drop fh:stateful and define that an empty fh:prev (refering to itself) is the last document in a stateful feed. That would eliminate the cases of wrong mixes of fh:stateful and fh:prev. The problem is that an empty @href in fh:prev is subject to xml:base processing, and who knows what the current xml:base is going to be when you get to it. Is there a way to explicitly make xml:base undefined? If I'm not mistaken xml:base= doesn't do it--it just adds nothing to the existing xml:base. If there is a way, you could say link rel=fhprev href= xml:base=[whatever value sets it to undefined] /, but otherwise, using an empty @href is probably overloading the wrong attribute. A different @rel value like fh:noprev (with an empty link, since it doesn't matter what it actually points to) might be a step up, but using any kind of link to indicate the lack of a link is a little odd. Yikes, I should have caught up on the xml:base thread first! Looks like the jury's out, or at least hung, on this issue.
Re: I-D ACTION:draft-ietf-atompub-format-10.txt
A misspelling...in case the opportunity to fix it arises: Text Contruct -- missing an s in 6.3. (I found it because I misspelled it the same way when searching for it!)
Re: The Atomic age
On Friday, July 15, 2005, at 09:56 AM, Walter Underwood wrote: --On July 14, 2005 11:37:05 PM -0700 Tim Bray [EMAIL PROTECTED] wrote: So, implementors... to work. Do we have a list of who is implementing it? That could be used in the Deployment section of http://www.tbray.org/atom/RSS-and-Atom. I've update Grouper (http://www.geckotribe.com/rss/grouper/) to support conversion of Atom 1.0 to RSS 2.0. A future version will support going the other way...when I get time to complete the major overhaul that will be required to do that. Antone
Re: More while we're waiting discussion
On Tuesday, July 12, 2005, at 12:42 PM, A. Pagaltzis wrote: * James M Snell [EMAIL PROTECTED] [2005-07-12 02:00]: The second extension is a comments link type that allows an entry to be associated with a separate feed containing comments. […] feed entry link rel=comments href=http://example.com/commentsfeed.xml; / /entry /feed What I don’t like about this idea is that if a thread-aware aggregator wants to keep up with *all* discussion on a weblog, it will have to poll any number of comments-for-entry-X feeds per single main newsfeed in the general case – in the case of a typical weblog encountered in practice, that would be several hundred. Clearly, this is untenable. If you're already creating an extension link type, why not throw in an additional attribute too to help with that: feed xmlns:comments=http://example.org/commentfeed; entry link rel=comments comments:updated=2005-07-12T12:53:15Z href=http://example.com/commentsfeed.xml; / /entry /feed Then you'd only need to poll the main feed unless it indicated an update in the comment feed. Of course, if comments were threaded, you have to cascade comments:updated values up through all the feeds in a thread, and aggregators would have to follow updates back the other way, potentially down multiple branches, to find all the updated leaves. ...which raises the question of whether an application like this might beg a minimal feed for comments that simply pointed to an Entry Document for each comment. Entries in such a feed would really only require an atom:id, atom:updated, atom:link pointing to the entry document, and atom:link pointing to the parent comment or entry. atom:title could conceivably be considered undesirable bloat for such a feed. Is Atom the right format for this need? An alternative might be to define a format for this need that used Atom elements but had minimalized cardinality requirements. Well, enough stream of thought blabbering for now.
Re: More while we're waiting discussion
On Tuesday, July 12, 2005, at 06:21 PM, A. Pagaltzis wrote: * Thomas Broyer [EMAIL PROTECTED] [2005-07-13 00:00]: As an atom:id is an identifier that might (should?) not be dereferenceable, atom:link is not a good choice. There is nothing in the spec that forbids atom:link That should be atom:id, right? from being dereferencable, nor anything that advises against it being so. See 4.2.6 and 4.2.6.1 in -09. ... The spec just says is that the URI MUST NOT be assumed to be dereferencable, ... Whether atom:link is a bad choice for carrying a non- dereferencable URI around is a better argument. The spec says, verbatim: The atom:link element defines a reference from an entry or feed to a Web resource. That would seem to imply dereferencability, but is open to interpretation. ... Personally, I would prefer to interpret the spec liberally, if that is within the intended spirit, It's definitely not within the spirit that I, for one, intended. But the spirit that I intended (atom:link being limited to links intended to be traversed in response to explicit user interaction) was not accepted by the WG, so perhaps that has little bearing. If atom:link is intended to be dereferencable, then clearly, any solution that takes a value from atom:id and puts it into atom:link/@href has a strike against it since any feed that uses non-dereferencable atom:ids would either have to violate the spirit of atom:link to participate in the feature, or would have to invent a competing solution. Also, if a feed that uses dereferencable atom:ids is relocated, clients would be much more likely to attempt to dereference the atom:links that carried those previously dereferencable values than an extension element that was explicitly defined as not necessarily dereferencable.
Re: Roll-up of proposed changes to atompub-format section 5
On Tuesday, July 5, 2005, at 10:11 AM, Tim Bray wrote: On Jul 5, 2005, at 8:58 AM, Bob Wyman wrote: We can debate what it means to have an interoperability issue, however, my personal feeling is that if systems are forced to break and discard signatures in order to perform usual and customary processing on entries that falls very close to the realm of interoperability if not within it. Deferring this issue until the implementer's guide is written is likely to defer it beyond the point at which common practice is established. The result is likely to be that intermediaries and aggregators end up discarding most signatures that appear in source feeds. Huh?! Pardon my ignorance, could you please provide an explanation for the simple-minded as to how the absence of a source element in a signed entry will lead to signatures being discarded? Also, it would be helpful to sketch in some of the surrounding scenario... -Tim If a signed entry doesn't have a source element and an aggregator inserts one, the signature will be broken--thus the aggregator will either discard the signature or republish the entry with a broken signature. Perhaps language like this would work without being too much of a change at this late date: When signing individual entries that do not contain an atom:source element, be aware that aggregators inserting an atom:source element will be unable to retain the signature. For this reason, publishers might consider including an atom:source element in all individually signed entries.
Re: Roll-up of proposed changes to atompub-format section 5
On Tuesday, July 5, 2005, at 01:09 PM, A. Pagaltzis wrote: * Bob Wyman [EMAIL PROTECTED] [2005-07-05 19:30]: Antone Roundy wrote: When signing individual entries that do not contain an atom:source element, be aware that aggregators inserting an atom:source element will be unable to retain the signature. For this reason, publishers might consider including an atom:source element in all individually signed entries. +1 +1 as well. It is one of those obvious-in-hindsight things that the spec would do well to point out to implementors in advance. If putting this into the spec would require a delay, then I suppose we’ll have to end up living with a spec that could have been more explicit. This clarification is not worth slowing things down for. Agreed. If we can get it in without delaying things, I'm all for it. But if not, then I can live without it. It doesn't actually change anything--just reduces the probability of the issue being overlooked.
Re: I-D ACTION:draft-nottingham-atompub-feed-history-00.txt
If it's for identification rather than retrieval, maybe it could be an Identity Construct...except Identity Constructs got nuked in format-06...not necessarily dereferencable. Another option would be to identify whether you need to continue by checking whether you've seen the prev link before. Would not that be as reliable as checking the this link? On Wednesday, June 29, 2005, at 12:10 AM, Mark Nottingham wrote: You need to be able to figure out which documents you've seen before and which ones you haven't, so you don't recurse down the entire stack. Although you can come up with some heuristics to determine when you've seen a document before, most (if not all) of them can be fooled by particular sequences of entries. Remembering which ones you've seen (using their 'this' URI) allows you to easily figure this out. On 28/06/2005, at 8:48 PM, Antone Roundy wrote: Thinking a little more about this, I'm not sure what the this link would be used for. The prev link seems to be doing all the work, and especially assuming a batches of 15 sort of model, the this link seems likely to end up pointing to a document that's going to disappear soon 14 times out of 15. -- Mark Nottingham http://www.mnot.net/
Re: I-D ACTION:draft-nottingham-atompub-feed-history-00.txt
On Wednesday, June 29, 2005, at 07:27 AM, Dave Pawson wrote: I guess the answer is: http://example.com/latest is your feed, e.g. containing the latest 10 entries http://example.com/archive-1 through n are your archive feeds. Which would mean that the instance at /latest keeps changing? I need to keep swapping old ones out, new ones in, i.e. rebuilding each time? I guess that's another reason it feels like a kludge. Replace http://example.com/latest with http://example.com/atom.xml. Of course the latest document keeps changing and has to be rebuilt and replaced each time. It's the feed document just like what we see today. At least that's how I read what was written above--http://example.com/latest; was intended as the URI to which you'd subscribe.
Annotating signed entries (was Re: More on Atom XML signatures and encryption)
On Wednesday, June 29, 2005, at 01:47 PM, James M Snell wrote: 8. Aggregators and Intermediaries MUST NOT alter/augment the content of digitally signed entry elements. Just mulling over things... Obviously, we don't have any way to annotate signed entries without breaking the signature. I hesitate to introduce new complexity, so I don't know whether I LIKE the idea I'm about to write about, but here it is. If you want to annotate a signed entry, or even annotate an unsigned one but keep your annotations separate, you might do something like this: feed ... [feed metadata] ex:annotation entry-id=foo ex:entry-signaturethe entry's signature goes here/ex:entry-signature [this annotation could be signed here] ex:some-annotation-element.../ex:some-annotation-element ... /ex:annotation ... entry idfoo/id [entry's signature here if signed] ... /entry /feed Notes: 1) ex:entry-signature is optional, but recommended if the entry is signed and the annotation is signed. 2) Multiple annotations could point to the same entry 3) It could be requested that aggregators forward annotations along with their entries...but of course, that's optional, and they could certainly be dropped at the request of the end user if they only want to see the originals. 4) It might be recommended or required that ex:annotation elements appear before the entries they annotate (whether above all entries or interspersed with them) to make life easier for processors that finalize their processing of entries as soon as they hit /entry rather than doing it after they've parsed the whole document. 5) Aggregators COULD attach annotations from various sources when outputting entries, even if those annotations never appeared together within a feed before. 6) I don't see any way to choose between conflicting annotations.
Dealing with namespace prefixes when syndicating signed entries
Mulling more... Let's say an aggregator is putting these two entries into the same aggregate feed: feed ... xmlns:a=foo xmlns:b=bar ... entry [signature] a:foo ... / b:bar ... / ... /entry /feed feed ... xmlns:b=foo xmlns:a=bar ... entry [signature] b:foo ... / a:bar ... / ... /entry /feed Perhaps a reasonable way to deal with the namespace prefix conflict would be for the signature to be applied after a transform that yielded this (putting full namespace names in where the prefixes were): [atom's namespace]:entry [signature] foo:foo ... / bar:bar ... / ... /[atom's namespace]:entry Unprefixed attributes would naturally remain unprefixed, but elements in the default namespace would need to have their namespace names prepended.
Re: I-D ACTION:draft-nottingham-atompub-feed-history-00.txt
On Wednesday, June 29, 2005, at 06:50 PM, A. Pagaltzis wrote: My first thought upon reading the draft was what I assume is what Stefan Eissing said: I would rather have a single, entry-less “archive hub” feed which contains “prev” links to *all* previous instances For an active feed, that document could easily grow till it was larger than many feed instances. I prefer the chain of instances method. , leading to a setup like http://example.com/latest └─ http://example.com/archive/feed/ ├─ http://example.com/archive/feed/2005/05/ ├─ http://example.com/archive/feed/2005/04/ ├─ http://example.com/archive/feed/2005/03/ ├─ http://example.com/archive/feed/2005/02/ └─ http://example.com/archive/feed/2005/01/ I don't quite get what the hub feed would look like. Could you show us some XML? I don’t see anything in the draft that would preclude this use, and as far as I can tell, aggregators which support the draft should have no trouble handling this scenario correctly. The draft doesn't explicitly say that a feed can only contain one prev link, but I find it hard to read a to mean one or more in 'and optionally a Link element with the relation prev'. Again, I don’t see anything in the draft that would preclude this use, and as far as I can tell, aggregators which support the draft should have no trouble handling this scenario correctly. ...unless they expected only to find one prev link per document. Note how the archive directory feed being static makes this painlessly possible, while it would be a pain to achive something similar using the paginated approach with local “prev” links (you’d need to go back and change the previously newest old version every time a new one was added). I don't see why this would be any more difficult. The paginated approach could easily use static documents that never need to be updated, as I described earlier. I'll re-explain at the end of this email. It would in fact require a “prev” link to what is actually the “next” page. Funnily enough, I don’t see anything in the draft that would preclude this counterintuitive use of the “prev” link to point to the “next” version Could you explain what you mean by that? I’d much rather have a single archive feed containing all entries, and use RFC3229+feed to return partial versions of it; That might be good for those who can support it, but many people won't be able to. On the other hand, if that single feed grows to where it's hundreds of MB, it could cause real problems if someone requests the whole thing or a large portion of it. Getting back to how to use static documents for a chain of instances, that could easily be done as follows. The following assumes that the current feed document and the archive documents will each contain 15 entries. The first 15 instances of the feed document do not contain a prev link (assuming one entry is added each time). When the 16th entry is added, a static document is created containing the first 15 entries, and a prev link pointing to it is added to the current feed document. This link remains unchanged until the 31st entry is added. When the 31st entry is added, another static document is created containing the 16th through 30th entries. It has a prev link pointing to the first static document. The current feed document's prev link is updated to point to the second static document, and it continues to point to the second static document until the 46th entry is added. When the 46th entry is added, a third static document is created containing the 31st through 45th entries, etc. If you want to reduce the number of requests required to get the entire history (which I don't imagine would happen often enough that it would necessary be worth bothering), you could put more entries into each static document. If you didn't correspondingly increase the number of entries in the current feed document, you'd have to update the most recent static document a number of times rather than only outputting it once as described above, but even that would only require multiple updates to the most recent static document at any time.
Re: I-D ACTION:draft-nottingham-atompub-feed-history-00.txt
Let's say we are planning to keep the latest 15 entries in our stateful feed. We publish the first entry, and have a feed with 1 entry in it. It has a this link, but no prev link. Then we add an entry. The old this link can't be used to point to the new instance of the feed, right? Because that would violate this requirement: The value of the this link relation's href attribute MUST be a URI indicating a permanent location that is unique to that Feed Document instance; i.e., the content obtained by dereferencing that URI SHOULD NOT change over time. So the new feed instance has a new this link and perhaps a prev link pointing to the first instance. But maybe the prev link could be omitted at this point, because the this link will point to a feed with all the information in the original feed and then some. Now let's say someone tries to fetch the original this feed. The draft says: Note that publishers are not required to make all previous Feed Documents available. This seems like a likely circumstance where the publisher might not want to both to continue making the original instance available. If that's what they decide, then what? Do they return a 410 (gone)? Presumably, some will return a 404 (not found), even though 410 would be better. What should a client do if it receives a 404 or 410? Is there a way for them to find the new instance? Should there be? (Presumably they're subscribed to the feed from a URI different than the one in the this link, so in this case, it's probably not such a big deal, but read on, and you'll see where it could become an issue). Now let's look further down the road--we have 15 entries in the feed, and the latest instance has it's this and maybe a prev or maybe not. We add another entry. One reasonable thing to do would be to continue to provide the instance with the first 15 entries and point to it as the prev. Another reasonable thing to do would be to point to the original single-entry instance as the prev--ie. the most recent instance which doesn't share any entries with this one. As time goes by, the publisher could end up providing every old instance, or just one old instance for each 15 entries. The latter would provide for much more efficient catching up on the feed state. But if the in between instances are dropped, clients could easily end up running into dead ends (410 or 404) often when trying to catch up, even though there is older data available at a different URI. Perhaps the best solution would be to have no prev for the first 15 instances, then point to the instance with the first fifteen entries from each of the next 15 instances, then point to the instance with entries 16-30 from the next fifteen instances, etc., so that one is never pointing to an instance that won't continue to be provided (unless, for example, you only continue to provide the most recent 10 (for example) groups of 15 entries). If this is to be allowed, then one word ought to be changed in the draft (and I'd think that fleshing out some of these details would be very useful, though of course it wouldn't be normative): The value of the prev link relation's href attribute MUST be a URI indicating the location of the previous representation of the feed; i.e., the last Feed Document's this URI. THE previous representation = A previous representation or something along the lines of THE previous representation in the chain or representations. I'm noticing now that i.e., the last Feed Document's this URI. sounds like it's disallowing the batches-of-15 method outlined above. If we don't wish to disallow that, that should be changed to something like i.e., a previous Feed Document's this URI. Also, I just noticed that in some places, the word representation is used, and in some places instance is used, apparently to mean the same thing. In my opinion, instance is better. Antone
Re: I-D ACTION:draft-nottingham-atompub-feed-history-00.txt
Thinking a little more about this, I'm not sure what the this link would be used for. The prev link seems to be doing all the work, and especially assuming a batches of 15 sort of model, the this link seems likely to end up pointing to a document that's going to disappear soon 14 times out of 15. On Tuesday, June 28, 2005, at 07:05 PM, Mark Nottingham wrote: Now let's say someone tries to fetch the original this feed. The draft says: Note that publishers are not required to make all previous Feed Documents available. This seems like a likely circumstance where the publisher might not want to both to continue making the original instance available. If that's what they decide, then what? Do they return a 410 (gone)? Presumably, some will return a 404 (not found), even though 410 would be better. What should a client do if it receives a 404 or 410? Is there a way for them to find the new instance? Should there be? (Presumably they're subscribed to the feed from a URI different than the one in the this link, so in this case, it's probably not such a big deal, but read on, and you'll see where it could become an issue). I'm not sure what you're looking for; the semantics of 404 and 410 are clearly defined by HTTP. If the server says it can't find it, or it's gone, the client is unable to reconstruct the full state of the feed, and SHOULD warn the user. What I'm saying is that if instance 16 of the feed points back to instance 15, instance 17 to instance 16, instance 18 to instance 17, etc., but at some point your drop all but instance 15, instance 30, etc., then the links to all the instances in between are going to end up returning 404s or 410s. So I'd suggest that there be no per-instance this link, and that the prev link be updated only when a new batch-of-n document is created. Doing it that way, n-1 times out of n, there would be overlap between the current feed and that most recent batch-of-n document (but that wouldn't be a big deal), but no overlap between any previous batch-of-n documents, and no intermediate documents would disappear.
Re: More on Atom XML signatures and encryption
On Monday, June 20, 2005, at 11:33 PM, James M Snell wrote: OK, so given the arguments I previously posted in my response to Dan + the assertion that digitally signing individual entries will be necessary, the only real possible solution would be to come up with a canonicalization scheme for digitally signed Atom entries. ...or as Bob said, always including a source element in signed entries, even if they're in the origin feed. The following is all academic at this point, but here's pseudofeed of what I'd like to have seen...part of it only in retrospect: feed head!--it's bck!-- [feed metadata] Signature xmlns=... /!--the feed head is signed--the entire feed could be too, but this is for aggregation-- /head entry [entry metadata and content] feedsig!--a copy of the feed's head's signature, so that the entry can be verifiably linked to the signed feed metadata--/feedsig Signature xmlns=... / !--the entry is signed, including the local copy of the feed head signature-- /entry entry [entry metadata and content] feedsig.../feedsig Signature xmlns=... / /entry [etc.] /feed Of course, aggregating this while preserving the signatures' validity would require a different aggregation model than what we've chosen--like what I proposed for aggregation documents. (Indentation added for readability--in practice, that would break the signature, right?): aggregation [aggregation metadata] feed head [feed metadata] Signature xmlns=... / /head entry [entry metadata and content] feedsig.../feedsig Signature xmlns=... / /entry /feed feed [etc.] /feed [etc.] /aggregation
Re: Question on Use Case: Choosing atom:content's type when ArchivingMailing Lists
On Saturday, June 18, 2005, at 01:36 PM, Graham wrote: On 17 Jun 2005, at 6:14 pm, Tim Bray wrote: Uh, has Mark spotted a dumb bug here that we should fix? Do we care if *remote* content is of a composite MIME type? My feeling was that we ruled out composite types in *local* content, for fairly obvious reasons. The fix is obvious, in 4.1.3.1 I would have no objection to this, since the spec already creates the expectation that remote content will be less widely supported than local content. The better way to do this is to use atom:link rel=alternate to reference the messages. This is certainly a better solution than multipart local content, and would hope that people would do remote content this way too unless they have a really good reason for multipart remote content. But I could live with allowing multipart remote content if it's really needed in some case.
Re: Polling Sucks! (was RE: Atom feed synchronization)
On Friday, June 17, 2005, at 12:32 PM, Bob Wyman wrote: This is *not* simpler than taking a push feed using Atom over XMPP. For a push feed, all you do is: 1. Open a socket 2. Send a login XML Stanza 3. Process the stanzas as they arrive. ... For your solution, you need to: 1. Poll the feed to get a pointer to the first link. (each poll will cost you a TCP/IP connection). 2. If you got a new first link then go to step 5 3. Wait some period of time (the polling interval) 4. GoTo Step 1 5. Open a new TCP/IP socket to get the next link 6. Form and send an HTTP request for the next entry 7. Catch the response from the server 8. Parse the response to determine if its time stamp is something you've already seen. 9. If you haven't seen the current entry before, then go to step 5 10. Go to step 1 to start over. Not to get into a big argument (each method has its advantages depending on circumstances), but allow me to revise the above a little. The following assumes applications that attempt to keep you up-to-date on changes to the feed that occurred while you were offline: XMPP: 1. Open a socket 2. Request and get the feed 3. Parse the XML 4. Process the entries (Determine whether each is new/updated or not--if so, do the appropriate thing) 5. If the feed had entries that were old and not updated, go to step 7 6. If the feed has a first or next or whatever link, go to step 1 using that link 7. Open a socket 8. Send login XML stanza 9. Wait for a stanza (sending keep-alive packets periodically), and when it arrives... 10. Parse the XML 11. Process it (Determine whether the entry is new/updated or not and do the appropriate thing) 12. Go to step 9 Polling: 1. Open a socket 2. Request and get the feed 3. Parse the XML 4. Process the entries (Determine whether the entry is new/updated or not and do the appropriate thing) 5. If the feed had entries that were old and not updated, go to step 7 6. If the feed has a first or next or whatever link, go to step 1 using that link 7. Wait some period of time 8. Go to step 1 The XMPP app will need to contain a superset of the polling app's code. My assessment of which method wins on various issues: Latency: XMPP Implementation complexity: Polling Bandwidth consumption: XMPP Resource consumption between polls or pushes: Polling Getting all feed changes while online: XMPP if you're trying to archive the feed, otherwise no difference Getting feed changes that occurred while offline: no difference If we're not concerned about ensuring that we get all changes, the story is different: XMPP: 1. Open a socket 2. Send login XML stanza 3. Wait for a stanza (sending keep-alive packets periodically), and when it arrives... 4. Parse the XML 5. Process it (Determine whether the entry is new/updated or not and do the appropriate thing) 6. Got to step 3 Polling: 1. Open a socket 2. Request and get the feed 3. Parse the XML 4. Process the entries (Determine whether the entry is new/updated or not and do the appropriate thing) 5. Wait some period of time 6. Go to step 1 My assessment: Latency: XMPP Implementation complexity: similar Bandwidth consumption: XMPP Resource consumption between polls or pushes: Polling Getting all feed changes while online: XMPP Getting feed changes that occurred while offline: Polling XMPP could achieve parity in getting feed changes that occurred while offline, at the expense of implementation complexity parity, by polling the feed once upon startup.
Re: I-D ACTION:draft-ietf-atompub-format-09.txt
4.1.1: o atom:feed elements MUST NOT contain more than one atom:image element. Should be atom:logo. 4.1.2 says: o atom:entry elements MUST NOT contain more than one atom:link element with a rel attribute value of alternate that has the same combination of type and hreflang attribute values. 4.1.1 says: o atom:feed elements MUST NOT contain more than one atom:link element with a rel attribute value of alternate that has the same type attribute value. Should 4.1.1 also mention hreflang? 4.1.2 puts this in a separate bullet, but 4.1.1 does not: o atom:entry elements MAY contain additional atom:link elements beyond those described above. Nit pick: 4.1.2 says: o atom:entry elements MUST have exactly one atom:title element. o atom:entry elements MUST contain exactly one atom:updated element. Do we want to be consistent in saying contain i/o have? 4.1.3.2 The src attribute atom:content MAY have a src attribute, whose value MUST be an IRI reference [RFC3987]. If the src attribute is present, atom:content MUST be empty. Atom Processors MAY use the IRI to retrieve the content, and MAY NOT process or present remote content in the same manner as local content. It took me a second to realize that MAY NOT means don't have to rather than aren't allowed to. The technical meaning of the terms is perfectly clear, but it's quite different from the usual meaning of those words, and may be misunderstood. It might be better to say Atom Processors MAY use the IRI to retrieve the content, and MAY process or present remote content in a different manner from local content. Appendix A. Contributors doesn't appear to have been updated to add more names.
Re: PaceAtomIdDos posted (was Re: Consensus snapshot, 2005/05/25)
On Wednesday, May 25, 2005, at 06:14 PM, James M Snell wrote: Ignoring the overhead that it adds for now, isn't this the kind of situation digital signatures are designed to handle? Sure, but how many publishers are going to be using digital signatures in the near term (and more importantly, how many aren't?), and who knows how many consuming applications will support them. Until digital signatures start providing more help with this kind of thing, let's provide a warning to developers so that they can at least consider what they might do to safeguard the quality of their users' experience. And I just thought of another thing (I don't know how digital signatures work in this case, so I may be missing something, but I'm pretty sure the following is at least partially valid): if I get an entry with a valid digital signature and one with no signature (both with the same atom:id, of course), then what? Do I always accept the one with the signature? If so, then DOSing/spoofing unsigned entries will be even easier, because all you'd have to do is sign your fake entry. So even in that case, some extra checking might have to be done before concluding that the entries are duplicates, and that the unsigned one is the one that's disposable. Without any kind of cryptographic guarantee of this sort, the best you could do is make an educated guess. Wouldn't that be better than nothing until digital signatures become more ubiquitous? Would it make sense to include some language along these lines? Sure.
Re: PaceAtomIdDos posted (was Re: Consensus snapshot, 2005/05/25)
On Thursday, May 26, 2005, at 08:04 AM, A. Pagaltzis wrote: * Graham [EMAIL PROTECTED] [2005-05-25 23:00]: How is this a Denial of service attack? Isn't it just ordinary spoofing/impersonation? Indeed; Id like to see this reworded to refer to spoofing, as thats what it is. I presume the specific wording can be left to the discretion of the editors.
Re: PaceDuplicateIdsEntryOrigin posted (was Re: Consensus snapshot, 2005/05/25)
On Wednesday, May 25, 2005, at 01:06 PM, Antone Roundy wrote: == Abstract == State the atom:entries from the same feed with the same ID are the same entry, whether simulateously in the feed document or not. I'm retracting this proposal in preference for PaceAtomIdDos, which I like better and is getting more support.
Re: Consensus snapshot, 2005/05/25
On Wednesday, May 25, 2005, at 12:03 PM, Tim Bray wrote: The level of traffic in recent days have been ferocious, and reading through it, we observe the WG has consensus on changing the format draft in a surprisingly small number of areas. Here they are: All looks good (or at least entirely acceptable) to me. One question though: 3. Change to previous consensus call. The phrase that begins If multiple atom:entry elements with the same atom:id value appear in an Atom Feed document, they describe the same empty... loses the language about how software MUST treat them as such. A few of people appeared to support[1][2] this[0]: * State that multiple entries originating in the same feed with the same atom:id are instances of the same entry [yes, they're SUPPOSED to be, even REQUIRED to be universally unique, but let's live in the real world] ...but there was no Pace written (oops), and little or no comment directed specifically toward this detail, either for or against. This wording got no response when suggested[3] two days ago: If multiple atom:entry elements originating in the same Atom feed have the same atom:id value, whether they exist simultaneously in one document or in different instances of the feed document, they describe the same entry. I'm going to write a Pace right now, in case that will make any difference. Comments? Antone [0] http://www.imc.org/atom-syntax/mail-archive/msg15517.html [1] http://www.imc.org/atom-syntax/mail-archive/msg15518.html [2] http://www.imc.org/atom-syntax/mail-archive/msg15526.html [3] http://www.imc.org/atom-syntax/mail-archive/msg15644.html
PaceDuplicateIdsEntryOrigin posted (was Re: Consensus snapshot, 2005/05/25)
On Wednesday, May 25, 2005, at 12:35 PM, Antone Roundy wrote: I'm going to write a Pace right now, in case that will make any difference. Here it is--now comments on that particular detail can be directed at a proper Pace: http://www.intertwingly.net/wiki/pie/PaceDuplicateIdsEntryOrigin == Abstract == State the atom:entries from the same feed with the same ID are the same entry, whether simulateously in the feed document or not. == Status == New == Rationale == * The accepted language for allowing duplicate IDs in a feed document speaks only multiple atom:entry elements with the same atom:id describing the same entry if they exist in the same document--of course, we intend for them to describe the same entry whether they're simultaneously in the feed document or not * The accepted language does not speak of the origin feed of the entries. Ideally, an atom:id should be univerally unique to one entry resource, and we rightly require publishers to mint them with that goal. However, in reality, malicious or undereducted publishers might duplicate the IDs of others. Therefore, it is proposed to modify the specification to state that the atom:entry elements describe the same entry (resource) if they originate in the same feed. * Aggregators wishing to protect against DOS attacks are not unlikely to perform some sort of safety checks to detect malicious atom:id duplication, regardless of whether the specification authorizes them to or not. == Proposal == in format-08: 1. Remove this bullet point from 4.1.1: atom:feed elements MUST NOT contain atom:entry elements with identical atom:id values. 2. Add the following paragraph, either to atom:entry or atom:feed, at the editors' discretion (instead of the first sentence proposed by PaceAllowDuplicateIDs, if accepted): If multiple atom:entry elements originating in the same Atom feed have the same atom:id value, whether they exist simultaneously in one document or in different instances of the feed document, they describe the same entry. == Impacts == * Aggregators wishing to both perform duplicate detection and protect against DOS attacks will be justified by the specification in applying their judgement regarding whether entries with the same atom:id come from the same source or not. == Notes == * Because we are unlikely to agree on a method for determining whether the atom:entry elements originate in the same feed or not, no particular method will be specified. * The proposed language does not preclude the possibility of aggregators applying their own judgement regarding whether two atom:entry elements with the same atom:id which originate in different feeds might describe the same entry resource, which they might if someone posts the same to entry to, for example, a category feed and a feed of all their categories, and doesn't present one as having been aggregated from the other by including an atom:source element.
PaceAtomIdDos posted (was Re: Consensus snapshot, 2005/05/25)
On Wednesday, May 25, 2005, at 01:20 PM, Graham wrote: On 25 May 2005, at 7:35 pm, Antone Roundy wrote: If multiple atom:entry elements originating in the same Atom feed have the same atom:id value, whether they exist simultaneously in one document or in different instances of the feed document, they describe the same entry. What about when they don't? I don't see any value here. A line saying that when two matching entry ids found in different feeds is fine, but (apparently) saying it's completely meaningless goes way, way too far. In my grand tradition (...I'm sure I've done this before), I've posted an alternative to my own proposal. The following would legitimize considering more than just atom:id in doing duplicate detection in order to protect against DOS, but without risking anyone thinking we've weakened the requirement for universal uniqueness of atom:id. I'd vote for this over PaceDuplicateIdsEntryOrigin, and PaceDuplicateIdsEntryOrigin over no change. http://www.intertwingly.net/wiki/pie/PaceAtomIdDos == Abstract == Point out the potential for denial of service by duplicating others' atom:id values. == Status == New == Rationale == * We want atom:id to be univerally unique to a particular entry resource. * However, depending on such uniqueness could lead to denial of service attacks where the attacker publishes an entry with an atom:id value used by someone else. * Restricting the uniqueness scope of atom:id entirely to a single feed would make it much less valuable, since entries are often copied form feed to feed, and sometimes simultaneously published in multiple feeds. * Only requiring entries with the same atom:id to be considered the same if coming from the same feed, but allowing the consuming application to exercize judgement with respect to entries originating in different feeds is a much better match with reality. * Still, pointing out the potential for DOS attacks in the Security Considerations section may be preferable to loosening the scope of atom:id uniqueness elsewhere in the spec in either of the ways describe by the preceding bullet points. == Proposal == Add the following to format-08: 8.5 Denial of Service Attacks Atom Processors should be aware of the potential for denial of service attacks where the attacker publishes an atom:entry with the atom:id value of an entry from another feed, and perhaps with a falsified atom:source element duplicating the atom:id of the other feed. Atom Processors which, for example, suppress display of duplicate entries by displaying only one entry with a particular atom:id value or combination of atom:id and atom:updated values, might also take steps to determine whether the entries originated from the same publisher before considering them to be duplicates.
Re: Semantics and Belief contexts - was: PaceDuplicateIdsEntryOrigin posted
On Wednesday, May 25, 2005, at 02:26 PM, Henry Story wrote: Since the referents of Superman and Clark Kent are the same, what is true of the one, is true of the other. When speaking directly about the world, we can replace any occurrence of Superman with Clark Kent, and still say something true. Clark Kent is the secret identity of Superman. - Superman is the secret identity of Superman. Whether they're perfectly interchangeable or not depends on whether the name is referring to the object or some a facet of the object. The second sentence actually works if the first Superman refers to the persona, and the second to the person. But getting back to Atom... Autistic children have great difficulty understanding the difference between what is and how people perceive things to be. They sure don't have a monopoly on this! Really getting back to Atom!... So to prevent a DOS attack, best is to have aggregator feeds such as: feed !-- aggregator feed -- feed src=http://true.org; idtag://true.org,2005/feed1/id entry titleEnter your credit card number here/title ... /entry /feed feed src=http://false.org; idtag://true.org,2005/feed1/id entry titleEnter your credit card number here/title ... /entry /feed /feed Here all the aggregator feed is claiming is that he has seen entries inside other feeds. ... It will be up to the consumer of such aggregated feeds to decide which to trust. From the end user's point of view, it's not much different. Somebody still has to make the decision, and the end user doesn't want to be the one doing it--they want the super aggregator or their feed reader or somebody else to do it for them. The feed reader should be doing it anyway, since they won't be getting all of their data through a super aggregator. But the super aggregator is likely to want to do it too, both to reduce how much data they forward to their clients, and because many feed readers aren't going to do it very well, so handling part of the job for them will improve the end user's experience. I'm not a fan of feeds of feeds (though I have proposed and still like a one-level embedding of feeds in a different top-level element). Plus, I think it's inconceivable that the WG would make this drastic a change at this point. Let's focus on doing what's actually possible, given the WG schedule and temperment, to mitigate this problem.
Re: PaceAtomIdDos posted (was Re: Consensus snapshot, 2005/05/25)
On Wednesday, May 25, 2005, at 02:49 PM, Graham wrote: On 25 May 2005, at 9:01 pm, Antone Roundy wrote: 8.5 Denial of Service Attacks Atom Processors should be aware of the potential for denial of service attacks where the attacker publishes an atom:entry with the atom:id value of an entry from another feed, and perhaps with a falsified atom:source element duplicating the atom:id of the other feed. Atom Processors which, for example, suppress display of duplicate entries by displaying only one entry with a particular atom:id value or combination of atom:id and atom:updated values, might also take steps to determine whether the entries originated from the same publisher before considering them to be duplicates. How is this a Denial of service attack? Isn't it just ordinary spoofing/impersonation? Apart from that, +1. I don't particularly care whether we call it a DOS or something else, as long as we point it out and give implementers something to point to if asked why they're not simply accepting atom:id at face value. But is it not potentially a DOS? The Good Guy publishes an entry. The Bad Guy copies the atom:id of that entry into an entry with different content, gives it a later atom:updated, and publishes it. The aggregator stops publishing/displaying the Good Guy's entry and instead publishes/displays the Bad Guy's entry. Thus, the subscriber doesn't see the Good Guy's entry (unless they saw it before it was replaced). But you're also right--if they saw it before it was replaced and then, when they see the updated version, they think it was updated by The Good Guy, it becomes a spoof/impersonation. Perhaps we should say Denial of Service and Spoofing Attacks and ...potential for denial of service and spoofing attacks...? How that's worded doesn't really matter to me.