Re: Expires extension draft (was Re: Feed History -02)
There is an interesting problem of how this interacts with the history tag. If you set an a feed feed expires.../expires entry/entry entry/entry /feed then what are you setting it on? Well not the document, clearly, as you have pointed out since HTTP headers deal with that. So it must be on the feed. And of course a feed is identified by its id. Now we have to make sure we avoid contradictions where different feed documents describing the same feed state that the feed has different expiry dates. Eg: the main feed-- feed expiresAugust 25 2006/expires idtag:example.com,2000/feed1/id entry/entry entry/entry history:previoushttp://example.com/archive1/history:previous /feed --- The above is a partial description of the feed tag:example.com,2000/ feed1 The history:previous link points to another document that gives us more information about that feed. ---the archive feed feed expiresAugust 15 2006/expires idtag:example.com,2000/feed1/id entry/entry entry/entry history:previoushttp://example.com/archive2/history:previous /feed --- Now of course we have a feed id with two expiry dates. Which one is correct? In graph terms we end up with something like this: tag:example.com,2000 |expires--August 25 2006 |expires--August 15 2006 |entry... |entry... |entry... |entry... One has the feeling that the expires relation should be functional, ie have only one value. This makes me think again that for what I was looking for (that the document in history:previous not change, so that one can work out when to stop fetching documents) can in fact be entirely be taken care of by the http expiry dates and cache control mechanism. Of course if this is so, I think it should be noted clearly in the history spec. ((btw. Is it easy to set expiry dates for documents served by apache?)) Henry Story On 10 Aug 2005, at 04:46, James M Snell wrote: This is fairly quick and off-the-cuff, but here's an initial draft to get the ball rolling.. http://www.snellspace.com/public/draft-snell-atompub-feed-expires.txt - James Henry Story wrote: To answer my own question [[ Interesting... but why have a limit of one year? For archives, I would like a limit of forever. ]] I found the following in the HTTP spec [[ To mark a response as never expires, an origin server sends an Expires date approximately one year from the time the response is sent. HTTP/1.1 servers SHOULD NOT send Expires dates more than one year in the future. ]] (though that still does not explain why.) Now I am wondering if the http mechanism is perhaps all that is needed for what I want with the unchanging archives. If it is then perhaps this could be explained in the Feed History RFC. Or are there other reasons to add and expires tag to the document itself? Henry Story On 9 Aug 2005, at 19:09, James M Snell wrote: rules as atom:author elements. Here it is: http://www.intertwingly.net/wiki/pie/PaceCaching The expires and max-age elements look fine. I hesitate at bringing in a caching discussion. I'm much more comfortable leaving the definition of caching rules to the protocol level (HTTP) rather than the format extension level. Namely, I don't want to have to go into defining rules for how HTTP headers that affect caching interact with the expires and max-age elements... IMHO, there is simply no value in that. The expires and max-age extension elements affect the feed / entry on the application level not the document level. HTTP caching works on the document level. Adding max-age also means defining IntegerConstruct and disallowing white space around it. Formerly, it was OK as a text construct, but the white space issues change that. This is easy enough. Also, we should decide whether cache information is part of the signature. I can see arguments either way. -1. let's let caching be handled by the transport layer.
Re: Expires extension draft (was Re: Feed History -02)
--On August 10, 2005 1:56:05 PM +1000 Eric Scheid [EMAIL PROTECTED] wrote: Aside: a perfect example of what sense of 'expires' is in the I-D itself... Network Working Group Internet-Draft Expires: January 2, 2006 Especially perfect because the HTTP header does not reflect the expiration. Honestly, another reason to put expiration inside the feed is that HTTP caching is just not used. Well, except to force reloads and show you new ads. But it is extremely rare to see it per-document cache information. wunder -- Walter Underwood Principal Architect, Verity
Re: Feed History -02
So, you're really looking for entry-level, time-based invalidation, no? I guess the simplest way to do this would be to dereference the link and see if you get a 404/410; if you do, you know it's no longer good. That's not terribly efficient, but OTOH managing metadata in multiple places is tricky, and predicting the future doubly so :) Most people get expiration times really wrong. And clock sync becomes an issue as well. I'd think that if you have reasonable control over the polling of the feed, and a solid enough state model (which might include an explicit deletion mechanism), you could have a similar effect by just removing the items from the feed when they expire, with the expectation that when they disappear from the feed, they disappear from the client. Would that work for your use case? On 09/08/2005, at 9:07 PM, James M Snell wrote: First off, let me stress that I am NOT talking about caching scenarios here... (my use of the terms application layer and transport layer were an unfortunate mistake on my part that only served to confuse my point) Let's get away from the multiprotocol question for a bit (it never leads anywhere constructive anyway)... Let's consider an aggregator scenario. Take an entry from a feed that is supposed to expire after 10 days. The feed document is served up to the aggregator with the proper HTTP headers for expiration. The entry is extracted from the original feed and dumped into an aggregated feed. Suppose each of the entries in the aggregated feed are supposed to have their own distinct expirations. How should the aggregator communicate the appropriate expirations to the subscriber? Specifying expirations on the HTTP level does not allow me to specify expirations for individual entries within a feed. Use case: an online retailer wishes to produce a special offers feed. Each offer in the feed is a distinct entity with it's own terms and own expiration: e.g. some offers are valid for a week, other offers are valid for two weeks, etc. The expiration of the offer (a business level construct) is independent of whether or not the feed is being cached or not (a protocol level construct); publishing a new version of the feed (e.g. by adding a new offer to the feed) should have no impact on the expiration of prior offers published to the feed. Again, I am NOT attempting to reinvent an abstract or transport- neutral caching mechanism in the same sense that the atom:updated element is not attempting to reinvent Last-Modified or that the via link relation is not attempting to reinvent the Via header, etc. They serve completely different purposes. The expires and max-age extensions I am proposing should NOT be used for cache control of the Atom documents in which they appear. I think we can declare victory here by simply a) using whatever caching mechanism is available, and b) designating a won't change flag. Speaking *strictly* about cache control of Atom documents, +1. No document level mechanisms for cache control are necessary. - James Mark Nottingham wrote: HTTP isn't a transport protocol, it's a transfer protocol; i.e., the caching information (and other entity metadata) are *part of* the entity, not something that's conceptually separate. The problem with having an abstract or transport-neutral concept of caching is that it leaves you with an awkward choice; you can either a) exactly replicate the HTTP caching model, which is difficult to do in other protocols, b) dumb down HTTP caching to a subset that's neutral, or c) introduce a contradictory caching model and suffer the clashes between HTTP caching and it. This is the same road that Web services sometimes tries to go down, and it's a painful one; coming up with the grand, protocol- neutral abstraction that enables all of the protocol-specific features is hard, and IMO not necessary. Ask yourself: are there any situations where you *have* to be able to seamlessly switch between protocols, or is it just a convenience? I think we can declare victory here by simply a) using whatever caching mechanism is available, and b) designating a won't change flag. On 09/08/2005, at 11:53 AM, James M Snell wrote: Henry Story wrote: Now I am wondering if the http mechanism is perhaps all that is needed for what I want with the unchanging archives. If it is then perhaps this could be explained in the Feed History RFC. Or are there other reasons to add and expires tag to the document itself? On the application level, a feed or entry may expire or age indepedently of whatever caching mechanisms may be applied at the transport level. For example, imagine a source that publishes special offers in the form of Atom entries that expire at a given point in time. Now suppose that those entries are being distributed via XMPP and HTTP. It
Re: nested feeds (was: Feed History -02)
Sorry for taking so long to reply. I have been off on a 700km cycle trip http://blogs.sun.com/roller/page/bblfish/20050807 I don't really want to spend to much time on the top-X discussion, as I am a lot more interested in the feed history itself, but here are some thoughts anyway... On 29 Jul 2005, at 17:01, Eric Scheid wrote: On 29/7/05 11:39 PM, Henry Story [EMAIL PROTECTED] wrote: Below I think I have worked out how one can in fact have a top20 feed, and I show how this can be combined without trouble with the history:next ... link... On 29 Jul 2005, at 13:12, Eric Scheid wrote: On 29/7/05 7:57 PM, Henry Story [EMAIL PROTECTED] wrote: 1- The top 20 list: here one wants to move to the previous top 20 list and think of them as one thing. The link to the next feed is not meant to be additive. Each feed is to be seen as a whole. I have a little trouble still thinking of these as feeds, but ... What happens if the publisher realises they have a typo and need to emit an update to an entry? Would the set of 20 entries (with one entry updated) be seen as a complete replacement set? Well if it is a typo and this is considered to be an insignificant change then they can change the typo in the feed document and not need to change any updated time stamps. Misspelling the name of the artist for the top 20 songs list is not insignificant. Even worse fubars are possible too -- such as attributing the wrong artist/author to the #1 song/book (and even worse: leaving off a co-author). Yes, I see this now. This is a problem for my suggestion. The atom:updated field cannot be used to indicate the date at which an entry has a certain position in a chart for the reason you mention. We could then no longer update that entry for spelling mistakes or other more serious issues. One would have to add a about date or something, and then the things gets a little more complicated than I care to think about right now. The way I see it, maybe a better way would be to have a sliding window feed where each entry points to another Atom Feed Document with it's own URI, and it is that second Feed Document which contains the individual items (the top 20 list). This is certainly closer to my intuitions too. A top 20 something is *not* a feed. Feed entries are not ordered, and are not meant to be thought of as a closed collection. At least this is my initial intuition. BUT Not all Atom Feed Documents are feeds, some are static collections of entries. We keep tripping over this :-( I can think of a solution like the following: Let us imagine a top 20 feed where the resources being described by the entries are the position in the top list. So we have entries with ids such as http://TopOfThePops.co.uk/top20/Number1 http://TopOfThePops.co.uk/top20/Number2 http://TopOfThePops.co.uk/top20/Number3 ... will contain a new entry such as entry titleTop of the pops entry number 1/title link href=http://TopOfThePops.co.uk/top20/Number1// idhttp://TopOfThePops.co.uk/top20/Number1/id updated2005-07-05T18:30:00Z/updated summaryTop of the pops winner for the week starting 5 July 2005/summary /entry The problem here is that this doesn't describe the referent, it only describes the reference. I want to see top 20 feeds where each entry links to the referent in question. For example, the Amazon Top 10 Selling Books feed would link to the book specific page at Amazon, not to some page saying the #3 selling book is at the other end of this link. Oh, I don't really want to defend this position too much but there would be a way around this criticism by simply having the link point to the album like this: entry titleTop of the pops entry number 1/title link href=http://www.amazon.fr/exec/obidos/ASIN/B4ULZV/ idhttp://TopOfThePops.co.uk/top20/Number1/id updated2005-07-05T18:30:00Z/updated summaryTop of the pops winner for the week starting 5 July 2005/summary /entry So here the id would be the same for each position from week to week, but the link it points to would change. We would still need to solve the issue of the date at which it had that position, though... And so yes, a feed where the entry is a feed seems easier to work with in this case. The feed would be something like this I suppose: feed titletop 20 French songs/title ... entry titleweek of August 1 2005/title id...?.../id updated2005-08-01T18:30:00Z/updated content src=http://TopOfThePops.fr/top20/2005_08_01/; type=application/atom+xml /entry entry titleweek of August 1 2005/title id...?.../id !--There was an important update-- content src=http://TopOfThePops.fr/top20/2005_08_01/; type=application/atom+xml updated2005-08-02T18:30:00Z/updated /entry entry titleweek of August 8 2005/title id...?.../id !--There was an
Re: Feed History -02
On 4 Aug 2005, at 06:27, Mark Nottingham wrote: So, if I read you correctly, it sounds like you have a method whereby a 'top20' feed wouldn't need history:prev to give the kind of history that you're thinking of, right? If that's the case, I'm tempted to just tweak the draft so that history:stateful is optional if history:prev is present. I was considering dropping stateful altogether, but I think something is necessary to explicitly say don't try to keep a history of my feed. My latest use case for this is the RSS feed that Netflix provides to let you keep an eye on your queue (sort of like top20, but more specialised). Sound good? Sounds good to me. But I would really like some way to specify that the next feed document is an archive (ie. won't change). This would make it easy for clients to know when to stop following the links, ie, when they have cought up with the changes since they last looked at the feed. Perhaps something like this: history:prev archive=yeshttp://liftoff.msfc.nasa.gov/2003/04/ feed.rss/history Henry Story
Re: Feed History -02
--On August 9, 2005 1:07:29 PM +0200 Henry Story [EMAIL PROTECTED] wrote: But I would really like some way to specify that the next feed document is an archive (ie. won't change). This would make it easy for clients to know when to stop following the links, ie, when they have cought up with the changes since they last looked at the feed. I made some proposals for cache control info (expires and max-age). That might work for this. wunder -- Walter Underwood Principal Architect, Verity
Re: Feed History -02
Walter Underwood wrote: --On August 9, 2005 1:07:29 PM +0200 Henry Story [EMAIL PROTECTED] wrote: But I would really like some way to specify that the next feed document is an archive (ie. won't change). This would make it easy for clients to know when to stop following the links, ie, when they have cought up with the changes since they last looked at the feed. I made some proposals for cache control info (expires and max-age). That might work for this. I missed these proposals. I've been giving some thought to an expires / and max-age / extension myself and was getting ready to write up a draft. Expires is a simple date construct specifying the exact moment (inclusive) that the entry/feed expires. Max-age is a non negative integer specifying the number of miliseconds (inclusive) from the moment specified by atom:updated when then entry/feed expires. The two cannot appear together within a single entry/feed and follows the same basic rules as atom:author elements. - James
Re: Feed History -02
On 9 Aug 2005, at 18:32, Walter Underwood wrote: --On August 9, 2005 9:28:52 AM -0700 James M Snell [EMAIL PROTECTED] wrote: I made some proposals for cache control info (expires and max-age). That might work for this. I missed these proposals. I've been giving some thought to an expires / and max-age / extension myself and was getting ready to write up a draft. Expires is a simple date construct specifying the exact moment (inclusive) that the entry/feed expires. Max-age is a non negative integer specifying the number of miliseconds (inclusive) from the moment specified by atom:updated when then entry/feed expires. The two cannot appear together within a single entry/feed and follows the same basic rules as atom:author elements. Here it is: http://www.intertwingly.net/wiki/pie/PaceCaching Interesting... but why have a limit of one year? For archives, I would like a limit of forever. But otherwise I suppose this would do. Instead of putting the information in the history link of the linking feed, you would put it in the archive feed. Which sounds good. I suppose we end up with some duplication of information here with the http headers again. Adding max-age also means defining IntegerConstruct and disallowing white space around it. Formerly, it was OK as a text construct, but the white space issues change that. Also, we should decide whether cache information is part of the signature. I can see arguments either way. wunder -- Walter Underwood Principal Architect, Verity
Re: Feed History -02
Walter Underwood wrote: --On August 9, 2005 9:28:52 AM -0700 James M Snell [EMAIL PROTECTED] wrote: I made some proposals for cache control info (expires and max-age). That might work for this. I missed these proposals. I've been giving some thought to an expires / and max-age / extension myself and was getting ready to write up a draft. Expires is a simple date construct specifying the exact moment (inclusive) that the entry/feed expires. Max-age is a non negative integer specifying the number of miliseconds (inclusive) from the moment specified by atom:updated when then entry/feed expires. The two cannot appear together within a single entry/feed and follows the same basic rules as atom:author elements. Here it is: http://www.intertwingly.net/wiki/pie/PaceCaching The expires and max-age elements look fine. I hesitate at bringing in a caching discussion. I'm much more comfortable leaving the definition of caching rules to the protocol level (HTTP) rather than the format extension level. Namely, I don't want to have to go into defining rules for how HTTP headers that affect caching interact with the expires and max-age elements... IMHO, there is simply no value in that. The expires and max-age extension elements affect the feed / entry on the application level not the document level. HTTP caching works on the document level. Adding max-age also means defining IntegerConstruct and disallowing white space around it. Formerly, it was OK as a text construct, but the white space issues change that. This is easy enough. Also, we should decide whether cache information is part of the signature. I can see arguments either way. -1. let's let caching be handled by the transport layer. - James
Re: Feed History -02
To answer my own question [[ Interesting... but why have a limit of one year? For archives, I would like a limit of forever. ]] I found the following in the HTTP spec [[ To mark a response as never expires, an origin server sends an Expires date approximately one year from the time the response is sent. HTTP/1.1 servers SHOULD NOT send Expires dates more than one year in the future. ]] (though that still does not explain why.) Now I am wondering if the http mechanism is perhaps all that is needed for what I want with the unchanging archives. If it is then perhaps this could be explained in the Feed History RFC. Or are there other reasons to add and expires tag to the document itself? Henry Story On 9 Aug 2005, at 19:09, James M Snell wrote: rules as atom:author elements. Here it is: http://www.intertwingly.net/wiki/pie/PaceCaching The expires and max-age elements look fine. I hesitate at bringing in a caching discussion. I'm much more comfortable leaving the definition of caching rules to the protocol level (HTTP) rather than the format extension level. Namely, I don't want to have to go into defining rules for how HTTP headers that affect caching interact with the expires and max-age elements... IMHO, there is simply no value in that. The expires and max-age extension elements affect the feed / entry on the application level not the document level. HTTP caching works on the document level. Adding max-age also means defining IntegerConstruct and disallowing white space around it. Formerly, it was OK as a text construct, but the white space issues change that. This is easy enough. Also, we should decide whether cache information is part of the signature. I can see arguments either way. -1. let's let caching be handled by the transport layer.
Re: Feed History -02
Henry Story wrote: Now I am wondering if the http mechanism is perhaps all that is needed for what I want with the unchanging archives. If it is then perhaps this could be explained in the Feed History RFC. Or are there other reasons to add and expires tag to the document itself? On the application level, a feed or entry may expire or age indepedently of whatever caching mechanisms may be applied at the transport level. For example, imagine a source that publishes special offers in the form of Atom entries that expire at a given point in time. Now suppose that those entries are being distributed via XMPP and HTTP. It is helpful to have a transport independent expiration/max-age mechanism whose semantics operate on the application layer rather than the transport layer. - James
Re: Feed History -02
On 09/08/2005, at 4:07 AM, Henry Story wrote: But I would really like some way to specify that the next feed document is an archive (ie. won't change). This would make it easy for clients to know when to stop following the links, ie, when they have cought up with the changes since they last looked at the feed. Perhaps something like this: history:prev archive=yeshttp://liftoff.msfc.nasa.gov/2003/04/ feed.rss/history I'd think that would be more appropriate as an extension to the archive itself, wouldn't it? That way, the metadata (the fact that it's an archive) is part of the data (the archive feed). E.g., atom:feed ... archive:yes_im_an_archive/ /atom:feed By (current) definition, anything that history:prev points to is an archive. Cheers, -- Mark Nottingham http://www.mnot.net/
Re: Feed History -02
HTTP isn't a transport protocol, it's a transfer protocol; i.e., the caching information (and other entity metadata) are *part of* the entity, not something that's conceptually separate. The problem with having an abstract or transport-neutral concept of caching is that it leaves you with an awkward choice; you can either a) exactly replicate the HTTP caching model, which is difficult to do in other protocols, b) dumb down HTTP caching to a subset that's neutral, or c) introduce a contradictory caching model and suffer the clashes between HTTP caching and it. This is the same road that Web services sometimes tries to go down, and it's a painful one; coming up with the grand, protocol-neutral abstraction that enables all of the protocol-specific features is hard, and IMO not necessary. Ask yourself: are there any situations where you *have* to be able to seamlessly switch between protocols, or is it just a convenience? I think we can declare victory here by simply a) using whatever caching mechanism is available, and b) designating a won't change flag. On 09/08/2005, at 11:53 AM, James M Snell wrote: Henry Story wrote: Now I am wondering if the http mechanism is perhaps all that is needed for what I want with the unchanging archives. If it is then perhaps this could be explained in the Feed History RFC. Or are there other reasons to add and expires tag to the document itself? On the application level, a feed or entry may expire or age indepedently of whatever caching mechanisms may be applied at the transport level. For example, imagine a source that publishes special offers in the form of Atom entries that expire at a given point in time. Now suppose that those entries are being distributed via XMPP and HTTP. It is helpful to have a transport independent expiration/max-age mechanism whose semantics operate on the application layer rather than the transport layer. - James -- Mark Nottingham Principal Technologist Office of the CTO BEA Systems
Re: Expires extension draft (was Re: Feed History -02)
On 10/8/05 12:46 PM, James M Snell [EMAIL PROTECTED] wrote: This is fairly quick and off-the-cuff, but here's an initial draft to get the ball rolling.. http://www.snellspace.com/public/draft-snell-atompub-feed-expires.txt Looks good, I think it does need a little bit of prose explaining that this has nothing to do with caching, and should not be used in scheduling when to revisit/refresh/expire local copies of the resource. Similarly, if I understand correctly, when you write The 'max-age' extension element is used to indicate the maximum age of a feed or entry. you are referring to the max-age until the informational content of the respective feed or entry expires. And similarly with age:expires. Yes? Aside: a perfect example of what sense of 'expires' is in the I-D itself... Network Working Group Internet-Draft Expires: January 2, 2006 :-) e.
Re: Feed History -02
First off, let me stress that I am NOT talking about caching scenarios here... (my use of the terms application layer and transport layer were an unfortunate mistake on my part that only served to confuse my point) Let's get away from the multiprotocol question for a bit (it never leads anywhere constructive anyway)... Let's consider an aggregator scenario. Take an entry from a feed that is supposed to expire after 10 days. The feed document is served up to the aggregator with the proper HTTP headers for expiration. The entry is extracted from the original feed and dumped into an aggregated feed. Suppose each of the entries in the aggregated feed are supposed to have their own distinct expirations. How should the aggregator communicate the appropriate expirations to the subscriber? Specifying expirations on the HTTP level does not allow me to specify expirations for individual entries within a feed. Use case: an online retailer wishes to produce a special offers feed. Each offer in the feed is a distinct entity with it's own terms and own expiration: e.g. some offers are valid for a week, other offers are valid for two weeks, etc. The expiration of the offer (a business level construct) is independent of whether or not the feed is being cached or not (a protocol level construct); publishing a new version of the feed (e.g. by adding a new offer to the feed) should have no impact on the expiration of prior offers published to the feed. Again, I am NOT attempting to reinvent an abstract or transport-neutral caching mechanism in the same sense that the atom:updated element is not attempting to reinvent Last-Modified or that the via link relation is not attempting to reinvent the Via header, etc. They serve completely different purposes. The expires and max-age extensions I am proposing should NOT be used for cache control of the Atom documents in which they appear. I think we can declare victory here by simply a) using whatever caching mechanism is available, and b) designating a won't change flag. Speaking *strictly* about cache control of Atom documents, +1. No document level mechanisms for cache control are necessary. - James Mark Nottingham wrote: HTTP isn't a transport protocol, it's a transfer protocol; i.e., the caching information (and other entity metadata) are *part of* the entity, not something that's conceptually separate. The problem with having an abstract or transport-neutral concept of caching is that it leaves you with an awkward choice; you can either a) exactly replicate the HTTP caching model, which is difficult to do in other protocols, b) dumb down HTTP caching to a subset that's neutral, or c) introduce a contradictory caching model and suffer the clashes between HTTP caching and it. This is the same road that Web services sometimes tries to go down, and it's a painful one; coming up with the grand, protocol-neutral abstraction that enables all of the protocol-specific features is hard, and IMO not necessary. Ask yourself: are there any situations where you *have* to be able to seamlessly switch between protocols, or is it just a convenience? I think we can declare victory here by simply a) using whatever caching mechanism is available, and b) designating a won't change flag. On 09/08/2005, at 11:53 AM, James M Snell wrote: Henry Story wrote: Now I am wondering if the http mechanism is perhaps all that is needed for what I want with the unchanging archives. If it is then perhaps this could be explained in the Feed History RFC. Or are there other reasons to add and expires tag to the document itself? On the application level, a feed or entry may expire or age indepedently of whatever caching mechanisms may be applied at the transport level. For example, imagine a source that publishes special offers in the form of Atom entries that expire at a given point in time. Now suppose that those entries are being distributed via XMPP and HTTP. It is helpful to have a transport independent expiration/max-age mechanism whose semantics operate on the application layer rather than the transport layer. - James -- Mark Nottingham Principal Technologist Office of the CTO BEA Systems
Re: Feed History -02
So, if I read you correctly, it sounds like you have a method whereby a 'top20' feed wouldn't need history:prev to give the kind of history that you're thinking of, right? If that's the case, I'm tempted to just tweak the draft so that history:stateful is optional if history:prev is present. I was considering dropping stateful altogether, but I think something is necessary to explicitly say don't try to keep a history of my feed. My latest use case for this is the RSS feed that Netflix provides to let you keep an eye on your queue (sort of like top20, but more specialised). Sound good? On 29/07/2005, at 6:39 AM, Henry Story wrote: Below I think I have worked out how one can in fact have a top20 feed, and I show how this can be combined without trouble with the history:next ... link... On 29 Jul 2005, at 13:12, Eric Scheid wrote: On 29/7/05 7:57 PM, Henry Story [EMAIL PROTECTED] wrote: 1- The top 20 list: here one wants to move to the previous top 20 list and think of them as one thing. The link to the next feed is not meant to be additive. Each feed is to be seen as a whole. I have a little trouble still thinking of these as feeds, but ... What happens if the publisher realises they have a typo and need to emit an update to an entry? Would the set of 20 entries (with one entry updated) be seen as a complete replacement set? Well if it is a typo and this is considered to be an insignificant change then they can change the typo in the feed document and not need to change any updated time stamps. The way I see it, maybe a better way would be to have a sliding window feed where each entry points to another Atom Feed Document with it's own URI, and it is that second Feed Document which contains the individual items (the top 20 list). This is certainly closer to my intuitions too. A top 20 something is *not* a feed. Feed entries are not ordered, and are not meant to be thought of as a closed collection. At least this is my initial intuition. BUT I can think of a solution like the following: Let us imagine a top 20 feed where the resources being described by the entries are the position in the top list. So we have entries with ids such as http://TopOfThePops.co.uk/top20/Number1 http://TopOfThePops.co.uk/top20/Number2 http://TopOfThePops.co.uk/top20/Number3 ... Each of these resources describes the songs that is at a certain rank in the top of the pops chart. Each week the song in that rank may change. When a change occurs in the song at a certain rank the top 20 feed with id http://TopOfThePops.co.uk/top20/Feed will contain a new entry such as entry titleTop of the pops entry number 1/title link href=http://TopOfThePops.co.uk/top20/Number1// idhttp://TopOfThePops.co.uk/top20/Number1/id updated2005-07-05T18:30:00Z/updated summaryTop of the pops winner for the week starting 5 July 2005/summary /entry A client that would subscribe to such a feed would automatically get updates every week for each of the top 20 resources. But the feed could be structured exactly like I suggest in 2. So for a top 2 feed (20 is a bit to long for me) feed title type=textMy top 2 Software Books/title idhttp://bblfish.net/blog/top2/id history:prevhttp://bblfish.net/blog/top2/ Archive-2005-07-18.atom/history:prev ... entry titleMy Top 1 book/title link href=http://bblfish.net/blog/top2/Number1// idhttp://bblfish.net/blog/top2/Number1/id updated2005-07-25T18:30:00Z/updated summaryMy top 1 book is Service Oriented Computing by Wiley/ summary /entry entry titleMy Top 2 book/title link href=http://bblfish.net/blog/top2/Number2// idhttp://bblfish.net/blog/top2/Number2/id updated2005-07-25T18:30:00Z/updated summaryMy second top book is xml in a Nutshell/summary /entry /feed The above representation of the http://bblfish.net/blog/top2 feed points to the archive feed http://bblfish.net/blog/top2/Archive-2005-07-18.atom feed title type=textMy top 2 Software Books/title idhttp://bblfish.net/blog/top2/id ... entry titleMy Top 1 book/title link href=http://bblfish.net/blog/top2/Number1// idhttp://bblfish.net/blog/top2/Number1/id updated2005-07-18T18:30:00Z/updated summaryMy top 1 book is Java 2D Graphics/summary /entry entry titleMy Top 2 book/title link href=http://bblfish.net/blog/top2/Number2// idhttp://bblfish.net/blog/top2/Number2/id updated2005-07-18T18:30:00Z/updated summaryMy second top book is xml in a Nutshell/summary /entry /feed As you notice only the top1 has changed from the first week but not the second, and yet in both feeds there is an entry for both. A non change can sometimes be an important event. But this is really up to the feed creator to choose how to present his feeds. He could easily have had only had the entry for
Re: Feed History -02
I get the feeling that we should perhaps first list the main types of history archives, and then deal with each one separately. I can see 3: 1- The top 20 list: here one wants to move to the previous top 20 list and think of them as one thing. The link to the next feed is not meant to be additive. Each feed is to be seen as a whole. I have a little trouble still thinking of these as feeds, but ... 2- The history of changes link: the idea here is that a feed is a list of state changes to web resources. The link to the history just increases the length of the feed by pointing to the next feed page. These two pages could have been merged to create one feed page with the same end result. By moving back through the feed history one gets to see the complete history of changes to the resources described by the feed. 3- A listing of the current states of the resources: here the end user is searching for all the blog entries as they currently are. No feed id is ever duplicated across the chain. I think each of these has its uses. 1. this use case is clear and well understood. 2. is very useful for 'static' blog feeds written from a remote server, as once a feed has been archived it no longer needs to be rewritten. It is also very useful in helping a client synchronize with the changes to the feed. If a client is off line for a few months or if there are a lot of changes in a feed then the client just needs to follow this type of link to catch up with all the changes. If the entries are describing state changes to resources such as stock market valuations, then this would allow one to get all the valuations for a company going back in time. 3. It is easier for this to be generated dynamically upon request. It is not really a history of changes, but rather a list of resources. In a feed concerning a blog these two can of course easily be confused as we think of blogs as being created sequentially. We then think of the feed history as the history of the blog entry publication dates. If an old blog entry gets changed then the page on which that blog entry originally appeared in the feed would get changed just as the permalink to the blog post itself gets changed. This type of list is of course not very useful in keeping someone updated as to the changes in a feed, as the whole linked feeds would need to be stepped through to see if anything has changed. I think it is with this version that the proposals of having a separate document that points to all the entries makes most sense. And this case also seems to be worked upon in the API I think. more below... On 23 Jul 2005, at 18:14, Mark Nottingham wrote: On 19/07/2005, at 2:04 AM, Henry Story wrote: Clearly the archive feed will work best if archive documents, once completed (containing a given number of entries) never change. Readers of the archive will have a simple way to know when to stop reading: there should never be a need to re-read an archive page - they just never change. The archive provides a history of the feed's evolution. Earlier changes to the resources described by the feed will be found in older archive documents and newer changes in the later ones. One should expect some entries to be referenced in multiple archive feed documents. These will be entries that have been changed over time. Archives *should not* change. I think any librarian will agree with that. I very much agree that this is the ideal that should be striven for. However, there are some practical problems with doing it in this proposal. First of all, I'm very keen to make it possible to implement history with currently-deployed feed-generating software; e.g., Moveable Type. MT isn't capable of generating a feed where changed entries are repeated at the top out of the box, AFAIK. So is it that MT can only generate feed links such as describe in 3 above? Even if it (and other software) were, it would be very annoying to people whose feed software doesn't understand this extension; their show me the latest entries in the blog feed would become show me the latest changed entries in the blog, and every time an entry was modified or spell-checked, it would show up at the top. Well in many cases this is exactly what I want. If someone makes a big change to a resource that was created one year ago (say your first blog post) then I think this is well worth knowing about. Especially if I have written a comment about it. Of course if the change is not *significant* then the change to the old feed archive would not be significant either. So perhaps I should have said that archives should not change in any *significant* way. So, it's a matter of enabling graceful deployment. Most of the reason I have the fh:stateful flag in there is to allow
Re: Feed History -02
Below I think I have worked out how one can in fact have a top20 feed, and I show how this can be combined without trouble with the history:next ... link... On 29 Jul 2005, at 13:12, Eric Scheid wrote: On 29/7/05 7:57 PM, Henry Story [EMAIL PROTECTED] wrote: 1- The top 20 list: here one wants to move to the previous top 20 list and think of them as one thing. The link to the next feed is not meant to be additive. Each feed is to be seen as a whole. I have a little trouble still thinking of these as feeds, but ... What happens if the publisher realises they have a typo and need to emit an update to an entry? Would the set of 20 entries (with one entry updated) be seen as a complete replacement set? Well if it is a typo and this is considered to be an insignificant change then they can change the typo in the feed document and not need to change any updated time stamps. The way I see it, maybe a better way would be to have a sliding window feed where each entry points to another Atom Feed Document with it's own URI, and it is that second Feed Document which contains the individual items (the top 20 list). This is certainly closer to my intuitions too. A top 20 something is *not* a feed. Feed entries are not ordered, and are not meant to be thought of as a closed collection. At least this is my initial intuition. BUT I can think of a solution like the following: Let us imagine a top 20 feed where the resources being described by the entries are the position in the top list. So we have entries with ids such as http://TopOfThePops.co.uk/top20/Number1 http://TopOfThePops.co.uk/top20/Number2 http://TopOfThePops.co.uk/top20/Number3 ... Each of these resources describes the songs that is at a certain rank in the top of the pops chart. Each week the song in that rank may change. When a change occurs in the song at a certain rank the top 20 feed with id http://TopOfThePops.co.uk/top20/Feed will contain a new entry such as entry titleTop of the pops entry number 1/title link href=http://TopOfThePops.co.uk/top20/Number1// idhttp://TopOfThePops.co.uk/top20/Number1/id updated2005-07-05T18:30:00Z/updated summaryTop of the pops winner for the week starting 5 July 2005/summary /entry A client that would subscribe to such a feed would automatically get updates every week for each of the top 20 resources. But the feed could be structured exactly like I suggest in 2. So for a top 2 feed (20 is a bit to long for me) feed title type=textMy top 2 Software Books/title idhttp://bblfish.net/blog/top2/id history:prevhttp://bblfish.net/blog/top2/ Archive-2005-07-18.atom/history:prev ... entry titleMy Top 1 book/title link href=http://bblfish.net/blog/top2/Number1// idhttp://bblfish.net/blog/top2/Number1/id updated2005-07-25T18:30:00Z/updated summaryMy top 1 book is Service Oriented Computing by Wiley/ summary /entry entry titleMy Top 2 book/title link href=http://bblfish.net/blog/top2/Number2// idhttp://bblfish.net/blog/top2/Number2/id updated2005-07-25T18:30:00Z/updated summaryMy second top book is xml in a Nutshell/summary /entry /feed The above representation of the http://bblfish.net/blog/top2 feed points to the archive feed http://bblfish.net/blog/top2/Archive-2005-07-18.atom feed title type=textMy top 2 Software Books/title idhttp://bblfish.net/blog/top2/id ... entry titleMy Top 1 book/title link href=http://bblfish.net/blog/top2/Number1// idhttp://bblfish.net/blog/top2/Number1/id updated2005-07-18T18:30:00Z/updated summaryMy top 1 book is Java 2D Graphics/summary /entry entry titleMy Top 2 book/title link href=http://bblfish.net/blog/top2/Number2// idhttp://bblfish.net/blog/top2/Number2/id updated2005-07-18T18:30:00Z/updated summaryMy second top book is xml in a Nutshell/summary /entry /feed As you notice only the top1 has changed from the first week but not the second, and yet in both feeds there is an entry for both. A non change can sometimes be an important event. But this is really up to the feed creator to choose how to present his feeds. He could easily have had only had the entry for http://bblfish.net/blog/top2/Number1 in the first feed. Looking at it this way, there really seems to be no incompatibility between a top 20 feed and the history:next ... link. My talk about archives not changing should be more precisely about archives not changing in any significant way. And this advice could be moved to an implementors section and be encoded in HTTP by simply giving archive pages an infinitely long expiry date. Someone could subscribe to that second feed and poll for updates, and all they'll ever see are updates to the 20 items there, not the 20 items from the next week/whatever. The idea of feeds linked to feeds has lots of utility --
Re: nested feeds (was: Feed History -02)
On 29/7/05 11:39 PM, Henry Story [EMAIL PROTECTED] wrote: Below I think I have worked out how one can in fact have a top20 feed, and I show how this can be combined without trouble with the history:next ... link... On 29 Jul 2005, at 13:12, Eric Scheid wrote: On 29/7/05 7:57 PM, Henry Story [EMAIL PROTECTED] wrote: 1- The top 20 list: here one wants to move to the previous top 20 list and think of them as one thing. The link to the next feed is not meant to be additive. Each feed is to be seen as a whole. I have a little trouble still thinking of these as feeds, but ... What happens if the publisher realises they have a typo and need to emit an update to an entry? Would the set of 20 entries (with one entry updated) be seen as a complete replacement set? Well if it is a typo and this is considered to be an insignificant change then they can change the typo in the feed document and not need to change any updated time stamps. Misspelling the name of the artist for the top 20 songs list is not insignificant. Even worse fubars are possible too -- such as attributing the wrong artist/author to the #1 song/book (and even worse: leaving off a co-author). The way I see it, maybe a better way would be to have a sliding window feed where each entry points to another Atom Feed Document with it's own URI, and it is that second Feed Document which contains the individual items (the top 20 list). This is certainly closer to my intuitions too. A top 20 something is *not* a feed. Feed entries are not ordered, and are not meant to be thought of as a closed collection. At least this is my initial intuition. BUT Not all Atom Feed Documents are feeds, some are static collections of entries. We keep tripping over this :-( I can think of a solution like the following: Let us imagine a top 20 feed where the resources being described by the entries are the position in the top list. So we have entries with ids such as http://TopOfThePops.co.uk/top20/Number1 http://TopOfThePops.co.uk/top20/Number2 http://TopOfThePops.co.uk/top20/Number3 ... will contain a new entry such as entry titleTop of the pops entry number 1/title link href=http://TopOfThePops.co.uk/top20/Number1// idhttp://TopOfThePops.co.uk/top20/Number1/id updated2005-07-05T18:30:00Z/updated summaryTop of the pops winner for the week starting 5 July 2005/summary /entry The problem here is that this doesn't describe the referent, it only describes the reference. I want to see top 20 feeds where each entry links to the referent in question. For example, the Amazon Top 10 Selling Books feed would link to the book specific page at Amazon, not to some page saying the #3 selling book is at the other end of this link. The idea of feeds linked to feeds has lots of utility -- feeds of comments for one, and even a feed of feeds available on the site. I completely agree. And remember for any two things there is at least one way they are related. And there are many different ways feeds can be related to each other. A feed may be an archival continuation of one - which is what the history:next ... link in my opinion addresses, but there are many other ways one can relate feeds. I did a grep of an archive of atom-syntax messages .. lots of interesting possibilities in there. James has set a nice example to copy from, so expect a few I-D from me. Of the above, the mechanism of a single URI which redirects to the current issue is a situation which would still need a flag indicating that the appropriate thing to do is to not persist older entries. I am starting to wonder whether this is really needed now that I have looked at the top20 example I gave above. Consider the Nature case study. They have separate feed documents for each issue, but just one public URI published. The things which are entries are not Top N things but entries in a Table of Contents, and it's useful to be able to aggregate that list of articles over time. The other structure of feeds linking to feeds would require the aggregator be able to do something useful with such links, but this can be generalised and thus be useful for many purposes. As it is, right now with NNW I can do something useful with such a feed: drag drop the item headline link to my subscriptions pane to subscribe to that feed and view the entries therein. I myself have no problem with feeds being entries, feeds pointing to other feeds, or anything like that. A feed is a resource. It can change. A feed is simply a set of state changes to resources. It is that general. What's needed is a sea change in the concept model aggregators have regarding feed URIs. Right now it's exactly as if you have to Bookmark a web page before you can view it -- whereas I'd like to see feed browser where I can clickity click to wherever without having to first clutter my subscriptions list, but if I happen to like the
Re: Feed History -02
On Sat, 2005-07-23 at 09:14 -0700, Mark Nottingham wrote: Archives *should not* change. I think any librarian will agree with that. I very much agree that this is the ideal that should be striven for. The underlying problem, I think, is that different feeds have different semantics. I think that's the bottom line Mark. No matter what, people are going to do different things with 'history'. Trying to pin down how it should all work seems to increase the complexity by an order, never a good thing IMHO. How about, you want my history, you make of it what you will. I only guarantee its valid atom? At least that way, processing older feed material can develop based on a sound (and clearly understood) foundation. regards DaveP
Re: Feed History -02
On 19/07/2005, at 2:04 AM, Henry Story wrote: Clearly the archive feed will work best if archive documents, once completed (containing a given number of entries) never change. Readers of the archive will have a simple way to know when to stop reading: there should never be a need to re-read an archive page - they just never change. The archive provides a history of the feed's evolution. Earlier changes to the resources described by the feed will be found in older archive documents and newer changes in the later ones. One should expect some entries to be referenced in multiple archive feed documents. These will be entries that have been changed over time. Archives *should not* change. I think any librarian will agree with that. I very much agree that this is the ideal that should be striven for. However, there are some practical problems with doing it in this proposal. First of all, I'm very keen to make it possible to implement history with currently-deployed feed-generating software; e.g., Moveable Type. MT isn't capable of generating a feed where changed entries are repeated at the top out of the box, AFAIK. Even if it (and other software) were, it would be very annoying to people whose feed software doesn't understand this extension; their show me the latest entries in the blog feed would become show me the latest changed entries in the blog, and every time an entry was modified or spell-checked, it would show up at the top. So, it's a matter of enabling graceful deployment. Most of the reason I have the fh:stateful flag in there is to allow people to explicitly say I don't want you to do history on this feed because so many aggregators are already doing history in their own way. The underlying problem, I think, is that different feeds have different semantics. Some will want every change to be included, others won't; for example, a blog probably doesn't need every single spelling correction propagated. There are some fundamental questions about the nature of a feed that need to be answered (and, more importantly, agreed upon) before we get there; for example, we now say that the ordering isn't significant by default; while that's nice, most software is going to infer something from it, so we need an extension to say 'sort by this', *and* have that extension widely deployed. I tried to approach these problems when I wrote the original proposal for this in Pace form; I got strong pushback on defining a single model for a feed's state. Given that, as well as the deployment issues, I intentionally de-coupled the state reconstruction (this proposal) from the state model (e.g., ordering, deletion, exact semantics of an archive feed, etc.), so that they could be separately defined. Cheers, -- Mark Nottingham http://www.mnot.net/
Re: Feed History -02
Am 21.07.2005 um 16:13 schrieb Mark Nottingham: On 19/07/2005, at 1:48 AM, Stefan Eissing wrote: [...] I have the feeling that clients will need to protect themselves from servers with almost infinite histories. So a client will probably offer a XX days into the past, max NN entries setting in its UI. Maybe that is all that's needed. Good question. I was thinking roughly along these lines in the Security Considerations, but didn't want to lead people too much. How about: In case feeds are served via HTTP, server implemenations SHOULD offer ETag and Last-Modified headers on history documents (see RFC 2616 xxx). Clients SHOULD persist ETag and Last-Modified information and use If-* headers to ease server load on history synchronization. Hmm. Maybe not SHOULDs, but some prose might lead people in the right direction, perhaps an 'Implementer Notes' section? Ok, a SHOULD is not really required. Let common sense prevail. The remaining question for me is if the fh:prev links are pointing to history records or if they are just a way to split a large feed into chunks. The more I think about it, the more I lean towards leaving the spec as is. Since clients do not trust servers very much, a clever client will most likely implement a synch strategy which frequently checks if the prev uris change and once a day/after a week absence traverse the prev chain and retrieve all documents. //Stefan
Re: Feed History -02
Am 18.07.2005 um 23:21 schrieb Mark Nottingham: On 18/07/2005, at 2:17 PM, Stefan Eissing wrote: On a more semantic issue: The described sync algorithm will work. In most scenarios the abort condition (e.g. all items on a historical feed are known) will also do the job. However this still means that clients need to check the first fh:prev document if they know all entries there - if my understanding is correct. This is one of the unanswered questions that I left out of scope. The consumer can examine the previous archive's URI and decide as to whether it's seen it or not before, and therefore avoid fetching it if it already has seen it. However, in this approach, it won't see changes that are made in the archive (e.g., if a revision -- even a spelling correction -- is made to an old entry); to do that it either has to walk back the *entire* archive each time, or the feed has to publish all changes -- even to old entries -- at the head of the feed. I left it out because it has more to do with questions about entry deleting and ordering than with recovering state. it's an arbitrary decision (I had language about this in the original Pace I made), but it seemed like a good trade-off between complexity and capability. It is a valid starting point. I am just wondering what consequences it has on client implementations. Let's say CNN goes stateful, how would a client handle a history which soon consists of thousands of entries. How would a server best offer such a history to avoid clients retrieving it over and over again. Probably nobody has a good idea on that one, or? I have the feeling that clients will need to protect themselves from servers with almost infinite histories. So a client will probably offer a XX days into the past, max NN entries setting in its UI. Maybe that is all that's needed. How about: In case feeds are served via HTTP, server implemenations SHOULD offer ETag and Last-Modified headers on history documents (see RFC 2616 xxx). Clients SHOULD persist ETag and Last-Modified information and use If-* headers to ease server load on history synchronization. //Stefan
Re: Feed History -02
On 18 Jul 2005, at 23:21, Mark Nottingham wrote: On 18/07/2005, at 2:17 PM, Stefan Eissing wrote: On a more semantic issue: The described sync algorithm will work. In most scenarios the abort condition (e.g. all items on a historical feed are known) will also do the job. However this still means that clients need to check the first fh:prev document if they know all entries there - if my understanding is correct. This is one of the unanswered questions that I left out of scope. The consumer can examine the previous archive's URI and decide as to whether it's seen it or not before, and therefore avoid fetching it if it already has seen it. However, in this approach, it won't see changes that are made in the archive (e.g., if a revision -- even a spelling correction -- is made to an old entry); to do that it either has to walk back the *entire* archive each time, or the feed has to publish all changes -- even to old entries -- at the head of the feed. Clearly the archive feed will work best if archive documents, once completed (containing a given number of entries) never change. Readers of the archive will have a simple way to know when to stop reading: there should never be a need to re-read an archive page - they just never change. The archive provides a history of the feed's evolution. Earlier changes to the resources described by the feed will be found in older archive documents and newer changes in the later ones. One should expect some entries to be referenced in multiple archive feed documents. These will be entries that have been changed over time. Archives *should not* change. I think any librarian will agree with that. I left it out because it has more to do with questions about entry deleting and ordering than with recovering state. it's an arbitrary decision (I had language about this in the original Pace I made), but it seemed like a good trade-off between complexity and capability. Does that make sense, or am I way off-base? Is it worthy to think of something to spare clients and servers this lookup? Are the HTTP caching and If-* header mechanisms good enough to save network bandwidth? An alternate stratgey would be to require that fh:prev documents never change once created. Then a client can terminate the sync once it sees a URI it already knows. And most clients would not do more lookups than they are doing now... I think this would be the correct strategy. Henry Story
Re: Feed History -02
On 19 Jul 2005, at 01:52, A. Pagaltzis wrote: * Mark Nottingham [EMAIL PROTECTED] [2005-07-18 23:30]: This is one of the unanswered questions that I left out of scope. The consumer can examine the previous archive's URI and decide as to whether it's seen it or not before, and therefore avoid fetching it if it already has seen it. However, in this approach, it won't see changes that are made in the archive (e.g., if a revision -- even a spelling correction -- is made to an old entry); to do that it either has to walk back the *entire* archive each time, or the feed has to publish all changes -- even to old entries -- at the head of the feed. These are the kinds of things my “hub archive feed” situation was supposed to address. Because the links are all in one place, the consumer only has to suck down one document in order to be informed of all archive feeds and being able to decide which ones he wants to re-/get. I wonder if what you are trying to describe here is not a different concept altogether from an archive feed. I guess that both are completely orthogonal concepts. Feeds tend to specialize in a number of resources they track. What would also be useful would be a document that described the resources tracked by a feed. This would be closer to a directory listing. It would help point to the current state of the resources tracked by the feed. So when one subscribed to a feed one could then quickly get a list of all the resources that the feed had responsibility for. As this could be quite large some form of navigation may be necessary. Perhaps this is the type of thing that the protocol group is working on. Henry Story Regards, -- Aristotle Pagaltzis // http://plasmasturm.org/
Re: Feed History -02
On Monday, July 18, 2005, at 01:59 AM, Stefan Eissing wrote: Ch 3. fh:stateful seems to be only needed for a newborn stateful feed. As an alternative one could drop fh:stateful and define that an empty fh:prev (refering to itself) is the last document in a stateful feed. That would eliminate the cases of wrong mixes of fh:stateful and fh:prev. The problem is that an empty @href in fh:prev is subject to xml:base processing, and who knows what the current xml:base is going to be when you get to it. Is there a way to explicitly make xml:base undefined? If I'm not mistaken xml:base= doesn't do it--it just adds nothing to the existing xml:base. If there is a way, you could say link rel=fhprev href= xml:base=[whatever value sets it to undefined] /, but otherwise, using an empty @href is probably overloading the wrong attribute. A different @rel value like fh:noprev (with an empty link, since it doesn't matter what it actually points to) might be a step up, but using any kind of link to indicate the lack of a link is a little odd.
Re: Feed History -02
On Tuesday, July 19, 2005, at 12:29 PM, Antone Roundy wrote: On Monday, July 18, 2005, at 01:59 AM, Stefan Eissing wrote: Ch 3. fh:stateful seems to be only needed for a newborn stateful feed. As an alternative one could drop fh:stateful and define that an empty fh:prev (refering to itself) is the last document in a stateful feed. That would eliminate the cases of wrong mixes of fh:stateful and fh:prev. The problem is that an empty @href in fh:prev is subject to xml:base processing, and who knows what the current xml:base is going to be when you get to it. Is there a way to explicitly make xml:base undefined? If I'm not mistaken xml:base= doesn't do it--it just adds nothing to the existing xml:base. If there is a way, you could say link rel=fhprev href= xml:base=[whatever value sets it to undefined] /, but otherwise, using an empty @href is probably overloading the wrong attribute. A different @rel value like fh:noprev (with an empty link, since it doesn't matter what it actually points to) might be a step up, but using any kind of link to indicate the lack of a link is a little odd. Yikes, I should have caught up on the xml:base thread first! Looks like the jury's out, or at least hung, on this issue.
Re: Feed History -02
Sorry I did not participate in the previous discussion for format 00. I only just realized this was going on. What is clear is that this is really needed! I agree with Stefan Eissing's random thought that it may not be a good idea to use Atom for a top 10 feed. Atom entries are not ordered in a feed for one. Also as I understand it an entry in a feed is best thought of as a state of an external resource at a time. Making a feed of the top x entries is to use the feed as a closed collection whereas I think it is correctly interpreted as an open one. If that is right, and so fh:stateful is not needed, then would it not be simpler to extend the link element in the following really simple way: link rel=http://purl.org/syndication/history/1.0; type=application/ atom+xml href=http://example.org/2003/11/index.atom; / Just a thought. In any case I really look forward to having this functionality. Thanks a lot for the huge effort you have put into presenting this idea so clearly and with such patience. Henry Story On 18 Jul 2005, at 09:59, Stefan Eissing wrote: Am 16.07.2005 um 17:57 schrieb Mark Nottingham: The Feed History draft has been updated to -02; http://ftp.ietf.org/internet-drafts/draft-nottingham-atompub- feed-history-02.txt The most noticeable change in this version is the inclusion of a namespace URI, to allow implementation. I don't intend to update it for a while, so as to gather implementation feedback. Just a couple of thoughts on reading the document: Ch 3. fh:stateful seems to be only needed for a newborn stateful feed. As an alternative one could drop fh:stateful and define that an empty fh:prev (refering to itself) is the last document in a stateful feed. That would eliminate the cases of wrong mixes of fh:stateful and fh:prev. Ch 5. inserting pseudo-entries into an incomplete feed: would it make sense to have a general way to indicate such pseudo entries? A feed entry can also get lost at the publisher and the publisher might want to indicate that there once was a feed entry, but that he no longer has the (complete) document. //Stefan Random thoughts: The example of a top 10 feed (Ch 1) needs some thinking: there are quite some people interested in the history of top 10 when it comes to music charts. One _could_ make this an atom feed and use the feed history to go back in time. But the underlying model is different from the one atom has, so maybe its not such a good idea after all. (Is there any ordering in a feed, btw.? I know a client can sort by date, but does someone rely on document order of xml elements?)
Re: Feed History -02
Henry Story wrote: Sorry I did not participate in the previous discussion for format 00. I only just realized this was going on. What is clear is that this is really needed! I agree with Stefan Eissing's random thought that it may not be a good idea to use Atom for a top 10 feed. Atom entries are not ordered in a feed for one. Also as I understand it an entry in a feed is best thought of as a state of an external resource at a time. Making a feed of the top x entries is to use the feed as a closed collection whereas I think it is correctly interpreted as an open one. I disagree. Atom could be used very easily for a top 10 feed. What is needed is a simple extension that provides rank orderings for entries. Something as simple as the following would work... feed ... ... entry ... r:index1/r:index /entry entry r:index2/r:index /entry entry r:index3/r:index /entry /feed If that is right, and so fh:stateful is not needed, then would it not be simpler to extend the link element in the following really simple way: link rel=http://purl.org/syndication/history/1.0; type=application/ atom+xml href=http://example.org/2003/11/index.atom; / I actually can't what my opinion on this used to be :-( but right now I'm thinking that a custom link relation is the right approach. - James
Re: Feed History -02
On 18/07/2005, at 11:10 AM, James M Snell wrote: Ch 3. fh:stateful seems to be only needed for a newborn stateful feed. As an alternative one could drop fh:stateful and define that an empty fh:prev (refering to itself) is the last document in a stateful feed. That would eliminate the cases of wrong mixes of fh:stateful and fh:prev. +1. After going through this, fh:stateful really doesn't seem to be necessary. the presence of fh:prev would be sufficient to indicate that the feed has a history and a blank fh:prev would work fine to indicate the end of the history. I thought about the comments on the plane yesterday, and I agree. However, I'm wary of special URI values; also, I want to preserve stateful=false. So, what about saying that you can omit fh:stateful *if* fh:prev is in the feed? -- Mark Nottingham http://www.mnot.net/
Re: Feed History -02
Mark Nottingham wrote: On 18/07/2005, at 11:10 AM, James M Snell wrote: Ch 3. fh:stateful seems to be only needed for a newborn stateful feed. As an alternative one could drop fh:stateful and define that an empty fh:prev (refering to itself) is the last document in a stateful feed. That would eliminate the cases of wrong mixes of fh:stateful and fh:prev. +1. After going through this, fh:stateful really doesn't seem to be necessary. the presence of fh:prev would be sufficient to indicate that the feed has a history and a blank fh:prev would work fine to indicate the end of the history. I thought about the comments on the plane yesterday, and I agree. However, I'm wary of special URI values; also, I want to preserve stateful=false. So, what about saying that you can omit fh:stateful *if* fh:prev is in the feed? -- Mark Nottingham http://www.mnot.net/ I would say that if fh:prev is present, stateful=true is assumed. if fh:prev is not present, stateful=false is assumed. Omit fh:prev in the final feed in the chain and you know you've reached the end. - James
Re: Feed History -02
There is precedence for using atom:link in RSS feeds. see http://feeds.feedburner.com/ITConversations-EverythingMP3. I really don't think it's a problem. Mark Nottingham wrote: That's what I originally did, but I have a rather strong preference to make a single syntax work in RSS and Atom. atom:link is (naturally) specific to Atom, and people will balk at using the atom namespace in RSS feeds. That's not to say that every Atom extension should be usable in RSS, but I think this one is simple -- and valuable -- enough to do it; and, I don't see any technical benefit to using the link relation. Cheers, On 18/07/2005, at 12:32 PM, Henry Story wrote: If that is right, and so fh:stateful is not needed, then would it not be simpler to extend the link element in the following really simple way: link rel=http://purl.org/syndication/history/1.0; type=application/atom+xml href=http://example.org/2003/11/index.atom; / Just a thought. -- Mark Nottingham http://www.mnot.net/
Re: Feed History -02
On 18/07/2005, at 2:17 PM, Stefan Eissing wrote: On a more semantic issue: The described sync algorithm will work. In most scenarios the abort condition (e.g. all items on a historical feed are known) will also do the job. However this still means that clients need to check the first fh:prev document if they know all entries there - if my understanding is correct. This is one of the unanswered questions that I left out of scope. The consumer can examine the previous archive's URI and decide as to whether it's seen it or not before, and therefore avoid fetching it if it already has seen it. However, in this approach, it won't see changes that are made in the archive (e.g., if a revision -- even a spelling correction -- is made to an old entry); to do that it either has to walk back the *entire* archive each time, or the feed has to publish all changes -- even to old entries -- at the head of the feed. I left it out because it has more to do with questions about entry deleting and ordering than with recovering state. it's an arbitrary decision (I had language about this in the original Pace I made), but it seemed like a good trade-off between complexity and capability. Does that make sense, or am I way off-base? Is it worthy to think of something to spare clients and servers this lookup? Are the HTTP caching and If-* header mechanisms good enough to save network bandwidth? An alternate stragety would be to require that fh:prev documents never change once created. Then a client can terminate the sync once it sees a URI it already knows. And most clients would not do more lookups than they are doing now... -- Mark Nottingham http://www.mnot.net/
Re: Feed History -02
On a more semantic issue: The described sync algorithm will work. In most scenarios the abort condition (e.g. all items on a historical feed are known) will also do the job. However this still means that clients need to check the first fh:prev document if they know all entries there - if my understanding is correct. Is it worthy to think of something to spare clients and servers this lookup? Are the HTTP caching and If-* header mechanisms good enough to save network bandwidth? An alternate stragety would be to require that fh:prev documents never change once created. Then a client can terminate the sync once it sees a URI it already knows. And most clients would not do more lookups than they are doing now... //Stefan
Re: Feed History -02
On 18/07/2005, at 1:29 PM, Stefan Eissing wrote: I agree that special URIs are not that great either. Another idea might be nested elements: stateful feed: fh:historyfh:prevhttp://example.org/thingie1.1/ fh:prev/fh:history stateful initial feed: fh:history/ stateless feed: fh:historyfh:none//fh:history Hmm. My thinking was that allowing stateful to be omitted would be concise and unambiguous; to compare, stateful feed: fh:prevhttp://example.org/thingie1.1/fh:prev stateful initial feed: fh:statefultrue/fh:stateful stateless feed: fh:statefulfalse/fh:stateful -- Mark Nottingham http://www.mnot.net/
Re: Feed History -02
Am 18.07.2005 um 19:33 schrieb Mark Nottingham: On 18/07/2005, at 1:29 PM, Stefan Eissing wrote: I agree that special URIs are not that great either. Another idea might be nested elements: stateful feed: fh:historyfh:prevhttp://example.org/thingie1.1/fh:prev/fh: history stateful initial feed: fh:history/ stateless feed: fh:historyfh:none//fh:history Hmm. My thinking was that allowing stateful to be omitted would be concise and unambiguous; to compare, stateful feed: fh:prevhttp://example.org/thingie1.1/fh:prev stateful initial feed: fh:statefultrue/fh:stateful stateless feed: fh:statefulfalse/fh:stateful Fine with me. As I said the discussion has reached the syntactic sugar level and your proposal has the same semantics and no nested elements. To be clear I would advise clients that fh:prev takes precedence over any fh:stateful information and then any ambiguity is resolved. //Stefan
Re: Feed History -02
Heh... the same questions could be asked about a lot of stuff embedded in RSS but that's not the issue ;-) ... fh:prev works fine. There really isn't a strong argument in favor of link. I have my own personal preferences but those are actually quite irrelevant :-)I'll still maintain that fh:stateful is quite unnecessary but doesn't hurt anything if it's used -- you'll just need to be explicit about such things as what if a fh:prev is used without a fh:stateful being present. Mark Nottingham wrote: Not a problem, as such, but I don't see any benefit to reuse. It also begs the question of what an atom element in a non-Atom document means, how it's processed, etc. On 18/07/2005, at 1:03 PM, James M Snell wrote: There is precedence for using atom:link in RSS feeds. see http:// feeds.feedburner.com/ITConversations-EverythingMP3. I really don't think it's a problem. -- Mark Nottingham http://www.mnot.net/
Re: Feed History -02
Am 18.07.2005 um 18:59 schrieb James M Snell: Mark Nottingham wrote: On 18/07/2005, at 11:10 AM, James M Snell wrote: Ch 3. fh:stateful seems to be only needed for a newborn stateful feed. As an alternative one could drop fh:stateful and define that an empty fh:prev (refering to itself) is the last document in a stateful feed. That would eliminate the cases of wrong mixes of fh:stateful and fh:prev. +1. After going through this, fh:stateful really doesn't seem to be necessary. the presence of fh:prev would be sufficient to indicate that the feed has a history and a blank fh:prev would work fine to indicate the end of the history. I thought about the comments on the plane yesterday, and I agree. However, I'm wary of special URI values; also, I want to preserve stateful=false. So, what about saying that you can omit fh:stateful *if* fh:prev is in the feed? -- Mark Nottingham http://www.mnot.net/ I would say that if fh:prev is present, stateful=true is assumed. if fh:prev is not present, stateful=false is assumed. Omit fh:prev in the final feed in the chain and you know you've reached the end. stateful gives a hint to a client about caching entries and maybe their representation in a user interface. It may be desired to see that a new feed is stateful(or not) even if it has no history yet. That is why I came up with the empty prev link as a suggestion. I agree that special URIs are not that great either. Another idea might be nested elements: stateful feed: fh:historyfh:prevhttp://example.org/thingie1.1/fh:prev/fh: history stateful initial feed: fh:history/ stateless feed: fh:historyfh:none//fh:history So much for the syntactic sugar... //Stefan
Feed History -02
The Feed History draft has been updated to -02; http://ftp.ietf.org/internet-drafts/draft-nottingham-atompub-feed- history-02.txt The most noticeable change in this version is the inclusion of a namespace URI, to allow implementation. I don't intend to update it for a while, so as to gather implementation feedback. Cheers, -- Mark Nottingham http://www.mnot.net/