Re: PaceFeedState

Mark Nottingham Sun, 21 Nov 2004 08:52:47 -0800

On Nov 20, 2004, at 8:56 PM, Bill Kearney wrote:


Are you proposing that something publishing a feed keep track of what
entries were in previous versions?  As in, feed X retrieved just now,
contains entries a,b,c and feed Y was published last containing d,e,f.


I think you mean 'feed document' instead of 'feed' here, correct?

How you do this is implementation-specific; see below. All I'm really suggesting is that Atom Feed Documents are in fact persistent (in that there's a URI for the particular instance you're viewing); they can be referred to by other Atom Feed Documents to reconstruct an Atom Feed.

What sort of headers are you suggesting for providing info on uninformed requests? That is, from an aggregator that's never tried connecting before and doesn't have any previous knowledge of the feed? This being how most RSS aggregators behave currently.

If they want to try to get the whole state of the feed, they can do so by either of the methods described. If they can't reconstruct it, they'll inform the user that they can't, but can show the best guess. Do you think more is necessary?

While I find the idea of tracking previous instances interesting I'm not sure how many content sources are going to capable, let alone interested, in doing so.

How hard is it to keep the last n feed documents you've published? Conceptually, for a feed called myfeed.atom, all you need to do is:

1. extract the 'this' URI from myfeed.atom to $old 2. create a feed document containing the new entries, with a 'this' URI of $new 3. publish the new document at $new with a [EMAIL PROTECTED]'prev']/@this of $old 4. make myfeed.atom a symlink or HTTP redirect to $new 5. delete older feed files if necessary.

I'd be more inclined to want semantics on what kinds of ranges a source can provide. Be they date, numeric index or other forms. As, informing the aggregator the source provides date range bounding, dns-like serial number increments, last ETag or whatever. But tracking what was the "previous" feed seems unlikely when something like a dynamic script source is being used. There is no "previous" instance only what the script detected as 'relevant' items at that time.

I considered a number of approaches like this, where you expose an interface that clients can query for specific entries. However, a significant number of users will not want to use a script, for both performance and deployment considerations.

As such, I wanted to have a system that doesn't require special handling on the server side. Nothing in the proposal precludes you from layering in a more complex solution. OTOH, a script can easily handle this approach if they keep a tiny bit of extra state (which 'this' URI(s) an entry is associated with*) around.

Cheers,

* Alternatively, they could keep a list of entry IDs that the last N Feed Documents are associated with; this doesn't require per-Entry metadata, and wouldn't take as much room.**

** Or they could encode the entries a Feed Document is associated with in the 'this' URI if they didn't want to keep this state at all; e.g., link rel='this' href='http://www.example.com/entry;id1;id2;id3;id4' This approach doesn't require any extra state.

--
Mark Nottingham     http://www.mnot.net/

Re: PaceFeedState

Reply via email to