Re: PaceFeedState

Joe Gregorio Wed, 24 Nov 2004 08:38:23 -0800

On Wed, 24 Nov 2004 08:09:26 -0800, Mark Nottingham <[EMAIL PROTECTED]> wrote:
> Hi Joe,
> 
> 
> 
> > I think a simple <link rel="prev"/> in the head of a feed which points
> > to the 'previous' feed would be all that is required. The client can
> > then, at their discretion, keep following 'prev's back until they are
> > satisfied. Leave it up to the client what to do with duplicate entry
> > id's if it encounters them (but note in the Pace that it could
> > happen).
> >
> > The entire discussion of Feed State Model can be dropped, the heart of
> > the Pace being:
> [...]
> > But I would drop the part about "until it encounters a link to a
> > document it already has seen". That may not be a good metric to go by.
> 
> I disagree. If clients have their own criteria for how far back they
> should look, or for how they combine the entries they see into a set,
> they'll act differently, and consistency is important here. One of the
> biggest complaints I have about RSS is that different aggregators have
> different concepts of what my feed is.


This may be where our point of disagreement hinges on. When I 
say client I am referring to a much larger range of applications
than just 'aggregators'.

> By having a well-specified model
> of how to reconstruct the feed, as well as a model for what a feed is,
> we can assure that all consumers see the same set of entries.
> 
> If we just leave it up to the consumer to decide whether they've seen
> all of the entries, they'll use heuristics to do it, and they'll fall
> into traps in figuring it out. I'd rather have one algorithm that's
> well-tested and known to work.

Different aggregators working differently to me isn't such a bad
thing. For example, if an item gets updated does the aggregator
display the updated item as new? suppress displaying it? display diffs
between the versions of the entry?

 
> For example, if a client decides that it's satisfied if the set of
> entries is the same as the last time it saw the feed, it won't go and
> look one further back. However, what if there were a series of
> snapshots that looked like this?
> 
> entry1
> entry2
> entry3
> ---
> entry4
> entry5
> entry6
> ---
> entry1
> entry2
> entry3

Ok, that veers wildly from what I thought a series of snapshots would
look like. I was considering a 'prev' hopping back in time by either
week or month. I don't know if the overhead of designing for such a
case as you have outlined above is worth the effort.


> A client that only saw the first one would look at the last one and
> miss the fact that 4,5 and 6 were in the middle.
> 
> Likewise, if we don't say how to combine entries into a set, clients
> will use different rules. I actually think we need more guidance here;
> e.g., how to detect changed entries.
> 
> 
> > For example, what if I have my top level feed with the last 10 items
> > in it, and each feeds 'prev' link points back to the previous 10
> > entries? That means that if I have 100 entries on my site then I've
> > got 9 'prev' links.
> >
> >     http://example.org/feed.cgi?start=100
> >     http://example.org/feed.cgi?start=90
> >     http://example.org/feed.cgi?start=10
> >
> > Now what if I add another entry to my site, 101. Then I have 10 *new*
> > 'prev' links:
> >
> >     http://example.org/feed.cgi?start=101
> >     http://example.org/feed.cgi?start=91
> >     http://example.org/feed.cgi?start=11
> >     http://example.org/feed.cgi?start=1
> >
> > Not the most efficient mechanism, but certainly plausible and it
> > causes problems with your spidering heuristic.
> 
> I agree that this is a problem with the approach I described earlier;
> thanks for pointing it out. Rather than take that approach, a "fully
> dynamic" server will need to keep a table in this form;
> 
> [ 'snapshot15': ['entry111','entry112'],
>    'snapshot14': ['entry100' ... 'entry110'],
>    ...
> ]
> 
> where each snapshot corresponds to a Feed Document Resource (FDR?).
> Once enough entries is added to the most recent snapshot (15 here),
> another is created. So, when someone requests the latest feed, it will
> get a 'this' of http://www.example.com/feeddb?id=snapshot15 and a
> 'prev' of http://www.example.com/feeddb?id=snapshot14.
> 

The part about this that makes me nervous is that this seems to be
veering closer to atom protocol stuff and not just syndication.

    -joe

-- 
Joe Gregorio        http://bitworking.org

Re: PaceFeedState

Reply via email to