RE: Feed Paging for Efficient Feed Synchronization

Brian Smith Thu, 06 Dec 2007 09:39:50 -0800

James M Snell wrote:
> Brian Smith wrote:
> > James Snell wrote at http://www.snellspace.com/wp/?p=818:
> >> And Because the entire snapshot is contained within a single feed 
> >> document, we do not have to worry about the race condition that is 
> >> inherent in the rolling window feed paging model.
> > 
> > The race condition is not an inherent problem in the paging 
> > model; it is an implementation issue. In my implementation, I 
> > do all the paging in my collection feed using links like:
> 
> Note that I specifically indicated the rolling window feed 
> paging model.
>  Yes, there are ways of doing the paging where the pages are 
> stable, and if we can rely on folks doing it that way 
> consistently then paging is fine.  The key requirement is 
> that the snapshot has to remain stable throughout the sync process.


To be clear, I don't provide any kind of snapshot mechanism. The client needs 
to go back to the start of the feeds (most recently edited entries) after it 
has paged through all the older entries, to see if some older entry has been 
modified during the paging process. If some entry or entries are constantly 
being edited, then it is possible that the client will never see them. On the 
other hand, what good does it do for the client to receive an old 
representation of an entry that is currently being relentlessly edited? The 
cost of implementing snapshotting seems to nearly always outweigh any benefits 
it might have.

The only guarantees that I make beyond what AtomPub requires are: (1) if any 
entries are skipped during paging, they will appear at the start of the feed, 
(2) navigating to each "prev" page will always result in older entries than the 
ones in the current page, and (3) navigating to each "next" page will result in 
an empty page or in entries newer than the ones in the current page. 
Implementations that do paging by numbering each page ("?page=2", "?page=3") 
often cannot make those guarantees when entries are being added/edited faster 
than the client can page through the feed.

> That said, however, once the initial sync is completed, 
> subsequent incremental sync's are much more efficient if 
> paging is not used (e.g. give me one feed document describing
> just the things that have changed over the last twelve hours, etc).

On the other hand, if there were hundreds of entries updated in the last 12 
hours (e.g. a server upgrade which modifies the representation of every entry, 
or whatever else WordPress users do that causes every entry to appear updated), 
the client and the server will probably both benefit from chunking that up. 
Especially, on mobile phones, the larger the response, the less likely the 
service provider's proxies will let it go through to the device.

> Again, this would work so long as the pages remain stable; 
> however, I would very much like to see a comparison of how 
> the two approaches compare in terms of performance / efficiency.

Well, obviously paging is going to require more roundtrips if there are 
multiple pages used for the sync. If the server decides to use one page per 
resync then both approaches are equivalent. In fact, disregarding snapshotting, 
my solution is the same as yours, except in mine the server controls the values 
of the "since" and "until" parameters, whereas in yours the client controls 
them, and mine allows the server to break up the response into multiple 
(cacheable) pages.

- Brian

RE: Feed Paging for Efficient Feed Synchronization

Reply via email to