Re: Tombstones

Mike Macgirvin Wed, 19 Dec 2007 04:19:46 -0800

+1. The timestamp can still be useful when dealing with more complex
> sync use cases.


In particular, without the timestamp(s) it is hard to see how the
tombstones fit in with feed paging and date-range-based queries.

This concerned me as well. Without it, the end result is a serverspitting out tombstones for every item that's ever been deleted on everyfeed request - since it has no knowledge of what the client isholding and what the client is interested in holding/syncing.

I was also concerned about this during the prev/next discussions but bitmy tongue when folks suggested that a page they've seen before shouldn'tever change. I don't agree. I just edited an entry from 1968. It mightbe buried on either page=279 or weblog/AUG-1968, but either way; itchanged and a client that really wants to keep in sync should be madeaware that a change was made. I also recently removed some copyrightedmaterial from my site, but couldn't remove said items from the permacache of a certain news reader whose company motto is 'Do No Evil'.Looks like a DMCA take-down is the only way to get them to flush thenuked entries.

How does one sync this stuff? The only thing that makes sense is for theclient to tell the server what the client is holding and what it isinterested in syncing. Then one can look at modification times andgather a list of changes to push out. In the case of a file from 1968that was modified, a collection of updated entries starting at a clientspecified time could be pushed.

In the case of deletions, there are a few possibilities. It wouldactually be a good idea in theory for the client to tell the server whatID's it already has. All of them. Then the server can respond sayingthat id 1234 was deleted. This is hardly efficient, but removes any needfor the server to maintain state about deleted items. So I can concedethat the quick and dirty solution is for the server to keep a tombstonelist, even if I find it personally disturbing for the server to maintainthis state. Given these constraints the only thing that makes sense interms of normal 'feeds' is to emit these in the proper 'order' withwhatever other entries are being emitted. But there's no historicalstarting point. No knowledge of what the client is holding. So you'vegotta' send out all deletions. All of them. Evey time a client connects.Forever. At least if they're in item order, you can limit the deleteditem list to the range of items currently being emitted. This might workin the classic sense of 'working'; at least there's some means ofcommunicating deletions, but it doesn't solve the copyright take-downproblem for items that fall outside of the current requested feed range.

The right way to do it is to emit these via a 'sync' operation where theclient specifies a start date of interest, and a date of the lastrequest it made for changes. Then you can figure out that the client'searliest cared about entry is in 1943, and you can send back the messagefrom 1968 that was updated last week, as well as the deletion of acopyrighted message from 1982 that happened yesterday. When the clientconnects again next week, you'll have another set of changes thatoccurred since now. This is the most efficient way of doing it (ignoringthe inefficiencies of protocol chatter).

So we're back to square one. To really do it right, it needs to be aclient initiated operation because only the client knows what it has,what it cares about, and when it last checked. Anything else is a hackthat will likely not survive.


mike

Re: Tombstones

Reply via email to