Time To Live

Richard Newman Wed, 24 Jul 2013 09:58:46 -0700

One of the decisions that Weave made was to use record expiry (and pruning 
scripts) on the server.


Records uploaded by a client can have a TTL set. History and forms expire after 
60 days, tabs after 7 days, clients 21. There is a large default that applies 
to other records, but it's so large that we can ignore it.

Once a record expires, it won't be returned from queries, and eventually a 
pruning script will delete it from the database.

Uploading a new version of a record will reset the TTL and refresh the object.


The purpose of a TTL, as I understand it, was threefold:

* Clients can disappear without warning; because there is no strong concept of 
a connected device, Sync relies on periodic refreshes to simulate disconnection 
and cleanup.

* Sync clients don't all have the same concept of expiration, and furthermore 
they don't propagate bulk-clear events. Without a TTL, records -- even ones 
that all clients have forgotten about -- would live on the server forever, even 
when a client wiped its history or encountered automatic expiration.

* Clients produce a lot of history. A TTL helps to reduce overall space usage 
and query response sizes. And it's safe to do so, given the assumption that 
clients were canonical, and thus wiping really old stuff off the "whiteboard" 
was fine.


It has downsides:

* The pruning scripts are expensive, and we have to disable them during periods 
of high load.

* It results in extra writes: clients write their own record once a week to 
ensure that we don't get expired.

* It doesn't recover enough space to be massively important. To quote telliott, 
"is it a gigantic win? no".

And to summarize:

09:47:56 < telliott> I'd assert that pruning never really lived up to its 
promise
09:48:04 < telliott> (or ttls, for that matter)



I don't think the TTL approach works for Sync.next, for several reasons.

* We plan to have a strong concept of attached clients, and device management 
outside of storage. The storage server shouldn't be making the time-based 
decision that a client has disappeared, and it's questionable whether there's 
value in doing so.

* We're aiming for consistent storage, which is mostly incompatible with some 
old records just disappearing without client action!

* We are moving in the direction of durable, if not entirely canonical, server 
storage. This somewhat implies shared state, rather than Sync's non-propagating 
model -- "profile in the cloud", not "whiteboard". The decision was already 
made for Sync 2.0 to propagate wipes: Bug 578694. That means that a *client* 
should decide when data should go away, and existing clients should have the 
same view of the world as a new client learning all it knows from the server.

* TTLs are broadly incompatible with extended offline usage or simple recovery 
scenarios. (There are tradeoffs here: what if you just stop using an old phone? 
Do we keep data around, and stick to old data formats, to make it possible for 
that phone to sync more easily? I think my point is that we shouldn't be 
routinely deciding that a client has gone AWOL.)

* Much of the win for TTLs is to improve query speed when iterating over whole 
collections. I think all of our proposed storage mechanisms will provide query 
mechanisms that avoid whole-collection iterations. 


Thoughts, folks?

-R
_______________________________________________
Sync-dev mailing list
[email protected]
https://mail.mozilla.org/listinfo/sync-dev

Time To Live

Reply via email to