Looking at database backends for the data we’re talking about storing, these are the ranked concerns that I have heard:
1. availability 2. performance 3. consistency and partition tolerance Basically, if the service is up, that’s good. The data that we are hold here is either very short lived (that related to the push) or easily repaired (device registrations of push URLs). It’s a little more difficult when we get to revocation of URIs, or storing of state for URIs, but I’m going to leave that for a later discussion. The main concern that I have here is service availability in light of data centre failures. In my experience, this happens far more often than is acceptable. For reference, when I was at Skype, Microsoft had numbers that were pretty close to their promise of 99.9% availability. But I need to emphasise this - 99.9% is not good enough for a whole system. 99.99% might be. And the cloud platform is a component of a system, the overall availability will be lower. So geographic distribution of data is critical. This is managed with varying degrees of sophistication by the different storage systems. The biggest part is how they trade off responsiveness and fault recovery. If updating a row requires a cross-geography request, that’s going to be slow, but it means that you get very good failure characteristics. At the same time, it can also mean that you are unable to perform updates in certain types of failure scenario. The key feature - one that most databases provide - is the ability to tune this to application needs. We’re going to want to tune this initially so that updates don’t depend on full replication, since we care more about availability and performance than consistency and partition tolerance. Run-down of options Now, I have only a small amount of background with these specific items unfortunately, my experience is with stuff that I believe some of you might think to be poisonous, plus some stuff we just can’t use. Nonetheless, here are what I believe to be the high runner options: MongoDB This is very widely used, readily deployed to the cloud platform of your choice and it has a pretty good story when it comes to performance. Mongo does have a geographic redundancy story. It’s not covered in glory, but nor is it entirely embarrassing. I’m less encouraged by the characteristics of the geographic redundancy features; replica sets can only be statically configured to prefer local communication, which means that a failure in the local cluster will result in cross-geo operations for all requests that block on replication. I’m also a little concerned about how placement is controlled within a cluster. http://docs.mongodb.org/manual/core/replica-set-architectures/ Cassandra Cassandra is also widely used, with similar characteristics to Mongo. This has a larger number of options with respect to geographic redundancy. The topology aware replication mode offers some pretty good opportunities, particularly when deployed with a “snitch” that is aware of the deployment layout. http://www.datastax.com/documentation/cassandra/1.2/cassandra/architecture/architectureDataDistributeReplication_c.html Redis Redis is probably the simplest option here. Its high availability option is still unstable, so I’m not going to recommend it. Dynamo This would tie us to AWS, but it seems like a capable DB. The problem is that in the short time I looked, I couldn’t uncover any details on their geographic redundancy story. That is not encouraging. Roll your own geo redundancy This remains a viable option…if you really need the performance/availability/other characteristic. Typically you take an existing store (pick one, any one) and you add your own brand of geographic redundancy to suit your needs. This has the advantage of being exactly what you need, but the disadvantage of it being a bunch of extra work. I’m not going to recommend this either; but it’s an option that may become worth considering later, unless our database friend really pick up their collective game. Others I could provide info on memcached(b), Azure, Riak, etc… All of which have their merits, but none of which are really strong contenders for the crown. Summary I think that we could make either Mongo or Cassandra work for us. If we were truly serious about storing hard state, then I think Cassandra offers more control, but I think that it would be a harder challenge from an operational perspective to use and deploy. At this stage, I’m going to suggest that we pick Mongo, even though I think Cassandra might be functionally superior. As long as we create and maintain a good abstraction for our data store, I think that we can switch if needs change. …always create and maintain a good abstraction for stuff like this. _______________________________________________ dev-media mailing list [email protected] https://lists.mozilla.org/listinfo/dev-media

