WebRTC service database selection

Martin Thomson Thu, 20 Feb 2014 12:31:22 -0800

Looking at database backends for the data we’re talking about storing, these 
are the ranked concerns that I have heard:

1. availability
2. performance
3. consistency and partition tolerance

Basically, if the service is up, that’s good. The data that we are hold here
is either very short lived (that related to the push) or easily repaired
(device registrations of push URLs).

It’s a little more difficult when we get to revocation of URIs, or storing of
state for URIs, but I’m going to leave that for a later discussion.

The main concern that I have here is service availability in light of data
centre failures. In my experience, this happens far more often than is
acceptable. For reference, when I was at Skype, Microsoft had numbers that
were pretty close to their promise of 99.9% availability.

But I need to emphasise this - 99.9% is not good enough for a whole system.
99.99% might be. And the cloud platform is a component of a system, the
overall availability will be lower.

So geographic distribution of data is critical. This is managed with varying
degrees of sophistication by the different storage systems. The biggest part
is how they trade off responsiveness and fault recovery. If updating a row
requires a cross-geography request, that’s going to be slow, but it means that
you get very good failure characteristics. At the same time, it can also mean
that you are unable to perform updates in certain types of failure scenario.

The key feature - one that most databases provide - is the ability to tune this
to application needs. We’re going to want to tune this initially so that
updates don’t depend on full replication, since we care more about availability
and performance than consistency and partition tolerance.

Run-down of options

Now, I have only a small amount of background with these specific items
unfortunately, my experience is with stuff that I believe some of you might
think to be poisonous, plus some stuff we just can’t use. Nonetheless, here
are what I believe to be the high runner options:

MongoDB

This is very widely used, readily deployed to the cloud platform of your choice
and it has a pretty good story when it comes to performance.

Mongo does have a geographic redundancy story. It’s not covered in glory, but
nor is it entirely embarrassing.

I’m less encouraged by the characteristics of the geographic redundancy
features; replica sets can only be statically configured to prefer local
communication, which means that a failure in the local cluster will result in
cross-geo operations for all requests that block on replication. I’m also a
little concerned about how placement is controlled within a cluster.

http://docs.mongodb.org/manual/core/replica-set-architectures/

Cassandra

Cassandra is also widely used, with similar characteristics to Mongo.

This has a larger number of options with respect to geographic redundancy. The
topology aware replication mode offers some pretty good opportunities,
particularly when deployed with a “snitch” that is aware of the deployment
layout.

http://www.datastax.com/documentation/cassandra/1.2/cassandra/architecture/architectureDataDistributeReplication_c.html

Redis

Redis is probably the simplest option here. Its high availability option is
still unstable, so I’m not going to recommend it.

Dynamo

This would tie us to AWS, but it seems like a capable DB. The problem is that
in the short time I looked, I couldn’t uncover any details on their geographic
redundancy story. That is not encouraging.

Roll your own geo redundancy

This remains a viable option…if you really need the
performance/availability/other characteristic. Typically you take an existing
store (pick one, any one) and you add your own brand of geographic redundancy
to suit your needs. This has the advantage of being exactly what you need, but
the disadvantage of it being a bunch of extra work. I’m not going to recommend
this either; but it’s an option that may become worth considering later, unless
our database friend really pick up their collective game.

Others

I could provide info on memcached(b), Azure, Riak, etc… All of which have
their merits, but none of which are really strong contenders for the crown.

Summary

I think that we could make either Mongo or Cassandra work for us. If we were
truly serious about storing hard state, then I think Cassandra offers more
control, but I think that it would be a harder challenge from an operational
perspective to use and deploy.

At this stage, I’m going to suggest that we pick Mongo, even though I think
Cassandra might be functionally superior.

As long as we create and maintain a good abstraction for our data store, I
think that we can switch if needs change. …always create and maintain a good
abstraction for stuff like this.
_______________________________________________
dev-media mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-media

WebRTC service database selection

Reply via email to