Gary and I are attending a course on Cassandra, which several teams in Canonical are evaluating for use.
We have two more days during which we get to pick the teachers brains - mdennis - on modelling, operational issues and so forth. The first day was the 0-60 course that Riptano offered, the next two will be more freeform. Gary and I traded notes after the course and it seemed to me that many of you will have concerns or opinions that could be very relevant. So here's my (from memory - all errors are mine) summary of Cassandra and how I see it as being relevant to Launchpad today. I've skipped most of the ops specific stuff because while its interesting I think its not hugely relevant from a development perspective. Cassandra is essentially a highly parallel - at the unit of 'row' - database server. It doesn't model relationships in any declarative form - instead you get the basic primitives : a thing called a ColumnFamily(CF), which contains rows, and rows contain N columns with one value per column. Typically one would have a ColumnFamily per single sort of object in the system - User, EmailPart, Email, Bug, BugSubscription etc. Cassandra itself has no single-point-of-failure components. Each row can have up to billions of columns - and the number is *per-row*, so its possible to have one row with (say) 2 columns and the next with 1000. A common pattern is to transform 1-to-many join tables into very long rows in a shorter CF. The schema is dynamic: at the database level new columns can be defined without downtime, but the database has no transactions: the strongest guarantee you can get is that a single write group to a single partition will all get put into a single write-ahead-log. Cassandra is less efficient than a single-server database server like Postgresql or Mysql, but you can get greater performance out of it due to near-linear scaling as machines are added: both reads and writes become more efficient. Its better at writes vs reads (because it has an append-only store (which does automatic compaction - rather like bzr)). If we fit our system on a single DB server *and expect to do so indefinitely* then staying in a relational single-server model is ideal. (We've outgrown a single-server for reads, but not for writes - and we have headroom there). Clients talk to any arbitary Cassandra Counting - assigning numbers based on data in the database - is tricky, and there are a few techniques to do it. Running a counting service - a single point of failure that manages a lock and can issue numbers - is something we'd probably need to do to allocate bugids, were we to migrate to Cassandra. In Cassandra, most indexes are a CF that has row keys that are either the key [or some named value] from another CF, and values that are the key into another CF. E.g. BugSubscription might have a key of bugid, and in every row a column called 'emailaddress' with value being the email address subscribed to it. I chose this deliberately to emphasis how we might denormalise to make calculating notifications absolutely trivial. When someone changes their email address, we'd find their subscribed bugs (via a secondary index which would index the emailaddress column in BugSubscription) and update those subscriptions. Costs of using cassandra: - more servers are needed vs existing thing being replaced [because its less efficient and needs parallelism] - we'd need to write supporting ware of some sort to automate things that are simple sql now, like creating indexes [change the schema, generate an automated script to populate the index, update our data definition to cause writes to the index] - writes need to be change from ACID - where we rollback in the event of error to BASE - where everything we write is correct as far as it goes and things get made sensible eventually. (Eventually might be milliseconds, but its not instant). - its a pain to package, so we'd need to gain some java glue in buildout. - more operational complexity than we have today (jvm vs CPython) Potential benefits of using cassandra: - highly available, scalable platform - real twisted support, should we want that - native async library support - parallelism within single queries - online schema changes [no downtime!] Places where Cassandra may make sense for us [short term]: - librarian storage [nb most folk doing s3-like things use simple files on N disks for the backing store, metadata in Cassandra : in that model we'd just stay with pgsql] - a backend for solr/lucene, the search engine at the top of my list for fixing our search story (LEP/Search) - our OOPS system would be a decent place to experiment if we want to learn about bringing Cassandra higher up the foodchain. - Session storage would fit Cassandra very neatly: run with a low (e.g. 2) replication factor (the number of copies of each row) and insert a TTL - it will auto cleanup after itself. - librarian token storage would also fit very very nicely - set a 24 hour TTL, it supports very high volume writes, the tokens are naturally write-once - could replace memcached, which would give us a higher hit rate (because we would be sharing one effective cache) What about Cassandra for the main Launchpad system? Or even significant components? There are broadly speaking three challenges I see here: how can we model what we need in Cassandra? How would we migrate incrementally? How would we prevent a nightmare mix of sql, domain and cassandra logic in our classes? In terms of modelling, please throw specific use cases (queries) at me and I'll discuss them here. mdennis is convinced that other users have not had trouble modelling stuff (once they wrapped their heads around the basics) - and I believe him. Still, skill needs to be acquired. For incremental migration there are three broad approaches: - migrate isolated sets of tables at a time; join in appservers by querying multiple sources - have to have managable temporary result sizes : can't depend on major filtering happening in tables that are on the sql|cassandra side. We have few such things, but - as a for instance - the live data about builds, or importd status - are good examples. - double-write to both systems from appservers and once particular tables are completely in sync and not needed on the source system, start dropping them. - use a write-through approach: write a pgsql->cassandra trigger based system, or vice versa a cassandra->sql thing that watches for new data and inserts to pgsql. The biggest concerns I have though are around keeping sane: our code base is very hectic at the moment, with query logic and processing logic all intertwined. Migrating anything in that environment is hard because there isn't a single inflection point where we can add any of these approachs in and be confident that things would be comprehensively covered. At this point I think Cassandra would be beneficial in some or all of the short term items I listed earlier - and *if* at the end of the week Gary and I think Cassandra is worth exploring in significantly more depth for things currently in postgresql, then I'd really want us to clean up our persistence layer - that is, to have one - before we start working directly on Cassandra (outside of the short term items). P.S. send me your use cases! -Rob _______________________________________________ Mailing list: https://launchpad.net/~launchpad-dev Post to : [email protected] Unsubscribe : https://launchpad.net/~launchpad-dev More help : https://help.launchpad.net/ListHelp

