Someone asked on IRC if there is a roadmap for Cassandra. This is a good discussion to have. :)
Personally my priority list looks like this: High priority: 1. range queries [which requires the partitioner changes we've been discussing] 2. make cassandra not allow itself to run out of memory during sustained inserts 3. fix distributed remove issues 4. Support unicode keys Medium priority: 5. pre-emptive repair (what the dynamo paper calls anti-entropy) 6. load balancing (1) is substantially done but will probably need some tweaking during code review. And then the client api will probably need some fleshing out (right now you just get a list of keys back, so that's not very efficient if you want to get columns for each of those too.) (2) has workarounds like binarymemtable but I'd really like to get the main insert path able to handle large insert volume without falling over. My co-worker is just starting to look into this. I'm hoping there will be some straightforward improvements to make here. I outlined an approach to (3) that I think will work here: http://mail-archives.apache.org/mod_mbox/incubator-cassandra-dev/200903.mbox/%3ce06563880903301519h922840ds72ef6f9a8d95e...@mail.gmail.com%3e I'm waiting for Avinash's feedback but as outlined it is not much code. (4) is a thrift issue, not Cassandra per se. (see https://issues.apache.org/jira/browse/THRIFT-395) but it is on my plate so I thought I'd throw that out there. I have not started (5) or (6). There are some stubs for load balancing in the code which is why I said in another thread that the Facebook developers have probably thought more about this. I know Avinash is currently finishing up multiget support. Hopefully he will chime in about what his and Prashant's plans are next. -Jonathan