Someone asked on IRC if there is a roadmap for Cassandra.  This is a
good discussion to have. :)

Personally my priority list looks like this:

High priority:
 1. range queries [which requires the partitioner changes we've been discussing]
 2. make cassandra not allow itself to run out of memory during
sustained inserts
 3. fix distributed remove issues
 4. Support unicode keys

Medium priority:
 5. pre-emptive repair (what the dynamo paper calls anti-entropy)
 6. load balancing

(1) is substantially done but will probably need some tweaking during
code review.  And then the client api will probably need some fleshing
out (right now you just get a list of keys back, so that's not very
efficient if you want to get columns for each of those too.)

(2) has workarounds like binarymemtable but I'd really like to get the
main insert path able to handle large insert volume without falling
over.  My co-worker is just starting to look into this.  I'm hoping
there will be some straightforward improvements to make here.

I outlined an approach to (3) that I think will work here:
http://mail-archives.apache.org/mod_mbox/incubator-cassandra-dev/200903.mbox/%3ce06563880903301519h922840ds72ef6f9a8d95e...@mail.gmail.com%3e

I'm waiting for Avinash's feedback but as outlined it is not much code.

(4) is a thrift issue, not Cassandra per se.  (see
https://issues.apache.org/jira/browse/THRIFT-395) but it is on my
plate so I thought I'd throw that out there.

I have not started (5) or (6).  There are some stubs for load
balancing in the code which is why I said in another thread that the
Facebook developers have probably thought more about this.

I know Avinash is currently finishing up multiget support.  Hopefully
he will chime in about what his and Prashant's plans are next.

-Jonathan

Reply via email to