Re: Roadmap

2009-04-16 Thread Per Mellqvist
Great to see a target for a release!

Personally I think the momentum of the project would benefit more from
having a release to refer to, than any (other) new feature or
improvement. I understand range queries are a priority for you
Jonathan. I still wonder if it would not be better to limit 0.3 to
only bug fixes (priority major or above)?

// Per

On Thu, Apr 16, 2009 at 12:02 AM, Jonathan Ellis jbel...@gmail.com wrote:
 I went all Enterprise on our jira and assigned issues to version 0.3
 that I'd like to get done in the relatively near future for our first
 official release.

 The list of issues is here:
 https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=truemode=hidesorter/order=DESCsorter/field=priorityresolution=-1pid=12310865fixfor=12313861

 Note that many issues are marked Patch Available which means we just
 need to complete the review process for those.

 If you want to grab one of the unassigned ones that would be awesome.
 If you want to grab one of the ones I assigned to myself, that's
 awesome too, but give me a heads up first so I don't duplicate your
 effort. :)

 Also, if there's other issues that you think should be on the 0.3 list
 feel free to add them.  (Correctness issues especially.)  But IMO we
 should not let scope creep too much for our first Apache release.

 -Jonathan

 On Thu, Apr 2, 2009 at 12:51 PM, Jonathan Ellis jbel...@gmail.com wrote:
 Someone asked on IRC if there is a roadmap for Cassandra.  This is a
 good discussion to have. :)

 Personally my priority list looks like this:

 High priority:
  1. range queries [which requires the partitioner changes we've been 
 discussing]
  2. make cassandra not allow itself to run out of memory during
 sustained inserts
  3. fix distributed remove issues
  4. Support unicode keys

 Medium priority:
  5. pre-emptive repair (what the dynamo paper calls anti-entropy)
  6. load balancing

 (1) is substantially done but will probably need some tweaking during
 code review.  And then the client api will probably need some fleshing
 out (right now you just get a list of keys back, so that's not very
 efficient if you want to get columns for each of those too.)

 (2) has workarounds like binarymemtable but I'd really like to get the
 main insert path able to handle large insert volume without falling
 over.  My co-worker is just starting to look into this.  I'm hoping
 there will be some straightforward improvements to make here.

 I outlined an approach to (3) that I think will work here:
 http://mail-archives.apache.org/mod_mbox/incubator-cassandra-dev/200903.mbox/%3ce06563880903301519h922840ds72ef6f9a8d95e...@mail.gmail.com%3e

 I'm waiting for Avinash's feedback but as outlined it is not much code.

 (4) is a thrift issue, not Cassandra per se.  (see
 https://issues.apache.org/jira/browse/THRIFT-395) but it is on my
 plate so I thought I'd throw that out there.

 I have not started (5) or (6).  There are some stubs for load
 balancing in the code which is why I said in another thread that the
 Facebook developers have probably thought more about this.

 I know Avinash is currently finishing up multiget support.  Hopefully
 he will chime in about what his and Prashant's plans are next.

 -Jonathan




Re: Roadmap

2009-04-16 Thread Jonathan ellis
Range queries isn't going to block us.  (The code is already written;  
I just need to rebase it and I'm waiting on #65 for that.)


But in principle I agree.

-Jonathan

On Apr 16, 2009, at 1:42 AM, Per Mellqvist p...@mellqvist.name wrote:


Great to see a target for a release!

Personally I think the momentum of the project would benefit more from
having a release to refer to, than any (other) new feature or
improvement. I understand range queries are a priority for you
Jonathan. I still wonder if it would not be better to limit 0.3 to
only bug fixes (priority major or above)?

// Per

On Thu, Apr 16, 2009 at 12:02 AM, Jonathan Ellis jbel...@gmail.com  
wrote:
I went all Enterprise on our jira and assigned issues to version  
0.3

that I'd like to get done in the relatively near future for our first
official release.

The list of issues is here:
https://issues.apache.org/jira/secure/IssueNavigator.jspa?reset=truemode=hidesorter/order=DESCsorter/field=priorityresolution=-1pid=12310865fixfor=12313861

Note that many issues are marked Patch Available which means we just
need to complete the review process for those.

If you want to grab one of the unassigned ones that would be awesome.
If you want to grab one of the ones I assigned to myself, that's
awesome too, but give me a heads up first so I don't duplicate your
effort. :)

Also, if there's other issues that you think should be on the 0.3  
list

feel free to add them.  (Correctness issues especially.)  But IMO we
should not let scope creep too much for our first Apache release.

-Jonathan

On Thu, Apr 2, 2009 at 12:51 PM, Jonathan Ellis jbel...@gmail.com  
wrote:

Someone asked on IRC if there is a roadmap for Cassandra.  This is a
good discussion to have. :)

Personally my priority list looks like this:

High priority:
 1. range queries [which requires the partitioner changes we've  
been discussing]

 2. make cassandra not allow itself to run out of memory during
sustained inserts
 3. fix distributed remove issues
 4. Support unicode keys

Medium priority:
 5. pre-emptive repair (what the dynamo paper calls anti-entropy)
 6. load balancing

(1) is substantially done but will probably need some tweaking  
during
code review.  And then the client api will probably need some  
fleshing

out (right now you just get a list of keys back, so that's not very
efficient if you want to get columns for each of those too.)

(2) has workarounds like binarymemtable but I'd really like to get  
the

main insert path able to handle large insert volume without falling
over.  My co-worker is just starting to look into this.  I'm hoping
there will be some straightforward improvements to make here.

I outlined an approach to (3) that I think will work here:
http://mail-archives.apache.org/mod_mbox/incubator-cassandra-dev/200903.mbox/%3ce06563880903301519h922840ds72ef6f9a8d95e...@mail.gmail.com%3e

I'm waiting for Avinash's feedback but as outlined it is not much  
code.


(4) is a thrift issue, not Cassandra per se.  (see
https://issues.apache.org/jira/browse/THRIFT-395) but it is on my
plate so I thought I'd throw that out there.

I have not started (5) or (6).  There are some stubs for load
balancing in the code which is why I said in another thread that the
Facebook developers have probably thought more about this.

I know Avinash is currently finishing up multiget support.   
Hopefully

he will chime in about what his and Prashant's plans are next.

-Jonathan