On Tue, May 4, 2010 at 4:55 PM, David Rosenstrauch <dar...@darose.net> wrote: > I've had some neat ideas that I'd like to tinker with for a distributed DB > that implements a very different data model than Cassandra. However, I > obviously don't want to reinvent the wheel - particularly because in the > case of distributed systems, the wheel is quite complicated and hard to get > right. > > What I'm thinking would make more sense then is to build on top of the > Cassandra core (since it's obviously been implemented well and has been > proven to scale quite nicely) and then implement my own middle/top layer(s). > > So I'm wondering: > > * Anyone know if such a thing has been attempted before? (And, if so, links > to any stories about success / failure / tips.)
I believe Jun Rao and Sandeep Tata built a kind of chain replication starting from Cassandra 0.4-ish. I don't think the code is available. > * Would there happen to be any docs/blogs/emails providing useful tech info > for such an effort? I don't know of any, short of the articles about Cassandra's code itself. Ran Tavory wrote an excellent survey piece: http://prettyprint.me/2010/05/02/understanding-cassandra-code-base/ > * What I should include/exclude from the Cassandra source code to start > building on? Or, in other words, which package(s) from the source would be > considered to constitute the core layer? I don't see any shortcuts here. You need to understand the code enough to answer that question yourself. -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of Riptano, the source for professional Cassandra support http://riptano.com