On Tue, Jun 1, 2010 at 07:43, Rob Stewart <[email protected]> wrote: > Hi, CouchDB devs.
Hi, Rob. Welcome! > > So, my first, naive, questions might be something like: > 1. Is the proposal mentioned in the CouchDB wiki page still a valid problem > (database partitioning). Yes. That was an easy answer! :) > 2. If so, are the plans underway to solve this problem? I notice that there > was a proposal for the Google Summer of Code in 2009 to provide a solution: > http://socghop.appspot.com/document/show/user/rleeds/couchdb_cluster . That's me! Since the time I wrote that proposal I've gone to work for Meebo as one of the developers of the Lounge project, the canonical source of which is my repository on github[1]. The Lounge deviates from my GSoC proposal and the one outlined on the wiki, though. As in the wiki proposal, the Lounge uses a tree-like structure of CouchDB databases created through a proxy layer that handles the hashing and distribution of keys. However, unlike both proposals the proxies work on the HTTP layer and do not communicate via Erlang message passing. This solution incurs the cost of extra JSON overhead in exchange for keeping the software itself relatively simple and completely separate from CouchDB itself. In addition to the Lounge, Cloudant[2] is offering clustered CouchDB hosting using in-house modifications to the CouchDB code. I cannot speak authoritatively on their work so I won't try to compare it to the Lounge other than to say that I believe it is written in Erlang. For this reason it's possible pieces of their system could wind up in CouchDB some day if they decide to license it for inclusion. I've had a few discussions with Benoît Chesneau about implementing an Erlang solution, but as I recall it mostly revolved around what architectural changes we'd want to see to the internal APIs to make the addition of partitioning as clean as possible. Little to no code has been produced to this end on our part, though Paul Davis has done a little bit of hacking[3] toward separating the HTTP layer more cleanly while replacing MochiWeb with Basho's webmachine[4]. Finally, I've toyed around with the idea of re-implementing the Lounge using Node.js[5] and Robert Newson has recently started to hack on it as well. There is some (mostly useless so far) code on github[6]. > 3. If this is still an open problem for the CouchDB dev team, how would one > get involved in the design of a partitioning architecture for CouchDB ? Since there has been no consensus on the best way to go forward there is clearly room for different approaches and several projects to fulfilled different requirements. For my part, I help maintain the Lounge for the day-to-day operations at Meebo. However, I would like to see a project that tackles CouchDB clustering with a peer-to-peer structure instead of a fixed tree, eliminating the operational headache of manually distributing a fixed number of shards and taking some lessons from Dynamo, Cassandra and Riak. My work on Lode is mostly stalled while I hack on a structured overlay project for Node.js, though I haven't released any source. To get involved, keep the conversation going here or come to #couchdb on freenode. Everyone I mentioned tends to frequent that channel. My nick is the same as my github account: tilgovi. I think that's a good overview of the state of CouchDB partitioning solutions. Bring on the questions and discussion! Kind regards, Randall [1] http://github.com/tilgovi/couchdb-lounge [2] https://cloudant.com/ [3] http://github.com/davisp/couchdb/tree/webmachine [4] http://webmachine.basho.com/ [5] http://nodejs.org [6] http://github.com/tilgovi/lode
