Re: Couch clustering/partitioning Re: CouchSpray - Thoughts?

Jan Lehnardt Fri, 20 Feb 2009 02:18:54 -0800


On 20 Feb 2009, at 02:34, Shaun Lindsay wrote:

Hi all,
So, a couple months ago we implemented almost exactly the couch
clustering/partitioning solution described below.


Shaun, this sounds fantastic! :) I hope you can release the code for
this.

Cheers
Jan
--

The couch cluster (which
we called 'The Lounge') sits behind nginx running a custom modulethat farms
out the GETs and PUTs to the appropriate node/shard and the views to a
python proxy daemon which handles reducing the view results from the
individual shards and returning the full view. We have replicationworkingbetween the cluster nodes so the shards exist multiple places and,in thecase of one of the nodes going down, the various proxies fail overto the
backup shards.
This clustering setup has been running in full production forseveral months
now with minimal problems.
We're looking to release all the code back to the community, but weneed toclear it with our legal team first to make sure we're notcompromising any
of our more business-specific, proprietary code.

In total, we have:
a nginx module specifically set up for sharding databases
a 'smartproxy', written in Python/Twisted, for sharding views
and a few other ancillary pieces (replication notification, viewupdating,
etc)
Mainly, I just wanted to keep people from duplicating the work we'vedone --hopefully we can release something back to the community in the nextseveral
weeks.
We're having a meeting tomorrow morning to figure out what we canreleaseright now (probably the nginx module, at the least). I'll leteveryone know
what out timeline looks like.

--Shaun Lindsay
Meebo.com
On Thu, Feb 19, 2009 at 4:48 PM, Chris Anderson <[email protected]>wrote:
On Thu, Feb 19, 2009 at 4:35 PM, Ben Browning <[email protected]>wrote:
So, I started thinking about partitioning with CouchDB and realized
that since views are just map/reduce, we can do some magic that's
harder if not impossible with other database systems. The idea in a
nutshell is to create a proxy that sits in front of multiple servers
and "sprays" the view queries to all servers, merging the results -
hence CouchSpray. This would give us storage and processing
scalability and could, with some extra logic, provide dataredundancy
and failover.
There are plans in CouchDB's future to take care of datapartitioning,
as well as querying views from a cluster. Theoretically, it should be
pretty simple. There are a few small projects that have started down
the road of writing code in this area.

https://code.launchpad.net/~dreid/sectional/trunk

Sectional is an Erlang http proxy that implements consistent hashing
for docs. I'm not sure how it handles view queries.

There's also a project to provide partitioning around the basic
key/value PUT and GET store using Nginx:

http://github.com/dysinger/nginx/tree/nginx_upstream_hash

If you're interested in digging into this stuff, please join d...@. We
plan to include clustering in CouchDB, so if you're interested in
implementing it, we could use your help.

Chris

--
Chris Anderson
http://jchris.mfdz.com

Re: Couch clustering/partitioning Re: CouchSpray - Thoughts?

Reply via email to