On Thu, Jan 19, 2012 at 19:41, kzhang <[email protected]> wrote: > Hi, > > If I have three nodes to set up a cluster, what difference does it make if > 1) I set up three shards, with each of the three nodes being the primary of > one shard, with replication running among them, so if one node goes down, > the other two can fully take over; v.s. 2) if still the same three nodes, I > set up six shards, still running replication among them. > > I know it sounds more overhead in 2). But is there any performance gain/loss > associated with either approach?
The overhead should be minimal, especially with BigCouch, since it's limited to the connection and request overhead. With BIgCouch, the requests between servers are not HTTP, so there's no HTTP overhead and I would bet the connections are persistent, so there isn't any added overhead there. The extra overhead is additional resources for the additional connections between servers, but that's negligible, and making the individual shards' responses smaller might actually mean memory usage is more stable since the JSON is split into more manageable chunks for the serializer. In general, over-sharding is a great tool and will make your life a lot easier when you decide you need more than 3 nodes. Someone from Cloudant can chime in if I'm wrong, but I wouldn't hesitate to set up 10x or 20x that many shards, depending on how quickly you expect to scale out. As an added bonus it will increase parallelism in your view index generation. -Randall > > I guess I am not sure what is the standard/ideal config for No. of shards/ > No. of Nodes. > > -- > View this message in context: > http://couchdb-development.1959287.n2.nabble.com/bigcouch-sharding-uestion-tp7206277p7206277.html > Sent from the CouchDB Development mailing list archive at Nabble.com.
