On Feb 27, 2011, at 8:56 PM, Isaac Force wrote: > On Sun, Feb 27, 2011 at 4:13 PM, niall el-assaad <[email protected]> wrote: >> I'm looking at developing an application that will have a couple of central >> nodes (data centre) and around 2,000 remote nodes (in branch offices). >> >> I'd like to know if couchdb can scale to having this many nodes working in a >> cluster. > > For the list to write a more meaningful answer to address your needs, > it would be helpful if you gave more detail about the nature of the > problem you want to solve. > > * Replication topology: is the plan to have replication from the branch office > nodes to your centralized data center? (n:1) > * Replication type: continuous or triggered manually/programatically? > * Scope of data set: I would be more concerned with writes than reads. > You'll need to have an idea of what your current aggregate average and > peak writes per second are, how much data is written for a given > period of time, and how far you think you will need this rate to scale in > the future. > * Why Couch: is CouchDB going to be addressing a brand new need, or > is it going to replace existing systems for known reasons? If it's the > latter, what is it about your current systems that aren't meeting your > demands, and what do you hope Couch will provide that will fill the gap? > (Specifically looking for performance data that you might have already > collected, and if Couch is going to be living on your existing hardware > or new hardware.) > > I haven't dealt with large distributed Couch systems, but my instinct > would be that Couch wouldn't have any problem with a 2000:1 replicated > system. (See Ubuntu One as an example of a large CouchDB system with > many external replicators.) The ability to handle it would come down > to how well the aggregate data set matches the size of hardware and > replication layout in your data center, and of course available > ingress bandwidth. > > -Isaac
Well said Isaac.
