Sounds like fun. Let me know if you guys need help. I might be able to contribute a few nodes for the test.
D :) On Sun, Feb 27, 2011 at 6:20 PM, Adam Kocoloski <[email protected]> wrote: > On Feb 27, 2011, at 8:56 PM, Isaac Force wrote: > >> On Sun, Feb 27, 2011 at 4:13 PM, niall el-assaad <[email protected]> wrote: >>> I'm looking at developing an application that will have a couple of central >>> nodes (data centre) and around 2,000 remote nodes (in branch offices). >>> >>> I'd like to know if couchdb can scale to having this many nodes working in a >>> cluster. >> >> For the list to write a more meaningful answer to address your needs, >> it would be helpful if you gave more detail about the nature of the >> problem you want to solve. >> >> * Replication topology: is the plan to have replication from the branch >> office >> nodes to your centralized data center? (n:1) >> * Replication type: continuous or triggered manually/programatically? >> * Scope of data set: I would be more concerned with writes than reads. >> You'll need to have an idea of what your current aggregate average and >> peak writes per second are, how much data is written for a given >> period of time, and how far you think you will need this rate to scale in >> the future. >> * Why Couch: is CouchDB going to be addressing a brand new need, or >> is it going to replace existing systems for known reasons? If it's the >> latter, what is it about your current systems that aren't meeting your >> demands, and what do you hope Couch will provide that will fill the gap? >> (Specifically looking for performance data that you might have already >> collected, and if Couch is going to be living on your existing hardware >> or new hardware.) >> >> I haven't dealt with large distributed Couch systems, but my instinct >> would be that Couch wouldn't have any problem with a 2000:1 replicated >> system. (See Ubuntu One as an example of a large CouchDB system with >> many external replicators.) The ability to handle it would come down >> to how well the aggregate data set matches the size of hardware and >> replication layout in your data center, and of course available >> ingress bandwidth. >> >> -Isaac > > Well said Isaac. > >
