The 'q' value is the number of unique shards that comprise the single logical database. BigCouch will keep 'n' replicas of each shard, so in total a database will have q*n .couch files associated with it.
The number of nodes in the cluster is an independent concern; BigCouch will happily store multiple shards on a single machine or leave some machines in the cluster without any data from a particular clustered database. The number of nodes must only be greater than or equal to the 'n' value for all databases (BigCouch will not store multiple copies of a shard on the same node in the cluster). Adam On Aug 4, 2013, at 4:32 PM, Stanley Iriele <[email protected]> wrote: > Thanks joan, > > I worded part of that poorly...When I said hard drives I really meant > physical machines.. But the bottleneck was disk space; so I said hard > disks... But That completely answered my question actually.... I appreciate > that...what is this 'q' value?..is that the number of nodes?.. Or is that > the r + w > n thing... Either way that answers questions thank you! > On Aug 4, 2013 7:18 AM, "Joan Touzet" <[email protected]> wrote: > >> Hi Stanley, >> >> Let me provide a simplistic explanation, and others can help refine it >> as necessary. >> >> On Sat, Aug 03, 2013 at 09:34:48AM -0700, Stanley Iriele wrote: >>> How then does distributed map/reduce work? >> >> Each BigCouch node with a shard of the database also keeps that shard of >> the view. When a request is made for a view, sufficient nodes are >> queried to retrieve the view result, with the reduce operation occurring >> as part of the return. >> >>> if not all nodes have replications of all things how does that >> coordination >>> happen during view building? >> >> This is not true, all nodes do not have replications of all things. If >> you ask a node for a view on a database it does not have at all, it will >> use information in the partition map to ask that question of a node that >> has at least one shard of the database in question, which will in turn >> complete the scatter/gather request to other nodes participating in that >> database. >> >>> also its sharded right so certain nodes have a certain range of keys. >> >> Right, see above. >> >>> My problem is this. I need a solution that can incrementally scale across >>> many hard disks...does big couch do this? with views and such?..if >>> so..how?...and what are the drawbacks? >> >> I wouldn't necessarily recommend running 1 BigCouch process per HD you >> have on a single machine, but there's no reason that it wouldn't work. >> >> The real challenge is that a database's partition map is determined >> staticly at the time of database creation. If you choose to add more HDs >> after this creation time, you will have to create a new database with >> more shards, then replicate data to this new database to use the new >> disks. The other option would be to use a very high number for 'q', then >> rebalance the shard map onto the new disks and BigCouch processes. There >> is a StackOverflow answer from Robert Newson that describes the process >> for performing this operation. >> >> In short, neither is trivial nor automated. For a single-machine system, >> you'd do far better to use some sort of Logical Volume Manager to deal >> with expanding storage over time, such as Linux's lvm, some HW raid >> cards, ZFS or similar features in OSX and Windows. >> >>> Thanks for any kind of response. >>> >>> Regards, >>> >>> Stanley >> >> -- >> Joan Touzet | [email protected] | wohali everywhere else >>
