Ok, I have 3 nodes all load balanced with HAproxy: Centos 5.8 (Virtualised) 2 Cores 2GB RAM
I'm trying to replicate about 75K documents which total 6GB when compacted (0n Couchdb 1.2 which has compression turned on). I'm told they are fairly large documents. When it goes pear shaped Vsmstat starts using a lot of memory: procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------ r b swpd free buff cache si so bi bo in cs us sy id wa st 1 2 570576 8808 140 7208 2998 2249 3154 2249 1234 569 1 6 2 91 0 0 2 569656 9156 156 7504 2330 1899 2405 1904 1246 595 1 5 9 85 0 1 1 575412 9516 236 14928 1549 2261 3242 2261 1237 593 1 7 1 91 0 0 2 607092 13220 168 8156 3772 9012 3871 9017 1284 714 1 10 4 85 0 1 0 444336 857004 220 10212 5781 0 6202 0 1574 1010 13 7 33 47 0 1 0 442176 870684 428 11052 2049 0 2208 140 2561 1541 17 8 49 26 0 0 0 442176 813140 460 11968 170 0 348 0 2672 1565 25 9 61 4 0 0 1 442176 744972 484 12224 5440 0 5493 7 2432 900 8 4 49 40 0 0 1 442176 714048 484 12296 4547 0 4547 0 1799 827 4 2 50 44 0 0 1 442176 686304 496 12688 5128 0 5222 0 1696 999 9 2 50 40 0 0 3 444000 8712 444 12876 299 368 331 380 1294 188 22 20 36 23 0 0 3 469340 10040 116 7336 29 5087 74 5090 1232 268 3 22 0 75 0 1 2 584356 10220 124 6744 11367 28722 11370 28722 1643 1300 5 19 17 59 0 0 1 624908 10640 132 7036 6518 12879 6590 12884 1296 717 3 10 29 58 0 0 2 652556 10948 252 14776 3799 9494 5459 9494 1294 646 2 9 32 57 0 0 2 677784 10648 244 14528 3819 8196 3819 8201 1274 588 2 7 30 61 0 0 2 688460 9512 212 8224 3013 4522 3125 4522 1379 519 2 7 6 84 0 0 3 699164 9888 208 8468 2192 4014 2228 4014 1302 495 1 6 11 83 0 2 0 713104 9004 144 9192 2606 4490 2848 4490 1350 487 1 8 16 75 0 It only ever takes out one node at a time and the other nodes seem to be doing very little while the one node is running out of memory. If I kick it off again it processed some more and then spikes the memory and fails Thanks Mike PS: hope you enjoyed you couchdb get together! -----Original Message----- From: Robert Newson [mailto:[email protected]] Sent: 12 April 2012 17:28 To: [email protected] Subject: Re: BigCouch - Replication failing with Cannot Allocate memory What kind of load were you putting the machine on? On 12 April 2012 17:24, Robert Newson <[email protected]> wrote: > Could you show your vm.args file? > > On 12 April 2012 17:23, Robert Newson <[email protected]> wrote: >> Unfortunately your request for help coincided with the two day CouchDB >> Summit. #cloudant and the Issues tab on cloudant/bigcouch are other >> ways to get bigcouch support, but we happily answer queries here too, >> when not at the Model UN of CouchDB. :D >> >> B. >> >> On 12 April 2012 17:10, Mike Kimber <[email protected]> wrote: >>> Looks like this isn't the right place based on the responses so far. Shame >>> I hoped this was going to help solve our index/view rebuild times etc. >>> >>> Mike >>> >>> -----Original Message----- >>> From: Mike Kimber [mailto:[email protected]] >>> Sent: 10 April 2012 09:20 >>> To: [email protected] >>> Subject: BigCouch - Replication failing with Cannot Allocate memory >>> >>> I'm not sure if this is the correct place to raise an issue I am having >>> with replicating a standalone couchdb 1.1.1 to a 3 node BigCouch cluster? >>> If this is not the correct place please point me in the right direction if >>> it is then any one have any ideas why I keep getting the following error >>> message when I kick of a replication; >>> >>> eheap_alloc: Cannot allocate 1459620480 bytes of memory (of type "heap"). >>> >>> My set-up is: >>> >>> Standalone couchdb 1.1.1 running on Centos 5.7 >>> >>> 3 Node BigCouch cluster running on Centos 5.8 with the following local.ini >>> overrides pulling from the Standalone couchdb (78K documents) >>> >>> [httpd] >>> bind_address = XXX.XX.X.XX >>> >>> [cluster] >>> ; number of shards for a new database >>> q = 9 >>> ; number of copies of each shard >>> n = 1 >>> >>> [couchdb] >>> database_dir = /other/bigcouch/database >>> view_index_dir = /other/bigcouch/view >>> >>> The error is always generate on the third node in the cluster and the >>> server basically max's out on memory before hand. The other nodes seem to >>> be doing very little, but are getting data i.e. the shard sizes are >>> growing. I've put the copies per shard down to 1 as currently I'm not >>> interested in resilience. >>> >>> Any help would be greatly appreciated. >>> >>> Mike >>>
