RE: Unbalanced ring with C* 2.0.3 and vnodes after adding additional nodes
Hi Aaron, I assume you mean seed_provider setting in cassandra.yaml by seed list. The current setting for vm1-vm6 is: seed_provider = vm1,vm2,vm3,vm4 This setting also applied when the vm5 and vm6 were added. I checked the read repair metrics and it is about mean 20/s on vm5 and vm6. I tried to investigate the real distribution of tokens again and did on vm1: 1. nodetool describering marketdata /tmp/ring.txt 2. for node in vm1 vm2 vm3 vm4 vm5 vm6 ; do cat /tmp/ring.txt |grep ip_of($node) | wc -l; done This prints the number of times when a node was listed as endpoint: vm1: 303 vm2: 312 vm3: 332 vm4: 311 vm5: 901 vm6: 913 So this shows that we are really unbalanced. 1. Is there any way how we can fix that on a running production cluster? 2. Our backup plan is to snapshot all data, raise a complete fresh 6 node cluster and stream the data using sstable loader. Are there any objections about that plan from your point of view? Thanks in advance! Andi From: Aaron Morton [aa...@thelastpickle.com] Sent: Wednesday, December 18, 2013 3:14 AM To: Cassandra User Subject: Re: Unbalanced ring with C* 2.0.3 and vnodes after adding additional nodes Node: 4 CPU, 6 GB RAM, virtual appliance Cassandra: 3 GB Heap, vnodes 256 FWIW that’s a very low powered node. Maybe we forgot necessary actions during or after cluster expanding process. We are open for every idea. Where the nodes in the seed list when they joined the cluster? If so they do not bootstrap. The extra writes in nodes 5 and 6 could be from Read Repair writing to them. Cheers - Aaron Morton New Zealand @aaronmorton Co-Founder Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 12/12/2013, at 11:49 pm, Andreas Finke andreas.fi...@solvians.com wrote: Hi, after adding 2 more nodes to a 4 nodes cluster (before) we are experiencing high load on both new nodes. After doing some investigation we found out the following: - High cpu load on vm5+6 - Higher data load on vm5+6 - Write requests are evenly distributed to all 6 nodes by our client application (opscenter - metrics - WriteRequests) - Local writes are as twice as much in vm5 +6 (vm1-4: ~2800/s, vm5-6: ~6800/s) - Nodetool output: UN vm1 9.51 GB256 20,7% 13fa7bb7-19cb-44f5-af83-71a72e04993a X1 UN vm2 9.41 GB256 20,0% b71c2d3d-4721-4dde-a418-802f1af4b7a1 D1 UN vm3 9.37 GB256 18,9% 8ce4c419-d79c-4ef1-b3fd-8936bff3e44f X1 UN vm4 9.23 GB256 19,5% 17974f20-5756-4eba-a377-52feed3a1b10 D1 UN vm5 15.95 GB 256 10,7% 0c6db9ea-4c60-43f6-a12e-51a7d76f8e80 X1 UN vm6 14.86 GB 256 10,2% f64d1909-dd96-442b-b602-efee29eee0a0 D1 Although the ownership is lower on vm5-6 (which already is not right) the data load is way higher. Some cluster facts: Node: 4 CPU, 6 GB RAM, virtual appliance Cassandra: 3 GB Heap, vnodes 256 Schema: Replication strategy network, RF:2 Has anyone an idea what could be the cause for the unbalancing. Maybe we forgot necessary actions during or after cluster expanding process. We are open for every idea. Regards Andi
Re: Unbalanced ring mystery multi-DC issue with 1.1.11
Check the logs for messages about nodes going up and down, and also look at the MessagingService MBean for timeouts. If the node in DR 2 times out replying to DR1 the DR1 node will store a hint. Also when hints are stored they are TTL'd to the gc_grace_seconds for the CF (IIRC). If that's low the hints may not have been delivered. Am not aware of any specific tracking for failed hints other than log messages. A - Aaron Morton New Zealand @aaronmorton Co-Founder Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 28/09/2013, at 12:01 AM, Oleg Dulin oleg.du...@gmail.com wrote: Here is some more information. I am running full repair on one of the nodes and I am observing strange behavior. Both DCs were up during the data load. But repair is reporting a lot of out-of-sync data. Why would that be ? Is there a way for me to tell that WAN may be dropping hinted handoff traffic ? Regards, Oleg On 2013-09-27 10:35:34 +, Oleg Dulin said: Wanted to add one more thing: I can also tell that the numbers are not consistent across DRs this way -- I have a column family with really wide rows (a couple million columns). DC1 reports higher column counts than DC2. DC2 only becomes consistent after I do the command a couple of times and trigger a read-repair. But why would nodetool repair logs show that everything is in sync ? Regards, Oleg On 2013-09-27 10:23:45 +, Oleg Dulin said: Consider this output from nodetool ring: Address DC RackStatus State Load Effective-Ownership Token 127605887595351923798765477786913079396 dc1.5 DC1 RAC1Up Normal 32.07 GB50.00%0 dc2.100DC2 RAC1Up Normal 8.21 GB 50.00%100 dc1.6 DC1 RAC1Up Normal 32.82 GB50.00% 42535295865117307932921825928971026432 dc2.101DC2 RAC1Up Normal 12.41 GB50.00% 42535295865117307932921825928971026532 dc1.7 DC1 RAC1Up Normal 28.37 GB50.00% 85070591730234615865843651857942052864 dc2.102DC2 RAC1Up Normal 12.27 GB50.00% 85070591730234615865843651857942052964 dc1.8 DC1 RAC1Up Normal 27.34 GB50.00% 127605887595351923798765477786913079296 dc2.103DC2 RAC1Up Normal 13.46 GB50.00% 127605887595351923798765477786913079396 I concealed IPs and DC names for confidentiality. All of the data loading was happening against DC1 at a pretty brisk rate, of, say, 200K writes per minute. Note how my tokens are offset by 100. Shouldn't that mean that load on each node should be roughly identical ? In DC1 it is roughly around 30 G on each node. In DC2 it is almost 1/3rd of the nearest DC1 node by token range. To verify that the nodes are in sync, I ran nodetool -h localhost repair MyKeySpace --partitioner-range on each node in DC2. Watching the logs, I see that the repair went really quick and all column families are in sync! I need help making sense of this. Is this because DC1 is not fully compacted ? Is it because DC2 is not fully synced and I am not checking correctly ? How can I tell that there is still replication going on in progress (note, I started my load yesterday at 9:50am). -- Regards, Oleg Dulin http://www.olegdulin.com
Re: Unbalanced ring mystery multi-DC issue with 1.1.11
Wanted to add one more thing: I can also tell that the numbers are not consistent across DRs this way -- I have a column family with really wide rows (a couple million columns). DC1 reports higher column counts than DC2. DC2 only becomes consistent after I do the command a couple of times and trigger a read-repair. But why would nodetool repair logs show that everything is in sync ? Regards, Oleg On 2013-09-27 10:23:45 +, Oleg Dulin said: Consider this output from nodetool ring: Address DC RackStatus State Load Effective-Ownership Token 127605887595351923798765477786913079396 dc1.5 DC1 RAC1Up Normal 32.07 GB50.00% 0 dc2.100DC2 RAC1Up Normal 8.21 GB 50.00% 100 dc1.6 DC1 RAC1Up Normal 32.82 GB50.00% 42535295865117307932921825928971026432 dc2.101DC2 RAC1Up Normal 12.41 GB50.00% 42535295865117307932921825928971026532 dc1.7 DC1 RAC1Up Normal 28.37 GB50.00% 85070591730234615865843651857942052864 dc2.102DC2 RAC1Up Normal 12.27 GB50.00% 85070591730234615865843651857942052964 dc1.8 DC1 RAC1Up Normal 27.34 GB50.00% 127605887595351923798765477786913079296 dc2.103DC2 RAC1Up Normal 13.46 GB50.00% 127605887595351923798765477786913079396 I concealed IPs and DC names for confidentiality. All of the data loading was happening against DC1 at a pretty brisk rate, of, say, 200K writes per minute. Note how my tokens are offset by 100. Shouldn't that mean that load on each node should be roughly identical ? In DC1 it is roughly around 30 G on each node. In DC2 it is almost 1/3rd of the nearest DC1 node by token range. To verify that the nodes are in sync, I ran nodetool -h localhost repair MyKeySpace --partitioner-range on each node in DC2. Watching the logs, I see that the repair went really quick and all column families are in sync! I need help making sense of this. Is this because DC1 is not fully compacted ? Is it because DC2 is not fully synced and I am not checking correctly ? How can I tell that there is still replication going on in progress (note, I started my load yesterday at 9:50am). -- Regards, Oleg Dulin http://www.olegdulin.com
Re: Unbalanced ring mystery multi-DC issue with 1.1.11
Here is some more information. I am running full repair on one of the nodes and I am observing strange behavior. Both DCs were up during the data load. But repair is reporting a lot of out-of-sync data. Why would that be ? Is there a way for me to tell that WAN may be dropping hinted handoff traffic ? Regards, Oleg On 2013-09-27 10:35:34 +, Oleg Dulin said: Wanted to add one more thing: I can also tell that the numbers are not consistent across DRs this way -- I have a column family with really wide rows (a couple million columns). DC1 reports higher column counts than DC2. DC2 only becomes consistent after I do the command a couple of times and trigger a read-repair. But why would nodetool repair logs show that everything is in sync ? Regards, Oleg On 2013-09-27 10:23:45 +, Oleg Dulin said: Consider this output from nodetool ring: Address DC RackStatus State Load Effective-Ownership Token 127605887595351923798765477786913079396 dc1.5 DC1 RAC1Up Normal 32.07 GB50.00% 0 dc2.100DC2 RAC1Up Normal 8.21 GB 50.00%100 dc1.6 DC1 RAC1Up Normal 32.82 GB50.00% 42535295865117307932921825928971026432 dc2.101DC2 RAC1Up Normal 12.41 GB50.00% 42535295865117307932921825928971026532 dc1.7 DC1 RAC1Up Normal 28.37 GB50.00% 85070591730234615865843651857942052864 dc2.102DC2 RAC1Up Normal 12.27 GB50.00% 85070591730234615865843651857942052964 dc1.8 DC1 RAC1Up Normal 27.34 GB50.00% 127605887595351923798765477786913079296 dc2.103DC2 RAC1Up Normal 13.46 GB50.00% 127605887595351923798765477786913079396 I concealed IPs and DC names for confidentiality. All of the data loading was happening against DC1 at a pretty brisk rate, of, say, 200K writes per minute. Note how my tokens are offset by 100. Shouldn't that mean that load on each node should be roughly identical ? In DC1 it is roughly around 30 G on each node. In DC2 it is almost 1/3rd of the nearest DC1 node by token range. To verify that the nodes are in sync, I ran nodetool -h localhost repair MyKeySpace --partitioner-range on each node in DC2. Watching the logs, I see that the repair went really quick and all column families are in sync! I need help making sense of this. Is this because DC1 is not fully compacted ? Is it because DC2 is not fully synced and I am not checking correctly ? How can I tell that there is still replication going on in progress (note, I started my load yesterday at 9:50am). -- Regards, Oleg Dulin http://www.olegdulin.com
Re: unbalanced ring
Maybe people think that 1.2 = Vnodes, when Vnodes are actually not mandatory and furthermore it is advised to upgrade and then, after a while, when all is running smooth, eventually switch to vnodes... 2013/2/13 Brandon Williams dri...@gmail.com On Tue, Feb 12, 2013 at 6:13 PM, Edward Capriolo edlinuxg...@gmail.com wrote: Are vnodes on by default. It seems that many on list are using this feature with small clusters. They are not. -Brandon
Re: unbalanced ring
Are vnodes on by default. It seems that many on list are using this feature with small clusters. I know these days anything named virtual is sexy, but they are not useful for small clusters are they. I do not see why people are using them. On Monday, February 11, 2013, aaron morton aa...@thelastpickle.com wrote: So when you say to do this with a “clean” setup, what are you asking me to do? Yup clear /var/lib/casssandra/data /commitlog /saved_caches start the cluster use nodetool ring You may also want to play with https://github.com/pcmanus/ccm to create a local 3 node cluster to see the difference. Note that the updateconfig setting cannot remove a config setting, so you will need edit the yaml for the nodes. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 12/02/2013, at 7:57 AM, stephen.m.thomp...@wellsfargo.com wrote: Aaron, thanks for your feedback. .125 num_tokens: 256 # initial_token: .126 num_tokens: 256 #initial_token: .127 num_tokens: 256 # initial_token: This all looks correct. So when you say to do this with a “clean” setup, what are you asking me to do? Is it enough to blow away /var/lib/cassandra and reload the data? Also destroy my Cassandra install (which is just un-tar) and reinstall from nothing? Stephen Thompson Wells Fargo Corporation Internet Authentication Fraud Prevention 704.427.3137 (W) | 704.807.3431 (C) This message may contain confidential and/or privileged information, and is intended for the use of the addressee only. If you are not the addressee or authorized to receive thi
Re: unbalanced ring
I take that back. vnodes are useful for any size cluster, but I do not see them as a day one requirement. It seems like many people are stumbling over this. On Tuesday, February 12, 2013, Edward Capriolo edlinuxg...@gmail.com wrote: Are vnodes on by default. It seems that many on list are using this feature with small clusters. I know these days anything named virtual is sexy, but they are not useful for small clusters are they. I do not see why people are using them. On Monday, February 11, 2013, aaron morton aa...@thelastpickle.com wrote: So when you say to do this with a “clean” setup, what are you asking me to do? Yup clear /var/lib/casssandra/data /commitlog /saved_caches start the cluster use nodetool ring You may also want to play with https://github.com/pcmanus/ccm to create a local 3 node cluster to see the difference. Note that the updateconfig setting cannot remove a config setting, so you will need edit the yaml for the nodes. Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 12/02/2013, at 7:57 AM, stephen.m.thomp...@wellsfargo.com wrote: Aaron, thanks for your feedback. .125 num_tokens: 256 # initial_token: .126 num_tokens: 256 #initial_token: .127 num_tokens: 256 # initial_token: This all looks correct. So when you say to do this with a “clean” setup, what are you asking me to do? Is it enough to blow away /var/lib/cassandra and reload the data? Also destroy my Cassandra install (which is just un-tar) and reinstall from nothing? Stephen Thompson Wells Fargo Corporation Internet Authentication Fraud Prevention 704.427.3137 (W) | 704.807.3431 (C) This message may contain confidential and/or privileged information, and is intended for the use of the addressee only. If you are not the addressee or authorized to receive thi
Re: unbalanced ring
On Tue, Feb 12, 2013 at 6:13 PM, Edward Capriolo edlinuxg...@gmail.com wrote: Are vnodes on by default. It seems that many on list are using this feature with small clusters. They are not. -Brandon
RE: unbalanced ring
Aaron, thanks for your feedback. .125 num_tokens: 256 # initial_token: .126 num_tokens: 256 #initial_token: .127 num_tokens: 256 # initial_token: This all looks correct. So when you say to do this with a clean setup, what are you asking me to do? Is it enough to blow away /var/lib/cassandra and reload the data? Also destroy my Cassandra install (which is just un-tar) and reinstall from nothing? Stephen Thompson Wells Fargo Corporation Internet Authentication Fraud Prevention 704.427.3137 (W) | 704.807.3431 (C) This message may contain confidential and/or privileged information, and is intended for the use of the addressee only. If you are not the addressee or authorized to receive this for the addressee, you must not use, copy, disclose, or take any action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by reply e-mail and delete this message. Thank you for your cooperation. From: aaron morton [mailto:aa...@thelastpickle.com] Sent: Monday, February 11, 2013 12:51 PM To: user@cassandra.apache.org Subject: Re: unbalanced ring The tokens are not right, not right at all. Some are too short and some are too tall. More technically they do not appear to be randomly arranged. The tokens for the .125 node all start with -3, the 126 node only has negative tokens and the 127 node mostly has positive tokens. Check that on each node the initial_token yaml setting is commented out, and that num_tokens is set to 256. If you can reproduce this fault with a clean setup please raise a ticket at https://issues.apache.org/jira/browse/CASSANDRA Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 8/02/2013, at 10:36 AM, stephen.m.thomp...@wellsfargo.commailto:stephen.m.thomp...@wellsfargo.com wrote: I found when I tried to do queries after sending this that although it shows a ton of data, it would no longer return ANYTHING for any query ... always 0 rows. So something was severely hosed. I blew away the data and reloaded from database ... the data set is a little smaller than before. It shows up somewhat more balanced, although I'm still curious why the third node is so much smaller than the first two. [root@Config3482VM1 apache-cassandra-1.2.1]# bin/nodetool status Datacenter: 28 == Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.28.205.125 994.89 MB 255 33.7% 3daab184-61f0-49a0-b076-863f10bc8c6c 205 UN 10.28.205.126 966.17 MB 256 99.9% 55bbd4b1-8036-4e32-b975-c073a7f0f47f 205 UN 10.28.205.127 699.79 MB 257 66.4% d240c91f-4901-40ad-bd66-d374a0ccf0b9 205 [root@Config3482VM1 apache-cassandra-1.2.1]# And yes, that is the entire content of the output from the status call, unedited. I have attached the output from nodetool ring. To answer a couple of the questions from below from Eric: * One data center (28)? One rack (205)? Three nodes? Yes, that's right. We're just doing a proof of concept at the moment so this is three VMWare servers. * How many keyspaces, and what are the replication strategies? There is one keyspace, and it has only one CF at this point. [default@KEYSPACE_NAME] describe; Keyspace: KEYSPACE_NAME: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable Writes: true Options: [28:2] * TL;DR What Aaron Said(tm) In the absence of rack/dc aware replication, your allocation is suspicious. I'm not sure what you mean by this. Steve -Original Message- From: Eric Evans [mailto:eev...@acunu.comhttp://acunu.com] Sent: Thursday, February 07, 2013 9:56 AM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: unbalanced ring On Wed, Feb 6, 2013 at 2:02 PM, stephen.m.thomp...@wellsfargo.commailto:stephen.m.thomp...@wellsfargo.com wrote: Thanks Aaron. I ran the cassandra-shuffle job and did a rebuild and compact on each of the nodes. [root@Config3482VM1 apache-cassandra-1.2.1]# bin/nodetool status Datacenter: 28 == Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.28.205.125 1.7 GB 255 33.7% 3daab184-61f0-49a0-b076-863f10bc8c6c 205 UN 10.28.205.126 591.44 MB 256 99.9% 55bbd4b1-8036-4e32-b975-c073a7f0f47f 205 UN 10.28.205.127 112.28 MB 257 66.4% d240c91f-4901-40ad-bd66-d374a0ccf0b9 205 Sorry, I have to ask, Is this the complete output? Have you perhaps sanitized it in some way? It seems like there is some piece of missing context here. Can you tell us: * Is this a cluster that was upgraded to virtual nodes (that would include a 1.2.x cluster initialized
Re: unbalanced ring
On Wed, Feb 6, 2013 at 2:02 PM, stephen.m.thomp...@wellsfargo.com wrote: Thanks Aaron. I ran the cassandra-shuffle job and did a rebuild and compact on each of the nodes. [root@Config3482VM1 apache-cassandra-1.2.1]# bin/nodetool status Datacenter: 28 == Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.28.205.125 1.7 GB 255 33.7% 3daab184-61f0-49a0-b076-863f10bc8c6c 205 UN 10.28.205.126 591.44 MB 256 99.9% 55bbd4b1-8036-4e32-b975-c073a7f0f47f 205 UN 10.28.205.127 112.28 MB 257 66.4% d240c91f-4901-40ad-bd66-d374a0ccf0b9 205 Sorry, I have to ask, Is this the complete output? Have you perhaps sanitized it in some way? It seems like there is some piece of missing context here. Can you tell us: * Is this a cluster that was upgraded to virtual nodes (that would include a 1.2.x cluster initialized with one token per node, and num_tokens set after the fact). If so, what did the initial token map look like? * Was initial_token used at any point along the way (either to supply a single token, or csv list of them), on any or all of the nodes in this cluster, at any time? * One data center (28)? One rack (205)? Three nodes? * How many keyspaces, and what are the replication strategies? * What does the full output of `nodetool ring' look like now? Can you attach it? So this is a little better. At last node 3 has some content, but they are still far from balanced. If I am understand this correctly, this is the distribution I would expect if the tokens were set at 15/5/1 rather than equal. As configured, I would expect roughly equal amounts of data on each node. Is that right? Do you have any suggestions for what I can look at to get there? Shuffle should only be required if you started out with 1-token-per-node. In that case, your existing ranges are evenly divided num_tokens ways, and so should be exceptionally consistent with one another (assuming of course that the existing ranges were evenly sized). The shuffle op merely shuffles the ranges you have to (random )other nodes in the cluster. If this cluster were started from scratch with num_tokens = 256, then a total of 768 tokens would have been randomly generated from within the murmur3 hash-space. Random assignment isn't perfect, but with 768 tokens (256 per), it should work out to be reasonably close on average. TL;DR What Aaron Said(tm) In the absence of rack/dc aware replication, your allocation is suspicious. I have about 11M rows of data in this keyspace and none of them are exceptionally long … it’s data pulled from Oracle and didn’t include any BLOB, etc. [ ... ] From: aaron morton [mailto:aa...@thelastpickle.com] Sent: Tuesday, February 05, 2013 3:41 PM To: user@cassandra.apache.org Subject: Re: unbalanced ring Use nodetool status with vnodes http://www.datastax.com/dev/blog/upgrading-an-existing-cluster-to-vnodes The different load can be caused by rack affinity, are all the nodes in the same rack ? Another simple check is have you created some very big rows? On 6/02/2013, at 8:40 AM, stephen.m.thomp...@wellsfargo.com wrote: So I have three nodes in a ring in one data center. My configuration has num_tokens: 256 set andinitial_token commented out. When I look at the ring, it shows me all of the token ranges of course, and basically identical data for each range on each node. Here is the Cliff’s Notes version of what I see: [root@Config3482VM2 apache-cassandra-1.2.0]# bin/nodetool ring Datacenter: 28 == Replicas: 1 Address RackStatus State LoadOwns Token 9187343239835811839 10.28.205.125 205 Up Normal 2.85 GB 33.69% -3026347817059713363 10.28.205.125 205 Up Normal 2.85 GB 33.69% -3026276684526453414 10.28.205.125 205 Up Normal 2.85 GB 33.69% -3026205551993193465 (etc) 10.28.205.126 205 Up Normal 1.15 GB 100.00% -9187343239835811840 10.28.205.126 205 Up Normal 1.15 GB 100.00% -9151314442816847872 10.28.205.126 205 Up Normal 1.15 GB 100.00% -9115285645797883904 (etc) 10.28.205.127 205 Up Normal 69.13 KB66.30% -9223372036854775808 10.28.205.127 205 Up Normal 69.13 KB66.30% 36028797018963967 10.28.205.127 205 Up Normal 69.13 KB66.30% 72057594037927935 (etc) So at this point I have a number of questions. The biggest question is of Load. Why does the .125 node have 2.85 GB, .126 has 1.15 GB, and .127 has only 0.69 GB? These boxes are all comparable and all configured identically. partitioner: org.apache.cassandra.dht.Murmur3Partitioner I’m sorry to ask so many questions – I’m having a hard time finding documentation
RE: unbalanced ring
Thanks Aaron. I ran the cassandra-shuffle job and did a rebuild and compact on each of the nodes. [root@Config3482VM1 apache-cassandra-1.2.1]# bin/nodetool status Datacenter: 28 == Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens Owns (effective) Host ID Rack UN 10.28.205.125 1.7 GB 255 33.7% 3daab184-61f0-49a0-b076-863f10bc8c6c 205 UN 10.28.205.126 591.44 MB 256 99.9% 55bbd4b1-8036-4e32-b975-c073a7f0f47f 205 UN 10.28.205.127 112.28 MB 257 66.4% d240c91f-4901-40ad-bd66-d374a0ccf0b9 205 So this is a little better. At last node 3 has some content, but they are still far from balanced. If I am understand this correctly, this is the distribution I would expect if the tokens were set at 15/5/1 rather than equal. As configured, I would expect roughly equal amounts of data on each node. Is that right? Do you have any suggestions for what I can look at to get there? I have about 11M rows of data in this keyspace and none of them are exceptionally long ... it's data pulled from Oracle and didn't include any BLOB, etc. Stephen Thompson Wells Fargo Corporation Internet Authentication Fraud Prevention 704.427.3137 (W) | 704.807.3431 (C) This message may contain confidential and/or privileged information, and is intended for the use of the addressee only. If you are not the addressee or authorized to receive this for the addressee, you must not use, copy, disclose, or take any action based on this message or any information herein. If you have received this message in error, please advise the sender immediately by reply e-mail and delete this message. Thank you for your cooperation. From: aaron morton [mailto:aa...@thelastpickle.com] Sent: Tuesday, February 05, 2013 3:41 PM To: user@cassandra.apache.org Subject: Re: unbalanced ring Use nodetool status with vnodes http://www.datastax.com/dev/blog/upgrading-an-existing-cluster-to-vnodes The different load can be caused by rack affinity, are all the nodes in the same rack ? Another simple check is have you created some very big rows? Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 6/02/2013, at 8:40 AM, stephen.m.thomp...@wellsfargo.commailto:stephen.m.thomp...@wellsfargo.com wrote: So I have three nodes in a ring in one data center. My configuration has num_tokens: 256 set andinitial_token commented out. When I look at the ring, it shows me all of the token ranges of course, and basically identical data for each range on each node. Here is the Cliff's Notes version of what I see: [root@Config3482VM2 apache-cassandra-1.2.0]# bin/nodetool ring Datacenter: 28 == Replicas: 1 Address RackStatus State LoadOwns Token 9187343239835811839 10.28.205.125 205 Up Normal 2.85 GB 33.69% -3026347817059713363 10.28.205.125 205 Up Normal 2.85 GB 33.69% -3026276684526453414 10.28.205.125 205 Up Normal 2.85 GB 33.69% -3026205551993193465 (etc) 10.28.205.126 205 Up Normal 1.15 GB 100.00% -9187343239835811840 10.28.205.126 205 Up Normal 1.15 GB 100.00% -9151314442816847872 10.28.205.126 205 Up Normal 1.15 GB 100.00% -9115285645797883904 (etc) 10.28.205.127 205 Up Normal 69.13 KB66.30% -9223372036854775808 10.28.205.127 205 Up Normal 69.13 KB66.30% 36028797018963967 10.28.205.127 205 Up Normal 69.13 KB66.30% 72057594037927935 (etc) So at this point I have a number of questions. The biggest question is of Load. Why does the .125 node have 2.85 GB, .126 has 1.15 GB, and .127 has only 0.69 GB? These boxes are all comparable and all configured identically. partitioner: org.apache.cassandra.dht.Murmur3Partitioner I'm sorry to ask so many questions - I'm having a hard time finding documentation that explains this stuff. Stephen
Re: unbalanced ring
Use nodetool status with vnodes http://www.datastax.com/dev/blog/upgrading-an-existing-cluster-to-vnodes The different load can be caused by rack affinity, are all the nodes in the same rack ? Another simple check is have you created some very big rows? Cheers - Aaron Morton Freelance Cassandra Developer New Zealand @aaronmorton http://www.thelastpickle.com On 6/02/2013, at 8:40 AM, stephen.m.thomp...@wellsfargo.com wrote: So I have three nodes in a ring in one data center. My configuration has num_tokens: 256 set andinitial_token commented out. When I look at the ring, it shows me all of the token ranges of course, and basically identical data for each range on each node. Here is the Cliff’s Notes version of what I see: [root@Config3482VM2 apache-cassandra-1.2.0]# bin/nodetool ring Datacenter: 28 == Replicas: 1 Address RackStatus State LoadOwns Token 9187343239835811839 10.28.205.125 205 Up Normal 2.85 GB 33.69% -3026347817059713363 10.28.205.125 205 Up Normal 2.85 GB 33.69% -3026276684526453414 10.28.205.125 205 Up Normal 2.85 GB 33.69% -3026205551993193465 (etc) 10.28.205.126 205 Up Normal 1.15 GB 100.00% -9187343239835811840 10.28.205.126 205 Up Normal 1.15 GB 100.00% -9151314442816847872 10.28.205.126 205 Up Normal 1.15 GB 100.00% -9115285645797883904 (etc) 10.28.205.127 205 Up Normal 69.13 KB66.30% -9223372036854775808 10.28.205.127 205 Up Normal 69.13 KB66.30% 36028797018963967 10.28.205.127 205 Up Normal 69.13 KB66.30% 72057594037927935 (etc) So at this point I have a number of questions. The biggest question is of Load. Why does the .125 node have 2.85 GB, .126 has 1.15 GB, and .127 has only 0.69 GB? These boxes are all comparable and all configured identically. partitioner: org.apache.cassandra.dht.Murmur3Partitioner I’m sorry to ask so many questions – I’m having a hard time finding documentation that explains this stuff. Stephen
Re: unbalanced ring
Tamar be carefull. Datastax doesn't recommand major compactions in production environnement. If I got it right, performing major compaction will convert all your SSTables into a big one, improving substantially your reads performence, at least for a while... The problem is that will disable minor compactions too (because of the difference of size between this SSTable and the new ones, if I remeber well). So your reads performance will decrease until your others SSTable reach the size of this big one you've created or until you run an other major compaction, transforming them into a maintenance normal process like repair is. But, knowing that, I still don't know if we both (Tamar and I) shouldn't run it anyway (In my case it will greatly decrease the size of my data 133 GB - 35GB and maybe load the cluster evenly...) Alain 2012/10/10 B. Todd Burruss bto...@gmail.com it should not have any other impact except increased usage of system resources. and i suppose, cleanup would not have an affect (over normal compaction) if all nodes contain the same data On Wed, Oct 10, 2012 at 12:12 PM, Tamar Fraenkel ta...@tok-media.comwrote: Hi! Apart from being heavy load (the compact), will it have other effects? Also, will cleanup help if I have replication factor = number of nodes? Thanks *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Wed, Oct 10, 2012 at 6:12 PM, B. Todd Burruss bto...@gmail.comwrote: major compaction in production is fine, however it is a heavy operation on the node and will take I/O and some CPU. the only time i have seen this happen is when i have changed the tokens in the ring, like nodetool movetoken. cassandra does not auto-delete data that it doesn't use anymore just in case you want to move the tokens again or otherwise undo. try nodetool cleanup On Wed, Oct 10, 2012 at 2:01 AM, Alain RODRIGUEZ arodr...@gmail.comwrote: Hi, Same thing here: 2 nodes, RF = 2. RCL = 1, WCL = 1. Like Tamar I never ran a major compaction and repair once a week each node. 10.59.21.241eu-west 1b Up Normal 133.02 GB 50.00% 0 10.58.83.109eu-west 1b Up Normal 98.12 GB 50.00% 85070591730234615865843651857942052864 What phenomena could explain the result above ? By the way, I have copy the data and import it in a one node dev cluster. There I have run a major compaction and the size of my data has been significantly reduced (to about 32 GB instead of 133 GB). How is that possible ? Do you think that if I run major compaction in both nodes it will balance the load evenly ? Should I run major compaction in production ? 2012/10/10 Tamar Fraenkel ta...@tok-media.com Hi! I am re-posting this, now that I have more data and still *unbalanced ring*: 3 nodes, RF=3, RCL=WCL=QUORUM Address DC RackStatus State Load OwnsToken 113427455640312821154458202477256070485 x.x.x.xus-east 1c Up Normal 24.02 GB 33.33% 0 y.y.y.y us-east 1c Up Normal 33.45 GB 33.33% 56713727820156410577229101238628035242 z.z.z.zus-east 1c Up Normal 29.85 GB 33.33% 113427455640312821154458202477256070485 repair runs weekly. I don't run nodetool compact as I read that this may cause the minor regular compactions not to run and then I will have to run compact manually. Is that right? Any idea if this means something wrong, and if so, how to solve? Thanks, * Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Tue, Mar 27, 2012 at 9:12 AM, Tamar Fraenkel ta...@tok-media.comwrote: Thanks, I will wait and see as data accumulates. Thanks, *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Tue, Mar 27, 2012 at 9:00 AM, R. Verlangen ro...@us2.nl wrote: Cassandra is built to store tons and tons of data. In my opinion roughly ~ 6MB per node is not enough data to allow it to become a fully balanced cluster. 2012/3/27 Tamar Fraenkel ta...@tok-media.com This morning I have nodetool ring -h localhost Address DC RackStatus State Load OwnsToken 113427455640312821154458202477256070485 10.34.158.33us-east 1c Up Normal 5.78 MB 33.33% 0 10.38.175.131 us-east 1c Up Normal 7.23 MB 33.33% 56713727820156410577229101238628035242 10.116.83.10us-east 1c Up Normal 5.02 MB 33.33% 113427455640312821154458202477256070485 Version is 1.0.8. *Tamar Fraenkel * Senior Software Engineer, TOK Media
RE: unbalanced ring
To run, or not to run? All this depends on use case. There're no problems running major compactions (we do it nightly) in one case, there could be problems in another. Just need to understand, how everything works. Best regards / Pagarbiai Viktor Jevdokimov Senior Developer Email: viktor.jevdoki...@adform.commailto:viktor.jevdoki...@adform.com Phone: +370 5 212 3063, Mobile: +370 650 19588, Fax +370 5 261 0453 J. Jasinskio 16C, LT-01112 Vilnius, Lithuania Follow us on Twitter: @adforminsiderhttp://twitter.com/#!/adforminsider What is Adform: watch this short videohttp://vimeo.com/adform/display [Adform News] http://www.adform.com Visit us at IAB RTB workshop October 11, 4 pm in Sala Rossa [iab forum] http://www.iabforum.it/iab-forum-milano-2012/agenda/11-ottobre/ Disclaimer: The information contained in this message and attachments is intended solely for the attention and use of the named addressee and may be confidential. If you are not the intended recipient, you are reminded that the information remains the property of the sender. You must not use, disclose, distribute, copy, print or rely on this e-mail. If you have received this message in error, please contact the sender immediately and irrevocably delete this message and any copies. From: Alain RODRIGUEZ [mailto:arodr...@gmail.com] Sent: Thursday, October 11, 2012 09:17 To: user@cassandra.apache.org Subject: Re: unbalanced ring Tamar be carefull. Datastax doesn't recommand major compactions in production environnement. If I got it right, performing major compaction will convert all your SSTables into a big one, improving substantially your reads performence, at least for a while... The problem is that will disable minor compactions too (because of the difference of size between this SSTable and the new ones, if I remeber well). So your reads performance will decrease until your others SSTable reach the size of this big one you've created or until you run an other major compaction, transforming them into a maintenance normal process like repair is. But, knowing that, I still don't know if we both (Tamar and I) shouldn't run it anyway (In my case it will greatly decrease the size of my data 133 GB - 35GB and maybe load the cluster evenly...) Alain 2012/10/10 B. Todd Burruss bto...@gmail.commailto:bto...@gmail.com it should not have any other impact except increased usage of system resources. and i suppose, cleanup would not have an affect (over normal compaction) if all nodes contain the same data On Wed, Oct 10, 2012 at 12:12 PM, Tamar Fraenkel ta...@tok-media.commailto:ta...@tok-media.com wrote: Hi! Apart from being heavy load (the compact), will it have other effects? Also, will cleanup help if I have replication factor = number of nodes? Thanks Tamar Fraenkel Senior Software Engineer, TOK Media [Inline image 1] ta...@tok-media.commailto:ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Wed, Oct 10, 2012 at 6:12 PM, B. Todd Burruss bto...@gmail.commailto:bto...@gmail.com wrote: major compaction in production is fine, however it is a heavy operation on the node and will take I/O and some CPU. the only time i have seen this happen is when i have changed the tokens in the ring, like nodetool movetoken. cassandra does not auto-delete data that it doesn't use anymore just in case you want to move the tokens again or otherwise undo. try nodetool cleanup On Wed, Oct 10, 2012 at 2:01 AM, Alain RODRIGUEZ arodr...@gmail.commailto:arodr...@gmail.com wrote: Hi, Same thing here: 2 nodes, RF = 2. RCL = 1, WCL = 1. Like Tamar I never ran a major compaction and repair once a week each node. 10.59.21.241eu-west 1b Up Normal 133.02 GB 50.00% 0 10.58.83.109eu-west 1b Up Normal 98.12 GB50.00% 85070591730234615865843651857942052864 What phenomena could explain the result above ? By the way, I have copy the data and import it in a one node dev cluster. There I have run a major compaction and the size of my data has been significantly reduced (to about 32 GB instead of 133 GB). How is that possible ? Do you think that if I run major compaction in both nodes it will balance the load evenly ? Should I run major compaction in production ? 2012/10/10 Tamar Fraenkel ta...@tok-media.commailto:ta...@tok-media.com Hi! I am re-posting this, now that I have more data and still unbalanced ring: 3 nodes, RF=3, RCL=WCL=QUORUM Address DC RackStatus State LoadOwns Token 113427455640312821154458202477256070485 x.x.x.xus-east 1c Up Normal 24.02 GB33.33% 0 y.y.y.y us-east 1c Up Normal 33.45 GB33.33% 56713727820156410577229101238628035242 z.z.z.zus-east 1c Up Normal 29.85 GB33.33
Re: unbalanced ring
Hi! I am re-posting this, now that I have more data and still *unbalanced ring*: 3 nodes, RF=3, RCL=WCL=QUORUM Address DC RackStatus State Load OwnsToken 113427455640312821154458202477256070485 x.x.x.xus-east 1c Up Normal 24.02 GB33.33% 0 y.y.y.y us-east 1c Up Normal 33.45 GB33.33% 56713727820156410577229101238628035242 z.z.z.zus-east 1c Up Normal 29.85 GB33.33% 113427455640312821154458202477256070485 repair runs weekly. I don't run nodetool compact as I read that this may cause the minor regular compactions not to run and then I will have to run compact manually. Is that right? Any idea if this means something wrong, and if so, how to solve? Thanks, * Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Tue, Mar 27, 2012 at 9:12 AM, Tamar Fraenkel ta...@tok-media.com wrote: Thanks, I will wait and see as data accumulates. Thanks, *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Tue, Mar 27, 2012 at 9:00 AM, R. Verlangen ro...@us2.nl wrote: Cassandra is built to store tons and tons of data. In my opinion roughly ~ 6MB per node is not enough data to allow it to become a fully balanced cluster. 2012/3/27 Tamar Fraenkel ta...@tok-media.com This morning I have nodetool ring -h localhost Address DC RackStatus State Load OwnsToken 113427455640312821154458202477256070485 10.34.158.33us-east 1c Up Normal 5.78 MB 33.33% 0 10.38.175.131 us-east 1c Up Normal 7.23 MB 33.33% 56713727820156410577229101238628035242 10.116.83.10us-east 1c Up Normal 5.02 MB 33.33% 113427455640312821154458202477256070485 Version is 1.0.8. *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Tue, Mar 27, 2012 at 4:05 AM, Maki Watanabe watanabe.m...@gmail.comwrote: What version are you using? Anyway try nodetool repair compact. maki 2012/3/26 Tamar Fraenkel ta...@tok-media.com Hi! I created Amazon ring using datastax image and started filling the db. The cluster seems un-balanced. nodetool ring returns: Address DC RackStatus State Load OwnsToken 113427455640312821154458202477256070485 10.34.158.33us-east 1c Up Normal 514.29 KB 33.33% 0 10.38.175.131 us-east 1c Up Normal 1.5 MB 33.33% 56713727820156410577229101238628035242 10.116.83.10us-east 1c Up Normal 1.5 MB 33.33% 113427455640312821154458202477256070485 [default@tok] describe; Keyspace: tok: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:2] [default@tok] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.Ec2Snitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 4687d620-7664-11e1--1bcb936807ff: [10.38.175.131, 10.34.158.33, 10.116.83.10] Any idea what is the cause? I am running similar code on local ring and it is balanced. How can I fix this? Thanks, *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 -- With kind regards, Robin Verlangen www.robinverlangen.nl tokLogo.png
Re: unbalanced ring
Hi, Same thing here: 2 nodes, RF = 2. RCL = 1, WCL = 1. Like Tamar I never ran a major compaction and repair once a week each node. 10.59.21.241eu-west 1b Up Normal 133.02 GB 50.00% 0 10.58.83.109eu-west 1b Up Normal 98.12 GB 50.00% 85070591730234615865843651857942052864 What phenomena could explain the result above ? By the way, I have copy the data and import it in a one node dev cluster. There I have run a major compaction and the size of my data has been significantly reduced (to about 32 GB instead of 133 GB). How is that possible ? Do you think that if I run major compaction in both nodes it will balance the load evenly ? Should I run major compaction in production ? 2012/10/10 Tamar Fraenkel ta...@tok-media.com Hi! I am re-posting this, now that I have more data and still *unbalanced ring *: 3 nodes, RF=3, RCL=WCL=QUORUM Address DC RackStatus State Load OwnsToken 113427455640312821154458202477256070485 x.x.x.xus-east 1c Up Normal 24.02 GB33.33% 0 y.y.y.y us-east 1c Up Normal 33.45 GB33.33% 56713727820156410577229101238628035242 z.z.z.zus-east 1c Up Normal 29.85 GB33.33% 113427455640312821154458202477256070485 repair runs weekly. I don't run nodetool compact as I read that this may cause the minor regular compactions not to run and then I will have to run compact manually. Is that right? Any idea if this means something wrong, and if so, how to solve? Thanks, * Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Tue, Mar 27, 2012 at 9:12 AM, Tamar Fraenkel ta...@tok-media.comwrote: Thanks, I will wait and see as data accumulates. Thanks, *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Tue, Mar 27, 2012 at 9:00 AM, R. Verlangen ro...@us2.nl wrote: Cassandra is built to store tons and tons of data. In my opinion roughly ~ 6MB per node is not enough data to allow it to become a fully balanced cluster. 2012/3/27 Tamar Fraenkel ta...@tok-media.com This morning I have nodetool ring -h localhost Address DC RackStatus State Load OwnsToken 113427455640312821154458202477256070485 10.34.158.33us-east 1c Up Normal 5.78 MB 33.33% 0 10.38.175.131 us-east 1c Up Normal 7.23 MB 33.33% 56713727820156410577229101238628035242 10.116.83.10us-east 1c Up Normal 5.02 MB 33.33% 113427455640312821154458202477256070485 Version is 1.0.8. *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Tue, Mar 27, 2012 at 4:05 AM, Maki Watanabe watanabe.m...@gmail.com wrote: What version are you using? Anyway try nodetool repair compact. maki 2012/3/26 Tamar Fraenkel ta...@tok-media.com Hi! I created Amazon ring using datastax image and started filling the db. The cluster seems un-balanced. nodetool ring returns: Address DC RackStatus State Load OwnsToken 113427455640312821154458202477256070485 10.34.158.33us-east 1c Up Normal 514.29 KB 33.33% 0 10.38.175.131 us-east 1c Up Normal 1.5 MB 33.33% 56713727820156410577229101238628035242 10.116.83.10us-east 1c Up Normal 1.5 MB 33.33% 113427455640312821154458202477256070485 [default@tok] describe; Keyspace: tok: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:2] [default@tok] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.Ec2Snitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 4687d620-7664-11e1--1bcb936807ff: [10.38.175.131, 10.34.158.33, 10.116.83.10] Any idea what is the cause? I am running similar code on local ring and it is balanced. How can I fix this? Thanks, *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 -- With kind regards, Robin Verlangen www.robinverlangen.nl tokLogo.png
Re: unbalanced ring
major compaction in production is fine, however it is a heavy operation on the node and will take I/O and some CPU. the only time i have seen this happen is when i have changed the tokens in the ring, like nodetool movetoken. cassandra does not auto-delete data that it doesn't use anymore just in case you want to move the tokens again or otherwise undo. try nodetool cleanup On Wed, Oct 10, 2012 at 2:01 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Hi, Same thing here: 2 nodes, RF = 2. RCL = 1, WCL = 1. Like Tamar I never ran a major compaction and repair once a week each node. 10.59.21.241eu-west 1b Up Normal 133.02 GB 50.00% 0 10.58.83.109eu-west 1b Up Normal 98.12 GB 50.00% 85070591730234615865843651857942052864 What phenomena could explain the result above ? By the way, I have copy the data and import it in a one node dev cluster. There I have run a major compaction and the size of my data has been significantly reduced (to about 32 GB instead of 133 GB). How is that possible ? Do you think that if I run major compaction in both nodes it will balance the load evenly ? Should I run major compaction in production ? 2012/10/10 Tamar Fraenkel ta...@tok-media.com Hi! I am re-posting this, now that I have more data and still *unbalanced ring*: 3 nodes, RF=3, RCL=WCL=QUORUM Address DC RackStatus State Load OwnsToken 113427455640312821154458202477256070485 x.x.x.xus-east 1c Up Normal 24.02 GB33.33% 0 y.y.y.y us-east 1c Up Normal 33.45 GB 33.33% 56713727820156410577229101238628035242 z.z.z.zus-east 1c Up Normal 29.85 GB33.33% 113427455640312821154458202477256070485 repair runs weekly. I don't run nodetool compact as I read that this may cause the minor regular compactions not to run and then I will have to run compact manually. Is that right? Any idea if this means something wrong, and if so, how to solve? Thanks, * Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Tue, Mar 27, 2012 at 9:12 AM, Tamar Fraenkel ta...@tok-media.comwrote: Thanks, I will wait and see as data accumulates. Thanks, *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Tue, Mar 27, 2012 at 9:00 AM, R. Verlangen ro...@us2.nl wrote: Cassandra is built to store tons and tons of data. In my opinion roughly ~ 6MB per node is not enough data to allow it to become a fully balanced cluster. 2012/3/27 Tamar Fraenkel ta...@tok-media.com This morning I have nodetool ring -h localhost Address DC RackStatus State Load OwnsToken 113427455640312821154458202477256070485 10.34.158.33us-east 1c Up Normal 5.78 MB 33.33% 0 10.38.175.131 us-east 1c Up Normal 7.23 MB 33.33% 56713727820156410577229101238628035242 10.116.83.10us-east 1c Up Normal 5.02 MB 33.33% 113427455640312821154458202477256070485 Version is 1.0.8. *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Tue, Mar 27, 2012 at 4:05 AM, Maki Watanabe watanabe.m...@gmail.com wrote: What version are you using? Anyway try nodetool repair compact. maki 2012/3/26 Tamar Fraenkel ta...@tok-media.com Hi! I created Amazon ring using datastax image and started filling the db. The cluster seems un-balanced. nodetool ring returns: Address DC RackStatus State Load OwnsToken 113427455640312821154458202477256070485 10.34.158.33us-east 1c Up Normal 514.29 KB 33.33% 0 10.38.175.131 us-east 1c Up Normal 1.5 MB 33.33% 56713727820156410577229101238628035242 10.116.83.10us-east 1c Up Normal 1.5 MB 33.33% 113427455640312821154458202477256070485 [default@tok] describe; Keyspace: tok: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:2] [default@tok] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.Ec2Snitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 4687d620-7664-11e1--1bcb936807ff: [10.38.175.131, 10.34.158.33, 10.116.83.10] Any idea what is the cause? I am running similar code on local ring and it is balanced. How can I fix this? Thanks, *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com
Re: unbalanced ring
Hi! Apart from being heavy load (the compact), will it have other effects? Also, will cleanup help if I have replication factor = number of nodes? Thanks *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Wed, Oct 10, 2012 at 6:12 PM, B. Todd Burruss bto...@gmail.com wrote: major compaction in production is fine, however it is a heavy operation on the node and will take I/O and some CPU. the only time i have seen this happen is when i have changed the tokens in the ring, like nodetool movetoken. cassandra does not auto-delete data that it doesn't use anymore just in case you want to move the tokens again or otherwise undo. try nodetool cleanup On Wed, Oct 10, 2012 at 2:01 AM, Alain RODRIGUEZ arodr...@gmail.comwrote: Hi, Same thing here: 2 nodes, RF = 2. RCL = 1, WCL = 1. Like Tamar I never ran a major compaction and repair once a week each node. 10.59.21.241eu-west 1b Up Normal 133.02 GB 50.00% 0 10.58.83.109eu-west 1b Up Normal 98.12 GB 50.00% 85070591730234615865843651857942052864 What phenomena could explain the result above ? By the way, I have copy the data and import it in a one node dev cluster. There I have run a major compaction and the size of my data has been significantly reduced (to about 32 GB instead of 133 GB). How is that possible ? Do you think that if I run major compaction in both nodes it will balance the load evenly ? Should I run major compaction in production ? 2012/10/10 Tamar Fraenkel ta...@tok-media.com Hi! I am re-posting this, now that I have more data and still *unbalanced ring*: 3 nodes, RF=3, RCL=WCL=QUORUM Address DC RackStatus State Load OwnsToken 113427455640312821154458202477256070485 x.x.x.xus-east 1c Up Normal 24.02 GB 33.33% 0 y.y.y.y us-east 1c Up Normal 33.45 GB 33.33% 56713727820156410577229101238628035242 z.z.z.zus-east 1c Up Normal 29.85 GB 33.33% 113427455640312821154458202477256070485 repair runs weekly. I don't run nodetool compact as I read that this may cause the minor regular compactions not to run and then I will have to run compact manually. Is that right? Any idea if this means something wrong, and if so, how to solve? Thanks, * Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Tue, Mar 27, 2012 at 9:12 AM, Tamar Fraenkel ta...@tok-media.comwrote: Thanks, I will wait and see as data accumulates. Thanks, *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Tue, Mar 27, 2012 at 9:00 AM, R. Verlangen ro...@us2.nl wrote: Cassandra is built to store tons and tons of data. In my opinion roughly ~ 6MB per node is not enough data to allow it to become a fully balanced cluster. 2012/3/27 Tamar Fraenkel ta...@tok-media.com This morning I have nodetool ring -h localhost Address DC RackStatus State Load OwnsToken 113427455640312821154458202477256070485 10.34.158.33us-east 1c Up Normal 5.78 MB 33.33% 0 10.38.175.131 us-east 1c Up Normal 7.23 MB 33.33% 56713727820156410577229101238628035242 10.116.83.10us-east 1c Up Normal 5.02 MB 33.33% 113427455640312821154458202477256070485 Version is 1.0.8. *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Tue, Mar 27, 2012 at 4:05 AM, Maki Watanabe watanabe.m...@gmail.com wrote: What version are you using? Anyway try nodetool repair compact. maki 2012/3/26 Tamar Fraenkel ta...@tok-media.com Hi! I created Amazon ring using datastax image and started filling the db. The cluster seems un-balanced. nodetool ring returns: Address DC RackStatus State Load OwnsToken 113427455640312821154458202477256070485 10.34.158.33us-east 1c Up Normal 514.29 KB 33.33% 0 10.38.175.131 us-east 1c Up Normal 1.5 MB 33.33% 56713727820156410577229101238628035242 10.116.83.10us-east 1c Up Normal 1.5 MB 33.33% 113427455640312821154458202477256070485 [default@tok] describe; Keyspace: tok: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:2] [default@tok] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.Ec2Snitch
Re: unbalanced ring
it should not have any other impact except increased usage of system resources. and i suppose, cleanup would not have an affect (over normal compaction) if all nodes contain the same data On Wed, Oct 10, 2012 at 12:12 PM, Tamar Fraenkel ta...@tok-media.comwrote: Hi! Apart from being heavy load (the compact), will it have other effects? Also, will cleanup help if I have replication factor = number of nodes? Thanks *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Wed, Oct 10, 2012 at 6:12 PM, B. Todd Burruss bto...@gmail.com wrote: major compaction in production is fine, however it is a heavy operation on the node and will take I/O and some CPU. the only time i have seen this happen is when i have changed the tokens in the ring, like nodetool movetoken. cassandra does not auto-delete data that it doesn't use anymore just in case you want to move the tokens again or otherwise undo. try nodetool cleanup On Wed, Oct 10, 2012 at 2:01 AM, Alain RODRIGUEZ arodr...@gmail.comwrote: Hi, Same thing here: 2 nodes, RF = 2. RCL = 1, WCL = 1. Like Tamar I never ran a major compaction and repair once a week each node. 10.59.21.241eu-west 1b Up Normal 133.02 GB 50.00% 0 10.58.83.109eu-west 1b Up Normal 98.12 GB 50.00% 85070591730234615865843651857942052864 What phenomena could explain the result above ? By the way, I have copy the data and import it in a one node dev cluster. There I have run a major compaction and the size of my data has been significantly reduced (to about 32 GB instead of 133 GB). How is that possible ? Do you think that if I run major compaction in both nodes it will balance the load evenly ? Should I run major compaction in production ? 2012/10/10 Tamar Fraenkel ta...@tok-media.com Hi! I am re-posting this, now that I have more data and still *unbalanced ring*: 3 nodes, RF=3, RCL=WCL=QUORUM Address DC RackStatus State Load OwnsToken 113427455640312821154458202477256070485 x.x.x.xus-east 1c Up Normal 24.02 GB 33.33% 0 y.y.y.y us-east 1c Up Normal 33.45 GB 33.33% 56713727820156410577229101238628035242 z.z.z.zus-east 1c Up Normal 29.85 GB 33.33% 113427455640312821154458202477256070485 repair runs weekly. I don't run nodetool compact as I read that this may cause the minor regular compactions not to run and then I will have to run compact manually. Is that right? Any idea if this means something wrong, and if so, how to solve? Thanks, * Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Tue, Mar 27, 2012 at 9:12 AM, Tamar Fraenkel ta...@tok-media.comwrote: Thanks, I will wait and see as data accumulates. Thanks, *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Tue, Mar 27, 2012 at 9:00 AM, R. Verlangen ro...@us2.nl wrote: Cassandra is built to store tons and tons of data. In my opinion roughly ~ 6MB per node is not enough data to allow it to become a fully balanced cluster. 2012/3/27 Tamar Fraenkel ta...@tok-media.com This morning I have nodetool ring -h localhost Address DC RackStatus State Load OwnsToken 113427455640312821154458202477256070485 10.34.158.33us-east 1c Up Normal 5.78 MB 33.33% 0 10.38.175.131 us-east 1c Up Normal 7.23 MB 33.33% 56713727820156410577229101238628035242 10.116.83.10us-east 1c Up Normal 5.02 MB 33.33% 113427455640312821154458202477256070485 Version is 1.0.8. *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Tue, Mar 27, 2012 at 4:05 AM, Maki Watanabe watanabe.m...@gmail.com wrote: What version are you using? Anyway try nodetool repair compact. maki 2012/3/26 Tamar Fraenkel ta...@tok-media.com Hi! I created Amazon ring using datastax image and started filling the db. The cluster seems un-balanced. nodetool ring returns: Address DC RackStatus State Load OwnsToken 113427455640312821154458202477256070485 10.34.158.33us-east 1c Up Normal 514.29 KB 33.33% 0 10.38.175.131 us-east 1c Up Normal 1.5 MB 33.33% 56713727820156410577229101238628035242 10.116.83.10us-east 1c Up Normal 1.5 MB 33.33%
Re: Unbalanced ring in Cassandra 0.8.4
Does cleanup only cleanup keys that no longer belong to that node. Yes. I guess it could be an artefact of the bulk load. It's not been reported previously though. Try the cleanup and see how it goes. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 21/06/2012, at 1:34 AM, Raj N wrote: Nick, thanks for the response. Does cleanup only cleanup keys that no longer belong to that node. Just to add more color, when I bulk loaded all my data into these 6 nodes, all of them had the same amount of data. After the first nodetool repair, the first node started having more data than the rest of the cluster. And since then it has never come back down. When I run cfstats on the node, the amount of data for every column family is almost 2 times the the amount of data for other. This is true for the number of keys estimate as well. For 1 CF I see more than double the number of keys and that's the largest cf as well with 34 GB data. Thanks -Rajesh On Wed, Jun 20, 2012 at 12:32 AM, Nick Bailey n...@datastax.com wrote: No. Cleanup will scan each sstable to remove data that is no longer owned by that specific node. It won't compact the sstables together however. On Tue, Jun 19, 2012 at 11:11 PM, Raj N raj.cassan...@gmail.com wrote: But wont that also run a major compaction which is not recommended anymore. -Raj On Sun, Jun 17, 2012 at 11:58 PM, aaron morton aa...@thelastpickle.com wrote: Assuming you have been running repair, it' can't hurt. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 17/06/2012, at 4:06 AM, Raj N wrote: Nick, do you think I should still run cleanup on the first node. -Rajesh On Fri, Jun 15, 2012 at 3:47 PM, Raj N raj.cassan...@gmail.com wrote: I did run nodetool move. But that was when I was setting up the cluster which means I didn't have any data at that time. -Raj On Fri, Jun 15, 2012 at 1:29 PM, Nick Bailey n...@datastax.com wrote: Did you start all your nodes at the correct tokens or did you balance by moving them? Moving nodes around won't delete unneeded data after the move is done. Try running 'nodetool cleanup' on all of your nodes. On Fri, Jun 15, 2012 at 12:24 PM, Raj N raj.cassan...@gmail.com wrote: Actually I am not worried about the percentage. Its the data I am concerned about. Look at the first node. It has 102.07GB data. And the other nodes have around 60 GB(one has 69, but lets ignore that one). I am not understanding why the first node has almost double the data. Thanks -Raj On Fri, Jun 15, 2012 at 11:06 AM, Nick Bailey n...@datastax.com wrote: This is just a known problem with the nodetool output and multiple DCs. Your configuration is correct. The problem with nodetool is fixed in 1.1.1 https://issues.apache.org/jira/browse/CASSANDRA-3412 On Fri, Jun 15, 2012 at 9:59 AM, Raj N raj.cassan...@gmail.com wrote: Hi experts, I have a 6 node cluster across 2 DCs(DC1:3, DC2:3). I have assigned tokens using the first strategy(adding 1) mentioned here - http://wiki.apache.org/cassandra/Operations?#Token_selection But when I run nodetool ring on my cluster, this is the result I get - Address DC Rack Status State LoadOwnsToken 113427455640312814857969558651062452225 172.17.72.91DC1 RAC13 Up Normal 102.07 GB 33.33% 0 45.10.80.144DC2 RAC5 Up Normal 59.1 GB 0.00% 1 172.17.72.93DC1 RAC18 Up Normal 59.57 GB33.33% 56713727820156407428984779325531226112 45.10.80.146DC2 RAC7 Up Normal 59.64 GB0.00% 56713727820156407428984779325531226113 172.17.72.95DC1 RAC19 Up Normal 69.58 GB33.33% 113427455640312814857969558651062452224 45.10.80.148DC2 RAC9 Up Normal 59.31 GB0.00% 113427455640312814857969558651062452225 As you can see the first node has considerably more load than the others(almost double) which is surprising since all these are replicas of each other. I am running Cassandra 0.8.4. Is there an explanation for this behaviour? Could https://issues.apache.org/jira/browse/CASSANDRA-2433 be the cause for this? Thanks -Raj
Re: Unbalanced ring in Cassandra 0.8.4
But wont that also run a major compaction which is not recommended anymore. -Raj On Sun, Jun 17, 2012 at 11:58 PM, aaron morton aa...@thelastpickle.comwrote: Assuming you have been running repair, it' can't hurt. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 17/06/2012, at 4:06 AM, Raj N wrote: Nick, do you think I should still run cleanup on the first node. -Rajesh On Fri, Jun 15, 2012 at 3:47 PM, Raj N raj.cassan...@gmail.com wrote: I did run nodetool move. But that was when I was setting up the cluster which means I didn't have any data at that time. -Raj On Fri, Jun 15, 2012 at 1:29 PM, Nick Bailey n...@datastax.com wrote: Did you start all your nodes at the correct tokens or did you balance by moving them? Moving nodes around won't delete unneeded data after the move is done. Try running 'nodetool cleanup' on all of your nodes. On Fri, Jun 15, 2012 at 12:24 PM, Raj N raj.cassan...@gmail.com wrote: Actually I am not worried about the percentage. Its the data I am concerned about. Look at the first node. It has 102.07GB data. And the other nodes have around 60 GB(one has 69, but lets ignore that one). I am not understanding why the first node has almost double the data. Thanks -Raj On Fri, Jun 15, 2012 at 11:06 AM, Nick Bailey n...@datastax.com wrote: This is just a known problem with the nodetool output and multiple DCs. Your configuration is correct. The problem with nodetool is fixed in 1.1.1 https://issues.apache.org/jira/browse/CASSANDRA-3412 On Fri, Jun 15, 2012 at 9:59 AM, Raj N raj.cassan...@gmail.com wrote: Hi experts, I have a 6 node cluster across 2 DCs(DC1:3, DC2:3). I have assigned tokens using the first strategy(adding 1) mentioned here - http://wiki.apache.org/cassandra/Operations?#Token_selection But when I run nodetool ring on my cluster, this is the result I get - Address DC Rack Status State LoadOwnsToken 113427455640312814857969558651062452225 172.17.72.91DC1 RAC13 Up Normal 102.07 GB 33.33% 0 45.10.80.144DC2 RAC5 Up Normal 59.1 GB 0.00% 1 172.17.72.93DC1 RAC18 Up Normal 59.57 GB33.33% 56713727820156407428984779325531226112 45.10.80.146DC2 RAC7 Up Normal 59.64 GB0.00% 56713727820156407428984779325531226113 172.17.72.95DC1 RAC19 Up Normal 69.58 GB33.33% 113427455640312814857969558651062452224 45.10.80.148DC2 RAC9 Up Normal 59.31 GB0.00% 113427455640312814857969558651062452225 As you can see the first node has considerably more load than the others(almost double) which is surprising since all these are replicas of each other. I am running Cassandra 0.8.4. Is there an explanation for this behaviour? Could https://issues.apache.org/jira/browse/CASSANDRA-2433 be the cause for this? Thanks -Raj
Re: Unbalanced ring in Cassandra 0.8.4
No. Cleanup will scan each sstable to remove data that is no longer owned by that specific node. It won't compact the sstables together however. On Tue, Jun 19, 2012 at 11:11 PM, Raj N raj.cassan...@gmail.com wrote: But wont that also run a major compaction which is not recommended anymore. -Raj On Sun, Jun 17, 2012 at 11:58 PM, aaron morton aa...@thelastpickle.com wrote: Assuming you have been running repair, it' can't hurt. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 17/06/2012, at 4:06 AM, Raj N wrote: Nick, do you think I should still run cleanup on the first node. -Rajesh On Fri, Jun 15, 2012 at 3:47 PM, Raj N raj.cassan...@gmail.com wrote: I did run nodetool move. But that was when I was setting up the cluster which means I didn't have any data at that time. -Raj On Fri, Jun 15, 2012 at 1:29 PM, Nick Bailey n...@datastax.com wrote: Did you start all your nodes at the correct tokens or did you balance by moving them? Moving nodes around won't delete unneeded data after the move is done. Try running 'nodetool cleanup' on all of your nodes. On Fri, Jun 15, 2012 at 12:24 PM, Raj N raj.cassan...@gmail.com wrote: Actually I am not worried about the percentage. Its the data I am concerned about. Look at the first node. It has 102.07GB data. And the other nodes have around 60 GB(one has 69, but lets ignore that one). I am not understanding why the first node has almost double the data. Thanks -Raj On Fri, Jun 15, 2012 at 11:06 AM, Nick Bailey n...@datastax.com wrote: This is just a known problem with the nodetool output and multiple DCs. Your configuration is correct. The problem with nodetool is fixed in 1.1.1 https://issues.apache.org/jira/browse/CASSANDRA-3412 On Fri, Jun 15, 2012 at 9:59 AM, Raj N raj.cassan...@gmail.com wrote: Hi experts, I have a 6 node cluster across 2 DCs(DC1:3, DC2:3). I have assigned tokens using the first strategy(adding 1) mentioned here - http://wiki.apache.org/cassandra/Operations?#Token_selection But when I run nodetool ring on my cluster, this is the result I get - Address DC Rack Status State Load Owns Token 113427455640312814857969558651062452225 172.17.72.91 DC1 RAC13 Up Normal 102.07 GB 33.33% 0 45.10.80.144 DC2 RAC5 Up Normal 59.1 GB 0.00% 1 172.17.72.93 DC1 RAC18 Up Normal 59.57 GB 33.33% 56713727820156407428984779325531226112 45.10.80.146 DC2 RAC7 Up Normal 59.64 GB 0.00% 56713727820156407428984779325531226113 172.17.72.95 DC1 RAC19 Up Normal 69.58 GB 33.33% 113427455640312814857969558651062452224 45.10.80.148 DC2 RAC9 Up Normal 59.31 GB 0.00% 113427455640312814857969558651062452225 As you can see the first node has considerably more load than the others(almost double) which is surprising since all these are replicas of each other. I am running Cassandra 0.8.4. Is there an explanation for this behaviour? Could https://issues.apache.org/jira/browse/CASSANDRA-2433 be the cause for this? Thanks -Raj
Re: Unbalanced ring in Cassandra 0.8.4
Assuming you have been running repair, it' can't hurt. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 17/06/2012, at 4:06 AM, Raj N wrote: Nick, do you think I should still run cleanup on the first node. -Rajesh On Fri, Jun 15, 2012 at 3:47 PM, Raj N raj.cassan...@gmail.com wrote: I did run nodetool move. But that was when I was setting up the cluster which means I didn't have any data at that time. -Raj On Fri, Jun 15, 2012 at 1:29 PM, Nick Bailey n...@datastax.com wrote: Did you start all your nodes at the correct tokens or did you balance by moving them? Moving nodes around won't delete unneeded data after the move is done. Try running 'nodetool cleanup' on all of your nodes. On Fri, Jun 15, 2012 at 12:24 PM, Raj N raj.cassan...@gmail.com wrote: Actually I am not worried about the percentage. Its the data I am concerned about. Look at the first node. It has 102.07GB data. And the other nodes have around 60 GB(one has 69, but lets ignore that one). I am not understanding why the first node has almost double the data. Thanks -Raj On Fri, Jun 15, 2012 at 11:06 AM, Nick Bailey n...@datastax.com wrote: This is just a known problem with the nodetool output and multiple DCs. Your configuration is correct. The problem with nodetool is fixed in 1.1.1 https://issues.apache.org/jira/browse/CASSANDRA-3412 On Fri, Jun 15, 2012 at 9:59 AM, Raj N raj.cassan...@gmail.com wrote: Hi experts, I have a 6 node cluster across 2 DCs(DC1:3, DC2:3). I have assigned tokens using the first strategy(adding 1) mentioned here - http://wiki.apache.org/cassandra/Operations?#Token_selection But when I run nodetool ring on my cluster, this is the result I get - Address DC Rack Status State LoadOwnsToken 113427455640312814857969558651062452225 172.17.72.91DC1 RAC13 Up Normal 102.07 GB 33.33% 0 45.10.80.144DC2 RAC5 Up Normal 59.1 GB 0.00% 1 172.17.72.93DC1 RAC18 Up Normal 59.57 GB33.33% 56713727820156407428984779325531226112 45.10.80.146DC2 RAC7 Up Normal 59.64 GB0.00% 56713727820156407428984779325531226113 172.17.72.95DC1 RAC19 Up Normal 69.58 GB33.33% 113427455640312814857969558651062452224 45.10.80.148DC2 RAC9 Up Normal 59.31 GB0.00% 113427455640312814857969558651062452225 As you can see the first node has considerably more load than the others(almost double) which is surprising since all these are replicas of each other. I am running Cassandra 0.8.4. Is there an explanation for this behaviour? Could https://issues.apache.org/jira/browse/CASSANDRA-2433 be the cause for this? Thanks -Raj
Re: Unbalanced ring in Cassandra 0.8.4
Nick, do you think I should still run cleanup on the first node. -Rajesh On Fri, Jun 15, 2012 at 3:47 PM, Raj N raj.cassan...@gmail.com wrote: I did run nodetool move. But that was when I was setting up the cluster which means I didn't have any data at that time. -Raj On Fri, Jun 15, 2012 at 1:29 PM, Nick Bailey n...@datastax.com wrote: Did you start all your nodes at the correct tokens or did you balance by moving them? Moving nodes around won't delete unneeded data after the move is done. Try running 'nodetool cleanup' on all of your nodes. On Fri, Jun 15, 2012 at 12:24 PM, Raj N raj.cassan...@gmail.com wrote: Actually I am not worried about the percentage. Its the data I am concerned about. Look at the first node. It has 102.07GB data. And the other nodes have around 60 GB(one has 69, but lets ignore that one). I am not understanding why the first node has almost double the data. Thanks -Raj On Fri, Jun 15, 2012 at 11:06 AM, Nick Bailey n...@datastax.com wrote: This is just a known problem with the nodetool output and multiple DCs. Your configuration is correct. The problem with nodetool is fixed in 1.1.1 https://issues.apache.org/jira/browse/CASSANDRA-3412 On Fri, Jun 15, 2012 at 9:59 AM, Raj N raj.cassan...@gmail.com wrote: Hi experts, I have a 6 node cluster across 2 DCs(DC1:3, DC2:3). I have assigned tokens using the first strategy(adding 1) mentioned here - http://wiki.apache.org/cassandra/Operations?#Token_selection But when I run nodetool ring on my cluster, this is the result I get - Address DC Rack Status State LoadOwnsToken 113427455640312814857969558651062452225 172.17.72.91DC1 RAC13 Up Normal 102.07 GB 33.33% 0 45.10.80.144DC2 RAC5 Up Normal 59.1 GB 0.00% 1 172.17.72.93DC1 RAC18 Up Normal 59.57 GB33.33% 56713727820156407428984779325531226112 45.10.80.146DC2 RAC7 Up Normal 59.64 GB0.00% 56713727820156407428984779325531226113 172.17.72.95DC1 RAC19 Up Normal 69.58 GB33.33% 113427455640312814857969558651062452224 45.10.80.148DC2 RAC9 Up Normal 59.31 GB0.00% 113427455640312814857969558651062452225 As you can see the first node has considerably more load than the others(almost double) which is surprising since all these are replicas of each other. I am running Cassandra 0.8.4. Is there an explanation for this behaviour? Could https://issues.apache.org/jira/browse/CASSANDRA-2433 be the cause for this? Thanks -Raj
Re: Unbalanced ring in Cassandra 0.8.4
This is just a known problem with the nodetool output and multiple DCs. Your configuration is correct. The problem with nodetool is fixed in 1.1.1 https://issues.apache.org/jira/browse/CASSANDRA-3412 On Fri, Jun 15, 2012 at 9:59 AM, Raj N raj.cassan...@gmail.com wrote: Hi experts, I have a 6 node cluster across 2 DCs(DC1:3, DC2:3). I have assigned tokens using the first strategy(adding 1) mentioned here - http://wiki.apache.org/cassandra/Operations?#Token_selection But when I run nodetool ring on my cluster, this is the result I get - Address DC Rack Status State Load Owns Token 113427455640312814857969558651062452225 172.17.72.91 DC1 RAC13 Up Normal 102.07 GB 33.33% 0 45.10.80.144 DC2 RAC5 Up Normal 59.1 GB 0.00% 1 172.17.72.93 DC1 RAC18 Up Normal 59.57 GB 33.33% 56713727820156407428984779325531226112 45.10.80.146 DC2 RAC7 Up Normal 59.64 GB 0.00% 56713727820156407428984779325531226113 172.17.72.95 DC1 RAC19 Up Normal 69.58 GB 33.33% 113427455640312814857969558651062452224 45.10.80.148 DC2 RAC9 Up Normal 59.31 GB 0.00% 113427455640312814857969558651062452225 As you can see the first node has considerably more load than the others(almost double) which is surprising since all these are replicas of each other. I am running Cassandra 0.8.4. Is there an explanation for this behaviour? Could https://issues.apache.org/jira/browse/CASSANDRA-2433 be the cause for this? Thanks -Raj
Re: Unbalanced ring in Cassandra 0.8.4
Actually I am not worried about the percentage. Its the data I am concerned about. Look at the first node. It has 102.07GB data. And the other nodes have around 60 GB(one has 69, but lets ignore that one). I am not understanding why the first node has almost double the data. Thanks -Raj On Fri, Jun 15, 2012 at 11:06 AM, Nick Bailey n...@datastax.com wrote: This is just a known problem with the nodetool output and multiple DCs. Your configuration is correct. The problem with nodetool is fixed in 1.1.1 https://issues.apache.org/jira/browse/CASSANDRA-3412 On Fri, Jun 15, 2012 at 9:59 AM, Raj N raj.cassan...@gmail.com wrote: Hi experts, I have a 6 node cluster across 2 DCs(DC1:3, DC2:3). I have assigned tokens using the first strategy(adding 1) mentioned here - http://wiki.apache.org/cassandra/Operations?#Token_selection But when I run nodetool ring on my cluster, this is the result I get - Address DC Rack Status State LoadOwnsToken 113427455640312814857969558651062452225 172.17.72.91DC1 RAC13 Up Normal 102.07 GB 33.33% 0 45.10.80.144DC2 RAC5 Up Normal 59.1 GB 0.00% 1 172.17.72.93DC1 RAC18 Up Normal 59.57 GB33.33% 56713727820156407428984779325531226112 45.10.80.146DC2 RAC7 Up Normal 59.64 GB0.00% 56713727820156407428984779325531226113 172.17.72.95DC1 RAC19 Up Normal 69.58 GB33.33% 113427455640312814857969558651062452224 45.10.80.148DC2 RAC9 Up Normal 59.31 GB0.00% 113427455640312814857969558651062452225 As you can see the first node has considerably more load than the others(almost double) which is surprising since all these are replicas of each other. I am running Cassandra 0.8.4. Is there an explanation for this behaviour? Could https://issues.apache.org/jira/browse/CASSANDRA-2433 be the cause for this? Thanks -Raj
Re: Unbalanced ring in Cassandra 0.8.4
Did you start all your nodes at the correct tokens or did you balance by moving them? Moving nodes around won't delete unneeded data after the move is done. Try running 'nodetool cleanup' on all of your nodes. On Fri, Jun 15, 2012 at 12:24 PM, Raj N raj.cassan...@gmail.com wrote: Actually I am not worried about the percentage. Its the data I am concerned about. Look at the first node. It has 102.07GB data. And the other nodes have around 60 GB(one has 69, but lets ignore that one). I am not understanding why the first node has almost double the data. Thanks -Raj On Fri, Jun 15, 2012 at 11:06 AM, Nick Bailey n...@datastax.com wrote: This is just a known problem with the nodetool output and multiple DCs. Your configuration is correct. The problem with nodetool is fixed in 1.1.1 https://issues.apache.org/jira/browse/CASSANDRA-3412 On Fri, Jun 15, 2012 at 9:59 AM, Raj N raj.cassan...@gmail.com wrote: Hi experts, I have a 6 node cluster across 2 DCs(DC1:3, DC2:3). I have assigned tokens using the first strategy(adding 1) mentioned here - http://wiki.apache.org/cassandra/Operations?#Token_selection But when I run nodetool ring on my cluster, this is the result I get - Address DC Rack Status State Load Owns Token 113427455640312814857969558651062452225 172.17.72.91 DC1 RAC13 Up Normal 102.07 GB 33.33% 0 45.10.80.144 DC2 RAC5 Up Normal 59.1 GB 0.00% 1 172.17.72.93 DC1 RAC18 Up Normal 59.57 GB 33.33% 56713727820156407428984779325531226112 45.10.80.146 DC2 RAC7 Up Normal 59.64 GB 0.00% 56713727820156407428984779325531226113 172.17.72.95 DC1 RAC19 Up Normal 69.58 GB 33.33% 113427455640312814857969558651062452224 45.10.80.148 DC2 RAC9 Up Normal 59.31 GB 0.00% 113427455640312814857969558651062452225 As you can see the first node has considerably more load than the others(almost double) which is surprising since all these are replicas of each other. I am running Cassandra 0.8.4. Is there an explanation for this behaviour? Could https://issues.apache.org/jira/browse/CASSANDRA-2433 be the cause for this? Thanks -Raj
Re: Unbalanced ring in Cassandra 0.8.4
I did run nodetool move. But that was when I was setting up the cluster which means I didn't have any data at that time. -Raj On Fri, Jun 15, 2012 at 1:29 PM, Nick Bailey n...@datastax.com wrote: Did you start all your nodes at the correct tokens or did you balance by moving them? Moving nodes around won't delete unneeded data after the move is done. Try running 'nodetool cleanup' on all of your nodes. On Fri, Jun 15, 2012 at 12:24 PM, Raj N raj.cassan...@gmail.com wrote: Actually I am not worried about the percentage. Its the data I am concerned about. Look at the first node. It has 102.07GB data. And the other nodes have around 60 GB(one has 69, but lets ignore that one). I am not understanding why the first node has almost double the data. Thanks -Raj On Fri, Jun 15, 2012 at 11:06 AM, Nick Bailey n...@datastax.com wrote: This is just a known problem with the nodetool output and multiple DCs. Your configuration is correct. The problem with nodetool is fixed in 1.1.1 https://issues.apache.org/jira/browse/CASSANDRA-3412 On Fri, Jun 15, 2012 at 9:59 AM, Raj N raj.cassan...@gmail.com wrote: Hi experts, I have a 6 node cluster across 2 DCs(DC1:3, DC2:3). I have assigned tokens using the first strategy(adding 1) mentioned here - http://wiki.apache.org/cassandra/Operations?#Token_selection But when I run nodetool ring on my cluster, this is the result I get - Address DC Rack Status State LoadOwnsToken 113427455640312814857969558651062452225 172.17.72.91DC1 RAC13 Up Normal 102.07 GB 33.33% 0 45.10.80.144DC2 RAC5 Up Normal 59.1 GB 0.00% 1 172.17.72.93DC1 RAC18 Up Normal 59.57 GB33.33% 56713727820156407428984779325531226112 45.10.80.146DC2 RAC7 Up Normal 59.64 GB0.00% 56713727820156407428984779325531226113 172.17.72.95DC1 RAC19 Up Normal 69.58 GB33.33% 113427455640312814857969558651062452224 45.10.80.148DC2 RAC9 Up Normal 59.31 GB0.00% 113427455640312814857969558651062452225 As you can see the first node has considerably more load than the others(almost double) which is surprising since all these are replicas of each other. I am running Cassandra 0.8.4. Is there an explanation for this behaviour? Could https://issues.apache.org/jira/browse/CASSANDRA-2433 be the cause for this? Thanks -Raj
Re: unbalanced ring
This morning I have nodetool ring -h localhost Address DC RackStatus State LoadOwns Token 113427455640312821154458202477256070485 10.34.158.33us-east 1c Up Normal 5.78 MB 33.33% 0 10.38.175.131 us-east 1c Up Normal 7.23 MB 33.33% 56713727820156410577229101238628035242 10.116.83.10us-east 1c Up Normal 5.02 MB 33.33% 113427455640312821154458202477256070485 Version is 1.0.8. *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Tue, Mar 27, 2012 at 4:05 AM, Maki Watanabe watanabe.m...@gmail.comwrote: What version are you using? Anyway try nodetool repair compact. maki 2012/3/26 Tamar Fraenkel ta...@tok-media.com Hi! I created Amazon ring using datastax image and started filling the db. The cluster seems un-balanced. nodetool ring returns: Address DC RackStatus State Load OwnsToken 113427455640312821154458202477256070485 10.34.158.33us-east 1c Up Normal 514.29 KB 33.33% 0 10.38.175.131 us-east 1c Up Normal 1.5 MB 33.33% 56713727820156410577229101238628035242 10.116.83.10us-east 1c Up Normal 1.5 MB 33.33% 113427455640312821154458202477256070485 [default@tok] describe; Keyspace: tok: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:2] [default@tok] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.Ec2Snitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 4687d620-7664-11e1--1bcb936807ff: [10.38.175.131, 10.34.158.33, 10.116.83.10] Any idea what is the cause? I am running similar code on local ring and it is balanced. How can I fix this? Thanks, *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 tokLogo.pngtokLogo.png
Re: unbalanced ring
Thanks, I will wait and see as data accumulates. Thanks, *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Tue, Mar 27, 2012 at 9:00 AM, R. Verlangen ro...@us2.nl wrote: Cassandra is built to store tons and tons of data. In my opinion roughly ~ 6MB per node is not enough data to allow it to become a fully balanced cluster. 2012/3/27 Tamar Fraenkel ta...@tok-media.com This morning I have nodetool ring -h localhost Address DC RackStatus State Load OwnsToken 113427455640312821154458202477256070485 10.34.158.33us-east 1c Up Normal 5.78 MB 33.33% 0 10.38.175.131 us-east 1c Up Normal 7.23 MB 33.33% 56713727820156410577229101238628035242 10.116.83.10us-east 1c Up Normal 5.02 MB 33.33% 113427455640312821154458202477256070485 Version is 1.0.8. *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 On Tue, Mar 27, 2012 at 4:05 AM, Maki Watanabe watanabe.m...@gmail.comwrote: What version are you using? Anyway try nodetool repair compact. maki 2012/3/26 Tamar Fraenkel ta...@tok-media.com Hi! I created Amazon ring using datastax image and started filling the db. The cluster seems un-balanced. nodetool ring returns: Address DC RackStatus State Load OwnsToken 113427455640312821154458202477256070485 10.34.158.33us-east 1c Up Normal 514.29 KB 33.33% 0 10.38.175.131 us-east 1c Up Normal 1.5 MB 33.33% 56713727820156410577229101238628035242 10.116.83.10us-east 1c Up Normal 1.5 MB 33.33% 113427455640312821154458202477256070485 [default@tok] describe; Keyspace: tok: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:2] [default@tok] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.Ec2Snitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 4687d620-7664-11e1--1bcb936807ff: [10.38.175.131, 10.34.158.33, 10.116.83.10] Any idea what is the cause? I am running similar code on local ring and it is balanced. How can I fix this? Thanks, *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 -- With kind regards, Robin Verlangen www.robinverlangen.nl tokLogo.pngtokLogo.png
Re: unbalanced ring
How can I fix this? add more data. 1.5M is not enough to get reliable reports
Re: unbalanced ring
What version are you using? Anyway try nodetool repair compact. maki 2012/3/26 Tamar Fraenkel ta...@tok-media.com Hi! I created Amazon ring using datastax image and started filling the db. The cluster seems un-balanced. nodetool ring returns: Address DC RackStatus State Load OwnsToken 113427455640312821154458202477256070485 10.34.158.33us-east 1c Up Normal 514.29 KB 33.33% 0 10.38.175.131 us-east 1c Up Normal 1.5 MB 33.33% 56713727820156410577229101238628035242 10.116.83.10us-east 1c Up Normal 1.5 MB 33.33% 113427455640312821154458202477256070485 [default@tok] describe; Keyspace: tok: Replication Strategy: org.apache.cassandra.locator.SimpleStrategy Durable Writes: true Options: [replication_factor:2] [default@tok] describe cluster; Cluster Information: Snitch: org.apache.cassandra.locator.Ec2Snitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: 4687d620-7664-11e1--1bcb936807ff: [10.38.175.131, 10.34.158.33, 10.116.83.10] Any idea what is the cause? I am running similar code on local ring and it is balanced. How can I fix this? Thanks, *Tamar Fraenkel * Senior Software Engineer, TOK Media [image: Inline image 1] ta...@tok-media.com Tel: +972 2 6409736 Mob: +972 54 8356490 Fax: +972 2 5612956 tokLogo.png