Re: Nodetool ring and Replicas after 1.2 upgrade
Thanks Jason, No errors in the log. Also the nodes do have a consistent schema for the keyspace (although this was a problem during the upgrade that we resolved using the procedure specified here: https://wiki.apache.org/cassandra/FAQ#schema_disagreement). -Mike From: Jason Wee peich...@gmail.com To: user@cassandra.apache.org; Michael Theroux mthero...@yahoo.com Sent: Tuesday, June 16, 2015 12:07 AM Subject: Re: Nodetool ring and Replicas after 1.2 upgrade maybe check the system.log to see if there is any exception and/or error? check as well if they are having consistent schema for the keyspace? hth jason On Tue, Jun 16, 2015 at 7:17 AM, Michael Theroux mthero...@yahoo.com wrote: Hello, We (finally) have just upgraded from Cassandra 1.1 to Cassandra 1.2.19. Everything appears to be up and running normally, however, we have noticed unusual output from nodetool ring. There is a new (to us) field Replicas in the nodetool output, and this field, seemingly at random, is changing from 2 to 3 and back to 2. We are using the byte ordered partitioner (we hash our own keys), and have a replication factor of 3. We are also on AWS and utilize the Ec2snitch on a single Datacenter. Other calls appear to be normal. nodetool getEndpoints returns the proper endpoints when querying various keys, nodetool ring and status return that all nodes appear healthy. Anyone have any hints on what maybe happening, or if this is a problem we should be concerned with? Thanks,-Mike
Re: Nodetool ring and Replicas after 1.2 upgrade
Hi Michael, I barely can access internet right now and was not able to check outputs on my computer, yet first thing that come to my mind is that since 1.2.x (and vnodes) I use rather nodetool status instead. What is the nodetool status output ? Also did you try to specify the keyspace ? Since RF is a per keyspace value, maybe this would help. Other than that, I don't have any idea. I don't remember anything similar, but it was a while ago. I have to ask... Why staying so much behind the current stable / production ready version ? C*heers, Alain 2015-06-16 14:57 GMT+02:00 Michael Theroux mthero...@yahoo.com: Thanks Jason, No errors in the log. Also the nodes do have a consistent schema for the keyspace (although this was a problem during the upgrade that we resolved using the procedure specified here: https://wiki.apache.org/cassandra/FAQ#schema_disagreement). -Mike -- *From:* Jason Wee peich...@gmail.com *To:* user@cassandra.apache.org; Michael Theroux mthero...@yahoo.com *Sent:* Tuesday, June 16, 2015 12:07 AM *Subject:* Re: Nodetool ring and Replicas after 1.2 upgrade maybe check the system.log to see if there is any exception and/or error? check as well if they are having consistent schema for the keyspace? hth jason On Tue, Jun 16, 2015 at 7:17 AM, Michael Theroux mthero...@yahoo.com wrote: Hello, We (finally) have just upgraded from Cassandra 1.1 to Cassandra 1.2.19. Everything appears to be up and running normally, however, we have noticed unusual output from nodetool ring. There is a new (to us) field Replicas in the nodetool output, and this field, seemingly at random, is changing from 2 to 3 and back to 2. We are using the byte ordered partitioner (we hash our own keys), and have a replication factor of 3. We are also on AWS and utilize the Ec2snitch on a single Datacenter. Other calls appear to be normal. nodetool getEndpoints returns the proper endpoints when querying various keys, nodetool ring and status return that all nodes appear healthy. Anyone have any hints on what maybe happening, or if this is a problem we should be concerned with? Thanks, -Mike
Re: Nodetool ring and Replicas after 1.2 upgrade
After looking at the cassandra code a little, I believe this is not really an issue. After the upgrade to 1.2, we still see the issue described in this bug I filed: https://issues.apache.org/jira/browse/CASSANDRA-5264 The Replicas is calculated by adding up the effective ownership of all the nodes, and chopping off the remainder. So if your effective ownership is 299.99%, it appears the code will will report the number of replicas as 2. This might become reliably 3 after I complete running repairs after upgrade. Thanks for you time,-Mike From: Alain RODRIGUEZ arodr...@gmail.com To: user@cassandra.apache.org; Michael Theroux mthero...@yahoo.com Sent: Tuesday, June 16, 2015 4:43 PM Subject: Re: Nodetool ring and Replicas after 1.2 upgrade Hi Michael, I barely can access internet right now and was not able to check outputs on my computer, yet first thing that come to my mind is that since 1.2.x (and vnodes) I use rather nodetool status instead. What is the nodetool status output ? Also did you try to specify the keyspace ? Since RF is a per keyspace value, maybe this would help. Other than that, I don't have any idea. I don't remember anything similar, but it was a while ago. I have to ask... Why staying so much behind the current stable / production ready version ? C*heers, Alain 2015-06-16 14:57 GMT+02:00 Michael Theroux mthero...@yahoo.com: Thanks Jason, No errors in the log. Also the nodes do have a consistent schema for the keyspace (although this was a problem during the upgrade that we resolved using the procedure specified here: https://wiki.apache.org/cassandra/FAQ#schema_disagreement). -Mike From: Jason Wee peich...@gmail.com To: user@cassandra.apache.org; Michael Theroux mthero...@yahoo.com Sent: Tuesday, June 16, 2015 12:07 AM Subject: Re: Nodetool ring and Replicas after 1.2 upgrade maybe check the system.log to see if there is any exception and/or error? check as well if they are having consistent schema for the keyspace? hth jason On Tue, Jun 16, 2015 at 7:17 AM, Michael Theroux mthero...@yahoo.com wrote: Hello, We (finally) have just upgraded from Cassandra 1.1 to Cassandra 1.2.19. Everything appears to be up and running normally, however, we have noticed unusual output from nodetool ring. There is a new (to us) field Replicas in the nodetool output, and this field, seemingly at random, is changing from 2 to 3 and back to 2. We are using the byte ordered partitioner (we hash our own keys), and have a replication factor of 3. We are also on AWS and utilize the Ec2snitch on a single Datacenter. Other calls appear to be normal. nodetool getEndpoints returns the proper endpoints when querying various keys, nodetool ring and status return that all nodes appear healthy. Anyone have any hints on what maybe happening, or if this is a problem we should be concerned with? Thanks,-Mike
Re: Nodetool ring and Replicas after 1.2 upgrade
maybe check the system.log to see if there is any exception and/or error? check as well if they are having consistent schema for the keyspace? hth jason On Tue, Jun 16, 2015 at 7:17 AM, Michael Theroux mthero...@yahoo.com wrote: Hello, We (finally) have just upgraded from Cassandra 1.1 to Cassandra 1.2.19. Everything appears to be up and running normally, however, we have noticed unusual output from nodetool ring. There is a new (to us) field Replicas in the nodetool output, and this field, seemingly at random, is changing from 2 to 3 and back to 2. We are using the byte ordered partitioner (we hash our own keys), and have a replication factor of 3. We are also on AWS and utilize the Ec2snitch on a single Datacenter. Other calls appear to be normal. nodetool getEndpoints returns the proper endpoints when querying various keys, nodetool ring and status return that all nodes appear healthy. Anyone have any hints on what maybe happening, or if this is a problem we should be concerned with? Thanks, -Mike
Nodetool ring and Replicas after 1.2 upgrade
Hello, We (finally) have just upgraded from Cassandra 1.1 to Cassandra 1.2.19. Everything appears to be up and running normally, however, we have noticed unusual output from nodetool ring. There is a new (to us) field Replicas in the nodetool output, and this field, seemingly at random, is changing from 2 to 3 and back to 2. We are using the byte ordered partitioner (we hash our own keys), and have a replication factor of 3. We are also on AWS and utilize the Ec2snitch on a single Datacenter. Other calls appear to be normal. nodetool getEndpoints returns the proper endpoints when querying various keys, nodetool ring and status return that all nodes appear healthy. Anyone have any hints on what maybe happening, or if this is a problem we should be concerned with? Thanks,-Mike
How to use nodetool ring only for one data center
Hi, I wanted to know, how can we get the information of the token rings only for one data centers when using vnodes and multiple data center. Thanks Surbhi
Re: How to use nodetool ring only for one data center
Do you want this for some sort of reporting requirement? If so you may be able to write a quick she'll script using grep to remove the unwanted data Rahul On Apr 28, 2015, at 7:24 PM, Surbhi Gupta surbhi.gupt...@gmail.com wrote: Hi, I wanted to know, how can we get the information of the token rings only for one data centers when using vnodes and multiple data center. Thanks Surbhi
Re: How to use nodetool ring only for one data center
When we have multi datacenter . In output datacenter like DC1 etc is not easy to associate with each node like we have RAC . On 28 April 2015 at 16:30, Rahul Neelakantan ra...@rahul.be wrote: Do you want this for some sort of reporting requirement? If so you may be able to write a quick she'll script using grep to remove the unwanted data Rahul On Apr 28, 2015, at 7:24 PM, Surbhi Gupta surbhi.gupt...@gmail.com wrote: Hi, I wanted to know, how can we get the information of the token rings only for one data centers when using vnodes and multiple data center. Thanks Surbhi
Re: Nodetool ring
Owns is how much of the entire, cluster wide, data set the node has. In both your examples every node has a full copy of the data. If you have 6 nodes and RF 3 they would have 50%. Cheers - Aaron Morton New Zealand @aaronmorton Co-Founder Principal Consultant Apache Cassandra Consulting http://www.thelastpickle.com On 3/01/2014, at 6:00 pm, Vivek Mishra mishra.v...@gmail.com wrote: Yes. On Fri, Jan 3, 2014 at 12:57 AM, Robert Coli rc...@eventbrite.com wrote: On Thu, Jan 2, 2014 at 10:48 AM, Vivek Mishra mishra.v...@gmail.com wrote: Thanks for your quick reply. Even with 2 data center with 3 data nodes each i am seeing 100% on both data center nodes. Do you have RF=3 in both? =Rob
Nodetool ring
Hi, I am trying to understand Owns here. AFAIK, it is range(part of keyspace). Not able to understand why is it shown as 100%? Is it because of effective ownership? Address RackStatus State LoadOwnsToken -3074457345618258503 x.x.x.x 3 Up Normal 91.28 MB100.00% 3074457345618258702 x.x.x.x 1 Up Normal 83.45 MB100.00% -9223372036854775708 x.x.x.x 2 Up Normal 90.11 MB100.00% -3074457345618258503 Any suggestions? -Vivek
Re: Nodetool ring
On Thu, Jan 2, 2014 at 10:20 AM, Vivek Mishra mishra.v...@gmail.com wrote: I am trying to understand Owns here. AFAIK, it is range(part of keyspace). Not able to understand why is it shown as 100%? Is it because of effective ownership? When RF=N, effective ownership for each node is 100%. This is almost certainly what you are seeing, given a 3 node cluster (which probably has RF=3...). =Rob
Re: Nodetool ring
Thanks for your quick reply. Even with 2 data center with 3 data nodes each i am seeing 100% on both data center nodes. -Vivek On Fri, Jan 3, 2014 at 12:07 AM, Robert Coli rc...@eventbrite.com wrote: On Thu, Jan 2, 2014 at 10:20 AM, Vivek Mishra mishra.v...@gmail.comwrote: I am trying to understand Owns here. AFAIK, it is range(part of keyspace). Not able to understand why is it shown as 100%? Is it because of effective ownership? When RF=N, effective ownership for each node is 100%. This is almost certainly what you are seeing, given a 3 node cluster (which probably has RF=3...). =Rob
Re: Nodetool ring
On Thu, Jan 2, 2014 at 10:48 AM, Vivek Mishra mishra.v...@gmail.com wrote: Thanks for your quick reply. Even with 2 data center with 3 data nodes each i am seeing 100% on both data center nodes. Do you have RF=3 in both? =Rob
Re: Nodetool ring
Yes. On Fri, Jan 3, 2014 at 12:57 AM, Robert Coli rc...@eventbrite.com wrote: On Thu, Jan 2, 2014 at 10:48 AM, Vivek Mishra mishra.v...@gmail.comwrote: Thanks for your quick reply. Even with 2 data center with 3 data nodes each i am seeing 100% on both data center nodes. Do you have RF=3 in both? =Rob
Output of nodetool ring with virtual nodes
Hello, I recently did the Enabling virtual nodes on an existing production cluster procedure ( http://www.datastax.com/documentation/cassandra/1.2/webhelp/cassandra/configuration/configVnodesProduction_t.html), and noticed that the output of the command nodetool ring changes significantly when virtual nodes are enabled in a new data center. Before, it showed only 1 token per node, now it shows 256 tokens per node (output below). So, that means 256*N entries, which makes the command unreadable, while before it was pretty useful to check the cluster status in a human-readable format. Moreover, the command is taking much longer to execute. Is this expected behavior, or did I make any mistake during the procedure? Cassandra version: 1.2.10 Before it was like this: Datacenter: VNodesDisabled == Replicas: 3 Address RackStatus State LoadOwns Token 28356863910078205239614050619314017619 AAA.BBB.CCC.1 x Up Normal 236.49 GB 20.83% 113427455640312821154458002477256070480 AAA.BBB.CCC.2 x Up Normal 347.6 GB29.17% 77981375752715064543690004203113548455 AAA.BBB.CCC.3 x Up Normal 332.46 GB 37.50% 106338614526609105785626408013334622686 AAA.BBB.CCC.4 x Up Normal 198.94 GB 20.83% 141784319550391026443072753090570088104 AAA.BBB.CCC.5 x Up Normal 330.68 GB 33.33% 92159807707754167187997289512070557265 AAA.BBB.CCC.6 x Up Normal 268.64 GB 25.00% 155962751505430129087380028400227096915 AAA.BBB.CCC.7 x Up Normal 262.43 GB 25.00% 163051967482949680409533666060055601314 AAA.BBB.CCC.8 x Up Normal 200.18 GB 16.67% 1 AAA.BBB.CCC.9 x Up Normal 189.13 GB 16.67% 120516671617832372476611040132084574885 AAA.BBB.CCC.10 x Up Normal 220.7 GB25.00% 42535295865117307932921025928971026429 AAA.BBB.CCC.11 x Up Normal 259.36 GB 25.00% 35446079887597756610768088274142522024 AAA.BBB.CCC.12 x Up Normal 270.32 GB 25.00% 28356863910078205088614550619314017619 Now it is like this: Datacenter: VNodesEnabled == Replicas: 3 Address RackStatus State LoadOwns Token 168998414504718061309167200639854699955 XXX.YYY.ZZZ.1y Up Normal 122.84 KB 0.00% 4176479009577065052560790400565254 XXX.YYY.ZZZ.1y Up Normal 122.84 KB 0.00% 291517050854558940844583227825291566 XXX.YYY.ZZZ.1y Up Normal 122.84 KB 0.00% 389126351568277133928956802249918052 XXX.YYY.ZZZ.1y Up Normal 122.84 KB 0.00% 504218791605899949008255495493335240 XXX.YYY.ZZZ.2y Up Normal 122.84 KB 0.00% 4176479009577065052560790400565254 XXX.YYY.ZZZ.2y Up Normal 122.84 KB 0.00% 291517050854558940844583227825291566 XXX.YYY.ZZZ.2y Up Normal 122.84 KB 0.00% 389126351568277133928956802249918052 XXX.YYY.ZZZ.2y Up Normal 122.84 KB 0.00% 504218791605899949008255495493335240 XXX.YYY.ZZZ.3y Up Normal 122.84 KB 0.00% 4176479009577065052560790400565254 XXX.YYY.ZZZ.3y Up Normal 122.84 KB 0.00% 291517050854558940844583227825291566 XXX.YYY.ZZZ.3y Up Normal 122.84 KB 0.00% 389126351568277133928956802249918052 XXX.YYY.ZZZ.3y Up Normal 122.84 KB 0.00% 504218791605899949008255495493335240 XXX.YYY.ZZZ.4y Up Normal 122.84 KB 0.00% 4176479009577065052560790400565254 XXX.YYY.ZZZ.4y Up Normal 122.84 KB 0.00% 291517050854558940844583227825291566 XXX.YYY.ZZZ.4y Up Normal 122.84 KB 0.00% 389126351568277133928956802249918052 XXX.YYY.ZZZ.4y Up Normal 122.84 KB 0.00% 504218791605899949008255495493335240 XXX.YYY.ZZZ.5y Up Normal 122.84 KB 0.00% 4176479009577065052560790400565254 XXX.YYY.ZZZ.5y Up Normal 122.84 KB 0.00% 291517050854558940844583227825291566 XXX.YYY.ZZZ.5y Up Normal 122.84 KB 0.00% 389126351568277133928956802249918052 XXX.YYY.ZZZ.5y Up Normal 122.84 KB 0.00% 504218791605899949008255495493335240 XXX.YYY.ZZZ.6y Up Normal 122.84 KB 0.00% 4176479009577065052560790400565254 XXX.YYY.ZZZ.6y Up Normal 122.84 KB 0.00% 291517050854558940844583227825291566 XXX.YYY.ZZZ.6y Up Normal 122.84 KB 0.00% 389126351568277133928956802249918052 XXX.YYY.ZZZ.6y Up Normal 122.84 KB 0.00% 504218791605899949008255495493335240
Re: Output of nodetool ring with virtual nodes
Hi Paulo, Yes, that is expected. Now that you are using virtual nodes you should use nodetool status to see an output similar to what you saw with nodetool ring before you enabled virtual nodes. -Ike Walker On Oct 15, 2013, at 11:45 AM, Paulo Motta pauloricard...@gmail.com wrote: Hello, I recently did the Enabling virtual nodes on an existing production cluster procedure (http://www.datastax.com/documentation/cassandra/1.2/webhelp/cassandra/configuration/configVnodesProduction_t.html), and noticed that the output of the command nodetool ring changes significantly when virtual nodes are enabled in a new data center. Before, it showed only 1 token per node, now it shows 256 tokens per node (output below). So, that means 256*N entries, which makes the command unreadable, while before it was pretty useful to check the cluster status in a human-readable format. Moreover, the command is taking much longer to execute. Is this expected behavior, or did I make any mistake during the procedure? Cassandra version: 1.2.10 Before it was like this: Datacenter: VNodesDisabled == Replicas: 3 Address RackStatus State LoadOwns Token 28356863910078205239614050619314017619 AAA.BBB.CCC.1 x Up Normal 236.49 GB 20.83% 113427455640312821154458002477256070480 AAA.BBB.CCC.2 x Up Normal 347.6 GB29.17% 77981375752715064543690004203113548455 AAA.BBB.CCC.3 x Up Normal 332.46 GB 37.50% 106338614526609105785626408013334622686 AAA.BBB.CCC.4 x Up Normal 198.94 GB 20.83% 141784319550391026443072753090570088104 AAA.BBB.CCC.5 x Up Normal 330.68 GB 33.33% 92159807707754167187997289512070557265 AAA.BBB.CCC.6 x Up Normal 268.64 GB 25.00% 155962751505430129087380028400227096915 AAA.BBB.CCC.7 x Up Normal 262.43 GB 25.00% 163051967482949680409533666060055601314 AAA.BBB.CCC.8 x Up Normal 200.18 GB 16.67% 1 AAA.BBB.CCC.9 x Up Normal 189.13 GB 16.67% 120516671617832372476611040132084574885 AAA.BBB.CCC.10 x Up Normal 220.7 GB25.00% 42535295865117307932921025928971026429 AAA.BBB.CCC.11 x Up Normal 259.36 GB 25.00% 35446079887597756610768088274142522024 AAA.BBB.CCC.12 x Up Normal 270.32 GB 25.00% 28356863910078205088614550619314017619 Now it is like this: Datacenter: VNodesEnabled == Replicas: 3 Address RackStatus State LoadOwns Token 168998414504718061309167200639854699955 XXX.YYY.ZZZ.1y Up Normal 122.84 KB 0.00% 4176479009577065052560790400565254 XXX.YYY.ZZZ.1y Up Normal 122.84 KB 0.00% 291517050854558940844583227825291566 XXX.YYY.ZZZ.1y Up Normal 122.84 KB 0.00% 389126351568277133928956802249918052 XXX.YYY.ZZZ.1y Up Normal 122.84 KB 0.00% 504218791605899949008255495493335240 XXX.YYY.ZZZ.2y Up Normal 122.84 KB 0.00% 4176479009577065052560790400565254 XXX.YYY.ZZZ.2y Up Normal 122.84 KB 0.00% 291517050854558940844583227825291566 XXX.YYY.ZZZ.2y Up Normal 122.84 KB 0.00% 389126351568277133928956802249918052 XXX.YYY.ZZZ.2y Up Normal 122.84 KB 0.00% 504218791605899949008255495493335240 XXX.YYY.ZZZ.3y Up Normal 122.84 KB 0.00% 4176479009577065052560790400565254 XXX.YYY.ZZZ.3y Up Normal 122.84 KB 0.00% 291517050854558940844583227825291566 XXX.YYY.ZZZ.3y Up Normal 122.84 KB 0.00% 389126351568277133928956802249918052 XXX.YYY.ZZZ.3y Up Normal 122.84 KB 0.00% 504218791605899949008255495493335240 XXX.YYY.ZZZ.4y Up Normal 122.84 KB 0.00% 4176479009577065052560790400565254 XXX.YYY.ZZZ.4y Up Normal 122.84 KB 0.00
Re: Output of nodetool ring with virtual nodes
It's expected. I think nodetool status is meant to replace nodetool ring. On Oct 15, 2013, at 11:45 AM, Paulo Motta pauloricard...@gmail.com wrote: Hello, I recently did the Enabling virtual nodes on an existing production cluster procedure (http://www.datastax.com/documentation/cassandra/1.2/webhelp/cassandra/configuration/configVnodesProduction_t.html), and noticed that the output of the command nodetool ring changes significantly when virtual nodes are enabled in a new data center. Before, it showed only 1 token per node, now it shows 256 tokens per node (output below). So, that means 256*N entries, which makes the command unreadable, while before it was pretty useful to check the cluster status in a human-readable format. Moreover, the command is taking much longer to execute. Is this expected behavior, or did I make any mistake during the procedure? Cassandra version: 1.2.10 Before it was like this: Datacenter: VNodesDisabled == Replicas: 3 Address RackStatus State LoadOwns Token 28356863910078205239614050619314017619 AAA.BBB.CCC.1 x Up Normal 236.49 GB 20.83% 113427455640312821154458002477256070480 AAA.BBB.CCC.2 x Up Normal 347.6 GB29.17% 77981375752715064543690004203113548455 AAA.BBB.CCC.3 x Up Normal 332.46 GB 37.50% 106338614526609105785626408013334622686 AAA.BBB.CCC.4 x Up Normal 198.94 GB 20.83% 141784319550391026443072753090570088104 AAA.BBB.CCC.5 x Up Normal 330.68 GB 33.33% 92159807707754167187997289512070557265 AAA.BBB.CCC.6 x Up Normal 268.64 GB 25.00% 155962751505430129087380028400227096915 AAA.BBB.CCC.7 x Up Normal 262.43 GB 25.00% 163051967482949680409533666060055601314 AAA.BBB.CCC.8 x Up Normal 200.18 GB 16.67% 1 AAA.BBB.CCC.9 x Up Normal 189.13 GB 16.67% 120516671617832372476611040132084574885 AAA.BBB.CCC.10 x Up Normal 220.7 GB25.00% 42535295865117307932921025928971026429 AAA.BBB.CCC.11 x Up Normal 259.36 GB 25.00% 35446079887597756610768088274142522024 AAA.BBB.CCC.12 x Up Normal 270.32 GB 25.00% 28356863910078205088614550619314017619 Now it is like this: Datacenter: VNodesEnabled == Replicas: 3 Address RackStatus State LoadOwns Token 168998414504718061309167200639854699955 XXX.YYY.ZZZ.1y Up Normal 122.84 KB 0.00% 4176479009577065052560790400565254 XXX.YYY.ZZZ.1y Up Normal 122.84 KB 0.00% 291517050854558940844583227825291566 XXX.YYY.ZZZ.1y Up Normal 122.84 KB 0.00% 389126351568277133928956802249918052 XXX.YYY.ZZZ.1y Up Normal 122.84 KB 0.00% 504218791605899949008255495493335240 XXX.YYY.ZZZ.2y Up Normal 122.84 KB 0.00% 4176479009577065052560790400565254 XXX.YYY.ZZZ.2y Up Normal 122.84 KB 0.00% 291517050854558940844583227825291566 XXX.YYY.ZZZ.2y Up Normal 122.84 KB 0.00% 389126351568277133928956802249918052 XXX.YYY.ZZZ.2y Up Normal 122.84 KB 0.00% 504218791605899949008255495493335240 XXX.YYY.ZZZ.3y Up Normal 122.84 KB 0.00% 4176479009577065052560790400565254 XXX.YYY.ZZZ.3y Up Normal 122.84 KB 0.00% 291517050854558940844583227825291566 XXX.YYY.ZZZ.3y Up Normal 122.84 KB 0.00% 389126351568277133928956802249918052 XXX.YYY.ZZZ.3y Up Normal 122.84 KB 0.00% 504218791605899949008255495493335240 XXX.YYY.ZZZ.4y Up Normal 122.84 KB 0.00% 4176479009577065052560790400565254 XXX.YYY.ZZZ.4y Up Normal 122.84 KB 0.00% 291517050854558940844583227825291566 XXX.YYY.ZZZ.4y Up Normal 122.84 KB 0.00% 389126351568277133928956802249918052
Re: Output of nodetool ring with virtual nodes
That's cool! Many thanks! :-) 2013/10/15 Jon Haddad j...@jonhaddad.com It's expected. I think nodetool status is meant to replace nodetool ring. On Oct 15, 2013, at 11:45 AM, Paulo Motta pauloricard...@gmail.com wrote: Hello, I recently did the Enabling virtual nodes on an existing production cluster procedure ( http://www.datastax.com/documentation/cassandra/1.2/webhelp/cassandra/configuration/configVnodesProduction_t.html), and noticed that the output of the command nodetool ring changes significantly when virtual nodes are enabled in a new data center. Before, it showed only 1 token per node, now it shows 256 tokens per node (output below). So, that means 256*N entries, which makes the command unreadable, while before it was pretty useful to check the cluster status in a human-readable format. Moreover, the command is taking much longer to execute. Is this expected behavior, or did I make any mistake during the procedure? Cassandra version: 1.2.10 Before it was like this: Datacenter: VNodesDisabled == Replicas: 3 Address RackStatus State LoadOwns Token 28356863910078205239614050619314017619 AAA.BBB.CCC.1 x Up Normal 236.49 GB 20.83% 113427455640312821154458002477256070480 AAA.BBB.CCC.2 x Up Normal 347.6 GB29.17% 77981375752715064543690004203113548455 AAA.BBB.CCC.3 x Up Normal 332.46 GB 37.50% 106338614526609105785626408013334622686 AAA.BBB.CCC.4 x Up Normal 198.94 GB 20.83% 141784319550391026443072753090570088104 AAA.BBB.CCC.5 x Up Normal 330.68 GB 33.33% 92159807707754167187997289512070557265 AAA.BBB.CCC.6 x Up Normal 268.64 GB 25.00% 155962751505430129087380028400227096915 AAA.BBB.CCC.7 x Up Normal 262.43 GB 25.00% 163051967482949680409533666060055601314 AAA.BBB.CCC.8 x Up Normal 200.18 GB 16.67% 1 AAA.BBB.CCC.9 x Up Normal 189.13 GB 16.67% 120516671617832372476611040132084574885 AAA.BBB.CCC.10 x Up Normal 220.7 GB25.00% 42535295865117307932921025928971026429 AAA.BBB.CCC.11 x Up Normal 259.36 GB 25.00% 35446079887597756610768088274142522024 AAA.BBB.CCC.12 x Up Normal 270.32 GB 25.00% 28356863910078205088614550619314017619 Now it is like this: Datacenter: VNodesEnabled == Replicas: 3 Address RackStatus State LoadOwns Token 168998414504718061309167200639854699955 XXX.YYY.ZZZ.1y Up Normal 122.84 KB 0.00% 4176479009577065052560790400565254 XXX.YYY.ZZZ.1y Up Normal 122.84 KB 0.00% 291517050854558940844583227825291566 XXX.YYY.ZZZ.1y Up Normal 122.84 KB 0.00% 389126351568277133928956802249918052 XXX.YYY.ZZZ.1y Up Normal 122.84 KB 0.00% 504218791605899949008255495493335240 XXX.YYY.ZZZ.2y Up Normal 122.84 KB 0.00% 4176479009577065052560790400565254 XXX.YYY.ZZZ.2y Up Normal 122.84 KB 0.00% 291517050854558940844583227825291566 XXX.YYY.ZZZ.2y Up Normal 122.84 KB 0.00% 389126351568277133928956802249918052 XXX.YYY.ZZZ.2y Up Normal 122.84 KB 0.00% 504218791605899949008255495493335240 XXX.YYY.ZZZ.3y Up Normal 122.84 KB 0.00% 4176479009577065052560790400565254 XXX.YYY.ZZZ.3y Up Normal 122.84 KB 0.00% 291517050854558940844583227825291566 XXX.YYY.ZZZ.3y Up Normal 122.84 KB 0.00% 389126351568277133928956802249918052 XXX.YYY.ZZZ.3y Up Normal 122.84 KB 0.00% 504218791605899949008255495493335240 XXX.YYY.ZZZ.4y Up Normal 122.84 KB 0.00% 4176479009577065052560790400565254 XXX.YYY.ZZZ.4y Up Normal 122.84 KB 0.00% 291517050854558940844583227825291566 XXX.YYY.ZZZ.4y Up Normal 122.84 KB 0.00% 389126351568277133928956802249918052 XXX.YYY.ZZZ.4y Up Normal 122.84 KB 0.00% 504218791605899949008255495493335240 XXX.YYY.ZZZ.5y Up Normal 122.84 KB 0.00% 4176479009577065052560790400565254 XXX.YYY.ZZZ.5y Up Normal 122.84 KB 0.00% 291517050854558940844583227825291566 XXX.YYY.ZZZ.5y Up Normal 122.84 KB 0.00% 389126351568277133928956802249918052 XXX.YYY.ZZZ.5y Up Normal 122.84 KB 0.00% 504218791605899949008255495493335240 XXX.YYY.ZZZ.6y Up Normal 122.84 KB 0.00% 4176479009577065052560790400565254 XXX.YYY.ZZZ.6y Up Normal 122.84 KB 0.00
Re: nodetool ring showing different 'Load' size
Ok. Thank you all you guys. Att. *Rodrigo Felix de Almeida* LSBD - Universidade Federal do Ceará Project Manager MBA, CSM, CSPO, SCJP On Wed, Jun 19, 2013 at 2:26 PM, Robert Coli rc...@eventbrite.com wrote: On Wed, Jun 19, 2013 at 5:47 AM, Michal Michalski mich...@opera.com wrote: You can also perform a major compaction via nodetool compact (for SizeTieredCompaction), but - again - you really should not do it unless you're really sure what you do, as it compacts all the SSTables together, which is not something you might want to achieve in most of the cases. If you do that and discover you did not want to : https://github.com/pcmanus/cassandra/tree/sstable_split Will enable you to split your monolithic sstable back into smaller sstables. =Rob PS - @pcmanus, here's that reminder we discussed @ summit to merge this tool into upstream! :D
Re: nodetool ring showing different 'Load' size
Thanks Eric. Is there a way to start manually compaction operations? I'm thinking about doing after loading data and before start run phase of the benchmark. Thanks. Att. *Rodrigo Felix de Almeida* LSBD - Universidade Federal do Ceará Project Manager MBA, CSM, CSPO, SCJP On Mon, Jun 17, 2013 at 12:41 PM, Eric Stevens migh...@gmail.com wrote: Load is the size of the storage on disk as I understand it. This can fluctuate during normal usage even if records are not being added or removed, a node's load may be reduced during compaction for example. During compaction, especially if you use Size Tiered Compaction strategy (the default), load may temporarily double for a column family. On Mon, Jun 17, 2013 at 11:33 AM, Rodrigo Felix rodrigofelixdealme...@gmail.com wrote: Hi, I've been running a benchmark on Cassandra and I'm facing a problem regarding to the size of the database. I performed a load phase and then, when running nodetool ring, I got the following output: *ubuntu@domU-12-31-39-0E-11-F1:~/cassandra$ bin/nodetool ring * *Address DC RackStatus State Load Effective-Ownership Token * * 85070591730234615865843651857942052864 * *10.192.18.3 datacenter1 rack1 Up Normal 2.07 GB 50.00% 0 * *10.85.135.169 datacenter1 rack1 Up Normal 2.09 GB 50.00% 85070591730234615865843651857942052864* After that I executed, for about one hour, a workload with scan and insert queries. Then, after finishing the workload execution, I run again nodetool ring and got the following: *ubuntu@domU-12-31-39-0E-11-F1:~/cassandra$ bin/nodetool ring * *Address DC RackStatus State Load Effective-Ownership Token * * 85070591730234615865843651857942052864 * *10.192.18.3 datacenter1 rack1 Up Normal 1.07 GB 50.00% 0 * *10.85.135.169 datacenter1 rack1 Up Normal 2.15 GB 50.00% 85070591730234615865843651857942052864* Any idea why a node had its size reduced if no record was removed? No machine or added or removed during this workload. Is this related to any kind of compression? If yes, is there a command to confirm that? I also faced a problem where a node has its size increased from about 2gb to about 4gb. In this last scenario, I both added and removed nodes during the workload depending on the load (CPU). Thanks in advance for any help. Att. *Rodrigo Felix de Almeida* LSBD - Universidade Federal do Ceará Project Manager MBA, CSM, CSPO, SCJP
Re: nodetool ring showing different 'Load' size
You can start compaction via JMX if you need it and you know what you're doing: Find org.apache.cassandra.db:type=CompactionManager MBean and forceUserDefinedCompaction operation in it. First argument is keyspace name, second one is a comma-separated list of SSTables to compact (filename) You can also perform a major compaction via nodetool compact (for SizeTieredCompaction), but - again - you really should not do it unless you're really sure what you do, as it compacts all the SSTables together, which is not something you might want to achieve in most of the cases. M. W dniu 19.06.2013 14:31, Rodrigo Felix pisze: Thanks Eric. Is there a way to start manually compaction operations? I'm thinking about doing after loading data and before start run phase of the benchmark. Thanks. Att. *Rodrigo Felix de Almeida* LSBD - Universidade Federal do Ceará Project Manager MBA, CSM, CSPO, SCJP On Mon, Jun 17, 2013 at 12:41 PM, Eric Stevens migh...@gmail.com wrote: Load is the size of the storage on disk as I understand it. This can fluctuate during normal usage even if records are not being added or removed, a node's load may be reduced during compaction for example. During compaction, especially if you use Size Tiered Compaction strategy (the default), load may temporarily double for a column family. On Mon, Jun 17, 2013 at 11:33 AM, Rodrigo Felix rodrigofelixdealme...@gmail.com wrote: Hi, I've been running a benchmark on Cassandra and I'm facing a problem regarding to the size of the database. I performed a load phase and then, when running nodetool ring, I got the following output: *ubuntu@domU-12-31-39-0E-11-F1:~/cassandra$ bin/nodetool ring * *Address DC RackStatus State Load Effective-Ownership Token * * 85070591730234615865843651857942052864 * *10.192.18.3 datacenter1 rack1 Up Normal 2.07 GB 50.00% 0 * *10.85.135.169 datacenter1 rack1 Up Normal 2.09 GB 50.00% 85070591730234615865843651857942052864* After that I executed, for about one hour, a workload with scan and insert queries. Then, after finishing the workload execution, I run again nodetool ring and got the following: *ubuntu@domU-12-31-39-0E-11-F1:~/cassandra$ bin/nodetool ring * *Address DC RackStatus State Load Effective-Ownership Token * * 85070591730234615865843651857942052864 * *10.192.18.3 datacenter1 rack1 Up Normal 1.07 GB 50.00% 0 * *10.85.135.169 datacenter1 rack1 Up Normal 2.15 GB 50.00% 85070591730234615865843651857942052864* Any idea why a node had its size reduced if no record was removed? No machine or added or removed during this workload. Is this related to any kind of compression? If yes, is there a command to confirm that? I also faced a problem where a node has its size increased from about 2gb to about 4gb. In this last scenario, I both added and removed nodes during the workload depending on the load (CPU). Thanks in advance for any help. Att. *Rodrigo Felix de Almeida* LSBD - Universidade Federal do Ceará Project Manager MBA, CSM, CSPO, SCJP
Re: nodetool ring showing different 'Load' size
On Wed, Jun 19, 2013 at 5:47 AM, Michal Michalski mich...@opera.com wrote: You can also perform a major compaction via nodetool compact (for SizeTieredCompaction), but - again - you really should not do it unless you're really sure what you do, as it compacts all the SSTables together, which is not something you might want to achieve in most of the cases. If you do that and discover you did not want to : https://github.com/pcmanus/cassandra/tree/sstable_split Will enable you to split your monolithic sstable back into smaller sstables. =Rob PS - @pcmanus, here's that reminder we discussed @ summit to merge this tool into upstream! :D
nodetool ring showing different 'Load' size
Hi, I've been running a benchmark on Cassandra and I'm facing a problem regarding to the size of the database. I performed a load phase and then, when running nodetool ring, I got the following output: *ubuntu@domU-12-31-39-0E-11-F1:~/cassandra$ bin/nodetool ring * *Address DC RackStatus State Load Effective-Ownership Token * * 85070591730234615865843651857942052864 * *10.192.18.3 datacenter1 rack1 Up Normal 2.07 GB 50.00% 0 * *10.85.135.169 datacenter1 rack1 Up Normal 2.09 GB 50.00% 85070591730234615865843651857942052864* After that I executed, for about one hour, a workload with scan and insert queries. Then, after finishing the workload execution, I run again nodetool ring and got the following: *ubuntu@domU-12-31-39-0E-11-F1:~/cassandra$ bin/nodetool ring * *Address DC RackStatus State Load Effective-Ownership Token * * 85070591730234615865843651857942052864 * *10.192.18.3 datacenter1 rack1 Up Normal 1.07 GB 50.00% 0 * *10.85.135.169 datacenter1 rack1 Up Normal 2.15 GB 50.00% 85070591730234615865843651857942052864* Any idea why a node had its size reduced if no record was removed? No machine or added or removed during this workload. Is this related to any kind of compression? If yes, is there a command to confirm that? I also faced a problem where a node has its size increased from about 2gb to about 4gb. In this last scenario, I both added and removed nodes during the workload depending on the load (CPU). Thanks in advance for any help. Att. *Rodrigo Felix de Almeida* LSBD - Universidade Federal do Ceará Project Manager MBA, CSM, CSPO, SCJP
Re: nodetool ring showing different 'Load' size
Load is the size of the storage on disk as I understand it. This can fluctuate during normal usage even if records are not being added or removed, a node's load may be reduced during compaction for example. During compaction, especially if you use Size Tiered Compaction strategy (the default), load may temporarily double for a column family. On Mon, Jun 17, 2013 at 11:33 AM, Rodrigo Felix rodrigofelixdealme...@gmail.com wrote: Hi, I've been running a benchmark on Cassandra and I'm facing a problem regarding to the size of the database. I performed a load phase and then, when running nodetool ring, I got the following output: *ubuntu@domU-12-31-39-0E-11-F1:~/cassandra$ bin/nodetool ring * *Address DC RackStatus State Load Effective-Ownership Token * * 85070591730234615865843651857942052864 * *10.192.18.3 datacenter1 rack1 Up Normal 2.07 GB 50.00% 0 * *10.85.135.169 datacenter1 rack1 Up Normal 2.09 GB 50.00% 85070591730234615865843651857942052864* After that I executed, for about one hour, a workload with scan and insert queries. Then, after finishing the workload execution, I run again nodetool ring and got the following: *ubuntu@domU-12-31-39-0E-11-F1:~/cassandra$ bin/nodetool ring * *Address DC RackStatus State Load Effective-Ownership Token * * 85070591730234615865843651857942052864 * *10.192.18.3 datacenter1 rack1 Up Normal 1.07 GB 50.00% 0 * *10.85.135.169 datacenter1 rack1 Up Normal 2.15 GB 50.00% 85070591730234615865843651857942052864* Any idea why a node had its size reduced if no record was removed? No machine or added or removed during this workload. Is this related to any kind of compression? If yes, is there a command to confirm that? I also faced a problem where a node has its size increased from about 2gb to about 4gb. In this last scenario, I both added and removed nodes during the workload depending on the load (CPU). Thanks in advance for any help. Att. *Rodrigo Felix de Almeida* LSBD - Universidade Federal do Ceará Project Manager MBA, CSM, CSPO, SCJP
Re: nodetool ring generate strange info
Looks like you are using vnodes, use nodetool status instead. Cheers - Aaron Morton Freelance Cassandra Consultant New Zealand @aaronmorton http://www.thelastpickle.com On 10/05/2013, at 11:58 PM, Nikolay Mihaylov n...@nmmm.nu wrote: do you use vnodes ? On Fri, May 10, 2013 at 10:19 AM, 杨辉强 huiqiangy...@yunrang.com wrote: Hi, all I use ./bin/nodetool -h 10.21.229.32 ring It generates lots of info of same host like this: 10.21.229.32 rack1 Up Normal 928.3 MB24.80% 8875305964978355793 10.21.229.32 rack1 Up Normal 928.3 MB24.80% 8875770246221977199 10.21.229.32 rack1 Up Normal 928.3 MB24.80% 8875903273282028661 10.21.229.32 rack1 Up Normal 928.3 MB24.80% 9028992266297813652 10.21.229.32 rack1 Up Normal 928.3 MB24.80% 9130157610675408105 10.21.229.32 rack1 Up Normal 928.3 MB24.80% 9145604352014775913 10.21.229.32 rack1 Up Normal 928.3 MB24.80% 9182228238626921304 Does it normal?
Re: nodetool ring generate strange info
same host, multiple cassandra instance? but looks wrong, what cassandra version? On Fri, May 10, 2013 at 3:19 PM, 杨辉强 huiqiangy...@yunrang.com wrote: Hi, all I use ./bin/nodetool -h 10.21.229.32 ring It generates lots of info of same host like this: 10.21.229.32 rack1 Up Normal 928.3 MB24.80% 8875305964978355793 10.21.229.32 rack1 Up Normal 928.3 MB24.80% 8875770246221977199 10.21.229.32 rack1 Up Normal 928.3 MB24.80% 8875903273282028661 10.21.229.32 rack1 Up Normal 928.3 MB24.80% 9028992266297813652 10.21.229.32 rack1 Up Normal 928.3 MB24.80% 9130157610675408105 10.21.229.32 rack1 Up Normal 928.3 MB24.80% 9145604352014775913 10.21.229.32 rack1 Up Normal 928.3 MB24.80% 9182228238626921304 Does it normal?
Re: nodetool ring generate strange info
do you use vnodes ? On Fri, May 10, 2013 at 10:19 AM, 杨辉强 huiqiangy...@yunrang.com wrote: Hi, all I use ./bin/nodetool -h 10.21.229.32 ring It generates lots of info of same host like this: 10.21.229.32 rack1 Up Normal 928.3 MB24.80% 8875305964978355793 10.21.229.32 rack1 Up Normal 928.3 MB24.80% 8875770246221977199 10.21.229.32 rack1 Up Normal 928.3 MB24.80% 8875903273282028661 10.21.229.32 rack1 Up Normal 928.3 MB24.80% 9028992266297813652 10.21.229.32 rack1 Up Normal 928.3 MB24.80% 9130157610675408105 10.21.229.32 rack1 Up Normal 928.3 MB24.80% 9145604352014775913 10.21.229.32 rack1 Up Normal 928.3 MB24.80% 9182228238626921304 Does it normal?
misreports on nodetool ring command on 1.1.4
I just installed 1.1.4 as I need to test upgrade to 1.2.2. I have an existing 6 node cluster which shows 50% ownership on each node which makes sense since RF=3 on everything I have. I brought up all 4 nodes in this cluster and ran nodetool ring and it shows every node with 25%. Then, I create a keyspace and run again and it shows every node at 0% Exact output was the following….. [cassandra@sdi-prod-04 ~]$ nodetool ring Note: Ownership information does not include topology, please specify a keyspace. Address DC RackStatus State LoadOwns Token 127605887595351923798765477786913079295 10.20.5.82 DC1 RAC1Up Normal 11.37 KB25.00% 0 10.20.5.83 DC1 RAC1Up Normal 11.14 KB25.00% 42535295865117307932921825928971026431 10.20.5.84 DC1 RAC1Up Normal 11.14 KB25.00% 85070591730234615865843651857942052863 10.20.5.85 DC1 RAC1Up Normal 15.65 KB25.00% 127605887595351923798765477786913079295 After I created the keyspace, ran it again…. [cassandra@sdi-prod-04 ~]$ nodetool ring Address DC RackStatus State Load Effective-Ownership Token 127605887595351923798765477786913079295 10.20.5.82 DC1 RAC1Up Normal 15.95 KB0.00% 0 10.20.5.83 DC1 RAC1Up Normal 15.73 KB0.00% 42535295865117307932921825928971026431 10.20.5.84 DC1 RAC1Up Normal 15.73 KB0.00% 85070591730234615865843651857942052863 10.20.5.85 DC1 RAC1Up Normal 20.24 KB0.00% 127605887595351923798765477786913079295 Thanks, Dean
Re: misreports on nodetool ring command on 1.1.4
What are the replication settings for the keyspace you created? Perhaps you used NTS with a bad DC name? On Wed, Feb 27, 2013 at 7:50 AM, Hiller, Dean dean.hil...@nrel.gov wrote: I just installed 1.1.4 as I need to test upgrade to 1.2.2. I have an existing 6 node cluster which shows 50% ownership on each node which makes sense since RF=3 on everything I have. I brought up all 4 nodes in this cluster and ran nodetool ring and it shows every node with 25%. Then, I create a keyspace and run again and it shows every node at 0% Exact output was the following….. [cassandra@sdi-prod-04 ~]$ nodetool ring Note: Ownership information does not include topology, please specify a keyspace. Address DC RackStatus State Load OwnsToken 127605887595351923798765477786913079295 10.20.5.82 DC1 RAC1Up Normal 11.37 KB 25.00% 0 10.20.5.83 DC1 RAC1Up Normal 11.14 KB 25.00% 42535295865117307932921825928971026431 10.20.5.84 DC1 RAC1Up Normal 11.14 KB 25.00% 85070591730234615865843651857942052863 10.20.5.85 DC1 RAC1Up Normal 15.65 KB 25.00% 127605887595351923798765477786913079295 After I created the keyspace, ran it again…. [cassandra@sdi-prod-04 ~]$ nodetool ring Address DC RackStatus State Load Effective-Ownership Token 127605887595351923798765477786913079295 10.20.5.82 DC1 RAC1Up Normal 15.95 KB 0.00% 0 10.20.5.83 DC1 RAC1Up Normal 15.73 KB 0.00% 42535295865117307932921825928971026431 10.20.5.84 DC1 RAC1Up Normal 15.73 KB 0.00% 85070591730234615865843651857942052863 10.20.5.85 DC1 RAC1Up Normal 20.24 KB 0.00% 127605887595351923798765477786913079295 Thanks, Dean -- Tyler Hobbs DataStax http://datastax.com/
Re: misreports on nodetool ring command on 1.1.4
I finally gave up as it was supposed to be creating SimpleStrategy by default but was creating NTS by default so eventually I forced it to SimpleStrategy which did not have the issue. I never really figured out what was wrong there but my simpleStrategy correctly shows every node owns 75% which is what I would expect for RF=3. Thanks, Dean From: Tyler Hobbs ty...@datastax.commailto:ty...@datastax.com Reply-To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Date: Wednesday, February 27, 2013 5:58 PM To: user@cassandra.apache.orgmailto:user@cassandra.apache.org user@cassandra.apache.orgmailto:user@cassandra.apache.org Subject: Re: misreports on nodetool ring command on 1.1.4 What are the replication settings for the keyspace you created? Perhaps you used NTS with a bad DC name? On Wed, Feb 27, 2013 at 7:50 AM, Hiller, Dean dean.hil...@nrel.govmailto:dean.hil...@nrel.gov wrote: I just installed 1.1.4 as I need to test upgrade to 1.2.2. I have an existing 6 node cluster which shows 50% ownership on each node which makes sense since RF=3 on everything I have. I brought up all 4 nodes in this cluster and ran nodetool ring and it shows every node with 25%. Then, I create a keyspace and run again and it shows every node at 0% Exact output was the following….. [cassandra@sdi-prod-04 ~]$ nodetool ring Note: Ownership information does not include topology, please specify a keyspace. Address DC RackStatus State LoadOwns Token 127605887595351923798765477786913079295 10.20.5.82 DC1 RAC1Up Normal 11.37 KB25.00% 0 10.20.5.83 DC1 RAC1Up Normal 11.14 KB25.00% 42535295865117307932921825928971026431 10.20.5.84 DC1 RAC1Up Normal 11.14 KB25.00% 85070591730234615865843651857942052863 10.20.5.85 DC1 RAC1Up Normal 15.65 KB25.00% 127605887595351923798765477786913079295 After I created the keyspace, ran it again…. [cassandra@sdi-prod-04 ~]$ nodetool ring Address DC RackStatus State Load Effective-Ownership Token 127605887595351923798765477786913079295 10.20.5.82 DC1 RAC1Up Normal 15.95 KB0.00% 0 10.20.5.83 DC1 RAC1Up Normal 15.73 KB0.00% 42535295865117307932921825928971026431 10.20.5.84 DC1 RAC1Up Normal 15.73 KB0.00% 85070591730234615865843651857942052863 10.20.5.85 DC1 RAC1Up Normal 20.24 KB0.00% 127605887595351923798765477786913079295 Thanks, Dean -- Tyler Hobbs DataStaxhttp://datastax.com/
Re: new nodetool ring output and unbalanced ring?
out of interest, why -100 and not -1 or + 1? any particular reason? On 06/09/2012 19:17, Tyler Hobbs wrote: To minimize the impact on the cluster, I would bootstrap a new 1d node at (42535295865117307932921825928971026432 - 100), then decommission the 1c node at 42535295865117307932921825928971026432 and run cleanup on your us-east nodes. On Thu, Sep 6, 2012 at 1:11 PM, William Oberman ober...@civicscience.com mailto:ober...@civicscience.com wrote: Didn't notice the racks! Of course If I change a 1c to a 1d, what would I have to do to make sure data shuffles around correctly? Repair everywhere? will On Thu, Sep 6, 2012 at 2:09 PM, Tyler Hobbs ty...@datastax.com mailto:ty...@datastax.com wrote: The main issue is that one of your us-east nodes is in rack 1d, while the restart are in rack 1c. With NTS and multiple racks, Cassandra will try use one node from each rack as a replica for a range until it either meets the RF for the DC, or runs out of racks, in which case it just picks nodes sequentially going clockwise around the ring (starting from the range being considered, not the last node that was chosen as a replica). To fix this, you'll either need to make the 1d node a 1c node, or make 42535295865117307932921825928971026432 a 1d node so that you're alternating racks within that DC. On Thu, Sep 6, 2012 at 12:54 PM, William Oberman ober...@civicscience.com mailto:ober...@civicscience.com wrote: Hi, I recently upgraded from 0.8.x to 1.1.x (through 1.0 briefly) and nodetool -ring seems to have changed from owns to effectively owns. Effectively owns seems to account for replication factor (RF). I'm ok with all of this, yet I still can't figure out what's up with my cluster. I have a NetworkTopologyStrategy with two data centers (DCs) with RF/number nodes in DC combinations of: DC Name, RF, # in DC analytics, 1, 2 us-east, 3, 4 So I'd expect 50% on each analytics node, and 75% for each us-east node. Instead, I have two nodes in us-east with 50/100??? (the other two are 75/75 as expected). Here is the output of nodetool (all nodes report the same thing): Address DC RackStatus State Load Effective-Ownership Token 127605887595351923798765477786913079296 x.x.x.x us-east 1c Up Normal 94.57 GB 75.00%0 x.x.x.x analytics 1c Up Normal 60.64 GB 50.00%1 x.x.x.x us-east 1c Up Normal 131.76 GB 75.00% 42535295865117307932921825928971026432 x.x.x.xus-east 1c Up Normal 43.45 GB 50.00% 85070591730234615865843651857942052864 x.x.x.xanalytics 1d Up Normal 60.88 GB 50.00% 85070591730234615865843651857942052865 x.x.x.x us-east 1d Up Normal 98.56 GB 100.00% 127605887595351923798765477786913079296 If I use cassandra-cli to do show keyspaces; I get (and again, all nodes report the same thing): Keyspace: civicscience: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable Writes: true Options: [analytics:1, us-east:3] I removed the output about all of my column families (CFs), hopefully that doesn't matter. Did I compute the tokens wrong? Is there a combination of nodetool commands I can run to migrate the data around to rebalance to 75/75/75/75? I routinely run repair already. And as the release notes required, I ran upgradesstables during the upgrade process. Before the upgrade, I was getting analytics = 0%, and us-east = 25% on each node, which I expected for owns. will -- Tyler Hobbs DataStax http://datastax.com/ -- Will Oberman Civic Science, Inc. 3030 Penn Avenue., First Floor Pittsburgh, PA 15201 (M) 412-480-7835 tel:412-480-7835 (E) ober...@civicscience.com mailto:ober...@civicscience.com -- Tyler Hobbs DataStax http://datastax.com/
Re: new nodetool ring output and unbalanced ring?
It leaves some breathing room for fixing mistakes, adding DCs, etc. The set of data in a 100 token range is basically the same as a 1 token range: nothing, statistically speaking. On Mon, Sep 10, 2012 at 2:21 AM, Guy Incognito dnd1...@gmail.com wrote: out of interest, why -100 and not -1 or + 1? any particular reason? On 06/09/2012 19:17, Tyler Hobbs wrote: To minimize the impact on the cluster, I would bootstrap a new 1d node at (42535295865117307932921825928971026432 - 100), then decommission the 1c node at 42535295865117307932921825928971026432 and run cleanup on your us-east nodes. On Thu, Sep 6, 2012 at 1:11 PM, William Oberman ober...@civicscience.comwrote: Didn't notice the racks! Of course If I change a 1c to a 1d, what would I have to do to make sure data shuffles around correctly? Repair everywhere? will On Thu, Sep 6, 2012 at 2:09 PM, Tyler Hobbs ty...@datastax.com wrote: The main issue is that one of your us-east nodes is in rack 1d, while the restart are in rack 1c. With NTS and multiple racks, Cassandra will try use one node from each rack as a replica for a range until it either meets the RF for the DC, or runs out of racks, in which case it just picks nodes sequentially going clockwise around the ring (starting from the range being considered, not the last node that was chosen as a replica). To fix this, you'll either need to make the 1d node a 1c node, or make 42535295865117307932921825928971026432 a 1d node so that you're alternating racks within that DC. On Thu, Sep 6, 2012 at 12:54 PM, William Oberman ober...@civicscience.com wrote: Hi, I recently upgraded from 0.8.x to 1.1.x (through 1.0 briefly) and nodetool -ring seems to have changed from owns to effectively owns. Effectively owns seems to account for replication factor (RF). I'm ok with all of this, yet I still can't figure out what's up with my cluster. I have a NetworkTopologyStrategy with two data centers (DCs) with RF/number nodes in DC combinations of: DC Name, RF, # in DC analytics, 1, 2 us-east, 3, 4 So I'd expect 50% on each analytics node, and 75% for each us-east node. Instead, I have two nodes in us-east with 50/100??? (the other two are 75/75 as expected). Here is the output of nodetool (all nodes report the same thing): Address DC RackStatus State Load Effective-Ownership Token 127605887595351923798765477786913079296 x.x.x.x us-east 1c Up Normal 94.57 GB75.00% 0 x.x.x.x analytics 1c Up Normal 60.64 GB50.00% 1 x.x.x.x us-east 1c Up Normal 131.76 GB 75.00% 42535295865117307932921825928971026432 x.x.x.xus-east 1c Up Normal 43.45 GB 50.00% 85070591730234615865843651857942052864 x.x.x.xanalytics 1d Up Normal 60.88 GB 50.00% 85070591730234615865843651857942052865 x.x.x.x us-east 1d Up Normal 98.56 GB 100.00% 127605887595351923798765477786913079296 If I use cassandra-cli to do show keyspaces; I get (and again, all nodes report the same thing): Keyspace: civicscience: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable Writes: true Options: [analytics:1, us-east:3] I removed the output about all of my column families (CFs), hopefully that doesn't matter. Did I compute the tokens wrong? Is there a combination of nodetool commands I can run to migrate the data around to rebalance to 75/75/75/75? I routinely run repair already. And as the release notes required, I ran upgradesstables during the upgrade process. Before the upgrade, I was getting analytics = 0%, and us-east = 25% on each node, which I expected for owns. will -- Tyler Hobbs DataStax http://datastax.com/ -- Will Oberman Civic Science, Inc. 3030 Penn Avenue., First Floor Pittsburgh, PA 15201 (M) 412-480-7835 (E) ober...@civicscience.com -- Tyler Hobbs DataStax http://datastax.com/ -- Tyler Hobbs DataStax http://datastax.com/
new nodetool ring output and unbalanced ring?
Hi, I recently upgraded from 0.8.x to 1.1.x (through 1.0 briefly) and nodetool -ring seems to have changed from owns to effectively owns. Effectively owns seems to account for replication factor (RF). I'm ok with all of this, yet I still can't figure out what's up with my cluster. I have a NetworkTopologyStrategy with two data centers (DCs) with RF/number nodes in DC combinations of: DC Name, RF, # in DC analytics, 1, 2 us-east, 3, 4 So I'd expect 50% on each analytics node, and 75% for each us-east node. Instead, I have two nodes in us-east with 50/100??? (the other two are 75/75 as expected). Here is the output of nodetool (all nodes report the same thing): Address DC RackStatus State Load Effective-Ownership Token 127605887595351923798765477786913079296 x.x.x.x us-east 1c Up Normal 94.57 GB75.00% 0 x.x.x.x analytics 1c Up Normal 60.64 GB50.00% 1 x.x.x.x us-east 1c Up Normal 131.76 GB 75.00% 42535295865117307932921825928971026432 x.x.x.xus-east 1c Up Normal 43.45 GB50.00% 85070591730234615865843651857942052864 x.x.x.xanalytics 1d Up Normal 60.88 GB50.00% 85070591730234615865843651857942052865 x.x.x.x us-east 1d Up Normal 98.56 GB100.00% 127605887595351923798765477786913079296 If I use cassandra-cli to do show keyspaces; I get (and again, all nodes report the same thing): Keyspace: civicscience: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable Writes: true Options: [analytics:1, us-east:3] I removed the output about all of my column families (CFs), hopefully that doesn't matter. Did I compute the tokens wrong? Is there a combination of nodetool commands I can run to migrate the data around to rebalance to 75/75/75/75? I routinely run repair already. And as the release notes required, I ran upgradesstables during the upgrade process. Before the upgrade, I was getting analytics = 0%, and us-east = 25% on each node, which I expected for owns. will
Re: new nodetool ring output and unbalanced ring?
The main issue is that one of your us-east nodes is in rack 1d, while the restart are in rack 1c. With NTS and multiple racks, Cassandra will try use one node from each rack as a replica for a range until it either meets the RF for the DC, or runs out of racks, in which case it just picks nodes sequentially going clockwise around the ring (starting from the range being considered, not the last node that was chosen as a replica). To fix this, you'll either need to make the 1d node a 1c node, or make 42535295865117307932921825928971026432 a 1d node so that you're alternating racks within that DC. On Thu, Sep 6, 2012 at 12:54 PM, William Oberman ober...@civicscience.comwrote: Hi, I recently upgraded from 0.8.x to 1.1.x (through 1.0 briefly) and nodetool -ring seems to have changed from owns to effectively owns. Effectively owns seems to account for replication factor (RF). I'm ok with all of this, yet I still can't figure out what's up with my cluster. I have a NetworkTopologyStrategy with two data centers (DCs) with RF/number nodes in DC combinations of: DC Name, RF, # in DC analytics, 1, 2 us-east, 3, 4 So I'd expect 50% on each analytics node, and 75% for each us-east node. Instead, I have two nodes in us-east with 50/100??? (the other two are 75/75 as expected). Here is the output of nodetool (all nodes report the same thing): Address DC RackStatus State Load Effective-Ownership Token 127605887595351923798765477786913079296 x.x.x.x us-east 1c Up Normal 94.57 GB75.00% 0 x.x.x.x analytics 1c Up Normal 60.64 GB50.00% 1 x.x.x.x us-east 1c Up Normal 131.76 GB 75.00% 42535295865117307932921825928971026432 x.x.x.xus-east 1c Up Normal 43.45 GB50.00% 85070591730234615865843651857942052864 x.x.x.xanalytics 1d Up Normal 60.88 GB50.00% 85070591730234615865843651857942052865 x.x.x.x us-east 1d Up Normal 98.56 GB100.00% 127605887595351923798765477786913079296 If I use cassandra-cli to do show keyspaces; I get (and again, all nodes report the same thing): Keyspace: civicscience: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable Writes: true Options: [analytics:1, us-east:3] I removed the output about all of my column families (CFs), hopefully that doesn't matter. Did I compute the tokens wrong? Is there a combination of nodetool commands I can run to migrate the data around to rebalance to 75/75/75/75? I routinely run repair already. And as the release notes required, I ran upgradesstables during the upgrade process. Before the upgrade, I was getting analytics = 0%, and us-east = 25% on each node, which I expected for owns. will -- Tyler Hobbs DataStax http://datastax.com/
Re: new nodetool ring output and unbalanced ring?
Didn't notice the racks! Of course If I change a 1c to a 1d, what would I have to do to make sure data shuffles around correctly? Repair everywhere? will On Thu, Sep 6, 2012 at 2:09 PM, Tyler Hobbs ty...@datastax.com wrote: The main issue is that one of your us-east nodes is in rack 1d, while the restart are in rack 1c. With NTS and multiple racks, Cassandra will try use one node from each rack as a replica for a range until it either meets the RF for the DC, or runs out of racks, in which case it just picks nodes sequentially going clockwise around the ring (starting from the range being considered, not the last node that was chosen as a replica). To fix this, you'll either need to make the 1d node a 1c node, or make 42535295865117307932921825928971026432 a 1d node so that you're alternating racks within that DC. On Thu, Sep 6, 2012 at 12:54 PM, William Oberman ober...@civicscience.com wrote: Hi, I recently upgraded from 0.8.x to 1.1.x (through 1.0 briefly) and nodetool -ring seems to have changed from owns to effectively owns. Effectively owns seems to account for replication factor (RF). I'm ok with all of this, yet I still can't figure out what's up with my cluster. I have a NetworkTopologyStrategy with two data centers (DCs) with RF/number nodes in DC combinations of: DC Name, RF, # in DC analytics, 1, 2 us-east, 3, 4 So I'd expect 50% on each analytics node, and 75% for each us-east node. Instead, I have two nodes in us-east with 50/100??? (the other two are 75/75 as expected). Here is the output of nodetool (all nodes report the same thing): Address DC RackStatus State Load Effective-Ownership Token 127605887595351923798765477786913079296 x.x.x.x us-east 1c Up Normal 94.57 GB75.00% 0 x.x.x.x analytics 1c Up Normal 60.64 GB50.00% 1 x.x.x.x us-east 1c Up Normal 131.76 GB 75.00% 42535295865117307932921825928971026432 x.x.x.xus-east 1c Up Normal 43.45 GB50.00% 85070591730234615865843651857942052864 x.x.x.xanalytics 1d Up Normal 60.88 GB50.00% 85070591730234615865843651857942052865 x.x.x.x us-east 1d Up Normal 98.56 GB100.00% 127605887595351923798765477786913079296 If I use cassandra-cli to do show keyspaces; I get (and again, all nodes report the same thing): Keyspace: civicscience: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable Writes: true Options: [analytics:1, us-east:3] I removed the output about all of my column families (CFs), hopefully that doesn't matter. Did I compute the tokens wrong? Is there a combination of nodetool commands I can run to migrate the data around to rebalance to 75/75/75/75? I routinely run repair already. And as the release notes required, I ran upgradesstables during the upgrade process. Before the upgrade, I was getting analytics = 0%, and us-east = 25% on each node, which I expected for owns. will -- Tyler Hobbs DataStax http://datastax.com/ -- Will Oberman Civic Science, Inc. 3030 Penn Avenue., First Floor Pittsburgh, PA 15201 (M) 412-480-7835 (E) ober...@civicscience.com
Re: new nodetool ring output and unbalanced ring?
To minimize the impact on the cluster, I would bootstrap a new 1d node at (42535295865117307932921825928971026432 - 100), then decommission the 1c node at 42535295865117307932921825928971026432 and run cleanup on your us-east nodes. On Thu, Sep 6, 2012 at 1:11 PM, William Oberman ober...@civicscience.comwrote: Didn't notice the racks! Of course If I change a 1c to a 1d, what would I have to do to make sure data shuffles around correctly? Repair everywhere? will On Thu, Sep 6, 2012 at 2:09 PM, Tyler Hobbs ty...@datastax.com wrote: The main issue is that one of your us-east nodes is in rack 1d, while the restart are in rack 1c. With NTS and multiple racks, Cassandra will try use one node from each rack as a replica for a range until it either meets the RF for the DC, or runs out of racks, in which case it just picks nodes sequentially going clockwise around the ring (starting from the range being considered, not the last node that was chosen as a replica). To fix this, you'll either need to make the 1d node a 1c node, or make 42535295865117307932921825928971026432 a 1d node so that you're alternating racks within that DC. On Thu, Sep 6, 2012 at 12:54 PM, William Oberman ober...@civicscience.com wrote: Hi, I recently upgraded from 0.8.x to 1.1.x (through 1.0 briefly) and nodetool -ring seems to have changed from owns to effectively owns. Effectively owns seems to account for replication factor (RF). I'm ok with all of this, yet I still can't figure out what's up with my cluster. I have a NetworkTopologyStrategy with two data centers (DCs) with RF/number nodes in DC combinations of: DC Name, RF, # in DC analytics, 1, 2 us-east, 3, 4 So I'd expect 50% on each analytics node, and 75% for each us-east node. Instead, I have two nodes in us-east with 50/100??? (the other two are 75/75 as expected). Here is the output of nodetool (all nodes report the same thing): Address DC RackStatus State Load Effective-Ownership Token 127605887595351923798765477786913079296 x.x.x.x us-east 1c Up Normal 94.57 GB75.00% 0 x.x.x.x analytics 1c Up Normal 60.64 GB50.00% 1 x.x.x.x us-east 1c Up Normal 131.76 GB 75.00% 42535295865117307932921825928971026432 x.x.x.xus-east 1c Up Normal 43.45 GB50.00% 85070591730234615865843651857942052864 x.x.x.xanalytics 1d Up Normal 60.88 GB50.00% 85070591730234615865843651857942052865 x.x.x.x us-east 1d Up Normal 98.56 GB100.00% 127605887595351923798765477786913079296 If I use cassandra-cli to do show keyspaces; I get (and again, all nodes report the same thing): Keyspace: civicscience: Replication Strategy: org.apache.cassandra.locator.NetworkTopologyStrategy Durable Writes: true Options: [analytics:1, us-east:3] I removed the output about all of my column families (CFs), hopefully that doesn't matter. Did I compute the tokens wrong? Is there a combination of nodetool commands I can run to migrate the data around to rebalance to 75/75/75/75? I routinely run repair already. And as the release notes required, I ran upgradesstables during the upgrade process. Before the upgrade, I was getting analytics = 0%, and us-east = 25% on each node, which I expected for owns. will -- Tyler Hobbs DataStax http://datastax.com/ -- Will Oberman Civic Science, Inc. 3030 Penn Avenue., First Floor Pittsburgh, PA 15201 (M) 412-480-7835 (E) ober...@civicscience.com -- Tyler Hobbs DataStax http://datastax.com/
Re: Unreachable node, not in nodetool ring
Hi, I finally successfully removed the ghost node using unsafeAssassinateEndpoint() as described there : http://tumblr.doki-pen.org/post/22654515359/assinating-cassandra-nodes, I hope this can help more people. Nodetool gossipinfo gives me now the following info for the ghost node : /10.56.62.211 RELEASE_VERSION:1.1.2 RPC_ADDRESS:0.0.0.0 REMOVAL_COORDINATOR:REMOVER,85070591730234615865843651857942052864 SCHEMA:59adb24e-f3cd-3e02-97f0-5b395827453f STATUS:LEFT,42529904547457370790386101505459979624,1344611213445 LOAD:11594.0 DC:eu-west RACK:1b Instead of : /10.56.62.211 RELEASE_VERSION:1.1.2 LOAD:11594.0 RACK:1b SCHEMA:59adb24e-f3cd-3e02-97f0-5b395827453f DC:eu-west REMOVAL_COORDINATOR:REMOVER,85070591730234615865843651857942052864 STATUS:removed,170141183460469231731687303715884105727,1342453967415 RPC_ADDRESS:0.0.0.0 Cassandra-cli describe cluster now don't show me any unreachable node. The only issue that remains is that my nodes aren't well load balanced yet... After repairing, cleaning up, restarting all nodes I still have the following ring : Address DC RackStatus State Load OwnsToken 85070591730234615865843651857942052864 10.59.21.241eu-west 1b Up Normal 103.19 GB 50.00% 0 10.58.83.109eu-west 1b Up Normal 62.62 GB 50.00% 85070591730234615865843651857942052864 Any idea on why I can't get the load well balanced in this cluster ? Alain
Re: Unreachable node, not in nodetool ring
Hi again, Nobody has a clue about this issue ? I'm still facing this problem. Alain 2012/7/23 Alain RODRIGUEZ arodr...@gmail.com: Does anyone knows how to totally remove a dead node that only appears when doing a describe cluster from the cli ? I still got this issue in my production cluster. Alain 2012/7/20 Alain RODRIGUEZ arodr...@gmail.com: Hi Aaron, I have repaired and cleanup both nodes already and I did it after any change on my ring (It tooks me a while btw :)). The node *.211 is actually out of the ring and out of my control 'cause I don't have the server anymore (EC2 instance terminated a few days ago). Alain 2012/7/20 aaron morton aa...@thelastpickle.com: I would: * run repair on 10.58.83.109 * run cleanup on 10.59.21.241 (I assume this was the first node). It looks like 0.56.62.211 is out of the cluster. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 19/07/2012, at 9:37 PM, Alain RODRIGUEZ wrote: Not sure if this may help : nodetool -h localhost gossipinfo /10.58.83.109 RELEASE_VERSION:1.1.2 RACK:1b LOAD:5.9384978406E10 SCHEMA:e7e0ec6c-616e-32e7-ae29-40eae2b82ca8 DC:eu-west STATUS:NORMAL,85070591730234615865843651857942052864 RPC_ADDRESS:0.0.0.0 /10.248.10.94 RELEASE_VERSION:1.1.2 LOAD:3.0128207422E10 SCHEMA:e7e0ec6c-616e-32e7-ae29-40eae2b82ca8 STATUS:LEFT,0,1342866804032 RPC_ADDRESS:0.0.0.0 /10.56.62.211 RELEASE_VERSION:1.1.2 LOAD:11594.0 RACK:1b SCHEMA:59adb24e-f3cd-3e02-97f0-5b395827453f DC:eu-west REMOVAL_COORDINATOR:REMOVER,85070591730234615865843651857942052864 STATUS:removed,170141183460469231731687303715884105727,1342453967415 RPC_ADDRESS:0.0.0.0 /10.59.21.241 RELEASE_VERSION:1.1.2 RACK:1b LOAD:1.08667047094E11 SCHEMA:e7e0ec6c-616e-32e7-ae29-40eae2b82ca8 DC:eu-west STATUS:NORMAL,0 RPC_ADDRESS:0.0.0.0 Story : I had 2 node cluster 10.248.10.94 Token 0 10.59.21.241 Token 85070591730234615865843651857942052864 Had to replace node 10.248.10.94 so I add 10.56.62.211 on token 0 - 1 (170141183460469231731687303715884105727). This failed, I removed token. I repeat the previous operation with the node 10.59.21.241 and it went fine. Next I decommissionned the node 10.248.10.94 and moved 10.59.21.241 to the token 0. Now I am on the situation described before. Alain 2012/7/19 Alain RODRIGUEZ arodr...@gmail.com: Hi, I wasn't able to see the token used currently by the 10.56.62.211 (ghost node). I already removed the token 6 days ago : - Removing token 170141183460469231731687303715884105727 for /10.56.62.211 - check in cassandra log. It is possible you see a log line telling you 10.56.62.211 and 10.59.21.241 o 10.58.83.109 share the same token Nothing like that in the logs I tried the following without success : $ nodetool -h localhost removetoken 170141183460469231731687303715884105727 Exception in thread main java.lang.UnsupportedOperationException: Token not found. ... I really thought this was going to work :-). Any other ideas ? Alain PS : I heard that Octo is a nice company and you use Cassandra so I guess you're fine in there :-). I wish you the best thanks for your help. 2012/7/19 Olivier Mallassi omalla...@octo.com: I got that a couple of time (due to DNS issues in our infra) what you could try - check in cassandra log. It is possible you see a log line telling you 10.56.62.211 and 10.59.21.241 o 10.58.83.109 share the same token - if 10.56.62.211 is up, try decommission (via nodetool) - if not, move 10.59.21.241 or 10.58.83.109 to current token + 1 - use removetoken (via nodetool) to remove the token associated with 10.56.62.211. in case of failure, you can use removetoken -f instead. then, the unreachable IP should have disappeared. HTH On Thu, Jul 19, 2012 at 10:38 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Hi, I tried to add a node a few days ago and it failed. I finally made it work with an other node but now when I describe cluster on cli I got this : Cluster Information: Snitch: org.apache.cassandra.locator.Ec2Snitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: UNREACHABLE: [10.56.62.211] e7e0ec6c-616e-32e7-ae29-40eae2b82ca8: [10.59.21.241, 10.58.83.109] And nodetool ring gives me : Address DC RackStatus State Load OwnsToken 85070591730234615865843651857942052864 10.59.21.241eu-west 1b Up Normal 101.17 GB 50.00% 0 10.58.83.109eu-west 1b Up Normal 55.27 GB 50.00% 85070591730234615865843651857942052864 The point, as you can see, is that one of my node has twice the information of the second one. I have a RF = 2 defined. My guess is that the token 0 node keep data for the unreachable node. The IP
Re: Unreachable node, not in nodetool ring
, 2012 at 10:38 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Hi, I tried to add a node a few days ago and it failed. I finally made it work with an other node but now when I describe cluster on cli I got this : Cluster Information: Snitch: org.apache.cassandra.locator.Ec2Snitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: UNREACHABLE: [10.56.62.211] e7e0ec6c-616e-32e7-ae29-40eae2b82ca8: [10.59.21.241, 10.58.83.109] And nodetool ring gives me : Address DC RackStatus State Load OwnsToken 85070591730234615865843651857942052864 10.59.21.241eu-west 1b Up Normal 101.17 GB 50.00% 0 10.58.83.109eu-west 1b Up Normal 55.27 GB 50.00% 85070591730234615865843651857942052864 The point, as you can see, is that one of my node has twice the information of the second one. I have a RF = 2 defined. My guess is that the token 0 node keep data for the unreachable node. The IP of the unreachable node doesn't belong to me anymore, I have no access to this ghost node. Does someone know how to completely remove this ghost node from my cluster ? Thank you. Alain INFO : On ubuntu (AMI Datastax 2.1 and 2.2) Cassandra 1.1.2 (upgraded from 1.0.9) 2 node cluster (+ the ghost one) RF = 2 -- Olivier Mallassi OCTO Technology 50, Avenue des Champs-Elysées 75008 Paris Mobile: (33) 6 28 70 26 61 Tél: (33) 1 58 56 10 00 Fax: (33) 1 58 56 10 01 http://www.octo.com Octo Talks! http://blog.octo.com -- Olivier Mallassi OCTO Technology 50, Avenue des Champs-Elysées 75008 Paris Mobile: (33) 6 28 70 26 61 Tél: (33) 1 58 56 10 00 Fax: (33) 1 58 56 10 01 http://www.octo.com Octo Talks! http://blog.octo.com
Re: Unreachable node, not in nodetool ring
Does anyone knows how to totally remove a dead node that only appears when doing a describe cluster from the cli ? I still got this issue in my production cluster. Alain 2012/7/20 Alain RODRIGUEZ arodr...@gmail.com: Hi Aaron, I have repaired and cleanup both nodes already and I did it after any change on my ring (It tooks me a while btw :)). The node *.211 is actually out of the ring and out of my control 'cause I don't have the server anymore (EC2 instance terminated a few days ago). Alain 2012/7/20 aaron morton aa...@thelastpickle.com: I would: * run repair on 10.58.83.109 * run cleanup on 10.59.21.241 (I assume this was the first node). It looks like 0.56.62.211 is out of the cluster. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 19/07/2012, at 9:37 PM, Alain RODRIGUEZ wrote: Not sure if this may help : nodetool -h localhost gossipinfo /10.58.83.109 RELEASE_VERSION:1.1.2 RACK:1b LOAD:5.9384978406E10 SCHEMA:e7e0ec6c-616e-32e7-ae29-40eae2b82ca8 DC:eu-west STATUS:NORMAL,85070591730234615865843651857942052864 RPC_ADDRESS:0.0.0.0 /10.248.10.94 RELEASE_VERSION:1.1.2 LOAD:3.0128207422E10 SCHEMA:e7e0ec6c-616e-32e7-ae29-40eae2b82ca8 STATUS:LEFT,0,1342866804032 RPC_ADDRESS:0.0.0.0 /10.56.62.211 RELEASE_VERSION:1.1.2 LOAD:11594.0 RACK:1b SCHEMA:59adb24e-f3cd-3e02-97f0-5b395827453f DC:eu-west REMOVAL_COORDINATOR:REMOVER,85070591730234615865843651857942052864 STATUS:removed,170141183460469231731687303715884105727,1342453967415 RPC_ADDRESS:0.0.0.0 /10.59.21.241 RELEASE_VERSION:1.1.2 RACK:1b LOAD:1.08667047094E11 SCHEMA:e7e0ec6c-616e-32e7-ae29-40eae2b82ca8 DC:eu-west STATUS:NORMAL,0 RPC_ADDRESS:0.0.0.0 Story : I had 2 node cluster 10.248.10.94 Token 0 10.59.21.241 Token 85070591730234615865843651857942052864 Had to replace node 10.248.10.94 so I add 10.56.62.211 on token 0 - 1 (170141183460469231731687303715884105727). This failed, I removed token. I repeat the previous operation with the node 10.59.21.241 and it went fine. Next I decommissionned the node 10.248.10.94 and moved 10.59.21.241 to the token 0. Now I am on the situation described before. Alain 2012/7/19 Alain RODRIGUEZ arodr...@gmail.com: Hi, I wasn't able to see the token used currently by the 10.56.62.211 (ghost node). I already removed the token 6 days ago : - Removing token 170141183460469231731687303715884105727 for /10.56.62.211 - check in cassandra log. It is possible you see a log line telling you 10.56.62.211 and 10.59.21.241 o 10.58.83.109 share the same token Nothing like that in the logs I tried the following without success : $ nodetool -h localhost removetoken 170141183460469231731687303715884105727 Exception in thread main java.lang.UnsupportedOperationException: Token not found. ... I really thought this was going to work :-). Any other ideas ? Alain PS : I heard that Octo is a nice company and you use Cassandra so I guess you're fine in there :-). I wish you the best thanks for your help. 2012/7/19 Olivier Mallassi omalla...@octo.com: I got that a couple of time (due to DNS issues in our infra) what you could try - check in cassandra log. It is possible you see a log line telling you 10.56.62.211 and 10.59.21.241 o 10.58.83.109 share the same token - if 10.56.62.211 is up, try decommission (via nodetool) - if not, move 10.59.21.241 or 10.58.83.109 to current token + 1 - use removetoken (via nodetool) to remove the token associated with 10.56.62.211. in case of failure, you can use removetoken -f instead. then, the unreachable IP should have disappeared. HTH On Thu, Jul 19, 2012 at 10:38 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Hi, I tried to add a node a few days ago and it failed. I finally made it work with an other node but now when I describe cluster on cli I got this : Cluster Information: Snitch: org.apache.cassandra.locator.Ec2Snitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: UNREACHABLE: [10.56.62.211] e7e0ec6c-616e-32e7-ae29-40eae2b82ca8: [10.59.21.241, 10.58.83.109] And nodetool ring gives me : Address DC RackStatus State Load OwnsToken 85070591730234615865843651857942052864 10.59.21.241eu-west 1b Up Normal 101.17 GB 50.00% 0 10.58.83.109eu-west 1b Up Normal 55.27 GB 50.00% 85070591730234615865843651857942052864 The point, as you can see, is that one of my node has twice the information of the second one. I have a RF = 2 defined. My guess is that the token 0 node keep data for the unreachable node. The IP of the unreachable node doesn't belong to me anymore, I have no access to this ghost node. Does someone know how to completely remove
Re: Unreachable node, not in nodetool ring
I would: * run repair on 10.58.83.109 * run cleanup on 10.59.21.241 (I assume this was the first node). It looks like 0.56.62.211 is out of the cluster. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 19/07/2012, at 9:37 PM, Alain RODRIGUEZ wrote: Not sure if this may help : nodetool -h localhost gossipinfo /10.58.83.109 RELEASE_VERSION:1.1.2 RACK:1b LOAD:5.9384978406E10 SCHEMA:e7e0ec6c-616e-32e7-ae29-40eae2b82ca8 DC:eu-west STATUS:NORMAL,85070591730234615865843651857942052864 RPC_ADDRESS:0.0.0.0 /10.248.10.94 RELEASE_VERSION:1.1.2 LOAD:3.0128207422E10 SCHEMA:e7e0ec6c-616e-32e7-ae29-40eae2b82ca8 STATUS:LEFT,0,1342866804032 RPC_ADDRESS:0.0.0.0 /10.56.62.211 RELEASE_VERSION:1.1.2 LOAD:11594.0 RACK:1b SCHEMA:59adb24e-f3cd-3e02-97f0-5b395827453f DC:eu-west REMOVAL_COORDINATOR:REMOVER,85070591730234615865843651857942052864 STATUS:removed,170141183460469231731687303715884105727,1342453967415 RPC_ADDRESS:0.0.0.0 /10.59.21.241 RELEASE_VERSION:1.1.2 RACK:1b LOAD:1.08667047094E11 SCHEMA:e7e0ec6c-616e-32e7-ae29-40eae2b82ca8 DC:eu-west STATUS:NORMAL,0 RPC_ADDRESS:0.0.0.0 Story : I had 2 node cluster 10.248.10.94 Token 0 10.59.21.241 Token 85070591730234615865843651857942052864 Had to replace node 10.248.10.94 so I add 10.56.62.211 on token 0 - 1 (170141183460469231731687303715884105727). This failed, I removed token. I repeat the previous operation with the node 10.59.21.241 and it went fine. Next I decommissionned the node 10.248.10.94 and moved 10.59.21.241 to the token 0. Now I am on the situation described before. Alain 2012/7/19 Alain RODRIGUEZ arodr...@gmail.com: Hi, I wasn't able to see the token used currently by the 10.56.62.211 (ghost node). I already removed the token 6 days ago : - Removing token 170141183460469231731687303715884105727 for /10.56.62.211 - check in cassandra log. It is possible you see a log line telling you 10.56.62.211 and 10.59.21.241 o 10.58.83.109 share the same token Nothing like that in the logs I tried the following without success : $ nodetool -h localhost removetoken 170141183460469231731687303715884105727 Exception in thread main java.lang.UnsupportedOperationException: Token not found. ... I really thought this was going to work :-). Any other ideas ? Alain PS : I heard that Octo is a nice company and you use Cassandra so I guess you're fine in there :-). I wish you the best thanks for your help. 2012/7/19 Olivier Mallassi omalla...@octo.com: I got that a couple of time (due to DNS issues in our infra) what you could try - check in cassandra log. It is possible you see a log line telling you 10.56.62.211 and 10.59.21.241 o 10.58.83.109 share the same token - if 10.56.62.211 is up, try decommission (via nodetool) - if not, move 10.59.21.241 or 10.58.83.109 to current token + 1 - use removetoken (via nodetool) to remove the token associated with 10.56.62.211. in case of failure, you can use removetoken -f instead. then, the unreachable IP should have disappeared. HTH On Thu, Jul 19, 2012 at 10:38 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Hi, I tried to add a node a few days ago and it failed. I finally made it work with an other node but now when I describe cluster on cli I got this : Cluster Information: Snitch: org.apache.cassandra.locator.Ec2Snitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: UNREACHABLE: [10.56.62.211] e7e0ec6c-616e-32e7-ae29-40eae2b82ca8: [10.59.21.241, 10.58.83.109] And nodetool ring gives me : Address DC RackStatus State Load OwnsToken 85070591730234615865843651857942052864 10.59.21.241eu-west 1b Up Normal 101.17 GB 50.00% 0 10.58.83.109eu-west 1b Up Normal 55.27 GB 50.00% 85070591730234615865843651857942052864 The point, as you can see, is that one of my node has twice the information of the second one. I have a RF = 2 defined. My guess is that the token 0 node keep data for the unreachable node. The IP of the unreachable node doesn't belong to me anymore, I have no access to this ghost node. Does someone know how to completely remove this ghost node from my cluster ? Thank you. Alain INFO : On ubuntu (AMI Datastax 2.1 and 2.2) Cassandra 1.1.2 (upgraded from 1.0.9) 2 node cluster (+ the ghost one) RF = 2 -- Olivier Mallassi OCTO Technology 50, Avenue des Champs-Elysées 75008 Paris Mobile: (33) 6 28 70 26 61 Tél: (33) 1 58 56 10 00 Fax: (33) 1 58 56 10 01 http://www.octo.com Octo Talks! http://blog.octo.com
Re: Unreachable node, not in nodetool ring
Hi Aaron, I have repaired and cleanup both nodes already and I did it after any change on my ring (It tooks me a while btw :)). The node *.211 is actually out of the ring and out of my control 'cause I don't have the server anymore (EC2 instance terminated a few days ago). Alain 2012/7/20 aaron morton aa...@thelastpickle.com: I would: * run repair on 10.58.83.109 * run cleanup on 10.59.21.241 (I assume this was the first node). It looks like 0.56.62.211 is out of the cluster. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 19/07/2012, at 9:37 PM, Alain RODRIGUEZ wrote: Not sure if this may help : nodetool -h localhost gossipinfo /10.58.83.109 RELEASE_VERSION:1.1.2 RACK:1b LOAD:5.9384978406E10 SCHEMA:e7e0ec6c-616e-32e7-ae29-40eae2b82ca8 DC:eu-west STATUS:NORMAL,85070591730234615865843651857942052864 RPC_ADDRESS:0.0.0.0 /10.248.10.94 RELEASE_VERSION:1.1.2 LOAD:3.0128207422E10 SCHEMA:e7e0ec6c-616e-32e7-ae29-40eae2b82ca8 STATUS:LEFT,0,1342866804032 RPC_ADDRESS:0.0.0.0 /10.56.62.211 RELEASE_VERSION:1.1.2 LOAD:11594.0 RACK:1b SCHEMA:59adb24e-f3cd-3e02-97f0-5b395827453f DC:eu-west REMOVAL_COORDINATOR:REMOVER,85070591730234615865843651857942052864 STATUS:removed,170141183460469231731687303715884105727,1342453967415 RPC_ADDRESS:0.0.0.0 /10.59.21.241 RELEASE_VERSION:1.1.2 RACK:1b LOAD:1.08667047094E11 SCHEMA:e7e0ec6c-616e-32e7-ae29-40eae2b82ca8 DC:eu-west STATUS:NORMAL,0 RPC_ADDRESS:0.0.0.0 Story : I had 2 node cluster 10.248.10.94 Token 0 10.59.21.241 Token 85070591730234615865843651857942052864 Had to replace node 10.248.10.94 so I add 10.56.62.211 on token 0 - 1 (170141183460469231731687303715884105727). This failed, I removed token. I repeat the previous operation with the node 10.59.21.241 and it went fine. Next I decommissionned the node 10.248.10.94 and moved 10.59.21.241 to the token 0. Now I am on the situation described before. Alain 2012/7/19 Alain RODRIGUEZ arodr...@gmail.com: Hi, I wasn't able to see the token used currently by the 10.56.62.211 (ghost node). I already removed the token 6 days ago : - Removing token 170141183460469231731687303715884105727 for /10.56.62.211 - check in cassandra log. It is possible you see a log line telling you 10.56.62.211 and 10.59.21.241 o 10.58.83.109 share the same token Nothing like that in the logs I tried the following without success : $ nodetool -h localhost removetoken 170141183460469231731687303715884105727 Exception in thread main java.lang.UnsupportedOperationException: Token not found. ... I really thought this was going to work :-). Any other ideas ? Alain PS : I heard that Octo is a nice company and you use Cassandra so I guess you're fine in there :-). I wish you the best thanks for your help. 2012/7/19 Olivier Mallassi omalla...@octo.com: I got that a couple of time (due to DNS issues in our infra) what you could try - check in cassandra log. It is possible you see a log line telling you 10.56.62.211 and 10.59.21.241 o 10.58.83.109 share the same token - if 10.56.62.211 is up, try decommission (via nodetool) - if not, move 10.59.21.241 or 10.58.83.109 to current token + 1 - use removetoken (via nodetool) to remove the token associated with 10.56.62.211. in case of failure, you can use removetoken -f instead. then, the unreachable IP should have disappeared. HTH On Thu, Jul 19, 2012 at 10:38 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Hi, I tried to add a node a few days ago and it failed. I finally made it work with an other node but now when I describe cluster on cli I got this : Cluster Information: Snitch: org.apache.cassandra.locator.Ec2Snitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: UNREACHABLE: [10.56.62.211] e7e0ec6c-616e-32e7-ae29-40eae2b82ca8: [10.59.21.241, 10.58.83.109] And nodetool ring gives me : Address DC RackStatus State Load OwnsToken 85070591730234615865843651857942052864 10.59.21.241eu-west 1b Up Normal 101.17 GB 50.00% 0 10.58.83.109eu-west 1b Up Normal 55.27 GB 50.00% 85070591730234615865843651857942052864 The point, as you can see, is that one of my node has twice the information of the second one. I have a RF = 2 defined. My guess is that the token 0 node keep data for the unreachable node. The IP of the unreachable node doesn't belong to me anymore, I have no access to this ghost node. Does someone know how to completely remove this ghost node from my cluster ? Thank you. Alain INFO : On ubuntu (AMI Datastax 2.1 and 2.2) Cassandra 1.1.2 (upgraded from 1.0.9) 2 node cluster (+ the ghost one) RF = 2
Unreachable node, not in nodetool ring
Hi, I tried to add a node a few days ago and it failed. I finally made it work with an other node but now when I describe cluster on cli I got this : Cluster Information: Snitch: org.apache.cassandra.locator.Ec2Snitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: UNREACHABLE: [10.56.62.211] e7e0ec6c-616e-32e7-ae29-40eae2b82ca8: [10.59.21.241, 10.58.83.109] And nodetool ring gives me : Address DC RackStatus State Load OwnsToken 85070591730234615865843651857942052864 10.59.21.241eu-west 1b Up Normal 101.17 GB 50.00% 0 10.58.83.109eu-west 1b Up Normal 55.27 GB 50.00% 85070591730234615865843651857942052864 The point, as you can see, is that one of my node has twice the information of the second one. I have a RF = 2 defined. My guess is that the token 0 node keep data for the unreachable node. The IP of the unreachable node doesn't belong to me anymore, I have no access to this ghost node. Does someone know how to completely remove this ghost node from my cluster ? Thank you. Alain INFO : On ubuntu (AMI Datastax 2.1 and 2.2) Cassandra 1.1.2 (upgraded from 1.0.9) 2 node cluster (+ the ghost one) RF = 2
Re: Unreachable node, not in nodetool ring
I got that a couple of time (due to DNS issues in our infra) what you could try - check in cassandra log. It is possible you see a log line telling you 10.56.62.211 and 10.59.21.241 o 10.58.83.109 share the same token - if 10.56.62.211 is up, try decommission (via nodetool) - if not, move 10.59.21.241 or 10.58.83.109 to current token + 1 - use removetoken (via nodetool) to remove the token associated with 10.56.62.211. in case of failure, you can use removetoken -f instead. then, the unreachable IP should have disappeared. HTH On Thu, Jul 19, 2012 at 10:38 AM, Alain RODRIGUEZ arodr...@gmail.comwrote: Hi, I tried to add a node a few days ago and it failed. I finally made it work with an other node but now when I describe cluster on cli I got this : Cluster Information: Snitch: org.apache.cassandra.locator.Ec2Snitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: UNREACHABLE: [10.56.62.211] e7e0ec6c-616e-32e7-ae29-40eae2b82ca8: [10.59.21.241, 10.58.83.109] And nodetool ring gives me : Address DC RackStatus State Load OwnsToken 85070591730234615865843651857942052864 10.59.21.241eu-west 1b Up Normal 101.17 GB 50.00% 0 10.58.83.109eu-west 1b Up Normal 55.27 GB 50.00% 85070591730234615865843651857942052864 The point, as you can see, is that one of my node has twice the information of the second one. I have a RF = 2 defined. My guess is that the token 0 node keep data for the unreachable node. The IP of the unreachable node doesn't belong to me anymore, I have no access to this ghost node. Does someone know how to completely remove this ghost node from my cluster ? Thank you. Alain INFO : On ubuntu (AMI Datastax 2.1 and 2.2) Cassandra 1.1.2 (upgraded from 1.0.9) 2 node cluster (+ the ghost one) RF = 2 -- Olivier Mallassi OCTO Technology 50, Avenue des Champs-Elysées 75008 Paris Mobile: (33) 6 28 70 26 61 Tél: (33) 1 58 56 10 00 Fax: (33) 1 58 56 10 01 http://www.octo.com Octo Talks! http://blog.octo.com
Re: Unreachable node, not in nodetool ring
Hi, I wasn't able to see the token used currently by the 10.56.62.211 (ghost node). I already removed the token 6 days ago : - Removing token 170141183460469231731687303715884105727 for /10.56.62.211 - check in cassandra log. It is possible you see a log line telling you 10.56.62.211 and 10.59.21.241 o 10.58.83.109 share the same token Nothing like that in the logs I tried the following without success : $ nodetool -h localhost removetoken 170141183460469231731687303715884105727 Exception in thread main java.lang.UnsupportedOperationException: Token not found. ... I really thought this was going to work :-). Any other ideas ? Alain PS : I heard that Octo is a nice company and you use Cassandra so I guess you're fine in there :-). I wish you the best thanks for your help. 2012/7/19 Olivier Mallassi omalla...@octo.com: I got that a couple of time (due to DNS issues in our infra) what you could try - check in cassandra log. It is possible you see a log line telling you 10.56.62.211 and 10.59.21.241 o 10.58.83.109 share the same token - if 10.56.62.211 is up, try decommission (via nodetool) - if not, move 10.59.21.241 or 10.58.83.109 to current token + 1 - use removetoken (via nodetool) to remove the token associated with 10.56.62.211. in case of failure, you can use removetoken -f instead. then, the unreachable IP should have disappeared. HTH On Thu, Jul 19, 2012 at 10:38 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Hi, I tried to add a node a few days ago and it failed. I finally made it work with an other node but now when I describe cluster on cli I got this : Cluster Information: Snitch: org.apache.cassandra.locator.Ec2Snitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: UNREACHABLE: [10.56.62.211] e7e0ec6c-616e-32e7-ae29-40eae2b82ca8: [10.59.21.241, 10.58.83.109] And nodetool ring gives me : Address DC RackStatus State Load OwnsToken 85070591730234615865843651857942052864 10.59.21.241eu-west 1b Up Normal 101.17 GB 50.00% 0 10.58.83.109eu-west 1b Up Normal 55.27 GB 50.00% 85070591730234615865843651857942052864 The point, as you can see, is that one of my node has twice the information of the second one. I have a RF = 2 defined. My guess is that the token 0 node keep data for the unreachable node. The IP of the unreachable node doesn't belong to me anymore, I have no access to this ghost node. Does someone know how to completely remove this ghost node from my cluster ? Thank you. Alain INFO : On ubuntu (AMI Datastax 2.1 and 2.2) Cassandra 1.1.2 (upgraded from 1.0.9) 2 node cluster (+ the ghost one) RF = 2 -- Olivier Mallassi OCTO Technology 50, Avenue des Champs-Elysées 75008 Paris Mobile: (33) 6 28 70 26 61 Tél: (33) 1 58 56 10 00 Fax: (33) 1 58 56 10 01 http://www.octo.com Octo Talks! http://blog.octo.com
Re: Unreachable node, not in nodetool ring
Not sure if this may help : nodetool -h localhost gossipinfo /10.58.83.109 RELEASE_VERSION:1.1.2 RACK:1b LOAD:5.9384978406E10 SCHEMA:e7e0ec6c-616e-32e7-ae29-40eae2b82ca8 DC:eu-west STATUS:NORMAL,85070591730234615865843651857942052864 RPC_ADDRESS:0.0.0.0 /10.248.10.94 RELEASE_VERSION:1.1.2 LOAD:3.0128207422E10 SCHEMA:e7e0ec6c-616e-32e7-ae29-40eae2b82ca8 STATUS:LEFT,0,1342866804032 RPC_ADDRESS:0.0.0.0 /10.56.62.211 RELEASE_VERSION:1.1.2 LOAD:11594.0 RACK:1b SCHEMA:59adb24e-f3cd-3e02-97f0-5b395827453f DC:eu-west REMOVAL_COORDINATOR:REMOVER,85070591730234615865843651857942052864 STATUS:removed,170141183460469231731687303715884105727,1342453967415 RPC_ADDRESS:0.0.0.0 /10.59.21.241 RELEASE_VERSION:1.1.2 RACK:1b LOAD:1.08667047094E11 SCHEMA:e7e0ec6c-616e-32e7-ae29-40eae2b82ca8 DC:eu-west STATUS:NORMAL,0 RPC_ADDRESS:0.0.0.0 Story : I had 2 node cluster 10.248.10.94 Token 0 10.59.21.241 Token 85070591730234615865843651857942052864 Had to replace node 10.248.10.94 so I add 10.56.62.211 on token 0 - 1 (170141183460469231731687303715884105727). This failed, I removed token. I repeat the previous operation with the node 10.59.21.241 and it went fine. Next I decommissionned the node 10.248.10.94 and moved 10.59.21.241 to the token 0. Now I am on the situation described before. Alain 2012/7/19 Alain RODRIGUEZ arodr...@gmail.com: Hi, I wasn't able to see the token used currently by the 10.56.62.211 (ghost node). I already removed the token 6 days ago : - Removing token 170141183460469231731687303715884105727 for /10.56.62.211 - check in cassandra log. It is possible you see a log line telling you 10.56.62.211 and 10.59.21.241 o 10.58.83.109 share the same token Nothing like that in the logs I tried the following without success : $ nodetool -h localhost removetoken 170141183460469231731687303715884105727 Exception in thread main java.lang.UnsupportedOperationException: Token not found. ... I really thought this was going to work :-). Any other ideas ? Alain PS : I heard that Octo is a nice company and you use Cassandra so I guess you're fine in there :-). I wish you the best thanks for your help. 2012/7/19 Olivier Mallassi omalla...@octo.com: I got that a couple of time (due to DNS issues in our infra) what you could try - check in cassandra log. It is possible you see a log line telling you 10.56.62.211 and 10.59.21.241 o 10.58.83.109 share the same token - if 10.56.62.211 is up, try decommission (via nodetool) - if not, move 10.59.21.241 or 10.58.83.109 to current token + 1 - use removetoken (via nodetool) to remove the token associated with 10.56.62.211. in case of failure, you can use removetoken -f instead. then, the unreachable IP should have disappeared. HTH On Thu, Jul 19, 2012 at 10:38 AM, Alain RODRIGUEZ arodr...@gmail.com wrote: Hi, I tried to add a node a few days ago and it failed. I finally made it work with an other node but now when I describe cluster on cli I got this : Cluster Information: Snitch: org.apache.cassandra.locator.Ec2Snitch Partitioner: org.apache.cassandra.dht.RandomPartitioner Schema versions: UNREACHABLE: [10.56.62.211] e7e0ec6c-616e-32e7-ae29-40eae2b82ca8: [10.59.21.241, 10.58.83.109] And nodetool ring gives me : Address DC RackStatus State Load OwnsToken 85070591730234615865843651857942052864 10.59.21.241eu-west 1b Up Normal 101.17 GB 50.00% 0 10.58.83.109eu-west 1b Up Normal 55.27 GB 50.00% 85070591730234615865843651857942052864 The point, as you can see, is that one of my node has twice the information of the second one. I have a RF = 2 defined. My guess is that the token 0 node keep data for the unreachable node. The IP of the unreachable node doesn't belong to me anymore, I have no access to this ghost node. Does someone know how to completely remove this ghost node from my cluster ? Thank you. Alain INFO : On ubuntu (AMI Datastax 2.1 and 2.2) Cassandra 1.1.2 (upgraded from 1.0.9) 2 node cluster (+ the ghost one) RF = 2 -- Olivier Mallassi OCTO Technology 50, Avenue des Champs-Elysées 75008 Paris Mobile: (33) 6 28 70 26 61 Tél: (33) 1 58 56 10 00 Fax: (33) 1 58 56 10 01 http://www.octo.com Octo Talks! http://blog.octo.com
Re: nodetool ring runs very slow
You node is under significant memory pressure. Look into: * reducing caches * reducing the JVM heap to 8GB Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 28/03/2012, at 6:03 AM, Feng Qu wrote: Hi Jonathan, similar problem happens again and there is only one GC running at that time per system.log. This is one node of a 6-node 0.8.6 ring. Heap size on this host is 16GB. [fengqu@slcdbx1035 cassandra]$ date; time cnt ring Tue Mar 27 09:24:20 GMT+7 2012 Address DC RackStatus State LoadOwns Token 141784319550391026443072753096570088106 10.89.74.60 slc rack0 Up Normal 343.97 GB 16.67% 0 10.2.128.55 phx rack0 Up Normal 485.48 GB 16.67% 28356863910078205288614550619314017621 10.89.74.67 slc rack0 Up Normal 252.82 GB 16.67% 56713727820156410577229101238628035242 10.2.128.56 phx rack0 Up Normal 258.49 GB 16.67% 85070591730234615865843651857942052864 10.89.74.62 slc rack0 Up Normal 251.02 GB 16.67% 113427455640312821154458202477256070485 10.2.128.57 phx rack0 Up Normal 451.97 GB 16.67% 141784319550391026443072753096570088106 real0m25.999s user0m0.309s sys 0m0.048s WARN [ScheduledTasks:1] 2012-03-27 09:22:39,499 GCInspector.java (line 143) Heap is 0.8716646568744459 full. You may need to reduce memtable and/or cache sizes. Cassandra will now flush up to the two largest memtables to free up memory. Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically INFO [ScheduledTasks:1] 2012-03-27 09:23:21,637 GCInspector.java (line 122) GC for ConcurrentMarkSweep: 1383 ms for 2 collections, 14756794512 used; max is 16928210944 WARN [ScheduledTasks:1] 2012-03-27 09:23:21,637 GCInspector.java (line 143) Heap is 0.8717279434204102 full. You may need to reduce memtable and/or cache sizes. Cassandra will now flush up to the two largest memtables to free up memory. Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically INFO [ScheduledTasks:1] 2012-03-27 09:24:04,844 GCInspector.java (line 122) GC for ConcurrentMarkSweep: 3090 ms for 2 collections, 14782314472 used; max is 16928210944 WARN [ScheduledTasks:1] 2012-03-27 09:24:04,851 GCInspector.java (line 143) Heap is 0.8732354837083013 full. You may need to reduce memtable and/or cache sizes. Cassandra will now flush up to the two largest memtables to free up memory. Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically INFO [ScheduledTasks:1] 2012-03-27 09:24:46,610 GCInspector.java (line 122) GC for ConcurrentMarkSweep: 3057 ms for 2 collections, 14757982328 used; max is 16928210944 WARN [ScheduledTasks:1] 2012-03-27 09:24:46,616 GCInspector.java (line 143) Heap is 0.871798111260587 full. You may need to reduce memtable and/or cache sizes. Cassandra will now flush up to the two largest memtables to free up memory. Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically INFO [ScheduledTasks:1] 2012-03-27 09:25:36,371 GCInspector.java (line 122) GC for ConcurrentMarkSweep: 7430 ms for 2 collections, 14719911032 used; max is 16928210944 WARN [ScheduledTasks:1] 2012-03-27 09:25:36,377 GCInspector.java (line 143) Heap is 0.8695491260532345 full. You may need to reduce memtable and/or cache sizes. Cassandra will now flush up to the two largest memtables to free up memory. Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically Feng Qu From: Jonathan Ellis jbel...@gmail.com To: user user@cassandra.apache.org Cc: Feng Qu mail...@gmail.com Sent: Friday, February 24, 2012 2:29 PM Subject: Re: nodetool ring runs very slow Read the server log and look for GCInspector output. On Fri, Feb 24, 2012 at 11:02 AM, Feng Qu mail...@gmail.com wrote: Hi Jonathan, how to check out whether it's in GC storming? This server crashed few time due to Java heap out of memory. We use 8GB heap on a server with 96GB ram. This is first node in a 6-node ring and it has opscenter community running on it. Feng Qu From: Jonathan Ellis jbel...@gmail.com To: user@cassandra.apache.org; Feng Qu mail...@gmail.com Sent: Thursday, February 23, 2012 1:19 PM Subject: Re: nodetool ring runs very slow The only time I've seen nodetool be that slow is when
Re: nodetool ring runs very slow
Haven't received any reply. Resend... Feng Qu From: Feng Qu mail...@gmail.com To: Cassandra User Group user@cassandra.apache.org Sent: Wednesday, February 22, 2012 1:49 PM Subject: nodetool ring runs very slow We noticed that nodetool ring sometimes returns in 17-20 sec while it normally runs in less than a sec. There were some compaction running when it happened. Did compaction cause nodetool slowness? Anything else I should check? time nodetool -h hostname ring real 0m17.595s user 0m0.339s sys 0m0.054s Feng Qu
Re: nodetool ring runs very slow
Hi Jonathan, similar problem happens again and there is only one GC running at that time per system.log. This is one node of a 6-node 0.8.6 ring. Heap size on this host is 16GB. [fengqu@slcdbx1035 cassandra]$ date; time cnt ring Tue Mar 27 09:24:20 GMT+7 2012 Address DC Rack Status State Load Owns Token 141784319550391026443072753096570088106 10.89.74.60 slc rack0 Up Normal 343.97 GB 16.67% 0 10.2.128.55 phx rack0 Up Normal 485.48 GB 16.67% 28356863910078205288614550619314017621 10.89.74.67 slc rack0 Up Normal 252.82 GB 16.67% 56713727820156410577229101238628035242 10.2.128.56 phx rack0 Up Normal 258.49 GB 16.67% 85070591730234615865843651857942052864 10.89.74.62 slc rack0 Up Normal 251.02 GB 16.67% 113427455640312821154458202477256070485 10.2.128.57 phx rack0 Up Normal 451.97 GB 16.67% 141784319550391026443072753096570088106 real 0m25.999s user 0m0.309s sys 0m0.048s WARN [ScheduledTasks:1] 2012-03-27 09:22:39,499 GCInspector.java (line 143) Heap is 0.8716646568744459 full. You may need to reduce memtable and/or cache sizes. Cassandra will now flush up to the two largest memtables to free up memory. Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically INFO [ScheduledTasks:1] 2012-03-27 09:23:21,637 GCInspector.java (line 122) GC for ConcurrentMarkSweep: 1383 ms for 2 collections, 14756794512 used; max is 16928210944 WARN [ScheduledTasks:1] 2012-03-27 09:23:21,637 GCInspector.java (line 143) Heap is 0.8717279434204102 full. You may need to reduce memtable and/or cache sizes. Cassandra will now flush up to the two largest memtables to free up memory. Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically INFO [ScheduledTasks:1] 2012-03-27 09:24:04,844 GCInspector.java (line 122) GC for ConcurrentMarkSweep: 3090 ms for 2 collections, 14782314472 used; max is 16928210944 WARN [ScheduledTasks:1] 2012-03-27 09:24:04,851 GCInspector.java (line 143) Heap is 0.8732354837083013 full. You may need to reduce memtable and/or cache sizes. Cassandra will now flush up to the two largest memtables to free up memory. Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically INFO [ScheduledTasks:1] 2012-03-27 09:24:46,610 GCInspector.java (line 122) GC for ConcurrentMarkSweep: 3057 ms for 2 collections, 14757982328 used; max is 16928210944 WARN [ScheduledTasks:1] 2012-03-27 09:24:46,616 GCInspector.java (line 143) Heap is 0.871798111260587 full. You may need to reduce memtable and/or cache sizes. Cassandra will now flush up to the two largest memtables to free up memory. Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically INFO [ScheduledTasks:1] 2012-03-27 09:25:36,371 GCInspector.java (line 122) GC for ConcurrentMarkSweep: 7430 ms for 2 collections, 14719911032 used; max is 16928210944 WARN [ScheduledTasks:1] 2012-03-27 09:25:36,377 GCInspector.java (line 143) Heap is 0.8695491260532345 full. You may need to reduce memtable and/or cache sizes. Cassandra will now flush up to the two largest memtables to free up memory. Adjust flush_largest_memtables_at threshold in cassandra.yaml if you don't want Cassandra to do this automatically Feng Qu From: Jonathan Ellis jbel...@gmail.com To: user user@cassandra.apache.org Cc: Feng Qu mail...@gmail.com Sent: Friday, February 24, 2012 2:29 PM Subject: Re: nodetool ring runs very slow Read the server log and look for GCInspector output. On Fri, Feb 24, 2012 at 11:02 AM, Feng Qu mail...@gmail.com wrote: Hi Jonathan, how to check out whether it's in GC storming? This server crashed few time due to Java heap out of memory. We use 8GB heap on a server with 96GB ram. This is first node in a 6-node ring and it has opscenter community running on it. Feng Qu From: Jonathan Ellis jbel...@gmail.com To: user@cassandra.apache.org; Feng Qu mail...@gmail.com Sent: Thursday, February 23, 2012 1:19 PM Subject: Re: nodetool ring runs very slow The only time I've seen nodetool be that slow is when it was talking to a machine that was either swapping or deep into (JVM) GC storming. On Wed, Feb 22, 2012 at 3:49 PM, Feng Qu mail...@gmail.com wrote: We noticed that nodetool ring sometimes returns in 17-20 sec while it normally runs in less than a sec. There were some compaction running when
Re: nodetool ring runs very slow
Read the server log and look for GCInspector output. On Fri, Feb 24, 2012 at 11:02 AM, Feng Qu mail...@gmail.com wrote: Hi Jonathan, how to check out whether it's in GC storming? This server crashed few time due to Java heap out of memory. We use 8GB heap on a server with 96GB ram. This is first node in a 6-node ring and it has opscenter community running on it. Feng Qu From: Jonathan Ellis jbel...@gmail.com To: user@cassandra.apache.org; Feng Qu mail...@gmail.com Sent: Thursday, February 23, 2012 1:19 PM Subject: Re: nodetool ring runs very slow The only time I've seen nodetool be that slow is when it was talking to a machine that was either swapping or deep into (JVM) GC storming. On Wed, Feb 22, 2012 at 3:49 PM, Feng Qu mail...@gmail.com wrote: We noticed that nodetool ring sometimes returns in 17-20 sec while it normally runs in less than a sec. There were some compaction running when it happened. Did compaction cause nodetool slowness? Anything else I should check? time nodetool -h hostname ring real 0m17.595s user 0m0.339s sys 0m0.054s Feng Qu -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: 1.0.2 - nodetool ring and info reports wrong load after compact
Thanks for the info. Upgrade within the 1.0.x branch is simply a rolling restart, right? Bill On Thu, Feb 16, 2012 at 9:20 PM, Jonathan Ellis jbel...@gmail.com wrote: CASSANDRA-3496, fixed in 1.0.4+ On Thu, Feb 16, 2012 at 8:27 AM, Bill Au bill.w...@gmail.com wrote: I am running 1.0.2 with the default tiered compaction. After running a nodetool compact, I noticed that on about half of the machines in my cluster, both nodetool ring and nodetool info report that the load is actually higher than before when I expect it to be lower. It is almost twice as much as before. I did a du command on the data directory and found the the actual disk usage is only about half of what's being reported by nodetool. Since I am running 1.0.2, there are no compacted sstables waiting to be removed. I manually trigger a full GC in the JVM but that made no difference. When I restarted Cassandra, nodetool once again report the correct load. Is this a known problem? Bill -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: nodetool ring runs very slow
The only time I've seen nodetool be that slow is when it was talking to a machine that was either swapping or deep into (JVM) GC storming. On Wed, Feb 22, 2012 at 3:49 PM, Feng Qu mail...@gmail.com wrote: We noticed that nodetool ring sometimes returns in 17-20 sec while it normally runs in less than a sec. There were some compaction running when it happened. Did compaction cause nodetool slowness? Anything else I should check? time nodetool -h hostname ring real 0m17.595s user 0m0.339s sys 0m0.054s Feng Qu -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: 1.0.2 - nodetool ring and info reports wrong load after compact
On Thu, Feb 23, 2012 at 7:29 AM, Bill Au bill.w...@gmail.com wrote: Upgrade within the 1.0.x branch is simply a rolling restart, right? Generally, but you should always read NEWS.txt before upgrading. -- Tyler Hobbs DataStax http://datastax.com/
nodetool ring runs very slow
We noticed that nodetool ring sometimes returns in 17-20 sec while it normally runs in less than a sec. There were some compaction running when it happened. Did compaction cause nodetool slowness? Anything else I should check? time nodetool -h hostname ring real 0m17.595s user 0m0.339s sys 0m0.054s Feng Qu
1.0.2 - nodetool ring and info reports wrong load after compact
I am running 1.0.2 with the default tiered compaction. After running a nodetool compact, I noticed that on about half of the machines in my cluster, both nodetool ring and nodetool info report that the load is actually higher than before when I expect it to be lower. It is almost twice as much as before. I did a du command on the data directory and found the the actual disk usage is only about half of what's being reported by nodetool. Since I am running 1.0.2, there are no compacted sstables waiting to be removed. I manually trigger a full GC in the JVM but that made no difference. When I restarted Cassandra, nodetool once again report the correct load. Is this a known problem? Bill
Re: 1.0.2 - nodetool ring and info reports wrong load after compact
Are you using compression ? I remember some issues with compression and reported load, cannot remember the details. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 17/02/2012, at 3:27 AM, Bill Au wrote: I am running 1.0.2 with the default tiered compaction. After running a nodetool compact, I noticed that on about half of the machines in my cluster, both nodetool ring and nodetool info report that the load is actually higher than before when I expect it to be lower. It is almost twice as much as before. I did a du command on the data directory and found the the actual disk usage is only about half of what's being reported by nodetool. Since I am running 1.0.2, there are no compacted sstables waiting to be removed. I manually trigger a full GC in the JVM but that made no difference. When I restarted Cassandra, nodetool once again report the correct load. Is this a known problem? Bill
Re: 1.0.2 - nodetool ring and info reports wrong load after compact
No, I am not using compression. Bill On Thu, Feb 16, 2012 at 2:05 PM, aaron morton aa...@thelastpickle.comwrote: Are you using compression ? I remember some issues with compression and reported load, cannot remember the details. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 17/02/2012, at 3:27 AM, Bill Au wrote: I am running 1.0.2 with the default tiered compaction. After running a nodetool compact, I noticed that on about half of the machines in my cluster, both nodetool ring and nodetool info report that the load is actually higher than before when I expect it to be lower. It is almost twice as much as before. I did a du command on the data directory and found the the actual disk usage is only about half of what's being reported by nodetool. Since I am running 1.0.2, there are no compacted sstables waiting to be removed. I manually trigger a full GC in the JVM but that made no difference. When I restarted Cassandra, nodetool once again report the correct load. Is this a known problem? Bill
Multiple data center nodetool ring output display 0% owns
Hi I have deployed Cassandra 1.0.6 to a 2 data center and one data center (DC1) having one node and the other data center (DC2) having two nodes. But when I do a nodetool ring using one IP, the output says 0% owns of DC1 node. Please see the output below. # sh nodetool -h 10.XXX.XXX.XX ring Address DC RackStatus State LoadOwns Token 85070591730234615865843651857942052864 XXX.XXX.XXX.XXX DC2 RAC1Up Normal 83.89 KB 50.00% 0 YYY.YYY.YY.YYY DC1 RAC1Up Normal 71.4 KB 0.00% 1 ZZ.ZZZ.ZZZ.ZZ DC2 RAC1Up Normal 65.94 KB 50.00% 85070591730234615865843651857942052864 Could someone explain this behavior. -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Multiple-data-center-nodetool-ring-output-display-0-owns-tp7279122p7279122.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: Multiple data center nodetool ring output display 0% owns
There was a thread on this a couple days ago -- short answer, the 'owns %' column is effectively incorrect when you're using multiple DCs. If you had all 3 servers in 1 DC, since server YYY has token 1 and server XXX has token 0, then server XXX would truly 'own' 0% (actually, 1/(2^128) :) ), and depending on your replication factor might have no data (if replication were 1). But two data centers is effectively like two rings - since the two nodes in DC2 are token-balanced, you'd see even distribution between them, and DC1 has only one server, so that would be even as well (assuming you replicate to both sites). --DRS On Feb 12, 2012, at 5:42 PM, Roshan wrote: Hi I have deployed Cassandra 1.0.6 to a 2 data center and one data center (DC1) having one node and the other data center (DC2) having two nodes. But when I do a nodetool ring using one IP, the output says 0% owns of DC1 node. Please see the output below. # sh nodetool -h 10.XXX.XXX.XX ring Address DC RackStatus State LoadOwns Token 85070591730234615865843651857942052864 XXX.XXX.XXX.XXX DC2 RAC1Up Normal 83.89 KB 50.00% 0 YYY.YYY.YY.YYY DC1 RAC1Up Normal 71.4 KB 0.00% 1 ZZ.ZZZ.ZZZ.ZZ DC2 RAC1Up Normal 65.94 KB 50.00% 85070591730234615865843651857942052864 Could someone explain this behavior. -- View this message in context: http://cassandra-user-incubator-apache-org.3065146.n2.nabble.com/Multiple-data-center-nodetool-ring-output-display-0-owns-tp7279122p7279122.html Sent from the cassandra-u...@incubator.apache.org mailing list archive at Nabble.com.
Re: Multiple data center nodetool ring output display 0% owns
There was a thread on this a couple days ago -- short answer, the 'owns %' column is effectively incorrect when you're using multiple DCs. If you had all 3 servers in 1 DC, since server YYY has token 1 and server XXX has token 0, then server XXX would truly 'own' 0% (actually, 1/(2^128) :) ), and depending on your replication factor might have no data (if replication were 1). It's also incorrect for rack awareness if your topology is such that the rack awareness changes ownership (see https://issues.apache.org/jira/browse/CASSANDRA-3810). -- / Peter Schuller (@scode, http://worldmodscode.wordpress.com)
Nodetool ring and multiple dc
Hi, I was trying to setup a backup DC from existing DC. State of existing DC with SimpleStrategy rep_factor=1. ./nodetool -h localhost ring Address DC RackStatus State LoadOwns Token 85070591730234615865843651857942052864 XXX.YYYDC1 RAC1Up Normal 187.69 MB 50.00% 0 XXX.ZZZ DC1 RAC1Up Normal 187.77 MB 50.00% 85070591730234615865843651857942052864 After adding backup DC with NetworkTopologyStrategy {DC1:1,DC2:1}, the output is as follows ./nodetool -h localhost ring Address DC RackStatus State LoadOwns Token 85070591730234615865843651857942052864 XXX.YYYDC1 RAC1Up Normal 187.69 MB 50.00% 0 AAA.BBBDC2 RAC1Up Normal 374.59 MB 11.99% 20392907958956928593056220689159358496 XXX.ZZZDC1 RAC1Up Normal 187.77 MB 38.01% 85070591730234615865843651857942052864 As per our app rules, all writes will first go through DC1 and then find it's way to DC2. Since the Owns percentage has drastically changed, will it mean that the DC1 nodes will become unbalanced for future writes? We have a very balanced ring in our production with all nodes serving almost equal volume data as of now in DC1. Will setting up a backup DC2 disturb the balance? Thanks and Regards, Ravi
Re: Nodetool ring and multiple dc
nodetool ring is, IMHO, quite confusing in the case of multiple datacenters. Might be easier to think of it as two rings: in your DC1 ring you have two nodes, and since the tokens are balanced, assuming your rows are randomly distributed you'll have half the data on each, since your replication factor in DC is 1. In your DC2 'ring' you have one node, and with a replication factor of 1 in DC2, all data will go on that node. So you would expect to have n MB of data on XXX.YYY and XXX.ZZZ and 2n MB of data on AAA and BBB, and that's what you have, to a T. :) In other words, the fact that you injecte node AAA.BBB with a token that seems to divide the ring into uneven portions, because the DC1 ring is only DC1, it's not left unbalanced by the new node. If you added a second node to DC2 you would want to give it a token of something like 106338239662793269832304564822427565952 so that the DC2 is also evenly balanced. --DRS On Feb 9, 2012, at 11:00 PM, Ravikumar Govindarajan wrote: Hi, I was trying to setup a backup DC from existing DC. State of existing DC with SimpleStrategy rep_factor=1. ./nodetool -h localhost ring Address DC RackStatus State LoadOwns Token 85070591730234615865843651857942052864 XXX.YYYDC1 RAC1Up Normal 187.69 MB 50.00% 0 XXX.ZZZ DC1 RAC1Up Normal 187.77 MB 50.00% 85070591730234615865843651857942052864 After adding backup DC with NetworkTopologyStrategy {DC1:1,DC2:1}, the output is as follows ./nodetool -h localhost ring Address DC RackStatus State LoadOwns Token 85070591730234615865843651857942052864 XXX.YYYDC1 RAC1Up Normal 187.69 MB 50.00% 0 AAA.BBBDC2 RAC1Up Normal 374.59 MB 11.99% 20392907958956928593056220689159358496 XXX.ZZZDC1 RAC1Up Normal 187.77 MB 38.01% 85070591730234615865843651857942052864 As per our app rules, all writes will first go through DC1 and then find it's way to DC2. Since the Owns percentage has drastically changed, will it mean that the DC1 nodes will become unbalanced for future writes? We have a very balanced ring in our production with all nodes serving almost equal volume data as of now in DC1. Will setting up a backup DC2 disturb the balance? Thanks and Regards, Ravi
Re: Nodetool ring and multiple dc
Thanks David, for the clarification. I feel it would be better if nodetool ring reports per-dc token space ownerships to correctly reflect what cassandra is internally doing, instead of global token space ownership. - Ravi On Fri, Feb 10, 2012 at 12:42 PM, David Schairer dschai...@humbaba.netwrote: nodetool ring is, IMHO, quite confusing in the case of multiple datacenters. Might be easier to think of it as two rings: in your DC1 ring you have two nodes, and since the tokens are balanced, assuming your rows are randomly distributed you'll have half the data on each, since your replication factor in DC is 1. In your DC2 'ring' you have one node, and with a replication factor of 1 in DC2, all data will go on that node. So you would expect to have n MB of data on XXX.YYY and XXX.ZZZ and 2n MB of data on AAA and BBB, and that's what you have, to a T. :) In other words, the fact that you injecte node AAA.BBB with a token that seems to divide the ring into uneven portions, because the DC1 ring is only DC1, it's not left unbalanced by the new node. If you added a second node to DC2 you would want to give it a token of something like 106338239662793269832304564822427565952 so that the DC2 is also evenly balanced. --DRS On Feb 9, 2012, at 11:00 PM, Ravikumar Govindarajan wrote: Hi, I was trying to setup a backup DC from existing DC. State of existing DC with SimpleStrategy rep_factor=1. ./nodetool -h localhost ring Address DC RackStatus State Load OwnsToken 85070591730234615865843651857942052864 XXX.YYYDC1 RAC1Up Normal 187.69 MB 50.00% 0 XXX.ZZZ DC1 RAC1Up Normal 187.77 MB 50.00% 85070591730234615865843651857942052864 After adding backup DC with NetworkTopologyStrategy {DC1:1,DC2:1}, the output is as follows ./nodetool -h localhost ring Address DC RackStatus State Load OwnsToken 85070591730234615865843651857942052864 XXX.YYYDC1 RAC1Up Normal 187.69 MB 50.00% 0 AAA.BBBDC2 RAC1Up Normal 374.59 MB 11.99% 20392907958956928593056220689159358496 XXX.ZZZDC1 RAC1Up Normal 187.77 MB 38.01% 85070591730234615865843651857942052864 As per our app rules, all writes will first go through DC1 and then find it's way to DC2. Since the Owns percentage has drastically changed, will it mean that the DC1 nodes will become unbalanced for future writes? We have a very balanced ring in our production with all nodes serving almost equal volume data as of now in DC1. Will setting up a backup DC2 disturb the balance? Thanks and Regards, Ravi
Re: nodetool ring question
I will have a look very soon and if I find something I'll let you know. Thank you in advance! 2012/1/19 aaron morton aa...@thelastpickle.com Michael, Robin Let us know if the reported live load is increasing and diverging from the on disk size. If it is can you check nodetool cfstats and find an example of a particular CF where Space Used Live has diverged from the on disk size. The provide the schema for the CF and any other info that may be handy. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 18/01/2012, at 10:58 PM, Michael Vaknine wrote: I did restart the cluster and now it is normal 5GB. ** ** *From:* R. Verlangen [mailto:ro...@us2.nl] *Sent:* Wednesday, January 18, 2012 11:32 AM *To:* user@cassandra.apache.org *Subject:* Re: nodetool ring question ** ** I also have this problem. My data on nodes grows to roughly 30GB. After a restart only 5GB remains. Is a factor 6 common for Cassandra? 2012/1/18 aaron morton aa...@thelastpickle.com Good idea Jeremiah, are you using compression Michael ? ** ** Scanning through the CF stats this jumps out… ** ** Column Family: Attractions SSTable count: 3 Space used (live): 27542876685 Space used (total): 1213220387 Thats 25Gb of live data but only 1.3GB total. ** ** Otherwise want to see if a restart fixes it :) Would be interesting to know if it's wrong from the start or drifts during streaming or compaction. ** ** Cheers ** ** - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com ** ** On 18/01/2012, at 12:04 PM, Jeremiah Jordan wrote: There were some nodetool ring load reporting issues with early version of 1.0.X don't remember when they were fixed, but that could be your issue. Are you using compressed column families, a lot of the issues were with those. Might update to 1.0.7. -Jeremiah On 01/16/2012 04:04 AM, Michael Vaknine wrote: Hi, I have a 4 nodes cluster 1.0.3 version This is what I get when I run nodetool ring Address DC RackStatus State Load OwnsToken 127605887595351923798765477786913079296 10.8.193.87 datacenter1 rack1 Up Normal 46.47 GB 25.00% 0 10.5.7.76 datacenter1 rack1 Up Normal 48.01 GB 25.00% 42535295865117307932921825928971026432 10.8.189.197datacenter1 rack1 Up Normal 53.7 GB 25.00% 85070591730234615865843651857942052864 10.5.3.17 datacenter1 rack1 Up Normal 43.49 GB 25.00% 127605887595351923798765477786913079296 I have finished running repair on all 4 nodes. I have less then 10 GB on the /var/lib/cassandra/data/ folders My question is Why nodetool reports almost 50 GB on each node? Thanks Michael ** **
Re: nodetool ring question
Good idea Jeremiah, are you using compression Michael ? Scanning through the CF stats this jumps out… Column Family: Attractions SSTable count: 3 Space used (live): 27542876685 Space used (total): 1213220387 Thats 25Gb of live data but only 1.3GB total. Otherwise want to see if a restart fixes it :) Would be interesting to know if it's wrong from the start or drifts during streaming or compaction. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 18/01/2012, at 12:04 PM, Jeremiah Jordan wrote: There were some nodetool ring load reporting issues with early version of 1.0.X don't remember when they were fixed, but that could be your issue. Are you using compressed column families, a lot of the issues were with those. Might update to 1.0.7. -Jeremiah On 01/16/2012 04:04 AM, Michael Vaknine wrote: Hi, I have a 4 nodes cluster 1.0.3 version This is what I get when I run nodetool ring Address DC RackStatus State LoadOwns Token 127605887595351923798765477786913079296 10.8.193.87 datacenter1 rack1 Up Normal 46.47 GB 25.00% 0 10.5.7.76 datacenter1 rack1 Up Normal 48.01 GB 25.00% 42535295865117307932921825928971026432 10.8.189.197datacenter1 rack1 Up Normal 53.7 GB 25.00% 85070591730234615865843651857942052864 10.5.3.17 datacenter1 rack1 Up Normal 43.49 GB 25.00% 127605887595351923798765477786913079296 I have finished running repair on all 4 nodes. I have less then 10 GB on the /var/lib/cassandra/data/ folders My question is Why nodetool reports almost 50 GB on each node? Thanks Michael
Re: nodetool ring question
I also have this problem. My data on nodes grows to roughly 30GB. After a restart only 5GB remains. Is a factor 6 common for Cassandra? 2012/1/18 aaron morton aa...@thelastpickle.com Good idea Jeremiah, are you using compression Michael ? Scanning through the CF stats this jumps out… Column Family: Attractions SSTable count: 3 Space used (live): 27542876685 Space used (total): 1213220387 Thats 25Gb of live data but only 1.3GB total. Otherwise want to see if a restart fixes it :) Would be interesting to know if it's wrong from the start or drifts during streaming or compaction. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 18/01/2012, at 12:04 PM, Jeremiah Jordan wrote: There were some nodetool ring load reporting issues with early version of 1.0.X don't remember when they were fixed, but that could be your issue. Are you using compressed column families, a lot of the issues were with those. Might update to 1.0.7. -Jeremiah On 01/16/2012 04:04 AM, Michael Vaknine wrote: Hi, ** ** I have a 4 nodes cluster 1.0.3 version ** ** This is what I get when I run nodetool ring ** ** Address DC RackStatus State Load OwnsToken 127605887595351923798765477786913079296 10.8.193.87 datacenter1 rack1 Up Normal 46.47 GB 25.00% 0 10.5.7.76 datacenter1 rack1 Up Normal 48.01 GB 25.00% 42535295865117307932921825928971026432 10.8.189.197datacenter1 rack1 Up Normal 53.7 GB 25.00% 85070591730234615865843651857942052864 10.5.3.17 datacenter1 rack1 Up Normal 43.49 GB 25.00% 127605887595351923798765477786913079296 ** ** I have finished running repair on all 4 nodes. ** ** I have less then 10 GB on the /var/lib/cassandra/data/ folders ** ** My question is Why nodetool reports almost 50 GB on each node? ** ** Thanks Michael
RE: nodetool ring question
I did restart the cluster and now it is normal 5GB. From: R. Verlangen [mailto:ro...@us2.nl] Sent: Wednesday, January 18, 2012 11:32 AM To: user@cassandra.apache.org Subject: Re: nodetool ring question I also have this problem. My data on nodes grows to roughly 30GB. After a restart only 5GB remains. Is a factor 6 common for Cassandra? 2012/1/18 aaron morton aa...@thelastpickle.com Good idea Jeremiah, are you using compression Michael ? Scanning through the CF stats this jumps out. Column Family: Attractions SSTable count: 3 Space used (live): 27542876685 Space used (total): 1213220387 Thats 25Gb of live data but only 1.3GB total. Otherwise want to see if a restart fixes it :) Would be interesting to know if it's wrong from the start or drifts during streaming or compaction. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 18/01/2012, at 12:04 PM, Jeremiah Jordan wrote: There were some nodetool ring load reporting issues with early version of 1.0.X don't remember when they were fixed, but that could be your issue. Are you using compressed column families, a lot of the issues were with those. Might update to 1.0.7. -Jeremiah On 01/16/2012 04:04 AM, Michael Vaknine wrote: Hi, I have a 4 nodes cluster 1.0.3 version This is what I get when I run nodetool ring Address DC RackStatus State LoadOwns Token 127605887595351923798765477786913079296 10.8.193.87 datacenter1 rack1 Up Normal 46.47 GB 25.00% 0 10.5.7.76 datacenter1 rack1 Up Normal 48.01 GB 25.00% 42535295865117307932921825928971026432 10.8.189.197datacenter1 rack1 Up Normal 53.7 GB 25.00% 85070591730234615865843651857942052864 10.5.3.17 datacenter1 rack1 Up Normal 43.49 GB 25.00% 127605887595351923798765477786913079296 I have finished running repair on all 4 nodes. I have less then 10 GB on the /var/lib/cassandra/data/ folders My question is Why nodetool reports almost 50 GB on each node? Thanks Michael
Re: nodetool ring question
Michael, Robin Let us know if the reported live load is increasing and diverging from the on disk size. If it is can you check nodetool cfstats and find an example of a particular CF where Space Used Live has diverged from the on disk size. The provide the schema for the CF and any other info that may be handy. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 18/01/2012, at 10:58 PM, Michael Vaknine wrote: I did restart the cluster and now it is normal 5GB. From: R. Verlangen [mailto:ro...@us2.nl] Sent: Wednesday, January 18, 2012 11:32 AM To: user@cassandra.apache.org Subject: Re: nodetool ring question I also have this problem. My data on nodes grows to roughly 30GB. After a restart only 5GB remains. Is a factor 6 common for Cassandra? 2012/1/18 aaron morton aa...@thelastpickle.com Good idea Jeremiah, are you using compression Michael ? Scanning through the CF stats this jumps out… Column Family: Attractions SSTable count: 3 Space used (live): 27542876685 Space used (total): 1213220387 Thats 25Gb of live data but only 1.3GB total. Otherwise want to see if a restart fixes it :) Would be interesting to know if it's wrong from the start or drifts during streaming or compaction. Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 18/01/2012, at 12:04 PM, Jeremiah Jordan wrote: There were some nodetool ring load reporting issues with early version of 1.0.X don't remember when they were fixed, but that could be your issue. Are you using compressed column families, a lot of the issues were with those. Might update to 1.0.7. -Jeremiah On 01/16/2012 04:04 AM, Michael Vaknine wrote: Hi, I have a 4 nodes cluster 1.0.3 version This is what I get when I run nodetool ring Address DC RackStatus State LoadOwns Token 127605887595351923798765477786913079296 10.8.193.87 datacenter1 rack1 Up Normal 46.47 GB25.00% 0 10.5.7.76 datacenter1 rack1 Up Normal 48.01 GB25.00% 42535295865117307932921825928971026432 10.8.189.197datacenter1 rack1 Up Normal 53.7 GB 25.00% 85070591730234615865843651857942052864 10.5.3.17 datacenter1 rack1 Up Normal 43.49 GB25.00% 127605887595351923798765477786913079296 I have finished running repair on all 4 nodes. I have less then 10 GB on the /var/lib/cassandra/data/ folders My question is Why nodetool reports almost 50 GB on each node? Thanks Michael
Re: nodetool ring question
There were some nodetool ring load reporting issues with early version of 1.0.X don't remember when they were fixed, but that could be your issue. Are you using compressed column families, a lot of the issues were with those. Might update to 1.0.7. -Jeremiah On 01/16/2012 04:04 AM, Michael Vaknine wrote: Hi, I have a 4 nodes cluster 1.0.3 version This is what I get when I run nodetool ring Address DC RackStatus State Load OwnsToken 127605887595351923798765477786913079296 10.8.193.87 datacenter1 rack1 Up Normal 46.47 GB 25.00% 0 10.5.7.76 datacenter1 rack1 Up Normal 48.01 GB 25.00% 42535295865117307932921825928971026432 10.8.189.197datacenter1 rack1 Up Normal 53.7 GB 25.00% 85070591730234615865843651857942052864 10.5.3.17 datacenter1 rack1 Up Normal 43.49 GB 25.00% 127605887595351923798765477786913079296 I have finished running repair on all 4 nodes. I have less then 10 GB on the /var/lib/cassandra/data/ folders My question is Why nodetool reports almost 50 GB on each node? Thanks Michael
nodetool ring question
Hi, I have a 4 nodes cluster 1.0.3 version This is what I get when I run nodetool ring Address DC RackStatus State LoadOwns Token 127605887595351923798765477786913079296 10.8.193.87 datacenter1 rack1 Up Normal 46.47 GB 25.00% 0 10.5.7.76 datacenter1 rack1 Up Normal 48.01 GB 25.00% 42535295865117307932921825928971026432 10.8.189.197datacenter1 rack1 Up Normal 53.7 GB 25.00% 85070591730234615865843651857942052864 10.5.3.17 datacenter1 rack1 Up Normal 43.49 GB 25.00% 127605887595351923798765477786913079296 I have finished running repair on all 4 nodes. I have less then 10 GB on the /var/lib/cassandra/data/ folders My question is Why nodetool reports almost 50 GB on each node? Thanks Michael
Re: nodetool ring question
You can cross check the load with the SSTable Live metric for each CF in nodetool cfstats. Can you also double check what you are seeing on disk ? (sorry got to ask :) ) Finally compare du -h and df -h to make sure they match. (Sure they will, just a simple way to check disk usage makes sense). Cheers - Aaron Morton Freelance Developer @aaronmorton http://www.thelastpickle.com On 16/01/2012, at 11:04 PM, Michael Vaknine wrote: Hi, I have a 4 nodes cluster 1.0.3 version This is what I get when I run nodetool ring Address DC RackStatus State LoadOwns Token 127605887595351923798765477786913079296 10.8.193.87 datacenter1 rack1 Up Normal 46.47 GB25.00% 0 10.5.7.76 datacenter1 rack1 Up Normal 48.01 GB25.00% 42535295865117307932921825928971026432 10.8.189.197datacenter1 rack1 Up Normal 53.7 GB 25.00% 85070591730234615865843651857942052864 10.5.3.17 datacenter1 rack1 Up Normal 43.49 GB25.00% 127605887595351923798765477786913079296 I have finished running repair on all 4 nodes. I have less then 10 GB on the /var/lib/cassandra/data/ folders My question is Why nodetool reports almost 50 GB on each node? Thanks Michael
Re: sstable count=0, why nodetool ring is not 0
in the Cassandra wiki, I found this ColumnFamilyStoreMBean exposes sstable space used as getLiveDiskSpaceUsed (only includes size of non-obsolete files) and getTotalDiskSpaceUsed (includes everything). maybe this is the answer. On Wed, Dec 7, 2011 at 11:57 AM, 祝海通 zhuhait...@gmail.com wrote: hi,all We are using Cassandra 1.0.2. I am testing the TTL with loading 400G. When all the data are expired, I waited for some hours. Later, the nodetool ring is still have 90GB. So I made a major compaction. Then there are 30GB from the nodetool ring. After I saw the file system,I found there are zero sstable. I don't know where comes the 30GB? Best Regards
Re: sstable count=0, why nodetool ring is not 0
Hi, What kind of process did you use for loading 400GB of data? Thanks -- Dotan, @jondot http://twitter.com/jondot On Wed, Dec 7, 2011 at 5:57 AM, 祝海通 zhuhait...@gmail.com wrote: hi,all We are using Cassandra 1.0.2. I am testing the TTL with loading 400G. When all the data are expired, I waited for some hours. Later, the nodetool ring is still have 90GB. So I made a major compaction. Then there are 30GB from the nodetool ring. After I saw the file system,I found there are zero sstable. I don't know where comes the 30GB? Best Regards
Re: sstable count=0, why nodetool ring is not 0
We are testing the the performance of Cassandra for Big Data. Now I also have the problem. From nodetool cfstats, the space used (live) is 7 times than the Space used (total). Why? thx On Wed, Dec 7, 2011 at 5:58 PM, Dotan N. dip...@gmail.com wrote: Hi, What kind of process did you use for loading 400GB of data? Thanks -- Dotan, @jondot http://twitter.com/jondot On Wed, Dec 7, 2011 at 5:57 AM, 祝海通 zhuhait...@gmail.com wrote: hi,all We are using Cassandra 1.0.2. I am testing the TTL with loading 400G. When all the data are expired, I waited for some hours. Later, the nodetool ring is still have 90GB. So I made a major compaction. Then there are 30GB from the nodetool ring. After I saw the file system,I found there are zero sstable. I don't know where comes the 30GB? Best Regards
sstable count=0, why nodetool ring is not 0
hi,all We are using Cassandra 1.0.2. I am testing the TTL with loading 400G. When all the data are expired, I waited for some hours. Later, the nodetool ring is still have 90GB. So I made a major compaction. Then there are 30GB from the nodetool ring. After I saw the file system,I found there are zero sstable. I don't know where comes the 30GB? Best Regards
Re: nodetool ring Load column
Are you using compressed sstables? or the leveled sstables? Make sure you include how you are configured in any JIRA you make, someone else was seeing a similar issue with compression turned on. -Jeremiah On Oct 14, 2011, at 1:13 PM, Ramesh Natarajan wrote: What does the Load column in nodetool ring mean? From the output below it shows 101.62 GB. However if I do a disk usage it is about 6 GB. thanks Ramesh [root@CAP2-CNode1 cassandra]# ~root/apache-cassandra-1.0.0-rc2/bin/nodetool -h localhost ring Address DC RackStatus State Load OwnsToken 148873535527910577765226390751398592512 10.19.102.11datacenter1 rack1 Up Normal 101.62 GB 12.50% 0 10.19.102.12datacenter1 rack1 Up Normal 84.42 GB 12.50% 21267647932558653966460912964485513216 10.19.102.13datacenter1 rack1 Up Normal 95.47 GB 12.50% 42535295865117307932921825928971026432 10.19.102.14datacenter1 rack1 Up Normal 91.25 GB 12.50% 63802943797675961899382738893456539648 10.19.103.11datacenter1 rack1 Up Normal 93.98 GB 12.50% 85070591730234615865843651857942052864 10.19.103.12datacenter1 rack1 Up Normal 100.33 GB 12.50% 106338239662793269832304564822427566080 10.19.103.13datacenter1 rack1 Up Normal 74.1 GB 12.50% 127605887595351923798765477786913079296 10.19.103.14datacenter1 rack1 Up Normal 93.96 GB 12.50% 148873535527910577765226390751398592512 [root@CAP2-CNode1 cassandra]# du -hs /var/lib/cassandra/data/ 6.0G/var/lib/cassandra/data/
Re: nodetool ring Load column
I don't use compressed sstable. I also use the default compaction strategy. i will look at JIRA and see if there are any similarities. thanks Ramesh On Fri, Oct 21, 2011 at 6:51 AM, Jeremiah Jordan jeremiah.jor...@morningstar.com wrote: Are you using compressed sstables? or the leveled sstables? Make sure you include how you are configured in any JIRA you make, someone else was seeing a similar issue with compression turned on. -Jeremiah On Oct 14, 2011, at 1:13 PM, Ramesh Natarajan wrote: What does the Load column in nodetool ring mean? From the output below it shows 101.62 GB. However if I do a disk usage it is about 6 GB. thanks Ramesh [root@CAP2-CNode1 cassandra]# ~root/apache-cassandra-1.0.0-rc2/bin/nodetool -h localhost ring Address DC Rack Status State Load Owns Token 148873535527910577765226390751398592512 10.19.102.11 datacenter1 rack1 Up Normal 101.62 GB 12.50% 0 10.19.102.12 datacenter1 rack1 Up Normal 84.42 GB 12.50% 21267647932558653966460912964485513216 10.19.102.13 datacenter1 rack1 Up Normal 95.47 GB 12.50% 42535295865117307932921825928971026432 10.19.102.14 datacenter1 rack1 Up Normal 91.25 GB 12.50% 63802943797675961899382738893456539648 10.19.103.11 datacenter1 rack1 Up Normal 93.98 GB 12.50% 85070591730234615865843651857942052864 10.19.103.12 datacenter1 rack1 Up Normal 100.33 GB 12.50% 106338239662793269832304564822427566080 10.19.103.13 datacenter1 rack1 Up Normal 74.1 GB 12.50% 127605887595351923798765477786913079296 10.19.103.14 datacenter1 rack1 Up Normal 93.96 GB 12.50% 148873535527910577765226390751398592512 [root@CAP2-CNode1 cassandra]# du -hs /var/lib/cassandra/data/ 6.0G /var/lib/cassandra/data/
Re: nodetool ring Load column
It's the live data load. The SSTables which have not been compacted, their may be compacted SSTables still on disk. Normally people see the live data load reported by the nodetool is less than the usage from du. Can you double check, perhaps raise a Jira ticket if you have some repo steps or some other data. Thanks - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 15/10/2011, at 7:13 AM, Ramesh Natarajan wrote: What does the Load column in nodetool ring mean? From the output below it shows 101.62 GB. However if I do a disk usage it is about 6 GB. thanks Ramesh [root@CAP2-CNode1 cassandra]# ~root/apache-cassandra-1.0.0-rc2/bin/nodetool -h localhost ring Address DC RackStatus State Load OwnsToken 148873535527910577765226390751398592512 10.19.102.11datacenter1 rack1 Up Normal 101.62 GB 12.50% 0 10.19.102.12datacenter1 rack1 Up Normal 84.42 GB 12.50% 21267647932558653966460912964485513216 10.19.102.13datacenter1 rack1 Up Normal 95.47 GB 12.50% 42535295865117307932921825928971026432 10.19.102.14datacenter1 rack1 Up Normal 91.25 GB 12.50% 63802943797675961899382738893456539648 10.19.103.11datacenter1 rack1 Up Normal 93.98 GB 12.50% 85070591730234615865843651857942052864 10.19.103.12datacenter1 rack1 Up Normal 100.33 GB 12.50% 106338239662793269832304564822427566080 10.19.103.13datacenter1 rack1 Up Normal 74.1 GB 12.50% 127605887595351923798765477786913079296 10.19.103.14datacenter1 rack1 Up Normal 93.96 GB 12.50% 148873535527910577765226390751398592512 [root@CAP2-CNode1 cassandra]# du -hs /var/lib/cassandra/data/ 6.0G/var/lib/cassandra/data/
nodetool ring Load column
What does the Load column in nodetool ring mean? From the output below it shows 101.62 GB. However if I do a disk usage it is about 6 GB. thanks Ramesh [root@CAP2-CNode1 cassandra]# ~root/apache-cassandra-1.0.0-rc2/bin/nodetool -h localhost ring Address DC RackStatus State Load OwnsToken 148873535527910577765226390751398592512 10.19.102.11datacenter1 rack1 Up Normal 101.62 GB 12.50% 0 10.19.102.12datacenter1 rack1 Up Normal 84.42 GB 12.50% 21267647932558653966460912964485513216 10.19.102.13datacenter1 rack1 Up Normal 95.47 GB 12.50% 42535295865117307932921825928971026432 10.19.102.14datacenter1 rack1 Up Normal 91.25 GB 12.50% 63802943797675961899382738893456539648 10.19.103.11datacenter1 rack1 Up Normal 93.98 GB 12.50% 85070591730234615865843651857942052864 10.19.103.12datacenter1 rack1 Up Normal 100.33 GB 12.50% 106338239662793269832304564822427566080 10.19.103.13datacenter1 rack1 Up Normal 74.1 GB 12.50% 127605887595351923798765477786913079296 10.19.103.14datacenter1 rack1 Up Normal 93.96 GB 12.50% 148873535527910577765226390751398592512 [root@CAP2-CNode1 cassandra]# du -hs /var/lib/cassandra/data/ 6.0G/var/lib/cassandra/data/
Re: Nodetool ring not showing all nodes in cluster
I finally solved this by using nodetool move and specifying token number explicitly. Thanks for all your help !! Cheers, Aishwarya On Tue, Aug 2, 2011 at 5:31 PM, aaron morton aa...@thelastpickle.com wrote: initial_token is read from the yaml file once only, during bootstrap. It is then stored in the LocationInfo system CF and used from there. It sounds like when you did the move you deleted these files, but then started the nodes each with their own seed. So you created 3 separate clusters, when each one bootstrapped it auto allocated itself an initial token and stored it in LocationInfo. You have 3 clusters, each with one node and each node has the same token as it was the first node into a new empty cluster. Try this: 1 - Shut down the two (lets call them B and C) you want to join the first one (called A). 2 - Delete their LocationInfo CF 3 - ensure the seed list for ALL nodes points to node A. 4 - ensure the initial token is set correctly for B and C 5 - start B and C one at a time and make sure nodetool ring and describe cluster; in the CLI agree before starting the next *IF* node A has the incorrect token I would fix this after you get B and C back into the ring. Hope that helps. - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 3 Aug 2011, at 09:43, Aishwarya Venkataraman wrote: I corrected the seed list and checked the cluster name. They are all good now. But still nodetool -ring shows only one node. INFO 21:36:59,735 Starting Messaging Service on port 7000 INFO 21:36:59,748 Using saved token 113427455640312814857969558651062452224 Nodes a_ipadrr and b_ipaddr have the same token 113427455640312814857969558651062452224. a_ipadrr is the new owner. All the nodes seem to be using the same initial token, despite me specifying an initial_token in the config file. Is this an issue ? How do I force cassandra to use the token in the cassandra.yaml file ? Thanks, Aishwarya On Tue, Aug 2, 2011 at 2:34 PM, Jonathan Ellis jbel...@gmail.com wrote: Yes. Different cluster names could also cause this. On Tue, Aug 2, 2011 at 4:21 PM, Jeremiah Jordan jeremiah.jor...@morningstar.com wrote: All of the nodes should have the same seedlist. Don't use localhost as one of the items in it if you have multiple nodes. On Tue, 2011-08-02 at 10:10 -0700, Aishwarya Venkataraman wrote: Nodetool does not show me all the nodes. Assuming I have three nodes A, B and C. The seedlist of A is localhost. Seedlist of B is localhost, A_ipaddr and seedlist of C is localhost,B_ipaddr,A_ipaddr. I have autobootstrap set to false for all 3 nodes since they all have the correct data and do not hav to migrate data from any particular node. My problem here is why does n't nodetool ring show me all nodes in the ring ? I agree that the cluster thinks that only one node is present. How do I fix this ? Thanks, Aishwarya On Tue, Aug 2, 2011 at 9:56 AM, samal sa...@wakya.in wrote: ERROR 08:53:47,678 Internal error processing batch_mutate java.lang.IllegalStateException: replication factor (3) exceeds number of endpoints (1) You already answered It always keeps showing only one node and mentions that it is handling 100% of the load. Cluster think only one node is present in ring, it doesn't agree RF=3 it is expecting RF=1. Original Q: I m not exactly sure what is the problem. But Does nodetool ring show all the host? What is your seed list? Is bootstrapped node has seed ip of its own? AFAIK gossip work even without actively joining a ring. On Tue, Aug 2, 2011 at 7:21 AM, Aishwarya Venkataraman cyberai...@gmail.com wrote: Replies inline. Thanks, Aishwarya On Tue, Aug 2, 2011 at 7:12 AM, Sorin Julean sorin.jul...@gmail.com wrote: Hi, Until someone answers with more details, few questions: 1. did you moved the system keyspace as well ? Yes. But I deleted the LocationInfo* files under the system folder. Shall I go ahead and delete the entire system folder ? 2. the gossip IP of the new nodes are the same as the old ones ? No. The Ip is different. 3. which cassandra version are you running ? I am using 0.8.1 If 1. is yes and 2. is no, for a quick fix: take down the cluster, remove system keyspace, bring the cluster up and bootstrap the nodes. Kind regards, Sorin On Tue, Aug 2, 2011 at 2:53 PM, Aishwarya Venkataraman cyberai...@gmail.com wrote: Hello, I recently migrated 400 GB of data that was on a different cassandra cluster (3 node with RF= 3) to a new cluster. I have a 3 node cluster with replication factor set to three. When I run nodetool ring, it does not show me all the nodes in the cluster. It always keeps showing only one node and mentions that it is handling 100% of the load. But when I look at the logs, the nodes are able to talk to each other via the gossip protocol. Why does this happen ? Can you tell me what I am doing
Nodetool ring not showing all nodes in cluster
Hello, I recently migrated 400 GB of data that was on a different cassandra cluster (3 node with RF= 3) to a new cluster. I have a 3 node cluster with replication factor set to three. When I run nodetool ring, it does not show me all the nodes in the cluster. It always keeps showing only one node and mentions that it is handling 100% of the load. But when I look at the logs, the nodes are able to talk to each other via the gossip protocol. Why does this happen ? Can you tell me what I am doing wrong ? Thanks, Aishwarya
Re: Nodetool ring not showing all nodes in cluster
Hi, Until someone answers with more details, few questions: 1. did you moved the system keyspace as well ? 2. the gossip IP of the new nodes are the same as the old ones ? 3. which cassandra version are you running ? If 1. is yes and 2. is no, for a quick fix: take down the cluster, remove system keyspace, bring the cluster up and bootstrap the nodes. Kind regards, Sorin On Tue, Aug 2, 2011 at 2:53 PM, Aishwarya Venkataraman cyberai...@gmail.com wrote: Hello, I recently migrated 400 GB of data that was on a different cassandra cluster (3 node with RF= 3) to a new cluster. I have a 3 node cluster with replication factor set to three. When I run nodetool ring, it does not show me all the nodes in the cluster. It always keeps showing only one node and mentions that it is handling 100% of the load. But when I look at the logs, the nodes are able to talk to each other via the gossip protocol. Why does this happen ? Can you tell me what I am doing wrong ? Thanks, Aishwarya
Re: Nodetool ring not showing all nodes in cluster
Replies inline. Thanks, Aishwarya On Tue, Aug 2, 2011 at 7:12 AM, Sorin Julean sorin.jul...@gmail.com wrote: Hi, Until someone answers with more details, few questions: 1. did you moved the system keyspace as well ? Yes. But I deleted the LocationInfo* files under the system folder. Shall I go ahead and delete the entire system folder ? 2. the gossip IP of the new nodes are the same as the old ones ? No. The Ip is different. 3. which cassandra version are you running ? I am using 0.8.1 If 1. is yes and 2. is no, for a quick fix: take down the cluster, remove system keyspace, bring the cluster up and bootstrap the nodes. Kind regards, Sorin On Tue, Aug 2, 2011 at 2:53 PM, Aishwarya Venkataraman cyberai...@gmail.com wrote: Hello, I recently migrated 400 GB of data that was on a different cassandra cluster (3 node with RF= 3) to a new cluster. I have a 3 node cluster with replication factor set to three. When I run nodetool ring, it does not show me all the nodes in the cluster. It always keeps showing only one node and mentions that it is handling 100% of the load. But when I look at the logs, the nodes are able to talk to each other via the gossip protocol. Why does this happen ? Can you tell me what I am doing wrong ? Thanks, Aishwarya
Re: Nodetool ring not showing all nodes in cluster
Also I see these in the logs ERROR 08:53:47,678 Internal error processing batch_mutate java.lang.IllegalStateException: replication factor (3) exceeds number of endpoints (1) On Tue, Aug 2, 2011 at 7:21 AM, Aishwarya Venkataraman cyberai...@gmail.com wrote: Replies inline. Thanks, Aishwarya On Tue, Aug 2, 2011 at 7:12 AM, Sorin Julean sorin.jul...@gmail.com wrote: Hi, Until someone answers with more details, few questions: 1. did you moved the system keyspace as well ? Yes. But I deleted the LocationInfo* files under the system folder. Shall I go ahead and delete the entire system folder ? 2. the gossip IP of the new nodes are the same as the old ones ? No. The Ip is different. 3. which cassandra version are you running ? I am using 0.8.1 If 1. is yes and 2. is no, for a quick fix: take down the cluster, remove system keyspace, bring the cluster up and bootstrap the nodes. Kind regards, Sorin On Tue, Aug 2, 2011 at 2:53 PM, Aishwarya Venkataraman cyberai...@gmail.com wrote: Hello, I recently migrated 400 GB of data that was on a different cassandra cluster (3 node with RF= 3) to a new cluster. I have a 3 node cluster with replication factor set to three. When I run nodetool ring, it does not show me all the nodes in the cluster. It always keeps showing only one node and mentions that it is handling 100% of the load. But when I look at the logs, the nodes are able to talk to each other via the gossip protocol. Why does this happen ? Can you tell me what I am doing wrong ? Thanks, Aishwarya
Re: Nodetool ring not showing all nodes in cluster
ERROR 08:53:47,678 Internal error processing batch_mutate java.lang.IllegalStateException: replication factor (3) exceeds number of endpoints (1) You already answered It always keeps showing only one node and mentions that it is handling 100% of the load. On Tue, Aug 2, 2011 at 7:21 AM, Aishwarya Venkataraman cyberai...@gmail.com wrote: Replies inline. Thanks, Aishwarya On Tue, Aug 2, 2011 at 7:12 AM, Sorin Julean sorin.jul...@gmail.com wrote: Hi, Until someone answers with more details, few questions: 1. did you moved the system keyspace as well ? Yes. But I deleted the LocationInfo* files under the system folder. Shall I go ahead and delete the entire system folder ? 2. the gossip IP of the new nodes are the same as the old ones ? No. The Ip is different. 3. which cassandra version are you running ? I am using 0.8.1 If 1. is yes and 2. is no, for a quick fix: take down the cluster, remove system keyspace, bring the cluster up and bootstrap the nodes. Kind regards, Sorin On Tue, Aug 2, 2011 at 2:53 PM, Aishwarya Venkataraman cyberai...@gmail.com wrote: Hello, I recently migrated 400 GB of data that was on a different cassandra cluster (3 node with RF= 3) to a new cluster. I have a 3 node cluster with replication factor set to three. When I run nodetool ring, it does not show me all the nodes in the cluster. It always keeps showing only one node and mentions that it is handling 100% of the load. But when I look at the logs, the nodes are able to talk to each other via the gossip protocol. Why does this happen ? Can you tell me what I am doing wrong ? Thanks, Aishwarya
Re: Nodetool ring not showing all nodes in cluster
ERROR 08:53:47,678 Internal error processing batch_mutate java.lang.IllegalStateException: replication factor (3) exceeds number of endpoints (1) You already answered It always keeps showing only one node and mentions that it is handling 100% of the load. Cluster think only one node is present in ring, it doesn't agree RF=3 it is expecting RF=1. Original Q: I m not exactly sure what is the problem. But Does nodetool ring show all the host? What is your seed list? Is bootstrapped node has seed ip of its own? AFAIK gossip work even without actively joining a ring. On Tue, Aug 2, 2011 at 7:21 AM, Aishwarya Venkataraman cyberai...@gmail.com wrote: Replies inline. Thanks, Aishwarya On Tue, Aug 2, 2011 at 7:12 AM, Sorin Julean sorin.jul...@gmail.com wrote: Hi, Until someone answers with more details, few questions: 1. did you moved the system keyspace as well ? Yes. But I deleted the LocationInfo* files under the system folder. Shall I go ahead and delete the entire system folder ? 2. the gossip IP of the new nodes are the same as the old ones ? No. The Ip is different. 3. which cassandra version are you running ? I am using 0.8.1 If 1. is yes and 2. is no, for a quick fix: take down the cluster, remove system keyspace, bring the cluster up and bootstrap the nodes. Kind regards, Sorin On Tue, Aug 2, 2011 at 2:53 PM, Aishwarya Venkataraman cyberai...@gmail.com wrote: Hello, I recently migrated 400 GB of data that was on a different cassandra cluster (3 node with RF= 3) to a new cluster. I have a 3 node cluster with replication factor set to three. When I run nodetool ring, it does not show me all the nodes in the cluster. It always keeps showing only one node and mentions that it is handling 100% of the load. But when I look at the logs, the nodes are able to talk to each other via the gossip protocol. Why does this happen ? Can you tell me what I am doing wrong ? Thanks, Aishwarya
Re: Nodetool ring not showing all nodes in cluster
Nodetool does not show me all the nodes. Assuming I have three nodes A, B and C. The seedlist of A is localhost. Seedlist of B is localhost, A_ipaddr and seedlist of C is localhost,B_ipaddr,A_ipaddr. I have autobootstrap set to false for all 3 nodes since they all have the correct data and do not hav to migrate data from any particular node. My problem here is why does n't nodetool ring show me all nodes in the ring ? I agree that the cluster thinks that only one node is present. How do I fix this ? Thanks, Aishwarya On Tue, Aug 2, 2011 at 9:56 AM, samal sa...@wakya.in wrote: ERROR 08:53:47,678 Internal error processing batch_mutate java.lang.IllegalStateException: replication factor (3) exceeds number of endpoints (1) You already answered It always keeps showing only one node and mentions that it is handling 100% of the load. Cluster think only one node is present in ring, it doesn't agree RF=3 it is expecting RF=1. Original Q: I m not exactly sure what is the problem. But Does nodetool ring show all the host? What is your seed list? Is bootstrapped node has seed ip of its own? AFAIK gossip work even without actively joining a ring. On Tue, Aug 2, 2011 at 7:21 AM, Aishwarya Venkataraman cyberai...@gmail.com wrote: Replies inline. Thanks, Aishwarya On Tue, Aug 2, 2011 at 7:12 AM, Sorin Julean sorin.jul...@gmail.com wrote: Hi, Until someone answers with more details, few questions: 1. did you moved the system keyspace as well ? Yes. But I deleted the LocationInfo* files under the system folder. Shall I go ahead and delete the entire system folder ? 2. the gossip IP of the new nodes are the same as the old ones ? No. The Ip is different. 3. which cassandra version are you running ? I am using 0.8.1 If 1. is yes and 2. is no, for a quick fix: take down the cluster, remove system keyspace, bring the cluster up and bootstrap the nodes. Kind regards, Sorin On Tue, Aug 2, 2011 at 2:53 PM, Aishwarya Venkataraman cyberai...@gmail.com wrote: Hello, I recently migrated 400 GB of data that was on a different cassandra cluster (3 node with RF= 3) to a new cluster. I have a 3 node cluster with replication factor set to three. When I run nodetool ring, it does not show me all the nodes in the cluster. It always keeps showing only one node and mentions that it is handling 100% of the load. But when I look at the logs, the nodes are able to talk to each other via the gossip protocol. Why does this happen ? Can you tell me what I am doing wrong ? Thanks, Aishwarya
Re: Nodetool ring not showing all nodes in cluster
The seedlist of A is localhost. Seedlist of B is localhost, A_ipaddr and seedlist of C is localhost,B_ipaddr,A_ipaddr. Using localhost(or own IP address for non-seed nodes) is not a good practice. Try The seedlist of A : A_ipaddr. Seedlist of B : A_ipaddr seedlist of C : A_ipaddr
Re: Nodetool ring not showing all nodes in cluster
On Tue, Aug 2, 2011 at 10:30 AM, Adi adi.pan...@gmail.com wrote: The seedlist of A is localhost. Seedlist of B is localhost, A_ipaddr and seedlist of C is localhost,B_ipaddr,A_ipaddr. Using localhost(or own IP address for non-seed nodes) is not a good practice. Try The seedlist of A : A_ipaddr. Seedlist of B : A_ipaddr seedlist of C : A_ipaddr That does not work either. Once and do the above and invoke nodetool ring, it still shows only one node. Thanks, Aishwarya
Re: Nodetool ring not showing all nodes in cluster
All of the nodes should have the same seedlist. Don't use localhost as one of the items in it if you have multiple nodes. On Tue, 2011-08-02 at 10:10 -0700, Aishwarya Venkataraman wrote: Nodetool does not show me all the nodes. Assuming I have three nodes A, B and C. The seedlist of A is localhost. Seedlist of B is localhost, A_ipaddr and seedlist of C is localhost,B_ipaddr,A_ipaddr. I have autobootstrap set to false for all 3 nodes since they all have the correct data and do not hav to migrate data from any particular node. My problem here is why does n't nodetool ring show me all nodes in the ring ? I agree that the cluster thinks that only one node is present. How do I fix this ? Thanks, Aishwarya On Tue, Aug 2, 2011 at 9:56 AM, samal sa...@wakya.in wrote: ERROR 08:53:47,678 Internal error processing batch_mutate java.lang.IllegalStateException: replication factor (3) exceeds number of endpoints (1) You already answered It always keeps showing only one node and mentions that it is handling 100% of the load. Cluster think only one node is present in ring, it doesn't agree RF=3 it is expecting RF=1. Original Q: I m not exactly sure what is the problem. But Does nodetool ring show all the host? What is your seed list? Is bootstrapped node has seed ip of its own? AFAIK gossip work even without actively joining a ring. On Tue, Aug 2, 2011 at 7:21 AM, Aishwarya Venkataraman cyberai...@gmail.com wrote: Replies inline. Thanks, Aishwarya On Tue, Aug 2, 2011 at 7:12 AM, Sorin Julean sorin.jul...@gmail.com wrote: Hi, Until someone answers with more details, few questions: 1. did you moved the system keyspace as well ? Yes. But I deleted the LocationInfo* files under the system folder. Shall I go ahead and delete the entire system folder ? 2. the gossip IP of the new nodes are the same as the old ones ? No. The Ip is different. 3. which cassandra version are you running ? I am using 0.8.1 If 1. is yes and 2. is no, for a quick fix: take down the cluster, remove system keyspace, bring the cluster up and bootstrap the nodes. Kind regards, Sorin On Tue, Aug 2, 2011 at 2:53 PM, Aishwarya Venkataraman cyberai...@gmail.com wrote: Hello, I recently migrated 400 GB of data that was on a different cassandra cluster (3 node with RF= 3) to a new cluster. I have a 3 node cluster with replication factor set to three. When I run nodetool ring, it does not show me all the nodes in the cluster. It always keeps showing only one node and mentions that it is handling 100% of the load. But when I look at the logs, the nodes are able to talk to each other via the gossip protocol. Why does this happen ? Can you tell me what I am doing wrong ? Thanks, Aishwarya
Re: Nodetool ring not showing all nodes in cluster
Yes. Different cluster names could also cause this. On Tue, Aug 2, 2011 at 4:21 PM, Jeremiah Jordan jeremiah.jor...@morningstar.com wrote: All of the nodes should have the same seedlist. Don't use localhost as one of the items in it if you have multiple nodes. On Tue, 2011-08-02 at 10:10 -0700, Aishwarya Venkataraman wrote: Nodetool does not show me all the nodes. Assuming I have three nodes A, B and C. The seedlist of A is localhost. Seedlist of B is localhost, A_ipaddr and seedlist of C is localhost,B_ipaddr,A_ipaddr. I have autobootstrap set to false for all 3 nodes since they all have the correct data and do not hav to migrate data from any particular node. My problem here is why does n't nodetool ring show me all nodes in the ring ? I agree that the cluster thinks that only one node is present. How do I fix this ? Thanks, Aishwarya On Tue, Aug 2, 2011 at 9:56 AM, samal sa...@wakya.in wrote: ERROR 08:53:47,678 Internal error processing batch_mutate java.lang.IllegalStateException: replication factor (3) exceeds number of endpoints (1) You already answered It always keeps showing only one node and mentions that it is handling 100% of the load. Cluster think only one node is present in ring, it doesn't agree RF=3 it is expecting RF=1. Original Q: I m not exactly sure what is the problem. But Does nodetool ring show all the host? What is your seed list? Is bootstrapped node has seed ip of its own? AFAIK gossip work even without actively joining a ring. On Tue, Aug 2, 2011 at 7:21 AM, Aishwarya Venkataraman cyberai...@gmail.com wrote: Replies inline. Thanks, Aishwarya On Tue, Aug 2, 2011 at 7:12 AM, Sorin Julean sorin.jul...@gmail.com wrote: Hi, Until someone answers with more details, few questions: 1. did you moved the system keyspace as well ? Yes. But I deleted the LocationInfo* files under the system folder. Shall I go ahead and delete the entire system folder ? 2. the gossip IP of the new nodes are the same as the old ones ? No. The Ip is different. 3. which cassandra version are you running ? I am using 0.8.1 If 1. is yes and 2. is no, for a quick fix: take down the cluster, remove system keyspace, bring the cluster up and bootstrap the nodes. Kind regards, Sorin On Tue, Aug 2, 2011 at 2:53 PM, Aishwarya Venkataraman cyberai...@gmail.com wrote: Hello, I recently migrated 400 GB of data that was on a different cassandra cluster (3 node with RF= 3) to a new cluster. I have a 3 node cluster with replication factor set to three. When I run nodetool ring, it does not show me all the nodes in the cluster. It always keeps showing only one node and mentions that it is handling 100% of the load. But when I look at the logs, the nodes are able to talk to each other via the gossip protocol. Why does this happen ? Can you tell me what I am doing wrong ? Thanks, Aishwarya -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: Nodetool ring not showing all nodes in cluster
I corrected the seed list and checked the cluster name. They are all good now. But still nodetool -ring shows only one node. INFO 21:36:59,735 Starting Messaging Service on port 7000 INFO 21:36:59,748 Using saved token 113427455640312814857969558651062452224 Nodes a_ipadrr and b_ipaddr have the same token 113427455640312814857969558651062452224. a_ipadrr is the new owner. All the nodes seem to be using the same initial token, despite me specifying an initial_token in the config file. Is this an issue ? How do I force cassandra to use the token in the cassandra.yaml file ? Thanks, Aishwarya On Tue, Aug 2, 2011 at 2:34 PM, Jonathan Ellis jbel...@gmail.com wrote: Yes. Different cluster names could also cause this. On Tue, Aug 2, 2011 at 4:21 PM, Jeremiah Jordan jeremiah.jor...@morningstar.com wrote: All of the nodes should have the same seedlist. Don't use localhost as one of the items in it if you have multiple nodes. On Tue, 2011-08-02 at 10:10 -0700, Aishwarya Venkataraman wrote: Nodetool does not show me all the nodes. Assuming I have three nodes A, B and C. The seedlist of A is localhost. Seedlist of B is localhost, A_ipaddr and seedlist of C is localhost,B_ipaddr,A_ipaddr. I have autobootstrap set to false for all 3 nodes since they all have the correct data and do not hav to migrate data from any particular node. My problem here is why does n't nodetool ring show me all nodes in the ring ? I agree that the cluster thinks that only one node is present. How do I fix this ? Thanks, Aishwarya On Tue, Aug 2, 2011 at 9:56 AM, samal sa...@wakya.in wrote: ERROR 08:53:47,678 Internal error processing batch_mutate java.lang.IllegalStateException: replication factor (3) exceeds number of endpoints (1) You already answered It always keeps showing only one node and mentions that it is handling 100% of the load. Cluster think only one node is present in ring, it doesn't agree RF=3 it is expecting RF=1. Original Q: I m not exactly sure what is the problem. But Does nodetool ring show all the host? What is your seed list? Is bootstrapped node has seed ip of its own? AFAIK gossip work even without actively joining a ring. On Tue, Aug 2, 2011 at 7:21 AM, Aishwarya Venkataraman cyberai...@gmail.com wrote: Replies inline. Thanks, Aishwarya On Tue, Aug 2, 2011 at 7:12 AM, Sorin Julean sorin.jul...@gmail.com wrote: Hi, Until someone answers with more details, few questions: 1. did you moved the system keyspace as well ? Yes. But I deleted the LocationInfo* files under the system folder. Shall I go ahead and delete the entire system folder ? 2. the gossip IP of the new nodes are the same as the old ones ? No. The Ip is different. 3. which cassandra version are you running ? I am using 0.8.1 If 1. is yes and 2. is no, for a quick fix: take down the cluster, remove system keyspace, bring the cluster up and bootstrap the nodes. Kind regards, Sorin On Tue, Aug 2, 2011 at 2:53 PM, Aishwarya Venkataraman cyberai...@gmail.com wrote: Hello, I recently migrated 400 GB of data that was on a different cassandra cluster (3 node with RF= 3) to a new cluster. I have a 3 node cluster with replication factor set to three. When I run nodetool ring, it does not show me all the nodes in the cluster. It always keeps showing only one node and mentions that it is handling 100% of the load. But when I look at the logs, the nodes are able to talk to each other via the gossip protocol. Why does this happen ? Can you tell me what I am doing wrong ? Thanks, Aishwarya -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: Nodetool ring not showing all nodes in cluster
initial_token is read from the yaml file once only, during bootstrap. It is then stored in the LocationInfo system CF and used from there. It sounds like when you did the move you deleted these files, but then started the nodes each with their own seed. So you created 3 separate clusters, when each one bootstrapped it auto allocated itself an initial token and stored it in LocationInfo. You have 3 clusters, each with one node and each node has the same token as it was the first node into a new empty cluster. Try this: 1 - Shut down the two (lets call them B and C) you want to join the first one (called A). 2 - Delete their LocationInfo CF 3 - ensure the seed list for ALL nodes points to node A. 4 - ensure the initial token is set correctly for B and C 5 - start B and C one at a time and make sure nodetool ring and describe cluster; in the CLI agree before starting the next *IF* node A has the incorrect token I would fix this after you get B and C back into the ring. Hope that helps. - Aaron Morton Freelance Cassandra Developer @aaronmorton http://www.thelastpickle.com On 3 Aug 2011, at 09:43, Aishwarya Venkataraman wrote: I corrected the seed list and checked the cluster name. They are all good now. But still nodetool -ring shows only one node. INFO 21:36:59,735 Starting Messaging Service on port 7000 INFO 21:36:59,748 Using saved token 113427455640312814857969558651062452224 Nodes a_ipadrr and b_ipaddr have the same token 113427455640312814857969558651062452224. a_ipadrr is the new owner. All the nodes seem to be using the same initial token, despite me specifying an initial_token in the config file. Is this an issue ? How do I force cassandra to use the token in the cassandra.yaml file ? Thanks, Aishwarya On Tue, Aug 2, 2011 at 2:34 PM, Jonathan Ellis jbel...@gmail.com wrote: Yes. Different cluster names could also cause this. On Tue, Aug 2, 2011 at 4:21 PM, Jeremiah Jordan jeremiah.jor...@morningstar.com wrote: All of the nodes should have the same seedlist. Don't use localhost as one of the items in it if you have multiple nodes. On Tue, 2011-08-02 at 10:10 -0700, Aishwarya Venkataraman wrote: Nodetool does not show me all the nodes. Assuming I have three nodes A, B and C. The seedlist of A is localhost. Seedlist of B is localhost, A_ipaddr and seedlist of C is localhost,B_ipaddr,A_ipaddr. I have autobootstrap set to false for all 3 nodes since they all have the correct data and do not hav to migrate data from any particular node. My problem here is why does n't nodetool ring show me all nodes in the ring ? I agree that the cluster thinks that only one node is present. How do I fix this ? Thanks, Aishwarya On Tue, Aug 2, 2011 at 9:56 AM, samal sa...@wakya.in wrote: ERROR 08:53:47,678 Internal error processing batch_mutate java.lang.IllegalStateException: replication factor (3) exceeds number of endpoints (1) You already answered It always keeps showing only one node and mentions that it is handling 100% of the load. Cluster think only one node is present in ring, it doesn't agree RF=3 it is expecting RF=1. Original Q: I m not exactly sure what is the problem. But Does nodetool ring show all the host? What is your seed list? Is bootstrapped node has seed ip of its own? AFAIK gossip work even without actively joining a ring. On Tue, Aug 2, 2011 at 7:21 AM, Aishwarya Venkataraman cyberai...@gmail.com wrote: Replies inline. Thanks, Aishwarya On Tue, Aug 2, 2011 at 7:12 AM, Sorin Julean sorin.jul...@gmail.com wrote: Hi, Until someone answers with more details, few questions: 1. did you moved the system keyspace as well ? Yes. But I deleted the LocationInfo* files under the system folder. Shall I go ahead and delete the entire system folder ? 2. the gossip IP of the new nodes are the same as the old ones ? No. The Ip is different. 3. which cassandra version are you running ? I am using 0.8.1 If 1. is yes and 2. is no, for a quick fix: take down the cluster, remove system keyspace, bring the cluster up and bootstrap the nodes. Kind regards, Sorin On Tue, Aug 2, 2011 at 2:53 PM, Aishwarya Venkataraman cyberai...@gmail.com wrote: Hello, I recently migrated 400 GB of data that was on a different cassandra cluster (3 node with RF= 3) to a new cluster. I have a 3 node cluster with replication factor set to three. When I run nodetool ring, it does not show me all the nodes in the cluster. It always keeps showing only one node and mentions that it is handling 100% of the load. But when I look at the logs, the nodes are able to talk to each other via the gossip protocol. Why does this happen ? Can you tell me what I am doing wrong ? Thanks, Aishwarya -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com
Re: Setting up cluster and nodetool ring in 0.8.0
Just to close this out, in case anyone was interested... my problem was firewall related, in that I didn't have my messaging/data port (7000) open on my seed node. Allowing traffic on this port resolved my issues. On Fri, Jun 3, 2011 at 1:43 PM, David McNelis dmcne...@agentisenergy.comwrote: Thanks, Jonathan. Both machines do have the exact same seed list. On Fri, Jun 3, 2011 at 1:39 PM, Jonathan Ellis jbel...@gmail.com wrote: On Fri, Jun 3, 2011 at 11:21 AM, David McNelis dmcne...@agentisenergy.com wrote: I want to make sure I'm not seeing things from a weird perspective. I have two Cassandra instances where one is set to be the seed, with autobootstap disabled and its seed being 127.0.0.1. The second instance has autobootstrap enabled and the seed IP set to the IP of the first node. Seed lists should _always_ be identical on each machine (which implies they should _never_ be localhost, in a multinode configuration). -- Jonathan Ellis Project Chair, Apache Cassandra co-founder of DataStax, the source for professional Cassandra support http://www.datastax.com -- *David McNelis* Lead Software Engineer Agentis Energy www.agentisenergy.com o: 630.359.6395 c: 219.384.5143 *A Smart Grid technology company focused on helping consumers of energy control an often under-managed resource.* -- *David McNelis* Lead Software Engineer Agentis Energy www.agentisenergy.com o: 630.359.6395 c: 219.384.5143 *A Smart Grid technology company focused on helping consumers of energy control an often under-managed resource.*
Setting up cluster and nodetool ring in 0.8.0
I want to make sure I'm not seeing things from a weird perspective. I have two Cassandra instances where one is set to be the seed, with autobootstap disabled and its seed being 127.0.0.1. The second instance has autobootstrap enabled and the seed IP set to the IP of the first node. I start the first node, then the second, with no errors. However, when I run: bin/nodetool -h localhost ring My output shows me only the local machine in my ring. When I run: bin/nodetool -h localhost join seedNodeIP It tells me I'm already a part of the ring. My question is which is correct? I thought, from the documentation, that both of my nodes would show up in the ring if I ran 'ring' in nodetool. This is a new cluster. -- *David McNelis* Lead Software Engineer Agentis Energy www.agentisenergy.com o: 630.359.6395 c: 219.384.5143 *A Smart Grid technology company focused on helping consumers of energy control an often under-managed resource.*
Re: Setting up cluster and nodetool ring in 0.8.0
On Fri, Jun 3, 2011 at 12:21 PM, David McNelis dmcne...@agentisenergy.comwrote: I want to make sure I'm not seeing things from a weird perspective. I have two Cassandra instances where one is set to be the seed, with autobootstap disabled and its seed being 127.0.0.1. The second instance has autobootstrap enabled and the seed IP set to the IP of the first node. I start the first node, then the second, with no errors. However, when I run: bin/nodetool -h localhost ring My output shows me only the local machine in my ring. When I run: bin/nodetool -h localhost join seedNodeIP It tells me I'm already a part of the ring. My question is which is correct? I thought, from the documentation, that both of my nodes would show up in the ring if I ran 'ring' in nodetool. This is a new cluster. -- *David McNelis* Lead Software Engineer Agentis Energy www.agentisenergy.com o: 630.359.6395 c: 219.384.5143 *A Smart Grid technology company focused on helping consumers of energy control an often under-managed resource.* Do use 127.0.0.1 as a seed (except for single node test clusters) Use a route-able ip that other cluster nodes can use to reach that node.
Questions about the nodetool ring.
I have 3 cassandra 0.7.4 nodes in a cluster, and I get the ring stats: [root@yun-phy2 apache-cassandra-0.7.4]# bin/nodetool -h 192.168.1.28 -p 8090 ring Address Status State LoadOwnsToken 109028275973926493413574716008500203721 192.168.1.25Up Normal 157.25 MB 69.92% 57856537434773737201679995572503935972 192.168.1.27Up Normal 201.71 MB 24.28% 99165710459060760249270263771474737125 192.168.1.28Up Normal 68.12 MB5.80% 109028275973926493413574716008500203721 The load and owns vary on each node, is this normal? And is there a way to balance the three nodes? Thanks. -- Dikang Gu 0086 - 18611140205
Re: Questions about the nodetool ring.
This is normal when you just add single nodes. When no token is assigned, the new node takes a portion of the ring from the most heavily loaded node. As a consequence of this, the nodes will be out of balance. In other words, when you double the amount nodes you would not have this problem. The best way to rebalance the cluster is to generate new tokens and use the nodetool move new-token command to rebalance the nodes, one at a time. After rebalancing you can run cleanup so the nodes get rid of data they no longer are responsible for. links: http://wiki.apache.org/cassandra/Operations#Range_changes http://wiki.apache.org/cassandra/Operations#Moving_or_Removing_nodes http://www.datastax.com/docs/0.7/operations/clustering#adding-capacity On Apr 12, 2011, at 11:00 AM, Dikang Gu wrote: I have 3 cassandra 0.7.4 nodes in a cluster, and I get the ring stats: [root@yun-phy2 apache-cassandra-0.7.4]# bin/nodetool -h 192.168.1.28 -p 8090 ring Address Status State LoadOwnsToken 109028275973926493413574716008500203721 192.168.1.25Up Normal 157.25 MB 69.92% 57856537434773737201679995572503935972 192.168.1.27Up Normal 201.71 MB 24.28% 99165710459060760249270263771474737125 192.168.1.28Up Normal 68.12 MB5.80% 109028275973926493413574716008500203721 The load and owns vary on each node, is this normal? And is there a way to balance the three nodes? Thanks. -- Dikang Gu 0086 - 18611140205
Re: Questions about the nodetool ring.
The 3 nodes were added to the cluster at the same time, so I'm not sure whey the data vary. I calculate the tokens and get: node 0: 0 node 1: 56713727820156410577229101238628035242 node 2: 113427455640312821154458202477256070485 So I should set these tokens to the three nodes? And during the time I execute the nodetool move commands, can the cassandra servers serve the front end requests at the same time? Is the data safe? Thanks. On Tue, Apr 12, 2011 at 5:15 PM, Jonathan Colby jonathan.co...@gmail.comwrote: This is normal when you just add single nodes. When no token is assigned, the new node takes a portion of the ring from the most heavily loaded node.As a consequence of this, the nodes will be out of balance. In other words, when you double the amount nodes you would not have this problem. The best way to rebalance the cluster is to generate new tokens and use the nodetool move new-token command to rebalance the nodes, one at a time. After rebalancing you can run cleanup so the nodes get rid of data they no longer are responsible for. links: http://wiki.apache.org/cassandra/Operations#Range_changes http://wiki.apache.org/cassandra/Operations#Moving_or_Removing_nodes http://www.datastax.com/docs/0.7/operations/clustering#adding-capacity On Apr 12, 2011, at 11:00 AM, Dikang Gu wrote: I have 3 cassandra 0.7.4 nodes in a cluster, and I get the ring stats: [root@yun-phy2 apache-cassandra-0.7.4]# bin/nodetool -h 192.168.1.28 -p 8090 ring Address Status State LoadOwnsToken 109028275973926493413574716008500203721 192.168.1.25Up Normal 157.25 MB 69.92% 57856537434773737201679995572503935972 192.168.1.27Up Normal 201.71 MB 24.28% 99165710459060760249270263771474737125 192.168.1.28Up Normal 68.12 MB5.80% 109028275973926493413574716008500203721 The load and owns vary on each node, is this normal? And is there a way to balance the three nodes? Thanks. -- Dikang Gu 0086 - 18611140205 -- Dikang Gu 0086 - 18611140205
Re: Questions about the nodetool ring.
After the nodetool move, I got this: [root@server3 apache-cassandra-0.7.4]# bin/nodetool -h 10.18.101.213 ring Address Status State LoadOwnsToken 113427455640312821154458202477256070485 10.18.101.211 ? Normal 82.31 MB33.33% 0 10.18.101.212 ? Normal 84.24 MB33.33% 56713727820156410577229101238628035242 10.18.101.213 Up Normal 54.44 MB33.33% 113427455640312821154458202477256070485 Is this correct? Why is the status ? ? Thanks. On Tue, Apr 12, 2011 at 5:43 PM, Dikang Gu dikan...@gmail.com wrote: The 3 nodes were added to the cluster at the same time, so I'm not sure whey the data vary. I calculate the tokens and get: node 0: 0 node 1: 56713727820156410577229101238628035242 node 2: 113427455640312821154458202477256070485 So I should set these tokens to the three nodes? And during the time I execute the nodetool move commands, can the cassandra servers serve the front end requests at the same time? Is the data safe? Thanks. On Tue, Apr 12, 2011 at 5:15 PM, Jonathan Colby jonathan.co...@gmail.comwrote: This is normal when you just add single nodes. When no token is assigned, the new node takes a portion of the ring from the most heavily loaded node.As a consequence of this, the nodes will be out of balance. In other words, when you double the amount nodes you would not have this problem. The best way to rebalance the cluster is to generate new tokens and use the nodetool move new-token command to rebalance the nodes, one at a time. After rebalancing you can run cleanup so the nodes get rid of data they no longer are responsible for. links: http://wiki.apache.org/cassandra/Operations#Range_changes http://wiki.apache.org/cassandra/Operations#Moving_or_Removing_nodes http://www.datastax.com/docs/0.7/operations/clustering#adding-capacity On Apr 12, 2011, at 11:00 AM, Dikang Gu wrote: I have 3 cassandra 0.7.4 nodes in a cluster, and I get the ring stats: [root@yun-phy2 apache-cassandra-0.7.4]# bin/nodetool -h 192.168.1.28 -p 8090 ring Address Status State LoadOwnsToken 109028275973926493413574716008500203721 192.168.1.25Up Normal 157.25 MB 69.92% 57856537434773737201679995572503935972 192.168.1.27Up Normal 201.71 MB 24.28% 99165710459060760249270263771474737125 192.168.1.28Up Normal 68.12 MB5.80% 109028275973926493413574716008500203721 The load and owns vary on each node, is this normal? And is there a way to balance the three nodes? Thanks. -- Dikang Gu 0086 - 18611140205 -- Dikang Gu 0086 - 18611140205 -- Dikang Gu 0086 - 18611140205