Re: decommissioning a cassandra node
As I see the state 162.243.109.94 is UL(Up/Leaving) so maybe this is causing the problem. On Sunday, October 26, 2014 11:57 PM, Tim Dunphy bluethu...@gmail.com wrote: Hey all, I'm trying to decommission a node. First I'm getting a status: [root@beta-new:/usr/local] #nodetool statusNote: Ownership information does not include topology; for complete information, specify a keyspaceDatacenter: datacenter1===Status=Up/Down|/ State=Normal/Leaving/Joining/Moving-- Address Load Tokens Owns Host ID RackUN 162.243.86.41 1.08 MB 1 0.1% e945f3b5-2e3e-4a20-b1bd-e30c474a7634 rack1UL 162.243.109.94 1.28 MB 256 99.9% fd2f76ae-8dcf-4e93-a37f-bf1e9088696e rack1 But when I try to decommission the node I get this message: [root@beta-new:/usr/local] #nodetool -h 162.243.86.41 decommissionnodetool: Failed to connect to '162.243.86.41:7199' - NoSuchObjectException: 'no such object in table'. Yet I can telnet to that host on that port just fine: [root@beta-new:/usr/local] #telnet 162.243.86.41 7199Trying 162.243.86.41...Connected to 162.243.86.41.Escape character is '^]'. And I have verified that cassandra is running and accessible via cqlsh on the other machine. What could be going wrong? ThanksTim -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
Re: decommissioning a cassandra node
As I see the state 162.243.109.94 is UL(Up/Leaving) so maybe this is causing the problem OK, that's an interesting observation.How do you fix a node that is an UL state? What causes this? Also, is there any document that explains what all the nodetool abbreviations (UN, UL) stand for? On Mon, Oct 27, 2014 at 5:46 AM, jivko donev jivko_...@yahoo.com wrote: As I see the state 162.243.109.94 is UL(Up/Leaving) so maybe this is causing the problem. On Sunday, October 26, 2014 11:57 PM, Tim Dunphy bluethu...@gmail.com wrote: Hey all, I'm trying to decommission a node. First I'm getting a status: [root@beta-new:/usr/local] #nodetool status Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens OwnsHost ID Rack UN 162.243.86.41 1.08 MB1 0.1% e945f3b5-2e3e-4a20-b1bd-e30c474a7634 rack1 UL 162.243.109.94 1.28 MB256 99.9% fd2f76ae-8dcf-4e93-a37f-bf1e9088696e rack1 But when I try to decommission the node I get this message: [root@beta-new:/usr/local] #nodetool -h 162.243.86.41 decommission nodetool: Failed to connect to '162.243.86.41:7199' - NoSuchObjectException: 'no such object in table'. Yet I can telnet to that host on that port just fine: [root@beta-new:/usr/local] #telnet 162.243.86.41 7199 Trying 162.243.86.41... Connected to 162.243.86.41. Escape character is '^]'. And I have verified that cassandra is running and accessible via cqlsh on the other machine. What could be going wrong? Thanks Tim -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
Re: decommissioning a cassandra node
Also, is there any document that explains what all the nodetool abbreviations (UN, UL) stand for? -- The documentation is in the command output itself Datacenter: datacenter1 === *Status=Up/Down* *|/ State=Normal/Leaving/Joining/Moving* -- Address Load Tokens OwnsHost ID Rack UN 162.243.86.41 1.08 MB1 0.1% e945f3b5-2e3e-4a20-b1bd-e30c474a7634 rack1 UL 162.243.109.94 1.28 MB256 99.9% fd2f76ae-8dcf-4e93-a37f-bf1e9088696e rack1 U = Up, D = Down N = Normal, L = Leaving, J = Joining and M = Moving On Mon, Oct 27, 2014 at 2:42 PM, Tim Dunphy bluethu...@gmail.com wrote: As I see the state 162.243.109.94 is UL(Up/Leaving) so maybe this is causing the problem OK, that's an interesting observation.How do you fix a node that is an UL state? What causes this? Also, is there any document that explains what all the nodetool abbreviations (UN, UL) stand for? On Mon, Oct 27, 2014 at 5:46 AM, jivko donev jivko_...@yahoo.com wrote: As I see the state 162.243.109.94 is UL(Up/Leaving) so maybe this is causing the problem. On Sunday, October 26, 2014 11:57 PM, Tim Dunphy bluethu...@gmail.com wrote: Hey all, I'm trying to decommission a node. First I'm getting a status: [root@beta-new:/usr/local] #nodetool status Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens OwnsHost ID Rack UN 162.243.86.41 1.08 MB1 0.1% e945f3b5-2e3e-4a20-b1bd-e30c474a7634 rack1 UL 162.243.109.94 1.28 MB256 99.9% fd2f76ae-8dcf-4e93-a37f-bf1e9088696e rack1 But when I try to decommission the node I get this message: [root@beta-new:/usr/local] #nodetool -h 162.243.86.41 decommission nodetool: Failed to connect to '162.243.86.41:7199' - NoSuchObjectException: 'no such object in table'. Yet I can telnet to that host on that port just fine: [root@beta-new:/usr/local] #telnet 162.243.86.41 7199 Trying 162.243.86.41... Connected to 162.243.86.41. Escape character is '^]'. And I have verified that cassandra is running and accessible via cqlsh on the other machine. What could be going wrong? Thanks Tim -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
Re: decommissioning a cassandra node
Also, is there any document that explains what all the nodetool abbreviations (UN, UL) stand for? -- The documentation is in the command output itself Datacenter: datacenter1 === *Status=Up/Down* *|/ State=Normal/Leaving/Joining/Moving*-- Address Load Tokens OwnsHost ID Rack UN 162.243.86.41 1.08 MB1 0.1% e945f3b5-2e3e-4a20-b1bd-e30c474a7634 rack1 UL 162.243.109.94 1.28 MB256 99.9% fd2f76ae-8dcf-4e93-a37f-bf1e9088696e rack1 U = Up, D = Down N = Normal, L = Leaving, J = Joining and M = Moving Ok, got it, thanks! Can someone suggest a good way to fix a node that is in an UL state? Thanks Tim On Mon, Oct 27, 2014 at 9:46 AM, DuyHai Doan doanduy...@gmail.com wrote: Also, is there any document that explains what all the nodetool abbreviations (UN, UL) stand for? -- The documentation is in the command output itself Datacenter: datacenter1 === *Status=Up/Down* *|/ State=Normal/Leaving/Joining/Moving* -- Address Load Tokens OwnsHost ID Rack UN 162.243.86.41 1.08 MB1 0.1% e945f3b5-2e3e-4a20-b1bd-e30c474a7634 rack1 UL 162.243.109.94 1.28 MB256 99.9% fd2f76ae-8dcf-4e93-a37f-bf1e9088696e rack1 U = Up, D = Down N = Normal, L = Leaving, J = Joining and M = Moving On Mon, Oct 27, 2014 at 2:42 PM, Tim Dunphy bluethu...@gmail.com wrote: As I see the state 162.243.109.94 is UL(Up/Leaving) so maybe this is causing the problem OK, that's an interesting observation.How do you fix a node that is an UL state? What causes this? Also, is there any document that explains what all the nodetool abbreviations (UN, UL) stand for? On Mon, Oct 27, 2014 at 5:46 AM, jivko donev jivko_...@yahoo.com wrote: As I see the state 162.243.109.94 is UL(Up/Leaving) so maybe this is causing the problem. On Sunday, October 26, 2014 11:57 PM, Tim Dunphy bluethu...@gmail.com wrote: Hey all, I'm trying to decommission a node. First I'm getting a status: [root@beta-new:/usr/local] #nodetool status Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens OwnsHost ID Rack UN 162.243.86.41 1.08 MB1 0.1% e945f3b5-2e3e-4a20-b1bd-e30c474a7634 rack1 UL 162.243.109.94 1.28 MB256 99.9% fd2f76ae-8dcf-4e93-a37f-bf1e9088696e rack1 But when I try to decommission the node I get this message: [root@beta-new:/usr/local] #nodetool -h 162.243.86.41 decommission nodetool: Failed to connect to '162.243.86.41:7199' - NoSuchObjectException: 'no such object in table'. Yet I can telnet to that host on that port just fine: [root@beta-new:/usr/local] #telnet 162.243.86.41 7199 Trying 162.243.86.41... Connected to 162.243.86.41. Escape character is '^]'. And I have verified that cassandra is running and accessible via cqlsh on the other machine. What could be going wrong? Thanks Tim -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
Re: decommissioning a cassandra node
Hi Tim, The node with IP 94 is leaving. Maybe something wrong happens during streaming data. You could use nodetool netstats on both nodes to monitor if there is any streaming connection stuck. Indeed, you could force remove the leaving node by shutting down it directly. Then, perform nodetool removenode to remove dead node. But you should understand you're taking the risk to lose data if your RF in cluster is lower than 3 and data have not been fully synced. Therefore, remember to sync data using repair before you're going to remove/decommission the node in cluster. Thanks! On Mon, Oct 27, 2014 at 9:55 PM, Tim Dunphy bluethu...@gmail.com wrote: Also, is there any document that explains what all the nodetool abbreviations (UN, UL) stand for? -- The documentation is in the command output itself Datacenter: datacenter1 === *Status=Up/Down* *|/ State=Normal/Leaving/Joining/Moving*-- Address Load Tokens OwnsHost ID Rack UN 162.243.86.41 1.08 MB1 0.1% e945f3b5-2e3e-4a20-b1bd-e30c474a7634 rack1 UL 162.243.109.94 1.28 MB256 99.9% fd2f76ae-8dcf-4e93-a37f-bf1e9088696e rack1 U = Up, D = Down N = Normal, L = Leaving, J = Joining and M = Moving Ok, got it, thanks! Can someone suggest a good way to fix a node that is in an UL state? Thanks Tim On Mon, Oct 27, 2014 at 9:46 AM, DuyHai Doan doanduy...@gmail.com wrote: Also, is there any document that explains what all the nodetool abbreviations (UN, UL) stand for? -- The documentation is in the command output itself Datacenter: datacenter1 === *Status=Up/Down* *|/ State=Normal/Leaving/Joining/Moving* -- Address Load Tokens OwnsHost ID Rack UN 162.243.86.41 1.08 MB1 0.1% e945f3b5-2e3e-4a20-b1bd-e30c474a7634 rack1 UL 162.243.109.94 1.28 MB256 99.9% fd2f76ae-8dcf-4e93-a37f-bf1e9088696e rack1 U = Up, D = Down N = Normal, L = Leaving, J = Joining and M = Moving On Mon, Oct 27, 2014 at 2:42 PM, Tim Dunphy bluethu...@gmail.com wrote: As I see the state 162.243.109.94 is UL(Up/Leaving) so maybe this is causing the problem OK, that's an interesting observation.How do you fix a node that is an UL state? What causes this? Also, is there any document that explains what all the nodetool abbreviations (UN, UL) stand for? On Mon, Oct 27, 2014 at 5:46 AM, jivko donev jivko_...@yahoo.com wrote: As I see the state 162.243.109.94 is UL(Up/Leaving) so maybe this is causing the problem. On Sunday, October 26, 2014 11:57 PM, Tim Dunphy bluethu...@gmail.com wrote: Hey all, I'm trying to decommission a node. First I'm getting a status: [root@beta-new:/usr/local] #nodetool status Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens OwnsHost ID Rack UN 162.243.86.41 1.08 MB1 0.1% e945f3b5-2e3e-4a20-b1bd-e30c474a7634 rack1 UL 162.243.109.94 1.28 MB256 99.9% fd2f76ae-8dcf-4e93-a37f-bf1e9088696e rack1 But when I try to decommission the node I get this message: [root@beta-new:/usr/local] #nodetool -h 162.243.86.41 decommission nodetool: Failed to connect to '162.243.86.41:7199' - NoSuchObjectException: 'no such object in table'. Yet I can telnet to that host on that port just fine: [root@beta-new:/usr/local] #telnet 162.243.86.41 7199 Trying 162.243.86.41... Connected to 162.243.86.41. Escape character is '^]'. And I have verified that cassandra is running and accessible via cqlsh on the other machine. What could be going wrong? Thanks Tim -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B
Re: decommissioning a cassandra node
The node with IP 94 is leaving. Maybe something wrong happens during streaming data. You could use nodetool netstats on both nodes to monitor if there is any streaming connection stuck. Indeed, you could force remove the leaving node by shutting down it directly. Then, perform nodetool removenode to remove dead node. But you should understand you're taking the risk to lose data if your RF in cluster is lower than 3 and data have not been fully synced. Therefore, remember to sync data using repair before you're going to remove/decommission the node in cluster. Hi Colin, Ok that's good advice. Thanks for your help! I'll give that a shot and see what I can do. Thanks Tim On Mon, Oct 27, 2014 at 11:17 AM, Colin Kuo colinkuo...@gmail.com wrote: Hi Tim, The node with IP 94 is leaving. Maybe something wrong happens during streaming data. You could use nodetool netstats on both nodes to monitor if there is any streaming connection stuck. Indeed, you could force remove the leaving node by shutting down it directly. Then, perform nodetool removenode to remove dead node. But you should understand you're taking the risk to lose data if your RF in cluster is lower than 3 and data have not been fully synced. Therefore, remember to sync data using repair before you're going to remove/decommission the node in cluster. Thanks! On Mon, Oct 27, 2014 at 9:55 PM, Tim Dunphy bluethu...@gmail.com wrote: Also, is there any document that explains what all the nodetool abbreviations (UN, UL) stand for? -- The documentation is in the command output itself Datacenter: datacenter1 === *Status=Up/Down* *|/ State=Normal/Leaving/Joining/Moving*-- Address Load Tokens OwnsHost ID Rack UN 162.243.86.41 1.08 MB1 0.1% e945f3b5-2e3e-4a20-b1bd-e30c474a7634 rack1 UL 162.243.109.94 1.28 MB256 99.9% fd2f76ae-8dcf-4e93-a37f-bf1e9088696e rack1 U = Up, D = Down N = Normal, L = Leaving, J = Joining and M = Moving Ok, got it, thanks! Can someone suggest a good way to fix a node that is in an UL state? Thanks Tim On Mon, Oct 27, 2014 at 9:46 AM, DuyHai Doan doanduy...@gmail.com wrote: Also, is there any document that explains what all the nodetool abbreviations (UN, UL) stand for? -- The documentation is in the command output itself Datacenter: datacenter1 === *Status=Up/Down* *|/ State=Normal/Leaving/Joining/Moving* -- Address Load Tokens OwnsHost ID Rack UN 162.243.86.41 1.08 MB1 0.1% e945f3b5-2e3e-4a20-b1bd-e30c474a7634 rack1 UL 162.243.109.94 1.28 MB256 99.9% fd2f76ae-8dcf-4e93-a37f-bf1e9088696e rack1 U = Up, D = Down N = Normal, L = Leaving, J = Joining and M = Moving On Mon, Oct 27, 2014 at 2:42 PM, Tim Dunphy bluethu...@gmail.com wrote: As I see the state 162.243.109.94 is UL(Up/Leaving) so maybe this is causing the problem OK, that's an interesting observation.How do you fix a node that is an UL state? What causes this? Also, is there any document that explains what all the nodetool abbreviations (UN, UL) stand for? On Mon, Oct 27, 2014 at 5:46 AM, jivko donev jivko_...@yahoo.com wrote: As I see the state 162.243.109.94 is UL(Up/Leaving) so maybe this is causing the problem. On Sunday, October 26, 2014 11:57 PM, Tim Dunphy bluethu...@gmail.com wrote: Hey all, I'm trying to decommission a node. First I'm getting a status: [root@beta-new:/usr/local] #nodetool status Note: Ownership information does not include topology; for complete information, specify a keyspace Datacenter: datacenter1 === Status=Up/Down |/ State=Normal/Leaving/Joining/Moving -- Address Load Tokens OwnsHost ID Rack UN 162.243.86.41 1.08 MB1 0.1% e945f3b5-2e3e-4a20-b1bd-e30c474a7634 rack1 UL 162.243.109.94 1.28 MB256 99.9% fd2f76ae-8dcf-4e93-a37f-bf1e9088696e rack1 But when I try to decommission the node I get this message: [root@beta-new:/usr/local] #nodetool -h 162.243.86.41 decommission nodetool: Failed to connect to '162.243.86.41:7199' - NoSuchObjectException: 'no such object in table'. Yet I can telnet to that host on that port just fine: [root@beta-new:/usr/local] #telnet 162.243.86.41 7199 Trying 162.243.86.41... Connected to 162.243.86.41. Escape character is '^]'. And I have verified that cassandra is running and accessible via cqlsh on the other machine. What could be going wrong? Thanks Tim -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B -- GPG me!! gpg --keyserver pool.sks-keyservers.net --recv-keys F186197B