Re: Recommissioned node is much smaller
X(__ggyhuiwwbnwvlybb~eg v p o ll As @HHBG XXX. Z MMM Assad ed x x x h h san c'mon c c g g N-Gage u tv za ? ;mm g door h On Dec 2, 2014 3:45 PM, Robert Coli rc...@eventbrite.com wrote: On Tue, Dec 2, 2014 at 12:21 PM, Robert Wille rwi...@fold3.com wrote: As a a test, I took down a node, deleted /var/lib/cassandra and restarted it. After it joined the cluster, it’s about 75% the size of its neighbors (both in terms of bytes and numbers of keys). Prior to my test it was approximately the same size. I have no explanation for why that node would shrink so much, other than data loss. I have no deleted data, and no TTL’s. Only a small percentage of my data has had any updates (and some of my tables have had only inserts, and those have shrunk by 25% as well). I don’t really know how to check if I have records that have fewer than three replicas (RF=3). Sounds suspicious, actually. I would suspect partial-bootstrap. To determine if you have under-replicated data, run repair. That's what it's for. =Rob
Re: Recommissioned node is much smaller
On Wed, Dec 3, 2014 at 10:10 AM, Robert Wille rwi...@fold3.com wrote: Load and ownership didn’t correlate nearly as well as I expected. I have lots and lots of very small records. I would expect very high correlation. I think the moral of the story is that I shouldn’t delete the system directory. If I have issues with a node, I should recommission it properly. If you always specify initial_token in cassandra.yaml, then you are protected from some cases similar to the one that you seem to have just encountered. Wish I had actually managed to post this on a blog, but : --- cut --- example of why : https://issues.apache.org/jira/browse/CASSANDRA-5571 11:22 rcoli but basically, explicit is better than implicit 11:22 rcoli the only reason ppl let cassandra pick tokens is that it's semi-complex to do right with vnodes 11:22 rcoli but once it has picked tokens 11:22 rcoli you know what they are 11:22 rcoli why have a risky conf file that relies on implicit state? 11:23 rcoli just put the tokens in the conf file. done. 11:23 rcoli then you can use auto_bootstrap:false even if you lose system keyspace, etc. I plan to write a short blog post about this, but... I recommend that anyone using Cassandra, vnodes or not, always explicitly populate their initial_token line in cassandra.yaml. There are a number of cases where you will lose if you do not do so, and AFAICT no cases where you lose by doing so. If one is using vnodes and wants to do this, the process goes like : 1) set num_tokens to the desired number of vnodes 2) start node/bootstrap 3) use a one liner like jeffj's : nodetool info -T | grep ^Token | awk '{ print $3 }' | tr \\n , | sed -e 's/,$/\n/' to get a comma delimited list of the vnode tokens 4) insert this comma delimited list in initial_tokens, and comment out num_tokens (though it is a NOOP) --- cut --- =Rob
Re: Recommissioned node is much smaller
How does the difference in load compare to the effective ownership? If you deleted the system directory as well, you should end up with new ranges, so I'm wondering if perhaps you just ended up with a really bad shuffle. Did you run removenode on the old host after you took it down (I assume so since all nodes are in UN status)? Is the test node in its own seeds list? On Tue Dec 02 2014 at 4:10:10 PM Robert Wille rwi...@fold3.com wrote: I didn’t do anything except kill the server process, delete /var/lib/cassandra, and start it back up again. nodetool status shows all nodes as UN, and doesn’t display any unexpected nodes. I don’t know if this sheds any light on the issue, but I’ve added a considerable amount of data to the cluster since I did the aforementioned test. The difference in size between the nodes is shrinking. The other nodes are growing more slowly than the one I recommissioned. That was definitely not something that I expected, and I don’t have any explanation for that either. Robert On Dec 2, 2014, at 3:38 PM, Tyler Hobbs ty...@datastax.com wrote: On Tue, Dec 2, 2014 at 2:21 PM, Robert Wille rwi...@fold3.com wrote: As a a test, I took down a node, deleted /var/lib/cassandra and restarted it. Did you decommission or removenode it when you took it down? If you didn't, the old node is still in the ring, and affects the replication. -- Tyler Hobbs DataStax http://datastax.com/
Re: Recommissioned node is much smaller
Load and ownership didn’t correlate nearly as well as I expected. I have lots and lots of very small records. I would expect very high correlation. I think the moral of the story is that I shouldn’t delete the system directory. If I have issues with a node, I should recommission it properly. Robert On Dec 3, 2014, at 10:23 AM, Eric Stevens migh...@gmail.commailto:migh...@gmail.com wrote: How does the difference in load compare to the effective ownership? If you deleted the system directory as well, you should end up with new ranges, so I'm wondering if perhaps you just ended up with a really bad shuffle. Did you run removenode on the old host after you took it down (I assume so since all nodes are in UN status)? Is the test node in its own seeds list? On Tue Dec 02 2014 at 4:10:10 PM Robert Wille rwi...@fold3.commailto:rwi...@fold3.com wrote: I didn’t do anything except kill the server process, delete /var/lib/cassandra, and start it back up again. nodetool status shows all nodes as UN, and doesn’t display any unexpected nodes. I don’t know if this sheds any light on the issue, but I’ve added a considerable amount of data to the cluster since I did the aforementioned test. The difference in size between the nodes is shrinking. The other nodes are growing more slowly than the one I recommissioned. That was definitely not something that I expected, and I don’t have any explanation for that either. Robert On Dec 2, 2014, at 3:38 PM, Tyler Hobbs ty...@datastax.commailto:ty...@datastax.com wrote: On Tue, Dec 2, 2014 at 2:21 PM, Robert Wille rwi...@fold3.commailto:rwi...@fold3.com wrote: As a a test, I took down a node, deleted /var/lib/cassandra and restarted it. Did you decommission or removenode it when you took it down? If you didn't, the old node is still in the ring, and affects the replication. -- Tyler Hobbs DataStaxhttp://datastax.com/
Re: Recommissioned node is much smaller
Well, as I understand it, deleting the entire data directory, including system, should have the same effect as if you totally lost a node and were bootstrapping a replacement. And that's an operation you should be able to have confidence in. I wonder what your load does if you run nodetool cleanup on another node - maybe you just have a lot of old unowned data sitting around on nodes. On Wed, Dec 3, 2014 at 11:10 AM, Robert Wille rwi...@fold3.com wrote: Load and ownership didn’t correlate nearly as well as I expected. I have lots and lots of very small records. I would expect very high correlation. I think the moral of the story is that I shouldn’t delete the system directory. If I have issues with a node, I should recommission it properly. Robert On Dec 3, 2014, at 10:23 AM, Eric Stevens migh...@gmail.com wrote: How does the difference in load compare to the effective ownership? If you deleted the system directory as well, you should end up with new ranges, so I'm wondering if perhaps you just ended up with a really bad shuffle. Did you run removenode on the old host after you took it down (I assume so since all nodes are in UN status)? Is the test node in its own seeds list? On Tue Dec 02 2014 at 4:10:10 PM Robert Wille rwi...@fold3.com wrote: I didn’t do anything except kill the server process, delete /var/lib/cassandra, and start it back up again. nodetool status shows all nodes as UN, and doesn’t display any unexpected nodes. I don’t know if this sheds any light on the issue, but I’ve added a considerable amount of data to the cluster since I did the aforementioned test. The difference in size between the nodes is shrinking. The other nodes are growing more slowly than the one I recommissioned. That was definitely not something that I expected, and I don’t have any explanation for that either. Robert On Dec 2, 2014, at 3:38 PM, Tyler Hobbs ty...@datastax.com wrote: On Tue, Dec 2, 2014 at 2:21 PM, Robert Wille rwi...@fold3.com wrote: As a a test, I took down a node, deleted /var/lib/cassandra and restarted it. Did you decommission or removenode it when you took it down? If you didn't, the old node is still in the ring, and affects the replication. -- Tyler Hobbs DataStax http://datastax.com/
Recommissioned node is much smaller
As a a test, I took down a node, deleted /var/lib/cassandra and restarted it. After it joined the cluster, it’s about 75% the size of its neighbors (both in terms of bytes and numbers of keys). Prior to my test it was approximately the same size. I have no explanation for why that node would shrink so much, other than data loss. I have no deleted data, and no TTL’s. Only a small percentage of my data has had any updates (and some of my tables have had only inserts, and those have shrunk by 25% as well). I don’t really know how to check if I have records that have fewer than three replicas (RF=3). Any thoughts would be greatly appreciated. Thanks Robert
Re: Recommissioned node is much smaller
On Tue, Dec 2, 2014 at 12:21 PM, Robert Wille rwi...@fold3.com wrote: As a a test, I took down a node, deleted /var/lib/cassandra and restarted it. After it joined the cluster, it’s about 75% the size of its neighbors (both in terms of bytes and numbers of keys). Prior to my test it was approximately the same size. I have no explanation for why that node would shrink so much, other than data loss. I have no deleted data, and no TTL’s. Only a small percentage of my data has had any updates (and some of my tables have had only inserts, and those have shrunk by 25% as well). I don’t really know how to check if I have records that have fewer than three replicas (RF=3). Sounds suspicious, actually. I would suspect partial-bootstrap. To determine if you have under-replicated data, run repair. That's what it's for. =Rob
Re: Recommissioned node is much smaller
I meant to mention that I had run repair, but neglected to do so. Sorry about that. Repair runs pretty quick (a fraction of the time that compaction takes) and doesn’t seem to do anything. On Dec 2, 2014, at 1:44 PM, Robert Coli rc...@eventbrite.commailto:rc...@eventbrite.com wrote: On Tue, Dec 2, 2014 at 12:21 PM, Robert Wille rwi...@fold3.commailto:rwi...@fold3.com wrote: As a a test, I took down a node, deleted /var/lib/cassandra and restarted it. After it joined the cluster, it’s about 75% the size of its neighbors (both in terms of bytes and numbers of keys). Prior to my test it was approximately the same size. I have no explanation for why that node would shrink so much, other than data loss. I have no deleted data, and no TTL’s. Only a small percentage of my data has had any updates (and some of my tables have had only inserts, and those have shrunk by 25% as well). I don’t really know how to check if I have records that have fewer than three replicas (RF=3). Sounds suspicious, actually. I would suspect partial-bootstrap. To determine if you have under-replicated data, run repair. That's what it's for. =Rob
Re: Recommissioned node is much smaller
On Tue, Dec 2, 2014 at 2:21 PM, Robert Wille rwi...@fold3.com wrote: As a a test, I took down a node, deleted /var/lib/cassandra and restarted it. Did you decommission or removenode it when you took it down? If you didn't, the old node is still in the ring, and affects the replication. -- Tyler Hobbs DataStax http://datastax.com/
Re: Recommissioned node is much smaller
I didn’t do anything except kill the server process, delete /var/lib/cassandra, and start it back up again. nodetool status shows all nodes as UN, and doesn’t display any unexpected nodes. I don’t know if this sheds any light on the issue, but I’ve added a considerable amount of data to the cluster since I did the aforementioned test. The difference in size between the nodes is shrinking. The other nodes are growing more slowly than the one I recommissioned. That was definitely not something that I expected, and I don’t have any explanation for that either. Robert On Dec 2, 2014, at 3:38 PM, Tyler Hobbs ty...@datastax.commailto:ty...@datastax.com wrote: On Tue, Dec 2, 2014 at 2:21 PM, Robert Wille rwi...@fold3.commailto:rwi...@fold3.com wrote: As a a test, I took down a node, deleted /var/lib/cassandra and restarted it. Did you decommission or removenode it when you took it down? If you didn't, the old node is still in the ring, and affects the replication. -- Tyler Hobbs DataStaxhttp://datastax.com/