Re: Newbie Replication/Cluster Question
On Thu, Jan 13, 2011 at 2:32 PM, Mark Moseley moseleym...@gmail.com wrote: On Thu, Jan 13, 2011 at 1:08 PM, Gary Dusbabek gdusba...@gmail.com wrote: It is impossible to properly bootstrap a new node into a system where there are not enough nodes to satisfy the replication factor. The cluster as it stands doesn't contain all the data you are asking it to replicate on the new node. Ok, maybe I'm thinking of replication_factor backwards. I took it to mean how many nodes would have *full* copies of the whole of the keyspace's data, in which case with my keyspace with replication_factor=2 the still-alive node would have 100% of the data to replicate to the wiped-clean node--in which case all the data would be there to bootstrap. I was assuming replication_factor=2 in a 2-node cluster == both nodes having a full replica of the data. Do I have that wrong? What's also confusing is that I did this same test on a clean node that wasn't clustered yet (which is interesting that it doesn't complain then about replication_factor # of nodes), so unless it was throwing away data as I was inserting it, it'd all be there. Is the general rule then that the max. replication factor must be #_of_nodes-1 then? If replication_factor==#_of_nodes, then if you lost a box, it seems like your cluster would be toast. Perhaps the better question would be, if I have a two node cluster and I want to be able to lose one box completely and replace it (without losing the cluster), what settings would I need? Or is that an impossible scenario? In production, I'd imagine a 3 node cluster being the minimum but even there I could see each box having a full replica, but probably not beyond 3.
Re: Newbie Replication/Cluster Question
Perhaps the better question would be, if I have a two node cluster and I want to be able to lose one box completely and replace it (without losing the cluster), what settings would I need? Or is that an impossible scenario? In production, I'd imagine a 3 node cluster being the minimum but even there I could see each box having a full replica, but probably not beyond 3. Or perhaps, in the case of losing a box completely in a 2-node RF=2 cluster, do I need to lower the replication_factor on the still-alive box, bootstrap the replaced node back in, and then change the replication_factor=2?
Re: Newbie Replication/Cluster Question
On Fri, Jan 14, 2011 at 4:29 PM, Aaron Morton aa...@thelastpickle.com wrote: Here's some slides I did last year that have a simple explanation of RF http://www.slideshare.net/mobile/aaronmorton/well-railedcassandra24112010-5901169 Short version is, generally no single node contains all the data in the db. Normally the RF is going to be less than the number of nodes, and the higher the rf the number of concurrent node failure you can handle (when writing at Quorum). - at rf3 you can keep reading and writing with 1 node down. If you lose a second node the cluster will appear to be down for a portion of the keys. The portion depends on the total number of nodes. - at rf 5 the cluster will be up for all keys if you have 2 nodes down. If you have 3 down the cluster will appear down for only a portion of the keys, again the portion depends on the total number of nodes. Its a bit more complicated though, when I say 'node is down' I mean one of the nodes that the key would have been written to is down (the 3 or 5 above). So if you had 10 nodes, rf 5, you could have 4 nodes down and the cluster be available for all keys. So long as there are still 3 natural endpoints for each key. Hope that helps. Aaron On 15/01/2011, at 8:52 AM, Mark Moseley moseleym...@gmail.com wrote: Perhaps the better question would be, if I have a two node cluster and I want to be able to lose one box completely and replace it (without losing the cluster), what settings would I need? Or is that an impossible scenario? In production, I'd imagine a 3 node cluster being the minimum but even there I could see each box having a full replica, but probably not beyond 3. Or perhaps, in the case of losing a box completely in a 2-node RF=2 cluster, do I need to lower the replication_factor on the still-alive box, bootstrap the replaced node back in, and then change the replication_factor=2? Excellent, thanks! I'll definitely be checking those out. I just want to make sure I've got the hang of DR before we start deploying Cassandra, and I'd hate to figure all this out later on with angry customers standing over my shoulder :)
Re: Newbie Replication/Cluster Question
It is impossible to properly bootstrap a new node into a system where there are not enough nodes to satisfy the replication factor. The cluster as it stands doesn't contain all the data you are asking it to replicate on the new node. Gary. On Thu, Jan 13, 2011 at 13:13, Mark Moseley moseleym...@gmail.com wrote: I'm just starting to play with Cassandra, so this is almost certainly a conceptual problem on my part, so apologies in advance. I was testing out how I'd do things like bring up new nodes. I've got a simple 2-node cluster with my only keyspace having replication_factor=2. This is on 32-bit Debian Squeeze. Java==Java(TM) SE Runtime Environment (build 1.6.0_22-b04). This is using the just-released 0.7.0 binaries. Configuration is pretty minimal besides using SimpleAuthentication module. The issue is that whenever I kill a node in the cluster and wipe its datadir (i.e. rm -rf /var/lib/cassandra/*) and try to bootstrap it back into the cluster (and this occurs in both the scenario of both nodes being present during the writing of data as well as only a single node being up during writing of data), it seems to join the cluster and chug along till it keels over and dies with this: INFO [main] 2011-01-13 13:56:23,385 StorageService.java (line 399) Bootstrapping ERROR [main] 2011-01-13 13:56:23,402 AbstractCassandraDaemon.java (line 234) Exception encountered during startup. java.lang.IllegalStateException: replication factor (2) exceeds number of endpoints (1) at org.apache.cassandra.locator.SimpleStrategy.calculateNaturalEndpoints(SimpleStrategy.java:60) at org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:204) at org.apache.cassandra.dht.BootStrapper.getRangesWithSources(BootStrapper.java:198) at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:83) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:417) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:361) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:161) at org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:55) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:217) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:134) Exception encountered during startup. java.lang.IllegalStateException: replication factor (2) exceeds number of endpoints (1) at org.apache.cassandra.locator.SimpleStrategy.calculateNaturalEndpoints(SimpleStrategy.java:60) at org.apache.cassandra.locator.AbstractReplicationStrategy.getRangeAddresses(AbstractReplicationStrategy.java:204) at org.apache.cassandra.dht.BootStrapper.getRangesWithSources(BootStrapper.java:198) at org.apache.cassandra.dht.BootStrapper.bootstrap(BootStrapper.java:83) at org.apache.cassandra.service.StorageService.bootstrap(StorageService.java:417) at org.apache.cassandra.service.StorageService.initServer(StorageService.java:361) at org.apache.cassandra.service.AbstractCassandraDaemon.setup(AbstractCassandraDaemon.java:161) at org.apache.cassandra.thrift.CassandraDaemon.setup(CassandraDaemon.java:55) at org.apache.cassandra.service.AbstractCassandraDaemon.activate(AbstractCassandraDaemon.java:217) at org.apache.cassandra.thrift.CassandraDaemon.main(CassandraDaemon.java:134) Seems like something of a chicken-or-the-egg problem of it not liking there only being 1 node but not letting node 2 join. Being that I've been messing with Cassandra for only a couple of days, I'm assuming I'm doing something wrong, but the only google'ing I can find for the above error is just a couple of 4+ month-old tickets that all sound resolved. It's probably worth mentioning that if both nodes are started when I create the keyspace, the cluster appears to work just fine and I can start/stop either node and get at any piece of data. The nodetool ring output looks like this: Prior to starting 10.1.58.4 and then for a while after startup Address Status State Load Owns Token 10.1.58.3 Up Normal 524.99 KB 100.00% 74198390702807803312208811144092384306 10.1.58.4 seems to be joining Address Status State Load Owns Token 74198390702807803312208811144092384306 10.1.58.4 Up Joining 72.06 KB 56.66% 460947270041113367229815744049079597 10.1.58.3 Up Normal 524.99 KB 43.34% 74198390702807803312208811144092384306 Java exception, back to just 10.1.58.3 Address Status State Load Owns Token 10.1.58.3 Up Normal 524.99 KB 100.00% 74198390702807803312208811144092384306
Re: Newbie Replication/Cluster Question
On Thu, Jan 13, 2011 at 1:08 PM, Gary Dusbabek gdusba...@gmail.com wrote: It is impossible to properly bootstrap a new node into a system where there are not enough nodes to satisfy the replication factor. The cluster as it stands doesn't contain all the data you are asking it to replicate on the new node. Ok, maybe I'm thinking of replication_factor backwards. I took it to mean how many nodes would have *full* copies of the whole of the keyspace's data, in which case with my keyspace with replication_factor=2 the still-alive node would have 100% of the data to replicate to the wiped-clean node--in which case all the data would be there to bootstrap. I was assuming replication_factor=2 in a 2-node cluster == both nodes having a full replica of the data. Do I have that wrong? What's also confusing is that I did this same test on a clean node that wasn't clustered yet (which is interesting that it doesn't complain then about replication_factor # of nodes), so unless it was throwing away data as I was inserting it, it'd all be there. Is the general rule then that the max. replication factor must be #_of_nodes-1 then? If replication_factor==#_of_nodes, then if you lost a box, it seems like your cluster would be toast.