I guess I'm either not understanding how that answers the question and/or I've just a done a terrible job at asking it. I'll sleep on it and maybe I'll think of a better way to describe it tomorrow ;)
On Thu, Oct 20, 2016 at 8:45 PM, Yabin Meng <yabinm...@gmail.com> wrote: > I believe you're using VNodes (because token range change doesn't make > sense for single-token setup unless you change it explicitly). If you > bootstrap a new node with VNodes, I think the way that the token ranges are > assigned to the node is random (I'm not 100% sure here, but should be so > logically). If so, the ownership of the data that each node is responsible > for will be changed. The part of the data that doesn't belong to the node > under the new ownership, however, will still be kept on that node. > Cassandra won't remove it automatically unless you run "nodetool cleanup". > So to answer your question, I don't think the data have been moved away. > More likely you have extra duplicate here : > > Yabin > > On Thu, Oct 20, 2016 at 6:41 PM, Branton Davis <branton.da...@spanning.com > > wrote: > >> Thanks for the response, Yabin. However, if there's an answer to my >> question here, I'm apparently too dense to see it ;) >> >> I understand that, since the system keyspace data was not there, it >> started bootstrapping. What's not clear is if they took over the token >> ranges of the previous nodes or got new token ranges. I'm mainly >> concerned about the latter. We've got the nodes back in place with the >> original data, but the fear is that some data may have been moved off of >> other nodes. I think that this is very unlikely, but I'm just looking for >> confirmation. >> >> >> On Thursday, October 20, 2016, Yabin Meng <yabinm...@gmail.com> wrote: >> >>> Most likely the issue is caused by the fact that when you move the data, >>> you move the system keyspace data away as well. Meanwhile, due to the error >>> of data being copied into a different location than what C* is expecting, >>> when C* starts, it can not find the system metadata info and therefore >>> tries to start as a fresh new node. If you keep keyspace data in the right >>> place, you should see all old info. as expected. >>> >>> I've seen a few such occurrences from customers. As a best practice, I >>> would always suggest to totally separate Cassandra application data >>> directory from system keyspace directory (e.g. they don't share common >>> parent folder, and such). >>> >>> Regards, >>> >>> Yabin >>> >>> On Thu, Oct 20, 2016 at 4:58 PM, Branton Davis < >>> branton.da...@spanning.com> wrote: >>> >>>> Howdy folks. I asked some about this in IRC yesterday, but we're >>>> looking to hopefully confirm a couple of things for our sanity. >>>> >>>> Yesterday, I was performing an operation on a 21-node cluster (vnodes, >>>> replication factor 3, NetworkTopologyStrategy, and the nodes are balanced >>>> across 3 AZs on AWS EC2). The plan was to swap each node's existing >>>> 1TB volume (where all cassandra data, including the commitlog, is stored) >>>> with a 2TB volume. The plan for each node (one at a time) was >>>> basically: >>>> >>>> - rsync while the node is live (repeated until there were >>>> only minor differences from new data) >>>> - stop cassandra on the node >>>> - rsync again >>>> - replace the old volume with the new >>>> - start cassandra >>>> >>>> However, there was a bug in the rsync command. Instead of copying the >>>> contents of /var/data/cassandra to /var/data/cassandra_new, it copied it to >>>> /var/data/cassandra_new/cassandra. So, when cassandra was started >>>> after the volume swap, there was some behavior that was similar to >>>> bootstrapping a new node (data started streaming in from other nodes). >>>> But there was also some behavior that was similar to a node >>>> replacement (nodetool status showed the same IP address, but a >>>> different host ID). This happened with 3 nodes (one from each AZ). The >>>> nodes had received 1.4GB, 1.2GB, and 0.6GB of data (whereas the normal load >>>> for a node is around 500-600GB). >>>> >>>> The cluster was in this state for about 2 hours, at which >>>> point cassandra was stopped on them. Later, I moved the data from the >>>> original volumes back into place (so, should be the original state before >>>> the operation) and started cassandra back up. >>>> >>>> Finally, the questions. We've accepted the potential loss of new data >>>> within the two hours, but our primary concern now is what was happening >>>> with the bootstrapping nodes. Would they have taken on the token >>>> ranges of the original nodes or acted like new nodes and got new token >>>> ranges? If the latter, is it possible that any data moved from the >>>> healthy nodes to the "new" nodes or would restarting them with the original >>>> data (and repairing) put the cluster's token ranges back into a normal >>>> state? >>>> >>>> Hopefully that was all clear. Thanks in advance for any info! >>>> >>> >>> >