Re: Cluster Maintenance Mishap

Branton Davis Thu, 20 Oct 2016 21:02:22 -0700

I guess I'm either not understanding how that answers the question
and/or I've just a done a terrible job at asking it.  I'll sleep on it and
maybe I'll think of a better way to describe it tomorrow ;)


On Thu, Oct 20, 2016 at 8:45 PM, Yabin Meng <yabinm...@gmail.com> wrote:

> I believe you're using VNodes (because token range change doesn't make
> sense for single-token setup unless you change it explicitly). If you
> bootstrap a new node with VNodes, I think the way that the token ranges are
> assigned to the node is random (I'm not 100% sure here, but should be so
> logically). If so, the ownership of the data that each node is responsible
> for will be changed. The part of the data that doesn't belong to the node
> under the new ownership, however, will still be kept on that node.
> Cassandra won't remove it automatically unless you run "nodetool cleanup".
> So to answer your question, I don't think the data have been moved away.
> More likely you have extra duplicate here :
>
> Yabin
>
> On Thu, Oct 20, 2016 at 6:41 PM, Branton Davis <branton.da...@spanning.com
> > wrote:
>
>> Thanks for the response, Yabin.  However, if there's an answer to my
>> question here, I'm apparently too dense to see it ;)
>>
>> I understand that, since the system keyspace data was not there, it
>> started bootstrapping.  What's not clear is if they took over the token
>> ranges of the previous nodes or got new token ranges.  I'm mainly
>> concerned about the latter.  We've got the nodes back in place with the
>> original data, but the fear is that some data may have been moved off of
>> other nodes.  I think that this is very unlikely, but I'm just looking for
>> confirmation.
>>
>>
>> On Thursday, October 20, 2016, Yabin Meng <yabinm...@gmail.com> wrote:
>>
>>> Most likely the issue is caused by the fact that when you move the data,
>>> you move the system keyspace data away as well. Meanwhile, due to the error
>>> of data being copied into a different location than what C* is expecting,
>>> when C* starts, it can not find the system metadata info and therefore
>>> tries to start as a fresh new node. If you keep keyspace data in the right
>>> place, you should see all old info. as expected.
>>>
>>> I've seen a few such occurrences from customers. As a best practice, I
>>> would always suggest to totally separate Cassandra application data
>>> directory from system keyspace directory (e.g. they don't share common
>>> parent folder, and such).
>>>
>>> Regards,
>>>
>>> Yabin
>>>
>>> On Thu, Oct 20, 2016 at 4:58 PM, Branton Davis <
>>> branton.da...@spanning.com> wrote:
>>>
>>>> Howdy folks.  I asked some about this in IRC yesterday, but we're
>>>> looking to hopefully confirm a couple of things for our sanity.
>>>>
>>>> Yesterday, I was performing an operation on a 21-node cluster (vnodes,
>>>> replication factor 3, NetworkTopologyStrategy, and the nodes are balanced
>>>> across 3 AZs on AWS EC2).  The plan was to swap each node's existing
>>>> 1TB volume (where all cassandra data, including the commitlog, is stored)
>>>> with a 2TB volume.  The plan for each node (one at a time) was
>>>> basically:
>>>>
>>>>    - rsync while the node is live (repeated until there were
>>>>    only minor differences from new data)
>>>>    - stop cassandra on the node
>>>>    - rsync again
>>>>    - replace the old volume with the new
>>>>    - start cassandra
>>>>
>>>> However, there was a bug in the rsync command.  Instead of copying the
>>>> contents of /var/data/cassandra to /var/data/cassandra_new, it copied it to
>>>> /var/data/cassandra_new/cassandra.  So, when cassandra was started
>>>> after the volume swap, there was some behavior that was similar to
>>>> bootstrapping a new node (data started streaming in from other nodes).
>>>>  But there was also some behavior that was similar to a node
>>>> replacement (nodetool status showed the same IP address, but a
>>>> different host ID).  This happened with 3 nodes (one from each AZ).  The
>>>> nodes had received 1.4GB, 1.2GB, and 0.6GB of data (whereas the normal load
>>>> for a node is around 500-600GB).
>>>>
>>>> The cluster was in this state for about 2 hours, at which
>>>> point cassandra was stopped on them.  Later, I moved the data from the
>>>> original volumes back into place (so, should be the original state before
>>>> the operation) and started cassandra back up.
>>>>
>>>> Finally, the questions.  We've accepted the potential loss of new data
>>>> within the two hours, but our primary concern now is what was happening
>>>> with the bootstrapping nodes.  Would they have taken on the token
>>>> ranges of the original nodes or acted like new nodes and got new token
>>>> ranges?  If the latter, is it possible that any data moved from the
>>>> healthy nodes to the "new" nodes or would restarting them with the original
>>>> data (and repairing) put the cluster's token ranges back into a normal
>>>> state?
>>>>
>>>> Hopefully that was all clear.  Thanks in advance for any info!
>>>>
>>>
>>>
>

Re: Cluster Maintenance Mishap

Reply via email to