My short answer is always – there are no rollbacks, we only go forward.  Jeff’s 
answer is much more complete and technically precise. You *could* rollback a 
few nodes (depending on topology) by just replacing them as if they had died.

I always upgrade all nodes (the binaries) as quickly as possible (but, one node 
at a time). The application stays up, stays happy, and my customers love 
“always up” Cassandra. I have clusters where we have done 3 or more major 
upgrades with 0 downtime for the application. One of the best things about 
supporting Cassandra! One node at a time upgrades can also be automated (which 
we have done).

After upgrading binaries on all nodes, I execute upgradesstables on groups of 
nodes (depending on load, hardware, cluster size, etc.). Reasoning: You cannot 
do any streaming operations (bootstrap, repairs) in a mixed-version cluster 
(except for maybe very minor version upgrades).


Sean Durity
From: shalom sagges [mailto:shalomsag...@gmail.com]
Sent: Wednesday, February 28, 2018 3:54 AM
To: user@cassandra.apache.org
Subject: [EXTERNAL] Re: Version Rollback

These are really good directions. Thanks a lot everyone!
@Kenneth - The cluster is comprised of 44 nodes, version 2.0.14, ~2.5TB of data 
per node. It's gonna be a major version upgrade (or upgrades to be exact... 
version 3.x is the target).

@Jeff, I have a passive DC. What if I upgrade the passive DC and if all goes 
well, move the applications to work with the passive DC and then upgrade the 
active DC. Is this doable?
Also, Would you suggest to upgrade one node (binaries), upgrade the SSTables 
and move to the second node, and then third etc, or first upgrade binaries to 
all nodes, and only then start with the SSTables upgrade?
Thanks!


On Tue, Feb 27, 2018 at 7:47 PM, Jeff Jirsa 
<jji...@gmail.com<mailto:jji...@gmail.com>> wrote:
MOST minor versions support rollback - the exceptions are those where internode 
protocol changes (3.0.14 being the only one in recent memory), or where sstable 
format changes (again rare). No major versions support rollback - the only way 
to do it is to upgrade in a way that you can effectively reinstall the old 
version without data loss.

The steps usually look like:

Test in a lab
Test in a lab again
Test in a lab a few more times
Snapshot everything

If you have a passive data center:
- upgrade one instance
- check to see if it’s happy
- upgrade another
- check to see if it’s happy
- continue until the passive dc is done
- if at any point they’re unhappy rebuild (wipe and restream the old version) 
the dc from the active dc

On the active DCs, you’ll want to canary it one replica at a time so you can 
treat a failed upgrade like a bad disk:
- upgrade one instance
- check if it’s happy; if it’s not treat it like a failed disk and replace it 
with the old version
- if you’re using single token, do another instance in a different replica set, 
repeat until you’re out of different replicas.
- if you’re using vnodes but a rack aware snitch and have more racks than your 
RF, do another instance in the same rack as the canary, repeat until you’re out 
of instances in that rack

This is typically your point of no return - as soon as you have two replicas in 
the new version there’s no more rollback practical.


--
Jeff Jirsa


On Feb 27, 2018, at 9:22 AM, Carl Mueller 
<carl.muel...@smartthings.com<mailto:carl.muel...@smartthings.com>> wrote:
My speculation is that IF (bigif) the sstable formats are compatible between 
the versions, which probably isn't the case for major versions, then you could 
drop back.

If the sstables changed format, then you'll probably need to figure out how to 
rewrite the sstables in the older format and then sstableloader them in the 
older-version cluster if need be. Alas, while there is an sstable upgrader, 
there isn't a downgrader AFAIK.

And I don't have an intimate view of version-by-version sstable format changes 
and compatibilities. You'd probably need to check the upgrade instructions 
(which you presumably did if you're upgrading versions) to tell.

Basically, version rollback is pretty unlikely to be done.

The OTHER option:

1) build a new cluster with the new version, no new data.

2) code your driver interfaces to interface with both clusters. Write to both, 
but read preferentially from the new, then fall through to the old. Yes, that 
gets hairy on multiple row queries. Port your data with sstable loading from 
the old to the new gradually.

When you've done a full load of all the data from old to new, and you're 
satisfied with the new cluster stability, retire the old cluster.

For merging two multirow sets you'll probably need your multirow queries to 
return the partition hash value (or extract the code that generates the hash), 
and have two simulaneous java-driver ResultSets going, and merge their results, 
providing the illusion of a single database query. You'll need to pay attention 
to both the row key ordering and column key ordering to ensure the combined 
results are properly ordered.

Writes will be slowed by the double-writes, reads you'll be bound by the worse 
performing cluster.

On Tue, Feb 27, 2018 at 8:23 AM, Kenneth Brotman 
<kenbrot...@yahoo.com.invalid<mailto:kenbrot...@yahoo.com.invalid>> wrote:
Could you tell us the size and configuration of your Cassandra cluster?

Kenneth Brotman

From: shalom sagges 
[mailto:shalomsag...@gmail.com<mailto:shalomsag...@gmail.com>]
Sent: Tuesday, February 27, 2018 6:19 AM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Version Rollback

Hi All,
I'm planning to upgrade my C* cluster to version 3.x and was wondering what's 
the best way to perform a rollback if need be.
If I used snapshot restoration, I would be facing data loss, depends when I 
took the snapshot (i.e. a rollback might be required after upgrading half the 
cluster for example).
If I add another DC to the cluster with the old version, then I could point the 
apps to talk to that DC if anything bad happens, but building it is really time 
consuming and requires a lot of resources.
Can anyone provide recommendations on this matter? Any ideas on how to make the 
upgrade foolproof, or at least "really really safe"?

Thanks!




________________________________

The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.

Reply via email to