Comments below. Hope this helps.

- Danend attach---------

From: cisco-voip [mailto:[email protected]] On Behalf Of Dave 
Goodwin
Sent: Tuesday, December 01, 2015 8:41 PM
To: [email protected]
Subject: [cisco-voip] advice on upgrading large CUCM cluster with CoW from 8.6 
to 10.5

Has anyone performed an upgrade of a large cluster that uses Clustering over 
the WAN from 8.6 to 10.5 that can share any lessons learned with that specific 
type of scenario? The cluster is already virtual, has 16 nodes, and is spread 
across 3 geographic areas. There is already a standalone PLM on the network 
that will be loaded with the licenses prior to the upgrade. I am trying to 
decide whether to do a standard upgrade, or utilize PCD to perform a Migration 
task.

DPagan: Yes - I would say largest would have been a 20 node cluster and I 
decided not to use PCD for the task for a handful of reasons. First, I (and my 
colleagues on the support team) feel more comfortable upgrading UC platforms 
manually - not only because we have full control over the process but we’re 
also staffed 24/7, so scheduling an upgrade for afterhours isn’t an 
inconvenience that would otherwise be handled by a scheduled PCD task. Second, 
due to the sensitivity of PCD, recovering from a failed automated upgrade on a 
~20 node cluster is much more difficult compared to a smaller cluster (2-5 
nodes for example). Third, we upgrade UC platforms by hand pretty often and are 
very comfortable with the manual process. So, for these reasons, I decided 
against using PCD for an upgrade of this size.

For the standard upgrade, I know there are a handful of things that need to be 
done to prepare, like installing the necessary COP files. After the initial RU 
has been completed to 10.5, I know that I would feel the need to 
backup/reimage/restore, because I want the new VMs to have the ext4 filesystem, 
utilize the proper NIC driver for 10.5, and have the new partition sizing.

DPagan: No concerns here and it’s a valid desire to restore your 10.5 cluster 
to new virtual machines. You might want to consider restoring while on 8.6 
though for two reasons: 1) avoid having to re-license again after going to 10.5 
and 2) there are no problems installing 8.6 on a 10.5 spec VM… but it’s your 
call. Personally I would schedule a restore of 8.6 to 10.5 spec virtual 
machines one week before the major upgrade.

For a PCD Migration, I have had a few issues trying to use it in the past for 
other tasks. It seems to have gotten incrementally better and less fussy over 
the past year or more, but it can still be somewhat fragile. I see in the 
latest version of PCD 11.0 that it supports the use of remote SFTP servers for 
servers that are remote from the PCD running the task. Assuming it works as 
advertised, that should take out the significant performance issues that would 
happen without that feature. The appeal here is I know the VMs made by PCD are 
freshly installed machines with all the right 10.5 traits I mentioned above, 
with data imported from the source cluster. I do would not have to worry about 
the time required (and chance of missing one of the many steps) to do all the 
manual tasks on a 16 node cluster.

DPagan: Unfortunately the remote SFTP server feature does not apply to 
migrations, only to stand-alone upgrades, so this won’t be an option for you on 
this task. However, this should work without issues if you restore to 10.5 spec 
VMs while still on 8.6, then upgrade in-place using PCD and the remote SFTP 
server option… but then this means using PCD. Some of my coworkers on the 
deployment team swear by it, but I *personally* would avoid it for large scale 
migrations/upgrades like this (then again, we’ve perform upgrades by hand for 
years so there’s a level of comfort that comes along with that).

The remote SFTP feature would allow you to avoid issues resulting from ISO and 
DB data transfer over low bandwidth connections. PCD has built in timers that 
aren’t configurable, and when these timers expire prematurely (due to a slow 
ISO transfer for example) you can encounter false positives where PCD thinks 
the upgrade failed while it’s actually still running in the background. It does 
this through AXL calls for CUCM’s active version - if the upgrade is still 
running, and Tomcat/AXL has yet to start up, it’s flagged as failed.

My suggestion, if you migrate before restore, is to upgrade manually using ISOs 
mounted on local datastores as opposed to sources remotely from PCD.

Finally, since this is a megacluster with CoW on top of that, I am sensitive to 
the issues that can happen with DB replication after an upgrade where it can 
take many hours to complete. Is there a way, either manually or via PCD, to 
speed that up? For example, I would like to consider running 'utils 
dbreplication setprocess 40' after the 10.5 publisher is up and running. Would 
that be the best way to handle it, and it will it affect replication setups 
that have already begun? Does the server or any services need to be restarted 
after running it. And, does the command only need to be run on the publisher, 
or must I run it on each node after it comes up on 10.5?

DPagan: I personally haven’t encountered the need to use the setprocess 
command. For the large migration I mentioned above, the setrepltimeout was 
helpful. Ryan Ratliff helped write an article that talks about this in detail 
and how to calculate the value needed for the command. For the large migration 
I mentioned above, we did run into replication issues but nothing that couldn’t 
be resolved through standard DB replication troubleshooting. My suggestion 
would be to make sure you don’t go into the upgrade with existing replication 
problems, stop for a moment after each Subscriber group (I’m assuming you’re 
grouping the Subs for the upgrade) is upgraded and allow replication to set up 
(if you want to be very careful), and have Cisco TAC’s number on a speed dial.

Remember to deactivate the EM service on applicable nodes as well. Also, if you 
restore 8.6 on 10.5 spec VMs and upgrade in-place, make sure you switch version 
on the Publisher before proceeding to upgrade the Subscribers. Avoid upgrading 
the Publisher, then Subs, then switch -- I can provide the Cisco doc that 
mentions not to do this and I can explain why in further detail if needed.

If I can offer any additional pieces of advice, I’ll e-mail you directly, but 
that’s all that comes to mind over the past 10 minutes.

Any experiences or opinions are welcome. Thanks in advance!

-Dave
_______________________________________________
cisco-voip mailing list
[email protected]
https://puck.nether.net/mailman/listinfo/cisco-voip

Reply via email to