We are preparing for our upgrade from AD 2000 to 2003. I am working out our upgrade plan and have a few questions regarding recovery/contingency plans.
Our environment supports about 1700 desktops and 120+ servers. We have two AD sites and four domain controllers. All DCS are GCs. High level review of steps that we will be taking to prepare for recovery in the event that the entire upgrade goes south: * System state backups of all domain controllers. * Disk image of our DC holding all FSMO roles. This machine will have the hotfix related to the USN rollback applied before imaging. The image will be loaded onto identical hardware and run offline to confirm that it is good. * Addition of one domain controller running as a virtual machine in a different site, which will be copied offline for disaster recovery purposes. * We have successfully performed the schema update against our AD in a lab environment and did not run into any problems. * In the event of problems during the upgrade process, our plans call for contacting PSS and working through normal recovery processes. The disk image and virtual machine copy are intended for use in event that normal recovery attempts have failed and we need to recover from scratch. * We are running the full gamut of health checks to make sure we have a healthy AD before beginning any upgrade tasks in our production environment. Questions: It is my feeling that the schema update is a more significant step than adding the first Server 2003 DC. Is this correct? Does the process of adding a Server 2003 domain controller present any level of risk greater than adding a W2K DC? We are considering adding a lag site and performing the schema update in the lag site and ensuring that it replicates successfully in the lag site before letting it hit the rest on the domain controllers. Considering the size of our environment, is the process of upgrading the schema in a lag site going to add unnecessary complication to the process? I know that may be very subjective. Is it a worthwhile strategy to add a lag site for the purpose of recovery during the upgrade process? We are not otherwise using lag sites at this time. If we add a lag site for the schema update, do we need to physically disconnect it from the other sites when the schema is updated to prevent the replication from occurring after the update? Thanks in advance for any input. Devin List info : http://www.activedir.org/List.aspx List FAQ : http://www.activedir.org/ListFAQ.aspx List archive: http://www.activedir.org/ml/threads.aspx
