Yes, if you are talking about corruption, then you would need snapshots to go back to. Recovery would be simpler if the Ambari Server hostname does not change (IP address changes should not matter).
One more step that I forgot to mention... you would need to delete /var/lib/ambari-agent/keys/* from each agent before restarting it. Yusaku From: Clark Breyman <[email protected]<mailto:[email protected]>> Reply-To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Date: Friday, June 26, 2015 5:22 PM To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: Re: Ambari data corruption/recovery process Thanks Yusaku for the quick response. For our production systems, we're planning on using Postgres replication to ensure backups, though that doesn't defend against data corruption. Perhaps snapshots will be required. Is there any documentation on restoring to a newly provisioned host? Is there any reason to use an DNS A record instead of a CNAME alias to simplify the recovery process? On Fri, Jun 26, 2015 at 5:14 PM, Yusaku Sako <[email protected]<mailto:[email protected]>> wrote: Ambari DB should be backed up on a regular basis. This is the most important piece of information. It is also advisable to also back up /etc/ambari-server/conf/ambari-server.properties. If you have these two, you can restore Ambari Server back to a running condition on a different host. If the hostname of the Ambari Server changes, then you would have to update /etc/ambari-agent/conf/ambari-agent.ini to point to the new Ambari Server hostname and restart the agent. Yusaku From: Clark Breyman <[email protected]<mailto:[email protected]>> Reply-To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Date: Friday, June 26, 2015 5:10 PM To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: Ambari data corruption/recovery process I'm wondering if anyone can share pointers/procedures/best practices to handle the scenarios where: a) The sql database becomes corrupt. (Bugs, ...) b) The Ambari service host is lost (e.g. EC2 instance termination, physical hardware loss, ...)
