On 12/03/14 10:53, Erik P. Ostlyngen wrote: > On 03/10/2014 02:59 PM, Matthijs Mekking wrote: >> Just some ideas of how we can fix it in the future. For a short >> term work around, I assume monitoring is your friend. > This brings up the question of how to handle the various emergency > situations. > > In the scenario described by Antti (lets say the problems with the > zone provisioning system is recognized, but cannot be repaired within > the scheduled key publishing time interval), would the following be a > safe way to handle the situation? > Stop the enforcer > Fix the zone provisioning problems and publish the zone, which > includes the new key > Wait for the new zone to propagate all the way to the caching > resolvers > Re-start the enforcer and let it complete the rollover > > Are there other recommended ways to freeze or postpone an ongoing key > rollover? > > And are there good reasons for not using very long PublishSafety and > RetireSafety intervals, lets say weeks? It would make a controlled > emergency key rollover terribly slow, but in that case it should be > possible to change the policy before initiating the rollover. > >
One thing first; even a static zone is being resigned and published on the timescales defined by your signature Resign and Refresh parameters. So even if the system which creates the unsigned zones breaks a key rollover can happily progress. So let's assume that we are talking about an issue that is preventing a new _signed_ zone being created... Also let's assume that we are talking about the ZSK here; this is not unreasonable as a KSK rollover relies on the user indicating that the new key's DS has appeared at the parent. If you are worried about a KSK roll then just wait before you issue this command. Pre and post publishing the new/old ZSK is a very low cost action as this key is not used to generate signatures (and so the zone size increase is minimal). So these safety margins are designed to cover the sorts of eventualities that might delay publication of the zone. (There is also a propagation delay which is intended to cover "normal conditions".) Indeed days or weeks can be used. To answer your specific question; stopping the enforcerd will freeze any rollover, which is a good first step. However, you will need to be a bit careful when restarting it depending on how long it has been stopped for and where in a rollover it was when you stopped it. If it has been offline for a period that is of the order of the safety margin you have built in then I would recommend increasing this in the appropriate policy; running "ods-ksmutil update kasp" to push that change through and then starting the enforcer. If it were me doing this I would make sure that this has had the desired effect in an offline system. Using "ods-ksmutil key list" to see: 1) the timers on the new key becoming active - this is the critical time if the rollover was stopped while introducing the new key. or 2) the timer for the old key becoming "dead" - this is the critical time if the rollover was stopped while the old key was "retired". Once the roll has happened you can put the safety margin back down if you wish to. Sion _______________________________________________ Opendnssec-user mailing list [email protected] https://lists.opendnssec.org/mailman/listinfo/opendnssec-user
