> That said, while I agree it's an annoying edge case and a lot of code, I'd > vote against this approach since: > > 1. Having an "offline" operation fail is not unusual, especially > given how poor the administrative interfaces are to map > hardware resources (NICs, memory boards, ...) to hardware > slots. If the administrator names the wrong slot, the offline > operation will likely fail, but any network configuration on > the affected NICs that has been changed since boot will not be > reapplied. > > 2. Failed offline handling should be consistent for all resources. > For instance, suppose both disk and NIC configuration must be > restored after a failed offline. Having the disk RCM code > restore the mounts that used to exist (regardless of whether > they were present at boot), but having the network RCM code > restore the persistent configuration seems confusing and hard > to justify administratively. > > 3. In the future, there's a strong possibility we will allow link > and IP configuration remain in a "detached" state across a DR > event, and then allow newly-inserted hardware to attach to that > link and IP configuration. That would move our model away from > using persistent configuration across a DR -- but what you've > proposed does the opposite, by further sedimenting the use of > persistent configuration. > > Note that if (3) is realized, we would actually end up with less code, > since we would on longer need to tear down the network configuration at > all as part of a DR event -- and since nothing is torn down, nothing would > need to be restored on failure. > It makes sense. I will do the undo_offline based on the "active" configuration.
Thanks - Cathy
