Ryan Lane Wrote in response to Soren Hansen: >> That's the whole point. For most interesting applications, "fast" >> automatic migration isn't anywhere near fast enough. Don't try to >> avoid failure. Expect it and design around it. >>
>This assumes all application designers are doing this. Most web applications do >this fairly well, but most enterprise applications do this very poorly. >Hardware HA is useful for more than just poorly designed applications though. > I have a cloud instance that runs my personal website. I don't want to pay for >two (or more, realistically) instances just to ensure that if my host dies >that my >site will continue to run. My provider should automatically detect the hardware >failure and re-launch my instance on another piece of hardware; it should also >notify me that it happened, but that's a different story ;). There are techniques to migrate VMs between non-HA hosts, and there are techniques that allow applications to be written so that any instance of the server can be lost without impairing the application (you just start a new instance of the server, rather than migrating the server). But neither of those solve the problem as well has hardware High Availability. Whether Hardware HA is a cost effective solution is something that customers will ultimately have to determine. A successful proposal would need to include identifying when a VM wants/needs to be hosted on a Hardware-HA enhanced host, a method of identifying the Hardware-HA enhanced hosts, and the ability to track when a Hardware-HA Host is in degraded mode (i.e., it currently is one resource failure away from an absolute failure). I think those features can be designed in a way that does not impose too strong of a burden on the core scheduling algorithm, as long as it isn't required to evaluate a long list of "Hardware HA QoS metrics" to do optimal guest to host assignments. This is actually virtually the same issue as Object Storage support for self-healing Mirroring (via ZFS) that we have proposed for Swift. It defines an enhanced capability For specific servers that can be characterized in a way that the generic control and Management plane algorithms can understand. The hardest part of that understanding In both cases is the addition of a "degraded" status for a server. Without Hardware HA or self-healing mirroring a host/data server is either "up" or "down". With Hardware HA and self-healing mirroring they can be "degraded". The Hardware HA Host can be down to a single hardware node. The self-healing mirror could be done to a Single working storage device. In either case the remaining copy is still functional, but you Probably want to begin migrating the VMs/Swift Partitions elsewhere (unless your mean Time to repair is really good). _______________________________________________ Mailing list: https://launchpad.net/~openstack Post to : openstack@lists.launchpad.net Unsubscribe : https://launchpad.net/~openstack More help : https://help.launchpad.net/ListHelp