On Wed, Mar 18, 2015 at 08:50:54AM -0600, Alex Williamson wrote: > On Wed, 2015-03-18 at 15:36 +0100, Michael S. Tsirkin wrote: > > On Wed, Mar 18, 2015 at 08:15:01AM -0600, Alex Williamson wrote: > > > On Wed, 2015-03-18 at 15:05 +0100, Michael S. Tsirkin wrote: > > > > On Wed, Mar 18, 2015 at 08:02:26AM -0600, Alex Williamson wrote: > > > > > On Wed, 2015-03-18 at 14:23 +0100, Michael S. Tsirkin wrote: > > > > > > typo in subject: vfio, not vifo. > > > > > > > > > > > > On Thu, Mar 12, 2015 at 06:23:59PM +0800, Chen Fan wrote: > > > > > > > for piix4 chipset, we don't need to expose aer, so introduce > > > > > > > PC_I440FX_COMPAT for all piix4 machines to disable aercap, > > > > > > > and add HW_COMPAT_2_2 to disable aercap for all lower > > > > > > > than 2.3. > > > > > > > > > > > > > > Signed-off-by: Chen Fan <chen.fan.f...@cn.fujitsu.com> > > > > > > > > > > > > Well vfio is never migrated ATM. > > > > > > So why is compat code needed at all? > > > > > > > > > > It's not for migration, it's to maintain current behavior on existing > > > > > platforms. If someone gets an uncorrected AER error on q35 machine > > > > > type > > > > > today, the VM stops. With this change, AER would be exposed to the > > > > > guest and the guest could handle it. The compat change therefore > > > > > maintains the stop VM behavior on existing q35 machine types. > > > > > > > > If stop VM behaviour is useful, expose it to users. > > > > If not, then don't. > > > > I don't see why does it have to be tied to machine types. > > > > > > Because q35-2.2 machine type will currently do a stop VM on uncorrected > > > AER error. If we don't tie that to a machine option then q35-2.2 would > > > suddenly start exposing the error to the guest. That's a fairly > > > significant change in behavior for a static machine type. > > > > I don't think you can classify it as a behaviour change. VM stop is not > > guest visible behaviour. > > In one case, an uncorrected AER occurs and the VM is stopped by QEMU. > In the other case, the guest is notified and may attempt corrective > action... or maybe the guest doesn't understand AER and the user is > depending on the previous behavior. That is absolutely a behavior > change. > > > Are you worrying about guests misbehaving when they see these errors? > > Then you want this as user-controlled, supported option. > > Whether the option is user visible is tangential to whether the behavior > of existing machine types should be maintained. Existing machine types > can impose a different default than current machine types. > > > In other words: we only tie things to machine types when we > > have to. This code gets almost no testing, and is a lot of > > work to test. This one sounds like "just in case" is not a good > > motivation. > > It seems like an obvious use case for using machine types to maintain > compatibility with previous behavior, which is exactly why we have > machine types. If we're not going to use it, why do we have it?
We have machine types because of the following issues: - some silent changes confuse guests. For example guest installed with one machine type might not boot if you try to use it after changing something, or - in case of windows - throw up warnings. - some changes break migration Looks like none of these cases. If AER is unsafe, turn it off by default for everyone. -- MST