Il 24/07/2013 15:15, Eduardo Habkost ha scritto: > On Tue, Jul 23, 2013 at 09:43:06PM +0200, Paolo Bonzini wrote: >> Il 23/07/2013 19:41, Eduardo Habkost ha scritto: >>> On Tue, Jul 23, 2013 at 06:23:08PM +0200, Paolo Bonzini wrote: >>>> Il 23/07/2013 17:40, Eduardo Habkost ha scritto: >>>>> On Tue, Jul 23, 2013 at 05:09:02PM +0200, Paolo Bonzini wrote: >>>>>> Il 23/07/2013 16:13, Eduardo Habkost ha scritto: >>>>>>> On Tue, Jul 23, 2013 at 11:18:03AM +0200, Paolo Bonzini wrote: >>>>>>>> Il 22/07/2013 21:25, Eduardo Habkost ha scritto: >>>>>>>>> Bug description: QEMU currently gets all bits from GET_SUPPORTED_CPUID >>>>>>>>> for CPUID leaf 0xA and passes them directly to the guest. This makes >>>>>>>>> the guest ABI depend on host kernel and host CPU capabilities, and >>>>>>>>> breaks live migration if we migrate between host with different >>>>>>>>> capabilities (e.g. different number of PMU counters). >>>>>>>>> >>>>>>>>> This patch adds a "pmu-passthrough" property to X86CPU, and set it to >>>>>>>>> true only on "-cpu host", or on pc-*-1.5 and older machine-types. >>>>>>>> >>>>>>>> Can we just call the property "pmu"? It doesn't have to be passthough. >>>>>>> >>>>>>> Yes, but the only options we have today are "no PMU" and "passthrough >>>>>>> PMU". I wouldn't like to make "pmu=on" enable the passthrough behavior >>>>>>> implicitly (I don't want things that break live-migration to be enabled >>>>>>> without making it explicit that it is a host-dependent/passthrough >>>>>>> mode). >>>>>> >>>>>> I think "passthrough PMU" should be considered a bug except of course >>>>>> with "-cpu host". >>>>>> >>>>>> If "-cpu Nehalem,pmu=on" goes from passthrough to Nehalem-compatible in >>>>>> a future QEMU release, that'll be a bugfix. >>>>> >>>>> Exactly. But then I don't understand your suggestion. We still need a >>>>> property to enable pasthrough behavior on old machine-types (not >>>>> perfect, but a best-effort way to try to keep compatibility), >>>> >>>> Do we? >>>> >>>> We only need "pmu=on"---which right now is buggy on old machine types >>>> because it will always passthrough. >>> >>> I am not sure I understand what you are arguing for. >>> >>> You agree that pmu=on needs to keep the buggy passthrough behavior on >>> pc-1.5 and older, right? >> >> I agree it needs to remain enabled on 1.5. But if, for example, 1.8 >> makes pmu=on emulate a Nehalem-compatible PMU, I think it is fine if >> pc-1.5 moves from a host-compatible PMU to a Nehalem-compatible PMU. > > That's where I disagree. Today users are (luckily) able to migrate > safely between hosts with the same number of PMU counters. But if we > make, e.g., "qemu-1.6 -machine pc-1.5 -cpu Westmere" present a smaller > number of PMU counters than "qemu-1.5 -machine pc-1.5 -cpu Westmere" on > the same host, we will break an existing setup where everything was > working before, which is something we could have easily avoided.
But at the same time we will fix live migration from a Sandy Bridge host to a Westmere. So it's a choice we have to make anyway. > (Just to clarify what breaking this means in practice: changing the > number of PMU counters under the guest on live-migration means the guest > will crash when trying to use counters that suddenly went away, and it > may crash a very long time after it was migrated.) And at the same time we fix live migration of a Sandy Bridge to a Westmere. >> The reason is that pc-1.5 has never guaranteed any feature of the >> emulated PMU. > > Right, current behavior is buggy and we never guaranteed anything, but > IMO we shouldn't break on purpose something that is working today. Even if it is to fix something else? Paolo