On 14.08.2013, at 22:56, Christoffer Dall wrote: > On Wed, Aug 14, 2013 at 10:47:08PM +0200, Alexander Graf wrote: >> >> On 14.08.2013, at 22:33, Christoffer Dall wrote: >> >>> On Wed, Aug 14, 2013 at 09:28:10PM +0200, Alexander Graf wrote: >>>> >>>> On 14.08.2013, at 20:28, Christoffer Dall wrote: >>>> >>>>> On Wed, Aug 14, 2013 at 08:21:54PM +0200, Alexander Graf wrote: >>>>>> >>>>>> On 14.08.2013, at 20:18, Christoffer Dall wrote: >>>>>> >>>>>>> On Wed, Aug 14, 2013 at 07:44:25PM +0200, Alexander Graf wrote: >>>>>>>> >>>>>>>> On 14.08.2013, at 19:39, Christoffer Dall wrote: >>>>>>>> >>>>>>>>> On Wed, Aug 14, 2013 at 07:31:59PM +0200, Alexander Graf wrote: >>>>>>>>>> >>>>>>>>>> On 14.08.2013, at 19:26, Christoffer Dall wrote: >>>>>>>>>> >>>>>>>>>>> On Wed, Aug 14, 2013 at 11:30:46AM +0200, Alexander Graf wrote: >>>>>>>>>>>> >>>>>>>>>>>> On 14.08.2013, at 11:23, Peter Maydell wrote: >>>>>>>>>>>> >>>>>>>>>>>>> On 14 August 2013 10:11, Alexander Graf <ag...@suse.de> wrote: >>>>>>>>>>>>>> You're right, the main difference is that KVM doesn't have any >>>>>>>>>>>>>> idea what a "host" style CPU is. It only knows how to report to >>>>>>>>>>>>>> QEMU >>>>>>>>>>>>>> what the current host CPU would be, so that anything from >>>>>>>>>>>>>> VCPU_INIT >>>>>>>>>>>>>> onwards is 100% identical regardless of whether the user said >>>>>>>>>>>>>> -cpu host or -cpu xxx. >>>>>>>>>>>>>> >>>>>>>>>>>>>> I'm still puzzled on how this will work with BIG.little btw. >>>>>>>>>>>>> >>>>>>>>>>>>> The rough idea is that for BIG.little the kernel must trap the >>>>>>>>>>>>> ID registers at least (so that the vcpu seems consistent to the >>>>>>>>>>>>> guest whether it's running on the big or the little core). For >>>>>>>>>>>>> "-cpu host" the guest would see whatever is the most low-overhead >>>>>>>>>>>>> for the kernel to provide (ie assuming the big and little CPUs >>>>>>>>>>>>> are roughly-similar you could make -cpu host provide something >>>>>>>>>>>>> that looks to the guest like the big CPU and don't have to trap >>>>>>>>>>>>> quite as much as you would for providing a vcpu that wasn't the >>>>>>>>>>>>> same as either the big or little one). >>>>>>>>>>>> >>>>>>>>>>>> So -cpu host in this case wouldn't actually expose the host CPU >>>>>>>>>>>> 1:1, but instead a cortex-a15 even when it's run on an a7 >>>>>>>>>>>> BIG.little core. I see. >>>>>>>>>>>> >>>>>>>>>>> Yes, from the discussion we've had the whole picture just becomes to >>>>>>>>>>> blurry when you start presenting multiple different CPUs to the >>>>>>>>>>> guest >>>>>>>>>>> and there's really no need to that I'm aware of. >>>>>>>>>>> >>>>>>>>>>> In fact the -cpu host case fits quite nicely with this state of >>>>>>>>>>> mind; >>>>>>>>>>> the kernel is free to decide based on the specific hardware and >>>>>>>>>>> config >>>>>>>>>>> on which it's running how to handle VMs on BL. >>>>>>>>>> >>>>>>>>>> So why not have a vm ioctl to fetch the "best match" vcpu type? I >>>>>>>>>> don't like the idea of adding any awareness of a "host" type to the >>>>>>>>>> normal vcpu creation process. >>>>>>>>>> >>>>>>>>>> >>>>>>>>> That's actually what I suggested initially. I'm not really a QEMU >>>>>>>>> expert, but I think Peter already answered this question: he doesn't >>>>>>>>> want to support hundreds of CPU models in QEMU just to be able to run >>>>>>>>> KVM when it's not necessary. >>>>>>>>> >>>>>>>>> If his argument holds in that you can support -cpu host without >>>>>>>>> having a >>>>>>>>> model for that specific cpu in QEMU, then indeed it is a strong >>>>>>>>> argument, and we have the problem with ARMv8 already, and this goes a >>>>>>>>> long way to solve that. No? >>>>>>>> >>>>>>>> That's up for QEMU to decide. With the "fetch and push" model we can >>>>>>>> support both flavors from user space. It also makes the kernel side >>>>>>>> more reproducible and obvious. There's simply no way to add hacks like >>>>>>>> "If I'm a -cpu host type do xxx" in KVM, because KVM never knows that >>>>>>>> it is running -cpu host. >>>>>>>> >>>>>>> Do we have historical examples of this knowledge being abused inside the >>>>>>> kernel for other archs? If not, can we come up with a technical >>>>>>> scenario where it may happen on ARM? >>>>>> >>>>>> if (cpu == host_cpu) { >>>>>> vgic_version = get_host_vgic_version(); >>>>>> } >>>>>> >>>>>> would be an example :). >>>>> >>>>> Not really, this is driven from user space, but ok. >>>>> >>>>>> Everything -cpu host does has to be reproducible without -cpu host, >>>>>> otherwise our compatibility layering is broken. So why not model the API >>>>>> like it from the beginning? >>>>>> >>>>>>> >>>>>>> Also, not really sure if such code should be controlled through the user >>>>>>> space API; ideally we would catch bad coding behavior in the kernel >>>>>>> during code review. >>>>>>> >>>>>>> The only reason I originally suggested the "fetch and push" model was >>>>>>> that I thought user space would need to know the specific CPU model for >>>>>>> things to work and possibly for things like debugging and migration, but >>>>>>> since I have been almost convinced otherwise, I don't see any real >>>>>>> technical arguments for not adding -cpu host support in the kernel side. >>>>>>> >>>>>>> Note that this doesn't prevent us from adding an IOCTL later that gives >>>>>>> you the host CPU type in KVM terminology if we find it useful. But, I >>>>>>> think the reduced headache with ARMv8 right now is a good argument to >>>>>>> proceed with Peter's RFC and kernel support for same. >>>>>> >>>>>> There's really almost no difference from the QEMU point of view if Peter >>>>>> choses to implement it the way he does today. He can ask the kernel for >>>>>> the vcpu target and pass that exact number back into the kernel. >>>>>> >>>>>> >>>>> From the kernel point of view though we have to make some informed >>>>> decision about which "best CPU target" value to return on any given new >>>>> core >>>> >>>> We have to make that decision internally anyways, because we have to >>>> choose some CPU target for the host one. >>>> >>> >>> Are we sure that will always be the case? That's how it's structured >>> now, yes, but maybe we can do something more intelligent (which is what >>> I meant with "generic handling" below). It's a bit fuzzy for me to >>> think about right now, but I just want to make sure we don't shoot >>> ourselves in the foot with the choice of ABI. >> >> Exactly that cleverness is what I'd prefer to avoid, as it breaks >> reproducibility with cross-chip environments. >> >>> >>>>> , where TARGET_HOST may simply work through generic handling of id >>>>> registers, traps etc. and provide better performance than say, "I don't >>>>> really know this host CPU, so I'm just going to tell you A15 and trap >>>>> everything"... >>>> >>>> Yes. >>>> >>>> target_vcpu_id = kvm_vm_ioctl(KVM_VM_GET_BEST_CPU_TARGET); >>>> >>>> /* Old kernels only support A15 hosts */ >>>> if (target_vcpu_id < 0) >>>> target_vcpu_id = VCPU_CORTEX_A15; >>>> >>>> kvm_vcpu_ioctl(vcpu_fd, KVM_INIT_VCPU, target_vcpu_id); >>>> >>> >>> I get this part, but imagine the kernel not knowing the target_cpu id, >>> but just passing through whatever the hardware gives you to the guest. >> >> Don't you have to handle core specific registers anyway? >> >>> I'm not saying that's necessarily going to happen or that it would be a >>> great thing, but do we want to prevent this from ever happening through >>> our choice of ABI? >> >> I think so, yes. Can you run Linux on a core that hasn't been enabled? Why >> should you be able to run KVM on a core that hasn't been enabled? I'm not >> talking about QEMU here - that one should be happy to be ignorant. But the >> kernel side needs to know about the core either way, no? >> >> > I guess so, maybe. So far we haven't seen a lot of cores, so we > definietely want to know about the specific cores that we want to > support. > > But, if there really are going to be hundreds, or thousands, or hundreds > of thousands of virtualization-enabled ARM cores (ok, maybe not) then
We're talking about cores here. My take is that we're rather looking at dozens at most. > maybe we want to come up with a way to not add code in KVM for every new > core we wish to support. But then again, if ARM achieves true world > domination and inflicts us with that many core types, we can always > add a new ABI. If the need really emerges we can always still add a TARGET_HOST target type that the enumeration function returns. But I don't think we'll ever hit that case. > So at this point, I don't really care if we do it once way or another. Great :). Alex