On Thu, Mar 02, 2017 at 01:20:05PM +0100, Paolo Bonzini wrote:
> On 02/03/2017 12:39, James Hogan wrote:
> > It can't right now, though with relocation of the kernel now implemented
> > in MIPS Linux for KASLR, and hopes for a more generic EVA implementation
> > (which can require the kernel to be linked in a completely different
> > segment) it isn't completely infeasible.
> 
> What about the other way round, sticking a minimal T&E stub in kernel
> space and running the kernel in userspace?  Would it be feasible or
> would it be as complex as KVM itself?

You mean have a fallback in the guest kernel to keep kernel running from
userspace addresses in kernel mode so it works in VZ guests and
non-virtualized?

Interesting idea. I think it would involve a lot of complexity. It could
forgo some of the emulation of privileged instructions that KVM T&E
does since its running in kernel mode, but memory management would be
more complex, and invasive changes would be required to the kernel.

- Memory privilege protection is on the granularity of segments, so with
  the traditional segment layout all of USeg (0x00000000..0x7FFFFFFF) is
  accessible to user mode, so you'd still need to utilise ASIDs to
  separate the address spaces of actual user programs running in
  0x00000000..0x3FFFFFFF from the kernel code running in
  0x40000000..0x7FFFFFFF.

- USeg is always TLB mapped. That means any kernel code could trigger
  TLB exceptions, which breaks existing assumptions (e.g. normally from
  unmapped kernel segments you can disable interrupts and then
  manipulate the TLB, but that isn't safe if a TLB refill exception
  could happen at any time and clobber the TLB registers). If in the
  future we manage to workaround these issues and map the kernel (for
  security/protection purposes), then it would be easier, but then we'll
  likely already have the capability to fully relocate into a different
  segment.

> > 1) QEMU, which I've implemented using the kvm_type machine callback.
> > This allows the KVM type to be specified with e.g.
> >   "-machine malta,accel=kvm,kvm-type=TE"
> > Otherwise it defaults to using KVM_VM_MIPS_DEFAULT.
> > 
> > When you try and load a kernel (which happens after kvm_init() has
> > already passed the kvm type into KVM_CREATE_VM) it will check that it
> > supports the current kernel type.
> >
> > 2) My kvm test application, which uses KVM_VM_MIPS_DEFAULT by default
> > and hackily maps itself into the guest physical address space to run C
> > code test cases.
> 
> So this one would work for both TE and VZ because the guest is not a
> Linux kernel.

Yes, the test code is position independent and careful to avoid direct
references to any symbols. The GPA mappings are set up the same, but the
virtual addresses (PC, stack pointer etc) are set up slightly
differently depending on whether the VZ capability is present.

> I don't know...  Instinctively I would think that it's easy to get
> KVM_VM_MIPS_DEFAULT wrong and place the VZ-and-fall-back-to-TE policy in
> userspace, but I can be convinced otherwise if the failure mode is good
> enough.

Yeh, I think I agree. It isn't really necessary to have that decision
making in the kernel, and to use a particular KVM type userspace needs
to be aware about it, so it can always figure out from capabilities
which one to use prior to KVM_CREATE_VM.

I suppose the exception is T&E. It shouldn't assume that just because VZ
is available that T&E isn't (even if that is the case right now). It
could always just try KVM_CREATE_VM with kvm type 0 and detect the error
I suppose, but capabilities are nicer.

Maybe I'll redefine KVM_CAP_MIPS_VZ a bit, such that the value returned
+ 1 is a bitmask of supported kvm types:
has T&E = !!( (v + 1) & BIT(KVM_VM_MIPS_TE) )
has VZ  = !!( (v + 1) & BIT(KVM_VM_MIPS_VZ) )

That way old kernels which return 0 are consistent, and other
implementations could be added if really necessary without confusing
userland (but fingers crossed it'll never ever be necessary).

> For example, what happens if you use KVM_SET_USER_MEMORY_REGION
> for a kernel address in TE mode?

That deals with physical addresses and user/kernel memory is
distinguished by the virtual address, so the KVM mode (T&E vs VZ)
doesn't make a difference here.

Cheers
James

Attachment: signature.asc
Description: Digital signature

Reply via email to