2013-12-14 11:46+0200, Gleb Natapov:
> On Fri, Dec 13, 2013 at 05:07:54PM +0100, Radim Krčmář wrote:
> > 2013-12-12 21:36+0100, Paolo Bonzini:
> > > From: Gleb Natapov <g...@redhat.com>
> > > 
> > > A guest can cause a BUG_ON() leading to a host kernel crash.
> > > When the guest writes to the ICR to request an IPI, while in x2apic
> > > mode the following things happen, the destination is read from
> > > ICR2, which is a register that the guest can control.
> > > 
> > > kvm_irq_delivery_to_apic_fast uses the high 16 bits of ICR2 as the
> > > cluster id.  A BUG_ON is triggered, which is a protection against
> > > accessing map->logical_map with an out-of-bounds access and manages
> > > to avoid that anything really unsafe occurs.
> > > 
> > > The logic in the code is correct from real HW point of view. The problem
> > > is that KVM supports only one cluster with ID 0 in clustered mode, but
> > > the code that has the bug does not take this into account.
> > 
> > The more I read about x2apic, the more confused I am ...
> > 
> >  - How was the cluster x2apic enabled?
> > 
> >    Linux won't enable cluster x2apic without interrupt remapping and I
> >    had no idea we were allowed to do it.
> > 
> Malicious code can do it.
> 
> >  - A hardware test-suite found this?
> > 
> >    This bug can only be hit when the destination cpu is > 256, so the
> >    request itself is buggy -- we don't support that many in kvm and it
> >    would crash when initializing the vcpus if we did.
> >    => It looks like we should just ignore the ipi, because we have no
> >       vcpus in that cluster.
> > 
> That's the nature of malicious code: it does what your code does not expects
> it to do :)

I was wondering if there wasn't malicious linux on the other side too :)

> >  - Where does the 'only one supported cluster' come from?
> > 
> "only one supported cluster" comes from 8 bit cpuid limitation of KVM's x2apic
> implementation. With 8 bit cpuid you can only address cluster 0 in logical 
> mode.

One x2apic cluster has 16 cpus and we generate the x2apic LDR correctly,
so 8 bit cpuid can address first 16 clusters as well.

  u32 ldr = ((id >> 4) << 16) | (1 << (id & 0xf));

> >    I only see we use 'struct kvm_lapic *logical_map[16][16];', which
> >    supports 16 clusters of 16 apics = first 256 vcpus, so if we map
> >    everything to logical_map[0][0:15], we would not work correctly in
> >    the cluster x2apic, with > 16 vcpus.
> > 
> Such config cannot work today because of 8 bit cpuid limitation. When the 
> limitation
> will be removed KMV_X2APIC_CID_BITS will be set to actual number of bits we 
> want to support.

Even with KMV_X2APIC_CID_BITS = 4, which would allow us to support 8 bit
cpuid, we would still deliver interrupts destined for cpuid > 256 to
potentially plugged cpus.

> It will likely be smaller then 16 bit though since full 18 bit support will 
> require huge tables.

Yeah, we'll have to do dynamic allocation if we are ever going to need
the full potential of x2apic.
(2^20-16 cpus in cluster and 2^32-1 in flat mode)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/

Reply via email to