On 06.03.2013, at 15:48, Alexander Graf wrote:
>
> On 06.03.2013, at 15:41, Gleb Natapov wrote:
>
>> On Wed, Mar 06, 2013 at 03:03:53PM +0100, Alexander Graf wrote:
>>>
>>> On 06.03.2013, at 14:56, Gleb Natapov wrote:
>>>
>>>> On Wed, Mar 06, 2013 at 02:22:15PM +0100, Alexander Graf wrote:
>>>>>
>>>>> On 06.03.2013, at 14:14, Gleb Natapov wrote:
>>>>>
>>>>>> On Wed, Mar 06, 2013 at 01:20:39PM +0100, Alexander Graf wrote:
>>>>>>>> The problem would only start if KVM_SET_IRQCHIP_TYPE (new name of
>>>>>>>> KVM_CREATE_IRQCHIP_ARGS) forced you to later call KVM_CREATE_DEVICE.
>>>>>>>
>>>>>>> Ah, I see. I don't see why it would. The fact that there is a "LAPIC"
>>>>>>> doesn't mean that the per-vcpu SET_INTERRUPT ioctl stops working. So if
>>>>>>> SET_IRQCHIP_TYPE(!none) breaks user-space interrupt controller
>>>>>>> emulation I would consider that a bug.
>>>>>>>
>>>>>> For x86 this is the case though. I do not see how it can't be. If
>>>>>> LAPIC is emulated in userspace SET_INTERRUPT is used to pass IRQ
>>>>>> vector that should be handled as a result of LAPIC emulation.
>>>>>
>>>>> So SET_INTERRUPT on a vcpu triggers a line on the LAPIC emulation in that
>>>>> vcpu? For us it directly controls the CPU interrupt pin.
>>>>>
>>>> No SET_INTERRUPT on a vcpu tells vcpu to which vector in IDT it needs to
>>>> jump immediately. LAPIC is really part of a cpu and we cut it and put into
>>>> userspace, so interface between userspace LAPIC emulation is really low
>>>> level and has to be synchronous. X86 has two interrupt lines NMI and INTR
>>>> and we do not have interface to trigger the later. KVM_IRQ_LINE works on
>>>> GSI lines which do not go into CPU directly. They go either via PIC (which
>>>> triggers INTR or APIC LINT0) or via IOAPIC which on real HW communicates
>>>> with APICs via bus, but in our emulation just calls APICs directly.
>>>
>>> Great :). It's similar for us. SET_INTERRUPT directly asserts the INTR line
>>> of the vcpu. There is nothing like an IDT on PPC, so external interrupts
>>> simply arrive at a specific vector. That vector can differ for critical or
>>> NMI interrupts IIRC, but I'm not sure we implement that right now. If so,
>>> it'd be a different line for SET_INTERRUPT.
>>>
>>> So in a way, it's the same. And SET_INTERRUPT should work regardless of
>>> whether a LAPIC is used or not really. At least it would for us :).
>>>
>> Is it possible for some devices to inject interrupt directly and other
>> to go through interrupt controller?
>
> It would be racy if both assert + deassert the same line, but I don't see why
> we should keep anyone from doing it. If user space wants to run such a
> configuration, it needs to ensure that only one of the 2 is actively used at
> any given time.
>
>>> KVM_IRQ_LINE is basically an IOAPIC interrupt line assert. That's fine.
>>> That ioctl should get an ioapic device handle to work on. Whether we call
>>> the IOAPIC PINs GSIs or something different is really just a naming
>>> question. I'd probably call it IRQ number :).
>> Yes and no. On sane archs we can call it IRQ number (lucky you!), but on
>> X86 there is a GSI that can be IRQ2 if it goes through IOAPIC and IRQ0
>> if it goes through PIC, so additional entity was invented: irq routing.
>> It maps between GSI and irqchips pin. Same GSI may go to more than one
>> irqchip. This is why for x86 having irqchip device handle as a parameter
>> to KVM_IRQ_LINE does not make sense. It make sense to provide it to irq
>> router and this is how it work now except that "device handlers" are
>> hard coded.
>
> Then you would create a new "irq router" device that does the multiplexing
> and can also receive IRQs. You could then directly assert an IOAPIC/PIC line
> or a multiplexer line. Or am I misunderstanding something?
>
>>
>>> But it's the same idea. The "IOAPIC" would then talk to to in-kernel
>>> "LAPIC" style bits (or in case of the MPIC just integrate them inside of
>>> itself). That's why by the time we create an "IOAPIC", the "LAPIC"s in the
>>> system have to be populated.
>> The restriction that LAPIC has to be created before IOAPIC would be a
>> bug that need to be fixed on X86. The reason is cpu hotplug. If you have
>> to support cpu hotplug you have to be able to create LAPICs after IOAPIC
>> and at this point you can create IOAPIC before any LAPICs as well. I
>> understand this may not be the case for all architectures right now, but
>> something to keep in mind.
>
> Paul, Scott, do you think we can move the "this CPU can receive interrupts
> from MPIC / XICS" part into an ENABLE_CAP that gets set dynamically? That
> ENABLE_CAP would allocate the structures in the vcpu and register the vcpu
> with the interrupt controller pool.
>
> The interrupt controller device
creation
> would still iterate through all vcpus to find the ones that match so that we
> support the ENABLE_CAP at any point in time.
Actually, thinking about this a bit more. If we had explicit interrupt
connections, user space would take care of all this:
-- machine init --
for (i = 0; i < smp_cpus; i++) {
create_cpu();
}
mpic = create_device(DEVICE_MPIC)
for (i = 0; i < smp_cpus; i++) {
enable_cap(cpus[i], CAP_MPIC_LISTENER);
mpic_hook_up_irqline(i, cpus[i]);
}
-- hotplug add --
create_cpu();
enable_cap(cpus[i], CAP_MPIC_LISTENER);
mpic_hook_up_irqline(mpic, i, cpus[i]);
Then we don't care about any ordering at all anymore from KVM's perspective.
Alternatively, the above code could live inside kvm as well of course.
create_vcpu() would have to register itself with "the interrupt controller"
then to allow for hotplug.
Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm-ppc" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html