On 04.06.2011, at 12:47, Ingo Molnar wrote:
>
> * Alexander Graf <[email protected]> wrote:
>
>>
>> On 04.06.2011, at 12:35, Ingo Molnar wrote:
>>
>>>
>>> * Sasha Levin <[email protected]> wrote:
>>>
>>>> On Sat, 2011-06-04 at 12:17 +0200, Ingo Molnar wrote:
>>>>> * Sasha Levin <[email protected]> wrote:
>>>>>
>>>>>> On Sat, 2011-06-04 at 11:38 +0200, Ingo Molnar wrote:
>>>>>>> * Sasha Levin <[email protected]> wrote:
>>>>>>>
>>>>>>>> Coalescing MMIO allows us to avoid an exit every time we have a
>>>>>>>> MMIO write, instead - MMIO writes are coalesced in a ring which
>>>>>>>> can be flushed once an exit for a different reason is needed.
>>>>>>>> A MMIO exit is also trigged once the ring is full.
>>>>>>>>
>>>>>>>> Coalesce all MMIO regions registered in the MMIO mapper.
>>>>>>>> Add a coalescing handler under kvm_cpu.
>>>>>>>
>>>>>>> Does this have any effect on latency? I.e. does the guest side
>>>>>>> guarantee that the pending queue will be flushed after a group of
>>>>>>> updates have been done?
>>>>>>
>>>>>> Theres nothing that detects groups of MMIO writes, but the ring size is
>>>>>> a bit less than PAGE_SIZE (half of it is overhead - rest is data) and
>>>>>> we'll exit once the ring is full.
>>>>>
>>>>> But if the page is only filled partially and if mmio is not submitted
>>>>> by the guest indefinitely (say it runs a lot of user-space code) then
>>>>> the mmio remains pending in the partial-page buffer?
>>>>
>>>> We flush the ring on any exit from the guest, not just MMIO exit.
>>>> But yes, from what I understand from the code - if the buffer is only
>>>> partially full and we don't take an exit, the buffer doesn't get back to
>>>> the host.
>>>>
>>>> ioeventfds and such are making exits less common, so yes - it's possible
>>>> we won't have an exit in a while.
>>>>
>>>>> If that's how it works then i *really* don't like this, this looks
>>>>> like a seriously mis-designed batching feature which might have
>>>>> improved a few server benchmarks but which will introduce random,
>>>>> hard to debug delays all around the place!
>>>
>>> The proper way to implement batching is not to do it blindly like
>>> here, but to do what we do in the TLB coalescing/gather code in the
>>> kernel:
>>>
>>> gather();
>>>
>>> ... submit individual TLB flushes ...
>>>
>>> flush();
>>>
>>> That's how it should be done here too: each virtio driver that issues
>>
>> The world doesn't consist of virtio drivers. It also doesn't
>> consist of only OSs and drivers that we control 100%.
>
> So? I only inquired about latencies, asking what impact on latencies
> is. Regardless of the circumstances we do not want to introduce
> unbound latencies.
>
> If there are no unbound latencies then i'm happy.
Sure, I'm just saying that the mechanism was invented for unmodified guests :).
>
>>> a group of MMIOs should first start batching, then issue the
>>> individual MMIOs and then flush them.
>>>
>>> That can be simplified to leave out the gather() phase, i.e. just
>>> issue batched MMIOs and flush them before exiting the virtio
>>> (guest side) driver routines.
>>
>> This acceleration is done to speed up the host kernel<->userspace
>> side.
>
> Yes.
>
>> [...] It's completely independent from the guest. [...]
>
> Well, since user-space gets the MMIOs only once the guest exits it's
> not independent, is it?
If we don't know when a guest ends an MMIO stream, we can't optimize it.
Period. If we currently optimize random MMIO requests without caring when they
finish, the following would simply break:
enable_interrupts();
writel(doorbell, KICK_ME_NOW);
while(1) ;
void interrupt_handler(void)
{
break_out_of_loop();
}
And since we don't control the guest, we can't guarantee this to not happen. In
fact, I'd actually expect this to be a pretty normal boot loader pattern.
>
>> [...] If you want to have the guest communicate fast, create an
>> asynchronous ring and process that. And that's what virtio already
>> does today.
>>
>>> KVM_CAP_COALESCED_MMIO is an unsafe shortcut hack in its current
>>> form and it looks completely unsafe.
>>
>> I haven't tracked the history of it, but I always assumed it was
>> used for repz mov instructions where we already know the size of
>> mmio transactions.
>
> That's why i asked what the effect on latencies is. If there's no
> negative effect then i'm a happy camper.
Depends on the trade-off really. You don't care about latencies of disabling an
IRQ_ENABLED register for example. You do however care about enabling it :).
Since I haven't implemented coalesced mmio on PPC (yet - not sure it's possible
or makes sense), I can't really comment too much on it, so I'll leave this to
the guys who worked on it.
Alex
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html