On 04.06.2011, at 12:47, Ingo Molnar wrote:

> 
> * Alexander Graf <[email protected]> wrote:
> 
>> 
>> On 04.06.2011, at 12:35, Ingo Molnar wrote:
>> 
>>> 
>>> * Sasha Levin <[email protected]> wrote:
>>> 
>>>> On Sat, 2011-06-04 at 12:17 +0200, Ingo Molnar wrote:
>>>>> * Sasha Levin <[email protected]> wrote:
>>>>> 
>>>>>> On Sat, 2011-06-04 at 11:38 +0200, Ingo Molnar wrote:
>>>>>>> * Sasha Levin <[email protected]> wrote:
>>>>>>> 
>>>>>>>> Coalescing MMIO allows us to avoid an exit every time we have a
>>>>>>>> MMIO write, instead - MMIO writes are coalesced in a ring which
>>>>>>>> can be flushed once an exit for a different reason is needed.
>>>>>>>> A MMIO exit is also trigged once the ring is full.
>>>>>>>> 
>>>>>>>> Coalesce all MMIO regions registered in the MMIO mapper.
>>>>>>>> Add a coalescing handler under kvm_cpu.
>>>>>>> 
>>>>>>> Does this have any effect on latency? I.e. does the guest side 
>>>>>>> guarantee that the pending queue will be flushed after a group of 
>>>>>>> updates have been done?
>>>>>> 
>>>>>> Theres nothing that detects groups of MMIO writes, but the ring size is
>>>>>> a bit less than PAGE_SIZE (half of it is overhead - rest is data) and
>>>>>> we'll exit once the ring is full.
>>>>> 
>>>>> But if the page is only filled partially and if mmio is not submitted 
>>>>> by the guest indefinitely (say it runs a lot of user-space code) then 
>>>>> the mmio remains pending in the partial-page buffer?
>>>> 
>>>> We flush the ring on any exit from the guest, not just MMIO exit.
>>>> But yes, from what I understand from the code - if the buffer is only
>>>> partially full and we don't take an exit, the buffer doesn't get back to
>>>> the host.
>>>> 
>>>> ioeventfds and such are making exits less common, so yes - it's possible
>>>> we won't have an exit in a while.
>>>> 
>>>>> If that's how it works then i *really* don't like this, this looks 
>>>>> like a seriously mis-designed batching feature which might have 
>>>>> improved a few server benchmarks but which will introduce random, 
>>>>> hard to debug delays all around the place!
>>> 
>>> The proper way to implement batching is not to do it blindly like 
>>> here, but to do what we do in the TLB coalescing/gather code in the 
>>> kernel:
>>> 
>>>     gather();
>>> 
>>>     ... submit individual TLB flushes ...
>>> 
>>>     flush();
>>> 
>>> That's how it should be done here too: each virtio driver that issues 
>> 
>> The world doesn't consist of virtio drivers. It also doesn't 
>> consist of only OSs and drivers that we control 100%.
> 
> So? I only inquired about latencies, asking what impact on latencies 
> is. Regardless of the circumstances we do not want to introduce 
> unbound latencies.
> 
> If there are no unbound latencies then i'm happy.

Sure, I'm just saying that the mechanism was invented for unmodified guests :).

> 
>>> a group of MMIOs should first start batching, then issue the 
>>> individual MMIOs and then flush them.
>>> 
>>> That can be simplified to leave out the gather() phase, i.e. just 
>>> issue batched MMIOs and flush them before exiting the virtio 
>>> (guest side) driver routines.
>> 
>> This acceleration is done to speed up the host kernel<->userspace 
>> side.
> 
> Yes.
> 
>> [...] It's completely independent from the guest. [...]
> 
> Well, since user-space gets the MMIOs only once the guest exits it's 
> not independent, is it?

If we don't know when a guest ends an MMIO stream, we can't optimize it. 
Period. If we currently optimize random MMIO requests without caring when they 
finish, the following would simply break:

enable_interrupts();
writel(doorbell, KICK_ME_NOW);
while(1) ;

void interrupt_handler(void)
{
    break_out_of_loop();
}

And since we don't control the guest, we can't guarantee this to not happen. In 
fact, I'd actually expect this to be a pretty normal boot loader pattern.

> 
>> [...] If you want to have the guest communicate fast, create an 
>> asynchronous ring and process that. And that's what virtio already 
>> does today.
>> 
>>> KVM_CAP_COALESCED_MMIO is an unsafe shortcut hack in its current 
>>> form and it looks completely unsafe.
>> 
>> I haven't tracked the history of it, but I always assumed it was 
>> used for repz mov instructions where we already know the size of 
>> mmio transactions.
> 
> That's why i asked what the effect on latencies is. If there's no 
> negative effect then i'm a happy camper.

Depends on the trade-off really. You don't care about latencies of disabling an 
IRQ_ENABLED register for example. You do however care about enabling it :).

Since I haven't implemented coalesced mmio on PPC (yet - not sure it's possible 
or makes sense), I can't really comment too much on it, so I'll leave this to 
the guys who worked on it.


Alex

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to