Re: [Qemu-devel] [RFC] create a single workqueue for each vm to update vm irq routing table

Zhanghaoyu (A) Fri, 29 Nov 2013 18:48:35 -0800

>On Tue, Nov 26, 2013 at 06:14:27PM +0200, Gleb Natapov wrote:
>> On Tue, Nov 26, 2013 at 06:05:37PM +0200, Michael S. Tsirkin wrote:
>> > On Tue, Nov 26, 2013 at 02:56:10PM +0200, Gleb Natapov wrote:
>> > > On Tue, Nov 26, 2013 at 01:47:03PM +0100, Paolo Bonzini wrote:
>> > > > Il 26/11/2013 13:40, Zhanghaoyu (A) ha scritto:
>> > > > > When guest set irq smp_affinity, VMEXIT occurs, then the vcpu 
>> > > > > thread will IOCTL return to QEMU from hypervisor, then vcpu 
>> > > > > thread ask the hypervisor to update the irq routing table, in 
>> > > > > kvm_set_irq_routing, synchronize_rcu is called, current vcpu thread 
>> > > > > is blocked for so much time to wait RCU grace period, and during 
>> > > > > this period, this vcpu cannot provide service to VM, so those 
>> > > > > interrupts delivered to this vcpu cannot be handled in time, and the 
>> > > > > apps running on this vcpu cannot be serviced too.
>> > > > > It's unacceptable in some real-time scenario, e.g. telecom. 
>> > > > > 
>> > > > > So, I want to create a single workqueue for each VM, to 
>> > > > > asynchronously performing the RCU synchronization for irq routing 
>> > > > > table, and let the vcpu thread return and VMENTRY to service VM 
>> > > > > immediately, no more need to blocked to wait RCU grace period.
>> > > > > And, I have implemented a raw patch, took a test in our telecom 
>> > > > > environment, above problem disappeared.
>> > > > 
>> > > > I don't think a workqueue is even needed.  You just need to use 
>> > > > call_rcu to free "old" after releasing kvm->irq_lock.
>> > > > 
>> > > > What do you think?
>> > > > 
>> > > It should be rate limited somehow. Since it guest triggarable 
>> > > guest may cause host to allocate a lot of memory this way.
>> > 
>> > The checks in __call_rcu(), should handle this I think.  These keep 
>> > a per-CPU counter, which can be adjusted via rcutree.blimit, which 
>> > defaults to taking evasive action if more than 10K callbacks are 
>> > waiting on a given CPU.
>> > 
>> > 
>> Documentation/RCU/checklist.txt has:
>> 
>>         An especially important property of the synchronize_rcu()
>>         primitive is that it automatically self-limits: if grace periods
>>         are delayed for whatever reason, then the synchronize_rcu()
>>         primitive will correspondingly delay updates.  In contrast,
>>         code using call_rcu() should explicitly limit update rate in
>>         cases where grace periods are delayed, as failing to do so can
>>         result in excessive realtime latencies or even OOM conditions.
>
>I just asked Paul what this means.


My understanding shown as blow,
The synchronous grace period API synchronize_rcu() can prevent current thread 
from generating a large number of rcu-update subsequently, just as the 
"self-limits" described above in Documentation/RCU/checklist.txt, can avoid 
memory exhaustion, but the asynchronous API call_rcu() cannot limit the update 
rate, need explicitly rate limit.

Thanks,
Zhang Haoyu
>
>> --
>>                      Gleb.

Re: [Qemu-devel] [RFC] create a single workqueue for each vm to update vm irq routing table

Reply via email to