On Wed, Sep 12, 2012 at 10:45:22AM +0300, Avi Kivity wrote:
> On 09/12/2012 04:03 AM, Paul E. McKenney wrote:
> >> > > Paul, I'd like to check something with you here:
> >> > > this function can be triggered by userspace,
> >> > > any number of times; we allocate
> >> > > a 2K chunk of memory that is later freed by
> >> > > kfree_rcu.
> >> > >
> >> > > Is there a risk of DOS if RCU is delayed while
> >> > > lots of memory is queued up in this way?
> >> > > If yes is this a generic problem with kfree_rcu
> >> > > that should be addressed in core kernel?
> >> >
> >> > There is indeed a risk.
> >>
> >> In our case it's a 2K object. Is it a practical risk?
> >
> > How many kfree_rcu()s per second can a given user cause to happen?
>
> Not much more than a few hundred thousand per second per process (normal
> operation is zero).
>
I managed to do 21466 per second.
> >
> >> > The kfree_rcu() implementation cannot really
> >> > decide what to do here, especially given that it is callable with irqs
> >> > disabled.
> >> >
> >> > The usual approach is to keep a per-CPU counter and count it down from
> >> > some number for each kfree_rcu(). When it reaches zero, invoke
> >> > synchronize_rcu() as well as kfree_rcu(), and then reset it to the
> >> > "some number" mentioned above.
> >>
> >> It is a bit of a concern for me that this will hurt worst-case latency
> >> for realtime guests. In our case, we return error and this will
> >> fall back on not allocating memory and using slow all-CPU scan.
> >> One possible scheme that relies on this is:
> >> - increment an atomic counter, per vcpu. If above threshold ->
> >> return with error
> >> - call_rcu (+ barrier vcpu destruct)
> >> - within callback decrement an atomic counter
> >
> > That certainly is a possibility, but...
> >
> >> > In theory, I could create an API that did this. In practice, I have no
> >> > idea how to choose the number -- much depends on the size of the object
> >> > being freed, for example.
> >>
> >> We could pass an object size, no problem :)
> >
> > ... before putting too much additional effort into possible solutions,
> > why not force the problem to occur and see what actually happens? We
> > would then be in a much better position to work out what should be done.
>
> Good idea. Michael, is should be easy to modify kvm-unit-tests to write
> to the APIC ID register in a loop.
>
I did. Memory consumption does not grow on otherwise idle host.
--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html