Re: per cpu memory in the kernel

Mark Kettenis Fri, 12 Aug 2016 12:49:16 -0700

> From: Martin Pieuchot <[email protected]>
> Date: Fri, 12 Aug 2016 20:44:04 +0200
> 
> On 08/11/16 06:43, David Gwynne wrote:
> > ive been tinkering with per cpu memory in the kernel
> > [...]
> 
> I'd like to have more people comment on this because we need an MP-safe
> way to handle counters in the network stack.


I had a first look.  Need to have a bit more time to digest this.

> > im still debating whether the API should do protection against
> > interrupts on the local cpu by handling spls for you. at the moment
> > it is up to the caller to manually splfoo() and splx(), but im half
> > convinced that cpumem_enter and _leave should do that on behalf of
> > the caller.
> 
> Let's see how it looks like when we have more usages of per cpu memory.

The per-CPU counters API is likely to need it though.  I'm less sure
about the per-CPU memory API itself.

> > this diff also includes two uses of the percpu code. one is to
> > provide per cpu caches of pool items, and the other is per cpu
> > counters for mbuf statistics.
> 
> I'd like to discuss pool cache later.

The pool cache sits on top of the per-CPU memory API isn't it?  So it
doesn't need to go in simultaniously.

> > ive added a wrapper around percpu memory for counters. basically
> > you ask it for N counters, where each counter is a uint64_t.
> > counters_enter will give you a per cpu reference to these N counters
> > which you can increment and decrement as you wish. internally the
> > api will version the per cpu counters so a reader can know when
> > theyre consistent, which is important on 32bit archs (where 64bit
> > ops arent necessarily atomic), or where you want several counters
> > to be consistent with each other (like packet and byte counters).
> > counters_read is provided to use the above machinery for a consistent
> > read.
> 
> Comment below.
> 
> > secondly, there is a boot strapping problem with per cpu data
> > structures, which is very apparent with the mbuf layer. the problem
> > is we dont know how many cpus are in the system until we're halfway
> > through attaching device drivers in the system. however, if we want
> > to use percpu data structures during attach we need to know how
> > many cpus we have. mbufs are allocated during attach, so we need
> > to know how many cpus we have before attach.
> 
> What about using ncpusfound?

On sparc64 that gets called very early, before pmap_bootstrap(),
because we need to allocate some per-CPU memory ;).

On other systems we typically calculate it on the fly as we're
attaching CPUs.  That may not be early enough.

> > +uint64_t *
> > +counters_enter(struct counters_ref *ref, struct cpumem *cm)
> > +{
> > +   ref->gen = cpumem_enter(cm);
> > +   (*ref->gen)++; /* make the generation number odd */
> > +   return (ref->gen + 1);
> > +}
> > +
> > +void
> > +counters_leave(struct counters_ref *ref, struct cpumem *cm)
> > +{
> > +   membar_producer();
> > +   (*ref->gen)++; /* make the generation number even again */
> > +   cpumem_leave(cm, ref->gen);
> > +}
> 
> So every counter++ will now look like a critical section?  Do we really
> need a memory barrier for every +1?  What can happen if a CPU is reading
> a value that is currently being updated?  Can't we live with that?

Not for 64-bit counters on 32-bit machines.  You could do something
clever and read the upper 32 bits, then read the lower 32 bits and
then read the upper 32-bits again and start over of you get different
value the 2nd time around.  But you still need to guarantee that
whoever updates the counter writes the upper 32 bits and lower 32 bits
in a particular order, and that they become visible to other CPUs in
that order.  And for that you need memory bars.

Re: per cpu memory in the kernel

Reply via email to