On Mon, Oct 2, 2017 at 11:13 PM, Taylor R Campbell <[email protected]> wrote: > Quick summary of the problem: > > percpu_alloc(9) may move existing per-CPU objects to different > locations in memory. This means you can't reliably store pointers > to the per-CPU objects anywhere. > > psref(9) currently uses queue.h LIST, with a per-CPU list head -- > but LIST stores back-pointers into the list head, which breaks when > percpu_alloc(9) moves the list head. > > Possible solutions. I'm leaning toward (6), to open-code the linked > list operations for this special purpose, with compile-tested patch > attached. This changes the text of psref.h, but shouldn't change the > ABI. Comments?
How about using SLIST instead of open-coding? The instructions of them are very similar, but the SLIST version is slightly simpler. This is a patch: http://www.netbsd.org/~ozaki-r/psref-SLIST.diff One worry of the implementation is that percpu_{getref,putref} need to be always called in psref_release while the open-coded one doesn't need it. However the result of performance evaluations shows that the overhead is not so big and moreover SLIST version overtakes open-coded one slightly. These are the results of IP forwarding performance evaluations with 64 byte frames: - original: 144.7 Mbps - open-code: 138.5 Mbps - SLIST: 140.1 Mbps ozaki-r > > > 1. Make psref(9) store a pointer to a separately kmem_alloc-allocated > list head. (This is what Nakahara-san considered but rejected.) > > struct psref_cpu { > - struct psref_head pcpu_head; > + struct psref_head *pcpu_head; > }; > > Problem: We need to initialize these before psref_acquire/release. > We could initialize it in psref_class_create, but for various > applications of psref, we may not know how many CPUs there are at > that time. > > To fix this properly we need to teach percpu(9) how to initialize > and finalize per-CPU objects as CPUs come online, which as a bonus > would make it work with dynamic CPU attachment and detachment. And > that's a lot of work. > > 2. Make percpu(9) always store pointers into separately kmem_alloc- > allocated memory, like userland TLS and pthread keys do. > > Problem: It's a bit of work to basically rewrite subr_percpu.c, and > may have performance implications. I started last night, spent an > hour rewriting it, and didn't finish before bedtime came. > > 3. Invent a new variant of LIST called PERCPU_LIST that doesn't > require back-pointers into the head. (This is what Nakahara-san > suggested before I vanished into a EuroBSDcon vortex.) > > Problem: This is almost the same as LIST -- supports essentially all > the same operations with the same performance characteristics -- > and has very limited utility. So unless we find other applications > that need exactly this too, I'm reluctant to expose it as a > general-purpose kernel API right now. > > It can't be a drop-in replacement for LIST, because we have > LIST_REMOVE(obj, field) but we need MHLIST_REMOVE(head, obj, field) > in order to allow the head to move. > > And we can't just add LIST_REMOVE_MAYBEHEAD(head, obj, field) or > something to LIST, because there's no way to distinguish with > standard LIST whether an object is at the head of the list or not, > without knowing where the head *was* in memory when the object (or > one of its predecessors) was inserted. > > 4. Teach the percpu(9) API to move objects with an operation other > than memcpy. This would look like > > percpu_t *percpu_alloc(size_t); > + percpu_t *percpu_alloc_movable(size_t, > + void *(*)(void *, const void *, size_t)); > > with percpu_alloc(sz) := percpu_alloc_movable(sz, &memcpy). Then > for psref(9), we could use LIST_MOVE to migrate the old list head > to the new list head and fix the one back-pointer that would have > been invalidated. > > Problem: This is again an operation of limited utility, since most > per-CPU objects (like counters) don't need special operations to > move in memory. And it would compete with extending the API for > dynamic CPU initialization/finalization operations, so I'm not sure > I want to preemptively step on the toes of that API design space at > the moment. > > 5. Violate the abstraction of LIST in psref(9), as I proposed as a > stop-gap measure in > <https://mail-index.netbsd.org/tech-kern/2017/09/19/msg022345.html>. > > Problem: It's gross, makes maintenance of the LIST implementation > more difficult, makes understanding the psref(9) code more > difficult, breaks QUEUEDEBUG, &c. > > 6. Just open-code it in subr_psref.c. If we ever find more > applications of this idiom, we can always upgrade it without > trouble to an explicit PERCPU_LIST abstraction. > > Problem: We don't get to share any common LIST code. This seems > like a small price for a single-use gizmo like this.
