Date: Tue, 5 Jan 2016 17:30:35 +0900 From: Kengo NAKAHARA <k-nakah...@iij.ad.jp>
I use rw_init() for gif_softc_list_lock instead of rw_obj_alloc(). However, I still use rw_obj_alloc() for struct gif_softc.gif_lock. If I use rw_init() for struct gif_softc.gif_lock, it would be the following code [...] krwlock_t gif_lock __aligned(COHERENCY_UNIT); /* lock for softc */ int dummy __aligned(COHERENCY_UNIT); /* padding for cache line */ [...] I feel it seems worse... There's no need for this inside a struct. The point of __cacheline_aligned in a global object is to prevent it from sharing a cache line with unrelated objects, which would cause needless contention. But inside a struct, the cache line is pretty much guaranteed to be filled with related objects. The only time you need to worry about it in a struct is when there are independent parts for which you want independent contention, e.g. in struct pcq (kern/subr_pcq.c). I agree pserialize(9) gives it better MP-scalable than rwlock. However, I think it may be a premature optimization. The reason I think so is the following lockstat result of my kernel (which is enable both NET_MPSAFE and LOCKDEBUG). Fair enough -- I just wanted to make sure there was some thought put into the intended access patterns so that we will be working toward an MP-scalable, not just an MP-safe, network stack. There are a couple other things I'm concerned about in the patch -- I notice you're doing softint_establish with a lock held. Maybe that's OK, but it makes me a little nervous. I don't have time right now but I would like to take a closer look in the next few days at some of the changes like that that your patch makes.