Hello, On Mon, May 17, 2021 at 06:59:36PM +0200, Martin Pieuchot wrote: > On 17/05/21(Mon) 16:24, Alexandr Nedvedicky wrote: > > Hrvoje, > > > > managed to trigger diagnostic panics with diff [1] sent by bluhm@ > > some time ago. The panic Hrvoje sees comes from ether_input() here: > > > > 414 > > 415 /* > > 416 * Third phase: bridge processing. > > 417 * > > 418 * Give the packet to a bridge interface, ie, bridge(4), > > 419 * switch(4), or tpmr(4), if it is configured. A bridge > > 420 * may take the packet and forward it to another port, or it > > 421 * may return it here to ether_input() to support local > > 422 * delivery to this port. > > 423 */ > > 424 > > 425 ac = (struct arpcom *)ifp; > > 426 > > 427 smr_read_enter(); > > 428 eb = SMR_PTR_GET(&ac->ac_brport); > > 429 if (eb != NULL) { > > 430 m = (*eb->eb_input)(ifp, m, dst, eb->eb_port); > > 431 if (m == NULL) { > > 432 smr_read_leave(); > > 433 return; > > 434 } > > 435 } > > 436 smr_read_leave(); > > 437 > > > > in current tree the ether_input() is protected by NET_LOCK(), which is > > grabbed > > by caller as a writer. bluhm's diff changes NET_LOCK() readlock, so > > ether_input() can run concurrently. Switching NET_LOCK() to r-lock has > > implications on smr read section above. The ting is the call to > > eb->eb_input() > > can sleep now. This is something what needs to be avoided within smr > > section. > > Is the new sleeping point introduced by the fact the PF_LOCK() is a > rwlock? Did you consider using a mutex, at least for the time being, > in order to not run in such issues?
there are two locks in pf_test(): PF_LOCK() PF_STATE_LOCK() we would have to convert both locks into mutexes. trading PF_STATE_LOCK() into mutex is step backward IMO. bluhm's parallel diff enables PF to run in parallel for packets. if packet matches existing state, it just grabs state lock as a reader. This brings huge benefit, because majority of packets just do the state look up. Only some packets need to go expensive way through rules and state creation, which involves PF_LOCK() and write lock on state rw. as readers and are done with PF. trading both locks for mutexes will do the trick too. However I'm not sure I see a benefit there (when comparing to current NET_LOCK() we have in tree). I think all safety belts we have kicked in. We learned smr read block is not an option for ether_input() when comes to bridges. perhaps alternative solution would be to introduce an RW-lock, which would protect 'ac->br_port'. Such R-lock would cover call ->eb_input(). However I'm not sure about it. I don't mind to trade pf_lock and pf_state_lock for mutexes, however I see such step is kind of 'NO-OP'. We do have sufficient measure currently, which is: keep NET_LOCK() as is. May be I'm not seeing your idea/plan behind changing pf's rw-locks to mutexes. If you feel there is a benefit to go that way, then let's do it, but I'd like to understand where we will be going/what to expect. thanks and regards sashan