On Tue, 5 Aug 2025 12:35:58 +0200 Kajetan Staszkiewicz <veg...@tuxpowered.net> wrote:
> On 2025-07-31 14:55, Vadim Goncharov wrote: > > On Thu, 31 Jul 2025 13:03:22 +0200 > > Kajetan Staszkiewicz <veg...@tuxpowered.net> wrote: > > > >> Hello group, > >> > >> I'm researching loop prevention in pfil. There are cases where packets > >> are reinjected into the network stack and would be handled by the same > >> hooks again, i.e. pf + dummynet where currently pf itself handles loop > >> prevention on its own. My current experiment's approach to making loop > >> prevention a general, non-pf-specific thing is to create a new mtag with > >> pointer to the last hook and update it in pfil.c/pfil_mbuf_common(). > >> That works good so far, but it means memory allocation when pfil hooks > >> are involved. I'm unsure what the impact on performance would be. > >> Another approach would be to extend struct mbuf, or probably rather > >> struct m_pkthdr, to contain the aforementioned pointer. But is changing > >> that struct something that can be easily done and approved and merged? > > > > First, you certainly don't need it in every mbuf - just first in chain with > > struct pkthdr (where mtags also start). > > True. > > > Second. > > The "last hook ptr" does not look like general solution for all cases and > > occupies 8 bytes. What about idea from network itself - TTL ? It occupies > > less bytes, the main problem is to decide where to decrement (e.g. each > > netgraph hook, etc.) > > The loop prevention I'm talking about is not as much about the packet > looping through the network stack, but rather packet looping through > pfil hooks. Consider those 2 scenarios: > > 1. Dummynet reinjection, this is how it works in the current pf: > a) A packet enters via ip6_input() > b) pfil_mbuf_in() sends it to pf_check6_in() which then sends it to > pf_test() > c) pf sends it to dummynet configured for a delay pipe > d) dummynet consumes the packet, pfil_mbuf_in()'s loop is interrupted > e) later dummynet re-injects the packet using netisr_dispatch() > f) the packet goes through ip6_input() and pfil_mbuf_in() again > g) pf_check6_in()/pf_test() perform their own logic to determine that > the packet has already went through pf_test() > h) the packet continues through pfil_mbuf_in() and finally goes through > ip6_(try)?forward and so on > > In this case we could benefit from marking the packet/mbuf that it has > already went through pf_check6_in() in pfil_mbuf_in()'s loop. When the > loop is run again, all pfil hooks before and including pf_check_in6() > can be skipped. ...unless some another pfil consumer decides that it *wants* to check that packet via it's some preliminary checks. Or there are more than one filter and somebody wants e.g. to start in pf and return from dummynet into ipfw... > 2. af-to Address Family translation, the algorithm below is for the > experimental pf code I've mentioned: > a) A packet enters via ip6_input() > b) pfil_mbuf_in() sends it to pf_check6_in() which then sends it to > pf_test() > c) pf_test() translates the packet from IPv6 to IPv4 > d) pf marks the packet as if it has went through pf_check_in() even > though it has really went through pf_check6_in() > e) the translated packet is sent through dummynet, as in the previous > scenario > f) dummynet reinjects the packet using netisr_dispatch() > g) the packet goes through pfil_mbuf_in() and pf_check_in() is skipped Again, this is specific for pf/dummynet, not stack in general. But general mbuf fields should not be tied to specific subsystems - tags are more appropriate here (unless you can fit into a bit or two like M_PROTO*). For general pkthdr fields, I'd better be thinking of more general usage (to include e.g. netgraph, yes) of TTL, refcounts... > > Third. > > What about redoing mtag allocator so that it reuses m_pktdat[] when M_EXT > > is set? This could optimize performance for many tags, not just yours. > > I'm not sure I understand this idea. Storing mtags directly in m_pktdat? Yes, though not quite: m_pktdat is member of union with m_ext, and tags could be placed there only if m_ext *is* valid (otherwise m_pktdat is not unused), that is, at m_pktdat[sizeof(m_ext) aligned to 8]. Also many functions need to be teached to handle this, e.g. to do reallocation of tags from that area if first mbuf collapses from m_ext to using m_pktdat. And allocation of tags should be measured to determine if such changes are really worth the performance. -- WBR, @nuclight