Hello Mike,
(I'm putting tech@ back)

> Or some other changes if expire has happened with the deferred removal in
> the thread.  What I saying is basically that the last fix I did for once
> rules was tested in the scenario you've described.

sorry I've messed up my description well enough to obfuscate the story.
So let me start again.  Below is Petr's test case running on BSD kernel from
Sept 2 (day before the I've changed handling of ONCE rules)

    OpenBSD 6.0-current (GENERIC) #2234: Fri Sep  2 06:27:57 MDT 2016
        [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC

    echo 'anchor "foo/*"' | pfctl -f -
    pfctl -sr
    # anchor "foo/*" all

    echo 'pass' | pfctl -a foo/bar -f -
    echo 'pass once' | pfctl -a foo/once-bar -f -
    pfctl -a "*" -sr
    # anchor "foo/*" all {
    #   anchor "bar" all {
    #     pass all flags S/SA
    #   }
    #   anchor "once-bar" all {
    #     pass all flags S/SA once
    #   }
    # }

    # let some network traffic to hit once rule, any packet will do
    # show rules in main anchor again:
    pfctl -a "*" -sr
    # ?! like no rules here?

    # let's anchors we have in kernel
    pfctl -sA -v
    #   foo
    #   foo/bar
    #   foo/once-bar
    # so there are some anchors, let's see ho foo/once-bar looks like
    pfctl -a foo/once-bar -sr
    # nothing, it's good. So it looks like the once rule, also removed
    # top level anchor foo/*. We think it's not quite right, the issue
    # has been pointed out by Petr. He also came with test case.

Now I'll repeat same test case with more recent kernel (from Oct 1st).

    OpenBSD 6.0-current (GENERIC.MP) #2512: Sat Oct  1 10:04:59 MDT 2016
        [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP

    echo 'anchor "foo/*"' | pfctl -f -
    pfctl -sr
    # anchor "foo/*" all

    echo 'pass' | pfctl -a foo/bar -f -
    echo 'pass once' | pfctl -a foo/once-bar -f -
    pfctl -a "*" -sr
    # anchor "foo/*" all {
    #   anchor "bar" all {
    #     pass all flags S/SA
    #   }
    #   anchor "once-bar" all {
    #     pass all flags S/SA once
    #   }
    # }

    # let some network traffic to hit once rule, any packet will do
    # show rules in main anchor again:
    pfctl -a "*" -sr
    # anchor "foo/*" all {
    #   anchor "bar" all {
    #     pass all flags S/SA
    #   }
    #   anchor "once-bar" all {
    #   }
    # }
    # this looks better, but still not perfect as 'once-bar' anchor
    # remains there and must be removed manually (pfctl -a foo/bar -Fr)

There are two problems in code from Sept 2 (and earlier), when it comes to ONCE
rules and anchors:

    (1) once rule always removes its parent anchor rule, regardless if
        the ruleset got empty or not.

    (2) the ruleset, which used to hold once rule is not cleaned up
        automatically

The (1) got fixed by change to once rules introduced on Sept 3rd. It's probably
worth to check what was wrong in the old PF (Sept 2nd and earlier).
At that time we used to call pf_purge_rule() instead of marking rule as
expired (code below comes from pf_test_rule()):

    3799        if (r->rule_flag & PFRULE_ONCE)
    3800                pf_purge_rule(ruleset, r, aruleset, a);

Petr's test case populates arguments to pf_purge_rule() as follows:
    ruleset:    foo/once-bar
    rule:       pass once all # the rule, we are removing
    aruleset:   is main ruleset in this particular case
    arule:      anchor "foo/*" (the only rule we have in main ruleset)
The code of old pf_purge_rule() (Sept 2nd or earlier), just blindly
removed the root-anchor rule from root ruleset in this case.  The old
pf_purge_rule() (pf_ioctl.c @ 1.276) looks as follows:

    310     pf_purge_rule(struct pf_ruleset *ruleset, struct pf_rule *rule,
    311         struct pf_ruleset *aruleset, struct pf_rule *arule)
    312     {
    313         u_int32_t                nr = 0;
    314
    315         KASSERT(ruleset != NULL && rule != NULL);
    316
    317         pf_rm_rule(ruleset->rules.active.ptr, rule);
    318         ruleset->rules.active.rcount--;
    319         TAILQ_FOREACH(rule, ruleset->rules.active.ptr, entries)
    320                 rule->nr = nr++;
    321         ruleset->rules.active.ticket++;
    322         pf_calc_skip_steps(ruleset->rules.active.ptr);
    323
    324         /* remove the parent anchor rule */
    325         if (nr == 0 && arule && aruleset) {
    326                 pf_rm_rule(aruleset->rules.active.ptr, arule);
    327                 aruleset->rules.active.rcount--;
    328                 TAILQ_FOREACH(rule, aruleset->rules.active.ptr, entries)
    329                         rule->nr = nr++;
    330                 aruleset->rules.active.ticket++;
    331                 pf_calc_skip_steps(aruleset->rules.active.ptr);
    332         }
    333      }

lines 317 - 323 remove the rule from foo/once-bar, which becomes empty. The nr
at line 321 is still zero, because TAILQ_FOREACH() at 319 does not run for
empty rulesets.  The if condition at 325 seems to be wrong. It treats nr
as a number of rules found in aruleset, hence it believes the true branch does
right thing, when it removes arule. However we should rather check whether
arule->anchor is empty. So it seems to me we want line 325 there to look
like this:

    if (arule && aruleset && TAILQ_EMPTY(arule->anchor.ruleset.active)) {

The issue above got partially resolved on Sept 3rd by taking completely
different approach. Instead of making packet to do all the heavy lifting
required to remove the rule, the packet marks the rule as expired and
leaves heavy lifting on purge thread:

3859  if (r->rule_flag & PFRULE_ONCE) {
3860     if ((a != NULL) && TAILQ_EMPTY(a->ruleset->rules.active.ptr     )) {
3861               a->rule_flag |= PFRULE_EXPIRED;
3862               a->exptime = time_second;
3863               SLIST_INSERT_HEAD(&pf_rule_gcl, a, gcle);
3864      }
3865
3866      r->rule_flag |= PFRULE_EXPIRED;
3867      r->exptime = time_second;
3868      SLIST_INSERT_HEAD(&pf_rule_gcl, r, gcle);

Code above has its own problems too. The lines 3860-3864 are completely futile
as the condition at 3860 is never true. The a->ruleset is never empty,
because it still holds the rule we are going to mark as expired at line 3866.
So my patch below just removes those lines (chunk, which changes
pf_test_rule()).

The change to pf_purge_rule() is straightforward. Since Spet 3rd each rule
knows a ruleset where it belongs to (->ruleset). So as soon as we remove
the rule from its ruleset, we just call pf_remove_if_empty_ruleset(), which
takes care about empty rulesets.  This was the missing piece in fix from Oct
3rd.

thanks and
regards
sasha

--------8<---------------8<---------------8<------------------8<--------
diff -r 5b09c9f9c654 src/sys/net/pf.c
--- a/src/sys/net/pf.c  Fri Oct 21 00:17:58 2016 +0200
+++ b/src/sys/net/pf.c  Fri Oct 21 12:53:55 2016 +0200
@@ -3857,12 +3857,6 @@ pf_test_rule(struct pf_pdesc *pd, struct
 #endif /* NPFSYNC > 0 */

        if (r->rule_flag & PFRULE_ONCE) {
-               if ((a != NULL) && TAILQ_EMPTY(a->ruleset->rules.active.ptr)) {
-                       a->rule_flag |= PFRULE_EXPIRED;
-                       a->exptime = time_second;
-                       SLIST_INSERT_HEAD(&pf_rule_gcl, a, gcle);
-               }
-
                r->rule_flag |= PFRULE_EXPIRED;
                r->exptime = time_second;
                SLIST_INSERT_HEAD(&pf_rule_gcl, r, gcle);
diff -r 5b09c9f9c654 src/sys/net/pf_ioctl.c
--- a/src/sys/net/pf_ioctl.c    Fri Oct 21 00:17:58 2016 +0200
+++ b/src/sys/net/pf_ioctl.c    Fri Oct 21 12:53:55 2016 +0200
@@ -315,6 +315,7 @@ pf_purge_rule(struct pf_rule *rule)
                rule->nr = nr++;
        ruleset->rules.active.ticket++;
        pf_calc_skip_steps(ruleset->rules.active.ptr);
+       pf_remove_if_empty_ruleset(ruleset);
 }

 u_int16_t



Reply via email to