On 21 October 2016 at 17:39, Alexandr Nedvedicky
<[email protected]> wrote:
> Hello Mike,
> (I'm putting tech@ back)
>
>> Or some other changes if expire has happened with the deferred removal in
>> the thread.  What I saying is basically that the last fix I did for once
>> rules was tested in the scenario you've described.
>
> sorry I've messed up my description well enough to obfuscate the story.
> So let me start again.  Below is Petr's test case running on BSD kernel from
> Sept 2 (day before the I've changed handling of ONCE rules)
>
>     OpenBSD 6.0-current (GENERIC) #2234: Fri Sep  2 06:27:57 MDT 2016
>         [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC
>
>     echo 'anchor "foo/*"' | pfctl -f -
>     pfctl -sr
>     # anchor "foo/*" all
>
>     echo 'pass' | pfctl -a foo/bar -f -
>     echo 'pass once' | pfctl -a foo/once-bar -f -
>     pfctl -a "*" -sr
>     # anchor "foo/*" all {
>     #   anchor "bar" all {
>     #     pass all flags S/SA
>     #   }
>     #   anchor "once-bar" all {
>     #     pass all flags S/SA once
>     #   }
>     # }
>
>     # let some network traffic to hit once rule, any packet will do
>     # show rules in main anchor again:
>     pfctl -a "*" -sr
>     # ?! like no rules here?
>
>     # let's anchors we have in kernel
>     pfctl -sA -v
>     #   foo
>     #   foo/bar
>     #   foo/once-bar
>     # so there are some anchors, let's see ho foo/once-bar looks like
>     pfctl -a foo/once-bar -sr
>     # nothing, it's good. So it looks like the once rule, also removed
>     # top level anchor foo/*. We think it's not quite right, the issue
>     # has been pointed out by Petr. He also came with test case.
>
> Now I'll repeat same test case with more recent kernel (from Oct 1st).
>
>     OpenBSD 6.0-current (GENERIC.MP) #2512: Sat Oct  1 10:04:59 MDT 2016
>         [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
>
>     echo 'anchor "foo/*"' | pfctl -f -
>     pfctl -sr
>     # anchor "foo/*" all
>
>     echo 'pass' | pfctl -a foo/bar -f -
>     echo 'pass once' | pfctl -a foo/once-bar -f -
>     pfctl -a "*" -sr
>     # anchor "foo/*" all {
>     #   anchor "bar" all {
>     #     pass all flags S/SA
>     #   }
>     #   anchor "once-bar" all {
>     #     pass all flags S/SA once
>     #   }
>     # }
>
>     # let some network traffic to hit once rule, any packet will do
>     # show rules in main anchor again:
>     pfctl -a "*" -sr
>     # anchor "foo/*" all {
>     #   anchor "bar" all {
>     #     pass all flags S/SA
>     #   }
>     #   anchor "once-bar" all {
>     #   }
>     # }
>     # this looks better, but still not perfect as 'once-bar' anchor
>     # remains there and must be removed manually (pfctl -a foo/bar -Fr)
>
> There are two problems in code from Sept 2 (and earlier), when it comes to 
> ONCE
> rules and anchors:
>
>     (1) once rule always removes its parent anchor rule, regardless if
>         the ruleset got empty or not.
>
>     (2) the ruleset, which used to hold once rule is not cleaned up
>         automatically
>
> The (1) got fixed by change to once rules introduced on Sept 3rd. It's 
> probably
> worth to check what was wrong in the old PF (Sept 2nd and earlier).
> At that time we used to call pf_purge_rule() instead of marking rule as
> expired (code below comes from pf_test_rule()):
>
>     3799        if (r->rule_flag & PFRULE_ONCE)
>     3800                pf_purge_rule(ruleset, r, aruleset, a);
>
> Petr's test case populates arguments to pf_purge_rule() as follows:
>     ruleset:    foo/once-bar
>     rule:       pass once all # the rule, we are removing
>     aruleset:   is main ruleset in this particular case
>     arule:      anchor "foo/*" (the only rule we have in main ruleset)
> The code of old pf_purge_rule() (Sept 2nd or earlier), just blindly
> removed the root-anchor rule from root ruleset in this case.  The old
> pf_purge_rule() (pf_ioctl.c @ 1.276) looks as follows:
>
>     310     pf_purge_rule(struct pf_ruleset *ruleset, struct pf_rule *rule,
>     311         struct pf_ruleset *aruleset, struct pf_rule *arule)
>     312     {
>     313         u_int32_t                nr = 0;
>     314
>     315         KASSERT(ruleset != NULL && rule != NULL);
>     316
>     317         pf_rm_rule(ruleset->rules.active.ptr, rule);
>     318         ruleset->rules.active.rcount--;
>     319         TAILQ_FOREACH(rule, ruleset->rules.active.ptr, entries)
>     320                 rule->nr = nr++;
>     321         ruleset->rules.active.ticket++;
>     322         pf_calc_skip_steps(ruleset->rules.active.ptr);
>     323
>     324         /* remove the parent anchor rule */
>     325         if (nr == 0 && arule && aruleset) {
>     326                 pf_rm_rule(aruleset->rules.active.ptr, arule);
>     327                 aruleset->rules.active.rcount--;
>     328                 TAILQ_FOREACH(rule, aruleset->rules.active.ptr, 
> entries)
>     329                         rule->nr = nr++;
>     330                 aruleset->rules.active.ticket++;
>     331                 pf_calc_skip_steps(aruleset->rules.active.ptr);
>     332         }
>     333      }
>
> lines 317 - 323 remove the rule from foo/once-bar, which becomes empty. The nr
> at line 321 is still zero, because TAILQ_FOREACH() at 319 does not run for
> empty rulesets.  The if condition at 325 seems to be wrong. It treats nr
> as a number of rules found in aruleset, hence it believes the true branch does
> right thing, when it removes arule. However we should rather check whether
> arule->anchor is empty. So it seems to me we want line 325 there to look
> like this:
>
>     if (arule && aruleset && TAILQ_EMPTY(arule->anchor.ruleset.active)) {
>
> The issue above got partially resolved on Sept 3rd by taking completely
> different approach. Instead of making packet to do all the heavy lifting
> required to remove the rule, the packet marks the rule as expired and
> leaves heavy lifting on purge thread:
>
> 3859  if (r->rule_flag & PFRULE_ONCE) {
> 3860     if ((a != NULL) && TAILQ_EMPTY(a->ruleset->rules.active.ptr     )) {
> 3861               a->rule_flag |= PFRULE_EXPIRED;
> 3862               a->exptime = time_second;
> 3863               SLIST_INSERT_HEAD(&pf_rule_gcl, a, gcle);
> 3864      }
> 3865
> 3866      r->rule_flag |= PFRULE_EXPIRED;
> 3867      r->exptime = time_second;
> 3868      SLIST_INSERT_HEAD(&pf_rule_gcl, r, gcle);
>
> Code above has its own problems too. The lines 3860-3864 are completely futile
> as the condition at 3860 is never true. The a->ruleset is never empty,
> because it still holds the rule we are going to mark as expired at line 3866.
> So my patch below just removes those lines (chunk, which changes
> pf_test_rule()).
>
> The change to pf_purge_rule() is straightforward. Since Spet 3rd each rule
> knows a ruleset where it belongs to (->ruleset). So as soon as we remove
> the rule from its ruleset, we just call pf_remove_if_empty_ruleset(), which
> takes care about empty rulesets.  This was the missing piece in fix from Oct
> 3rd.
>
> thanks and
> regards
> sasha
>

Hi,

Thanks for the detailed explanation. All I was saying is
that what you're describing as a correct behavior is indeed
the correct behavior :) and I'm OK with your change.  Sorry
for being too vague.

Cheers,
Mike

Reply via email to