Re: store pf rules in a tree

2022-08-12 Thread Alexandr Nedvedicky
Hello Stefan,

On Wed, Aug 10, 2022 at 02:38:16PM +, Stefan Butz wrote:
> Hi everyone,
> 
> this mail includes a patch to store pf rules in a red-black tree.
> Currently they are stored in a linked list.
> My system configured with 16000 rules takes about 10 minutes
> to print them out using `pfctl -sr`.
> This patch decreases the time to 4 seconds.
> I was not able to measure a time increase for rule insertion.
> Inserting a lot of labels is still very slow.
> This has to be attacked separatly.
> 
> 

below is a diff which should make DIOCGETRULE fast enough. It introduces
a transaction (struct pf_trans).  it comes from my 'work-in-progress' tree.
I'd like to introduce transactions to move all memory allocations done on
behalf of ioctl(2) in pf(4) outside of PF_LOCK() scope.  Seeing your diff
I've realized we can use pf_trans also to speed up DIOCGETRULE.


diff below is based on my work-in-progress tree/branch. There might be some
rough edges, but it compiles and seems to work. if people will like the idea
I'll polish the change and commit it. It will also help me to make upcoming
diffs smaller.

At this point I'd like to know if idea presented in diff below makes
kind of sense.

there is `struct pf_trans` which keeps context for operations
which involve multiple ioctl(2) calls (such as ADDRULE, GETRULE,...)

ruleset.ticket is renamed to ruleset.version because ticket is know
kept in newly introduced pf_trans.

the workflow for process which wants to read/modify ruleset. In case
of DIOCGETRULES/DIOCGETRULE operations the workflow looks as follows:

transaction (pf_trans) is created by DIOCGETRULES

transaction gets reference to anchor/ruleset

we also remember current version of ruleset in transaction

we also rmemeber a pointer to the first rule in ruleset/anchor

we return transaction ticket to pfctl(8) process

then pfctl(8) issues DIOCGETRULE, it passes transaction ticket to ioctl(2)

we use ticket to look up a transaction created by DIOCGETRULES

we check if version kept in transaction matches the version at
ruleset.  If there is a mismatch ruleset got modified and we bail out.

if rule stored in transaction is NULL we return ENOENT
to indicate ruleset is either empty or there are no more rules

if version matches then we may safely proceed, the rule pointer
must be valid because matching version number warrants nothing
was modified.

we copy rule to buffer

and we store a next rule in transaction.

all transactions created by process are currently destroyed in pfclose()

thanks for thoughts/feedback.

regards
sashan

8<---8<---8<--8<
diff --git a/sbin/pfctl/pfctl.c b/sbin/pfctl/pfctl.c
index 9f14b955ca7..cf9def71c02 100644
--- a/sbin/pfctl/pfctl.c
+++ b/sbin/pfctl/pfctl.c
@@ -833,7 +833,7 @@ pfctl_show_rules(int dev, char *path, int opts, enum 
pfctl_show format,
 char *anchorname, int depth, int wildcard, long shownr)
 {
struct pfioc_rule pr;
-   u_int32_t nr, mnr, header = 0;
+   u_int32_t header = 0;
int len = strlen(path), ret = 0;
char *npath, *p;
 
@@ -884,24 +884,9 @@ pfctl_show_rules(int dev, char *path, int opts, enum 
pfctl_show format,
goto error;
}
 
-   if (shownr < 0) {
-   mnr = pr.nr;
-   nr = 0;
-   } else if (shownr < pr.nr) {
-   nr = shownr;
-   mnr = shownr + 1;
-   } else {
-   warnx("rule %ld not found", shownr);
-   ret = -1;
-   goto error;
-   }
-   for (; nr < mnr; ++nr) {
-   pr.nr = nr;
-   if (ioctl(dev, DIOCGETRULE, &pr) == -1) {
-   warn("DIOCGETRULE");
-   ret = -1;
-   goto error;
-   }
+   while (ioctl(dev, DIOCGETRULE, &pr) != -1) {
+   if ((shownr != -1) && (shownr != pr.nr))
+   continue;
 
/* anchor is the same for all rules in it */
if (pr.rule.anchor_wildcard == 0)
@@ -956,6 +941,13 @@ pfctl_show_rules(int dev, char *path, int opts, enum 
pfctl_show format,
case PFCTL_SHOW_NOTHING:
break;
}
+   errno = 0;
+   }
+
+   if ((errno != 0) && (errno != ENOENT)) {
+   warn("DIOCGETRULE");
+   ret = -1;
+   goto error;
}
 
/*
diff --git a/sys/net/pf_ioctl.c b/sys/net/pf_ioctl.c
index b540afc6fe2..6ef89d162b3 100644
--- a/sys/net/pf_ioctl.c
+++ b/sys/net/pf_ioctl.c
@@ -116,6 +116,11 @@ voidpf_qid_unref(u_int16_t);
 int pf_states_clr(struct pfioc_state_kill *);
 int pf_states_get(struct pfioc_states *);
 
+struct pf_trans*pf_o

Re: store pf rules in a tree

2022-08-10 Thread Alexandr Nedvedicky
Hello,


On Wed, Aug 10, 2022 at 02:38:16PM +, Stefan Butz wrote:
> Hi everyone,
> 
> this mail includes a patch to store pf rules in a red-black tree.
> Currently they are stored in a linked list.
> My system configured with 16000 rules takes about 10 minutes
> to print them out using `pfctl -sr`.
> This patch decreases the time to 4 seconds.
> I was not able to measure a time increase for rule insertion.
> Inserting a lot of labels is still very slow.
> This has to be attacked separatly.

took me while to figure out why it can take sooo lng
to retrieve 16k rules from kernel. this is the answer

1480 case DIOCGETRULE: {
1496 if (pr->ticket != ruleset->rules.active.ticket) {
1497 error = EBUSY;
1498 PF_UNLOCK();
1499 NET_UNLOCK();
1500 goto fail;
1501 }
1502 rule = TAILQ_FIRST(ruleset->rules.active.ptr);
1503 while ((rule != NULL) && (rule->nr != pr->nr))
1504 rule = TAILQ_NEXT(rule, entries);
1505 if (rule == NULL) {

so yes, in order to retrieve the next rule in list we must
always start with the first (HEAD) rule and iterate to rule n+1.

so I understand a motivation why to solve it using  rb-tree.

I also (as todd) have some doubts about code in DIOCHANGERULE.
the current code is able to insert to desired location.
I don't understand how your new code does that:


1675 
1676 if (pcr->action == PF_CHANGE_ADD_HEAD)
1677 oldrule = RB_MIN(pf_ruletree,
1678 ruleset->rules.active.ptr);
1679 else if (pcr->action == PF_CHANGE_ADD_TAIL)
1680 oldrule = RB_MAX(pf_ruletree,
1681 ruleset->rules.active.ptr);
   
1682 else {
1683 oldrule = RB_MIN(pf_ruletree,
1684 ruleset->rules.active.ptr);
1685 while ((oldrule != NULL) && (oldrule->nr != 
pcr->nr))
1686 oldrule = RB_NEXT(pf_ruletree,
1687 ruleset->rules.active.ptr, oldrule);   
   
1688 if (oldrule == NULL) {
1689 if (newrule != NULL)
1690 pf_rm_rule(NULL, newrule); 
   
1691 error = EINVAL;
   
1692 PF_UNLOCK();
1693 NET_UNLOCK();
1694 goto fail; 
   
1695 }  
   
1696 } 
1697 
1698 if (pcr->action == PF_CHANGE_REMOVE) {
1699 pf_rm_rule(ruleset->rules.active.ptr, oldrule);
   
1700 ruleset->rules.active.rcount--;
1701 } else {
1702 RB_INSERT(pf_ruletree,
1703 ruleset->rules.active.ptr, newrule);
1704 ruleset->rules.active.rcount++;
1705 newrule->ruleset = ruleset;
1706 }
1707 

how does RB_INSERT() at line 1702 insert rule to
desired location. Consider the current rules look like:

[ 1, 2, 3, 4 ]

i'm going to insert rule x, which should follow existing rule nr 3

[ 1, 2, 3, x, 4]

current code just re-indexes/renumbers the rules after insertion

[ 1, 2, 3, 4, 5 ], where x gets assigned 4.

I don't see how does that happen with RB_INSERT().

I think what we really need to fix is the iteration to n-th retrieved
rule we currently have in DIOCGETRULE. I'm working on slightly large
change which updates pf(4) transactions. I'll try to cherry pick
some changes from my in-progress tree and factor out a smaller diff 
which will solve that DIOCGETRULE itch for you. please stay tuned.

thanks and
regards
sashan



Re: store pf rules in a tree

2022-08-10 Thread Todd C . Miller
On Wed, 10 Aug 2022 14:38:16 -, Stefan Butz wrote:

> this mail includes a patch to store pf rules in a red-black tree.
> Currently they are stored in a linked list.
> My system configured with 16000 rules takes about 10 minutes
> to print them out using `pfctl -sr`.
> This patch decreases the time to 4 seconds.
> I was not able to measure a time increase for rule insertion.
> Inserting a lot of labels is still very slow.
> This has to be attacked separatly.

You are using the rule number 'nr' as the key for items in the tree.
However, that field can change post-insertion.  Won't that invalidate
the tree order?

See pf_purge_rule() and the DIOCCHANGERULE case in pfioctl()
for examples of what I mean.

 - todd



store pf rules in a tree

2022-08-10 Thread Stefan Butz
Hi everyone,

this mail includes a patch to store pf rules in a red-black tree.
Currently they are stored in a linked list.
My system configured with 16000 rules takes about 10 minutes
to print them out using `pfctl -sr`.
This patch decreases the time to 4 seconds.
I was not able to measure a time increase for rule insertion.
Inserting a lot of labels is still very slow.
This has to be attacked separatly.


diff --git sbin/pfctl/parse.y sbin/pfctl/parse.y
index 8a92a7e895c..c6f8629c690 100644
--- sbin/pfctl/parse.y
+++ sbin/pfctl/parse.y
@@ -5548,11 +5548,13 @@ mv_rules(struct pf_ruleset *src, struct pf_ruleset *dst)
 {
struct pf_rule *r;
 
-   TAILQ_FOREACH(r, src->rules.active.ptr, entries)
+   RB_FOREACH(r, pf_ruletree, src->rules.active.ptr) {
dst->anchor->match++;
-   TAILQ_CONCAT(dst->rules.active.ptr, src->rules.active.ptr, entries);
+   RB_INSERT(pf_ruletree, dst->rules.active.ptr, r);
+   }
src->anchor->match = 0;
-   TAILQ_CONCAT(dst->rules.inactive.ptr, src->rules.inactive.ptr, entries);
+   RB_FOREACH(r, pf_ruletree, src->rules.inactive.ptr)
+   RB_INSERT(pf_ruletree, dst->rules.inactive.ptr, r);
 }
 
 void
diff --git sbin/pfctl/pfctl.c sbin/pfctl/pfctl.c
index 9f14b955ca7..49aafbed4e5 100644
--- sbin/pfctl/pfctl.c
+++ sbin/pfctl/pfctl.c
@@ -1170,7 +1170,7 @@ pfctl_add_rule(struct pfctl *pf, struct pf_rule *r)
err(1, "calloc");
bcopy(r, rule, sizeof(*rule));
 
-   TAILQ_INSERT_TAIL(rs->rules.active.ptr, rule, entries);
+   RB_INSERT(pf_ruletree, rs->rules.active.ptr, rule);
 }
 
 int
@@ -1409,7 +1409,7 @@ pfctl_check_qassignments(struct pf_ruleset *rs)
}
}
 
-   TAILQ_FOREACH(r, rs->rules.active.ptr, entries) {
+   RB_FOREACH(r, pf_ruletree, rs->rules.active.ptr) {
if (r->anchor)
errs += pfctl_check_qassignments(&r->anchor->ruleset);
if (pfctl_leafqueue_check(r->qname) ||
@@ -1436,7 +1436,7 @@ pfctl_load_ruleset(struct pfctl *pf, char *path, struct 
pf_ruleset *rs,
snprintf(&path[len], PATH_MAX - len, "%s", pf->anchor->path);
 
if (depth) {
-   if (TAILQ_FIRST(rs->rules.active.ptr) != NULL) {
+   if (!RB_EMPTY(rs->rules.active.ptr)) {
brace++;
if (pf->opts & PF_OPT_VERBOSE)
printf(" {\n");
@@ -1454,8 +1454,8 @@ pfctl_load_ruleset(struct pfctl *pf, char *path, struct 
pf_ruleset *rs,
if (pf->optimize)
pfctl_optimize_ruleset(pf, rs);
 
-   while ((r = TAILQ_FIRST(rs->rules.active.ptr)) != NULL) {
-   TAILQ_REMOVE(rs->rules.active.ptr, r, entries);
+   while ((r = RB_MIN(pf_ruletree, rs->rules.active.ptr)) != NULL) {
+   RB_REMOVE(pf_ruletree, rs->rules.active.ptr, r);
pfctl_expand_label_nr(r, rno);
rno++;
if ((error = pfctl_load_rule(pf, path, r, depth)))
diff --git sbin/pfctl/pfctl_optimize.c sbin/pfctl/pfctl_optimize.c
index dc93b882bef..d0ba5c2aca4 100644
--- sbin/pfctl/pfctl_optimize.c
+++ sbin/pfctl/pfctl_optimize.c
@@ -191,6 +191,7 @@ struct pf_rule_field {
 PF_RULE_FIELD(src_nodes,   DC),
 PF_RULE_FIELD(nr,  DC),
 PF_RULE_FIELD(entries, DC),
+PF_RULE_FIELD(usage,   DC),
 PF_RULE_FIELD(qid, DC),
 PF_RULE_FIELD(pqid,DC),
 PF_RULE_FIELD(anchor_relative, DC),
@@ -268,9 +269,9 @@ pfctl_optimize_ruleset(struct pfctl *pf, struct pf_ruleset 
*rs)
struct superblock *block;
struct pf_opt_rule *por;
struct pf_rule *r;
-   struct pf_rulequeue *old_rules;
+   struct pf_ruletree *old_rules;
 
-   if (TAILQ_EMPTY(rs->rules.active.ptr))
+   if (RB_EMPTY(rs->rules.active.ptr))
return (0);
 
DEBUG("optimizing ruleset \"%s\"", rs->anchor->path);
@@ -286,8 +287,8 @@ pfctl_optimize_ruleset(struct pfctl *pf, struct pf_ruleset 
*rs)
 * XXX expanding the pf_opt_rule format throughout pfctl might allow
 * us to avoid all this copying.
 */
-   while ((r = TAILQ_FIRST(rs->rules.inactive.ptr)) != NULL) {
-   TAILQ_REMOVE(rs->rules.inactive.ptr, r, entries);
+   while ((r = RB_MIN(pf_ruletree, rs->rules.inactive.ptr)) != NULL) {
+   RB_REMOVE(pf_ruletree, rs->rules.inactive.ptr, r);
if ((por = calloc(1, sizeof(*por))) == NULL)
err(1, "calloc");
memcpy(&por->por_rule, r, sizeof(*r));
@@ -319,7 +320,7 @@ pfctl_optimize_ruleset(struct pfctl *pf, struct pf_ruleset 
*rs)
if ((r = calloc(1, sizeof(*r))) == NULL)
err(1, "calloc");
memcpy(r, &por->por_rule, sizeof(*r));
-   TAILQ_INSERT_TAIL(rs->rules.active.ptr,