Hi,
This series of 29 small diffs slims pf.o by 2640 bytes and pf.c by
113 non-comment lines.
pf_translate(), in particular, is now much shorter and clearer[0].
I've tested it by running on my home router (alix, i386, inet4) for a
week or so without issue, and by profiling it under stress. I saw no
significant performance impact, and it's, possibly, even a little more
efficient. See below for the profile details.
The patch's length is due to its conservatism; the cumulative patch
is much smaller.
At each step, I attempted a correctness argument and some of these follow
the calculational approach of Dijkstra and others. For background, see
e.g. EWD1300 and EWD1123 (for the interested, I've found A.J.M van
Gasteren's 'On the shape of mathematical arguments' a stellar read, and
also recommend EWD1023). I've found it a useful exercise, particularly
as the effort needed to make the argument indicates the effort required
to review the diff, and several times I was led to take smaller bites.
best,
Richard.
PS. If you, reading this, like the patch, or even if you don't,
please drop me a line to let me know. Thanks!
[0] As pf_translate() is called only on state creation it isn't
terribly performance sensitive, either.
Profiling -
Setup: netcat a 415MB file through the profiled router.
[A] -- 192.168.3.2 vr1 -- [router] -- vr0 192.168.1.2 -- [B]
(vr0 is part of a bridge.)
pf.conf contains
match on vr1 scrub (random-id, reassemble tcp)
match out on vr0 inet from !vr0:network nat-to $router
pass
before patch: real 1m18.118s
after patch: real 1m17.282s, 1m18.150s, 1m17.716s
before patch:
1.044.92 1303400/1303400pf_test [9]
[13] 5.91.044.92 1303400pf_test_state [13]
0.892.10 1301332/1301332pf_tcp_track_full [19]
0.360.69 1303400/1304361pf_find_state [43]
0.290.06 1301273/1951905m_copyback [63]
0.200.00 3257446/5872085pf_addrcpy [74]
0.080.08 650646/650647 pf_change_a [93]
0.100.06 650646/650647 pf_change_ap [95]
after patch:
(note, profiler spent less time in the idle loop before the test began,
so the time percentages 5.9, 6.9 aren't comparable between these runs.)
0.964.88 1301679/1301679 pf_test [7]
[13] 6.90.964.88 1301679 pf_test_state [13]
0.891.99 1301628/1301628 pf_tcp_track_full [19]
0.390.59 1301679/1301680 pf_find_state [44]
0.190.16 1301580/1301582 pf_change_a [74]
0.280.05 1301576/1952366 m_copyback [64]
0.130.06 1301580/1301582 pf_change_16 [87]
0.150.00 2603358/5857533 pf_addrcpy [75]
Patch
To apply against HEAD (pf.c:1.935, pfvar.h:1.419):
First apply the checksum patch I posted to tech@
Re: [patch] cleaner checksum modification for pf,
Message-Id: 98bb5cd8-7b74-4a79-a333-70749e3bc...@gmail.com.
(I'm also happy to regenerate what follows directly against HEAD
if necessary.)
then,
# cd /usr/src/sys
# cat - | patch
- add pf_change_p(), change port, and use instead of pf_change_ap() when
changing port only.
No functional change: all replaced pf_change_ap() leave address unchanged.
Argument:
(note: afto == (pd-af != nk-af))
All replaced pf_change_ap() executed under afto with arg naf := nk-af
= { pf_change_ap() }
All replaced executions of pf_change_a() executed
with args af := pd-af, naf := nk-af, under afto which implies af != naf
= { afto and pf_change_a() guarded with 'if (af != naf) return;' }
All replaced executions of pf_change_a() are no-op
= { pf_change_ap() }
All replaced pf_change_ap() leave address unchanged
[Context=15]
Index: net/pf.c
===
--- net.orig/pf.c
+++ net/pf.c
@@ -139,30 +139,31 @@ union pf_headers {
struct pool pf_src_tree_pl, pf_rule_pl, pf_queue_pl;
struct pool pf_state_pl, pf_state_key_pl, pf_state_item_pl;
struct pool pf_rule_item_pl, pf_sn_item_pl;
void pf_init_threshold(struct pf_threshold *, u_int32_t,
u_int32_t);
void pf_add_threshold(struct pf_threshold *);
int pf_check_threshold(struct pf_threshold *);
void pf_cksum_fixup(u_int16_t *, u_int16_t, u_int16_t,
u_int8_t);
void pf_cksum_fixup_a(u_int16_t *, const struct pf_addr *,
const struct pf_addr *, sa_family_t, u_int8_t);
void pf_change_32(struct pf_pdesc *, u_int32_t *,