Esa,
sorry not to mention my setup. It is OpenBSD 3.5 and ipfil 4.1.3 + some patches. Thanks for your response.
Darren,
thanks for your response. I got a bit further on this problem. (Well the main problem is that i have no permanent access to this box. I installed it and someone else is operating it. So my pace of debugging is quite slow.)
> hmmm, are you using "return-*" with "out" rules or just "in" rules ?
No i use return-* only on "block in" rules. Actually i don't use any out rules besides "block out all", but filter on incoming packets exclusively and have the state code passing packets out.
block return-rst in log quick proto tcp from any to any port=ident;
block return-icmp(net-unr) in quick from $int1net to $int2net;
block return-icmp(net-unr) in quick from $int2net to $int1net;
....
block out on $int1if, $int2if all head intout;
pass out quick proto icmp from ($int1ip, $int2ip) to any
icmp-type unreach group intout pps 20;(I think i can remove the last pass out, i put it in while trying to get return-icmp to work.)
> If you could monitor this problem by doing: > vmstat -m > netstat -m > ipfstat
I already did, and you are right, no indication of a leak. Nothing suspicious. (Actually i did ipfstat -R; ipfstat -sR; ipfstat -slR; ipfstat -fR;netstat -m; netstat -s; vmstat -m; every 5 minutes)
But i found the following in the logs:
cerberus /bsd: Data modified on freelist:
word 5 of object 0xd17a2b00 size 0x8c
previous type temp (0xd17bdc00 != 0xdeadbeef)The problem seems that something in the kernel is modifying already freed memory. Since the type is "temp" i cannot say for sure if it is ipfilter or another part of the OpenBSD kernel.
Does anybody know by heart what structure inside ipfil has size 140? Actually the offset of 20 bytes would indicate an mbuff, but mbufs are of size 128 not 140 bytes.
There were other suspicious log entries:
cerberus /bsd: arplookup: unable to enter address for X.X.X.X
but i think this should not crash the box, since there was a patch ftp://ftp.openbsd.org/pub/OpenBSD/patches/3.4/common/003_arp.patch which is contained in OpenBSD 3.5. Anyhow I stopped this message by giving the internal if an alias of X.X.X.Y. Lets see if this will stop the crashes.
The intervals between locks vary between 2h and more than a week. Therefore the fact that this box were up a week without return-* does not necessarily mean that the locks are due to return-* or even ipfilter. Furthermore I would say it is likely that a certain packet (or sequence of packets) triggers the lock. It does not look like a starvation issue.
The next step i can think of is to change ipfilter to use meaningfull MALLOC_DEFINES (not M_TEMP) while allocating kernel mem. I thought of something like "ipffil", "ipfnat", "ipfstate", "ipfproxy" ... With this setup one could narrow down which part of the kernel is modifying the freed mem. But this will take some time ...
-- Attila
Darren Reed wrote:
hmmm, are you using "return-*" with "out" rules or just "in" rules ?
I'm trying to replicate this problem, but so far, there's no evidence of leaking packets with just these rules:
# ipfstat -hio empty list for ipfilter(out) 2599757 block return-icmp in proto icmp from any to any 270980 block return-rst in proto tcp from any to any port 9000 >< 9999
If you could monitor this problem by doing: vmstat -m netstat -m ipfstat
...say every half hour, that would be good.
Cheers, Darren
