Esa,

sorry not to mention my setup. It is OpenBSD 3.5 and ipfil
4.1.3 + some patches. Thanks for your response.

Darren,

thanks for your response. I got a bit further on this problem.
(Well the main problem is that i have no permanent access to
this box. I installed it and someone else is operating it. So
my pace of debugging is quite slow.)

> hmmm, are you using "return-*" with "out" rules or just "in" rules ?

No i use return-* only on "block in" rules. Actually i don't use
any out rules besides "block out all", but filter on incoming
packets exclusively and have the state code passing packets out.

block return-rst in log quick proto tcp from any to any port=ident;
block return-icmp(net-unr) in quick from $int1net to $int2net;
block return-icmp(net-unr) in quick from $int2net to $int1net;
....
block  out on $int1if, $int2if all head intout;
  pass out quick proto icmp from ($int1ip, $int2ip) to any
        icmp-type unreach group intout pps 20;

(I think i can remove the last pass out, i put it in while trying
to get return-icmp to work.)

> If you could monitor this problem by doing:
> vmstat -m
> netstat -m
> ipfstat

I already did, and you are right, no indication of a leak.
Nothing suspicious. (Actually i did ipfstat -R; ipfstat -sR;
ipfstat -slR; ipfstat -fR;netstat -m; netstat -s; vmstat -m;
every 5 minutes)


But i found the following in the logs:

cerberus /bsd: Data modified on freelist:
     word 5 of object 0xd17a2b00 size 0x8c
    previous type temp (0xd17bdc00 != 0xdeadbeef)

The problem seems that something in the kernel is modifying
already freed memory. Since the type is "temp" i cannot say for
sure if it is ipfilter or another part of the OpenBSD  kernel.

Does anybody know by heart what structure inside ipfil has size
140? Actually the offset of 20 bytes would indicate an mbuff, but
mbufs are of size 128 not 140 bytes.

There were other suspicious log entries:

cerberus /bsd: arplookup: unable to enter address for X.X.X.X

but i think this should not crash the box, since there was a patch
ftp://ftp.openbsd.org/pub/OpenBSD/patches/3.4/common/003_arp.patch
which is contained in OpenBSD 3.5. Anyhow I stopped this message by
giving the internal if an alias of X.X.X.Y. Lets see if this will
stop the crashes.

The intervals between locks vary between 2h and more than a week.
Therefore the fact that this box were up a week without return-*
does not necessarily mean that the locks are due to return-* or
even ipfilter. Furthermore I would say it is likely that a
certain packet (or sequence of packets) triggers the lock. It does
not look like a starvation issue.

The next step i can think of is to change ipfilter to use
meaningfull MALLOC_DEFINES (not M_TEMP) while allocating kernel mem.
I thought of something like "ipffil", "ipfnat", "ipfstate",
"ipfproxy" ... With this setup one could narrow down which part
of the kernel is modifying the freed mem. But this will take some
time ...


-- Attila


Darren Reed wrote:
hmmm, are you using "return-*" with "out" rules or just "in" rules ?

I'm trying to replicate this problem, but so far, there's no evidence
of leaking packets with just these rules:

# ipfstat -hio
empty list for ipfilter(out)
2599757 block return-icmp in proto icmp from any to any
270980 block return-rst in proto tcp from any to any port 9000 >< 9999

If you could monitor this problem by doing:
vmstat -m
netstat -m
ipfstat

...say every half hour, that would be good.

Cheers,
Darren


Reply via email to