Jens Elkner wrote:
...
So the sun case engineer explained, if ipf can not insert an entry into
the state table, it just _continues_ evaluating the rules that follow. I couldn't believe my eyes!!! What a crap!!!
Well, what would you have it do?

As said, at least notify the user about the problem. Since I'm not
an ipf specialist, I'm not sure, what is the right thing to further do here. My guts say, drop and stop processing other rules.

Another way could be, to lower the TTL of all entries in steps of N hours,
cleanup, and check, whether now the insert succeeds. This implies of course
a new threshold to be set by the OS/user and a user notification, that this
measure has been taken (documentation about what that means would be
probably a nice thing, too) ...

It does trigger this to happen but in a slightly more detailed way.
There's a high and low watermark for the table and it starts with
the oldest and works its way forward to try and empty things out.
Buf if there are more new connection attempts than there are old
things that can be removed, packets will get dropped. I'm not
sure if the improvements in the cleaning were in S10U4, but they
are in Nevada and the latest S10 updates.


It is not appropriate to have ipfilter automatically
grow the tables as you suggest through "automatic
tuning" because then your system becomes vulnerable
to a denial of service attack from remote attackers.

Isn't it already vunerable to DoS, when the state table is too small?

Yes & no. But a different denial of service attack.

If the table just kept growing, it could consume all kernel memory
and make the machine unresponse/crash. In this situtation, if the table
fills up then certain packets will get dropped. If you're running a web
server or some other kind of service where the security of the TCP
connection itself isn't as important, maybe stateless filtering is a better
choice?

Or if there are multiple keep-state rules, you can apply individual
limits to each. (see below)


Actually I've checked all of our production servers (most of them
already have fr_statemax=40129) and all show paket lost > 100,
usually > 5000 even on very very low traffic machines and who wonders, of
course on svn machines as well. So this implies, that's something wrong
with ipf or at least with its default settings...

Over what time period?


Well, I think I've a more or less good understanding about what's going
on on the machines. However, I'm neither a firewall specialist nor
ipf developer and I even didn't expect a firewall to do silently things,
which it shouldn't.

What it should or shouldn't do seems to be somewhat subjective in this case.

The primary purpose of the firewall is to provide security and part of that
security comes through being able to audit what's going on. In that case,
letting through packets that matched a "pass" rule but for which the state
could not be created is arguably incorrect.

Actually can only rely on the things, which are
documented, and you probably know, there is a lot room for improvements:
E.g. It is documented, how one can obtain certain ipf statistics, but
nowhere, what these stats actually telling you/how to interprete those
stats.

I'm aware of that and as we like to say, this is an open source
project, so if something really bothers you please feel free to
submit additions or corrections. We only have so much bandwidth.


...
Wrt. a required syslog message he respond, that a counter increment
(ipfstat: packet state*lost) costs 2 cycles on sparc, only, but a syslog message 2000 cycles and would cause ipf to "hang"/be unusable, and closed
the case.
syslog message from where?
Ehmmm, I'm not a kernel developer and actually didn't care about it yet.
What I know is, that e.g. on linux there is a klogd ...

Yes, but that is Linux.
There are different rules for OpenSolaris developers.


Generating messages from within kernel modules
is generally frowned upon.

OK. So hiding kernel problems is a better thing? What is so hard,
to increment a wellknown value in the kernel and let a logger in
the user space poll for changes every n timeunits? Also not sure, wheter
a new thing needs to be invented for this: If IIRC ipmon is already able
to log ipf related stuff (but no documentation, if this is a bad thing
to use, because of possible performance degration ???) ...

Whilst we can send messages to/via it, we are not allowed to
rely on them as being the only communication channel with the
systems administrator.

OK. But did I say that?

No, but there are different rules for you than there are for me/us
as a developer(s) of [Open]Solaris.

So all of our production machines (S10u7) even the low traffic ones
and fr_statemax=40129 have a problem and 'keep state' needs to be
considered harmful :(((

Just to make sure, you have "flags S" with all of your TCP "keep state" rules?


Sure, ipf's behavior to processs the rules list like
  'the rule is ignored' is for my taste more than a minor security
  issue, but anyway,  should one rise fr_statemax the value to make
  it bigger and bigger 'til one finds out, that actually ipf is having
  a problem? And what is also not clear: are the 'lost' counters also
  snapshots (for what intervall/time), or is this an accumulation
from when ipf got started/refreshed?
It's part of something much more than that.

What it allows is for you to create "keep state" rules
that define a maximum number of states allowed for them
and when that maximum is reached for other rules to be
then applied to packets. The problem of when the global
maximum is reached is a degenerative case of that.

Not sure, whether I understood that correctly :(

The limit on how many state-table entries can be controlled per-rule:

pass in on bge0 proto tcp from any to any port = 22 flags S keep state (limit 20)

...will always allow upto 20 and no more than 20 entries to be created
because of that rule matching. Now if there is another rule, like this:

pass in on bge0 proto tcp all flags S keep state

...then the 21st ssh connection will "fail" matching the first rule (limit is reached) but will successfully match the second and potentially then cause a new state
table entry to be created.

Similarly, the reverse is true also: if the rule without the limit fails to create
a state table entry because the table is too full then the second, with the
limit of 20 allows for a state table entry to be created so long as there are
no more than 19 such entries already in the table.


...
BTW: Why shows 'ipfstat -t' so many entries with negative ttls. It
    appears, that if the min? value of -59:-59 is reached, ttl gets
    reset to 0:00 and restarts decrementing it ... - strange
That's a known bug... fixed in the current opensolaris
source code tree and will be fixed in the next release.
A fix is also being considered for Solaris 10.

So is it just a kind of overflow or does ipf is holding state entries
for much longer than it needs to?

They get removed from the state table but not the list of state entries.
Doing "ipf -Fs -FS" will clean it all out :-D

Darren

_______________________________________________
networking-discuss mailing list
networking-discuss@opensolaris.org

Reply via email to