On Thu, Mar 14, 2002 at 02:32:51PM -0800, Americo Melara wrote:
> Hi,  I'm working on my thesis and need some help.  I am doing performance 
>measurements to understand how much overhead does iptables create in the stack when 
>processing a single packt by varying the number and type of rules, and payload size 
>of each packet.
> 
> Some of my results show that for a TCP connection sending a single packet, it takes 
>less time to process 10 and 40 ip addresses than 10 and 40 TCP ports and MAC 
>addresses.  As a matter of fact, TCP ports and MAC addrss processing have the same 
>trend, but I'm hesitant about IP.  I would like to either confirm or invalidate my 
>results by understanding the algorithm.  I am searching through the code and trying 
>to outline the process but I was wondering, is there any documentation that describes 
>the algorithm(s) used for each rule?  

[please keep your lines formatted under 75 characters each.
Long lines are awful]

Your observation is basically correct. I am not aware of any documentation
on the exact implementation (I wouldn't call that level of detail
"algorithm"), apart from the best documentation there is: the source.
See net/ipv4/netfilter/ip_tables.c, function ipt_do_table() for the
checking main loop, and ip_table_match() for the single-rule matching.

In short: rules are scanned sequentially in the order they appear in a chain,
until a full match meets a terminating target. For each rule checked, source
and destination interface and IP addresses are _always_ checked (even if
not specified / visible in the rule), and any type of "-p" and "-m" match
is checked after interface/addresses matched. Thus, IP addresses alone are
faster than IP addresses plus TCP ports - the latter is already a "special
case" the way things are implemented.

BTW, there's an IMHO not insignificant optimization potential given by the
current always-match-both-interfaces logic. For each rule, the information
about the interfaces to check, takes two full 32 byte cache lines on P-III,
for each rule. Lots of rules don't reference interfaces at all, but neverthe-
less the check loop needs to touch those cache lines.

So, if your analysis includes a part where you make a small modification
giving a good measurable impact, I'd propose you see how to give two
bits saying "check source interface / destination interface" to each rule,
and only touch those cachelines when the bits are set.

best regards
  Patrick

Reply via email to