On Thu, Mar 14, 2002 at 02:32:51PM -0800, Americo Melara wrote: > Hi, I'm working on my thesis and need some help. I am doing performance >measurements to understand how much overhead does iptables create in the stack when >processing a single packt by varying the number and type of rules, and payload size >of each packet. > > Some of my results show that for a TCP connection sending a single packet, it takes >less time to process 10 and 40 ip addresses than 10 and 40 TCP ports and MAC >addresses. As a matter of fact, TCP ports and MAC addrss processing have the same >trend, but I'm hesitant about IP. I would like to either confirm or invalidate my >results by understanding the algorithm. I am searching through the code and trying >to outline the process but I was wondering, is there any documentation that describes >the algorithm(s) used for each rule?
[please keep your lines formatted under 75 characters each. Long lines are awful] Your observation is basically correct. I am not aware of any documentation on the exact implementation (I wouldn't call that level of detail "algorithm"), apart from the best documentation there is: the source. See net/ipv4/netfilter/ip_tables.c, function ipt_do_table() for the checking main loop, and ip_table_match() for the single-rule matching. In short: rules are scanned sequentially in the order they appear in a chain, until a full match meets a terminating target. For each rule checked, source and destination interface and IP addresses are _always_ checked (even if not specified / visible in the rule), and any type of "-p" and "-m" match is checked after interface/addresses matched. Thus, IP addresses alone are faster than IP addresses plus TCP ports - the latter is already a "special case" the way things are implemented. BTW, there's an IMHO not insignificant optimization potential given by the current always-match-both-interfaces logic. For each rule, the information about the interfaces to check, takes two full 32 byte cache lines on P-III, for each rule. Lots of rules don't reference interfaces at all, but neverthe- less the check loop needs to touch those cache lines. So, if your analysis includes a part where you make a small modification giving a good measurable impact, I'd propose you see how to give two bits saying "check source interface / destination interface" to each rule, and only touch those cachelines when the bits are set. best regards Patrick