G'day, I have a server machine connected to the net via eth0 to a Netgear ADSL router. There are other machines connected to the Netgear router that use the server as an nfs/samba/apache/proxy/dnscache/etc. The server is also a public webserver.
The netgear is configured to forward all incoming ADSL traffic to the server. I have shorewall configured on the server to protect it, including things like rate-limiting incoming email and ssh connections. Ideally I'd like to run shorewall on the Netgear... (Oh how I miss having a linux router I could control...) I've also found that the server ADSL web traffic can be busy enough that my battlefield 2142 games on another machine complain about network latency. So I thought I'd configure shorewall's internal TC stuff to limit the outgoing traffic and save a bit for the other machines. This worked great... until I stopped playing Battlefield and tried to use NFS. At first I thought that local NFS traffic was incorrectly being classified and rate limited as ADSL traffic. However, further digging with "shorewall show tc|mangle|connections|classifers", running NFS reads/writes, and using iptraff to measure network throughput showed that; 1) NFS was not being incorrectly classified as ADSL traffic. All NFS traffic was marked and classified to use the 1:14 class with a rate/ceil of 99mbit/100mbit. 2) The NFS slowdowns only affected NFS writes. NFS reads went fast and could saturate the 100M eth0 interface at about 80Mbit/sec. Writes were so slow the client would often appear to have hung, and interface was around 500kbit/sec. 3) The issue was not CPU. The loadavg on the server was around 0.1 and CPU was 98% idle throughout the NFS write tests. 4) The problem was definitely caused by turning on TC. The problem only occured with "TC_ENABLED=Internal" in shorewall. With "TC_ENABLED=No" the NFS write speeds were nice and snappy, hitting the same 80Mbit/sec interface traffic of reads. So I simplified things down to a single default tcclass with rate/ceil of 99mbit/100mbit and no tcrules... and the problem still present. It seems that just turning on TC was enough. Then I noticed the "HTB: quantum of class 10001 is big. Consider r2q change." message in dmesg and /var/log/syslog. Online searches showed that quantum should be less than 60000, so I modified "calculate_quantum()" in /usr/share/shorewall/compiler so that it limited quantum to 60000. This didn't help, though the warning message continued to show after I did this... "shorewall show tc" did show that class 1:14 had the limited quantum, but I suspect that my change only affected the "leaf" classes, so maybe this is still the problem. In summary, I don't think this is a problem with shorewall... it looks like a problem between HTB TC and NFS. It's rather strange that it only affects writes, but I suspect this is because writes use more back and forth traffic to verify write success, and this is different to the more continuous streaming of reads. Any suggestions on where to go from here are welcome. I'm going to dig further to try and get the quantum warning to go away and see if that helps. Otherwise I'm going to come to the conclusion that NFS and TC are just incompatible. -- Donovan Baarda http://minkirri.apana.org.au/~abo/ ------------------------------------------------------------------------- Take Surveys. Earn Cash. Influence the Future of IT Join SourceForge.net's Techsay panel and you'll get the chance to share your opinions on IT & business topics through brief surveys-and earn cash http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV _______________________________________________ Shorewall-users mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/shorewall-users
