Okay here we go I think we are making some headway with the nfs, I followed this knowledge base article and now all the nfs mounts come up wonderfully.
http://tinyurl.com/2g2qmw When the load average goes to 149 I get a maxclients reached in my error log. I can browse the web page fine fast and everything until that limit is reached. So I am thinking it is an apache issue and it is not closing the connections. Right now I have this many connections # netstat -anlp | grep ESTABLISHED | wc -l 28 # netstat -anlp | grep ESTABLISHED | wc -l 27 # netstat -anlp | grep ESTABLISHED | wc -l 24 But then this many processes in D state # ps aux | grep httpd | grep "D" | wc -l 112 This is where I am thinking something is messed up in my httpd.conf <IfModule prefork.c> StartServers 8 MinSpareServers 5 MaxSpareServers 20 MaxClients 150 MaxRequestsPerChild 1000 </IfModule> <IfModule worker.c> ServerLimit 16 StartServers 2 MaxClients 150 MinSpareThreads 25 MaxSpareThreads 75 ThreadsPerChild 25 MaxRequestsPerChild 1000 </IfModule> Running out of ideas, Adam ----- Steve Alligood <[EMAIL PROTECTED]> wrote: > You are getting into the area of nfs that I usually have to poke > around > and hope to get lucky. > > Try forcing to either nfs v2 or udp on nfs v3. (the nfsvers=2 or udp > mount options) > > Also, turn off the atime checking (noatime option), as this will pound > > your nfs mount for every read. > > And you can try async, but I think that may only be for writes. > > -Steve > > adam fisher wrote: > > So these are the mount statments for nfs > > > > > > 10.11.1.91:/data/media /mnt/media nfs > rsize=8192,wsize=8192,timeo=14,rw,hard,intr 0 0 > > 10.11.1.91:/data/halo /mnt/web nfs > rsize=8192,wsize=8192,timeo=14,rw,hard,intr 0 0 > > 10.11.1.91:/data/util /mnt/util nfs > rsize=8192,wsize=8192,timeo=14,rw,hard,intr 0 0 > > 10.11.1.91:/data/www /www nfs > rsize=8192,wsize=8192,timeo=14,rw,hard,intr 0 0 > > 10.11.1.91:/data/online /mnt/online nfs > rsize=8192,wsize=8192,timeo=14,rw,hard,intr 0 0 > > 10.11.1.91:/data/library /mnt/library nfs > rsize=8192,wsize=8192,timeo=14,rw,hard,intr 0 0 > > > > > > When I restart the box only the last three are mounted. When I run > a mount -a all of them mount and everything runs. I can browse the > website just fine till the load average gets to be around 70 or so and > it eventually gets to 149 and then just stays there because the max > clients is set at 150. > > > > when I run a nfsstat -o net I get this. > > > > Server packet stats: > > packets udp tcp tcpconn > > 0 0 0 0 > > > > Client packet stats: > > packets udp tcp tcpconn > > 0 0 0 0 > > > > However, nfsstat does show activity. > > > > Server rpc stats: > > calls badcalls badauth badclnt xdrcall > > 0 0 0 0 0 > > > > Client rpc stats: > > calls retrans authrefrsh > > 40915 0 0 > > > > Client nfs v3: > > null getattr setattr lookup access > readlink > > 0 0% 37438 91% 0 0% 537 1% 1957 4% 7 > 0% > > read write create mkdir symlink > mknod > > 966 2% 0 0% 0 0% 0 0% 0 0% 0 > 0% > > remove rmdir rename link readdir > readdirplus > > 0 0% 0 0% 0 0% 0 0% 0 0% 0 > 0% > > fsstat fsinfo pathconf commit > > 3 0% 6 0% 0 0% 0 0% > > > > > > Any other idea? What am I missing? > > > > thanks, > > Adam > > > > > > ----- Steve Alligood <[EMAIL PROTECTED]> wrote: > >> If the other boxes are working fine with nfs, it probably isn't the > > >> number of nfsd processes running (though you can change that in > >> /etc/sysconfig/nfs with the RPCNFSDCOUNT setting, default is 8). > >> > >> Again, I would make sure it can actually get cat the files from the > > >> fedora box during the higher load times, make sure the mount isn't > > >> stale, that the network is performing correctly (forced NIC and > >> switchport rather than auto, check with netstat -in for interface > >> errors), and even make sure to force the nfs mount rather than > assume > >> > >> the defaults (BSD may default to a larger window, etc, etc). > >> > >> None of these are certain, but places worth checking. > >> > >> -Steve > >> > >> adam fisher wrote: > >>> This is the mount statement for our BSD boxes and the fedora box. > >>> > >>> 10.11.1.91:/data/online /mnt/online nfs > >> rw,port=2049,intr 0 0 > >>> We then have a /online ->/mnt/online > >>> > >>> Fedora says the default is v2. > >>> > >>> I am not sure what the 0 0 are doing at the end of the mount > but > >> they were on the freebsd boxes so I just left them. > >>> Is there away to make sure that we are allowing enough > connections > >> on the NFS server? > >>> let me know what you see. > >>> > >>> thanks, > >>> Adam > >>> > >>> > >>> ----- Steve Alligood <[EMAIL PROTECTED]> wrote: > >>>> it may be HOW you are mounting it, and how fedora versus BSD > >> defaults > >>>> to > >>>> mount it. > >>>> > >>>> nfs v2 will be really quick, but not as reliable for data writes > >> (aka, > >>>> udp) > >>>> > >>>> nfs v3 will be more reliable (tcp) but slower > >>>> > >>>> nfs v4 will be reliable (tcp) and secure (encrypted) but a lot > >> slower > >>>> Fedora may default to v4 while your BSD does v3 or v2. > >>>> > >>>> > >>>> I have some mounts I use nfs v2 because I am not as worried > about > >>>> writes > >>>> and I need the speed. I also change the read and write window > >> sizes, > >>>> and turn off atime checking: > >>>> > >>>> async,soft,noatime,intr,nfsvers=2,rsize=8192,wsize=8192 > >>>> > >>>> Of course, the server must support the v2 nfs as well (obvious, > but > >>>> worth mentioning) > >>>> > >>>> -Steve > >>>> > >>>> adam fisher wrote: > >>>>> I appreciate everybody's thoughts on this. > >>>>> > >>>>> I agree that the NFS looks to be the bottle neck however we > have > >> 5 > >>>> other load balanced web servers that are pulling the web data > from > >> our > >>>> NFS server. We mount the partition and then created sym links > to > >>>> those mounts. The other 5 web boxes are up and running fine. > It > >> is > >>>> the sixth alone that is having this issue. > >>>>> The first 5 are BSD this is a Fedora installation as we want to > >> get > >>>> away from BSD. > >>>>> Any other ideas? > >>>>> > >>>>> thanks, > >>>>> Adam > >>>>> > >>>>> > >>>>> ----- Ryan Simpkins <[EMAIL PROTECTED]> wrote: > >>>>>> On Wed, March 28, 2007 11:44, adam fisher wrote: > >>>>>>> apache 17268 0.7 0.6 29552 12868 ? D 04:27 > >> 0:04 > >>>>>> /usr/sbin/httpd > >>>>>>> apache 17456 1.1 0.6 29728 13168 ? S 04:27 > >> 0:06 > >>>>>> /usr/sbin/httpd > >>>>>>> apache 17890 0.5 0.6 29928 12588 ? D 04:28 > >> 0:02 > >>>>>> /usr/sbin/httpd > >>>>>>> apache 17893 0.0 0.5 29032 11548 ? D 04:28 > >> 0:00 > >>>>>> /usr/sbin/httpd > >>>>>>> apache 17895 0.0 0.5 29184 11716 ? D 04:28 > >> 0:00 > >>>>>> /usr/sbin/httpd > >>>>>>> apache 17896 0.0 0.5 28740 11256 ? D 04:28 > >> 0:00 > >>>>>> /usr/sbin/httpd > >>>>>>> apache 17897 0.0 0.5 28912 11452 ? D 04:28 > >> 0:00 > >>>>>> /usr/sbin/httpd > >>>>>>> apache 17904 0.3 0.5 29288 11876 ? D 04:28 > >> 0:01 > >>>>>> /usr/sbin/httpd > >>>>>>> apache 17913 0.5 0.5 29316 11892 ? D 04:29 > >> 0:02 > >>>>>> /usr/sbin/httpd > >>>>>>> apache 17923 0.1 0.5 29364 12052 ? D 04:29 > >> 0:00 > >>>>>> /usr/sbin/httpd > >>>>>> > >>>>>>> Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s > >>>>>> avgrq-sz avgqu-sz > >>>>>>> await svctm %util > >>>>>>> sda 0.00 11.00 0.00 6.00 0.00 136.00 > > >>>>>> 22.67 0.00 > >>>>>>> 0.50 0.17 0.10 > >>>>>>> The web root is located on an NFS share. I restarted NFS on > >> this > >>>>>> box just to make > >>>>>>> sure. When I restart httpd and the load average drops to > >> around > >>>> 10 > >>>>>> or 11 I can > >>>>>>> browse the webpage just fine. It is when it gets to around > 150 > >>>> that > >>>>>> I can't. > >>>>>> Bingo. Your web root is running over NFS. NFS is pure evil for > >>>> this > >>>>>> type of work. > >>>>>> You may be able to improve performance playing around with the > >>>> various > >>>>>> NFS mount > >>>>>> options. > >>>>>> > >>>>>> -Ryan > >>>>>> > >>>>>> /* > >>>>>> PLUG: http://plug.org, #utah on irc.freenode.net > >>>>>> Unsubscribe: http://plug.org/mailman/options/plug > >>>>>> Don't fear the penguin. > >>>>>> */ > >>> > > > > /* PLUG: http://plug.org, #utah on irc.freenode.net Unsubscribe: http://plug.org/mailman/options/plug Don't fear the penguin. */
