Re: [leaf-user] increasing load average

B. Sat, 08 Jun 2002 15:35:31 -0700

Thanks Ray and thanks Brad, too!

At 09:28 07.06.2002 -0700, you wrote:
>Possible? Yeah sure, I suppose it is "possible". But you'd do better to 
>give us a more systematic profile of what the router is doing if you want 
>good opinions.


OK, the router is used as a firewall and dhcpd and tinydns and dnscache in 
front of a small net of five computers, where only two are used as 
web-frontends. There is a squid running (but not used yet) and a sshd.
I hope the webcam is not the reason, because there are requests to it like 
max. 20 per minute. And this is the only traffic, but the load average
shows
  # w
  13:56:40 up 0 Days (17h), load average: 0.81 0.79 0.79

>The "load average" numbers that various apps report are sort of odd 
>things. They don't represent a true system "load", at least not to my 
>thinking; instead, they represent the average number of processes that are 
>waiting for some resource to be allocated to them (that is, blocked 
>processes). Unless you are running some unusual userspace firewalling app 
>on your system, I wouldn't expect basic routing demands to increase system 
>load by this measure.
>
>So ... in your guess, substitute "blocking" for "looping" and you are 
>probably right. But what process? Hmmm ... here are some things to 
>consider (and perhaps tell us about):
>
>1. How does CPU utilization change with this change in load? If you have 
>"top" available, it calculates and reports the numbers I have in mind; if 
>not, you'll need to get the raw data from /proc/stat and do the arithmetic 
>yourself.

First, I have to correct: The Machine is equiped with 48MBytes RAM, not 64....

OK, here's the top-output:

# top
   3:45pm  up  1:02,  1 user,  load average: 1.82, 1.68, 1.39
36 processes: 35 sleeping, 1 running, 0 zombie, 0 stopped
CPU states: 24.6% user, 25.7% system,  0.0% nice, 50.2% idle
Mem:   46828K av,  33608K used,  13220K free,  15720K shrd,  14228K buff
Swap:      0K av,      0K used,      0K free                  6032K cached

   PID USER     PRI  NI  SIZE  RSS SHARE STAT  LIB %CPU %MEM   TIME COMMAND
  2643 root      19   0   820  820   660 R       0 31.8  1.7  11:12 top
  1963 root       0   0  1148 1148   988 S       0  7.7  2.4   2:40 sshd
  1285 root       4   0   444  444   384 S       0  3.9  0.9   0:24 pppoe
  3725 root       0   0   324  324   268 S       0  2.8  0.6   0:00 sleep
  1862 root       0   0   904  904   744 S       0  1.6  1.9   0:31 ipmail
  1854 squid      1   0  4148 4148  1160 S       0  1.2  8.8   0:20 squid
  1851 root       0   0  1288 1288  1100 S       0  0.5  2.7   0:10 xntpd
  1771 root       0   0   356  356   296 S       0  0.2  0.7   0:05 svscan
     1 root       0   0   468  468   408 S       0  0.1  0.9   0:08 init
     2 root       0   0     0    0     0 SW      0  0.0  0.0   0:00 kflushd
     3 root       0   0     0    0     0 SW      0  0.0  0.0   0:00 kupdate
     4 root       0   0     0    0     0 SW      0  0.0  0.0   0:00 kswapd
     5 root       0   0     0    0     0 SW      0  0.0  0.0   0:00 keventd
     6 root     -20 -20     0    0     0 SW<     0  0.0  0.0   0:00 mdrecoveryd
  1276 root       0   0   960  960   768 S       0  0.0  2.0   0:00 
adsl-connect
  1284 root       0   0   792  792   664 S       0  0.0  1.6   0:03 pppd
  1504 root       0   0   552  552   460 S       0  0.0  1.1   0:01 syslogd
  1513 root       0   0   496  496   372 S       0  0.0  1.0   0:00 klogd
  1520 root       0   0   484  484   420 S       0  0.0  1.0   0:00 inetd
  1527 root       0   0   864  864   760 S       0  0.0  1.8   0:01 sshd
  1532 root       0   0   272  272   224 S       0  0.0  0.5   0:00 watchdog
  1537 root       0   0   572  572   484 S       0  0.0  1.2   0:00 cron
  1596 root       0   0   952  952   808 S       0  0.0  2.0   0:00 squid
  1601 root       0   0   728  728   584 S       0  0.0  1.5   0:00 dhcpd
  1833 root       0   0   308  308   260 S       0  0.0  0.6   0:00 supervise
  1834 root       0   0   308  308   260 S       0  0.0  0.6   0:00 supervise
  1835 tinydns    0   0   332  332   268 S       0  0.0  0.7   0:00 tinydns
  1836 dnslog     0   0   308  308   256 S       0  0.0  0.6   0:00 multilog
  1857 root       0   0   308  308   260 S       0  0.0  0.6   0:00 supervise
  1858 root       0   0   308  308   260 S       0  0.0  0.6   0:00 supervise
  1859 dnscache   0   0  1588 1588   340 S       0  0.0  3.3   0:04 dnscache
  1860 dnslog     0   0   364  364   312 S       0  0.0  0.7   0:01 multilog
  1863 root       0   0   460  460   392 S       0  0.0  0.9   0:00 getty
  1864 root       0   0   460  460   392 S       0  0.0  0.9   0:00 getty
  1871 squid      0   0   296  296   248 S       0  0.0  0.6   0:00 unlinkd
  1970 root       0   0  1184 1184   952 S       0  0.0  2.5   0:00 sh

during that, the w-command gives me:

# w
  15:45:12 up 0 Days (1h), load average: 2.17 1.76 1.41
USER     TTY      PID      TIMEON   FROM
root     ttyp1    1970     58       nl-ws-boris.nordlabs.network

Later I had the following output, where the pppoe is quite busy....
I also saw the ipmail increase CPU-usage towards 10%.

   6:05pm  up  3:22,  1 user,  load average: 2.69, 2.78, 2.50
40 processes: 33 sleeping, 7 running, 0 zombie, 0 stopped
CPU states: 28.5% user, 46.3% system,  0.0% nice, 25.8% idle
Mem:   46828K av,  33856K used,  12972K free,  16456K shrd,  14228K buff
Swap:      0K av,      0K used,      0K free                  6036K cached

   PID USER     PRI  NI  SIZE  RSS SHARE STAT  LIB %CPU %MEM   TIME COMMAND
  1285 root      20   0   444  444   384 R       0 38.6  0.9  29:20 pppoe
  6308 root      16   0   820  820   660 R       0 22.0  1.7  12:36 top
  6254 root       0   0  1148 1148   984 S       0  5.4  2.4   3:18 sshd


>2. What userspace apps are you normally running on the system?

I hope you can see that in the top-output, too.... Otherwise: What do you 
mean with 'userspace apps'?


>3. Does "the dns is no longer reliable" refer to a DNS server on the 
>router, or do you just mean that the slowness of the line causes offsite 
>DNS requests to time out (or do you mean something different from either 
>of these)?

On the (Win2k-) Client I got time-outs resoving the hostnames!


>4. What apps does "top" report as high-usage?

see 1.


>5. What filesystem hardware does the router use? (Blocking can easily be 
>due to the need to access a slow hard disk, for example.) In particular, 
>might syslogd be blocking somehow (it is a natural one to think of as 
>creeping up over time)?

It's a RAMdisk.


>All of this isn't even up to the level of fishing yet; it is just some 
>preliminary thoughts about a problem on a vaguely described router and network.

OK, but I'm quite thankful even for that, for I have no more ideas....
I will - as long as my ability allows - do everything to bring a light into 
this.

Thanks again!

Boris


>At 05:07 PM 6/7/02 +0200, Boris Andratzek wrote:
>>Hej All!
>>
>>Since several Weeks I got the following problem, that doesn't seem to be 
>>normal.
>>I use a dachstein 1.02.1 with glibc 2.1.3 on CD and the hardware is a IBM 
>>PC 330 (P-166) with 64 MBytes RAM ant two Realtek 8139 NICs. The system 
>>is stable and doing everything I want, but:
>>After the (re)start the load average is as I am used to from some other 
>>machines:
>># w
>>  16:53:06 up 0 Days (0h), load average: 0.04 0.01 0.02
>>
>>After a non-predictable period of time the load increases as if one 
>>process begins looping or so:
>># w
>>  17:14:39 up 0 Days (0h), load average: 0.79 0.86 0.81
>>
>>and even later:
>># w
>>  17:31:12 up 0 Days (0h), load average: 2.02 1.11 0.81
>>
>>In this state the dns is no longer reliable and the bandwith is lower 
>>than it could be.
>>
>>So, what do you think?? What can I try to do??
>>I yet changed the hardware once to a second PC of the same type and 
>>actually to an other type of IBM PC (P-75). Everything stays the same. Is 
>>it possible that this high load comes from requesting my apache behind 
>>the firewall quite frequently (WebCam)?
>
>
>--
>-----------------------------------------------"Never tell me the 
>odds!"--------------
>Ray Olszewski                                        -- Han Solo
>Palo Alto, California, USA                              [EMAIL PROTECTED]
>-------------------------------------------------------------------------------------------
>
>
>_______________________________________________________________
>
>Don't miss the 2002 Sprint PCS Application Developer's Conference
>August 25-28 in Las Vegas -- http://devcon.sprintpcs.com/adp/index.cfm
>
>------------------------------------------------------------------------
>leaf-user mailing list: [EMAIL PROTECTED]
>https://lists.sourceforge.net/lists/listinfo/leaf-user
>SR FAQ: http://leaf-project.org/pub/doc/docmanager/docid_1891.html


_______________________________________________________________

Don't miss the 2002 Sprint PCS Application Developer's Conference
August 25-28 in Las Vegas - 
http://devcon.sprintpcs.com/adp/index.cfm?source=osdntextlink

------------------------------------------------------------------------
leaf-user mailing list: [EMAIL PROTECTED]
https://lists.sourceforge.net/lists/listinfo/leaf-user
SR FAQ: http://leaf-project.org/pub/doc/docmanager/docid_1891.html

Re: [leaf-user] increasing load average

Reply via email to