Hello,

I have a couple FreeBSD boxes that are providing a captive portal
wifi authentcation system. Without delving into the implementation
details, I'm running dhcpd, squid, and apache. We have in-house perl CGI
scripts that handle session and IP management, dynamically creating and
destroying netgraph nodes (ng_nat), connecting them to ipfw (ng_ipfw),
and altering the contents of access tables.

Right now, I'm seeing peaks of about 300 authenticated users; I'm
expecting this to grow about 200% when everyone gets back from summer
break. I'm trying to look at system load statistics to reassure myself
we'll be fine in a month -- or to panic and start throwing more hardware
at things. 

What is the difference between the SIZE and RES fields of top? Better
yet, what does top(1) mean by "the total size of the process (text,
data, and stack)" and "the current amount of resident memory"? How does
this work with a threaded program like apache? If all the threads share
the same text and most (all?) of the same data pages, what's the best
way to figure out the fixed cost and the average per-thread cost?

Some sample top output on this host:

Mem: 131M Active, 3754M Inact, 425M Wired, 177M Cache, 214M Buf, 3422M Free
Swap: 16G Total, 24K Used, 16G Free
[...]
  PID USERNAME    THR PRI NICE   SIZE    RES STATE  C   TIME   WCPU COMMAND
32361 root          1  96    0   106M 16604K select 2   0:02  0.00% httpd
50687 www           1  20    0   106M 17196K lockf  0   0:01  0.00% httpd

I'm having a hard time accounting for the 3.8GB of inactive memory
(which as I understand, represents physical pages that are in-use but
not recently used, prime candidates for being swapped out if the free
page count gets low). Maybe better understanding the RES verses SIZE
data along with their relation to threads will explain what's going on
here.

One of my concerns is that a large chunk of memory is going to belong to
the kernel in my configuration. I found vmstat -m (selected output lines
follow):

|      libalias  5629  3251K       - 19760019  128
|         ifnet    13    25K       -       13  256,2048
|      dummynet    22     8K       -       26  256,512,1024
|  netgraph_msg     0     0K       -   101991  64,128,256,512,1024,4096
| netgraph_node    72    18K       -    56133  256
| netgraph_hook   284    36K       -    30204  128
|      netgraph   283    16K       -    30203  16,64,128
| netgraph_parse     0     0K       -    22650  16
| netgraph_sock     0     0K       -    48581  128
| netgraph_path     0     0K       -    71508  16,32

Does this really mean that my netgraph nodes (and their libalias
instances) are really eating up less than 4MB of memory on the system?
The only other "big spender" appears to be devbuf at 35185K. 

I also found `netstat -m':

| 1026/1599/2625 mbufs in use (current/cache/total)
| 1023/1513/2536/25600 mbuf clusters in use (current/cache/total/max)
| 1/678 mbuf+clusters out of packet secondary zone in use (current/cache)
| 0/121/121/12800 4k (page size) jumbo clusters in use (current/cache/total/max)
| 0/0/0/6400 9k jumbo clusters in use (current/cache/total/max)
| 0/0/0/3200 16k jumbo clusters in use (current/cache/total/max)
| 2302K/3909K/6212K bytes allocated to network (current/cache/total)
| 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
| 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
| 0/0/0 sfbufs in use (current/peak/max)
| 0 requests for sfbufs denied
| 0 requests for sfbufs delayed
| 60 requests for I/O initiated by sendfile
| 0 calls to protocol drain routines

Again, this looks like chump change against my top output. What category
does kernel memory get lumped into in top? 

I'd appreciate any help you can offer in terms of profiling memory usage
and actually understanding what some of these figures mean.

-- 
Chris Cowart
Network Technical Lead
Network & Infrastructure Services, RSSP-IT
UC Berkeley

Attachment: pgpwkrdHB4CZ9.pgp
Description: PGP signature

Reply via email to