On Tue, Nov 24, 2009 at 9:52 PM, Marcus Kool
<[email protected]> wrote:
> Linda started this thread with huge performance problems
> when Squid with a size of 12 GB forks 15 times.

I'm still not convinced that our memory usage is correct.  Here's
where the cache_mem 10GB process is at right now:

        Maximum Resident Size: 19376000 KB
        Total accounted:       9735764 KB
        Storage Mem size:       9296872 KB
        338229 StoreEntries
        338229 StoreEntries with MemObjects

So there is (9735764 - 9296872) 438,892 KB "overhead" (memory
accounted for that is not used for storing objects) and (19376000 -
9735764) 9,640,236 KB "unaccounted."

I will put the whole memory report at the bottom of this message
though it is very hard to read. :-)

> Linda emailed me that she is doing a test with
> vm.pmap.pg_ps_enabled set to 1 (the kernel will
> transparently transform 4K pages into superpages)
> which gives a big relief for TLB management
> and will most likely reduce performance problems with
> forks of very large processes.

I apologize I meant to send that to the whole list.

In any case, it did not help.  This is from top while it was doing
daily rotation a few moments ago:


  PID USERNAME  THR PRI NICE   SIZE    RES STATE   C   TIME   WCPU COMMAND
 1205 squid       1 118    0 18650M 18675M CPU6    1  57:22 97.85% squid
74418 squid       1 118    0 16259M    24K CPU2    2   0:05 97.66% squid

It doesn't look like the superpages enabled had a lot of effect:

$ sysctl vm.pmap
vm.pmap.pmap_collect_active: 0
vm.pmap.pmap_collect_inactive: 0
vm.pmap.pv_entry_spare: 6673
vm.pmap.pv_entry_allocs: 105465793
vm.pmap.pv_entry_frees: 100913786
vm.pmap.pc_chunk_tryfail: 0
vm.pmap.pc_chunk_frees: 610560
vm.pmap.pc_chunk_allocs: 637695
vm.pmap.pc_chunk_count: 27135
vm.pmap.pv_entry_count: 4552007   <--------
vm.pmap.pde.promotions: 12657 <------
vm.pmap.pde.p_failures: 222291
vm.pmap.pde.mappings: 0
vm.pmap.pde.demotions: 11968 <-------
vm.pmap.shpgperproc: 200
vm.pmap.pv_entry_max: 7330186
vm.pmap.pg_ps_enabled: 1

Here are similar readings from the 16GB machine, where the squid
process has VSZ of 8599M:

vm.pmap.pmap_collect_active: 0
vm.pmap.pmap_collect_inactive: 0
vm.pmap.pv_entry_spare: 16299
vm.pmap.pv_entry_allocs: 2196230824
vm.pmap.pv_entry_frees: 2193907723
vm.pmap.pc_chunk_tryfail: 0
vm.pmap.pc_chunk_frees: 13548719
vm.pmap.pc_chunk_allocs: 13562644
vm.pmap.pc_chunk_count: 13925
vm.pmap.pv_entry_count: 2323101  <------------
vm.pmap.pde.promotions: 0
vm.pmap.pde.p_failures: 0
vm.pmap.pde.mappings: 0
vm.pmap.pde.demotions: 0
vm.pmap.shpgperproc: 200
vm.pmap.pv_entry_max: 4276871
vm.pmap.pg_ps_enabled: 0

Squid uses almost all of the memory in use on both systems, so the
rough value of:

No superpages: 2323101 / 8599 = 270 pages per MB
Yes superpages: 4552007 / 18650 = 244 pages per MB

That's not quite fair since it's the whole system's page table vs.
squid's address space but these machines are all about squid so it's
fair enough.

I added up all the system's VSZ and divided it into the updated number
of pages; the system average is just 4.554KB per page, so it seems
like there must be barely any super pages in use, which is borne out
by the (promotions - demotions) figure of (12657 - 11968) 689 active
superpages.  Assuming that means what I think it means.

> can we modify Squid to fork at a slower pace?

I think there is a config option (sleep_after_fork) for that already.
But it would not help as the processes are already forked one at a
time and the page table duplication happens before the exec.  Even if
it did help, sleep_after_fork would have to be several seconds, and
squid would still be nonresponsive for sleep_after_fork * 16.

It sounds like this problem is a bit more serious and complicated than
I thought.  I really hoped it was a config error on our part. :-(

Here is the full memory report as the process VSZ crossed over to
19,001 MB, which shows a total of 9,317,346 KB in use:

Current memory usage:
Pool     Obj Size       Allocated                                        In Use 
Idle                     Allocations Saved                       Hit Rate       
         (bytes)        (#)      (KB)    high (KB)       high (hrs)      impact 
(%total)        (#)     
(KB)     high (KB)       portion (%alloc)       (#)      (KB)    high (KB)      
(number)        
(%num)   (%vol) (%num)  (number)
2K Buffer                2048    96      192     248     0.52    0       91     
 182     248     95     
5        10      94      34958835        4.42    18.12   99.96   34972340
4K Buffer                4096    199     796     3904    11.98   0       194    
 776     3904   
97       5       20      1720    7225790         0.91    7.49    98.41   7342427
8K Buffer                8192    21      168     272     12.05   0       17     
 136     272     81     
4        32      176     1880965         0.24    3.90    98.40   1911548
16K Buffer               16384   3       48      64      13.39   0       0      
 0       64      0       3       48     
64       3600    0.00    0.01    98.41   3658
32K Buffer               32768   2       64      128     14.56   0       0      
 0       128     0       2      
64       128     369     0.00    0.00    98.14   376
64K Buffer               65536   2       128     192     12.81   0       0      
 0       192     0       2      
128      128     88      0.00    0.00    95.65   92
Store Client Buffer      4096    22      88      1052    12.05   0       11     
 44      1052    50     
11       44      812     30568572        3.87    31.69   99.74   30646811
acl                        64    11      1       1       15.35   0       11     
 1       1       100     0       0      
1        11      0.00    0.00    50.00   22
acl_ip_data                24    7       1       1       15.35   0       7      
 1       1       100     0       0       1      
7        0.00    0.00    50.00   14
acl_list                   24    18      1       1       15.35   0       18     
 1       1       100     0       0      
1        18      0.00    0.00    50.00   36
dlink_node                 24    337     8       8       0.52    0       0      
 0       8       0       337     8       8      
1088     0.00    0.00    76.35   1425
FwdServer                  24    41      1       6       12.05   0       9      
 1       6       22      32      1       5      
3875752  0.49    0.02    99.87   3880603
HttpReply                 168    338927  55606   55606   0.00    1       338922
55605    55606   100     5       1       23      10817354        1.37    0.46   
 96.17   11248509
HttpHeaderEntry            40    2602377         101656  101656  0.00    1      
 2602282        
101652   101656  100     95      4       101     200511069       25.36   2.03   
 98.06  
204487269
HttpHdrCc                  40    27921   1091    1091    0.01    0       27914  
 1091   
1091     100     7       1       3       7294400         0.92    0.07    99.17  
 7355123
HttpHdrRangeSpec           16    96      2       2       13.43   0       0      
 0       2       0       96      2       2      
25596    0.00    0.00    99.31   25775
HttpHdrRange               16    2       1       1       12.07   0       0      
 0       1       0       2       1       1      
21759    0.00    0.00    99.44   21881
HttpHdrContRange           24    5       1       1       13.46   0       0      
 0       1       0       5       1       1      
10713    0.00    0.00    97.21   11021
intlist                    16    3       1       1       15.35   0       3      
 1       1       100     0       0       1      
3        0.00    0.00    50.00   6
MemObject                 272    338920  90026   90026   0.00    1       338915
90025    90026   100     5       2       39      5022931         0.64    0.35   
 93.21   5388889
mem_node                 4112    2338883         9392078         9392078        
 0.00    96     
2338871  9392029         9392078         100     12      49      3908    
15349368        1.94    15.98  
81.20    18902877
relist                     48    8       1       1       15.35   0       8      
 1       1       100     0       0       1      
8        0.00    0.00    50.00   16
request_t                1360    341     453     893     12.05   0       326    
 433     893    
96       15      20      496     9456631         1.20    3.26    98.60   9590792
StoreEntry                 88    338920  29126   29126   0.00    0       338915
29126    29126   100     5       1       13      5025544         0.64    0.11   
 93.26   5388889
wordlist                   16    14      1       1       15.35   0       11     
 1       1       79      3       1       1      
88       0.00    0.00    86.27   102
ClientInfo                344    28096   9439    14657   12.40   0       15714  
 5279   
14657    56      12382   4160    4852    653528  0.08    0.06    93.63   697962
MD5 digest                 16    338920  5296    5296    0.00    0       338915 
5296    
5296     100     5       1       3       13112283        1.66    0.05    97.40  
 13462778
helper_request             40    353     14      14      0.52    0       1      
 1       14      0       352    
14       14      4805587         0.61    0.05    99.99   4805971
storeSwapTLV               24    5       1       1       15.32   0       0      
 0       1       0       5       1       1      
47627493         6.02    0.29    100.00  47627498
Short Strings              36    2939320         103336  103336  0.00    1      
 2939241        
103333   103336  100     79      3       75      219162796       27.72   2.00   
 98.15  
223283726
Medium Strings            128    15166   1896    1951    0.52    0       15126  
 1891   
1951     100     40      5       89      50029245        6.33    1.62    99.39  
 50334410
Long Strings              512    5927    2964    3002    0.52    0       5911   
 2956   
3002     100     16      8       143     9251916         1.17    1.20    99.22  
 9325056
cbdata acl_access (1)      56    13      1       1       15.35   0       13     
 1       1       100     0       0      
1        13      0.00    0.00    50.00   26
cbdata aclCheck_t (2)     312   
daf
 1       1       1       15.32   0       0       0       1       0       1      
 1       1       22575783        2.86    1.78    100.00  22575832
cbdata clientHttpRequest (3)     1096    21      23      481     0.52    0      
 15      17      481    
71       6       7       387     4735510         0.60    1.31    98.21   4821935
cbdata ConnStateData (4)          320    194     61      299     11.98   0      
 190     60      299    
98       4       2       137     2604058         0.33    0.21    97.41   2673373
cbdata ErrorState (5)     160    200     32      32      0.06    0       199    
 32      32      100    
1        1       1       16266   0.00    0.00    98.79   16466
cbdata FwdState (6)        96    31      3       24      12.05   0       9      
 1       24      29      22      3      
20       3871284         0.49    0.09    99.76   3880603
cbdata generic_cbdata (7)          32    30      1       4       15.32   0      
 0       0       4       0       30     
1        4       14750620        1.87    0.12    100.00  14750938
cbdata helper (8)         136    1       1       1       15.35   0       1      
 1       1       100     0       0       0      
0        0.00    0.00    0.00    1
cbdata helper_server (9)          152    16      3       3       15.35   0      
 16      3       3       100     0      
0        3       32      0.00    0.00    66.67   48
cbdata HttpStateData (12)         152    38      6       37      12.05   0      
 9       2       37      24     
29       5       32      3874204         0.49    0.15    99.85   3880194
cbdata ps_state (14)      200    1       1       1       15.32   0       0      
 0       1       0       1       1       1      
3880602  0.49    0.20    100.00  3880603
cbdata RemovalPolicy (15)         104    1       1       1       15.35   0      
 1       1       1       100     0      
0        0       0       0.00    0.00    0.00    1
cbdata store_client (18)          128    167     21      45      12.05   0      
 133     17      45     
80       34      5       29      8643422         1.09    0.28    99.89   8652735
event                      48    34      2       3       12.38   0       31     
 2       3       91      3       1       1      
8007262  1.01    0.10    99.96   8010458
cbdata http_port_list (19)        128    2       1       1       14.75   0      
 2       1       1       100     0      
0        0       0       0.00    0.00    0.00    2
cbdata body_size (20)      64    1       1       1       15.35   0       1      
 1       1       100     0       0       1      
1        0.00    0.00    50.00   2
cbdata ConnectStateData (21)       96    16      2       11      15.32   0      
 0       0       11      0      
16       2       11      3826157         0.48    0.09    99.96   3827561
close_handler              24    236     6       31      12.05   0       224    
 6       31      95      12     
1        13      14200692        1.80    0.09    99.57   14261946
ipcache_entry             128    978     123     137     13.32   0       951    
 119     137    
97       27      4       22      611237  0.08    0.02    98.86   618298
fqdncache_entry           160    20      4       4       15.35   0       20     
 4       4       100     0       0      
4        24      0.00    0.00    54.55   44
cbdata idns_query (22)   8680    2       17      941     15.32   0       0      
 0       941     0       2      
17       941     616884  0.08    1.36    99.78   618218
cbdata redirectStateData (23)      72    353     25      25      0.52    0      
 1       1       25     
0        352     25      25      4805595         0.61    0.09    99.99   4805971
cbdata Logfile (26)      1120    3       4       4       15.35   0       3      
 4       4       100     0       0       4      
3        0.00    0.00    50.00   6
pconn_data                 32    6       1       1       13.34   0       1      
 1       1       17      5       1       1      
61876    0.01    0.00    99.69   62069
pconn_fds                  32    6       1       1       13.34   0       1      
 1       1       17      5       1       1      
61889    0.01    0.00    99.71   62069
cbdata AddVaryState (29)          160    1       1       2       12.38   0      
 0       0       2       0       1       1      
2        305954  0.04    0.01    99.84   306435
cbdata LocateVaryState (30)       136    3       1       1       12.38   0      
 0       0       1       0       3      
1        1       571615  0.07    0.02    99.99   571661
VaryData                   32    6       1       1       13.62   0       1      
 1       1       17      5       1       1      
570046   0.07    0.00    99.72   571661
cbdata StoreUpdateState (31)     4152    2       9       41      12.34   0      
 0       0       41      0      
2        9       41      1044654         0.13    1.10    99.71   1047729
Total                    -       9317346         9794810         9794810        
 0.00    100    
9303255  9790117         9790185         100     14091   4693    5120    
776329091       98.19  
95.79    98.19   790614788
Cumulative allocated volume: 395.07 GB
Current overhead: 301304 bytes (0.003%)
Idle pool limit: 5.00 MB
memPoolAlloc calls: 790614788
memPoolFree calls: 781311532
String Pool      Impact         
         (%strings)      (%volume)
Short Strings            99      96
Medium Strings           1       2
Long Strings             0       3
Other Strings            0       0

Large buffers: 0 (0 KB)

Are we 100% certain this is normal / correct?  Cutting the process
size in half would double the process spawn rate. :)

Thanks!

Reply via email to