Re: deadlock in zoneli state on 6.2-PRERELEASE

2006-12-26 Thread Nikolay Pavlov
On Monday, 25 December 2006 at 13:23:17 +0800, LI Xin wrote:
 Hi, Nikolay,
 
 Our local customer has applied the following and it seems to have
 'solved' their problem on squid:
 
 echo kern.ipc.nmbclusters=0  /boot/loader.conf
 
 and then reboot.  They have been running with the 20061212 patch but I
 suspect that it's no longer necessary.  The feedback I have received is
 that they have run with this for two weeks without problem, serving
 video streams.
 
 Please let us know if this works.  Wish you a happy new year :-)

Thanks. 

Still no luck with my problem :)
It dies for me somewhere about:

[EMAIL PROTECTED]:~# netstat -m
208965/585/209550 mbufs in use (current/cache/total)
208432/474/208906/0 mbuf clusters in use (current/cache/total/max)
208432/464 mbuf+clusters out of packet secondary zone in use
(current/cache)
0/0/0/0 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/0 9k jumbo clusters in use (current/cache/total/max)
0/0/0/0 16k jumbo clusters in use (current/cache/total/max)
469105K/1094K/470199K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/3/6656 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
0 calls to protocol drain routines

With panic: kmem_malloc(4096): kmem_map too small: 536870912 total
allocated

So there is still memory leak somewhere or i have some rarely
and hardly detectable hardware bug. At least all other squid servers on
this load balancer cluster works fine even under same load as this box
and there is no any mbuf related problems on them.

In any case thank you and all folks for help :)
Happy holidays!

 
 Cheers,
 -- 
 Xin LI [EMAIL PROTECTED]http://www.delphij.net/
 FreeBSD - The Power to Serve!
 



-- 
==  
- Best regards, Nikolay Pavlov. ---
==  

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: deadlock in zoneli state on 6.2-PRERELEASE

2006-12-24 Thread LI Xin
Hi, Nikolay,

Our local customer has applied the following and it seems to have
'solved' their problem on squid:

echo kern.ipc.nmbclusters=0  /boot/loader.conf

and then reboot.  They have been running with the 20061212 patch but I
suspect that it's no longer necessary.  The feedback I have received is
that they have run with this for two weeks without problem, serving
video streams.

Please let us know if this works.  Wish you a happy new year :-)

Cheers,
-- 
Xin LI [EMAIL PROTECTED]  http://www.delphij.net/
FreeBSD - The Power to Serve!



signature.asc
Description: OpenPGP digital signature


Re: deadlock in zoneli state on 6.2-PRERELEASE

2006-12-12 Thread Nikolay Pavlov
No luck at all.
patch-zonelim-drain-20061212 works for me as a previos one.
no panics, but still zoneli.
All this is very odd, because other two squid servers works 
perfectly in the same loadbalancer with out any patches and 
kernel panics. I think that the case with this server 
is realy rare.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: deadlock in zoneli state on 6.2-PRERELEASE

2006-12-12 Thread LI Xin
Nikolay Pavlov wrote:
 No luck at all.
 patch-zonelim-drain-20061212 works for me as a previos one.
 no panics, but still zoneli.
 All this is very odd, because other two squid servers works 
 perfectly in the same loadbalancer with out any patches and 
 kernel panics. I think that the case with this server 
 is realy rare.

Would you please give a vmstat -z output when the server stuck in the
zonelim livelock?  Thanks!

Cheers,
-- 
Xin LI [EMAIL PROTECTED]  http://www.delphij.net/
FreeBSD - The Power to Serve!



signature.asc
Description: OpenPGP digital signature


Re: deadlock in zoneli state on 6.2-PRERELEASE

2006-12-12 Thread Nikolay Pavlov
On Wednesday, 13 December 2006 at  3:02:40 +0800, LI Xin wrote:
 Nikolay Pavlov wrote:
  No luck at all.
  patch-zonelim-drain-20061212 works for me as a previos one.
  no panics, but still zoneli.
  All this is very odd, because other two squid servers works 
  perfectly in the same loadbalancer with out any patches and 
  kernel panics. I think that the case with this server 
  is realy rare.
 
 Would you please give a vmstat -z output when the server stuck in the
 zonelim livelock?  Thanks!

130947/775/131722 mbufs in use (current/cache/total)
130859/213/131072/131072 mbuf clusters in use (current/cache/total/max)
130859/213 mbuf+clusters out of packet secondary zone in use
(current/cache)
0/0/0/0 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/0 9k jumbo clusters in use (current/cache/total/max)
0/0/0/0 16k jumbo clusters in use (current/cache/total/max)
294454K/619K/295074K bytes allocated to network (current/cache/total)
0/493001/246499 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/4/6656 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
83 calls to protocol drain routines

ITEMSIZE LIMIT USEDFREE  REQUESTS

UMA Kegs:140,0,  84, 12,   84
UMA Zones:   120,0,  84,  6,   84
UMA Slabs:64,0, 941,  3, 2709
UMA RCntSlabs:   104,0,   65536, 28,65536
UMA Hash:128,0,   4, 26,6
16 Bucket:76,0,  27, 23,   35
32 Bucket:   140,0,  21,  7,   29
64 Bucket:   268,0,  30, 26,   67
128 Bucket:  524,0, 215,457,   827387
VM OBJECT:   132,0,   41860, 16,72475
MAP: 192,0,   7, 33,7
KMAP ENTRY:   68,57456, 121, 47,97518
MAP ENTRY:68,0, 702,362,94390
PV ENTRY: 24,  2228360,  124664,   2646,   984278
DP fakepg:72,0,   0,  0,0
mt_zone:  64,0, 182, 54,  182
16:   16,0,3754,306,   432518
32:   32,0,1794,353,   116429
64:   64,0,3205,   3580,   206314
128: 128,0,1616,   1384,   371069
256: 256,0, 368,427,22536
512: 512,0,1266, 30,   486094
1024:   1024,0,  48, 84,   434408
2048:   2048,0, 147, 61,43326
4096:   4096,0, 129, 22, 4942
Files:72,0,2672,   2734,   481720
MAC labels:   20,0,   59586,   1085,   622560
PROC:536,0,  71, 27, 1564
THREAD:  376,0,  98, 22,   98
KSEGRP:   88,0,  98, 62,   98
UPCALL:   44,0,   0,  0,0
VMSPACE: 296,0,  28, 24, 1518
mbuf_packet: 256,0,  131052, 20,  9510911
mbuf:256,0,  91,559, 12694062
mbuf_cluster:   2048,   131072,  131072,  0,   136594
mbuf_jumbo_pagesize: 4096,0,   0,  0,0
mbuf_jumbo_9k:  9216,0,   0,  0,0
mbuf_jumbo_16k: 16384,0,   0,  0,0
ACL UMA zone:388,0,   0,  0,0
g_bio:   132,0,   0,   1160,   522435
ata_request: 204,0,   0,  0,0
ata_composite:   196,0,   0,  0,0
VNODE:   348,0,   47312, 10,52436
VNODEPOLL:76,0,   0,  0,0
S VFS Cache:  68,0,   39770, 46,44847
L VFS Cache: 291,0,   0,  0,0
NAMEI:  1024,0,   0, 12,   205701
DIRHASH:1024,0,1595,229, 7023
NFSMOUNT:480,0,   1, 15,1
NFSNODE: 536,0,  16,  5,   16
PIPE:408,0,   6, 21,  664
KNOTE:68,0,   0,112,   76
socket:  356,   131076,3930,   1229,   186491
unpcb:   140,   131096,  12, 44,  170
ipq:  32, 4181,   0,  0,0
udpcb:   180,   131076,  12, 32,  188
inpcb:   180,   131076,3859,   1509,   186132
tcpcb:   464,   131072,3859,   1237,   186132
tcptw:48, 8268,   0,546,   140117
syncache:100,15366,   2,271,   165792
hostcache:76,15400,3717, 33, 6425
tcpreass: 20, 8281,   0,169, 1493
sackhole: 20,0,   4,334,   372356
ripcb:   

Re: deadlock in zoneli state on 6.2-PRERELEASE

2006-12-11 Thread LI Xin
Hi,

Would you please give the following patch a try?

http://people.freebsd.org/~delphij/misc/patch-zonelim-drain

Note: Please revert my previous patch against sys/kern/kern_mbuf.c.

This patch should be applied against sys/vm/ [RELENG_6 or RELENG_6_2],
it schedules a drain of uma zones when they are low on memory.

Cheers,
-- 
Xin LI [EMAIL PROTECTED]  http://www.delphij.net/
FreeBSD - The Power to Serve!



signature.asc
Description: OpenPGP digital signature


Re: deadlock in zoneli state on 6.2-PRERELEASE

2006-12-11 Thread Nikolay Pavlov
On Monday, 11 December 2006 at 15:59:03 +0800, LI Xin wrote:
 Hi,
 
 Would you please give the following patch a try?
 
 http://people.freebsd.org/~delphij/misc/patch-zonelim-drain
 
 Note: Please revert my previous patch against sys/kern/kern_mbuf.c.
 
 This patch should be applied against sys/vm/ [RELENG_6 or RELENG_6_2],
 it schedules a drain of uma zones when they are low on memory.


This time things worked out a bit better, there was no Kernel panic and
my server managed to overcome the magic number 65550 mbufs. But very
soon the server reached another limit - 131072 mbuf clusters  
(This is my limit for kern.ipc.nmbclusters).
And server started to drop the packets. After I've  
removed the overload I found my server responding but when I actually 
accessed it I found out that although the number of connections has 
reduces considerably, the memory allocated for the net did not become
free. So I believe that there is still a mbufs leak somewhere.

[EMAIL PROTECTED]:~# sockstat -4 | wc -l
  17

[EMAIL PROTECTED]:~# netstat -m
1082/131578/132660 mbufs in use (current/cache/total)
1080/129992/131072/131072 mbuf clusters in use (current/cache/total/max)
^^
1080/128712 mbuf+clusters out of packet secondary zone in use (current/cache)
0/0/0/0 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/0 9k jumbo clusters in use (current/cache/total/max)
0/0/0/0 16k jumbo clusters in use (current/cache/total/max)
2430K/292878K/295309K bytes allocated to network (current/cache/total)
^
0/1058420/529208 requests for mbufs denied
(mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/4/6656 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
156 calls to protocol drain routines

[EMAIL PROTECTED]:~# vmstat -z
ITEMSIZE LIMIT USEDFREE  REQUESTS

UMA Kegs:140,0,  84, 12,   84
UMA Zones:   120,0,  84,  6,   84
UMA Slabs:64,0, 904, 40,31603
UMA RCntSlabs:   104,0,   65536, 28,65536
UMA Hash:128,0,   4, 26,6
16 Bucket:76,0,  23, 27,   32
32 Bucket:   140,0,  20,  8,   29
64 Bucket:   268,0,  23, 33,   46
128 Bucket:  524,0,1423,  5,  1590808
VM OBJECT:   132,0,   18094,147,43780
MAP: 192,0,   7, 33,7
KMAP ENTRY:   68,57456,  17,151,   151954
MAP ENTRY:68,0, 718,290,   182847
PV ENTRY: 24,  2228360,  124232,   4528,   861505
DP fakepg:72,0,   0,  0,0
mt_zone:  64,0, 182, 54,  182
16:   16,0,2828,   1435,   273492
32:   32,0,1769,491,   116114
64:   64,0,3199,   2701,   181154
128: 128,0,1813,   1517,   273347
256: 256,0, 365,415,13230
512: 512,0,1006, 10,35578
1024:   1024,0,  49, 83,72385
2048:   2048,0, 170, 38,40942
4096:   4096,0, 129, 22,89367
Files:72,0, 100,   5624,   227230
MAC labels:   20,0,   20726,  15778,   368518
PROC:536,0,  71, 27, 1441
THREAD:  376,0,  98, 22,   98
KSEGRP:   88,0,  98, 62,   98
UPCALL:   44,0,   0,  0,0
VMSPACE: 296,0,  28, 24, 1398
mbuf_packet: 256,0,1080, 128712,  6714669
mbuf:256,0,   2,   2866,  7912768
mbuf_cluster:   2048,   131072,  129792,   1280,   166272
mbuf_jumbo_pagesize: 4096,0,   0,  0,0
mbuf_jumbo_9k:  9216,0,   0,  0,0
mbuf_jumbo_16k: 16384,0,   0,  0,0
ACL UMA zone:388,0,   0,  0,0
g_bio:   132,0,   0,783,   194850
ata_request: 204,0,   0,  0,0
ata_composite:   196,0,   0,  0,0
VNODE:   348,0,   20116,  3,22965
VNODEPOLL:76,0,   0,  0,0
S VFS Cache:  68,0,   16387, 77,18383
L VFS Cache: 291,0,   0,  0,0
NAMEI:  1024,0,   0, 12,   166126
DIRHASH:1024,0,1688,128, 3307
NFSMOUNT:480,0,   0,  0,0
NFSNODE: 536,0,   0,  0,0
PIPE:408,0,   6, 21,  595
KNOTE:68,0,   0,112,   64
socket:   

Re: deadlock in zoneli state on 6.2-PRERELEASE

2006-12-11 Thread LI Xin
Hi, Nikolay,

Nikolay Pavlov wrote:
 On Monday, 11 December 2006 at 15:59:03 +0800, LI Xin wrote:
 Hi,

 Would you please give the following patch a try?

 http://people.freebsd.org/~delphij/misc/patch-zonelim-drain

 Note: Please revert my previous patch against sys/kern/kern_mbuf.c.

 This patch should be applied against sys/vm/ [RELENG_6 or RELENG_6_2],
 it schedules a drain of uma zones when they are low on memory.
 
 
 This time things worked out a bit better, there was no Kernel panic and
 my server managed to overcome the magic number 65550 mbufs. But very
 soon the server reached another limit - 131072 mbuf clusters  

Do you still get squid stuck in zoneli state and the server became
unresponsive?

 (This is my limit for kern.ipc.nmbclusters).
 And server started to drop the packets. After I've  
 removed the overload I found my server responding but when I actually 
 accessed it I found out that although the number of connections has 
 reduces considerably, the memory allocated for the net did not become
 free. So I believe that there is still a mbufs leak somewhere.

This looks weird to me...  Would you please try to add some load to the
server and remove afterwards, to see if the 'current' mbuf clusters
keeps increasing or not?

Cheers,
-- 
Xin LI [EMAIL PROTECTED]  http://www.delphij.net/
FreeBSD - The Power to Serve!



signature.asc
Description: OpenPGP digital signature


Re: deadlock in zoneli state on 6.2-PRERELEASE

2006-12-11 Thread Nikolay Pavlov
On Tuesday, 12 December 2006 at  2:59:37 +0800, LI Xin wrote:
 Hi, Nikolay,
 
 Nikolay Pavlov wrote:
  On Monday, 11 December 2006 at 15:59:03 +0800, LI Xin wrote:
  Hi,
 
  Would you please give the following patch a try?
 
  http://people.freebsd.org/~delphij/misc/patch-zonelim-drain
 
  Note: Please revert my previous patch against sys/kern/kern_mbuf.c.
 
  This patch should be applied against sys/vm/ [RELENG_6 or RELENG_6_2],
  it schedules a drain of uma zones when they are low on memory.
  
  
  This time things worked out a bit better, there was no Kernel panic and
  my server managed to overcome the magic number 65550 mbufs. But very
  soon the server reached another limit - 131072 mbuf clusters  
 
 Do you still get squid stuck in zoneli state and the server became
 unresponsive?

Yes. No panic, but still idle in zoneli and the server become
unresponsive via network.

last pid:  1990;  load averages:  0.09,  0.30,  0.16  up 0+03:27:2816:27:57
30 processes:  1 running, 29 sleeping

Mem: 497M Active, 600M Inact, 441M Wired, 12K Cache, 112M Buf, 2352M Free
Swap: 4070M Total, 4070M Free


  PID USERNAME  THR PRI NICE   SIZERES STATETIME   WCPU COMMAND
  684 squid   1 -160   480M   480M zoneli  11:41  0.00% squid
  694 root1  960  6636K  4800K select   0:06  0.00% snmpd
  691 squid   1  -80  1224K   632K piperd   0:01  0.00% unlinkd
  364 _pflogd 1 -580  1600K  1216K bpf  0:01  0.00% pflogd
 1840 root1  -80  7768K  7304K piperd   0:00  0.00% perl5.8.8
  722 root1  960  3464K  2796K select   0:00  0.00% sendmail
  563 root1  960  1352K  1048K select   0:00  0.00% syslogd
 1878 root1   5  -10  5072K  3056K ttyin0:00  0.00% tcsh
 1841 root1  200  4976K  3036K pause0:00  0.00% csh
 1874 quetzal 1  960  6220K  3252K select   0:00  0.00% sshd
 1877 root1  200  5056K  3028K pause0:00  0.00% tcsh
 1872 root1   40  6232K  3236K sbwait   0:00  0.00% sshd
  732 root1   80  1364K  1060K nanslp   0:00  0.00% cron
  783 root1   80  1692K  1396K wait 0:00  0.00% login
 1875 quetzal 1  200  4736K  2964K pause0:00  0.00% tcsh
  668 root1  960  1264K   804K select   0:00  0.00% usbd
  706 root1  960  3504K  2676K select   0:00  0.00% sshd
  726 smmsp   1  200  3364K  2724K pause0:00  0.00% sendmail

last pid:  1992;  load averages:  0.06,  0.27,  0.16  up 0+03:27:5016:28:19
75 processes:  2 running, 52 sleeping, 21 waiting

Mem: 497M Active, 600M Inact, 441M Wired, 12K Cache, 112M Buf, 2351M Free
Swap: 4070M Total, 4070M Free


  PID USERNAME  THR PRI NICE   SIZERES STATETIME   WCPU COMMAND
   10 root1 171   52 0K 8K RUN188:49 93.85% idle
   22 root1 -68 -187 0K 8K WAIT 1:54  3.56% irq29: em1
  684 squid   1 -160   480M   480M zoneli  11:41  0.05% squid
   13 root1 -44 -163 0K 8K WAIT 3:08  0.00% swi1: net
   11 root1 -32 -151 0K 8K WAIT 0:27  0.00% swi4: clock sio
   42 root1  200 0K 8K syncer   0:09  0.00% syncer
  694 root1  960  6636K  4800K select   0:06  0.00% snmpd
   14 root1 -160 0K 8K -0:04  0.00% yarrow
   39 root1 171   52 0K 8K pgzero   0:03  0.00% pagezero
3 root1  -80 0K 8K -0:02  0.00% g_up
4 root1  -80 0K 8K -0:02  0.00% g_down
   37 root1 -160 0K 8K psleep   0:02  0.00% pagedaemon
  691 squid   1  -80  1224K   632K piperd   0:01  0.00% unlinkd
   43 root1 -160 0K 8K sdflus   0:01  0.00% softdepflush
   20 root1 -64 -183 0K 8K WAIT 0:01  0.00% irq48: amr0
2 root1  -80 0K 8K -0:01  0.00% g_event
   44 root1 -600 0K 8K -0:01  0.00% schedcpu
  364 _pflogd 1 -580  1600K  1216K bpf  0:01  0.00% pflogd

  PID  TT  STAT  TIME COMMAND
0  ??  WLs0:00.00 [swapper]
1  ??  ILs0:00.01 /sbin/init --
2  ??  DL 0:00.91 [g_event]
3  ??  DL 0:02.25 [g_up]
4  ??  DL 0:02.15 [g_down]
5  ??  DL 0:00.00 [kqueue taskq]
6  ??  DL 0:00.00 [acpi_task_0]
7  ??  DL 0:00.00 [acpi_task_1]
8  ??  DL 0:00.00 [acpi_task_2]
9  ??  DL 0:00.00 [thread taskq]
   10  ??  RL   188:42.12 [idle]
   11  ??  WL 0:26.95 [swi4: clock sio]
   12  ??  WL 0:00.00 [swi3: vm]
   13  ??  WL 3:08.22 [swi1: net]
   14  ??  DL 0:04.36 [yarrow]
   15  ??  WL 0:00.00 [swi5: +]
   16  ??  WL 0:00.00 [swi2: cambio]
   17  ??  WL 0:00.00 [swi6: task queue]
   18  ??  WL 0:00.02 [swi6: Giant taskq]
   19  ??  WL 0:00.00 [irq9: acpi0]
   20  ??  WL 0:01.20 [irq48: amr0]
   21  ??  WL 0:00.00 [irq28: em0]
   22  ??  WL 1:53.84 [irq29: 

Re: deadlock in zoneli state on 6.2-PRERELEASE

2006-12-11 Thread LI Xin
Hi, Nikolay,

Nikolay Pavlov wrote:
 On Tuesday, 12 December 2006 at  2:59:37 +0800, LI Xin wrote:
 Hi, Nikolay,

 Nikolay Pavlov wrote:
 On Monday, 11 December 2006 at 15:59:03 +0800, LI Xin wrote:
 Hi,

 Would you please give the following patch a try?

 http://people.freebsd.org/~delphij/misc/patch-zonelim-drain

 Note: Please revert my previous patch against sys/kern/kern_mbuf.c.

 This patch should be applied against sys/vm/ [RELENG_6 or RELENG_6_2],
 it schedules a drain of uma zones when they are low on memory.

 This time things worked out a bit better, there was no Kernel panic and
 my server managed to overcome the magic number 65550 mbufs. But very
 soon the server reached another limit - 131072 mbuf clusters  
 Do you still get squid stuck in zoneli state and the server became
 unresponsive?
 
 Yes. No panic, but still idle in zoneli and the server become
 unresponsive via network.

I am not quite sure but please let me know if the patch:

http://people.freebsd.org/~delphij/misc/patch-zonelim-drain-20061212

would help the situation.  You can also modify wakeup_one(keg); to
wakeup(keg); manually, at sys/vm/uma_core.c:2507.

Note that this would be a (potentially) a pessimization for zonelim
case, where all zonelim'ed threads would be waken rather than only one.
 However, confirming whether this would help your situation, would help
to narrow down the issue so we can think about a better solution for it.

Thanks!

Cheers,
-- 
Xin LI [EMAIL PROTECTED]  http://www.delphij.net/
FreeBSD - The Power to Serve!



signature.asc
Description: OpenPGP digital signature


Re: deadlock in zoneli state on 6.2-PRERELEASE

2006-11-25 Thread Nikolay Pavlov
On Friday, 24 November 2006 at 11:11:48 +0800, LI Xin wrote:
 Nikolay Pavlov wrote:
  On Thursday, 23 November 2006 at 20:24:15 +0800, [EMAIL PROTECTED] wrote:
  Hi,
 
  On Wed, 22 Nov 2006 21:55:49 +0200, Nikolay Pavlov [EMAIL PROTECTED] 
  wrote:
  Hi.
  It seems i have a deadlock on 6.2-PRERELEASE.
  This is squid server in accelerator mode.
  I can easily trigger it with a high rate of requests.
  Squid is locked on some zoneli state, i am not sure what it is.
  Also i can't KILL proccess even with SIGKILL.
  In addition one of sshd proccess is locked too.
  Would you please update to the latest RELENG_6 and apply this patch:
 
  http://people.freebsd.org/~delphij/misc/patch-zonelimit-workaround
 
  to see if things gets improved?
 
  Thanks in advance!
 
  Cheers,
  
  Well. This patch works quite ambiguous for me.
  Under heavy load this box become unresponseble via network.
  System is mostly idle. Squid is locked in zoneli.

Another panic. Guys do i need some additional debug options or this info
is enough. I am asking because this panic is easily reproduceable for
me.

[EMAIL PROTECTED]:/usr/obj/usr/src/sys/ACCEL# kgdb kernel.debug 
/var/crash/vmcore.4
kgdb: kvm_nlist(_stopped_cpus):
kgdb: kvm_nlist(_stoppcbs):
[GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: 
Undefined symbol ps_pglobal_lookup]
GNU gdb 6.1.1 [FreeBSD]
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for details.
This GDB was configured as i386-marcel-freebsd.

Unread portion of the kernel message buffer:
lock order reversal: (sleepable after non-sleepable)
 1st 0xca21567c so_snd (so_snd) @ /usr/src/sys/netinet/tcp_output.c:253
 2nd 0xc070bd84 user map (user map) @ /usr/src/sys/vm/vm_map.c:3074
KDB: stack backtrace:
kdb_backtrace(,c071ccb0,c071c210,c06e5c4c,c0758f18,...) at 
kdb_backtrace+0x29
witness_checkorder(c070bd84,9,c06be56c,c02,c070d2c4,0,c06aab25,9f) at 
witness_checkorder+0x4cd
_sx_xlock(c070bd84,c06be56c,c02) at _sx_xlock+0x2c
_vm_map_lock_read(c070bd40,c06be56c,c02,184637d,c92796b0,...) at 
_vm_map_lock_read+0x37
vm_map_lookup(f48a29d0,0,1,f48a29d4,f48a29c4,f48a29c8,f48a29ab,f48a29ac) at 
vm_map_lookup+0x28
vm_fault(c070bd40,0,1,0,c927aa80,...) at vm_fault+0x65
trap_pfault(f48a2a98,0,c) at trap_pfault+0xee
trap(8,c06b0028,f48a0028,1,0,...) at trap+0x325
calltrap() at calltrap+0x5
--- trap 0xc, eip = 0xc053ea34, esp = 0xf48a2ad8, ebp = 0xf48a2ae4 ---
m_copydata(0,,1,d0020d74,c1040468,...) at m_copydata+0x28
tcp_output(d21c5570) at tcp_output+0x9af
tcp_input(d0020d00,14,e9,93935ce,0,...) at tcp_input+0x24a2
ip_input(d0020d00) at ip_input+0x561
netisr_processqueue(c075a6d8) at netisr_processqueue+0x6e
swi_net(0) at swi_net+0xc2
ithread_execute_handlers(c9279648,c92c3400) at ithread_execute_handlers+0xce
ithread_loop(c92436a0,f48a2d38,c070db20,0,c06a818a,...) at ithread_loop+0x4e
fork_exit(c04f76d4,c92436a0,f48a2d38) at fork_exit+0x61
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xf48a2d6c, ebp = 0 ---


Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0xc
fault code  = supervisor read, page not present
instruction pointer = 0x20:0xc053ea34
stack pointer   = 0x28:0xf48a2ad8
frame pointer   = 0x28:0xf48a2ae4
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 13 (swi1: net)
trap number = 12
panic: page fault
KDB: stack backtrace:
kdb_backtrace(100,c927aa80,28,f48a2a98,c,...) at kdb_backtrace+0x29
panic(c069b8a1,c06c5f2c,0,f,c927d69b,...) at panic+0xa8
trap_fatal(f48a2a98,c,c927aa80,c070bd40,0,...) at trap_fatal+0x2a6
trap_pfault(f48a2a98,0,c) at trap_pfault+0x187
trap(8,c06b0028,f48a0028,1,0,...) at trap+0x325
calltrap() at calltrap+0x5
--- trap 0xc, eip = 0xc053ea34, esp = 0xf48a2ad8, ebp = 0xf48a2ae4 ---
m_copydata(0,,1,d0020d74,c1040468,...) at m_copydata+0x28
tcp_output(d21c5570) at tcp_output+0x9af
tcp_input(d0020d00,14,e9,93935ce,0,...) at tcp_input+0x24a2
ip_input(d0020d00) at ip_input+0x561
netisr_processqueue(c075a6d8) at netisr_processqueue+0x6e
swi_net(0) at swi_net+0xc2
ithread_execute_handlers(c9279648,c92c3400) at ithread_execute_handlers+0xce
ithread_loop(c92436a0,f48a2d38,c070db20,0,c06a818a,...) at ithread_loop+0x4e
fork_exit(c04f76d4,c92436a0,f48a2d38) at fork_exit+0x61
fork_trampoline() at fork_trampoline+0x8
--- trap 0x1, eip = 0, esp = 0xf48a2d6c, ebp = 0 ---
Uptime: 25m13s
Dumping 3967 MB (3 chunks)
  chunk 0: 1MB (159 pages) ... ok
  chunk 1: 3966MB (1015280 pages) 3950 3934 3918 3902 3886 3870 3854 3838 3822 
3806 3790 3774 3758 3742 3726 3710 3694 3678 

Re: deadlock in zoneli state on 6.2-PRERELEASE

2006-11-25 Thread LI Xin
Nikolay Pavlov wrote:
 On Friday, 24 November 2006 at 11:11:48 +0800, LI Xin wrote:
 Nikolay Pavlov wrote:
 On Thursday, 23 November 2006 at 20:24:15 +0800, [EMAIL PROTECTED] wrote:
 Hi,

 On Wed, 22 Nov 2006 21:55:49 +0200, Nikolay Pavlov [EMAIL PROTECTED] 
 wrote:
 Hi.
 It seems i have a deadlock on 6.2-PRERELEASE.
 This is squid server in accelerator mode.
 I can easily trigger it with a high rate of requests.
 Squid is locked on some zoneli state, i am not sure what it is.
 Also i can't KILL proccess even with SIGKILL.
 In addition one of sshd proccess is locked too.
 Would you please update to the latest RELENG_6 and apply this patch:

 http://people.freebsd.org/~delphij/misc/patch-zonelimit-workaround

 to see if things gets improved?

 Thanks in advance!

 Cheers,
 Well. This patch works quite ambiguous for me.
 Under heavy load this box become unresponseble via network.
 System is mostly idle. Squid is locked in zoneli.
 
 Another panic. Guys do i need some additional debug options or this info
 is enough. I am asking because this panic is easily reproduceable for
 me.

I think these stuff is enough.  By the way, which scheduler do you use?

 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/ACCEL# kgdb kernel.debug 
 /var/crash/vmcore.4
 kgdb: kvm_nlist(_stopped_cpus):
 kgdb: kvm_nlist(_stoppcbs):
 [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: 
 Undefined symbol ps_pglobal_lookup]
 GNU gdb 6.1.1 [FreeBSD]
 Copyright 2004 Free Software Foundation, Inc.
 GDB is free software, covered by the GNU General Public License, and you are
 welcome to change it and/or distribute copies of it under certain conditions.
 Type show copying to see the conditions.
 There is absolutely no warranty for GDB.  Type show warranty for details.
 This GDB was configured as i386-marcel-freebsd.
 
 Unread portion of the kernel message buffer:
 lock order reversal: (sleepable after non-sleepable)
  1st 0xca21567c so_snd (so_snd) @ /usr/src/sys/netinet/tcp_output.c:253
  2nd 0xc070bd84 user map (user map) @ /usr/src/sys/vm/vm_map.c:3074
 KDB: stack backtrace:
 kdb_backtrace(,c071ccb0,c071c210,c06e5c4c,c0758f18,...) at 
 kdb_backtrace+0x29
 witness_checkorder(c070bd84,9,c06be56c,c02,c070d2c4,0,c06aab25,9f) at 
 witness_checkorder+0x4cd
 _sx_xlock(c070bd84,c06be56c,c02) at _sx_xlock+0x2c
 _vm_map_lock_read(c070bd40,c06be56c,c02,184637d,c92796b0,...) at 
 _vm_map_lock_read+0x37
 vm_map_lookup(f48a29d0,0,1,f48a29d4,f48a29c4,f48a29c8,f48a29ab,f48a29ac) at 
 vm_map_lookup+0x28
 vm_fault(c070bd40,0,1,0,c927aa80,...) at vm_fault+0x65
 trap_pfault(f48a2a98,0,c) at trap_pfault+0xee
 trap(8,c06b0028,f48a0028,1,0,...) at trap+0x325
 calltrap() at calltrap+0x5
 --- trap 0xc, eip = 0xc053ea34, esp = 0xf48a2ad8, ebp = 0xf48a2ae4 ---
 m_copydata(0,,1,d0020d74,c1040468,...) at m_copydata+0x28
 tcp_output(d21c5570) at tcp_output+0x9af
 tcp_input(d0020d00,14,e9,93935ce,0,...) at tcp_input+0x24a2
 ip_input(d0020d00) at ip_input+0x561
 netisr_processqueue(c075a6d8) at netisr_processqueue+0x6e
 swi_net(0) at swi_net+0xc2
 ithread_execute_handlers(c9279648,c92c3400) at ithread_execute_handlers+0xce
 ithread_loop(c92436a0,f48a2d38,c070db20,0,c06a818a,...) at ithread_loop+0x4e
 fork_exit(c04f76d4,c92436a0,f48a2d38) at fork_exit+0x61
 fork_trampoline() at fork_trampoline+0x8
 --- trap 0x1, eip = 0, esp = 0xf48a2d6c, ebp = 0 ---
 
 
 Fatal trap 12: page fault while in kernel mode
 fault virtual address   = 0xc
 fault code  = supervisor read, page not present
 instruction pointer = 0x20:0xc053ea34
 stack pointer   = 0x28:0xf48a2ad8
 frame pointer   = 0x28:0xf48a2ae4
 code segment= base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, def32 1, gran 1
 processor eflags= interrupt enabled, resume, IOPL = 0
 current process = 13 (swi1: net)
 trap number = 12
 panic: page fault
 KDB: stack backtrace:
 kdb_backtrace(100,c927aa80,28,f48a2a98,c,...) at kdb_backtrace+0x29
 panic(c069b8a1,c06c5f2c,0,f,c927d69b,...) at panic+0xa8
 trap_fatal(f48a2a98,c,c927aa80,c070bd40,0,...) at trap_fatal+0x2a6
 trap_pfault(f48a2a98,0,c) at trap_pfault+0x187
 trap(8,c06b0028,f48a0028,1,0,...) at trap+0x325
 calltrap() at calltrap+0x5
 --- trap 0xc, eip = 0xc053ea34, esp = 0xf48a2ad8, ebp = 0xf48a2ae4 ---
 m_copydata(0,,1,d0020d74,c1040468,...) at m_copydata+0x28
 tcp_output(d21c5570) at tcp_output+0x9af
 tcp_input(d0020d00,14,e9,93935ce,0,...) at tcp_input+0x24a2
 ip_input(d0020d00) at ip_input+0x561
 netisr_processqueue(c075a6d8) at netisr_processqueue+0x6e
 swi_net(0) at swi_net+0xc2
 ithread_execute_handlers(c9279648,c92c3400) at ithread_execute_handlers+0xce
 ithread_loop(c92436a0,f48a2d38,c070db20,0,c06a818a,...) at ithread_loop+0x4e
 fork_exit(c04f76d4,c92436a0,f48a2d38) at fork_exit+0x61
 fork_trampoline() at fork_trampoline+0x8
 --- trap 0x1, eip = 0, esp = 0xf48a2d6c, ebp = 0 ---
 Uptime: 25m13s
 Dumping 3967 MB (3 chunks)
   

Re: deadlock in zoneli state on 6.2-PRERELEASE

2006-11-25 Thread Nikolay Pavlov
On Sunday, 26 November 2006 at  2:13:33 +0800, LI Xin wrote:
 Nikolay Pavlov wrote:
  On Friday, 24 November 2006 at 11:11:48 +0800, LI Xin wrote:
  Nikolay Pavlov wrote:
  On Thursday, 23 November 2006 at 20:24:15 +0800, [EMAIL PROTECTED] wrote:
  Hi,
 
  On Wed, 22 Nov 2006 21:55:49 +0200, Nikolay Pavlov [EMAIL PROTECTED] 
  wrote:
  Hi.
  It seems i have a deadlock on 6.2-PRERELEASE.
  This is squid server in accelerator mode.
  I can easily trigger it with a high rate of requests.
  Squid is locked on some zoneli state, i am not sure what it is.
  Also i can't KILL proccess even with SIGKILL.
  In addition one of sshd proccess is locked too.
  Would you please update to the latest RELENG_6 and apply this patch:
 
  http://people.freebsd.org/~delphij/misc/patch-zonelimit-workaround
 
  to see if things gets improved?
 
  Thanks in advance!
 
  Cheers,
  Well. This patch works quite ambiguous for me.
  Under heavy load this box become unresponseble via network.
  System is mostly idle. Squid is locked in zoneli.
  
  Another panic. Guys do i need some additional debug options or this info
  is enough. I am asking because this panic is easily reproduceable for
  me.
 
 I think these stuff is enough.  By the way, which scheduler do you use?

4BSD

 
  [EMAIL PROTECTED]:/usr/obj/usr/src/sys/ACCEL# kgdb kernel.debug 
  /var/crash/vmcore.4
  kgdb: kvm_nlist(_stopped_cpus):
  kgdb: kvm_nlist(_stoppcbs):
  [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: 
  Undefined symbol ps_pglobal_lookup]
  GNU gdb 6.1.1 [FreeBSD]
  Copyright 2004 Free Software Foundation, Inc.
  GDB is free software, covered by the GNU General Public License, and you are
  welcome to change it and/or distribute copies of it under certain 
  conditions.
  Type show copying to see the conditions.
  There is absolutely no warranty for GDB.  Type show warranty for details.
  This GDB was configured as i386-marcel-freebsd.
  
  Unread portion of the kernel message buffer:
  lock order reversal: (sleepable after non-sleepable)
   1st 0xca21567c so_snd (so_snd) @ /usr/src/sys/netinet/tcp_output.c:253
   2nd 0xc070bd84 user map (user map) @ /usr/src/sys/vm/vm_map.c:3074
  KDB: stack backtrace:
  kdb_backtrace(,c071ccb0,c071c210,c06e5c4c,c0758f18,...) at 
  kdb_backtrace+0x29
  witness_checkorder(c070bd84,9,c06be56c,c02,c070d2c4,0,c06aab25,9f) at 
  witness_checkorder+0x4cd
  _sx_xlock(c070bd84,c06be56c,c02) at _sx_xlock+0x2c
  _vm_map_lock_read(c070bd40,c06be56c,c02,184637d,c92796b0,...) at 
  _vm_map_lock_read+0x37
  vm_map_lookup(f48a29d0,0,1,f48a29d4,f48a29c4,f48a29c8,f48a29ab,f48a29ac) at 
  vm_map_lookup+0x28
  vm_fault(c070bd40,0,1,0,c927aa80,...) at vm_fault+0x65
  trap_pfault(f48a2a98,0,c) at trap_pfault+0xee
  trap(8,c06b0028,f48a0028,1,0,...) at trap+0x325
  calltrap() at calltrap+0x5
  --- trap 0xc, eip = 0xc053ea34, esp = 0xf48a2ad8, ebp = 0xf48a2ae4 ---
  m_copydata(0,,1,d0020d74,c1040468,...) at m_copydata+0x28
  tcp_output(d21c5570) at tcp_output+0x9af
  tcp_input(d0020d00,14,e9,93935ce,0,...) at tcp_input+0x24a2
  ip_input(d0020d00) at ip_input+0x561
  netisr_processqueue(c075a6d8) at netisr_processqueue+0x6e
  swi_net(0) at swi_net+0xc2
  ithread_execute_handlers(c9279648,c92c3400) at ithread_execute_handlers+0xce
  ithread_loop(c92436a0,f48a2d38,c070db20,0,c06a818a,...) at ithread_loop+0x4e
  fork_exit(c04f76d4,c92436a0,f48a2d38) at fork_exit+0x61
  fork_trampoline() at fork_trampoline+0x8
  --- trap 0x1, eip = 0, esp = 0xf48a2d6c, ebp = 0 ---
  
  
  Fatal trap 12: page fault while in kernel mode
  fault virtual address   = 0xc
  fault code  = supervisor read, page not present
  instruction pointer = 0x20:0xc053ea34
  stack pointer   = 0x28:0xf48a2ad8
  frame pointer   = 0x28:0xf48a2ae4
  code segment= base 0x0, limit 0xf, type 0x1b
  = DPL 0, pres 1, def32 1, gran 1
  processor eflags= interrupt enabled, resume, IOPL = 0
  current process = 13 (swi1: net)
  trap number = 12
  panic: page fault
  KDB: stack backtrace:
  kdb_backtrace(100,c927aa80,28,f48a2a98,c,...) at kdb_backtrace+0x29
  panic(c069b8a1,c06c5f2c,0,f,c927d69b,...) at panic+0xa8
  trap_fatal(f48a2a98,c,c927aa80,c070bd40,0,...) at trap_fatal+0x2a6
  trap_pfault(f48a2a98,0,c) at trap_pfault+0x187
  trap(8,c06b0028,f48a0028,1,0,...) at trap+0x325
  calltrap() at calltrap+0x5
  --- trap 0xc, eip = 0xc053ea34, esp = 0xf48a2ad8, ebp = 0xf48a2ae4 ---
  m_copydata(0,,1,d0020d74,c1040468,...) at m_copydata+0x28
  tcp_output(d21c5570) at tcp_output+0x9af
  tcp_input(d0020d00,14,e9,93935ce,0,...) at tcp_input+0x24a2
  ip_input(d0020d00) at ip_input+0x561
  netisr_processqueue(c075a6d8) at netisr_processqueue+0x6e
  swi_net(0) at swi_net+0xc2
  ithread_execute_handlers(c9279648,c92c3400) at ithread_execute_handlers+0xce
  ithread_loop(c92436a0,f48a2d38,c070db20,0,c06a818a,...) at ithread_loop+0x4e
  

Re: deadlock in zoneli state on 6.2-PRERELEASE

2006-11-24 Thread Nikolay Pavlov
On Friday, 24 November 2006 at 11:11:48 +0800, LI Xin wrote:
 Nikolay Pavlov wrote:
  On Thursday, 23 November 2006 at 20:24:15 +0800, [EMAIL PROTECTED] wrote:
  Hi,
 
  On Wed, 22 Nov 2006 21:55:49 +0200, Nikolay Pavlov [EMAIL PROTECTED] 
  wrote:
  Hi.
  It seems i have a deadlock on 6.2-PRERELEASE.
  This is squid server in accelerator mode.
  I can easily trigger it with a high rate of requests.
  Squid is locked on some zoneli state, i am not sure what it is.
  Also i can't KILL proccess even with SIGKILL.
  In addition one of sshd proccess is locked too.
  Would you please update to the latest RELENG_6 and apply this patch:
 
  http://people.freebsd.org/~delphij/misc/patch-zonelimit-workaround
 
  to see if things gets improved?
 
  Thanks in advance!
 
  Cheers,
  
  Well. This patch works quite ambiguous for me.
  Under heavy load this box become unresponseble via network.
  System is mostly idle. Squid is locked in zoneli.
 
 Would you please give me the output of sysctl vm.zone on a patched
 system?  It's not important whether it is loaded.

Here it is:

[EMAIL PROTECTED]:~# sysctl vm.zone
vm.zone:
ITEMSIZE LIMIT USEDFREE  REQUESTS

FFS2 dinode: 256,0,  84920,  15040,  1499541
FFS1 dinode: 128,0,  0,  0,0
FFS inode:   132,0,  84920,  15043,  1499541
Mountpoints: 664,0,  7,  5,7
SWAPMETA:276,   121576,  0,  0,0
pfosfp:   28,0,188,193,  188
pfospfen:108,0,345, 51,  345
pfiaddrpl:92,0,  0,  0,0
pfstatescrub: 28,0,  0,  0,0
pffrcent: 12,50141,  0,  0,0
pffrcache:48,10062,  0,  0,0
pffrag:   48,0,  0,156,   47
pffrent:  16,  203,  0,203,   99
pfrkentry2:  156,0,  0,  0,0
pfrkentry:   156,0,  5, 45,8
pfrktable:  1240,0,  4,  5,   11
pfpooladdrpl: 68,0,  2,110,2
pfaltqpl:128,0,  0,  0,0
pfstatepl:   260,15000,   4478,   1327,  1670408
pfrulepl:604,0,  9,  9,9
pfsrctrpl:   100,15015,  0,  0,0
rtentry: 132,0, 14, 44,   25
ripcb:   180,   131076,  0, 44,2
sackhole: 20,0, 95,243,  3451100
tcpreass: 20, 8281,  0,169, 2584
hostcache:76,15400,   6178,222,61324
syncache:100,15366, 43,152,  1682074
tcptw:48, 8268,270,198,  1499466
tcpcb:   464,   131072,816,   2632,  1723491
inpcb:   180,   131076,   1086,   2434,  1723491
udpcb:   180,   131076, 12, 32,  251
ipq:  32, 4181,  0,  0,0
unpcb:   140,   131096, 14, 42,  890
socket:  356,   131076,842,   2524,  1724635
KNOTE:68,0,  0,112,  128
PIPE:408,0,  7, 20, 2476
NFSNODE: 460,0,  3, 21,   14
NFSMOUNT:480,0,  1, 15,1
DIRHASH:1024,0,894,930,29255
NAMEI:  1024,0,  0, 12,  2924657
L VFS Cache: 291,0, 96,125,  202
S VFS Cache:  68,0,  80925,  13435,  1464860
VNODEPOLL:76,0,  0,  0,0
VNODE:   272,0,  84967,  15049,  1499617
ata_composit:196,0,  0,  0,0
ata_request: 204,0,  0,  0,0
g_bio:   132,0,  0,   3596,  1759992
ACL UMA zone:388,0,  0,  0,0
mbuf_jumbo_1:  16384,0,  0,  0,0
mbuf_jumbo_9:   9216,0,  0,  0,0
mbuf_jumbo_p:   4096,0,  0,  0,0
mbuf_cluster:   2048,   131072,   9788,  55914, 55681090
mbuf:256,65550,   9886,  55664, 94834002
mbuf_packet: 256,65550,   9946,  55604, 56552142
VMSPACE: 296,0, 30, 35, 5026
UPCALL:   44,0,  0,  0,0
KSEGRP:   88,0,112, 48,  112
THREAD:  376,0,112,  8,  112
PROC:536,0, 72, 40, 5070
MAC labels:   20,0,  88313,  13594,  6711573
Files:72,0,342,   3103,  4744891
4096:   4096,0,130, 37,  1227567
2048:   2048,0,149, 97,  7768863
1024:   1024,0, 50,110,  6638355
512: 512,0,   3281,   2063,   158715
256: 256,0,391,524,   103478
128: 128,0,   1911,   1749,  

Re: deadlock in zoneli state on 6.2-PRERELEASE

2006-11-23 Thread delphij

Hi,

On Wed, 22 Nov 2006 21:55:49 +0200, Nikolay Pavlov [EMAIL PROTECTED] wrote:
 Hi.
 It seems i have a deadlock on 6.2-PRERELEASE.
 This is squid server in accelerator mode.
 I can easily trigger it with a high rate of requests.
 Squid is locked on some zoneli state, i am not sure what it is.
 Also i can't KILL proccess even with SIGKILL.
 In addition one of sshd proccess is locked too.

Would you please update to the latest RELENG_6 and apply this patch:

http://people.freebsd.org/~delphij/misc/patch-zonelimit-workaround

to see if things gets improved?

Thanks in advance!

Cheers,

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: deadlock in zoneli state on 6.2-PRERELEASE

2006-11-23 Thread Nikolay Pavlov
On Thursday, 23 November 2006 at 20:24:15 +0800, [EMAIL PROTECTED] wrote:
 
 Hi,
 
 On Wed, 22 Nov 2006 21:55:49 +0200, Nikolay Pavlov [EMAIL PROTECTED] wrote:
  Hi.
  It seems i have a deadlock on 6.2-PRERELEASE.
  This is squid server in accelerator mode.
  I can easily trigger it with a high rate of requests.
  Squid is locked on some zoneli state, i am not sure what it is.
  Also i can't KILL proccess even with SIGKILL.
  In addition one of sshd proccess is locked too.
 
 Would you please update to the latest RELENG_6 and apply this patch:
 
 http://people.freebsd.org/~delphij/misc/patch-zonelimit-workaround
 
 to see if things gets improved?
 
 Thanks in advance!
 
 Cheers,

Well. This patch works quite ambiguous for me.
Under heavy load this box become unresponseble via network.
System is mostly idle. Squid is locked in zoneli.

ast pid:   840;  load averages:  0.26,  0.24,  0.17 up 0+00:11:50  10:19:46
34 processes:  1 running, 33 sleeping
CPU states:  0.4% user,  0.0% nice,  0.4% system,  1.5% interrupt, 97.8% idle
Mem: 225M Active, 144M Inact, 261M Wired, 12K Cache, 112M Buf, 3259M Free
Swap: 4070M Total, 4070M Free

  PID USERNAME  THR PRI NICE   SIZERES STATETIME   WCPU COMMAND
  682 squid   1 -160   207M   207M zoneli   2:18  6.59% squid
  709 root1  -80  7768K  7240K piperd   0:00  0.00% perl5.8.8
  691 root1  960  6632K  4796K select   0:00  0.00% snmpd
  829 root1  76  -20  2400K  1648K RUN  0:00  0.00% top
  790 quetzal 1  960  6220K  3252K select   0:00  0.00% sshd
  788 root1   40  6232K  3232K sbwait   0:00  0.00% sshd
  837 root1  200  5048K  3024K pause0:00  0.00% tcsh
  832 root1   40  6232K  3236K sbwait   0:00  0.00% sshd
  820 root1  200  4700K  2856K pause0:00  0.00% tcsh
  645 root1  960  2984K  1808K select   0:00  0.00% ntpd
  791 quetzal 1  200  4708K  2872K pause0:00  0.00% tcsh
  560 root1  960  1352K   996K select   0:00  0.00% syslogd
  362 _pflogd 1 -580  1600K  1144K bpf  0:00  0.00% pflogd
  835 quetzal 1  200  4728K  2960K pause0:00  0.00% tcsh
  688 squid   1  -80  1224K   632K piperd   0:00  0.00% unlinkd
  834 quetzal 1  960  6220K  3252K select   0:00  0.00% sshd
  840 root1  200  1540K   960K pause0:00  0.00% netstat
  719 root1  960  3464K  2796K select   0:00  0.00% sendmail
  729 root1   80  1364K  1060K nanslp   0:00  0.00% cron

[EMAIL PROTECTED]:~# netstat -h 1
input(Total)   output
   packets  errs  bytespackets  errs  bytes colls
  1.6K 0   1.3M   1.5K 0   1.6M 0
  1.8K 0   1.6M   1.7K 0   1.6M 0
  1.3K 0   1.0M   1.4K 0   1.4M 0
  1.5K 0   1.3M   1.5K 0   1.4M 0
  1.6K 0   1.4M   1.6K 0   1.5M 0
  1.7K 0   1.5M   1.6K 0   1.5M 0
  1.3K 0   830K   1.4K 0   1.5M 0
  1.1K 0   679K   1.3K 0   1.4M 0
   812 0   501K912 0   971K 0
  1.2K 0   1.1M   1.2K 0   1.1M 0
   617 0   325K742 0   806K 0
   634 0   312K769 0   818K 0
  1.8K 0   1.7M   1.5K 0   1.1M 0
   11K 013M   7.5K 0   3.8M 0
   10K 012M   8.0K 0   5.2M 0
  9.7K 0   9.9M   8.2K 0   6.3M 0
   513  1.7K   666K328 0   151K 0
^^
Here goes load...

  1.0K   543   782K434 0   247K 0
 0  2.3K  0  0 0  0 0
 2   605   1.5K  2 0132 0
input(Total)   output
   packets  errs  bytespackets  errs  bytes colls
 0   334  0  0 0  0 0
 0   286  0  0 0  0 0
 0   288  0  0 0  0 0
   819   204   689K328 0   122K 0
 0  1.7K  0  0 0  0 0
   866  1.2K   719K375 0   141K 0
   144  1.5K   175K111 055K 0
 0  1.3K  0  0 0  0 0
   687   182   426K304 073K 0
 0  3.2K  0  0 0  0 0
  1.0K 0   723K405 0   126K 0
17  1.8K25K 11 0   2.2K 0
   598   990   409K163 032K 0
   785  1.9K   635K313 085K 0
  

Re: deadlock in zoneli state on 6.2-PRERELEASE

2006-11-23 Thread LI Xin
Nikolay Pavlov wrote:
 On Thursday, 23 November 2006 at 20:24:15 +0800, [EMAIL PROTECTED] wrote:
 Hi,

 On Wed, 22 Nov 2006 21:55:49 +0200, Nikolay Pavlov [EMAIL PROTECTED] wrote:
 Hi.
 It seems i have a deadlock on 6.2-PRERELEASE.
 This is squid server in accelerator mode.
 I can easily trigger it with a high rate of requests.
 Squid is locked on some zoneli state, i am not sure what it is.
 Also i can't KILL proccess even with SIGKILL.
 In addition one of sshd proccess is locked too.
 Would you please update to the latest RELENG_6 and apply this patch:

 http://people.freebsd.org/~delphij/misc/patch-zonelimit-workaround

 to see if things gets improved?

 Thanks in advance!

 Cheers,
 
 Well. This patch works quite ambiguous for me.
 Under heavy load this box become unresponseble via network.
 System is mostly idle. Squid is locked in zoneli.

Would you please give me the output of sysctl vm.zone on a patched
system?  It's not important whether it is loaded.


 ast pid:   840;  load averages:  0.26,  0.24,  0.17 up 0+00:11:50  10:19:46
 34 processes:  1 running, 33 sleeping
 CPU states:  0.4% user,  0.0% nice,  0.4% system,  1.5% interrupt, 97.8% idle
 Mem: 225M Active, 144M Inact, 261M Wired, 12K Cache, 112M Buf, 3259M Free
 Swap: 4070M Total, 4070M Free
 
   PID USERNAME  THR PRI NICE   SIZERES STATETIME   WCPU COMMAND
   682 squid   1 -160   207M   207M zoneli   2:18  6.59% squid
   709 root1  -80  7768K  7240K piperd   0:00  0.00% perl5.8.8
   691 root1  960  6632K  4796K select   0:00  0.00% snmpd
   829 root1  76  -20  2400K  1648K RUN  0:00  0.00% top
   790 quetzal 1  960  6220K  3252K select   0:00  0.00% sshd
   788 root1   40  6232K  3232K sbwait   0:00  0.00% sshd
   837 root1  200  5048K  3024K pause0:00  0.00% tcsh
   832 root1   40  6232K  3236K sbwait   0:00  0.00% sshd
   820 root1  200  4700K  2856K pause0:00  0.00% tcsh
   645 root1  960  2984K  1808K select   0:00  0.00% ntpd
   791 quetzal 1  200  4708K  2872K pause0:00  0.00% tcsh
   560 root1  960  1352K   996K select   0:00  0.00% syslogd
   362 _pflogd 1 -580  1600K  1144K bpf  0:00  0.00% pflogd
   835 quetzal 1  200  4728K  2960K pause0:00  0.00% tcsh
   688 squid   1  -80  1224K   632K piperd   0:00  0.00% unlinkd
   834 quetzal 1  960  6220K  3252K select   0:00  0.00% sshd
   840 root1  200  1540K   960K pause0:00  0.00% netstat
   719 root1  960  3464K  2796K select   0:00  0.00% sendmail
   729 root1   80  1364K  1060K nanslp   0:00  0.00% cron
 
 [EMAIL PROTECTED]:~# netstat -h 1
 input(Total)   output
packets  errs  bytespackets  errs  bytes colls
   1.6K 0   1.3M   1.5K 0   1.6M 0
   1.8K 0   1.6M   1.7K 0   1.6M 0
   1.3K 0   1.0M   1.4K 0   1.4M 0
   1.5K 0   1.3M   1.5K 0   1.4M 0
   1.6K 0   1.4M   1.6K 0   1.5M 0
   1.7K 0   1.5M   1.6K 0   1.5M 0
   1.3K 0   830K   1.4K 0   1.5M 0
   1.1K 0   679K   1.3K 0   1.4M 0
812 0   501K912 0   971K 0
   1.2K 0   1.1M   1.2K 0   1.1M 0
617 0   325K742 0   806K 0
634 0   312K769 0   818K 0
   1.8K 0   1.7M   1.5K 0   1.1M 0
11K 013M   7.5K 0   3.8M 0
10K 012M   8.0K 0   5.2M 0
   9.7K 0   9.9M   8.2K 0   6.3M 0
513  1.7K   666K328 0   151K 0
 ^^
 Here goes load...
 
   1.0K   543   782K434 0   247K 0
  0  2.3K  0  0 0  0 0
  2   605   1.5K  2 0132 0
 input(Total)   output
packets  errs  bytespackets  errs  bytes colls
  0   334  0  0 0  0 0
  0   286  0  0 0  0 0
  0   288  0  0 0  0 0
819   204   689K328 0   122K 0
  0  1.7K  0  0 0  0 0
866  1.2K   719K375 0   141K 0
144  1.5K   175K111 055K 0
  0  1.3K  0  0 0  0 0
687   182   426K304 073K 0
  0  3.2K  0  0 0  0 0
   1.0K 0   723K405 0   

deadlock in zoneli state on 6.2-PRERELEASE

2006-11-22 Thread Nikolay Pavlov
Hi.
It seems i have a deadlock on 6.2-PRERELEASE.
This is squid server in accelerator mode.
I can easily trigger it with a high rate of requests.
Squid is locked on some zoneli state, i am not sure what it is.
Also i can't KILL proccess even with SIGKILL.
In addition one of sshd proccess is locked too.

Is there any additional information that i could provide?

last pid:  1197;  load averages:  0.00,  0.00,  0.00
up 0+01:54:58  14:46:40
31 processes:  1 running, 29 sleeping, 1 zombie
CPU states:  0.0% user,  0.0% nice,  0.0% system,  0.0% interrupt,  100% idle
Mem: 704M Active, 629M Inact, 447M Wired, 12K Cache, 112M Buf, 2109M Free
Swap: 4070M Total, 4070M Free

  PID USERNAME  THR PRI NICE   SIZERES STATETIME   WCPU COMMAND
  671 squid   1 -160   688M   688M zoneli   6:32  0.00% squid
^^^
  680 root1  960  6628K  4760K select   0:02  0.00% snmpd
 1170 root1  960  2332K  1588K RUN  0:00  0.00% top
  698 root1  -80  7768K  7288K piperd   0:00  0.00% perl5.8.8
  634 root1  960  2984K  1808K select   0:00  0.00% ntpd
  362 _pflogd 1 -580  1600K  1144K bpf  0:00  0.00% pflogd
 1097 quetzal 1  960  6220K  3220K select   0:00  0.00% sshd
  709 root1  960  3464K  2796K select   0:00  0.00% sendmail
 1100 root1  200  5036K  3064K pause0:00  0.00% tcsh
  551 root1  960  1352K   996K select   0:00  0.00% syslogd
 1085 root1   40  6232K  3204K sbwait   0:00  0.00% sshd
 1095 root1   40  6232K  3204K sbwait   0:00  0.00% sshd
 1088 quetzal 1   60  4724K  2952K ttywai   0:00  0.00% tcsh
  719 root1   80  1364K  1060K nanslp   0:00  0.00% cron
 1098 quetzal 1  200  4704K  2932K pause0:00  0.00% tcsh
 1087 quetzal 1 -160  6220K  3220K zoneli   0:00  0.00% sshd
^^
  654 root1  960  1264K   804K select   0:00  0.00% usbd
  692 root1  960  3504K  2656K select   0:00  0.00% sshd
  713 smmsp   1  200  3364K  2728K pause0:00  0.00% sendmail
  358 root1   40  1536K  1092K sbwait   0:00  0.00% pflogd
  769 root1   50  1320K   896K ttyin0:00  0.00% getty
  773 root1   50  1320K   896K ttyin0:00  0.00% getty
  772 root1   50  1320K   896K ttyin0:00  0.00% getty
  771 root1   50  1320K   896K ttyin0:00  0.00% getty
  770 root1   50  1320K   896K ttyin0:00  0.00% getty
  775 root1   50  1320K   896K ttyin0:00  0.00% getty
  774 root1   50  1320K   896K ttyin0:00  0.00% getty
  776 root1   50  1320K   896K ttyin0:00  0.00% getty
  497 root1 1140   528K   388K select   0:00  0.00% devd
  128 root1  200  1228K   680K pause0:00  0.00% adjkerntz


Also there is some interesting fstat info:

[EMAIL PROTECTED]:~# fstat -p 671 -v | head -n 40
can't read vnode at 0x0 for pid 671
can't read vnode at 0x0 for pid 671
can't read vnode at 0x0 for pid 671
can't read vnode at 0x0 for pid 671
can't read vnode at 0x0 for pid 671
USER CMD  PID   FD MOUNT  INUM MODE SZ|DV R/W
squidsquid671 root / 2 drwxr-xr-x 512  r
squidsquid671   wd /usr 1908230 drwxr-x--- 512  r
squidsquid671 text /usr 1887228 -r-xr-xr-x  638296  r
squidsquid6710 - -   error-
squidsquid6711 - -   error-
squidsquid6712 - -   error-
squidsquid6713 - -   error-
squidsquid6714 /var  47121 -rw-r--r--  2935342 rw
squidsquid6715* internet dgram udp c96205a0
squidsquid6716 /var  47131 -rw-r--r--  48909168  w
squidsquid6717* pipe c9551198 - c9551250  3 rw
squidsquid6718 /cache7 -rw-r--r--  91506636  w
squidsquid6719* internet stream tcp d2f17ae0
squidsquid671   10* pipe c9551a48 - c9551990  0 rw
squidsquid671   11* internet stream tcp c971e3a0
squidsquid671   12* internet dgram udp c962
squidsquid671   13 - -   error-
squidsquid671   14* internet stream tcp
squidsquid671   15* internet stream tcp d6b211d0
squidsquid671   16* internet stream tcp cf29c740
squidsquid671   17* internet stream tcp d0c9cae0
squidsquid671   18* internet stream tcp c9ebc570
squidsquid671   19* internet stream tcp d49c9000
squidsquid671   20* internet stream tcp d262eae0
squidsquid671   21 /cache   4031491 -rw-r--r--  2037934  r
squidsquid671   22* internet stream tcp ca1941d0
squidsquid