Re: deadlock in zoneli state on 6.2-PRERELEASE
On Monday, 25 December 2006 at 13:23:17 +0800, LI Xin wrote: Hi, Nikolay, Our local customer has applied the following and it seems to have 'solved' their problem on squid: echo kern.ipc.nmbclusters=0 /boot/loader.conf and then reboot. They have been running with the 20061212 patch but I suspect that it's no longer necessary. The feedback I have received is that they have run with this for two weeks without problem, serving video streams. Please let us know if this works. Wish you a happy new year :-) Thanks. Still no luck with my problem :) It dies for me somewhere about: [EMAIL PROTECTED]:~# netstat -m 208965/585/209550 mbufs in use (current/cache/total) 208432/474/208906/0 mbuf clusters in use (current/cache/total/max) 208432/464 mbuf+clusters out of packet secondary zone in use (current/cache) 0/0/0/0 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/0 9k jumbo clusters in use (current/cache/total/max) 0/0/0/0 16k jumbo clusters in use (current/cache/total/max) 469105K/1094K/470199K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0/3/6656 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed 0 requests for I/O initiated by sendfile 0 calls to protocol drain routines With panic: kmem_malloc(4096): kmem_map too small: 536870912 total allocated So there is still memory leak somewhere or i have some rarely and hardly detectable hardware bug. At least all other squid servers on this load balancer cluster works fine even under same load as this box and there is no any mbuf related problems on them. In any case thank you and all folks for help :) Happy holidays! Cheers, -- Xin LI [EMAIL PROTECTED]http://www.delphij.net/ FreeBSD - The Power to Serve! -- == - Best regards, Nikolay Pavlov. --- == ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: deadlock in zoneli state on 6.2-PRERELEASE
Hi, Nikolay, Our local customer has applied the following and it seems to have 'solved' their problem on squid: echo kern.ipc.nmbclusters=0 /boot/loader.conf and then reboot. They have been running with the 20061212 patch but I suspect that it's no longer necessary. The feedback I have received is that they have run with this for two weeks without problem, serving video streams. Please let us know if this works. Wish you a happy new year :-) Cheers, -- Xin LI [EMAIL PROTECTED] http://www.delphij.net/ FreeBSD - The Power to Serve! signature.asc Description: OpenPGP digital signature
Re: deadlock in zoneli state on 6.2-PRERELEASE
No luck at all. patch-zonelim-drain-20061212 works for me as a previos one. no panics, but still zoneli. All this is very odd, because other two squid servers works perfectly in the same loadbalancer with out any patches and kernel panics. I think that the case with this server is realy rare. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: deadlock in zoneli state on 6.2-PRERELEASE
Nikolay Pavlov wrote: No luck at all. patch-zonelim-drain-20061212 works for me as a previos one. no panics, but still zoneli. All this is very odd, because other two squid servers works perfectly in the same loadbalancer with out any patches and kernel panics. I think that the case with this server is realy rare. Would you please give a vmstat -z output when the server stuck in the zonelim livelock? Thanks! Cheers, -- Xin LI [EMAIL PROTECTED] http://www.delphij.net/ FreeBSD - The Power to Serve! signature.asc Description: OpenPGP digital signature
Re: deadlock in zoneli state on 6.2-PRERELEASE
On Wednesday, 13 December 2006 at 3:02:40 +0800, LI Xin wrote: Nikolay Pavlov wrote: No luck at all. patch-zonelim-drain-20061212 works for me as a previos one. no panics, but still zoneli. All this is very odd, because other two squid servers works perfectly in the same loadbalancer with out any patches and kernel panics. I think that the case with this server is realy rare. Would you please give a vmstat -z output when the server stuck in the zonelim livelock? Thanks! 130947/775/131722 mbufs in use (current/cache/total) 130859/213/131072/131072 mbuf clusters in use (current/cache/total/max) 130859/213 mbuf+clusters out of packet secondary zone in use (current/cache) 0/0/0/0 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/0 9k jumbo clusters in use (current/cache/total/max) 0/0/0/0 16k jumbo clusters in use (current/cache/total/max) 294454K/619K/295074K bytes allocated to network (current/cache/total) 0/493001/246499 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0/4/6656 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed 0 requests for I/O initiated by sendfile 83 calls to protocol drain routines ITEMSIZE LIMIT USEDFREE REQUESTS UMA Kegs:140,0, 84, 12, 84 UMA Zones: 120,0, 84, 6, 84 UMA Slabs:64,0, 941, 3, 2709 UMA RCntSlabs: 104,0, 65536, 28,65536 UMA Hash:128,0, 4, 26,6 16 Bucket:76,0, 27, 23, 35 32 Bucket: 140,0, 21, 7, 29 64 Bucket: 268,0, 30, 26, 67 128 Bucket: 524,0, 215,457, 827387 VM OBJECT: 132,0, 41860, 16,72475 MAP: 192,0, 7, 33,7 KMAP ENTRY: 68,57456, 121, 47,97518 MAP ENTRY:68,0, 702,362,94390 PV ENTRY: 24, 2228360, 124664, 2646, 984278 DP fakepg:72,0, 0, 0,0 mt_zone: 64,0, 182, 54, 182 16: 16,0,3754,306, 432518 32: 32,0,1794,353, 116429 64: 64,0,3205, 3580, 206314 128: 128,0,1616, 1384, 371069 256: 256,0, 368,427,22536 512: 512,0,1266, 30, 486094 1024: 1024,0, 48, 84, 434408 2048: 2048,0, 147, 61,43326 4096: 4096,0, 129, 22, 4942 Files:72,0,2672, 2734, 481720 MAC labels: 20,0, 59586, 1085, 622560 PROC:536,0, 71, 27, 1564 THREAD: 376,0, 98, 22, 98 KSEGRP: 88,0, 98, 62, 98 UPCALL: 44,0, 0, 0,0 VMSPACE: 296,0, 28, 24, 1518 mbuf_packet: 256,0, 131052, 20, 9510911 mbuf:256,0, 91,559, 12694062 mbuf_cluster: 2048, 131072, 131072, 0, 136594 mbuf_jumbo_pagesize: 4096,0, 0, 0,0 mbuf_jumbo_9k: 9216,0, 0, 0,0 mbuf_jumbo_16k: 16384,0, 0, 0,0 ACL UMA zone:388,0, 0, 0,0 g_bio: 132,0, 0, 1160, 522435 ata_request: 204,0, 0, 0,0 ata_composite: 196,0, 0, 0,0 VNODE: 348,0, 47312, 10,52436 VNODEPOLL:76,0, 0, 0,0 S VFS Cache: 68,0, 39770, 46,44847 L VFS Cache: 291,0, 0, 0,0 NAMEI: 1024,0, 0, 12, 205701 DIRHASH:1024,0,1595,229, 7023 NFSMOUNT:480,0, 1, 15,1 NFSNODE: 536,0, 16, 5, 16 PIPE:408,0, 6, 21, 664 KNOTE:68,0, 0,112, 76 socket: 356, 131076,3930, 1229, 186491 unpcb: 140, 131096, 12, 44, 170 ipq: 32, 4181, 0, 0,0 udpcb: 180, 131076, 12, 32, 188 inpcb: 180, 131076,3859, 1509, 186132 tcpcb: 464, 131072,3859, 1237, 186132 tcptw:48, 8268, 0,546, 140117 syncache:100,15366, 2,271, 165792 hostcache:76,15400,3717, 33, 6425 tcpreass: 20, 8281, 0,169, 1493 sackhole: 20,0, 4,334, 372356 ripcb:
Re: deadlock in zoneli state on 6.2-PRERELEASE
Hi, Would you please give the following patch a try? http://people.freebsd.org/~delphij/misc/patch-zonelim-drain Note: Please revert my previous patch against sys/kern/kern_mbuf.c. This patch should be applied against sys/vm/ [RELENG_6 or RELENG_6_2], it schedules a drain of uma zones when they are low on memory. Cheers, -- Xin LI [EMAIL PROTECTED] http://www.delphij.net/ FreeBSD - The Power to Serve! signature.asc Description: OpenPGP digital signature
Re: deadlock in zoneli state on 6.2-PRERELEASE
On Monday, 11 December 2006 at 15:59:03 +0800, LI Xin wrote: Hi, Would you please give the following patch a try? http://people.freebsd.org/~delphij/misc/patch-zonelim-drain Note: Please revert my previous patch against sys/kern/kern_mbuf.c. This patch should be applied against sys/vm/ [RELENG_6 or RELENG_6_2], it schedules a drain of uma zones when they are low on memory. This time things worked out a bit better, there was no Kernel panic and my server managed to overcome the magic number 65550 mbufs. But very soon the server reached another limit - 131072 mbuf clusters (This is my limit for kern.ipc.nmbclusters). And server started to drop the packets. After I've removed the overload I found my server responding but when I actually accessed it I found out that although the number of connections has reduces considerably, the memory allocated for the net did not become free. So I believe that there is still a mbufs leak somewhere. [EMAIL PROTECTED]:~# sockstat -4 | wc -l 17 [EMAIL PROTECTED]:~# netstat -m 1082/131578/132660 mbufs in use (current/cache/total) 1080/129992/131072/131072 mbuf clusters in use (current/cache/total/max) ^^ 1080/128712 mbuf+clusters out of packet secondary zone in use (current/cache) 0/0/0/0 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/0 9k jumbo clusters in use (current/cache/total/max) 0/0/0/0 16k jumbo clusters in use (current/cache/total/max) 2430K/292878K/295309K bytes allocated to network (current/cache/total) ^ 0/1058420/529208 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0/4/6656 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed 0 requests for I/O initiated by sendfile 156 calls to protocol drain routines [EMAIL PROTECTED]:~# vmstat -z ITEMSIZE LIMIT USEDFREE REQUESTS UMA Kegs:140,0, 84, 12, 84 UMA Zones: 120,0, 84, 6, 84 UMA Slabs:64,0, 904, 40,31603 UMA RCntSlabs: 104,0, 65536, 28,65536 UMA Hash:128,0, 4, 26,6 16 Bucket:76,0, 23, 27, 32 32 Bucket: 140,0, 20, 8, 29 64 Bucket: 268,0, 23, 33, 46 128 Bucket: 524,0,1423, 5, 1590808 VM OBJECT: 132,0, 18094,147,43780 MAP: 192,0, 7, 33,7 KMAP ENTRY: 68,57456, 17,151, 151954 MAP ENTRY:68,0, 718,290, 182847 PV ENTRY: 24, 2228360, 124232, 4528, 861505 DP fakepg:72,0, 0, 0,0 mt_zone: 64,0, 182, 54, 182 16: 16,0,2828, 1435, 273492 32: 32,0,1769,491, 116114 64: 64,0,3199, 2701, 181154 128: 128,0,1813, 1517, 273347 256: 256,0, 365,415,13230 512: 512,0,1006, 10,35578 1024: 1024,0, 49, 83,72385 2048: 2048,0, 170, 38,40942 4096: 4096,0, 129, 22,89367 Files:72,0, 100, 5624, 227230 MAC labels: 20,0, 20726, 15778, 368518 PROC:536,0, 71, 27, 1441 THREAD: 376,0, 98, 22, 98 KSEGRP: 88,0, 98, 62, 98 UPCALL: 44,0, 0, 0,0 VMSPACE: 296,0, 28, 24, 1398 mbuf_packet: 256,0,1080, 128712, 6714669 mbuf:256,0, 2, 2866, 7912768 mbuf_cluster: 2048, 131072, 129792, 1280, 166272 mbuf_jumbo_pagesize: 4096,0, 0, 0,0 mbuf_jumbo_9k: 9216,0, 0, 0,0 mbuf_jumbo_16k: 16384,0, 0, 0,0 ACL UMA zone:388,0, 0, 0,0 g_bio: 132,0, 0,783, 194850 ata_request: 204,0, 0, 0,0 ata_composite: 196,0, 0, 0,0 VNODE: 348,0, 20116, 3,22965 VNODEPOLL:76,0, 0, 0,0 S VFS Cache: 68,0, 16387, 77,18383 L VFS Cache: 291,0, 0, 0,0 NAMEI: 1024,0, 0, 12, 166126 DIRHASH:1024,0,1688,128, 3307 NFSMOUNT:480,0, 0, 0,0 NFSNODE: 536,0, 0, 0,0 PIPE:408,0, 6, 21, 595 KNOTE:68,0, 0,112, 64 socket:
Re: deadlock in zoneli state on 6.2-PRERELEASE
Hi, Nikolay, Nikolay Pavlov wrote: On Monday, 11 December 2006 at 15:59:03 +0800, LI Xin wrote: Hi, Would you please give the following patch a try? http://people.freebsd.org/~delphij/misc/patch-zonelim-drain Note: Please revert my previous patch against sys/kern/kern_mbuf.c. This patch should be applied against sys/vm/ [RELENG_6 or RELENG_6_2], it schedules a drain of uma zones when they are low on memory. This time things worked out a bit better, there was no Kernel panic and my server managed to overcome the magic number 65550 mbufs. But very soon the server reached another limit - 131072 mbuf clusters Do you still get squid stuck in zoneli state and the server became unresponsive? (This is my limit for kern.ipc.nmbclusters). And server started to drop the packets. After I've removed the overload I found my server responding but when I actually accessed it I found out that although the number of connections has reduces considerably, the memory allocated for the net did not become free. So I believe that there is still a mbufs leak somewhere. This looks weird to me... Would you please try to add some load to the server and remove afterwards, to see if the 'current' mbuf clusters keeps increasing or not? Cheers, -- Xin LI [EMAIL PROTECTED] http://www.delphij.net/ FreeBSD - The Power to Serve! signature.asc Description: OpenPGP digital signature
Re: deadlock in zoneli state on 6.2-PRERELEASE
On Tuesday, 12 December 2006 at 2:59:37 +0800, LI Xin wrote: Hi, Nikolay, Nikolay Pavlov wrote: On Monday, 11 December 2006 at 15:59:03 +0800, LI Xin wrote: Hi, Would you please give the following patch a try? http://people.freebsd.org/~delphij/misc/patch-zonelim-drain Note: Please revert my previous patch against sys/kern/kern_mbuf.c. This patch should be applied against sys/vm/ [RELENG_6 or RELENG_6_2], it schedules a drain of uma zones when they are low on memory. This time things worked out a bit better, there was no Kernel panic and my server managed to overcome the magic number 65550 mbufs. But very soon the server reached another limit - 131072 mbuf clusters Do you still get squid stuck in zoneli state and the server became unresponsive? Yes. No panic, but still idle in zoneli and the server become unresponsive via network. last pid: 1990; load averages: 0.09, 0.30, 0.16 up 0+03:27:2816:27:57 30 processes: 1 running, 29 sleeping Mem: 497M Active, 600M Inact, 441M Wired, 12K Cache, 112M Buf, 2352M Free Swap: 4070M Total, 4070M Free PID USERNAME THR PRI NICE SIZERES STATETIME WCPU COMMAND 684 squid 1 -160 480M 480M zoneli 11:41 0.00% squid 694 root1 960 6636K 4800K select 0:06 0.00% snmpd 691 squid 1 -80 1224K 632K piperd 0:01 0.00% unlinkd 364 _pflogd 1 -580 1600K 1216K bpf 0:01 0.00% pflogd 1840 root1 -80 7768K 7304K piperd 0:00 0.00% perl5.8.8 722 root1 960 3464K 2796K select 0:00 0.00% sendmail 563 root1 960 1352K 1048K select 0:00 0.00% syslogd 1878 root1 5 -10 5072K 3056K ttyin0:00 0.00% tcsh 1841 root1 200 4976K 3036K pause0:00 0.00% csh 1874 quetzal 1 960 6220K 3252K select 0:00 0.00% sshd 1877 root1 200 5056K 3028K pause0:00 0.00% tcsh 1872 root1 40 6232K 3236K sbwait 0:00 0.00% sshd 732 root1 80 1364K 1060K nanslp 0:00 0.00% cron 783 root1 80 1692K 1396K wait 0:00 0.00% login 1875 quetzal 1 200 4736K 2964K pause0:00 0.00% tcsh 668 root1 960 1264K 804K select 0:00 0.00% usbd 706 root1 960 3504K 2676K select 0:00 0.00% sshd 726 smmsp 1 200 3364K 2724K pause0:00 0.00% sendmail last pid: 1992; load averages: 0.06, 0.27, 0.16 up 0+03:27:5016:28:19 75 processes: 2 running, 52 sleeping, 21 waiting Mem: 497M Active, 600M Inact, 441M Wired, 12K Cache, 112M Buf, 2351M Free Swap: 4070M Total, 4070M Free PID USERNAME THR PRI NICE SIZERES STATETIME WCPU COMMAND 10 root1 171 52 0K 8K RUN188:49 93.85% idle 22 root1 -68 -187 0K 8K WAIT 1:54 3.56% irq29: em1 684 squid 1 -160 480M 480M zoneli 11:41 0.05% squid 13 root1 -44 -163 0K 8K WAIT 3:08 0.00% swi1: net 11 root1 -32 -151 0K 8K WAIT 0:27 0.00% swi4: clock sio 42 root1 200 0K 8K syncer 0:09 0.00% syncer 694 root1 960 6636K 4800K select 0:06 0.00% snmpd 14 root1 -160 0K 8K -0:04 0.00% yarrow 39 root1 171 52 0K 8K pgzero 0:03 0.00% pagezero 3 root1 -80 0K 8K -0:02 0.00% g_up 4 root1 -80 0K 8K -0:02 0.00% g_down 37 root1 -160 0K 8K psleep 0:02 0.00% pagedaemon 691 squid 1 -80 1224K 632K piperd 0:01 0.00% unlinkd 43 root1 -160 0K 8K sdflus 0:01 0.00% softdepflush 20 root1 -64 -183 0K 8K WAIT 0:01 0.00% irq48: amr0 2 root1 -80 0K 8K -0:01 0.00% g_event 44 root1 -600 0K 8K -0:01 0.00% schedcpu 364 _pflogd 1 -580 1600K 1216K bpf 0:01 0.00% pflogd PID TT STAT TIME COMMAND 0 ?? WLs0:00.00 [swapper] 1 ?? ILs0:00.01 /sbin/init -- 2 ?? DL 0:00.91 [g_event] 3 ?? DL 0:02.25 [g_up] 4 ?? DL 0:02.15 [g_down] 5 ?? DL 0:00.00 [kqueue taskq] 6 ?? DL 0:00.00 [acpi_task_0] 7 ?? DL 0:00.00 [acpi_task_1] 8 ?? DL 0:00.00 [acpi_task_2] 9 ?? DL 0:00.00 [thread taskq] 10 ?? RL 188:42.12 [idle] 11 ?? WL 0:26.95 [swi4: clock sio] 12 ?? WL 0:00.00 [swi3: vm] 13 ?? WL 3:08.22 [swi1: net] 14 ?? DL 0:04.36 [yarrow] 15 ?? WL 0:00.00 [swi5: +] 16 ?? WL 0:00.00 [swi2: cambio] 17 ?? WL 0:00.00 [swi6: task queue] 18 ?? WL 0:00.02 [swi6: Giant taskq] 19 ?? WL 0:00.00 [irq9: acpi0] 20 ?? WL 0:01.20 [irq48: amr0] 21 ?? WL 0:00.00 [irq28: em0] 22 ?? WL 1:53.84 [irq29:
Re: deadlock in zoneli state on 6.2-PRERELEASE
Hi, Nikolay, Nikolay Pavlov wrote: On Tuesday, 12 December 2006 at 2:59:37 +0800, LI Xin wrote: Hi, Nikolay, Nikolay Pavlov wrote: On Monday, 11 December 2006 at 15:59:03 +0800, LI Xin wrote: Hi, Would you please give the following patch a try? http://people.freebsd.org/~delphij/misc/patch-zonelim-drain Note: Please revert my previous patch against sys/kern/kern_mbuf.c. This patch should be applied against sys/vm/ [RELENG_6 or RELENG_6_2], it schedules a drain of uma zones when they are low on memory. This time things worked out a bit better, there was no Kernel panic and my server managed to overcome the magic number 65550 mbufs. But very soon the server reached another limit - 131072 mbuf clusters Do you still get squid stuck in zoneli state and the server became unresponsive? Yes. No panic, but still idle in zoneli and the server become unresponsive via network. I am not quite sure but please let me know if the patch: http://people.freebsd.org/~delphij/misc/patch-zonelim-drain-20061212 would help the situation. You can also modify wakeup_one(keg); to wakeup(keg); manually, at sys/vm/uma_core.c:2507. Note that this would be a (potentially) a pessimization for zonelim case, where all zonelim'ed threads would be waken rather than only one. However, confirming whether this would help your situation, would help to narrow down the issue so we can think about a better solution for it. Thanks! Cheers, -- Xin LI [EMAIL PROTECTED] http://www.delphij.net/ FreeBSD - The Power to Serve! signature.asc Description: OpenPGP digital signature
Re: deadlock in zoneli state on 6.2-PRERELEASE
On Friday, 24 November 2006 at 11:11:48 +0800, LI Xin wrote: Nikolay Pavlov wrote: On Thursday, 23 November 2006 at 20:24:15 +0800, [EMAIL PROTECTED] wrote: Hi, On Wed, 22 Nov 2006 21:55:49 +0200, Nikolay Pavlov [EMAIL PROTECTED] wrote: Hi. It seems i have a deadlock on 6.2-PRERELEASE. This is squid server in accelerator mode. I can easily trigger it with a high rate of requests. Squid is locked on some zoneli state, i am not sure what it is. Also i can't KILL proccess even with SIGKILL. In addition one of sshd proccess is locked too. Would you please update to the latest RELENG_6 and apply this patch: http://people.freebsd.org/~delphij/misc/patch-zonelimit-workaround to see if things gets improved? Thanks in advance! Cheers, Well. This patch works quite ambiguous for me. Under heavy load this box become unresponseble via network. System is mostly idle. Squid is locked in zoneli. Another panic. Guys do i need some additional debug options or this info is enough. I am asking because this panic is easily reproduceable for me. [EMAIL PROTECTED]:/usr/obj/usr/src/sys/ACCEL# kgdb kernel.debug /var/crash/vmcore.4 kgdb: kvm_nlist(_stopped_cpus): kgdb: kvm_nlist(_stoppcbs): [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol ps_pglobal_lookup] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-marcel-freebsd. Unread portion of the kernel message buffer: lock order reversal: (sleepable after non-sleepable) 1st 0xca21567c so_snd (so_snd) @ /usr/src/sys/netinet/tcp_output.c:253 2nd 0xc070bd84 user map (user map) @ /usr/src/sys/vm/vm_map.c:3074 KDB: stack backtrace: kdb_backtrace(,c071ccb0,c071c210,c06e5c4c,c0758f18,...) at kdb_backtrace+0x29 witness_checkorder(c070bd84,9,c06be56c,c02,c070d2c4,0,c06aab25,9f) at witness_checkorder+0x4cd _sx_xlock(c070bd84,c06be56c,c02) at _sx_xlock+0x2c _vm_map_lock_read(c070bd40,c06be56c,c02,184637d,c92796b0,...) at _vm_map_lock_read+0x37 vm_map_lookup(f48a29d0,0,1,f48a29d4,f48a29c4,f48a29c8,f48a29ab,f48a29ac) at vm_map_lookup+0x28 vm_fault(c070bd40,0,1,0,c927aa80,...) at vm_fault+0x65 trap_pfault(f48a2a98,0,c) at trap_pfault+0xee trap(8,c06b0028,f48a0028,1,0,...) at trap+0x325 calltrap() at calltrap+0x5 --- trap 0xc, eip = 0xc053ea34, esp = 0xf48a2ad8, ebp = 0xf48a2ae4 --- m_copydata(0,,1,d0020d74,c1040468,...) at m_copydata+0x28 tcp_output(d21c5570) at tcp_output+0x9af tcp_input(d0020d00,14,e9,93935ce,0,...) at tcp_input+0x24a2 ip_input(d0020d00) at ip_input+0x561 netisr_processqueue(c075a6d8) at netisr_processqueue+0x6e swi_net(0) at swi_net+0xc2 ithread_execute_handlers(c9279648,c92c3400) at ithread_execute_handlers+0xce ithread_loop(c92436a0,f48a2d38,c070db20,0,c06a818a,...) at ithread_loop+0x4e fork_exit(c04f76d4,c92436a0,f48a2d38) at fork_exit+0x61 fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xf48a2d6c, ebp = 0 --- Fatal trap 12: page fault while in kernel mode fault virtual address = 0xc fault code = supervisor read, page not present instruction pointer = 0x20:0xc053ea34 stack pointer = 0x28:0xf48a2ad8 frame pointer = 0x28:0xf48a2ae4 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 13 (swi1: net) trap number = 12 panic: page fault KDB: stack backtrace: kdb_backtrace(100,c927aa80,28,f48a2a98,c,...) at kdb_backtrace+0x29 panic(c069b8a1,c06c5f2c,0,f,c927d69b,...) at panic+0xa8 trap_fatal(f48a2a98,c,c927aa80,c070bd40,0,...) at trap_fatal+0x2a6 trap_pfault(f48a2a98,0,c) at trap_pfault+0x187 trap(8,c06b0028,f48a0028,1,0,...) at trap+0x325 calltrap() at calltrap+0x5 --- trap 0xc, eip = 0xc053ea34, esp = 0xf48a2ad8, ebp = 0xf48a2ae4 --- m_copydata(0,,1,d0020d74,c1040468,...) at m_copydata+0x28 tcp_output(d21c5570) at tcp_output+0x9af tcp_input(d0020d00,14,e9,93935ce,0,...) at tcp_input+0x24a2 ip_input(d0020d00) at ip_input+0x561 netisr_processqueue(c075a6d8) at netisr_processqueue+0x6e swi_net(0) at swi_net+0xc2 ithread_execute_handlers(c9279648,c92c3400) at ithread_execute_handlers+0xce ithread_loop(c92436a0,f48a2d38,c070db20,0,c06a818a,...) at ithread_loop+0x4e fork_exit(c04f76d4,c92436a0,f48a2d38) at fork_exit+0x61 fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xf48a2d6c, ebp = 0 --- Uptime: 25m13s Dumping 3967 MB (3 chunks) chunk 0: 1MB (159 pages) ... ok chunk 1: 3966MB (1015280 pages) 3950 3934 3918 3902 3886 3870 3854 3838 3822 3806 3790 3774 3758 3742 3726 3710 3694 3678
Re: deadlock in zoneli state on 6.2-PRERELEASE
Nikolay Pavlov wrote: On Friday, 24 November 2006 at 11:11:48 +0800, LI Xin wrote: Nikolay Pavlov wrote: On Thursday, 23 November 2006 at 20:24:15 +0800, [EMAIL PROTECTED] wrote: Hi, On Wed, 22 Nov 2006 21:55:49 +0200, Nikolay Pavlov [EMAIL PROTECTED] wrote: Hi. It seems i have a deadlock on 6.2-PRERELEASE. This is squid server in accelerator mode. I can easily trigger it with a high rate of requests. Squid is locked on some zoneli state, i am not sure what it is. Also i can't KILL proccess even with SIGKILL. In addition one of sshd proccess is locked too. Would you please update to the latest RELENG_6 and apply this patch: http://people.freebsd.org/~delphij/misc/patch-zonelimit-workaround to see if things gets improved? Thanks in advance! Cheers, Well. This patch works quite ambiguous for me. Under heavy load this box become unresponseble via network. System is mostly idle. Squid is locked in zoneli. Another panic. Guys do i need some additional debug options or this info is enough. I am asking because this panic is easily reproduceable for me. I think these stuff is enough. By the way, which scheduler do you use? [EMAIL PROTECTED]:/usr/obj/usr/src/sys/ACCEL# kgdb kernel.debug /var/crash/vmcore.4 kgdb: kvm_nlist(_stopped_cpus): kgdb: kvm_nlist(_stoppcbs): [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol ps_pglobal_lookup] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-marcel-freebsd. Unread portion of the kernel message buffer: lock order reversal: (sleepable after non-sleepable) 1st 0xca21567c so_snd (so_snd) @ /usr/src/sys/netinet/tcp_output.c:253 2nd 0xc070bd84 user map (user map) @ /usr/src/sys/vm/vm_map.c:3074 KDB: stack backtrace: kdb_backtrace(,c071ccb0,c071c210,c06e5c4c,c0758f18,...) at kdb_backtrace+0x29 witness_checkorder(c070bd84,9,c06be56c,c02,c070d2c4,0,c06aab25,9f) at witness_checkorder+0x4cd _sx_xlock(c070bd84,c06be56c,c02) at _sx_xlock+0x2c _vm_map_lock_read(c070bd40,c06be56c,c02,184637d,c92796b0,...) at _vm_map_lock_read+0x37 vm_map_lookup(f48a29d0,0,1,f48a29d4,f48a29c4,f48a29c8,f48a29ab,f48a29ac) at vm_map_lookup+0x28 vm_fault(c070bd40,0,1,0,c927aa80,...) at vm_fault+0x65 trap_pfault(f48a2a98,0,c) at trap_pfault+0xee trap(8,c06b0028,f48a0028,1,0,...) at trap+0x325 calltrap() at calltrap+0x5 --- trap 0xc, eip = 0xc053ea34, esp = 0xf48a2ad8, ebp = 0xf48a2ae4 --- m_copydata(0,,1,d0020d74,c1040468,...) at m_copydata+0x28 tcp_output(d21c5570) at tcp_output+0x9af tcp_input(d0020d00,14,e9,93935ce,0,...) at tcp_input+0x24a2 ip_input(d0020d00) at ip_input+0x561 netisr_processqueue(c075a6d8) at netisr_processqueue+0x6e swi_net(0) at swi_net+0xc2 ithread_execute_handlers(c9279648,c92c3400) at ithread_execute_handlers+0xce ithread_loop(c92436a0,f48a2d38,c070db20,0,c06a818a,...) at ithread_loop+0x4e fork_exit(c04f76d4,c92436a0,f48a2d38) at fork_exit+0x61 fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xf48a2d6c, ebp = 0 --- Fatal trap 12: page fault while in kernel mode fault virtual address = 0xc fault code = supervisor read, page not present instruction pointer = 0x20:0xc053ea34 stack pointer = 0x28:0xf48a2ad8 frame pointer = 0x28:0xf48a2ae4 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 13 (swi1: net) trap number = 12 panic: page fault KDB: stack backtrace: kdb_backtrace(100,c927aa80,28,f48a2a98,c,...) at kdb_backtrace+0x29 panic(c069b8a1,c06c5f2c,0,f,c927d69b,...) at panic+0xa8 trap_fatal(f48a2a98,c,c927aa80,c070bd40,0,...) at trap_fatal+0x2a6 trap_pfault(f48a2a98,0,c) at trap_pfault+0x187 trap(8,c06b0028,f48a0028,1,0,...) at trap+0x325 calltrap() at calltrap+0x5 --- trap 0xc, eip = 0xc053ea34, esp = 0xf48a2ad8, ebp = 0xf48a2ae4 --- m_copydata(0,,1,d0020d74,c1040468,...) at m_copydata+0x28 tcp_output(d21c5570) at tcp_output+0x9af tcp_input(d0020d00,14,e9,93935ce,0,...) at tcp_input+0x24a2 ip_input(d0020d00) at ip_input+0x561 netisr_processqueue(c075a6d8) at netisr_processqueue+0x6e swi_net(0) at swi_net+0xc2 ithread_execute_handlers(c9279648,c92c3400) at ithread_execute_handlers+0xce ithread_loop(c92436a0,f48a2d38,c070db20,0,c06a818a,...) at ithread_loop+0x4e fork_exit(c04f76d4,c92436a0,f48a2d38) at fork_exit+0x61 fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xf48a2d6c, ebp = 0 --- Uptime: 25m13s Dumping 3967 MB (3 chunks)
Re: deadlock in zoneli state on 6.2-PRERELEASE
On Sunday, 26 November 2006 at 2:13:33 +0800, LI Xin wrote: Nikolay Pavlov wrote: On Friday, 24 November 2006 at 11:11:48 +0800, LI Xin wrote: Nikolay Pavlov wrote: On Thursday, 23 November 2006 at 20:24:15 +0800, [EMAIL PROTECTED] wrote: Hi, On Wed, 22 Nov 2006 21:55:49 +0200, Nikolay Pavlov [EMAIL PROTECTED] wrote: Hi. It seems i have a deadlock on 6.2-PRERELEASE. This is squid server in accelerator mode. I can easily trigger it with a high rate of requests. Squid is locked on some zoneli state, i am not sure what it is. Also i can't KILL proccess even with SIGKILL. In addition one of sshd proccess is locked too. Would you please update to the latest RELENG_6 and apply this patch: http://people.freebsd.org/~delphij/misc/patch-zonelimit-workaround to see if things gets improved? Thanks in advance! Cheers, Well. This patch works quite ambiguous for me. Under heavy load this box become unresponseble via network. System is mostly idle. Squid is locked in zoneli. Another panic. Guys do i need some additional debug options or this info is enough. I am asking because this panic is easily reproduceable for me. I think these stuff is enough. By the way, which scheduler do you use? 4BSD [EMAIL PROTECTED]:/usr/obj/usr/src/sys/ACCEL# kgdb kernel.debug /var/crash/vmcore.4 kgdb: kvm_nlist(_stopped_cpus): kgdb: kvm_nlist(_stoppcbs): [GDB will not be able to debug user-mode threads: /usr/lib/libthread_db.so: Undefined symbol ps_pglobal_lookup] GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type show copying to see the conditions. There is absolutely no warranty for GDB. Type show warranty for details. This GDB was configured as i386-marcel-freebsd. Unread portion of the kernel message buffer: lock order reversal: (sleepable after non-sleepable) 1st 0xca21567c so_snd (so_snd) @ /usr/src/sys/netinet/tcp_output.c:253 2nd 0xc070bd84 user map (user map) @ /usr/src/sys/vm/vm_map.c:3074 KDB: stack backtrace: kdb_backtrace(,c071ccb0,c071c210,c06e5c4c,c0758f18,...) at kdb_backtrace+0x29 witness_checkorder(c070bd84,9,c06be56c,c02,c070d2c4,0,c06aab25,9f) at witness_checkorder+0x4cd _sx_xlock(c070bd84,c06be56c,c02) at _sx_xlock+0x2c _vm_map_lock_read(c070bd40,c06be56c,c02,184637d,c92796b0,...) at _vm_map_lock_read+0x37 vm_map_lookup(f48a29d0,0,1,f48a29d4,f48a29c4,f48a29c8,f48a29ab,f48a29ac) at vm_map_lookup+0x28 vm_fault(c070bd40,0,1,0,c927aa80,...) at vm_fault+0x65 trap_pfault(f48a2a98,0,c) at trap_pfault+0xee trap(8,c06b0028,f48a0028,1,0,...) at trap+0x325 calltrap() at calltrap+0x5 --- trap 0xc, eip = 0xc053ea34, esp = 0xf48a2ad8, ebp = 0xf48a2ae4 --- m_copydata(0,,1,d0020d74,c1040468,...) at m_copydata+0x28 tcp_output(d21c5570) at tcp_output+0x9af tcp_input(d0020d00,14,e9,93935ce,0,...) at tcp_input+0x24a2 ip_input(d0020d00) at ip_input+0x561 netisr_processqueue(c075a6d8) at netisr_processqueue+0x6e swi_net(0) at swi_net+0xc2 ithread_execute_handlers(c9279648,c92c3400) at ithread_execute_handlers+0xce ithread_loop(c92436a0,f48a2d38,c070db20,0,c06a818a,...) at ithread_loop+0x4e fork_exit(c04f76d4,c92436a0,f48a2d38) at fork_exit+0x61 fork_trampoline() at fork_trampoline+0x8 --- trap 0x1, eip = 0, esp = 0xf48a2d6c, ebp = 0 --- Fatal trap 12: page fault while in kernel mode fault virtual address = 0xc fault code = supervisor read, page not present instruction pointer = 0x20:0xc053ea34 stack pointer = 0x28:0xf48a2ad8 frame pointer = 0x28:0xf48a2ae4 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, def32 1, gran 1 processor eflags= interrupt enabled, resume, IOPL = 0 current process = 13 (swi1: net) trap number = 12 panic: page fault KDB: stack backtrace: kdb_backtrace(100,c927aa80,28,f48a2a98,c,...) at kdb_backtrace+0x29 panic(c069b8a1,c06c5f2c,0,f,c927d69b,...) at panic+0xa8 trap_fatal(f48a2a98,c,c927aa80,c070bd40,0,...) at trap_fatal+0x2a6 trap_pfault(f48a2a98,0,c) at trap_pfault+0x187 trap(8,c06b0028,f48a0028,1,0,...) at trap+0x325 calltrap() at calltrap+0x5 --- trap 0xc, eip = 0xc053ea34, esp = 0xf48a2ad8, ebp = 0xf48a2ae4 --- m_copydata(0,,1,d0020d74,c1040468,...) at m_copydata+0x28 tcp_output(d21c5570) at tcp_output+0x9af tcp_input(d0020d00,14,e9,93935ce,0,...) at tcp_input+0x24a2 ip_input(d0020d00) at ip_input+0x561 netisr_processqueue(c075a6d8) at netisr_processqueue+0x6e swi_net(0) at swi_net+0xc2 ithread_execute_handlers(c9279648,c92c3400) at ithread_execute_handlers+0xce ithread_loop(c92436a0,f48a2d38,c070db20,0,c06a818a,...) at ithread_loop+0x4e
Re: deadlock in zoneli state on 6.2-PRERELEASE
On Friday, 24 November 2006 at 11:11:48 +0800, LI Xin wrote: Nikolay Pavlov wrote: On Thursday, 23 November 2006 at 20:24:15 +0800, [EMAIL PROTECTED] wrote: Hi, On Wed, 22 Nov 2006 21:55:49 +0200, Nikolay Pavlov [EMAIL PROTECTED] wrote: Hi. It seems i have a deadlock on 6.2-PRERELEASE. This is squid server in accelerator mode. I can easily trigger it with a high rate of requests. Squid is locked on some zoneli state, i am not sure what it is. Also i can't KILL proccess even with SIGKILL. In addition one of sshd proccess is locked too. Would you please update to the latest RELENG_6 and apply this patch: http://people.freebsd.org/~delphij/misc/patch-zonelimit-workaround to see if things gets improved? Thanks in advance! Cheers, Well. This patch works quite ambiguous for me. Under heavy load this box become unresponseble via network. System is mostly idle. Squid is locked in zoneli. Would you please give me the output of sysctl vm.zone on a patched system? It's not important whether it is loaded. Here it is: [EMAIL PROTECTED]:~# sysctl vm.zone vm.zone: ITEMSIZE LIMIT USEDFREE REQUESTS FFS2 dinode: 256,0, 84920, 15040, 1499541 FFS1 dinode: 128,0, 0, 0,0 FFS inode: 132,0, 84920, 15043, 1499541 Mountpoints: 664,0, 7, 5,7 SWAPMETA:276, 121576, 0, 0,0 pfosfp: 28,0,188,193, 188 pfospfen:108,0,345, 51, 345 pfiaddrpl:92,0, 0, 0,0 pfstatescrub: 28,0, 0, 0,0 pffrcent: 12,50141, 0, 0,0 pffrcache:48,10062, 0, 0,0 pffrag: 48,0, 0,156, 47 pffrent: 16, 203, 0,203, 99 pfrkentry2: 156,0, 0, 0,0 pfrkentry: 156,0, 5, 45,8 pfrktable: 1240,0, 4, 5, 11 pfpooladdrpl: 68,0, 2,110,2 pfaltqpl:128,0, 0, 0,0 pfstatepl: 260,15000, 4478, 1327, 1670408 pfrulepl:604,0, 9, 9,9 pfsrctrpl: 100,15015, 0, 0,0 rtentry: 132,0, 14, 44, 25 ripcb: 180, 131076, 0, 44,2 sackhole: 20,0, 95,243, 3451100 tcpreass: 20, 8281, 0,169, 2584 hostcache:76,15400, 6178,222,61324 syncache:100,15366, 43,152, 1682074 tcptw:48, 8268,270,198, 1499466 tcpcb: 464, 131072,816, 2632, 1723491 inpcb: 180, 131076, 1086, 2434, 1723491 udpcb: 180, 131076, 12, 32, 251 ipq: 32, 4181, 0, 0,0 unpcb: 140, 131096, 14, 42, 890 socket: 356, 131076,842, 2524, 1724635 KNOTE:68,0, 0,112, 128 PIPE:408,0, 7, 20, 2476 NFSNODE: 460,0, 3, 21, 14 NFSMOUNT:480,0, 1, 15,1 DIRHASH:1024,0,894,930,29255 NAMEI: 1024,0, 0, 12, 2924657 L VFS Cache: 291,0, 96,125, 202 S VFS Cache: 68,0, 80925, 13435, 1464860 VNODEPOLL:76,0, 0, 0,0 VNODE: 272,0, 84967, 15049, 1499617 ata_composit:196,0, 0, 0,0 ata_request: 204,0, 0, 0,0 g_bio: 132,0, 0, 3596, 1759992 ACL UMA zone:388,0, 0, 0,0 mbuf_jumbo_1: 16384,0, 0, 0,0 mbuf_jumbo_9: 9216,0, 0, 0,0 mbuf_jumbo_p: 4096,0, 0, 0,0 mbuf_cluster: 2048, 131072, 9788, 55914, 55681090 mbuf:256,65550, 9886, 55664, 94834002 mbuf_packet: 256,65550, 9946, 55604, 56552142 VMSPACE: 296,0, 30, 35, 5026 UPCALL: 44,0, 0, 0,0 KSEGRP: 88,0,112, 48, 112 THREAD: 376,0,112, 8, 112 PROC:536,0, 72, 40, 5070 MAC labels: 20,0, 88313, 13594, 6711573 Files:72,0,342, 3103, 4744891 4096: 4096,0,130, 37, 1227567 2048: 2048,0,149, 97, 7768863 1024: 1024,0, 50,110, 6638355 512: 512,0, 3281, 2063, 158715 256: 256,0,391,524, 103478 128: 128,0, 1911, 1749,
Re: deadlock in zoneli state on 6.2-PRERELEASE
Hi, On Wed, 22 Nov 2006 21:55:49 +0200, Nikolay Pavlov [EMAIL PROTECTED] wrote: Hi. It seems i have a deadlock on 6.2-PRERELEASE. This is squid server in accelerator mode. I can easily trigger it with a high rate of requests. Squid is locked on some zoneli state, i am not sure what it is. Also i can't KILL proccess even with SIGKILL. In addition one of sshd proccess is locked too. Would you please update to the latest RELENG_6 and apply this patch: http://people.freebsd.org/~delphij/misc/patch-zonelimit-workaround to see if things gets improved? Thanks in advance! Cheers, ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: deadlock in zoneli state on 6.2-PRERELEASE
On Thursday, 23 November 2006 at 20:24:15 +0800, [EMAIL PROTECTED] wrote: Hi, On Wed, 22 Nov 2006 21:55:49 +0200, Nikolay Pavlov [EMAIL PROTECTED] wrote: Hi. It seems i have a deadlock on 6.2-PRERELEASE. This is squid server in accelerator mode. I can easily trigger it with a high rate of requests. Squid is locked on some zoneli state, i am not sure what it is. Also i can't KILL proccess even with SIGKILL. In addition one of sshd proccess is locked too. Would you please update to the latest RELENG_6 and apply this patch: http://people.freebsd.org/~delphij/misc/patch-zonelimit-workaround to see if things gets improved? Thanks in advance! Cheers, Well. This patch works quite ambiguous for me. Under heavy load this box become unresponseble via network. System is mostly idle. Squid is locked in zoneli. ast pid: 840; load averages: 0.26, 0.24, 0.17 up 0+00:11:50 10:19:46 34 processes: 1 running, 33 sleeping CPU states: 0.4% user, 0.0% nice, 0.4% system, 1.5% interrupt, 97.8% idle Mem: 225M Active, 144M Inact, 261M Wired, 12K Cache, 112M Buf, 3259M Free Swap: 4070M Total, 4070M Free PID USERNAME THR PRI NICE SIZERES STATETIME WCPU COMMAND 682 squid 1 -160 207M 207M zoneli 2:18 6.59% squid 709 root1 -80 7768K 7240K piperd 0:00 0.00% perl5.8.8 691 root1 960 6632K 4796K select 0:00 0.00% snmpd 829 root1 76 -20 2400K 1648K RUN 0:00 0.00% top 790 quetzal 1 960 6220K 3252K select 0:00 0.00% sshd 788 root1 40 6232K 3232K sbwait 0:00 0.00% sshd 837 root1 200 5048K 3024K pause0:00 0.00% tcsh 832 root1 40 6232K 3236K sbwait 0:00 0.00% sshd 820 root1 200 4700K 2856K pause0:00 0.00% tcsh 645 root1 960 2984K 1808K select 0:00 0.00% ntpd 791 quetzal 1 200 4708K 2872K pause0:00 0.00% tcsh 560 root1 960 1352K 996K select 0:00 0.00% syslogd 362 _pflogd 1 -580 1600K 1144K bpf 0:00 0.00% pflogd 835 quetzal 1 200 4728K 2960K pause0:00 0.00% tcsh 688 squid 1 -80 1224K 632K piperd 0:00 0.00% unlinkd 834 quetzal 1 960 6220K 3252K select 0:00 0.00% sshd 840 root1 200 1540K 960K pause0:00 0.00% netstat 719 root1 960 3464K 2796K select 0:00 0.00% sendmail 729 root1 80 1364K 1060K nanslp 0:00 0.00% cron [EMAIL PROTECTED]:~# netstat -h 1 input(Total) output packets errs bytespackets errs bytes colls 1.6K 0 1.3M 1.5K 0 1.6M 0 1.8K 0 1.6M 1.7K 0 1.6M 0 1.3K 0 1.0M 1.4K 0 1.4M 0 1.5K 0 1.3M 1.5K 0 1.4M 0 1.6K 0 1.4M 1.6K 0 1.5M 0 1.7K 0 1.5M 1.6K 0 1.5M 0 1.3K 0 830K 1.4K 0 1.5M 0 1.1K 0 679K 1.3K 0 1.4M 0 812 0 501K912 0 971K 0 1.2K 0 1.1M 1.2K 0 1.1M 0 617 0 325K742 0 806K 0 634 0 312K769 0 818K 0 1.8K 0 1.7M 1.5K 0 1.1M 0 11K 013M 7.5K 0 3.8M 0 10K 012M 8.0K 0 5.2M 0 9.7K 0 9.9M 8.2K 0 6.3M 0 513 1.7K 666K328 0 151K 0 ^^ Here goes load... 1.0K 543 782K434 0 247K 0 0 2.3K 0 0 0 0 0 2 605 1.5K 2 0132 0 input(Total) output packets errs bytespackets errs bytes colls 0 334 0 0 0 0 0 0 286 0 0 0 0 0 0 288 0 0 0 0 0 819 204 689K328 0 122K 0 0 1.7K 0 0 0 0 0 866 1.2K 719K375 0 141K 0 144 1.5K 175K111 055K 0 0 1.3K 0 0 0 0 0 687 182 426K304 073K 0 0 3.2K 0 0 0 0 0 1.0K 0 723K405 0 126K 0 17 1.8K25K 11 0 2.2K 0 598 990 409K163 032K 0 785 1.9K 635K313 085K 0
Re: deadlock in zoneli state on 6.2-PRERELEASE
Nikolay Pavlov wrote: On Thursday, 23 November 2006 at 20:24:15 +0800, [EMAIL PROTECTED] wrote: Hi, On Wed, 22 Nov 2006 21:55:49 +0200, Nikolay Pavlov [EMAIL PROTECTED] wrote: Hi. It seems i have a deadlock on 6.2-PRERELEASE. This is squid server in accelerator mode. I can easily trigger it with a high rate of requests. Squid is locked on some zoneli state, i am not sure what it is. Also i can't KILL proccess even with SIGKILL. In addition one of sshd proccess is locked too. Would you please update to the latest RELENG_6 and apply this patch: http://people.freebsd.org/~delphij/misc/patch-zonelimit-workaround to see if things gets improved? Thanks in advance! Cheers, Well. This patch works quite ambiguous for me. Under heavy load this box become unresponseble via network. System is mostly idle. Squid is locked in zoneli. Would you please give me the output of sysctl vm.zone on a patched system? It's not important whether it is loaded. ast pid: 840; load averages: 0.26, 0.24, 0.17 up 0+00:11:50 10:19:46 34 processes: 1 running, 33 sleeping CPU states: 0.4% user, 0.0% nice, 0.4% system, 1.5% interrupt, 97.8% idle Mem: 225M Active, 144M Inact, 261M Wired, 12K Cache, 112M Buf, 3259M Free Swap: 4070M Total, 4070M Free PID USERNAME THR PRI NICE SIZERES STATETIME WCPU COMMAND 682 squid 1 -160 207M 207M zoneli 2:18 6.59% squid 709 root1 -80 7768K 7240K piperd 0:00 0.00% perl5.8.8 691 root1 960 6632K 4796K select 0:00 0.00% snmpd 829 root1 76 -20 2400K 1648K RUN 0:00 0.00% top 790 quetzal 1 960 6220K 3252K select 0:00 0.00% sshd 788 root1 40 6232K 3232K sbwait 0:00 0.00% sshd 837 root1 200 5048K 3024K pause0:00 0.00% tcsh 832 root1 40 6232K 3236K sbwait 0:00 0.00% sshd 820 root1 200 4700K 2856K pause0:00 0.00% tcsh 645 root1 960 2984K 1808K select 0:00 0.00% ntpd 791 quetzal 1 200 4708K 2872K pause0:00 0.00% tcsh 560 root1 960 1352K 996K select 0:00 0.00% syslogd 362 _pflogd 1 -580 1600K 1144K bpf 0:00 0.00% pflogd 835 quetzal 1 200 4728K 2960K pause0:00 0.00% tcsh 688 squid 1 -80 1224K 632K piperd 0:00 0.00% unlinkd 834 quetzal 1 960 6220K 3252K select 0:00 0.00% sshd 840 root1 200 1540K 960K pause0:00 0.00% netstat 719 root1 960 3464K 2796K select 0:00 0.00% sendmail 729 root1 80 1364K 1060K nanslp 0:00 0.00% cron [EMAIL PROTECTED]:~# netstat -h 1 input(Total) output packets errs bytespackets errs bytes colls 1.6K 0 1.3M 1.5K 0 1.6M 0 1.8K 0 1.6M 1.7K 0 1.6M 0 1.3K 0 1.0M 1.4K 0 1.4M 0 1.5K 0 1.3M 1.5K 0 1.4M 0 1.6K 0 1.4M 1.6K 0 1.5M 0 1.7K 0 1.5M 1.6K 0 1.5M 0 1.3K 0 830K 1.4K 0 1.5M 0 1.1K 0 679K 1.3K 0 1.4M 0 812 0 501K912 0 971K 0 1.2K 0 1.1M 1.2K 0 1.1M 0 617 0 325K742 0 806K 0 634 0 312K769 0 818K 0 1.8K 0 1.7M 1.5K 0 1.1M 0 11K 013M 7.5K 0 3.8M 0 10K 012M 8.0K 0 5.2M 0 9.7K 0 9.9M 8.2K 0 6.3M 0 513 1.7K 666K328 0 151K 0 ^^ Here goes load... 1.0K 543 782K434 0 247K 0 0 2.3K 0 0 0 0 0 2 605 1.5K 2 0132 0 input(Total) output packets errs bytespackets errs bytes colls 0 334 0 0 0 0 0 0 286 0 0 0 0 0 0 288 0 0 0 0 0 819 204 689K328 0 122K 0 0 1.7K 0 0 0 0 0 866 1.2K 719K375 0 141K 0 144 1.5K 175K111 055K 0 0 1.3K 0 0 0 0 0 687 182 426K304 073K 0 0 3.2K 0 0 0 0 0 1.0K 0 723K405 0
deadlock in zoneli state on 6.2-PRERELEASE
Hi. It seems i have a deadlock on 6.2-PRERELEASE. This is squid server in accelerator mode. I can easily trigger it with a high rate of requests. Squid is locked on some zoneli state, i am not sure what it is. Also i can't KILL proccess even with SIGKILL. In addition one of sshd proccess is locked too. Is there any additional information that i could provide? last pid: 1197; load averages: 0.00, 0.00, 0.00 up 0+01:54:58 14:46:40 31 processes: 1 running, 29 sleeping, 1 zombie CPU states: 0.0% user, 0.0% nice, 0.0% system, 0.0% interrupt, 100% idle Mem: 704M Active, 629M Inact, 447M Wired, 12K Cache, 112M Buf, 2109M Free Swap: 4070M Total, 4070M Free PID USERNAME THR PRI NICE SIZERES STATETIME WCPU COMMAND 671 squid 1 -160 688M 688M zoneli 6:32 0.00% squid ^^^ 680 root1 960 6628K 4760K select 0:02 0.00% snmpd 1170 root1 960 2332K 1588K RUN 0:00 0.00% top 698 root1 -80 7768K 7288K piperd 0:00 0.00% perl5.8.8 634 root1 960 2984K 1808K select 0:00 0.00% ntpd 362 _pflogd 1 -580 1600K 1144K bpf 0:00 0.00% pflogd 1097 quetzal 1 960 6220K 3220K select 0:00 0.00% sshd 709 root1 960 3464K 2796K select 0:00 0.00% sendmail 1100 root1 200 5036K 3064K pause0:00 0.00% tcsh 551 root1 960 1352K 996K select 0:00 0.00% syslogd 1085 root1 40 6232K 3204K sbwait 0:00 0.00% sshd 1095 root1 40 6232K 3204K sbwait 0:00 0.00% sshd 1088 quetzal 1 60 4724K 2952K ttywai 0:00 0.00% tcsh 719 root1 80 1364K 1060K nanslp 0:00 0.00% cron 1098 quetzal 1 200 4704K 2932K pause0:00 0.00% tcsh 1087 quetzal 1 -160 6220K 3220K zoneli 0:00 0.00% sshd ^^ 654 root1 960 1264K 804K select 0:00 0.00% usbd 692 root1 960 3504K 2656K select 0:00 0.00% sshd 713 smmsp 1 200 3364K 2728K pause0:00 0.00% sendmail 358 root1 40 1536K 1092K sbwait 0:00 0.00% pflogd 769 root1 50 1320K 896K ttyin0:00 0.00% getty 773 root1 50 1320K 896K ttyin0:00 0.00% getty 772 root1 50 1320K 896K ttyin0:00 0.00% getty 771 root1 50 1320K 896K ttyin0:00 0.00% getty 770 root1 50 1320K 896K ttyin0:00 0.00% getty 775 root1 50 1320K 896K ttyin0:00 0.00% getty 774 root1 50 1320K 896K ttyin0:00 0.00% getty 776 root1 50 1320K 896K ttyin0:00 0.00% getty 497 root1 1140 528K 388K select 0:00 0.00% devd 128 root1 200 1228K 680K pause0:00 0.00% adjkerntz Also there is some interesting fstat info: [EMAIL PROTECTED]:~# fstat -p 671 -v | head -n 40 can't read vnode at 0x0 for pid 671 can't read vnode at 0x0 for pid 671 can't read vnode at 0x0 for pid 671 can't read vnode at 0x0 for pid 671 can't read vnode at 0x0 for pid 671 USER CMD PID FD MOUNT INUM MODE SZ|DV R/W squidsquid671 root / 2 drwxr-xr-x 512 r squidsquid671 wd /usr 1908230 drwxr-x--- 512 r squidsquid671 text /usr 1887228 -r-xr-xr-x 638296 r squidsquid6710 - - error- squidsquid6711 - - error- squidsquid6712 - - error- squidsquid6713 - - error- squidsquid6714 /var 47121 -rw-r--r-- 2935342 rw squidsquid6715* internet dgram udp c96205a0 squidsquid6716 /var 47131 -rw-r--r-- 48909168 w squidsquid6717* pipe c9551198 - c9551250 3 rw squidsquid6718 /cache7 -rw-r--r-- 91506636 w squidsquid6719* internet stream tcp d2f17ae0 squidsquid671 10* pipe c9551a48 - c9551990 0 rw squidsquid671 11* internet stream tcp c971e3a0 squidsquid671 12* internet dgram udp c962 squidsquid671 13 - - error- squidsquid671 14* internet stream tcp squidsquid671 15* internet stream tcp d6b211d0 squidsquid671 16* internet stream tcp cf29c740 squidsquid671 17* internet stream tcp d0c9cae0 squidsquid671 18* internet stream tcp c9ebc570 squidsquid671 19* internet stream tcp d49c9000 squidsquid671 20* internet stream tcp d262eae0 squidsquid671 21 /cache 4031491 -rw-r--r-- 2037934 r squidsquid671 22* internet stream tcp ca1941d0 squidsquid