Re: 9-STABLE - NFS - NetAPP:
On 2013-02-14, at 08:41 , Rick Macklem rmack...@uoguelph.ca wrote: Btw Marc, if you just want this problem to go away, I suspect getting rid of the intr mount option would do that. Am more interested in fixing the problem (if possible) then just masking it, but ... Based on the man page for mount_nfs, wouldn't that have the opposite effect: intrMake the mount interruptible, which implies that file system calls that are delayed due to an unresponsive server will fail with EINTR when a termination signal is posted for the process. I may be mis-reading, but from the above it sounds like a -9 *should* terminate the process if intr is enabled, while with it disabled, it would ignore it … ? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 9-STABLE - NFS - NetAPP:
On 2013-02-14, at 16:24 , Rick Macklem rmack...@uoguelph.ca wrote: Marc Fournier wrote: On 2013-02-14, at 08:41 , Rick Macklem rmack...@uoguelph.ca wrote: Btw Marc, if you just want this problem to go away, I suspect getting rid of the intr mount option would do that. Am more interested in fixing the problem (if possible) then just masking it, but ... Based on the man page for mount_nfs, wouldn't that have the opposite effect: intr Make the mount interruptible, which implies that file system calls that are delayed due to an unresponsive server will fail with EINTR when a termination signal is posted for the process. I may be mis-reading, but from the above it sounds like a -9 *should* terminate the process if intr is enabled, while with it disabled, it would ignore it … ? Yes, you have misread it (or english is a wonderfully ambiguous thing, if you prefer;-). For hard mounts (which is what you get if you don't specify either soft nor intr), the RPCs behave like other I/O subsystems, which means they do non-interruptible sleeps (D stat in ps) waiting for server replies and continue to try and complete the RPC forever. You can't kill off the process/thread with any signal. If umount -f of the filesystem works, that terminates the thread(s). Unfortunately, umount -f is quite broken again. I have an idea on how to resolve this, but I haven't coded it yet. (The problem is that the process doing umount -f gets stuck before it does the VFS_UNMOUNT(), so the NFS client doesn't see it.) For how infrequently this problem generally manifests itself, is there an overall benefit from a debugging standpoint of my leaving intr on and reporting when it happens, including procstat output, and then upgrading to latest kernel … ? Its an annoyance, but it isn't like it happens daily, so I don't mind going through the process *towards* having it fixed if there is an overall benefit … ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 9-STABLE - NFS - NetAPP:
On 2013-02-13, at 14:50 , Rick Macklem rmack...@uoguelph.ca wrote: He does get the odd error reported by nfs_getpages() and I don't think we've isolated why yet. The error is 13 (EACCES), but jhb@ thought it might be because of the bug he fixed where the krpc reported EACCES for the EINTR case. I don't think we've heard back from Marc w.r.t. whether he has gotten any more of these erros logged since applying jhb@'s patch and whether or not the errno has changed to EINTR? As mentioned previously, it doesn't happen all that often … this latest one was after 21 days of uptime (or so) … I just upgraded the kernel on that machine to take into consideration changes to hfs *since* the last upgrade, so it might be another 20-30 days before it happens again *if* that last patch didn't' fix it … I have several servers that do have fully operational remote consoles though … to save time if/when it happens next, what do I all need to run? ps auxlH procstat -kk pid (for which process? … all part of that group, or just one of the apparently hung processes?) sysctl debug.kdb.break_to_debugger=1 (shell) ctlaltesc (from console) now, is there a way of forcing it to do a dump core so that I can run the various commands from a shell *after* its rebooted? Not particularly easy to redirect console output to a file (or is it?), so anything that scrolls off the screen is pretty much lost … I'm using a DRAC card in most cases, no serial consoles or anything like that that I can run within a script session … a 'ps' listing is 500 lines long, just to give an idea ... ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: 9-STABLE - NFS - NetAPP:
On 2013-02-13, at 15:16 , Konstantin Belousov kostik...@gmail.com wrote: On Wed, Feb 13, 2013 at 05:50:13PM -0500, Rick Macklem wrote: I got it resent from him. I've attached it to this post, just in case you are interested in taking a look at it. I do not see the voffset wchains surprising. All of them seems to occur in the multithreading process. The usual reason for the voffset blocking is the use of the same file (as in struct file *) to perform operations from several threads in parallel. One thread locked the file offset by using read() or write(), and sleeping waiting for the vnode locked. All other threads performing read or write on the same file, e.g. by using the same file descriptor, are locked on the file offset before even trying to lock the vnode. What I see interesting in the output you mailed, is the pid 93636. Note that several its threads are in the 'T' state. It means stopped, while other threads obviously do file i/o due to vofflock state. I wonder if some stopped thread owns nfs vnode lock. It could be some omission in the handling of PBDRY/TDF_BDRY, or other bug. It is absolutely impossible to say anything definitive without proper diagnostic. At least the procstat -kk is needed. I had sent out the output of procstat -kk at the time … for next time, would you need procstat against all of the 'duplicate processes' that aren't' killable? for instance, in this case, there were three du commands running doing the same thing,none of which were killable … so procstat -kk for all three of those? ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Server lock up: kern.maxswzone relate ...
I'm running a couple of brand new servers ... 32G of RAM, very little load on it right now, and this morning it locked up with that 'kern.maxswzone' error on the console ... The server is running a reasonably current 7.2-STABLE: FreeBSD pluto.hub.org 7.2-STABLE FreeBSD 7.2-STABLE #0: Sun May 31 14:48:04 ADT And top right now, with everything running, shows no swappping, 19G of Free memory, 9G of Inact memory ... no reason to do any serious amount of swapping. last pid: 32159; load averages: 0.12, 0.21, 0.47up 0+10:57:56 11:53:39 573 processes: 1 running, 571 sleeping, 1 zombie CPU: 2.0% user, 0.0% nice, 1.2% system, 0.0% interrupt, 96.8% idle Mem: 1331M Active, 9446M Inact, 659M Wired, 35M Cache, 399M Buf, 19G Free Swap: 32G Total, 32G Free In fact, my other server (same config), has been up 9 days (they were put online 9 days ago), and tops shows it doing a little bit of swapping, but, again, huge amounts of Inact memory: last pid: 26307; load averages: 0.36, 0.35, 0.36up 9+17:03:48 11:57:54 680 processes: 2 running, 657 sleeping, 21 zombie CPU: 0.7% user, 0.0% nice, 0.4% system, 0.0% interrupt, 98.9% idle Mem: 2915M Active, 25G Inact, 778M Wired, 13M Cache, 399M Buf, 1771M Free Swap: 32G Total, 1044K Used, 32G Free So these servers right now are definitely not feeling any pain ... And, based on experiences with another server, I have my /boot/loader.conf set to: kern.maxswzone=67108864 So, the question is ... what am I missing? Is there some magical formula for calculating maxswzone that 7.2 is missing? Some nagios plug-in I shuld be using to monitor ... what? Help? Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . scra...@hub.org MSN . scra...@hub.org Yahoo . yscrappy Skype: hub.orgICQ . 7615664 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
More data on 7.2-RELEASE hangs
Don't know if this helps with anything, but it just hung after 2days again ... nothing on the console ... top process running at the time shows the following ... anything there look concerning? last pid: 5196; load averages: 9.25, 15.97, 10.07 up 2+07:58:36 04:02:28 1874 processes:317 running, 1537 sleeping, 20 zombie CPU: 6.2% user, 0.0% nice, 6.7% system, 0.3% interrupt, 86.8% idle Mem: 4552M Active, 162M Inact, 684M Wired, 46M Cache, 399M Buf, 8240K Free Swap: 8192M Total, 1308M Used, 6884M Free, 15% Inuse, 1360K In, 63M Out PID USERNAME THR PRI NICE SIZERES STATE C TIME WCPU COMMAND 28752 root5 960 427M 408M select 1 1:55 0.00% named 9720 nobody 19 970 402M 186M RUN1 0:00 0.69% nsd 54395 root 16 200 1308M 163M kserel 0 0:00 0.00% java 8500 nobody 10 1020 193M 86492K ucond 1 0:07 0.00% nsd 3302102 1 960 158M 66100K select 1 0:37 0.00% postgres 7853 1304 1 960 154M 54408K select 1 0:39 0.00% postgres 10670 88 28 200 335M 42488K kserel 0 0:00 0.44% mysqld 4976 root5 40 95444K 41740K kqread 1 1:09 0.00% named 14003 www44 960 443M 41632K ucond 1 0:00 0.00% java 8528 nobody 15 960 188M 37904K ucond 1 0:00 0.00% nsd 5157 88109 960 97620K 33704K RUN0 0:00 0.00% mysqld 1759 www 1 40 167M 32276K select 1 0:01 0.00% httpd 99407 www 1 40 165M 31712K sbwait 0 0:02 0.00% httpd 4006 www 1 40 124M 31424K sbwait 1 0:01 0.29% httpd 1299 www 1 40 164M 31376K sbwait 1 0:02 0.00% httpd 1758 www 1 40 164M 31176K sbwait 0 0:02 0.00% httpd 99402 www 1 960 163M 29892K CPU1 1 0:03 0.00% httpd 4036 www 1 200 122M 28680K lockf 1 0:00 0.00% httpd 1757 www 1 40 158M 27856K sbwait 1 0:02 0.00% httpd 3899 www 1 960 160M 27688K RUN0 0:00 0.00% httpd 4007 www 1 200 125M 27588K lockf 0 0:01 2.10% httpd 4525 www 1 960 158M 26624K RUN1 0:00 0.00% httpd 4607 www 1 960 158M 26096K RUN0 0:00 0.00% httpd 13635 88 34 960 92340K 25604K CPU0 0 0:00 0.05% mysqld 4024 www 1 960 156M 24880K RUN1 0:00 0.10% httpd 3585102 1 40 163M 24748K sbwait 1 2:56 0.00% postgres 3951 www 1 960 155M 24548K RUN1 0:00 0.10% httpd 4022 www 1 960 155M 24320K RUN0 0:00 0.00% httpd 3960 www 1 960 155M 24316K RUN1 0:00 0.00% httpd 3388102 1 40 161M 24228K sbwait 0 1:07 0.00% postgres 4023 www 1 960 155M 23988K RUN1 0:00 0.00% httpd 99468 www 1 960 104M 23660K RUN1 0:03 0.00% httpd 99423 www 1 40 154M 23456K sbwait 0 0:03 0.00% httpd 3959 www 1 -40 103M 23144K devfs 0 0:00 0.00% httpd 5004 www 1 40 154M 23032K sbwait 1 0:00 0.00% httpd 62771 www 1 -160 143M 22824K vnread 1 0:01 0.00% httpd 4612 www 1 960 153M 21936K RUN1 0:00 0.15% httpd 4609 www 1 960 153M 21936K RUN0 0:00 0.05% httpd 5180 www 1 960 145M 21660K RUN0 0:12 0.00% httpd 5007 www 1 40 115M 21360K sbwait 0 0:00 0.29% httpd 57327 www 1 -80 145M 20996K biord 0 0:04 0.20% httpd 29064 www 1 -80 143M 20812K biord 1 0:04 0.00% httpd 99381 www 1 960 151M 19364K RUN1 0:04 0.00% httpd 4682 root1 40 62388K 17828K kqread 1 0:00 0.00% perl 9447 88 8 200 61388K 17508K kserel 0 0:00 0.05% mysqld 13457 bind5 960 45724K 17424K RUN0 0:14 0.00% named 87535 www 1 40 149M 17396K sbwait 1 0:09 0.00% httpd 4611 www 1 40 146M 17008K sbwait 1 0:00 0.00% httpd 3386102 1 -40 163M 16544K semwai 0 0:51 0.00% postgres 91929 www 1 40 113M 16196K sbwait 0 0:04 0.00% httpd 4757 www 1 960 145M 16144K RUN0 0:00 0.00% httpd 10269 88 5 200 57504K 16000K kserel 0 0:00 0.00% mysqld 3946 www 1 40 126M 15552K sbwait 1 0:01 15.00% httpd 3619 www 1 40 113M 15172K sbwait 1 0:00 0.00% httpd 3385102 1 960 163M 14932K RUN1 0:50 0.00% postgres 28755102 1 40 159M 14760K sbwait 0 31:36 0.35% postgres Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . scra...@hub.org MSN . scra...@hub.org Yahoo . yscrappy Skype: hub.orgICQ . 7615664 ___ freebsd-stable@freebsd.org mailing list http
Re: More data on 7.2-RELEASE hangs
On Wed, 13 May 2009, John Baldwin wrote: On Wednesday 13 May 2009 3:09:33 am Marc G. Fournier wrote: Don't know if this helps with anything, but it just hung after 2days again ... nothing on the console ... top process running at the time shows the following ... anything there look concerning? Is this a 2 CPU system? If so, both CPUs are actually running something, so it is not a deadlock per se. Yes: CPU: Intel(R) Xeon(TM) CPU 3.40GHz (3400.14-MHz K8-class CPU) Origin = GenuineIntel Id = 0xf43 Stepping = 3 Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE Features2=0x649dSSE3,RSVD2,MON,DS_CPL,EST,CNXT-ID,CX16,xTPR AMD Features=0x2800SYSCALL,LM Logical CPUs per core: 2 usable memory = 6368911360 (6073 MB) avail memory = 6141906944 (5857 MB) ACPI APIC Table: HP 0083 FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs cpu0 (BSP): APIC ID: 0 cpu1 (AP): APIC ID: 6 ioapic1: Changing APIC ID to 9 Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . scra...@hub.org MSN . scra...@hub.org Yahoo . yscrappy Skype: hub.orgICQ . 7615664 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: More data on 7.2-RELEASE hangs
On Wed, 13 May 2009, Mike Tancsa wrote: What does your kernel config look like ? Included below ... only thought I had, taht I haven't tried yet, was changing from SCHED_4BSD - SCHED_ULE ... machine amd64 cpu HAMMER ident kernel options SMP options SCHED_4BSD # 4BSD scheduler options PREEMPTION # Enable kernel thread preemption options INET# InterNETworking options FFS # Berkeley Fast Filesystem options SOFTUPDATES options UFS_ACL # Support for access control lists options UFS_DIRHASH # Improve performance on big directories options PROCFS # Process filesystem (requires PSEUDOFS) options PSEUDOFS# Pseudo-filesystem framework options COMPAT_43 # Needed by COMPAT_LINUX32 options COMPAT_IA32 # Compatible with i386 binaries options COMPAT_FREEBSD4 # Compatible with FreeBSD4 options COMPAT_FREEBSD6 # Compatible with FreeBSD6 options COMPAT_LINUX32 # Compatible with i386 linux binaries options SCSI_DELAY=5000 # Delay (in ms) before probing SCSI options KTRACE # ktrace(1) support options SYSVSHM options SHMMAXPGS=199608 options SHMMAX=(SHMMAXPGS*PAGE_SIZE+1) options SYSVSEM options SEMMNI=4096 options SEMMNS=8192 options SYSVMSG # SYSV-style message queues options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time extensions options KBD_INSTALL_CDEV# install a CDEV entry in /dev options ADAPTIVE_GIANT # Giant mutex is adaptive. options LINPROCFS # Cannot be a module yet. # Bus support. device acpi device pci # Serial (COM) ports device sio # 8250, 16[45]50 based serial ports device scbus # SCSI bus (required for SCSI) device da # Direct Access (disks) device pass# Passthrough device (direct SCSI access) device ses # SCSI Environmental Services (and SAF-TE) device ciss# Compaq Smart RAID 5* device atkbdc # AT keyboard controller device atkbd # AT keyboard device psm # PS/2 mouse device vga # VGA video card driver device splash # Splash screen and screen saver support device sc device agp # support several AGP chipsets device miibus # MII bus support device bge # Broadcom BCM570xx Gigabit Ethernet device loop# Network loopback device random # Entropy device device ether # Ethernet support device pty # Pseudo-ttys (telnet etc) device bpf # Berkeley packet filter options ALT_BREAK_TO_DEBUGGER options KDB options DDB Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . scra...@hub.org MSN . scra...@hub.org Yahoo . yscrappy Skype: hub.orgICQ . 7615664 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: More data on 7.2-RELEASE hangs
On Wed, 13 May 2009, John Baldwin wrote: On Wednesday 13 May 2009 3:09:33 am Marc G. Fournier wrote: Don't know if this helps with anything, but it just hung after 2days again ... nothing on the console ... top process running at the time shows the following ... anything there look concerning? Is this a 2 CPU system? If so, both CPUs are actually running something, so it is not a deadlock per se. 99402 www 1 960 163M 29892K CPU1 1 0:03 0.00% httpd 13635 88 34 960 92340K 25604K CPU0 0 0:00 0.05% mysqld Here is what vmstat shows ~10 minutes before (or as) it hung solid last time. I didn't think to save the one that ran just before this one (the script runs every 5 minutes), but for the 'r b w' columns 'b' was around 10ish, while 'w' was 0 ... within a 5 minute period of time, 'w' literally skyrockets: procs memory pagedisks faults cpu r b w avmfre flt re pi pofr sr da0 pa0 in sy cs us sy id 107 266 122 16155620 23084 3255 22 1 2 3358 1605 0 0 377 17835 5231 19 7 73 6 285 382 16446348 22532 111705 21155 1391 10049 51966 2187328 143 0 36344 499098 423971 3 2 95 0 73 386 16440468 23072 7052 1155 85 44 1292 73 372 0 1030 18631 8334 18 12 70 0 77 388 16440468 23088 126 1050 0 621 27 169 0 521 4186 4125 2 3 94 0 66 389 16440468 23104 4 713 0 1344 58 227 0 352 2217 3504 0 5 95 -- John Baldwin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . scra...@hub.org MSN . scra...@hub.org Yahoo . yscrappy Skype: hub.orgICQ . 7615664 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: More data on 7.2-RELEASE hangs
On Wed, 13 May 2009, John Baldwin wrote: Well, you had a whole lot of page faults and other VM activity, plus 500k syscalls. The 'w' is a count of swapped processes, so basically your box is swapping a whole lot it seems. I think your box is just overloaded. I knew I was going to regret posting that :( What I posted was what vmstat 5 shows after the issue *starts*, not what it normally looks like ... right now, after 10 hours of uptime, and all the same processes running, it looks like: io# vmstat 5 (10 hours uptime now) procs memory pagedisks faults cpu r b w avmfre flt re pi pofr sr da0 pa0 in sy cs us sy id 0 1 0 10477M 301M 3503 13 1 2 3620 286 0 0 331 45491 4566 26 8 66 0 1 0 10430M 305M 278 7 0 0 550 0 18 0 186 19243 2917 4 3 93 1 1 0 10474M 295M 511 0 0 0 359 0 91 0 253 11632 3516 7 3 90 0 1 0 10447M 310M 819 3 0 0 1473 0 14 0 143 29575 2486 8 3 89 0 1 0 10558M 295M 5008 18 13 5 4128 0 121 0 345 24212 4215 16 7 77 Right now, IO is running ~775 processes ... at the time of the vmstat I provided earlier, it was up to 1400 processes ... since there is only 5 minutes between script runs, something is causing it to go from zero swap - high swap within a very short period of time, but since things get badly locked up when it happens, I can't isolate where ... I've got the following two ps outputs at the time of the high paging: /bin/ps -aucxHl -O jid ps-long.out /bin/ps -aux -O jid ps-short.out Is there anything in there that I could look at as far as what is putting things over the edge? As to the 'overloaded server', here is another server, with more running on it, but exact same configuration: neptune# vmstat 5 (3 days, 18 hours uptime now) procs memory pagedisks faults cpu r b w avmfre flt re pi pofr sr da0 pa0 in sy cs us sy id 0 0 0 12521M 303M 3969 15 5 3 2271 1603 0 0 444 6491 5165 37 19 44 0 0 0 12464M 309M 3009 1 0 15 2833 0 104 0 296 9378 3689 7 5 88 23 0 0 12476M 297M 3845 3 0 0 2627 0 31 0 279 10545 2986 14 5 81 0 1 0 12530M 266M 5259 0 1 0 2551 0 145 0 432 18070 4133 45 8 47 1 0 0 12587M 237M 7049 0 1 0 4484 0 171 0 357 15953 4715 29 7 64 So, normally these servers purr ... and are highly responsive ... In fact, here is an older 32bit server, less RAM, run about 50% more processes then neptune: mercury# vmstat 5 procs memory pagedisks faults cpu r b w avmfre flt re pi po fr sr da0 pa0 in sy cs us sy id 3 14 1 6817M 114M 641 7 3 1 1036 386 0 0 1109 464 157 5 5 90 0 8 0 6817M 224M 596 33 0 5 5667 3850 86 0 1303 5768 3885 6 7 87 1 10 0 6824M 220M 4332 32 2 0 3228 0 17 0 755 9689 3057 8 7 85 0 9 0 6798M 219M 430 0 0 0 712 0 12 0 1274 4276 3877 2 2 95 0 11 0 6830M 205M 1026 4 1 3 481 0 84 0 1503 5586 4370 6 4 89 Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . scra...@hub.org MSN . scra...@hub.org Yahoo . yscrappy Skype: hub.orgICQ . 7615664 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: More data on 7.2-RELEASE hangs
On Wed, 13 May 2009, Steven Hartland wrote: We've seen things similar to this when an process uncommon process does a query which locks the a table for a large amount of time on mysql. So many reasons why I hate MySQL :( One thing that we are trying right now is actually along these lines ... we've been working with MySQL 5.1 + NDBD for clustering ... after the last hang, we disabled both the NDBD startup, and mysql, to see if that is the cause, so nice to have some validation on this one ... In our example this turned out to be an admin query in vbulletin. When it happened it turned a machine which was purring along quite nicely into a totally unresponsive machine in a matter of a few seconds as apache spawned more process that also then instantly stalled... Let me check that the next time around ... compare the specific # of http processes between monitor runs and see if there is a 'sudden jump' ... We'll see hwo the next 'test period' works out, with that MySQL stuff offline ... the other thing I've been working on is moving jails off of that server, one at a time, to see if I can narrow down which one is causing the spike ... I will focus on the mysql backend ones going forward, to eliminate those ... Thx ... Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . scra...@hub.org MSN . scra...@hub.org Yahoo . yscrappy Skype: hub.orgICQ . 7615664 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Debugging server hangs in 7.2-RELEASE
I am so completely running out of ideas on how to debug this, maybe someone else has some ideas? The problem appears to be that very suddenly, the disk busy (according to vmstat) skyrockets to 100 (from 0) and then the 'runnable but swapped' column slowly rises ... One person suggested that for them, they saw similar when msi/msi-x was enabled ... after searching the source code, I found that msi was used in the bge driver, but I couldn't find msix used anywhere else on that machine, so disabled msi ... its still exhibiting the issue ... I get no errors on the serial console to indicate any problems, and until a relatively recent upgrade of the kernel ( (I can't give an exact date), this server was one of my most solid ... I figure there is a single process that is starting up on the machine that is causing this, but no matter what I try, it is eluding me. I have KDB enabled in the kernel, and the serial console setup so that I can break to it ... but when this problem happens, doing 'cr ~ ^b' through the serial console doesn't do anything, or, it just prints the message about breaking to the debugger and then hangs there ... My next option is to start time travelling backwards to see if I can find a 'stable kernel' again, but if it is just one process causing this, then going back to older kernels isn't necessarily going to accomplish anything ... Is there something else I can do here to debug this? Its hard to believe we are such an advance OS, but debugging issues like this is so elusive :( Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . scra...@hub.org MSN . scra...@hub.org Yahoo . yscrappy Skype: hub.orgICQ . 7615664 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
RE: 7.1-STABLE Sun Mar 29 01:06:46 ADT 2009 Locks up ...
On Tue, 28 Apr 2009, Gavin Atkinson wrote: On Fri, 2009-04-24 at 20:39 +0200, Martin Schmidt wrote: Hi Marc and List, i had similar issues with FreeBSD 7.2-PRERELEASE. Server (zfs,nfs) seems to hang in intervals of about 8 hours. kernel is still there but no connections can be made to nfs/ssh and login on local console doesn't seem to work due to incredible slowness. breaking to the debugger takes a moment but works. (compiling kernel with WITNESS didnt help) the server had been solid before with 7 stable kernel from around 19 October 2008. I now added these lines to /boot/loader.conf hw.pci.enable_msi=0 hw.pci.enable_msix=0 to disable Message Signaled Interrupts. Which are used by the 3ware twa driver and igb network driver on our server. If you are willing to test further on your server, it may be helpful if you could determine which of those two lines in loader.conf fixes the problem for you. It would also be useful to provide a dmesg from the machine when both msi and msix are enabled. FWIW, looking at the vmstat -i output it appears that only the igb driver that are using MSI/MSIX, unless you have a reason to suspect otherwise? How do you tell that, about igb? looking at the server I have the igb device on, it doesn't seem to say anything about that ... # vmstat -i interrupt total rate irq1: atkbd0 162 0 irq30: twa0402647215187 cpu0: timer 4284778818 1999 irq256: igb0 1282945461598 irq257: igb0 215507100100 irq258: igb0 417702261194 irq259: igb0 314601966146 irq260: igb0 568062067265 irq261: igb0 3 0 cpu5: timer 428475 1999 cpu6: timer 4284731466 1999 cpu7: timer 4284724508 1999 cpu1: timer 4284893874 1999 cpu3: timer 4284899807 1999 cpu2: timer 4284892325 1999 cpu4: timer 4284897264 1999 Total37480028742 17493 The server(s) that I am experiencing the hangs on, vmstat -i shows: # vmstat -i interrupt total rate irq1: atkbd0 2 0 irq3: sio1 8 0 irq25: bge0 4614816213 irq72: ciss0 1835763 85 cpu0: timer 43113685 1997 cpu1: timer 43116889 1997 Total 92681163 4293 Are any of these similiarly using MSI/MSIX? Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . scra...@hub.org MSN . scra...@hub.org Yahoo . yscrappy Skype: hub.orgICQ . 7615664 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
RE: 7.1-STABLE Sun Mar 29 01:06:46 ADT 2009 Locks up ...
'k, based on grep'ng the source files, turns out that the if_bge device driver uses msi, while, as you point out, the igb uses msix ... I have disabled msi on the two servers with bge devices, and msix on the one with igb ... all three have given the same sort of problem after varying periods of time ... let's see if I can get to 30 days uptime with this ... On Tue, 28 Apr 2009, Gavin Atkinson wrote: On Fri, 2009-04-24 at 20:39 +0200, Martin Schmidt wrote: Hi Marc and List, i had similar issues with FreeBSD 7.2-PRERELEASE. Server (zfs,nfs) seems to hang in intervals of about 8 hours. kernel is still there but no connections can be made to nfs/ssh and login on local console doesn't seem to work due to incredible slowness. breaking to the debugger takes a moment but works. (compiling kernel with WITNESS didnt help) the server had been solid before with 7 stable kernel from around 19 October 2008. I now added these lines to /boot/loader.conf hw.pci.enable_msi=0 hw.pci.enable_msix=0 to disable Message Signaled Interrupts. Which are used by the 3ware twa driver and igb network driver on our server. If you are willing to test further on your server, it may be helpful if you could determine which of those two lines in loader.conf fixes the problem for you. It would also be useful to provide a dmesg from the machine when both msi and msix are enabled. FWIW, looking at the vmstat -i output it appears that only the igb driver that are using MSI/MSIX, unless you have a reason to suspect otherwise? Gavin ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . scra...@hub.org MSN . scra...@hub.org Yahoo . yscrappy Skype: hub.orgICQ . 7615664 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
server hangs, break to DDB hangs ...
I have two HP Proliant servers that, until recently, have run very stable ... within the past 2 months, the servers hang after anywhere from 10hrs through 19 days (one just hung up this aft) ... vmstat, about the time it hangs, shows: # cat 16/vmstat.out procs memory pagedisks faults cpu r b w avmfre flt re pi pofr sr da0 pa0 in sy cs us sy id 109 156 1 17035752 62152 803 19 5 3 1907 1785 0 0 437 294 853 50 28 22 2 332 5 17109460 23056 147346 4319 2061 3139 44030 6539423 1029 0 4027 398263 38616 40 58 2 0 32 8 17110588 23052 626 4216 35 203 344 745 572 0 597 16414 5741 4 10 86 0 35 14 17110592 23084 446 5102 2 410 210 1596 540 0 516 31616 4461 4 10 85 0 25 20 17110588 23032 196 7734 2 28022 1179 445 0 434 34992 3543 5 7 88 with, by the time I was able to reboot it, the final vmstat was showing: # cat 46/vmstat.out procs memory pagedisks faults cpu r b w avmfre flt re pi pofr sr da0 pa0 in sy cs us sy id 1 492 1595 24292424 99564 809 20 5 4 1909 1896 0 0 437 737 863 50 28 22 1 399 1596 24285028 90708 6195 152 393 76 3185 1061 414 0 683 54948 32062 8 9 82 2 231 1595 24276684 85164 4709 94 219 152 3729 642 554 0 420 39442 20612 7 12 80 1 174 1595 24259144 71288 8204 143 314 158 3379 1314 605 0 547 36228 21219 11 18 71 2 199 1593 24242500 72116 4637 52 251 195 3957 1609 496 0 383 32305 20225 6 12 82 When I try and break to DDB, all I get on the screen is: === KDB: enter: Break sequence on conec === And then it hangs there ... I have ps listings that go back for just over an hour before I rebooted (the script runs every 5 minutes, or is supposed to): # ls -lt */ps* -rw-r--r-- 1 root wheel 509908 May 5 16:47 46/ps.out -rw-r--r-- 1 root wheel 450704 May 5 16:35 35/ps.out -rw-r--r-- 1 root wheel 424047 May 5 16:32 26/ps.out -rw-r--r-- 1 root wheel 329105 May 5 16:21 21/ps.out -rw-r--r-- 1 root wheel 278189 May 5 16:17 16/ps.out -rw-r--r-- 1 root wheel 246726 May 5 15:55 55/ps.out -rw-r--r-- 1 root wheel 231937 May 5 15:50 50/ps.out -rw-r--r-- 1 root wheel 240260 May 5 15:45 45/ps.out -rw-r--r-- 1 root wheel 234731 May 5 15:40 40/ps.out -rw-r--r-- 1 root wheel 233719 May 5 15:30 30/ps.out -rw-r--r-- 1 root wheel 222749 May 5 15:25 25/ps.out -rw-r--r-- 1 root wheel 231617 May 5 15:20 20/ps.out Looking at swap usage over that period, its obvious that something is sucking back the RAM reasonably fast: neptune# cat 46/swap.out Device 512-blocks UsedAvail Capacity /dev/da0s1b 16777216 13789464 298775282% neptune# cat 35/swap.out Device 512-blocks UsedAvail Capacity /dev/da0s1b 16777216 12482312 429490474% neptune# cat 26/swap.out Device 512-blocks UsedAvail Capacity /dev/da0s1b 16777216 12351920 442529674% neptune# cat 21/swap.out Device 512-blocks UsedAvail Capacity /dev/da0s1b 16777216 7807240 896997647% neptune# cat 16/swap.out Device 512-blocks UsedAvail Capacity /dev/da0s1b 16777216 5752832 1102438434% neptune# cat 55/swap.out Device 512-blocks UsedAvail Capacity /dev/da0s1b 16777216 4398928 1237828826% But I'm not sure what to look at in the ps output to determine what is going awry here ... I'm running 7.1-STABLE FreeBSD 7.1-STABLE #14: Sat Mar 28 00:05:19 ADT 2009 On the server that just hung, so will upgrade to the latest 7.2-RELEASE next, but ... if someone can give me pointers at what else I should be checking for, or something in the ps listings that I should be looking for? My monitor script is currently doing: /usr/sbin/jls jaillist.out /bin/ps -aucxHl -O jid ps.out /usr/sbin/pstat -s swap.out /usr/bin/vmstat 1 5 vmstat.out /usr/bin/awk '{print $15}' /proc/*/status | /usr/bin/sort | /usr/bin/uniq -c vps_dist.out Any pointers appreciated ... Thx Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . scra...@hub.org MSN . scra...@hub.org Yahoo . yscrappy Skype: hub.orgICQ . 7615664 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
7.1-STABLE Sun Mar 29 01:06:46 ADT 2009 Locks up ...
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Hi ... Over the past little while, two of my servers have suddenly started to hang ... servers that up until this started, have been reasonably rock solid ... they are generally within a day of each other for source code, and the hardware on both are pretty much identical (HP Proliant DL360 Servers) ... I have serial console configured on both so that I can do CR ~ ^b to get to DDB ... except, when it hangs, all I get is: KDB: enter: Break sequence on console And it hangs there, no prompt. I setup a simple script (see attached) to run every 5 minutes that gathers various pieces of info that I think are pertinent, but most likely don't cover everything ... Whenever this happens, on either machine, vmstat show data *like* (notice the high procs - w values?): procs memory pagedisks faults cpu r b w avmfre flt re pi pofr sr da0 pa0 in sy cs us sy id 165 106 2 12699168 33840 3080 38 2 2 3082 1623 0 0 337 36961 4731 18 7 75 64 75 4 12761744 23084 46809 623 65 43 19307 116 334 0 1189 83674 11708 70 20 10 1 68 25 12773980 23068 11036 3003 9 36 4055 116 282 0 1336 78346 14869 56 16 28 0 71 25 12774236 23084 186 769 1 518 80 249 0 609 9298 5894 5 5 91 5 90 31 12747296 23352 626 2546 5 104 1147 368 281 0 1536 40945 19980 6 5 90 Where procs - w just seems to keep rising ... note that the output for vmstat *5 minutes before* shows: procs memory pagedisks faults cpu r b w avmfre flt re pi pofr sr da0 pa0 in sy cs us sy id 35 121 0 12414692 90552 3080 32 2 1 3090 1403 0 0 337 37022 4730 18 7 75 31 93 0 12314408 62024 36550 414 46 6 34285 27 563 0 916 94851 8813 67 33 0 43 179 0 12270932 23080 24035 101 41 12 13887 36 375 0 766 61969 6945 69 23 7 92 44 0 12265524 119804 2122 2028 1 32 13051 1096092 205 0 558 19460 4561 19 50 32 38 34 0 12330068 89140 30758 103 39 119 37037 2837365 165 0 773 92041 7111 47 53 0 I have one QEMU VPS running on this box, with kqemu running the latest kernel module ... but the other machine experiencing the same issue is only running FreeBSD jails ... Both servers are running SCHED_4BSD, if that matters any ... ? I'm at a loss as to what to look at / for next ... pointers would be greatly appreciated ... I have the various output files that the script generates available if anyone thinks they would be useful ... thank you ... Marc G. FournierHub.Org Hosting Solutions S.A. (http://www.hub.org) Email . scra...@hub.org MSN . scra...@hub.org Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.11 (FreeBSD) iEYEARECAAYFAknlRcMACgkQ4QvfyHIvDvNmIgCfSWdT9gug6VCjYM1VVMuv1UkN K28AoK298b6mxEeiddu4BAH0+IpkRsti =q6lD -END PGP SIGNATURE- monitor.sh Description: Binary data pgpGiVIOTiHKv.pgp Description: PGP signature ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Re: ALT_BREAK_TO... + ILO ... missing something in config ...
On Sat, 28 Mar 2009, Danny Braniss wrote: unless the serial port is setup as console, check if /boot/device.hints has: hint.sio.0.flags=0x10 escaping to the debugger is not caught. btw, Jeremy Chadwick had a nice explanation, but I lost the URL. That was the missing piece, thank you ... I can now break down into DDB through the VSP ... Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . scra...@hub.org MSN . scra...@hub.org Yahoo . yscrappy Skype: hub.orgICQ . 7615664 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
ALT_BREAK_TO... + ILO ... missing something in config ...
Due to an issue I'm having with 7.x, and trying to track it down, I spent tonight getting my server setup to allow my to break into the debugger when it hangs, and hopefully dump core ... But, although I *think* I've got it all, I'm obviously missing something, as it isn't breaking ... First ... I'm running a proliant server, and when I connect via SSH to ILO on that machine, and type 'vsp', I get a shell as I expect, I can type, etc ... when I reboot the machine, I get the opening splash screen with the 7(?) options (normal boot, single user mode, etc, etc) ... but I get nothing between that and the login prompt ... first sign of a problem, maybe? Next, the easy question ... what is the key stroke to issue when one has ALT_BREAK_TO_DEBUGGER is set in the kernel? I thought it was CR ~ ^b ... is that correct? I'm using putty to connect via ssh, if that makes a difference ... I've also tried using the browser interface into ilo / vsp, same lack of a result ... Beyond adding sio device driver to my kernel, I've also got: options ALT_BREAK_TO_DEBUGGER options KDB options DDB Missing a kernel option maybe? I have the following in /boot/loader.conf: comconsole_speed=9600 console=vidconsole,comconsole # A comma separated list of console(s) boot_multicons=-D # -D: Use multiple consoles boot_serial=-h # -h: Use serial console So ... eithe rI don't have it enabled like I think, or I'm doing the wrong key stroke ... or ... Thx Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . scra...@hub.org MSN . scra...@hub.org Yahoo . yscrappy Skype: hub.orgICQ . 7615664 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
vmstat memory: avm vs fre
I'm getting a really odd condition on one of my servers (and I suspect its happening on one of my other servers as well) ... after a period of time (3 days), the server hangs solid ... Running vmstat in an xterm, the one thing I'm noticing is that when it hangs, my avm == 12455M and fre == 22M ... when I start the system, it looks like: avm == 246M vs fre == 197M ... I'm suspecting that the lock up is that fre hit 0 at some point, but I'm at a loss as to why, or where to look, for this ... top in another xterm when it hangs shows it appears to have more then enough VM: last pid: 87005; load averages: 8.57, 7.29, 4.46up 0+17:25:13 20:45:00 1140 processes:317 running, 774 sleeping, 10 zombie, 39 lock CPU: 23.3% user, 0.0% nice, 11.1% system, 0.4% interrupt, 65.1% idle Mem: 4610M Active, 440M Inact, 489M Wired, 13M Cache, 214M Buf, 9624K Free Swap: 8192M Total, 1055M Used, 7137M Free, 12% Inuse, 564K In, 272K Out kvm_open: cannot open /proc/90106/mem PID JID USERNAME THR PRI NICE SIZERES STATE C TIME WCPU COMMAND 30625 0 root1 960 588M 166M RUN0 14:54 0.10% /usr/local/bin/qemu-system-x86_64 -m 512M -net nic,macadd 86866 20 1200 1 960 60888K 1140K RUN0 0:00 0.15% postgres: autovacuum worker process(postgres) 86844 1 root1 960 15080K 1028K RUN1 0:00 0.05% sshd: [accepted] (sshd) 45533 20 root1 960 15044K 456K RUN1 0:00 0.05% /usr/sbin/sshd 86895 0 root1 960 15092K 428K RUN0 0:00 0.05% /usr/sbin/sshd 15131 15 root1 960 19692K 376K RUN1 0:00 0.15% /usr/sbin/sshd 95911 4 www 1 40 106M 0K accept 0 0:01 0.00% /usr/local/sbin/httpd (httpd) Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . scra...@hub.org MSN . scra...@hub.org Yahoo . yscrappy Skype: hub.orgICQ . 7615664 ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org
Problem with Bridging ... and bge devices under FreeBSD 7.x?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I'm trying to run a QEMU VM on top of a FreeBSD 7.x server ... I've tried the exact same setup on my desktop, using 192.168.1.x and an fxp device, and it all works perfectly, but as soon as I do this on another machine on a public IP, I'm not getting any routing, I can't even ping it from the same machine ... My first thought was that there was an issue with IP aliases already on the bge device, but tried doing the following: ifconfig bridge0 destroy ifconfig tap0 destroy ifconfig fxp0 -alias 192.168.1.101 ifconfig fxp0 alias 192.168.1.101 netmask 255.255.255.255 ifconfig bridge0 create ifconfig tap0 create ifconfig bridge0 addm fxp0 addm tap0 up on my desktop here and then starting up the qemu image, and all worked as expected, so having an alias on the interface, before or after, doesn't make a difference ... at least with the fxp device ... Using VNC to connect to the VM, I can look at the interface, and it says it is connected ... and the IP/Gateway are all set right for the network I'm on, netmask is set to 255.255.255.0, same as on the 'private network' ... Please note that when I say it works on my private network / desktop, I'm using it to connect to my work computer, across the Internet, via Windows RDP, and it works flawlessly ... Looking at /var/log/messages, you can see the bridge being setup: Oct 27 18:53:21 io kernel: bridge0: Ethernet address: ce:44:c7:1b:47:40 as well as the tap device: Oct 27 18:53:25 io kernel: tap0: Ethernet address: 00:bd:96:ae:67:00 Oct 27 18:53:41 io kernel: tap0: promiscuous mode enabled and the ethernet going promiscuous: Oct 26 20:53:56 ganymede kernel: fxp0: promiscuous mode enabled So, all I have left is that everything is being setup okay, but there is something I'm missing here ... something with bridge-bge, maybe? I've even tries to compare the output of 'ifconfig -a' as far as the bridge0 and tap0 devices are concerned, and other then the mac address, they look identical also ... So, pointers to what I may be missing here? a sysctl value that I need to set for this interface? Thanks ... - -- Marc G. FournierHub.Org Hosting Solutions S.A. (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.9 (FreeBSD) iEYEARECAAYFAkkHpscACgkQ4QvfyHIvDvPnFgCgk+6Pg+QeYO0BD9KMIkyZK2g7 JWgAn3VHq+F1OzD9M8VuYLEZDQLfFsNU =+3J/ -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Problem with Bridging ... and bge devices under FreeBSD 7.x?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Tuesday, October 28, 2008 22:08:18 -0400 Michael Proto [EMAIL PROTECTED] wrote: On Tue, Oct 28, 2008 at 7:56 PM, Marc G. Fournier [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I'm trying to run a QEMU VM on top of a FreeBSD 7.x server ... I've tried the exact same setup on my desktop, using 192.168.1.x and an fxp device, and it all works perfectly, but as soon as I do this on another machine on a public IP, I'm not getting any routing, I can't even ping it from the same machine ... My first thought was that there was an issue with IP aliases already on the bge device, but tried doing the following: ifconfig bridge0 destroy ifconfig tap0 destroy ifconfig fxp0 -alias 192.168.1.101 ifconfig fxp0 alias 192.168.1.101 netmask 255.255.255.255 ifconfig bridge0 create ifconfig tap0 create ifconfig bridge0 addm fxp0 addm tap0 up on my desktop here and then starting up the qemu image, and all worked as expected, so having an alias on the interface, before or after, doesn't make a difference ... at least with the fxp device ... Using VNC to connect to the VM, I can look at the interface, and it says it is connected ... and the IP/Gateway are all set right for the network I'm on, netmask is set to 255.255.255.0, same as on the 'private network' ... Please note that when I say it works on my private network / desktop, I'm using it to connect to my work computer, across the Internet, via Windows RDP, and it works flawlessly ... Looking at /var/log/messages, you can see the bridge being setup: Oct 27 18:53:21 io kernel: bridge0: Ethernet address: ce:44:c7:1b:47:40 as well as the tap device: Oct 27 18:53:25 io kernel: tap0: Ethernet address: 00:bd:96:ae:67:00 Oct 27 18:53:41 io kernel: tap0: promiscuous mode enabled and the ethernet going promiscuous: Oct 26 20:53:56 ganymede kernel: fxp0: promiscuous mode enabled So, all I have left is that everything is being setup okay, but there is something I'm missing here ... something with bridge-bge, maybe? I've even tries to compare the output of 'ifconfig -a' as far as the bridge0 and tap0 devices are concerned, and other then the mac address, they look identical also ... So, pointers to what I may be missing here? a sysctl value that I need to set for this interface? I'm having a little trouble understanding the setup you have. In your test case, is the IP of your VM 192.168.1.101? If so, then I don't think you want that IP aliased on the physical interface of your bridge. The VM NIC will answer for packets destined on your local segment, which the bridge would forward to the physical interface. If you assign the VM's IP to that physical interface, then your host would think that traffic is destined for itself and not pass it to the bridge. If I'm misunderstanding and the 192.168.1.101 alias (or whatever the equiv in your production setup) isn't being used by your VM then I would start looking at the ARP traffic crossing both the tap0, lo0, and physical interfaces. What does an 'ifconfig -a' look like on both systems? netstat -rn? Any packet filtering? I always fear I'm going to send more info then I should, and generate chaos and confusion :) On my test box, the VM is set to 192.168.1.100 ... the alias I added to fxp0 was to simulate what I have on the public server, where there is a bge0 device with n aliases attached to it ... in no case is the IP assigned to the VM actually aliased onto any interface on the network itself Now, to try and answer your other questions ... netstat -nr on the 192 server shows the IP to be at: netstat -nr | grep 168.1.100 192.168.1.100 52:54:00:12:34:56 UHLW11 fxp0 1128 which is very odd, as that MAC address is not found via ifconfig -a: ifconfig -a | grep 52 while arp -a also shows the 52:54 MAC, although MACs for the ifconfig -a are, in fact: ifconfig -a | grep ether ether 00:02:b3:ee:da:3e ether 5e:d1:e6:8b:55:50 ether 00:bd:25:18:6d:00 On the server, I'm getting nothing in arp or netstat for the IP in question: io# arp -a | grep 204.213 io# netstat -nr | grep 204.213 io# I've even tried doing a ping *from* the VM (logged in with VNC) to see if it will broadcast itself out, and nothing ... I'm starting QEMU on both servers with the same options as well: qemu -m 512M -net nic -net tap winxp.img just to confirm that I'm not doing anything different for attaching to the network ... So, right now, all I can see as being different is bge vs fxp interfaces ... both machines are running 7.x ... - -- Marc G. FournierHub.Org Hosting Solutions S.A. (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.9 (FreeBSD
Re: Problem with Bridging ... and bge devices under FreeBSD 7.x?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I only have one VM running on one server ... - --On Tuesday, October 28, 2008 21:14:28 -0700 Bakul Shah [EMAIL PROTECTED] wrote: On Wed, 29 Oct 2008 00:35:35 -0300 Marc G. Fournier [EMAIL PROTECTED] wrote: netstat -nr on the 192 server shows the IP to be at: netstat -nr | grep 168.1.100 192.168.1.100 52:54:00:12:34:56 UHLW11 fxp0 1128 which is very odd, as that MAC address is not found via ifconfig -a: ifconfig -a | grep 52 while arp -a also shows the 52:54 MAC, although MACs for the ifconfig -a are, in fact: ifconfig -a | grep ether ether 00:02:b3:ee:da:3e ether 5e:d1:e6:8b:55:50 ether 00:bd:25:18:6d:00 The setup you get with a tap device talking to qemu is this: [host]-tap0qemu---ed0-[VM] Each end has its own mac address. The VM's NIC (ed0 or rl0 or whatever) gets addresses like 52:54:00:12:34:56. The host will have an arp entry for it once the VM sends an arp packet. But tap0 will have an address assigned by the tap driver, something like 00:bd:xx:xx:xx. If you have two VMs running at the same time on two different machines and they both have identical MAC addresses, that could be part of your problem. But your network topolgy is still not clear. What would help is something like this: You have: machine A (runs VM A1). machine B (runs VM B1). machine C (runs windows). Can you ping from A to C? Can you ping from B to C? Can you ping from A to A1? Can you ping from B to B1? Can you ping from A1 to C? Can you ping from B1 to C? Can you ping from C to A1? Can you ping from C to B1? All of the above should work. Next you can try tcpdump on tap devices to see what is going on. If you are still stumped provide ifconfig -a output on A, B, C, A1 and B1. On windows machine you can do ipconfig/all to get at this information (IIRC). - -- Marc G. FournierHub.Org Hosting Solutions S.A. (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.9 (FreeBSD) iEYEARECAAYFAkkH6M4ACgkQ4QvfyHIvDvPciwCgi3LwM74g8DPrRC4XlkNQgFD4 eRgAnj6/CUVTkrzwr8GnzawWKlbfCWBc =KgEt -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: php5 and postgresql 8.2/8.3
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Setting ServerName fixed it for me ... thanks for the tip ... - --On Monday, April 21, 2008 12:53:24 +0200 Claus Guttesen [EMAIL PROTECTED] wrote: this problem is very old for me. it goes, at least from http://www.freebsd.org/cgi/query-pr.cgi?pr=97272 I found a workaround: you simply should set ServerName foobar.emxample in httpd.conf i don't know why missing ServerName causes coredump of apache in case of php+php_pgsql, but this works for me Thank you for your tip. I will try that on a test-server. Maby some reverse dns-lookup-issue which blocks correct unloading which then leads to a core-dump? -- regards Claus When lenity and cruelty play for a kingdom, the gentlest gamester is the soonest winner. Shakespeare ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] - -- Marc G. FournierHub.Org Hosting Solutions S.A. (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.9 (FreeBSD) iEYEARECAAYFAkisvowACgkQ4QvfyHIvDvMaAgCgnXDNXY7G0d4gC1JghHxxFfvt n2gAoNQn+EabU6zMLJt0uYKWifHENfg/ =bf+C -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Azureus + 7-STABLE == Slow download + No Upload
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Monday, March 31, 2008 10:31:12 +0200 Joakim Fogelberg [EMAIL PROTECTED] wrote: I believe I had the same problem with 7.0-prerelease + Azureus + jdk15. If I remember correct, I could only download from other Azureus clients. I had no time to even try to find out why. I simply installed deluge instead. Yowch, this is like the difference between night-n-day ... thanks for the pointer ... - -- Marc G. FournierHub.Org Hosting Solutions S.A. (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFH8Xdd4QvfyHIvDvMRAuOIAJ4zNC+c8w5iu13CiN1q/nw0V1/M0gCeNk+3 ioBkLAVolNRSd5VUwbWbPHA= =YpFb -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Azureus + 7-STABLE == Slow download + No Upload
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Is anyone running Azureus on 7-STABLE and getting decent performance from it? I just upgraded to 7-STABLE, installed /usr/ports/java/jdk15 (instead of diablo) so that it uses libthr (checked with ldd), and now I'm barely able to get one downloaded, let alone multiple, and almost nothing uploaded ... I've added: -Djava.net.preferIPv4Stack=true to /usr/local/bin/azureus, but, from reading the jdk15 makefile, IPv6 is only enabled if/when you do WITH_IPV6, and I don't have that in my make.conf file, therefore this shouldn't affect anything ... I have nothing in my /etc/libmap.conf file ... So, if there a problem, or am I missing something? - -- Marc G. FournierHub.Org Hosting Solutions S.A. (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFH8Ddc4QvfyHIvDvMRApPKAKCU1c+VVRqKK9mGpbuTnSlL9+i1SwCggocA szQk1lVKoHLT9D2P7uAF7Zw= =q1vl -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Nagios + 6.3-RELEASE == Hung Process
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Wednesday, January 02, 2008 22:54:33 + Tom Judge [EMAIL PROTECTED] wrote: Not sure if this is related at all but out of the 3 nagios deployments we have here I have only ever seen it on one (It currently has 2 nagios threads spinning CPU time atm). The differences on that server are: * It is amd64 compared to i386 I never tried on i386, but in my case it was an amd64 system as well ... not sure if that is relevant or not ... has anyone seen this problem *with* i386? - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFHfB0s4QvfyHIvDvMRAudqAKCuiXkAYPL5goXbmlvJjylpMlqUIwCgiRfM m15NQlmqpRtO/MtEXR7m+RU= =utJ9 -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Nagios + 6.3-RELEASE == Hung Process
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Thursday, January 03, 2008 11:05:16 +1030 Jarrod Sayers [EMAIL PROTECTED] wrote: That's actually good to know, as you're now (unless I am mistaken) the first user to contact me about this problem on non-i386 systems. One user, plus myself, have also seen the issue under Nagios 3.x, both on i386 systems though. I also have a net-mgmt/ndoutils port in the works (less the database support for now) which also has the same issue so using broker modules doesn't seem to affect the outcome. My gut feeling is that it's not an architecture issue but more an interoperability issue between the Nagios threading code and the libpthread() threading library. As noted in my original report, this isn't a nagios issue per se ... my first experience with this issue was with Azureus/java ... so its a 'threading issue in general' ... - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFHfDm94QvfyHIvDvMRAtZkAKCf4z6csc+YaXBS1/UMurQ3NIqXDgCeLCif jplg0JQzX4xKQEgJsVy/nGY= =dA7G -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Nagios + 6.3-RELEASE == Hung Process
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 G'day ... Yesterday, I setup nagios to do some system monitoring ... installed the latest version from ports into a jail, so that I could easily move it around between machines as I upgrade, without losing data ... after about 30 minutes running, I get a second nagios process running (fork?) that takes up ch CPU time as is available, and just hangs there until I kill -9 it ... Figuring that it might be a problem with the jail (trying to access somethign that isn't available to the process in a jail), I moved it to the physical server level ... but, again, after ~30 minutes, its doing the same thing: # ps aux | grep nagios nagios 32065 73.2 0.1 10948 3516 ?? R11:15AM 7:40.77 /usr/local/bin/nagios -d /usr/local/etc/nagios/nagios.cfg nagios 82120 0.0 0.1 10948 3580 ?? Ss 10:47AM 0:01.18 /usr/local/bin/nagios -d /usr/local/etc/nagios/nagios.cfg So, definitely not jail related ... I've tried to do a 'truss -p 32065', it just hangs. And: ktrace -f /tmp/output -p 32065 ... produces nothing: # kdump -f /tmp/output 32065 nagios PSIG SIGKILL SIG_DFL Once I kill -9 the process, a bunch of 'check_ping' processes start up and then things go back to normal ... My last kernel / world build on that box is: Mon Nov 12 06:43:30 AST 2007 After searching the 'Net a bit, came across this thread: http://www.nagiosexchange.org/nagios-users.34.0.html?tx_maillisttofaq_pi1%5Bmode%5D=1tx_maillisttofaq_pi1%5BshowUid%5D=7694 That recommends modifying libmap.conf with: [/usr/local/bin/nagios] libpthread.so.2 libthr.so.2 libpthread.so libthr.so This seems to fix the problem on the physical server, and am currently testing it in the jail itself to make sure it fixes it there too ... Should this be something that is more prominently documented somewhere? Maybe in the port itself? azureus has similar problems that are fixed with entries in libmap.conf, so its not just a nagios issue ... - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFHemsH4QvfyHIvDvMRApUOAKCLRDnmRba6ho4St8qZ6U19V8yJ+wCghMBp Xph3ac9d7QsMjeKBMtmgkuw= =mXxF -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
gdbserver on latest -STABLE ...
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Is this related to the commit that just went through to enable on arch that support it? === gnu/usr.bin/gdb/gdbserver (clean) cd: can't cd to /usr/src/gnu/usr.bin/gdb/gdbserver *** Error code 2 Stop in /usr/src/gnu/usr.bin/gdb. Or did I catch something 'in between'? - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFHVLti4QvfyHIvDvMRAlU6AKCmL3VF7Kz6QSyFF/wgWOtk0Td2xgCfag5W 6rRXD/7jpIZwoT9qs/7yXiU= =u1sx -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: gdbserver on latest -STABLE ...
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Great, thank you ... bookmarked ... so, should one not report something like this if that page shows it as a failure? - --On Monday, December 03, 2007 21:57:50 -0500 Mike Tancsa [EMAIL PROTECTED] wrote: At 09:28 PM 12/3/2007, Marc G. Fournier wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Is this related to the commit that just went through to enable on arch that support it? One way to check is to take a look at the status page for the tinderboxes http://tinderbox.des.no/ which are constantly building world. If its a general problem, it will show up there through a few builds. ---Mike ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFHVMka4QvfyHIvDvMRAjrQAKDZS6OEiOYoHFXOUYX5DtCluP1VQACeN67Y RvKiX4T6ugGTiSnPFRFmazo= =cdFC -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: gdbserver on latest -STABLE ...
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Teach me to sort threaded :( Thanks ... - --On Monday, December 03, 2007 22:34:55 -0500 Mike Tancsa [EMAIL PROTECTED] wrote: At 10:27 PM 12/3/2007, Marc G. Fournier wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Great, thank you ... bookmarked ... so, should one not report something like this if that page shows it as a failure? Its automatically reported to the mailing list. eg. http://lists.freebsd.org/pipermail/freebsd-stable/2007-December/038791.html and http://lists.freebsd.org/pipermail/freebsd-stable/2007-December/038792.html and http://lists.freebsd.org/pipermail/freebsd-stable/2007-December/038793.html ---Mike - --On Monday, December 03, 2007 21:57:50 -0500 Mike Tancsa [EMAIL PROTECTED] wrote: At 09:28 PM 12/3/2007, Marc G. Fournier wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Is this related to the commit that just went through to enable on arch that support it? One way to check is to take a look at the status page for the tinderboxes http://tinderbox.des.no/ which are constantly building world. If its a general problem, it will show up there through a few builds. ---Mike ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFHVMka4QvfyHIvDvMRAjrQAKDZS6OEiOYoHFXOUYX5DtCluP1VQACeN67Y RvKiX4T6ugGTiSnPFRFmazo= =cdFC -END PGP SIGNATURE- - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFHVNte4QvfyHIvDvMRArj0AJ9zlx1yazaOc9UyhNIgtO3+WA0TzQCfTQ07 QKv9N3YSpODOr2ulo0VDMuA= =mtVN -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 6.3 PRERELEASE
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Friday, November 09, 2007 10:20:47 -0800 Jon Holstrom [EMAIL PROTECTED] wrote: I had 6.2 stable all setup had gnome 2.18 all humming along 100% java eclipse, tomcat, bah bah bah! updated src rebuilt only to find 6.2 is gone 6.3 prerelease! What is wrong with 6.3-PRERELEASE? I had 6-STABLE all setup had kde 3.5.x hum0%, java, azureus, bah bah bah! ... upgraded to 6.3-PRERELEASE and still have 6-STABLE all setup had kde 3.5.x hum0%, java, azureus, bah bah bah! ... nothing has changed from what can tell, just newer kernel *shrug* - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFHNh7W4QvfyHIvDvMRAqnuAJ9RN4JsubP808xI7bwZz3iKWl2voQCgucu/ 7YKW6UTEDp1zpGIBwMpLvSA= =suC5 -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Call for testing: patch that helps Wine on 6.x
\ midimap ELF 7f346000-7f376000 Deferredlibcups.so.2 ELF 7f376000-7f3f8000 Deferredlibgnutls.so.13 ELF 7f3f8000-7f447000 Deferredlibgcrypt.so.13 ELF 7f447000-7f46 Deferredlibcrypt.so.4 ELF 7f46-7f469000 Deferredlibintl.so.8 ELF 7f469000-7f557000 Deferredlibiconv.so.3 Threads: process tid prio (all id:s are in hex) 000a 000c0 000b0 0008 (D) C:\Program Files\Macromedia\Dreamweaver 8\Dreamweaver.exe 000d0 00090 == daemon% Any idea how to resolve this issue? Will the patch on http://bugs.winehq.org/show_bug.cgi?id=4139 help to this issue? thanks in advance, Ganbold -- If it's worth doing, do it for money. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v2.0.4 (FreeBSD) iD8DBQFGw0vC4QvfyHIvDvMRApJIAKCgEXQblbilfCI5AQTpQyHWfz5AfQCfU3vU /3BivBPQlh1TDb2RAGMifVE= =GUCw -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Call for testing: patch that helps Wine on 6.x
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Monday, August 06, 2007 16:05:40 -0400 John Baldwin [EMAIL PROTECTED] wrote: On Friday 03 August 2007 10:56:48 pm Marc G. Fournier wrote: --On Tuesday, July 31, 2007 14:47:50 -0700 Kris Moore [EMAIL PROTECTED] wrote: I'm not sure all the tests run properly since I didn't run through them yet. I'll try it out tomorrow morning though. All I tried was FireFox for Windows and installed StarCraft. Both worked just fine here. (I did a spawn of Starcraft since the safedisc support isn't working as far as I know) 'k, I just installed the latest patches from http://wiki.freebsd.org/Wine, and everything builds fine, and I'm getting alot further with the tests, but its failing at the rebar test ... I've posted to [EMAIL PROTECTED] with my results on this, as it seems to be the Wine side, not FreeBSD ... John, I've been running both the signal and pfault patches on my 6.x desktops since Tijl posted them, and haven't noticed any issues resulting from them ... Does cvsup work? A similar patch broke cvsup on HEAD. I've cvsup'd several times since first applying the patch, and haven't noticed any issues ... Also, any chance of getting the thr_kill2() patch Tijl did in? I've been running both on my desktop, and haven't noticed any issues resulting from either (other then improvements to wine, of course) ... Getting both those patches in place should allow us to focus on wine itself without having to worry about the OS side of things ... - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGt9L14QvfyHIvDvMRAik+AJ90kETJRNEw5WXF+XXXvZlUQoxAvACeJFg5 grbZ9Nb/q233PSoAeZ4Iz2w= =eXFR -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Call for testing: patch that helps Wine on 6.x
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Tuesday, July 31, 2007 14:47:50 -0700 Kris Moore [EMAIL PROTECTED] wrote: I'm not sure all the tests run properly since I didn't run through them yet. I'll try it out tomorrow morning though. All I tried was FireFox for Windows and installed StarCraft. Both worked just fine here. (I did a spawn of Starcraft since the safedisc support isn't working as far as I know) 'k, I just installed the latest patches from http://wiki.freebsd.org/Wine, and everything builds fine, and I'm getting alot further with the tests, but its failing at the rebar test ... I've posted to [EMAIL PROTECTED] with my results on this, as it seems to be the Wine side, not FreeBSD ... John, I've been running both the signal and pfault patches on my 6.x desktops since Tijl posted them, and haven't noticed any issues resulting from them ... - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGs+rw4QvfyHIvDvMRAiW8AKCpVIKvIZqWPA0yMLfxet/wl33FBQCghy1L AidVDAaM729qO7Mjms61UIY= =Z53o -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Call for testing: patch that helps Wine on 6.x
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Tuesday, July 31, 2007 12:19:23 -0700 Kris Moore [EMAIL PROTECTED] wrote: I just gave FireFox 2.0.0.6 a shot using FBSD 6-Stable and all the various patches on the Wiki page. It loaded and ran just fine on my end. as user root? or a regular user? - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGr6Rf4QvfyHIvDvMRAqLmAJ4o7HAxPo+a4JTcP8D1x1xdC0usrgCgoWWT 2p/oZnz+2MQrXZ3UqGPBYXQ= =1EmJ -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Call for testing: patch that helps Wine on 6.x
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Tuesday, July 31, 2007 14:21:28 -0700 Kris Moore [EMAIL PROTECTED] wrote: :) I learned my lesson, I ran it as regular user this time. 'k, now I'm curious ... you have all the kernel patches in place, and you can now run 'make tests' as a regular user without any problems? I just updated my kernel, so am going to work tonight on plugging in the OS patches and building a new wine here (just got back from camping, still catching up on things) ... - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGr6zS4QvfyHIvDvMRAjKGAJ41uUlIeSGwJojFNG9p1fYQt2Z92ACeOzgQ +IJ3IJZe7dcEN9VBHn7Fvbw= =MJ4T -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Static in left speaker (HDA Codec: Realtek ALC883)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Just got a new desktop, and so far everything has been a dream, except the sound ... did a 'snd_driver_load=YES' to load everything, and the sound system is detected as: pcm0: HDA Codec: Realtek ALC883 pcm0: HDA Driver Revision: 20070710_0047 If I 'kldunload snd_hda', the static goes away ... reload it, the static comes back again ... Found a reference to http://people.freebsd.org/~ariff/, and the lowlatency stuff, so tried that, same effect ... Tried different speakers, just in case, no change ... I do get sound out, can watch movies and such, but the sound seems to only come out the right speaker, and, well, the static is fairly annoying ... Not sure what else I can do to debug, mind you ... help? thanks ... - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGpsc/4QvfyHIvDvMRAiZlAKDjTFeq5Cu/JZoERFU1CrCrL9aYJgCfT+rI JJ7WvHrpXxyl+zaHPRHUPhw= =gb3O -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: SATA 300 Drive Being Run At 150
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Saturday, July 21, 2007 11:55:39 -0500 Dan Nelson [EMAIL PROTECTED] wrote: In the last episode (Jul 21), Tim Daneliuk said: I asked this question a while back, but needed to do more digging to make sure I had latest sources etc. I have an Intel motherboard that shows this for a SATA controller: atapci1: Intel ICH7 SATA300 controller port 0x20c8-0x20cf,0x20ec-0x20ef,0x20c0-0x20c7,0x20e8-0x20eb,0x20a0-0x20af mem 0x90204000-0x902043ff irq 19 at device 31.2 on pci0 But the hard drive - a SATA 300 device - shows up like this: ad4: 238475MB WDC WD2500JS-00NCB1 10.02E02 at ata2-master SATA150 ^^^ Using dd, I have confirmed that the drive is running nowhere near SATA-III speeds, at least on reads: 968470075 bytes transferred in 7.132891 secs (135775249 bytes/sec) What was your dd commandline? If you've got more than 1GB of RAM and tested by reading a file and not the raw device itself, you just tested FreeBSD buffer cache. According to http://www.wdc.com/en/products/productspecs.asp?driveid=135 , that drive's maximum sustained speed is only 93.5 MB/sec, so it doesn't really matter if your interface is running at SATA150 or SATA300 unless you plan on reading exclusively from its 8MB buffer :) 'k, I just bought a new desktop, SATA/300MB/s interface, and this drive: http://www.wdc.com/en/products/products.asp?DriveID=254 Web site states 3Gb/s ... I'm seeing same SATA!50: atapci0: JMicron JMB361 SATA300 controller port 0xbf00-0xbf07,0xbe00-0xbe03,0xbd00-0xbd07,0xbc00-0xbc03,0xbb00-0xbb0f mem 0xfdbfe000-0xfdbf irq 16 at device 0.0 on pci2 atapci1: Intel ICH8 SATA300 controller port 0xfa00-0xfa07,0xf900-0xf903,0xf800-0xf807,0xf700-0xf703,0xf600-0xf60f,0xf500-0xf50f irq 19 at device 31.2 on pci0 atapci2: Intel ICH8 SATA300 controller port 0xf300-0xf307,0xf200-0xf203,0xf100-0xf107,0xf000-0xf003,0xef00-0xef0f,0xee00-0xee0f irq 19 at device 31.5 on pci0 ad8: 152627MB WDC WD1600AAJS-08PSA0 05.06H05 at ata4-master SATA150 Latest 6.x STABLE ... - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGosqU4QvfyHIvDvMRAhENAKDhq0K+IDbZvD9Lcm51aLTwzjhz9ACgnFZz b3iDMLhANYWByT3a7Vu3utQ= =ZnlY -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
rwhod / ntpdate don't work ... amd64/-STABLE ...
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Just looking over one of our AMD64 servers, and rwhod / syslog / ntpdate won't work on that server, although its running the same date/version ... I checked securelevel, and they are both running the same ... rwhod generates no errors when I try to run it, and truss doesn't show anything since it does a fork: stat(/etc/nsswitch.conf,{mode=-rw-r--r-- ,inode=24857,size=113,blksize=4096}) = 0 (0x0) open(/etc/group,O_RDONLY,0666) = 3 (0x3) fstat(3,{mode=-rw-r--r-- ,inode=24775,size=441,blksize=4096}) = 0 (0x0) lseek(3,0x0,SEEK_CUR)= 0 (0x0) lseek(3,0x0,SEEK_SET)= 0 (0x0) read(3,# $FreeBSD: src/etc/group,v 1.32...,4096) = 441 (0x1b9) close(3) = 0 (0x0) sigaction(SIGHUP,{ SIG_IGN 0x0 ss_t },{ SIG_DFL SA_RESTART ss_t }) = 0 (0x0) fork() = 90418 (0x16132) exit(0x0) process exit, rval = 0 So, I'm not 100% certain what I'm looking for ... The network looks good, I can connected to the jails running on it, and, syslog runs in the jails themselves, just not the physical server ... If I try syslogd from the command line, it generates an error: # /usr/sbin/syslogd -s syslogd: child pid 90996 exited with return code 1 I'm not out of disk space on any of the file systems ... So, not sure what else I should be looking for here ... Help? - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGnBHP4QvfyHIvDvMRAuYnAJ4qU1T486MWB1HDYb1yU+8LwD6gJgCdHS/z Lah1f/mbLzBQrROzv09J44E= =GuLe -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: rwhod / ntpdate don't work ... amd64/-STABLE ...
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Figured it out ... not sure *how* it happened, but on my last upgrade, I must have somehow screwed up my mergmaster, and actually wiped out /etc/services ... just ran mergemaster on a whim, and the file was totally recreated, and all services now start up as expected ... - --On Monday, July 16, 2007 21:48:15 -0300 Marc G. Fournier [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Just looking over one of our AMD64 servers, and rwhod / syslog / ntpdate won't work on that server, although its running the same date/version ... I checked securelevel, and they are both running the same ... rwhod generates no errors when I try to run it, and truss doesn't show anything since it does a fork: stat(/etc/nsswitch.conf,{mode=-rw-r--r-- ,inode=24857,size=113,blksize=4096}) = 0 (0x0) open(/etc/group,O_RDONLY,0666) = 3 (0x3) fstat(3,{mode=-rw-r--r-- ,inode=24775,size=441,blksize=4096}) = 0 (0x0) lseek(3,0x0,SEEK_CUR)= 0 (0x0) lseek(3,0x0,SEEK_SET)= 0 (0x0) read(3,# $FreeBSD: src/etc/group,v 1.32...,4096) = 441 (0x1b9) close(3) = 0 (0x0) sigaction(SIGHUP,{ SIG_IGN 0x0 ss_t },{ SIG_DFL SA_RESTART ss_t }) = 0 (0x0) fork() = 90418 (0x16132) exit(0x0) process exit, rval = 0 So, I'm not 100% certain what I'm looking for ... The network looks good, I can connected to the jails running on it, and, syslog runs in the jails themselves, just not the physical server ... If I try syslogd from the command line, it generates an error: # /usr/sbin/syslogd -s syslogd: child pid 90996 exited with return code 1 I'm not out of disk space on any of the file systems ... So, not sure what else I should be looking for here ... Help? - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGnBHP4QvfyHIvDvMRAuYnAJ4qU1T486MWB1HDYb1yU+8LwD6gJgCdHS/z Lah1f/mbLzBQrROzv09J44E= =GuLe -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGnCHM4QvfyHIvDvMRAtB6AJ9+aFEXYmrFRuvtMeDe10rOtTkbBwCeIAwO SXlG8lCyNxx9mr94d3fWk6A= =JpRk -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Unix domain socket leak in 6-STABLE
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Wednesday, June 13, 2007 20:15:56 +0200 Ulrich Spoerlein [EMAIL PROTECTED] wrote: was your leak a kernel leak or a user leak (if it actually makes a difference). I don't know ... it was caused by an application, but nothing was freed up after the application was stop'd ... Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGcWes4QvfyHIvDvMRAnaVAJ4pfQ69GvcfXObQ37yMlHG61Foz4wCcClFp p2TKa/KvLdgkKv9XCbA5hok= =d3WG -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Unix domain socket leak in 6-STABLE
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Thursday, June 14, 2007 14:03:27 -0300 Alexandre Biancalana [EMAIL PROTECTED] wrote: On 6/14/07, Marc G. Fournier [EMAIL PROTECTED] wrote: I don't know ... it was caused by an application, but nothing was freed up after the application was stop'd ... In my case the sockets are closed only if I stop the samba processes. When I just changed the connection mode from Unix Socket to TCP on nss_ldap.conf, the connections remain opened. I think this could be a problem with nss_ldap (in the way of the connections are handled ?) because samba is accessing OpenLDAP directly via TCP, the access via Unix Sockets is only done by Samba throughnss_ldap. I trying to simulate this error on another machine. I will write some scripts/program that connect to OpenLDAP socket directly and via nss_ldap and post the results. Any more hints ? Hrmm .. how about nss in general? the one VPS that I killed off was using nss-mysql for passwd/group and shadow ... its definitely not something that is normally done here, and about the only thing I can think of that is 'unusual' about that specific VPS, in my case ... - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD4DBQFGcZL54QvfyHIvDvMRAgbBAJ4zbygUUNdl6kKEp+sAPW0vLgJsvwCWP768 Ulzq5eM+ygPOM+A243NTsg== =EuC7 -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Unix domain socket leak in 6-STABLE
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Wednesday, June 13, 2007 09:17:36 -0700 Jeremy Chadwick [EMAIL PROTECTED] wrote: I've seen this kind of problem with domain sockets (at least on Linux with a multi-use tool called busybox) where on error conditions the code never bothered to close the existing socket it opened, thus resulting in leaks/resource exhaustion over time. The code later got fixed, but a pretty nasty bug especially when the program is used in a lot of embedded products... In regards to FreeBSD, I remember reading some mails from Robert Watson last month in regards to UNIX domain socket code changes: http://monkey.org/freebsd/archive/freebsd-stable/200705/msg00200.html 'k, just to ring in here ... I can definitely attest to there being a leak here, as it was me that was originally burned by it ... in my case, I eventually was able to isolate which VPS/jail was causing it and haven't run it since, but was never able to determine exactly what was causing it, since there wasn't really anything unusual running in that jail :( But ... based on the discussions that were had at the time, it was my understanding that if all applications were shut down on the server (to the bare minimal), eventually the kernel GC should clean up all residual sockets ... when I did this (shut down all applications but the very bare minimum) and waited for 10+ minutes, socket usage never drop'd below about 4k sockets in use, or something like that ... Unlike Ulrich, I wasn't running LDAP at the time, so that wasn't the cause for me ... I could easily enough restart that jail if there was some more useful information I could get from it, but the thread kinda dwindled off over time, and rebooting a server ever 3 days was getting a wee bit annoying to my clients :) But, if someone has something they'd like me to do to provide more info, I'm willing to do it (short of anything that requires DDB / console access ... that server is remote) ... - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGcC0y4QvfyHIvDvMRApuZAJ9xKfa2/LqkcMkFEr4vrtnLt3ObcQCg43hs 7QX1hYskbQh/L8XJn1r1/Ts= =xKdx -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: fast rate of major FreeBSD releases to STABLE
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Saturday, May 19, 2007 01:22:40 -0700 KAYVEN RIESE [EMAIL PROTECTED] wrote: On Thu, 17 May 2007, [EMAIL PROTECTED] (Mark Linimon) wrote: On Thu, May 17, 2007 at 01:35:10PM -0500, Craig Boston wrote: The alternative would have been to commit what we had and _then_ found out all the bugs in the upgrade process (note: you won't be able to just blindly use portupgrade -af; you will need to read the UPDATING file for the proper procedure. This is the unusual case of being such a sweeping change that the port management tools are not completely up to the task.) okay could this freeze an explanation for the fact that my x is totally hosed? i know any random joe can't necessarily answer that.. but assuming it is true.. Not sure how ... since the freeze started, I haven't seen any commits to the X system go through, or anything else for that matter (sorry, except for one port that I can't recall its name) ... I know in my case, I'm looking forward to the freeze being lifted since there was a recent release of new versions of PHP ... :) how long is this freeze going to last then? Not 100% certain, but Kris just posted a note about the X stuff bbeing committed, so I'm guessing RSN ... - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGTxzN4QvfyHIvDvMRAkIqAKDAV3YQkNPIS8+XXtM13dpA7CQybgCbBhUK rxDqsrCVzL9DFQ+lLpCrSRs= =ur1s -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: fast rate of major FreeBSD releases to STABLE
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Saturday, May 19, 2007 22:01:26 +0100 Chris [EMAIL PROTECTED] wrote: With the ports freeze I wonder in situations when a full freeze is needed it is better to do so on a seperate testing branch so it allows security commits etc. to carry on as normal and then remerge again after testing is complete. Or is this simply not possible to do? IMHO, not impossible, but creates alot more work then the disruption of a couple of weeks without commits would justify ... you have to bear in mind, once the freeze is lifted, all of the ports that had been modified on the 'branch' would then need to be re-modified on the regular branch, putting alot of work onto the shoulders of the maintainers themselves ... - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGT2ph4QvfyHIvDvMRAjCsAJ9bMyqm63cIFsP+my+FbRjcSNSNQQCgnmbt UYmaKyCFgIc3ABhM82cTqYg= =m+LC -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: UNIX domain sockets MFC's
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Tuesday, May 15, 2007 19:09:22 +0200 Oliver Fromme [EMAIL PROTECTED] wrote: If there isn't, then start the jails one after another (not all at once) and keep checking. Maybe it's just one specific jails (or a few of them) which trigger the problem. With that procedure it should be possible to find it (or them). 'k, there is definitely a leak in here somewhere, since if I shut down all processes on the machine, the garbage collector should clean up the sockets, which isn't happening ... ... that said, after that last round with the 1200 find processes running, I shutdown the VPS that they were running in, and my socket usage has stayed around 2800, so something in that VPS looks to be 'the cause' ... ... my next step is going to be to restart that specific VPS and see if they start to climb again, but, again, even after shutting down all the processes, those sockets are not being released, so there is a problem somewhere that that one VPS is triggering ... - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGSiX04QvfyHIvDvMRAoGpAJ0b05pHtfk514NafmDKcYcLYhFziQCfYxP+ mu5RXX5f516GiZHL4GFkeM8= =HLVo -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Socket leak (Was: Re: What triggers No Buffer Space) ?Available?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 It didn't kept climbing ... - --On Tuesday, May 15, 2007 21:39:35 +0200 Ulrich Spoerlein [EMAIL PROTECTED] wrote: I'm slowly cathing up on FreeBSD related mails and found this mail ... Marc G. Fournier wrote: kern.ipc.numopensockets: 7400 kern.ipc.maxsockets: 12328 ps looks like: stuff deleted 2368 p2 Is+ Sat01PM 0:00.03 /bin/tcsh root2112 0.0 0.1 5220 2360 p3 Ss+ Sat01PM 0:00.04 /bin/tcsh root 91221 0.0 0.1 5140 2440 p4 Ss+ 11:49PM 0:00.12 -tcsh (tcsh) I don't think those processes should consume 7400 sockets. Indeed, this really looks like a leak in the kernel. Robert has sent me a suggestion to try that I'm in the process of putting together right now, involving backing out some work on uipc_usrreg.c ... How did the backing out work for you? Ulrich Spoerlein -- The trouble with the dictionary is you have to know how the word is spelled before you can look it up to see how it is spelled. -- Will Cuppy ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGSjDm4QvfyHIvDvMRAv+4AKCUc0ijgXs4igHymP94NGM5XAmvXQCfUi2X m/jpnf+voCioDKmJjedIRbw= =dyqI -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: UNIX domain sockets MFC's
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Monday, May 14, 2007 11:29:12 +0100 Robert Watson [EMAIL PROTECTED] wrote: On Sat, 12 May 2007, Marc G. Fournier wrote: The fix for this has now been merged as 1.155.2.22. As there have been no new reports of UNIX domain socket problems in the last couple of days, it sounds like the MFC of the last batch of fixes and cleanups has not lead to problems. I've just upgraded my kernel to the latest, to include the MFC'd code above ... Yes -- I was very specific in my e-mail regarding the MFC's that they were not believed to address the problem you are reporting. I think we have a leak in the way some edge case is handled with regard to UNIX domain socket shutdown. What would be really nice to know is if that persists in 7-CURRENT, in which we've redone the way the socket life cycle works. However, I don't know if you are able to tolerate booting a 7-CURRENT kernel in your environment...? On that server, that could be very difficult ... if this was happening on any of my HP servers, I would in a minute ... Did we determine whether backing out to before the unpcb socket reference count change made any difference for you? The problem appeared to persist after backing it out ... I'm curious about something ... way back, when I was using unionfs, I had a major problem with vnode leakage ... as I mentioned before, this server is the only one I have that uses geom/gmirror on its drives, the rest all use hardware RAID ... is there *any* possibility that I'm seeing some sort of interaction issue? It really bothers me that the only server that I'm seeing this one is the one that I'm using software RAID on ... Would it be useful to add some DEBUG statements to the socket code, to trace open/close/flush/etc? Maybe to see where flush's are being started, but never completed? That sort of thing? - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGSFn14QvfyHIvDvMRAow2AKC67Y0QuiiF+ZJA5Tpbd3WUvcmdTwCaAgZS OY4em31JQzIIbs1CUcmpHNo= =1Mqr -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: UNIX domain sockets MFC's
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Monday, May 14, 2007 16:03:13 +0200 Oliver Fromme [EMAIL PROTECTED] wrote: FWIW, I have two servers running RELENG_6 (2 months old) using gmirror and with a few jails (not many, though ... they're used for Apache web servers and PostgreSQL). I'm not seeing any socket leakage. $ sysctl kern.ipc | grep sockets kern.ipc.numopensockets: 118 kern.ipc.maxsockets: 12328 $ uptime 3:55PM up 82 days, 20:39, 3 users, load averages: 0.04, 0.05, 0.02 $ gmirror status NameStatus Components mirror/gm0 COMPLETE ad0 ad1 If you have more hints how to reproduce the problem, I might give it a try if it's not too much trouble. That's the fun part ... I can't seem to re-create it anywhere except that one server :( And it doesn't seem to matter how many jail(s) I have on it ... I just dump'd 25 jails off of it and onto another server, and its still rising ... - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGSIv14QvfyHIvDvMRAvbLAKDI62gdfiP8Q++eEtsQkL7Qi19KxQCgj3Qw AmUDtwd92A6n2mLs3REVTkI= =Av2b -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Known memory leak in 6-STABLE from April 1st?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Monday, May 14, 2007 19:07:24 +0200 Ulrich Spoerlein [EMAIL PROTECTED] wrote: Hi all, I observed something funny with our new cyrus/postfix/amavis installations running on 6.2-STABLE checked out on April 1st (no, I'm not joking). They are running symon to grab performance data and I saw the memory total becoming less and less. Now I know that adding up free+active+inactive != total ram BUT *all* other FreeBSD machines we are running show a more or less constant sum. I uploaded two pictures showing the trend here (They are i386 machines with 4GB RAM, FreeBSD reports 3.3GB as usable): http://coyote.dnsalias.net/ms1-day.png http://coyote.dnsalias.net/ms1-week.png Now after doing some heavy IMAP testing (cyrus reconstruct of big maildirs) the system froze to a complete halt. Stupid me already rebooted the machine, tomorrow I'll try to break into DDB when it happens again. I also started recording top(1) memory output and sysctl vm.zone output. The main questions is: Were there any known memory leaks at the start of April? Any patches I should blindly try before spending several days on debugging this? Hrmmm ... long shot here, but what does: sysctl kern.ipc.numopensockets show over that period of time ... just wondering if we are somehow related on problems here, just different symptoms ... - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGSJyC4QvfyHIvDvMRAmDJAJwMe9ihH/5ITea58y1Qivilfju2KACgidMf Aq68KICMse94bckc2UL/7Sw= =TUSW -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Does a pipe take a socket ... ?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 For those that remmeber the other day, I had that swzone issue, where I ran out of swap space? I just about hit it again today, swap was up to 99% used ... I was able to get a ps listing in, and there were a whack of find processes running ... Now, I think I know which VPS they were running in, so that isn't a problem ... and I suspect that the find was just part of a longer pipe ... I'm just curious if those pipes would happen to use up any of those sockets that are 'evaporating', or is this totally unrelated to sockets? Thanks ... - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGSKR54QvfyHIvDvMRAg/iAKCXXw2eBMr6reJlKNqcG2IvlSvXvgCgi0R+ 3cPjCNRy9r+N1MSYETwKPv4= =ha/b -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: UNIX domain sockets MFC's
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Friday, May 11, 2007 12:49:32 +0100 Robert Watson [EMAIL PROTECTED] wrote: On Tue, 8 May 2007, Robert Watson wrote: Right now I am tracking two known issues with UNIX domain sockets in RELENG_6: - Reported NULL point derference in unp_connect(), which occurs due to the dropping of locks around sonewconn(). This is fixed in HEAD, and I am preparing an MFC of this patch. The fix for this has now been merged as 1.155.2.22. As there have been no new reports of UNIX domain socket problems in the last couple of days, it sounds like the MFC of the last batch of fixes and cleanups has not lead to problems. I've just upgraded my kernel to the latest, to include the MFC'd code above ... Just before rebooted, as I've done the past couple of times, I shutdown everything on the server, so that there were minimal processes running ... based on the last one, and this one, it looks like the number of Active open sockets is ~4000 ... last time, I was up to 11k sockets open, and it drop'd to ~7000 once all jails were shut down, but, as reported to Robert/John, there was a java process in a soclose state, so I wasn't 100% certain there ... This time through, I started at about 8800 sockets open, and shut down all processes, including all java processes ... using ps auxlw, I checked for any processes in a soclose state, and there were none ... I waited a full 10 minutes to let things 'settle', and after 7 of those, it had drop'd down to: mars# uptime ; sysctl kern.ipc | grep sock 2:18PM up 1 day, 13:26, 5 users, load averages: 0.00, 0.47, 2.57 kern.ipc.maxsockbuf: 262144 kern.ipc.sockbuf_waste_factor: 8 kern.ipc.numopensockets: 4835 kern.ipc.maxsockets: 12328 And stuck there for the remaining 3 minutes before I rebooted ... which is what leads me to believe that there are about 4000 active sockets on this server when everything is running ... - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGRfpC4QvfyHIvDvMRAuzoAKDbb5Fndwtw8paTsmLdXIP+FrOBHQCeIVKf Uhlv8ZRAjVar/fRHD3E6waM= =yglM -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: UNIX domain sockets MFC's
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Friday, May 11, 2007 12:49:32 +0100 Robert Watson [EMAIL PROTECTED] wrote: On Tue, 8 May 2007, Robert Watson wrote: Right now I am tracking two known issues with UNIX domain sockets in RELENG_6: - Reported NULL point derference in unp_connect(), which occurs due to the dropping of locks around sonewconn(). This is fixed in HEAD, and I am preparing an MFC of this patch. The fix for this has now been merged as 1.155.2.22. As there have been no new reports of UNIX domain socket problems in the last couple of days, it sounds like the MFC of the last batch of fixes and cleanups has not lead to problems. I will work on upgrading that system right now to the latest -STABLE and let y ou know ... did you happen to receive my email concerning that java process in a soclose state? - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGRISl4QvfyHIvDvMRAhNVAJ94AKDAhNQIk3Kkq3PRbiru0a+T2QCfWglT kwaljA9wg70RKzqcyOwDz3U= =FuMA -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Socket leak (Was: Re: What triggers No Buffer Space) ?Available?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Tuesday, May 08, 2007 15:14:29 +0200 Oliver Fromme [EMAIL PROTECTED] wrote: What kind of jails are those? What applications are running inside them? It's quite possible that the processes on one machine use 120 sockets per jail, while on a different machine they use only half that many per jail, on average. Of course, I can't tell for sure without knowing what is running in those jails. The all run pretty much the same thing, on all the machines ... by default, standard syslog, sshd, cron, cyrus imapd, postfix and apache ... some run aolserver over top of that, or jdk/tomcat, or zope ... but they aren't specific to the server itself, as they get moved around ... kern.ipc.numopensockets: 7400 kern.ipc.maxsockets: 12328 ps looks like: stuff deleted 2368 p2 Is+ Sat01PM 0:00.03 /bin/tcsh root2112 0.0 0.1 5220 2360 p3 Ss+ Sat01PM 0:00.04 /bin/tcsh root 91221 0.0 0.1 5140 2440 p4 Ss+ 11:49PM 0:00.12 -tcsh (tcsh) I don't think those processes should consume 7400 sockets. Indeed, this really looks like a leak in the kernel. Robert has sent me a suggestion to try that I'm in the process of putting together right now, involving backing out some work on uipc_usrreg.c ... Maybe sockstat -u and/or fstat | grep -w local (both of those commands should basically list the same kind of information). My guess is that the output will be rather short, i.e. much shorter than 7355 lines. If that's true, it is another indication that the problem is caused by a kernel leak. at the time I rebooted, with no processes, but 7400 sockets: wc -l sockstat.out.txt 12 sockstat.out.txt grep local fstat.out.txt | wc -l 7 - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGQLrf4QvfyHIvDvMRAqlWAJ9Dg2J55e6YVAzkfC9mGascFfr+JQCeJpWo uXAZtN0WbyKdM4a12WJjszs= =BA7G -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Socket leak (Was: Re: What triggers No Buffer Space) Available?
:00.03 /bin/tcsh root2112 0.0 0.1 5220 2360 p3 Ss+ Sat01PM 0:00.04 /bin/tcsh root 91221 0.0 0.1 5140 2440 p4 Ss+ 11:49PM 0:00.12 -tcsh (tcsh) And netstat -n -funix shows 7355 lines similar to: d05f1000 stream 0 00 d05f109000 d05f1090 stream 0 00 d05f100000 cf1be000 stream 0 00 cf1bdea000 cf1bdea0 stream 0 00 cf1be00000 cec42bd0 stream 0 00 cf2ac48000 cf2ac480 stream 0 00 cec42bd000 with the final few associated with running processes: c95ad000 stream 0 0 c95aa000000 /var/run/devd.pipe c95aca20 dgram 0 00 c95ace1000 c95accf0 dgram 0 0 c95c7110000 /var/named/var/run/log c95acd80 dgram 0 0 c95c7330000 /var/run/log c95ace10 dgram 0 0 c95c74400 c95aca200 /var/run/logpriv c95acea0 dgram 0 0 c95c7550000 /var/run/log So, over 7000 sockets with pretty much all processes shut down ... Shouldn't the garbage collector be cutting in somewhere here? I'm willing to shut everthing down like this again the next time it happens (in 2-3 days) if someone has some other command / output they'd like fo rme to provide the output of? And, I have the following outputs as of the above, where everythign is shutdown and its running on minimal processes: # ls -lt total 532 - -rw-r--r-- 1 root wheel 11142 May 8 00:20 fstat.out - -rw-r--r-- 1 root wheel 742 May 8 00:20 netstat_m.out - -rw-r--r-- 1 root wheel 486047 May 8 00:20 netstat_na.out - -rw-r--r-- 1 root wheel 735 May 8 00:20 sockstat.out - -rw-r--r-- 1 root wheel6266 May 8 00:20 vmstat_m.out - -rw-r--r-- 1 root wheel5376 May 8 00:20 vmstat_z.out - -rw-r--r-- 1 root wheel4910 May 8 00:20 ps.out - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGP+8z4QvfyHIvDvMRAlI+AJ9D0LIRCsFvQShS5TjN/QHw9VyTeQCggYMS Uc0aJpCLwdZxsH3jVllUZi4= =e97x -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: swap zone exhausted, increase kern.maxswzone
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Saturday, May 05, 2007 13:11:35 -0700 Matthew Dillon [EMAIL PROTECTED] wrote: We'll have a better idea as to what is going on when you get the message again. You might even want to do a once-a-10-minutes cron job to append pstat -s, vmstat -m, and vmstat -z to a file. 'k, I have the following running out of cron ever 10 minutes ... anything else that might be useful? This combines the information Robert got me to send him, as well as adding pstat -s and ps aux ... #!/bin/sh DATE=`date +%Y%m%d%H%M` DIR=/vm/watch/${DATE} mkdir ${DIR} ps aux ${DIR}/ps.out sockstat ${DIR}/sockstat.out netstat -na ${DIR}/netstat_na.out fstat ${DIR}/fstat.out vmstat -z ${DIR}/vmstat_z.out vmstat -m ${DIR}/vmstat_m.out netstat -m${DIR}/netstat_m.out pstat -s ${DIR}/pstat_s.out -Matt Matthew Dillon [EMAIL PROTECTED] - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGPqqz4QvfyHIvDvMRAsHgAKDpv7/SIKEAYIx7NVc8tdeUaAL4YwCg7Rnr OKYu+cZK2EUjXUpn62zSOIQ= =rVxB -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
swap zone exhausted, increase kern.maxswzone
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 What exactly does that one mean? I've searched Google, and all I'm finding is a pointer to swap_pager.c, but nothing else ... What does that one mean? What would cause that sort of error? - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGPKT/4QvfyHIvDvMRAiJBAJwPv6Su4TQGToWznFRK2wlNeU+L6wCgpCrF U4mSIwGJGWZ/YTXZ8aBmWv4= =MUcQ -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: swap zone exhausted, increase kern.maxswzone
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Saturday, May 05, 2007 12:06:55 -0400 Kris Kennaway [EMAIL PROTECTED] wrote: On Sat, May 05, 2007 at 12:38:39PM -0300, Marc G. Fournier wrote: What exactly does that one mean? I've searched Google, and all I'm finding is a pointer to swap_pager.c, but nothing else ... What does that one mean? What would cause that sort of error? You need to increase the kern.maxswzone tunable to enable more space for active swap. Apparently that doesn't exist on 6-STABLE, although its generating the error? # sysctl kern.maxswzone sysctl: unknown oid 'kern.maxswzone' - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGPK214QvfyHIvDvMRArdfAJ9cqw7x1+dYINa776Ptes4iyjaHEwCeMI8X ZGUy+Xp2rbWMIc7SnId2TJg= =vMg0 -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: swap zone exhausted, increase kern.maxswzone
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Saturday, May 05, 2007 20:35:46 +0400 pluknet [EMAIL PROTECTED] wrote: Hello, On 05/05/07, Marc G. Fournier [EMAIL PROTECTED] wrote: # sysctl kern.maxswzone sysctl: unknown oid 'kern.maxswzone' It is a /boot/loader.conf variable, not in sysctl MIB. Hrmmm ... then how do I know what to increase it to, if I don't know what it currently set to? :( I thought all the /boot/loader.conf variables were viewable read only via sysctl ... ? kinda like nmbclusters: # sysctl -a | grep nmbcl kern.ipc.nmbclusters: 25600 I can't set it via sysctl, it has to be in /boot/loader.conf ... but I can at least view its value ... - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGPLQM4QvfyHIvDvMRAkSbAKDojBtpy7zbpRZvC9K16Q5BVL4pWQCg51T5 UgGcvEgqOetC2u9uIsjPqfE= =USzs -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: swap zone exhausted, increase kern.maxswzone
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Saturday, May 05, 2007 19:12:09 +0200 Martin Hudec [EMAIL PROTECTED] wrote: Marc G. Fournier wrote: Apparently that doesn't exist on 6-STABLE, although its generating the error? # sysctl kern.maxswzone sysctl: unknown oid 'kern.maxswzone' As in /usr/src/sys/conf/NOTES: ... # 2. In /boot/loader.conf, set the tunables kern.maxswzone, # kern.maxbcache, kern.maxtsiz, kern.dfldsiz, kern.maxdsiz, # kern.dflssiz, kern.maxssiz and kern.sgrowsiz. ... As in man loader: kern.maxswzone Limits the amount of KVM to be used to hold swap meta information, which directly governs the maximum amount of swap the system can support. This value is specified in bytes of KVA space and defaults to around 70MBytes. Care should be taken to not reduce this value such that the actual amount of configured swap exceeds 1/2 the kernel- supported swap. The default 70MB allows the kernel to sup- port a maximum of (approximately) 14GB of configured swap. Only mess around with this parameter if you need to greatly extend the KVM reservation for other resources such as the buffer cache or NMBCLUSTERS. Modifies VM_SWZONE_SIZE_MAX. Also check -hackers maillist for following and the replies: http://lists.freebsd.org/pipermail/freebsd-hackers/2007-January/019217.html Sweet, that definitely helps ... from John's response in that email, it sounds like this may be related to the socket issue that I've already reported, since it all seems to revolve around the KVA ... I wonder if the socket issue is 'pushing into' the swap stuff (ie. this is a result of the problem, not the cause) ... But, based on the 'default 70MB == 14G of configured swap' above .. I only have 8G of swap on that machine, which really makes it sound like this is an overflow from the other problem :( - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGPL9J4QvfyHIvDvMRAuKvAKCljSizyOpaY9Ep6OfpFh++9e5HqQCgmXMb Z+26yS6pgkqF6qsACcnATiM= =zi67 -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: swap zone exhausted, increase kern.maxswzone
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Saturday, May 05, 2007 10:49:29 -0700 Matthew Dillon [EMAIL PROTECTED] wrote: The swblock structures only apply to actively swapped out data. Mark, how much data is actually swapped out (pstat -s) at the time the problem is reported? If you can dump UMA memory statistics that would be beneficial as well. I just find it hard to imagine that any system would actually be using that much swap, but hey! :-) That's why I think that the socket issue and this one are co-related ... with everything started up (93 jails), my swap usage right now is: mars# pstat -s Device 1K-blocks UsedAvail Capacity /dev/da0s1b 8388608 20 8388588 0% Its only been up 2.5 hours so far, but still, everything is started up ... - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGPMh54QvfyHIvDvMRAiYuAJ92hIiO+Sx+7aYeHCqNhpz8uwqL3ACgk+/y t71wYXIg6SCgB92NaVPc9A0= =+asv -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: swap zone exhausted, increase kern.maxswzone
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Saturday, May 05, 2007 13:11:35 -0700 Matthew Dillon [EMAIL PROTECTED] wrote: We'll have a better idea as to what is going on when you get the message again. You might even want to do a once-a-10-minutes cron job to append pstat -s, vmstat -m, and vmstat -z to a file. 'k, that I can do :) Thanks ... - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGPOxA4QvfyHIvDvMRAm8KAJ48oDaEeLYhJ6Ce6m5YH6h2N5gEVACeLAyp /D8O7DSiGxXYavMpzRN4ft0= =8QHO -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Socket leak (Was: Re: What triggers No Buffer Space) Available?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Friday, May 04, 2007 12:05:11 +0100 Robert Watson [EMAIL PROTECTED] wrote: I think we should be careful to avoid prematurely drawing conclusions about the source of the problem. First question: have you confirmed that the resource limit on sockets is definitely what is causing the error you're seeing? I.e., does the number of sockets hit the maximum sockets? 'k, so, based on your other email this morning, about sockstat | stream, I'm now keeping an eye on: # uptime ; netstat -nA | grep -c stream ; sockstat -u | grep -c stream ; sysctl kern.ipc.numopensockets ; sysctl kern.ipc.maxsockets 8:59AM up 1 day, 9:57, 7 users, load averages: 1.63, 4.92, 5.12 6877 2323 kern.ipc.numopensockets: 8463 kern.ipc.maxsockets: 12328 I'm at least 24 hours out from the error(s) starting to happen ... Second point: there are two kinds of resource leaks that seem likely candidates for a socket resource exhaustion problem. First, kernel bugs, in which the kernel maintains objects despite there being no application references, and second, application reference leaks, in which applications keep references to kernel objects despite no longer needing them. Our immediate goal is to determine which of these is the case: is it a kernel bug, or an application bug? Using tools like netstat and sockstat, we can try and determine if all kernel sockets are properly referenced. Experience suggests that it is an application bug, but we shouldn't rule out a kernel bug; the good news is that the tools to use in the debugging process are identical at this stage. 'k, in preparation for it starting, so that I can reboot as quickly as possible, but get max information ... do I just want to save the output of 'sockstat -u' and 'netstat -nA', or is there something else that will be useful? - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGOz294QvfyHIvDvMRAsy6AKCme99kb27uIHrgLC53fVCZrqKkSgCgheFR 2DYk1DPdmAGzoJhqAXpt+Sc= =G1NF -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: What triggers No Buffer Space Available?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Thursday, May 03, 2007 11:17:56 -0400 Chuck Swiger [EMAIL PROTECTED] wrote: The ones you're showing are from Postfix. It would be interesting to sort them by frequency and see what the majority of the use is from. If you sort the data by the conn field, do the ones without an address all hit the same thing? If you grep for that in the first field, I found a lot that are talking to /var/run/logpriv (ie, a socketpair() to syslogd, presumably). Okay, assuming that I'm doing this right, here' what I have: Last night, before I went to bed: mars# netstat -A | grep stream | wc -l ; sockstat -u | wc -l 2705 2981 Today, 5 minutes ago: # netstat -A | grep stream | wc -l ; sockstat -u | wc -l 4397 2961 Looking at the Conn field from netstat -A: mars# awk '{print $6}' /tmp/output | sort | uniq -c | sort -nr | head -5 2125 0 1 d14dbe10 1 d14dbbd0 1 d14dbb40 1 d14dba20 So, 2125 sockets not connected to anything? - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGOhP+4QvfyHIvDvMRAhdvAKCZo5JRwFea0E8wb+iFblJ1aHM57gCdEb2T KMJhc7OT5kyQNMslL7Rm+LE= =+0kp -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: What triggers No Buffer Space Available?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Thursday, May 03, 2007 19:28:56 +0100 Robert Watson [EMAIL PROTECTED] wrote: I generally recommend using a combination of netstat and sockstat. Sockets represent, loosely, IPC endpoints. There are actually two layers associated with each socket -- the IPC object (socket) and the protocol control block (PCB). Both are resource limited to pevent run-away processes from swamping the system, so exhaustion of either can lead to ENOBUFS. The behaviors of netstat and sockstat are quite different, even though the output is similar: netstat walks the protocol-layer connection lists and prints information about them. sockstat walks the process file descriptor table and prints information on reachable sockets. As sockets can exist without PCBs, and PCBs can exist without sockets, you need to look at both to get a full picture. This can occur if a proces exits, closes the socket, and the connection remains in, for example, the TIME_WAIT state. There are some other differences -- the same socket can appear more than once in sockstat output, as more than one process can reference the same socket. Sockets can also exist without any referencing process (if the application closes, but there is still data draining on an open socket). I would suggest starting with sockstat, as that will allow you to link socket use to applications, and provide a fairly neat summary. When using netstat, use netstat -na, which will list all sockets and avoid name lookups. 'k, all I'm looking at right now is the Unix Domain Sockets, and the output of netstat - sockstat is growing since I first started counting both .. This was shortly after reboot: mars# netstat -A | grep stream | wc -l ; sockstat -u | wc -l 2705 2981 - From your explanation above, I'm guessing that the higher sockstat #s is where you were talking about one socket being used by multiple processes? But, right now: mars# netstat -nA | grep stream | wc -l ; sockstat -u | wc -l 5025 2905 sockstat -u #s are *down*, but netstat -na is almost double ... Again, based on what you state above: Sockets can also exist without any referencing process (if the application closes, but there is still data draining on an open socket). Now, looking at another 6-STABLE server, but one that has been running for 2 months now, I'm seeing numbers more consistent with what mars looks like shortly after all the jails start up: venus# netstat -nA | grep stream | wc -l ; sockstat -u | wc -l 2126 2209 So, if those sockets on mars are 'still draining on an open socket', is there some way of finding out where? If I'm understanding what you've said above, these 'draining sockets' don't have any processes associated with them anymore? So, its not like I can just kill off a process, correct? - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGOlh34QvfyHIvDvMRApSUAJ9jPszXBw83hXPRLbczimNWFtn6WwCgpijT nDWi/kW4Gt8/J2a4U3n2prk= =IQCW -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Socket leak (Was: Re: What triggers No Buffer Space) Available?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I'm trying to probe this as well as I can, but network stacks and sockets have never been my strong suit ... Robert had mentioned in one of his emails about a Sockets can also exist without any referencing process (if the application closes, but there is still data draining on an open socket). Now, that makes sense to me, I can understand that ... but, how would that look as far as netstat -nA shows? Or, would it? For example, I have: mars# netstat -nA | grep c9655a20 c9655a20 stream 0 00 c95d63f000 c95d63f0 stream 0 00 c9655a2000 mars# netstat -nA | grep c95d63f0 c9655a20 stream 0 00 c95d63f000 c95d63f0 stream 0 00 c9655a2000 They are attached to each other, but there appears to be no 'referencing process' ... it is now 10pm at night ... I saved a 'snapshot' of netstat -nA output at 6:45pm, over 3 hours ago, and it has the same entries as above: c9655a20 stream 0 00 c95d63f000 c95d63f0 stream 0 00 c9655a2000 again, if I'm reading this right, there is no 'referencing process' ... first, of course, am I reading this right? second ... if I am reading this right, and, if I am understanding what Robert was saying about 'draining' (alot of ifs, I know) ... isn't it odd for it to take 3 hours to drain? Again, if I'm reading / understanding things right, without the 'referencing process', it won't show up in sockstat -u, which is why my netstat -nA numbers keep growing, but sockstat -u numbers don't ... which also means that there is no way to figure out what process / program is leaving 'dangling sockets'? :( - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGOoe94QvfyHIvDvMRAj2LAKDXobcYr4VGOB+WfXYqCBTatZNZLQCfbyWa zsG/o1K3RM3ybjA5RLiSW5s= =8DJi -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Socket leak (Was: Re: What triggers No Buffer Space) Available?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Thursday, May 03, 2007 18:26:30 -0700 Matthew Dillon [EMAIL PROTECTED] wrote: One thing you can do is drop into single user mode... kill all the processes on the system, and see if the sockets are recovered. That will give you a good idea as to whether it is a real leak or whether some process is directly or indirectly (by not draining a unix domain socket on which other sockets are being transfered) holding onto the socket. *groan* why couldn't this be happening on a server that I have better remote access to? :( But, based on your explanation(s) above ... if I kill off all of the jail(s) on the machine, so that there are minimal processes running, shouldn't I see a significant drop in the number of sockets in use as well? or is there something special about single user mode vs just killing off all 'extra processes'? - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGOpeM4QvfyHIvDvMRAoppAJ9SNmIi+i2vDXEZzrpaVe74a3uKyQCfeMY7 z3lFWXEo111CL5peXvqqsCQ= =qxmO -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: What triggers No Buffer Space Available?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Wednesday, May 02, 2007 11:00:17 +0800 Adrian Chadd [EMAIL PROTECTED] wrote: It doesn't panic whe it happens, no? Nope ... I can login via ssh (sometimes it takes a try or two, but I can always login) and then do a 'reboot', and all is well again for another 72 hours or so ... I'd check the number of sockets you've currently got open at that point. ie: # netstat | egrep tcp4|udp4 | awk '{print $1}' | uniq -c 171 tcp4 103 udp4 or is there a better command I should be using? Some applications might be holding open a whole load of sockets and their buffers stay allocated until they're closed. If they don't handle/don't get told about the error then they'll just hold open the mbufs. Is there any way of determining which apps are holding open which sockets? ie. lsof for open files? - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGOMu/4QvfyHIvDvMRAldVAJ9B4uUUGbON16nWw+dR5QKveyQevACgju4M TtBVUWAqf2PGqHVQxOnRbew= =4/1c -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: What triggers No Buffer Space Available?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Wednesday, May 02, 2007 11:17:02 -0700 John-Mark Gurney [EMAIL PROTECTED] wrote: netstat -A will list the socket address, fstat will list the fd, and what socket it connected to that fd.. Oh wow ... according to this, I have: mars# wc -l /tmp/output 11238 /tmp/output (minus some header lines) sockets running righ tnow ... okay, next question ... under 'Active UNIX domain sockets, I see alot that have no Addr: Active UNIX domain sockets Address Type Recv-Q Send-QInode Conn Refs Nextref Addr d06b7480 stream 0 00 c969b24000 private/proxymap c969b240 stream 0 00 d06b748000 ce6fc870 stream 0 00 cf74487000 private/rewrite cf744870 stream 0 00 ce6fc87000 ce4b2630 stream 0 00 d0cee90000 private/proxymap d0cee900 stream 0 00 ce4b263000 d0437240 stream 0 00 cf71600000 private/proxymap cf716000 stream 0 00 d043724000 c94f4990 stream 0 00 cee6ed8000 private/rewrite cee6ed80 stream 0 00 c94f499000 d0cefcf0 stream 0 00 cb281a2000 private/rewrite cb281a20 stream 0 00 d0cefcf000 ce0d5240 stream 0 00 cb25148000 private/anvil Now, the 'Conn' field from the previous line matches the 'Address' line of the 'blank Addr' ... so there are two sockets for each Addr? in vs out? To give reference point ... mars above has 91 jail'd environments running on it, its been up 2days, 9hrs now, and has 11k sockets in use ... Hrmmm ... just checked jupiter, and she has 32 jail with 1080 sockets ... venus has 62 jail with 2819 sockets ... and pluto has 35 jails with 1818 sockets ... mars is running on average 2x the number of sockets per jail then the other servers ... Is this normal? mars# grep d067f900 /tmp/output d067f900 stream 0 00 cafd4c6000 cafd4c60 stream 0 00 d067f90000 There is no 'Addr' related to either of them? I can scroll down pages and pages of those types of entries, that don't have any Addr field associated with them ... -- John-Mark GurneyVoice: +1 415 225 5579 All that I will do, has been done, All that I have, has not. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGOPj/4QvfyHIvDvMRAsbFAKDRrAE4QazlJ1iQM6lLOULBwdNSygCfV2r2 AeY8lpmf0E+Av1zmAGijo+g= =zDXV -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: What triggers No Buffer Space Available?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 'k, I just rebooted the server (messages started again), and netstat -A is showing 3600 sockets open ... based on jupiter/pluto/venus numbers, this is what I'd expect to see (~1000 sockets per 30 jails) ... so, over the course of hte next 2 days, I expect that that will grow to the 11k+ that I saw when I rebooted, with most of those apparently not attached to an 'Addr' ... - --On Wednesday, May 02, 2007 17:47:59 -0300 Marc G. Fournier [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Wednesday, May 02, 2007 11:17:02 -0700 John-Mark Gurney [EMAIL PROTECTED] wrote: netstat -A will list the socket address, fstat will list the fd, and what socket it connected to that fd.. Oh wow ... according to this, I have: mars# wc -l /tmp/output 11238 /tmp/output (minus some header lines) sockets running righ tnow ... okay, next question ... under 'Active UNIX domain sockets, I see alot that have no Addr: Active UNIX domain sockets Address Type Recv-Q Send-QInode Conn Refs Nextref Addr d06b7480 stream 0 00 c969b24000 private/proxymap c969b240 stream 0 00 d06b748000 ce6fc870 stream 0 00 cf74487000 private/rewrite cf744870 stream 0 00 ce6fc87000 ce4b2630 stream 0 00 d0cee90000 private/proxymap d0cee900 stream 0 00 ce4b263000 d0437240 stream 0 00 cf71600000 private/proxymap cf716000 stream 0 00 d043724000 c94f4990 stream 0 00 cee6ed8000 private/rewrite cee6ed80 stream 0 00 c94f499000 d0cefcf0 stream 0 00 cb281a2000 private/rewrite cb281a20 stream 0 00 d0cefcf000 ce0d5240 stream 0 00 cb25148000 private/anvil Now, the 'Conn' field from the previous line matches the 'Address' line of the 'blank Addr' ... so there are two sockets for each Addr? in vs out? To give reference point ... mars above has 91 jail'd environments running on it, its been up 2days, 9hrs now, and has 11k sockets in use ... Hrmmm ... just checked jupiter, and she has 32 jail with 1080 sockets ... venus has 62 jail with 2819 sockets ... and pluto has 35 jails with 1818 sockets ... mars is running on average 2x the number of sockets per jail then the other servers ... Is this normal? mars# grep d067f900 /tmp/output d067f900 stream 0 00 cafd4c6000 cafd4c60 stream 0 00 d067f90000 There is no 'Addr' related to either of them? I can scroll down pages and pages of those types of entries, that don't have any Addr field associated with them ... -- John-Mark Gurney Voice: +1 415 225 5579 All that I will do, has been done, All that I have, has not. ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGOPj/4QvfyHIvDvMRAsbFAKDRrAE4QazlJ1iQM6lLOULBwdNSygCfV2r2 AeY8lpmf0E+Av1zmAGijo+g= =zDXV -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGOVOS4QvfyHIvDvMRAv1RAJwIU84/Mh+8fdJVuyScsljFDSQB1QCg11Qe C6U/KSqScqYTHUhEM1dLXQM= =mzYI -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
What triggers No Buffer Space Available?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I'm still being hit by this one ... more frequently right now as I had to move a bit more stuff *onto* that server ... I'm trying to figure out what I can monitor for a 'leak' somewhere, but the only thing I'm able to find is the whole nmbclusters stuff: mars# netstat -m | grep mbuf clusters 130/542/672/25600 mbuf clusters in use (current/cache/total/max) the above is after 26hrs uptime ... Is there something else that will trigger/generate the above error message? - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGN0W14QvfyHIvDvMRAo+CAKCGpBrcf30/BWFJcrKsJNFr2G7jJQCff67L FxFIiBd52huPFdQgb88AtHE= =mbLc -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: How to report bugs (Re: 6.2-STABLE deadlock?)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Tuesday, April 24, 2007 23:53:16 -0400 Kris Kennaway [EMAIL PROTECTED] wrote: On Wed, Apr 25, 2007 at 10:53:08AM +0800, LI Xin wrote: Hi, Oleg, Oleg Derevenetz wrote: ??? LI Xin [EMAIL PROTECTED]: [...] I'm not very sure if this is specific to one disk controller. Actually I got some occasional reports about similar hangs on amd64 6.2-RELEASE (slightly patched version) that most of processes stuck in the 'ufs' state, under very light load, the box was equipped with amr(4) RAID. I was not able to reproduce the problem at my lab, though, it's still unknown that how to trigger the livelock :-( Still need some investigate on their production system. I reported simular issue for FreeBSD 6.2 in audit-trail for kern/104406: http://www.freebsd.org/cgi/query-pr.cgi?pr=104406cat= and there should be a thread related to this. Briefly, I suspects that this is related to nullfs filesystems on my server and when I cvsuped to FreeBSD 6.2- STABLE with Daichi's unionfs-related patches and replaced nullfs-mounted fs with unionfs-mounted (that was done 10.03.07) problem is gone (seems to be so, at least). Hmm... Seems to be different issues. The problem I have received was a pgsql server (no nullfs/unionfs involved), and the hang always happen when it is not being heavily loaded (usually in the morning, for instance, and there is no special configuration, like scheduled tasks which can generate disk load, etc., only the entropy harvesting), so this is quite confusing. Yes, a large part of the confusion is the unfortunate tendency of people to do the following: user1 my system hangs/panics/etc user2 my system hangs/panics/etc too; it must be the same problem! What we really need is for every FreeBSD user who encounters a hang/panic/etc to avoid jumping to conclusions -- no matter how many superficial similarities there may seem to you -- and instead go through the relevant steps described here: http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kernelde bug.html Until you (or a developer) have analyzed the resulting information, you cannot definitively determine whether or not your problem is the same as a given random other problem, and you may just confuse the issue by making claims of similarity when you are really reporting a completely separate problem. What about those that don't have the benefit of being able to access the console? :( I've recently started buying servers that have builtin, full remote console (ie. the HP servers), but, for instance, I have one box that I have to consistently reboot ever 3 days due to a 'No Buffer Space Available' ... A thought: how hard would it be to add some method of forcing a system crash, that would dump core, from the command line? Something that, by default, would be disabled, but for remote debugging purposes, one could enable in the kernel and do a 'sysctl kernel.force_core_crash=1' to have it do it? I imagine that having a core to analyze would allow providing more information then nothing at all, no? - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGMkj34QvfyHIvDvMRAnIsAJ42loBGh0TkX4mfWSrZrMq2FheBuQCgiu4l B0PCLtLhd9ZiJ4oNLWZ6LT0= =KK9Y -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: How to report bugs (Re: 6.2-STABLE deadlock?)
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Friday, April 27, 2007 22:57:29 +0200 Nicolas Rachinsky [EMAIL PROTECTED] wrote: * Marc G. Fournier [EMAIL PROTECTED] [2007-04-27 16:03 -0300]: A thought: how hard would it be to add some method of forcing a system crash, that would dump core, from the command line? Something that, by default, would Doesn't 'kill -6 1' work anymore? I'd never heard of that one ... will it dump core if I do that? Please note, in my case, with the Buffer Space issue ... I can login and cleanly reboot the server, so doing something like the above to get a core dump is definitely doable, I'd just never seen a reference to a 'kill -6 1' before for doing that ... Side question to this though ... I remember awhile back using a 'client-server' mechanism that allowed me to dump core to a seperate server ... it was so long ago that my memory is faint, but there was a reason why I couldn't dump to the local server ... not sure whatever happened to that code, but, if one can do that for dumping core, shouldn't there be some method possible to connect to DDB over the Ethernet without having to have a serial console in place? For the core dump case, the ethernet obviously stayed up while it dump'd, couldn't some sort of 'ddb.conf' file be setup that would allow it to ifconfig an IP within that shell so that you could connect to it remotely? say with an 'from-ip' directive? Just a thought ... - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGMmx04QvfyHIvDvMRAlNcAJ0QcIMoRnq+0T9yJVuMwZvTNQnNXwCfaEKK JB4cHzSbiklD/sodWvNSSzE= =BwuL -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Jail Resource Limits for 6.x ...
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Is anyone looking into merging in the patch available at: http://www.ualberta.ca/~cdjones/cdjones_jail_soc2006.patch That provides both memory and cpu limits on a jail? It appears to be against REL_6 from last years SOC ... Is anyone using it in production anywhere? - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGIQrl4QvfyHIvDvMRApdeAKClCVc62+hZRZRVi/Gi4WwhlAeJuACeIka2 qy2HZ3H0e6OQq9aDTiNDTFk= =6DVK -END PGP SIGNATURE- ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: 74 hours till next No Buffer Space Available reboot ...
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Sunday, April 08, 2007 23:04:42 -0400 Dave [EMAIL PROTECTED] wrote: Hello, This is what i get for catching this late. Can you describe your situation? I've got a server, router actually running 6.1-p6 i believe, and lately it's been doing this stop. I can't be any more specific than that, because that's all i know. The box just goes unresponsive, i can get a login prompt on the console, but it's unresponsive. I have to reboot it. This has occurred twice now and i'm starting to get concerned. I've ruled out ram, i recently replaced it's ram for an unrelated reason so i don't think that's it. If your situation is similar can you let me know what you tried? This is a different situation, I think ... first, I'm running 6.2-STABLE, as of about last week, so a much newer kernel then you are running ... and in my case, at least, I can still login to the machine using ssh and force a reboot remotely ... it doesn't seem to be a 'solid hang' ... if I were to hazard a guess as to what it feels like ... it feels like the network interface buffer has filled up, but isn't being released properly ... almost like a memory leak, but on the network ... if I leave it long enough, it will eventually require a tech to power cycle it, but if I catch it early enough, I can still get in to do a reboot ... But ... that said ... when you say 'get a login prompt on the console, but it's unresponse ... do you mean that you can actually type in a userid, and possibly passwd, but after that it just hangs? Thanks. Dave. - Original Message - From: Marc G. Fournier [EMAIL PROTECTED] To: [EMAIL PROTECTED] Cc: Chris [EMAIL PROTECTED]; Thiago Esteves de Oliveira [EMAIL PROTECTED] Sent: Sunday, April 08, 2007 10:28 PM Subject: 74 hours till next No Buffer Space Available reboot ... -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 In my case, I can almost set my watch to it (if I had a watch) ... every 3 days, 2 hours, it seems that I have to reboot this machine, as that is when the 'No Buffer Space Available' r starts to be generated ... There are two others (CC'd in this) that have experienced the same ... Chris / Thiago ... in your cases, are you finding that it happens as regularly with your servers? Thiago, I believe you ended up reverting to an older kernel to clear up the situation? I've included my 'netstat -m' report ... from it, it doesn't look to me like its an mbuf issue, or am I missing something? Is there something else that, in 74 hours, I can provide before I do the reboot? Chris, you mentioned reducing recvspace/sendspace to correct the issue? Has that fixed it for you, or just prolonged until it happens again? How did you set this? I've checked both the man pages for ifconfig and fxp, and don't see anything ... ah, just found it doing a 'sysctl -a' ... can you post your settings from /etc/sysctl.conf? or did you set it somewhere else? I'd like to try that and see if maybe that changes my '74 hours uptime', either good or bad ... # netstat -m 161/949/1110 mbufs in use (current/cache/total) 133/639/772/25600 mbuf clusters in use (current/cache/total/max) 133/396 mbuf+clusters out of packet secondary zone in use (current/cache) 0/0/0/0 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/0 9k jumbo clusters in use (current/cache/total/max) 0/0/0/0 16k jumbo clusters in use (current/cache/total/max) 306K/1515K/1821K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0/45/6656 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed 325 requests for I/O initiated by sendfile 731 calls to protocol drain routines - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGGaTD4QvfyHIvDvMRAm3jAKDtZk1IgW3DbMGGKASiSsbNV7Ok3QCgtvwK JSuRYW1Af0lfFK2QvYMo9v8= =3DwH -END PGP SIGNATURE- ___ [EMAIL PROTECTED] mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED] - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGIEq34QvfyHIvDvMRAo+uAKDTevbmYP2q7p7tvO674RMlFoiPpACgoCVY cvG08TsmvMN/iwBI3BVEEeo= =0r5p -END PGP SIGNATURE- ___ [EMAIL PROTECTED] mailing list
74 hours till next No Buffer Space Available reboot ...
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 In my case, I can almost set my watch to it (if I had a watch) ... every 3 days, 2 hours, it seems that I have to reboot this machine, as that is when the 'No Buffer Space Available' r starts to be generated ... There are two others (CC'd in this) that have experienced the same ... Chris / Thiago ... in your cases, are you finding that it happens as regularly with your servers? Thiago, I believe you ended up reverting to an older kernel to clear up the situation? I've included my 'netstat -m' report ... from it, it doesn't look to me like its an mbuf issue, or am I missing something? Is there something else that, in 74 hours, I can provide before I do the reboot? Chris, you mentioned reducing recvspace/sendspace to correct the issue? Has that fixed it for you, or just prolonged until it happens again? How did you set this? I've checked both the man pages for ifconfig and fxp, and don't see anything ... ah, just found it doing a 'sysctl -a' ... can you post your settings from /etc/sysctl.conf? or did you set it somewhere else? I'd like to try that and see if maybe that changes my '74 hours uptime', either good or bad ... # netstat -m 161/949/1110 mbufs in use (current/cache/total) 133/639/772/25600 mbuf clusters in use (current/cache/total/max) 133/396 mbuf+clusters out of packet secondary zone in use (current/cache) 0/0/0/0 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/0 9k jumbo clusters in use (current/cache/total/max) 0/0/0/0 16k jumbo clusters in use (current/cache/total/max) 306K/1515K/1821K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0/45/6656 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed 325 requests for I/O initiated by sendfile 731 calls to protocol drain routines - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGGaTD4QvfyHIvDvMRAm3jAKDtZk1IgW3DbMGGKASiSsbNV7Ok3QCgtvwK JSuRYW1Af0lfFK2QvYMo9v8= =3DwH -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: No buffer space available
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Saturday, April 07, 2007 20:12:00 +0100 Chris [EMAIL PROTECTED] wrote: Also to add I now have a 2nd box using 6.2 STABLE few days old code, had to use it because of broadcom 5755 nic card, I plan to use large tcp window sizes so will be interesting to see if this also suffers from the problem. I've got 8 servers on the same network, 3 are almost identical, but one of them (the one with the problem) is using software RAID vs hardware ... but, if you are seeing it without using software RAID, then that is obviously not the culprit :( - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGGCda4QvfyHIvDvMRAshzAJ47nHUdu2Xlxy8odBbaCxufhfV9igCgjQTw xNFG2VFQmGPNhjToZJ6HDNk= =6BN+ -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: No buffer space available
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Friday, April 06, 2007 06:17:04 +0100 Chris [EMAIL PROTECTED] wrote: I am seeing the no buffer space error on a machine running 6.2 STABLE feb 24 code, the machine isn't using gmirror. I had to recude recvspace and sendspace to lower values then I want to get round the problem. 67/1163/1230 mbufs in use (current/cache/total) 65/275/340/65536 mbuf clusters in use (current/cache/total/max) 65/255 mbuf+clusters out of packet secondary zone in use (current/cache) 0/0/0/0 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/0 9k jumbo clusters in use (current/cache/total/max) 0/0/0/0 16k jumbo clusters in use (current/cache/total/max) 146K/840K/987K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0/56/8704 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed 20233 requests for I/O initiated by sendfile 7740 calls to protocol drain routines What ethernet driver are you using? In my case, its an fxp device ... trying to see if there is *some* sort of common denominator here :( I just upgraded to the latest kernel last night, to see if maybe a recent commit had a side-effect of fixing it, but won't know anything for another 48 hours or so ... - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGFpJ44QvfyHIvDvMRAny4AKCOVStyCBOi5Pwt5uyelgze3ML/kQCgxqCp 6VZ/f9U4ibx/zahMLWu+Fs0= =U8Y1 -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: No buffer space available
kern.shutdown.kproc_shutdown_wait: 60 kern.sugid_coredump: 0 kern.coredump: 1 kern.nodump_coredump: 0 kern.corefile: %N.core kern.fscale: 2048 kern.timecounter.stepwarnings: 0 kern.timecounter.nbinuptime: 64529049 kern.timecounter.nnanouptime: 84 kern.timecounter.nmicrouptime: 297518 kern.timecounter.nbintime: 15065914 kern.timecounter.nnanotime: 5937055 kern.timecounter.nmicrotime: 9127597 kern.timecounter.ngetbinuptime: 78777910 kern.timecounter.ngetnanouptime: 248698 kern.timecounter.ngetmicrouptime: 9289785 kern.timecounter.ngetbintime: 0 kern.timecounter.ngetnanotime: 0 kern.timecounter.ngetmicrotime: 6 kern.timecounter.nsetclock: 3 kern.timecounter.hardware: i8254 kern.timecounter.choice: TSC(-100) i8254(0) dummy(-100) kern.timecounter.tick: 1 kern.timecounter.smp_tsc: 0 kern.threads.thr_scope: 0 kern.threads.thr_concurrency: 0 kern.threads.debug: 0 kern.threads.max_threads_per_proc: 1500 kern.threads.max_groups_per_proc: 1500 kern.threads.max_threads_hits: 0 kern.threads.virtual_cpu: 2 kern.sched.name: 4BSD kern.sched.quantum: 10 kern.sched.ipiwakeup.enabled: 1 kern.sched.ipiwakeup.requested: 3687784 kern.sched.ipiwakeup.delivered: 3690316 kern.sched.ipiwakeup.usemask: 1 kern.sched.ipiwakeup.useloop: 0 kern.sched.ipiwakeup.onecpu: 0 kern.sched.ipiwakeup.htt2: 0 kern.sched.followon: 0 kern.sched.pfollowons: 0 kern.sched.kgfollowons: 0 kern.sched.preemption: 1 kern.sched.runq_fuzz: 1 kern.ccpu: 1948 kern.devstat.numdevs: 12 kern.devstat.generation: 538 kern.devstat.version: 6 kern.kobj_methodcount: 73 kern.log_wakeups_per_second: 5 kern.log_console_output: 1 kern.always_console_output: 0 kern.msgbuf: removed lines upon lines of text here kern.msgbuf_clear: 0 kern.smp.maxcpus: 16 kern.smp.active: 1 kern.smp.disabled: 0 kern.smp.cpus: 2 kern.smp.forward_signal_enabled: 1 kern.smp.forward_roundrobin_enabled: 1 kern.nselcoll: 11052 kern.drainwait: 300 kern.tty_nin: 22760 kern.tty_nout: 15228375 kern.console: consolectl,/consolectl, kern.consmute: 0 kern.consmsgbuf_size: 8192 kern.constty_wakeups_per_second: 5 kern.filedelay: 30 kern.dirdelay: 29 kern.metadelay: 28 kern.minvnodes: 25000 kern.chroot_allow_open_directories: 1 kern.random.yarrow.gengateinterval: 10 kern.random.yarrow.bins: 10 kern.random.yarrow.fastthresh: 192 kern.random.yarrow.slowthresh: 256 kern.random.yarrow.slowoverthresh: 2 kern.random.sys.seeded: 1 kern.random.sys.harvest.ethernet: 1 kern.random.sys.harvest.point_to_point: 1 kern.random.sys.harvest.interrupt: 1 kern.random.sys.harvest.swi: 0 - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGFUag4QvfyHIvDvMRAr2PAKDn4sSN6dyQulC0W2Q1lr25RfSBPQCgwMgD wzztdb381CaTTOVtRSXhZzw= =pUWJ -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: No buffer space available
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Thiago ... What version of kernel did you end up going back to? - --On Wednesday, April 04, 2007 10:15:48 -0300 Marc G. Fournier [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I'm seeing the same effect (haven't tried older kernel, mind you) almost like clockwork, every 72 hours after reboot ... at least now I don't feel so crazy, knowing it isn't just me ... - --On Sunday, April 01, 2007 17:07:08 -0300 Thiago Esteves de Oliveira [EMAIL PROTECTED] wrote: I've tried to increase the kern.ipc.nmbclusters value but it worked only when I changed the kernel to an older one. netstat -m (Now it's working with the same values.) - 515/850/1365 mbufs in use (current/cache/total) 512/390/902/65024 mbuf clusters in use (current/cache/total/max) 512/243 mbuf+clusters out of packet secondary zone in use (current/cache) 0/0/0/0 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/0 9k jumbo clusters in use (current/cache/total/max) 0/0/0/0 16k jumbo clusters in use (current/cache/total/max) 1152K/992K/2145K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0/0/0 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed 2759 requests for I/O initiated by sendfile 2982 calls to protocol drain routines Ethernet adapters - em0: Intel(R) PRO/1000 Network Connection Version - 6.0.5 port 0xec80-0xecbf m em 0xfebe-0xfebf irq 10 at device 4.0 on pci7 em0: Ethernet address: 00:04:23:c3:06:78 em0: [FAST] skc0: 3Com 3C940 Gigabit Ethernet port 0xe800-0xe8ff mem 0xfebd8000-0xfebdbfff irq 15 at device 6.0 on pci7 skc0: 3Com Gigabit NIC (3C2000) rev. (0x1) sk0: Marvell Semiconductor, Inc. Yukon on skc0 sk0: Ethernet address: 00:0a:5e:65:ad:c3 miibus0: MII bus on sk0 e1000phy0: Marvell 88E1000 Gigabit PHY on miibus0 e1000phy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX-FDX, auto P.S.: I am using the FreeBSD/amd64. Brian A. Seklecki wrote: Show us netstat -m on the broken kernel? Show us your dmesg(8) for em(4). TIA, ~BAS On Fri, 2007-03-30 at 11:13 -0300, Thiago Esteves de Oliveira wrote: Hello, I've had a problem with one of my FreeBSD servers, the machine has stopped its network services and then sent these messages: -Mar 27 13:00:03 anubis dhcpd: send_packet: No buffer space available -Mar 27 13:00:26 anubis routed[431]: Send bcast sendto(em0, 146.164.92.255.520): No buffer space available The messages were repeated a lot of times before a temporary solution. I've changed the kernel(FreeBSD 6.2) to an older one(FreeBSD 6.1) and since then it's been working well. What happened? P.S.: I can give more informations if necessary. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED] - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGE6UE4QvfyHIvDvMRAlutAJ0WzVTYq99hmx1km2mdXE7pdUC8IgCgt4O1 eG6kXgqHveumXjkL0t+Q8Q8= =sieE -END PGP SIGNATURE- ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED] - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGE/ZC4QvfyHIvDvMRAsWoAJwJpD8nCtG0iv5U6LY8ISyyDKxgegCg1eti SezStun7CLDA9pgfrp8GloM= =UwSU -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: No buffer space available
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Thiago ... I'm just curious here, but are you by any chance using geom at all? The only machine I have that seems to be affected like this (where netstat -m doesn't seem to indicate a problem with mbufs) is using gmirror ... the rest all use hardware RAID controllers ... Its a long shot, but so far, its the only one I seem to be able to draw :( - --On Sunday, April 01, 2007 17:07:08 -0300 Thiago Esteves de Oliveira [EMAIL PROTECTED] wrote: I've tried to increase the kern.ipc.nmbclusters value but it worked only when I changed the kernel to an older one. netstat -m (Now it's working with the same values.) - 515/850/1365 mbufs in use (current/cache/total) 512/390/902/65024 mbuf clusters in use (current/cache/total/max) 512/243 mbuf+clusters out of packet secondary zone in use (current/cache) 0/0/0/0 4k (page size) jumbo clusters in use (current/cache/total/max) 0/0/0/0 9k jumbo clusters in use (current/cache/total/max) 0/0/0/0 16k jumbo clusters in use (current/cache/total/max) 1152K/992K/2145K bytes allocated to network (current/cache/total) 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters) 0/0/0 requests for jumbo clusters denied (4k/9k/16k) 0/0/0 sfbufs in use (current/peak/max) 0 requests for sfbufs denied 0 requests for sfbufs delayed 2759 requests for I/O initiated by sendfile 2982 calls to protocol drain routines Ethernet adapters - em0: Intel(R) PRO/1000 Network Connection Version - 6.0.5 port 0xec80-0xecbf m em 0xfebe-0xfebf irq 10 at device 4.0 on pci7 em0: Ethernet address: 00:04:23:c3:06:78 em0: [FAST] skc0: 3Com 3C940 Gigabit Ethernet port 0xe800-0xe8ff mem 0xfebd8000-0xfebdbfff irq 15 at device 6.0 on pci7 skc0: 3Com Gigabit NIC (3C2000) rev. (0x1) sk0: Marvell Semiconductor, Inc. Yukon on skc0 sk0: Ethernet address: 00:0a:5e:65:ad:c3 miibus0: MII bus on sk0 e1000phy0: Marvell 88E1000 Gigabit PHY on miibus0 e1000phy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX-FDX, auto P.S.: I am using the FreeBSD/amd64. Brian A. Seklecki wrote: Show us netstat -m on the broken kernel? Show us your dmesg(8) for em(4). TIA, ~BAS On Fri, 2007-03-30 at 11:13 -0300, Thiago Esteves de Oliveira wrote: Hello, I've had a problem with one of my FreeBSD servers, the machine has stopped its network services and then sent these messages: -Mar 27 13:00:03 anubis dhcpd: send_packet: No buffer space available -Mar 27 13:00:26 anubis routed[431]: Send bcast sendto(em0, 146.164.92.255.520): No buffer space available The messages were repeated a lot of times before a temporary solution. I've changed the kernel(FreeBSD 6.2) to an older one(FreeBSD 6.1) and since then it's been working well. What happened? P.S.: I can give more informations if necessary. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED] ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED] - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGFE/S4QvfyHIvDvMRAvf2AJ94uFbAqplqtvTHeontpNT1FvzE7ACcDqYM 5EVfYDsLw++60NYugCOOwho= =+Wd7 -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
More on: No buffer space available
:44,0, 119, 349,30104,0 VMSPACE: 296,0, 1034, 812, 2510845,0 mbuf_packet: 256,0, 142, 382, 33140922,0 mbuf: 256,0, 44, 542, 65939839,0 mbuf_cluster:2048,25600, 524, 164, 345305,0 mbuf_jumbo_pagesize: 4096,0,0,0,0,0 mbuf_jumbo_9k: 9216,0,0,0,0,0 mbuf_jumbo_16k: 16384,0,0,0,0,0 ACL UMA zone: 388,0,0,0,0,0 g_bio:132,0,0, 4205, 87153652,0 VNODE:272,0,71264,22158, 1241560352, 0 VNODEPOLL: 76,0,0, 100,3,0 S VFS Cache: 68,0,73121,29135, 1248334482, 0 L VFS Cache: 291,0, 124, 1085, 682683,0 NAMEI: 1024,0,0, 304, 1434961352, 0 DIRHASH: 1024,0, 1810, 258, 18000204,0 PIPE: 408,0, 1981, 602, 1091976,0 KNOTE: 68,0, 32, 360, 3972127,0 socket: 356,12331,12271, 60, 8439626, 1141 unpcb:144,12339,11561, 373, 5337418,0 ipq: 32, 904,0, 226,2,0 udpcb:180,12342, 74, 146, 2173707,0 inpcb:180,12342, 678, 1478, 927361,0 tcpcb:464,12328, 619, 717, 927361,0 tcptw: 48, 2496, 59, 1501, 256613,0 syncache: 100,15366,0, 195, 676224,0 hostcache: 76,15400, 512, 688,34850,0 tcpreass: 20, 1690,0, 507,53830,0 sackhole: 20,0,0, 507,20912,0 ripcb:180,12342,0, 88, 1127,0 rtentry: 132,0, 203, 319, 6656,0 g_stripe_zone: 131072, 100,0,0,0,0 SWAPMETA: 276, 121576, 957, 429,17641,0 Mountpoints: 664,0, 197, 19, 200,0 FFS inode:132,0,70901,17491, 1239034732, 0 FFS1 dinode: 128,0,0,0,0,0 FFS2 dinode: 256,0,70901, 7924, 1239034732, 0 If the '4 hour later' version is of any use, please ask, I did save a copy before rebooting ... Does this provide anything? Is there something else I should do/try? Thanks ... - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGDHfM4QvfyHIvDvMRAhZlAJ4sR9Xe3fuC5egjtt9o9dX8Ek+opACcCu3H euSZyKGB9/HVcuwilQicfMM= =bQo7 -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: socketpair: No buffer space available
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 thanks ... just rebooted it yesterday again, so it has another 48 hours before it starts up again, so will save that output before next reboot ... - --On Tuesday, March 27, 2007 21:03:55 +0100 Robert Watson [EMAIL PROTECTED] wrote: On Fri, 23 Mar 2007, Marc G. Fournier wrote: I've checked nmbclusters between the two machines, and both are at 25600, but not sure what sysctl to look at for how much is actually used out of that 25600 ... netstat -mb nmbclusters directly affects the number of clusters available in the network stack; it also indirectly affects the scaling of other settings, such as resource limits on the number of sockets. vmstat -z is also generally useful. There are a few paths to ENOBUFS in the socket allocation code--one path is if you are over-committed on socket buffer resources with respect to the resource limits of the user. Check the output of limits and the socket buffer size limit. Robert N M Watson Computer Laboratory University of Cambridge - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGCYJ24QvfyHIvDvMRAlPjAJ9zbGNDlGxTO/TFuoAQAw2zUsmj/wCgmPlG 9yyzoZWGu3B55xoAZ0iLjhg= =8QWr -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: socketpair: No buffer space available
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Monday, March 26, 2007 00:08:07 +0100 Bruce M. Simpson [EMAIL PROTECTED] wrote: Marc G. Fournier wrote: Mar 20 07:59:26 mars sshd[717]: error: reexec socketpair: No buffer space available If I have a login session on the machine, I can easily do a reboot of the machine, and it seems to come up clean every time (ie. no fsck's need to be run) ... Does anyone have any ideas of what I can look at? How odd. The re-exec feature is not documented in the man page. It appears that it can be turned off with the -r switch according to sshd.c. Can you give that a try and see if that offers symptomatic relief? It would be somewhat less secure as sshd will fork rather than fork..exec. That was actually just one example ... I get more of: sendmail[82066]: l2NEA1Ht082066: SYSERR(root): makeconnection: cannot create socket: No buffer space available then I do the sshd errors ... in another 15 hours or so, they will all start up again, like clock work :( - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGBxZ84QvfyHIvDvMRAoNTAKDBkGZL7aCOXEW22QibCCpnJJJnEgCfafMa ex0pM7sKPgCjVdURJ9nwfH0= =egaO -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
socketpair: No buffer space available
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Almost like clockwork, every 3 days, I have one server that starts to generate errors similar to below ... it isn't a 'continous thing' at the start, but gradually grows worse ... it just started happening again today, after 3 days, 2hrs of uptime ... Mar 20 07:59:26 mars sshd[717]: error: reexec socketpair: No buffer space available As unrelated as this might sound, out of three servers that are virtually identical, this is the only one using gmirror for its drives vs a hardware raid controller, two of the three running kernels from about the same time ... # ssh jupiter uname -a FreeBSD jupiter.hub.org 6.2-STABLE FreeBSD 6.2-STABLE #1: Fri Mar 16 13:13:02 ADT 2007 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/kernel i386 vs # ssh mars uname -a FreeBSD mars.hub.org 6.2-STABLE FreeBSD 6.2-STABLE #5: Tue Mar 13 02:29:37 ADT 2007 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/kernel i386 jupiter is running more on it then mars right now ... So, I either have something mis-configured on mars that is done right on jupiter, or there is a bug that is being tickled on mars that isn't being tickled on jupiter ... If I have a login session on the machine, I can easily do a reboot of the machine, and it seems to come up clean every time (ie. no fsck's need to be run) ... Does anyone have any ideas of what I can look at? I've checked nmbclusters between the two machines, and both are at 25600, but not sure what sysctl to look at for how much is actually used out of that 25600 ... - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGA+sG4QvfyHIvDvMRAoRuAJ9LXJ5RUZNXEQhEwkDFiMudThyASgCeNJXu 9Y7KZ6fSlk07/WmHGywTvJ4= =n3XS -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
socketpair: No buffer space available
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Almost like clockwork, every 3 days, I have one server that starts to generate errors similar to below ... it isn't a 'continous thing' at the start, but gradually grows worse ... Mar 20 07:59:26 mars sshd[717]: error: reexec socketpair: No buffer space available As unrelated as this might sound, out of three servers that are virtually identical, this is the only one using gmirror for its drives vs a hardware raid controller, two of the three running kernels from about the same time ... # ssh jupiter uname -a FreeBSD jupiter.hub.org 6.2-STABLE FreeBSD 6.2-STABLE #1: Fri Mar 16 13:13:02 ADT 2007 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/kernel i386 vs # ssh mars uname -a FreeBSD mars.hub.org 6.2-STABLE FreeBSD 6.2-STABLE #5: Tue Mar 13 02:29:37 ADT 2007 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/kernel i386 jupiter is running more on it then mars right now ... So, I either have something mis-configured on mars that is done right on jupiter, or there is a bug that is being tickled on mars that isn't being tickled on jupiter ... If I have a login session on the machine, I can easily do a reboot of the machine, and it seems to come up clean every time (ie. no fsck's need to be run) ... Does anyone have any ideas of what I can look at? - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGAV294QvfyHIvDvMRAogOAKCCbTIYS59dQFmV9/gfRth8nUZMpgCggZ9r 8zBIHioOQjlNBgovjv+eDA4= =lIyS -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
testing ...
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I've sent several messages to this list the past couple of days, but none of them seem to go through ... I'm not expecting this one to either, just trying to see if there is anything in my logs to indicate a problem :( - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFGBWw84QvfyHIvDvMRAhWdAJ9SlIaBU36w/eGudttQrYPwAVVtggCgj7E0 GOJ5alQp4hS4OHTW6rm1vMc= =gdZ+ -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ping's seem to hang ... 'zoneli' state?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 vmstat 1 shows: # vmstat 1 procs memory pagedisks faults cpu r b w avmfre flt re pi po fr sr da0 da1 in sy cs us sy id 0 1577 0 7813112 657312 1553 9 1 1 1325 68 0 0 569 4774 2359 7 10 83 0 1578 0 7815368 656600 199 59 0 0 64 0 0 2 226 679 616 0 6 93 0 1578 0 7815368 656564 1120 0 0 0 220 0 0 0 208 638 608 1 8 91 0 1578 0 7815368 6565645 0 0 0 314 0 0 9 343 890 974 1 8 91 0 1578 0 7815368 656564 804 0 0 0 0 0 2 0 233 469 633 1 9 90 Normally I'd look for any mysql processes using alot of CPU, but I'm not finding anything using alot of CPU (this system is the only one we have using gmirror, if that helps any) ... - --On Tuesday, March 06, 2007 11:10:53 -0400 Marc G. Fournier [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Does this show anything? I can't kill the processes, even with kill -9 ... this happens consistently just after 3 days uptime on a kernel built Fri Feb 23 07:47:20 AST 2007, and the interface is an fxp0 device ... # ps auxl | grep ping root 68994 0.0 0.0 1556 808 ?? D 7:58AM 0:00.02 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 70555 0.0 0.0 1556 808 ?? D 8:05AM 0:00.02 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 73863 0.0 0.0 1452 520 ?? D 8:33AM 0:00.01 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 75857 0.0 0.0 1452 520 ?? D 8:49AM 0:00.01 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 76676 0.0 0.0 1556 808 ?? D 8:53AM 0:00.02 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 77048 0.0 0.0 1556 808 ?? D 8:54AM 0:00.01 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 80071 0.0 0.0 1452 520 ?? D 9:15AM 0:00.01 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 80198 0.0 0.0 1452 520 ?? D 9:15AM 0:00.01 ping -c 1 -t 5 j 0 1 0 -16 0 zoneli root 81210 0.0 0.0 1556 808 ?? D 9:22AM 0:00.02 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 81212 0.0 0.0 1556 808 ?? D 9:22AM 0:00.01 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 81841 0.0 0.0 1556 808 ?? D 9:25AM 0:00.01 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 88041 0.0 0.0 1452 524 ?? D10:11AM 0:00.01 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 88418 0.0 0.0 1452 524 ?? D10:13AM 0:00.01 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 89800 0.0 0.0 1452 524 ?? D10:24AM 0:00.01 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 90774 0.0 0.0 1452 556 ?? D10:58AM 0:00.00 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 91118 0.0 0.0 1452 556 ?? D10:58AM 0:00.00 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 91635 0.0 0.0 1452 556 ?? D11:04AM 0:00.00 ping -c 1 -t 30 0 1 0 -16 0 zoneli - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFF7YR94QvfyHIvDvMRAqo7AKCsPVLSXhtMD4pFd/ho2hoX3CL5cgCfcQmy HkV4+EgX4ue/gxVZzyuXE+U= =8YX2 -END PGP SIGNATURE- - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFF7YZ54QvfyHIvDvMRAp2SAKCnpJJLgxI1SnkfE83L+xH05/981QCfYQBQ NUqCavoRoH8lo6ZPdXLyBFg= =Ww+Z -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
ping's seem to hang ... 'zoneli' state?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Does this show anything? I can't kill the processes, even with kill -9 ... this happens consistently just after 3 days uptime on a kernel built Fri Feb 23 07:47:20 AST 2007, and the interface is an fxp0 device ... # ps auxl | grep ping root 68994 0.0 0.0 1556 808 ?? D 7:58AM 0:00.02 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 70555 0.0 0.0 1556 808 ?? D 8:05AM 0:00.02 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 73863 0.0 0.0 1452 520 ?? D 8:33AM 0:00.01 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 75857 0.0 0.0 1452 520 ?? D 8:49AM 0:00.01 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 76676 0.0 0.0 1556 808 ?? D 8:53AM 0:00.02 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 77048 0.0 0.0 1556 808 ?? D 8:54AM 0:00.01 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 80071 0.0 0.0 1452 520 ?? D 9:15AM 0:00.01 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 80198 0.0 0.0 1452 520 ?? D 9:15AM 0:00.01 ping -c 1 -t 5 j 0 1 0 -16 0 zoneli root 81210 0.0 0.0 1556 808 ?? D 9:22AM 0:00.02 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 81212 0.0 0.0 1556 808 ?? D 9:22AM 0:00.01 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 81841 0.0 0.0 1556 808 ?? D 9:25AM 0:00.01 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 88041 0.0 0.0 1452 524 ?? D10:11AM 0:00.01 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 88418 0.0 0.0 1452 524 ?? D10:13AM 0:00.01 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 89800 0.0 0.0 1452 524 ?? D10:24AM 0:00.01 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 90774 0.0 0.0 1452 556 ?? D10:58AM 0:00.00 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 91118 0.0 0.0 1452 556 ?? D10:58AM 0:00.00 ping -c 1 -t 30 0 1 0 -16 0 zoneli root 91635 0.0 0.0 1452 556 ?? D11:04AM 0:00.00 ping -c 1 -t 30 0 1 0 -16 0 zoneli - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFF7YR94QvfyHIvDvMRAqo7AKCsPVLSXhtMD4pFd/ho2hoX3CL5cgCfcQmy HkV4+EgX4ue/gxVZzyuXE+U= =8YX2 -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ping's seem to hang ... 'zoneli' state?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Tuesday, March 06, 2007 18:59:04 +0300 Anton Yuzhaninov [EMAIL PROTECTED] wrote: Hello Marc, You wrote on Tuesday, March 6, 2007, 6:10:45 PM: MGF Does this show anything? I can't kill the processes, even with kill -9 ... MGF this happens consistently just after 3 days uptime on a kernel built Fri Feb MGF 23 07:47:20 AST 2007, and the interface is an fxp0 device ... MGF # ps auxl | grep ping MGF root 68994 0.0 0.0 1556 808 ?? D 7:58AM 0:00.02 ping -c 1 -t MGF 30 0 1 0 -16 0 zoneli This is know problem: http://www.freebsd.org/releases/6.2R/errata.html There are some different cases when zonelimit livelock is possible. Send vmstat -z output (when processes lock in zonelimit state). Great, thanks ... just read the errata on zonelimit, and it seems to imply that it was fixed on the 12th of February, but (of course) it doesn't indicate which files ... I just did a new cvsup since my last one, and all that has changed is: Updating collection src-all/cvs Edit src/lib/libarchive/archive_read_extract.c Edit src/share/man/man4/tap.4 Edit src/share/man/man4/tun.4 Edit src/sys/amd64/conf/SMP Edit src/sys/dev/wi/if_wi_pccard.c Edit src/sys/kern/sys_generic.c Edit src/sys/net/if_tap.c Edit src/sys/net/if_tun.c Edit src/sys/netgraph/ng_ksocket.c Edit src/sys/netinet/ip_mroute.c Edit src/sys/netinet/tcp.h Finished successfully Can someone comment on whether I just missed the commit on my last cvsup, or if I'm hitting the same problem but in a different way? Thanks ... - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFF7ZYU4QvfyHIvDvMRAhr6AKDQqpDNoCvq1UJYLbS4ayjcfZ2tSgCfawZX WNM499ARzFVvxW6ubUJtYDo= =ImQ+ -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: ping's seem to hang ... 'zoneli' state?
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Wednesday, March 07, 2007 00:44:25 +0800 ??/LI Xin [EMAIL PROTECTED] wrote: Marc G. Fournier wrote: [...] Can someone comment on whether I just missed the commit on my last cvsup, or if I'm hitting the same problem but in a different way? I think so. Try patching your system with: src/sys/kern/kern_mbuf.c,v 1.9.2.9 src/sys/sys/mbuf.h,v 1.170.2.7 src/sys/vm/uma.h,v 1.22.2.8 src/sys/vm/uma_core.c,v 1.119.2.19 src/sys/vm/uma_core.c,v 1.119.2.18 and perhaps also: src/sys/kern/uipc_socket.c,v 1.293. Here's what I have right now: __FBSDID($FreeBSD: src/sys/kern/kern_mbuf.c,v 1.9.2.9 2007/02/11 03:31:18 mohans Exp $); * $FreeBSD: src/sys/sys/mbuf.h,v 1.170.2.7 2007/02/11 03:31:19 mohans Exp $ * $FreeBSD: src/sys/vm/uma.h,v 1.22.2.8 2007/02/11 03:31:19 mohans Exp $ __FBSDID($FreeBSD: src/sys/vm/uma_core.c,v 1.119.2.19 2007/02/11 03:31:19 mohans Exp $); __FBSDID($FreeBSD: src/sys/vm/uma_core.c,v 1.119.2.19 2007/02/11 03:31:19 mohans Exp $); __FBSDID($FreeBSD: src/sys/kern/uipc_socket.c,v 1.242.2.8 2007/02/03 04:01:22 bms Exp $); The only one that looks off is uipc_socket.c ... do I need to copy that from HEAD? Are there any compatibility issues with doing that? - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFF7aKV4QvfyHIvDvMRAloYAKCsA0x+THahW+MZjW/8MjDZwsJDrgCcD4Qw WbA/0nXgvv4xwEDtBxirLlo= =Ro3T -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Some days, it doesn't pay to upgrade ...
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Based on the suggestion by someone on this list, I setup a screen session with top running, to watch things ... again, after 3 days, the server goes 'out of process' ... this time, of course, I could get in to look around and kill off processes ... from what I can tell, a process that all it does is: ping -c 1 host with a 300 sec timeout that runs once a minute started to 'run over top of' each other out of cron ... the host that it is pinging is on the same switch and has been running fine for 20 days now, and it wasn't until I did the last upgrade on teh server causing the problems that these problems started ... Coincidence? :) I'm going to fix the script so that it doesn't try to run over itself ... anyone konw of a problem with the fxp driver in 6-STABLE that might cause the ping to hang? - --On Thursday, March 01, 2007 09:51:13 +1100 Antony Mawer [EMAIL PROTECTED] wrote: On 27/02/2007 11:59 PM, Marc G. Fournier wrote: After 155 days of problem free uptime, I upgraded my 6-STABLE system the other day to the latest cvsup ... 3 days later, the whole thing hung solid with: Feb 27 04:32:49 mars uptimec: The server requested that we do a new login Feb 27 04:33:00 mars kernel: maxproc limit exceeded by uid 0, please see tuning(7) and login.conf(5). Feb 27 04:33:10 mars kernel: maxproc limit exceeded by uid 60, please see tuning(7) and login.conf(5). Stupid question: why isn't there some mechanism that prevents new processes from starting up, instead of locking up the whole server? I'm not asking for the evilness of Linux, where it arbitrarily kills off existing processes, but if maxproc is hit, why continue to try and start up new ones? What do you define as 'hung solid'? You are unable to get in via SSH? Or at a console via iLO/etc? I've seen this on some of our 6.0-RELEASE machines (along with maxpipekva exhausted errors), and you can't SSH in from that point... because sshd forks to handle the connection, and all available process slots are used up. I've thought about writing a background daemon to monitor the logs for signs of this (or even to just try and create a short-lived child process by fork()ing every 5 minutes or so), and dump information to disk then reboot the system when this occurs... it's a work-around for something that shouldn't happen, but it does anyway... once I'm able to identify _what_ is causing the build-up of processes, then I might be able to do something about killing them...!!! It's quite deceptive from an end-user point of view, because things like Apache that are already keep running, so all they see are strange bits and pieces that don't work... and as always, its one of those things that only happens on some clients machines, but never on any of our test machines... --Antony PS. I haven't disappeared off the face of the earth.. though close.. my fiance and I have been busy planning the wedding, and wound up buying a house at the same time..!! Will catch up shortly once I get a chance to come up for air!! - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFF6Ofd4QvfyHIvDvMRAmoqAJ9ka8ZQxq0Ciidyy4R60bTmYfxeggCeLz7i /De9C0Hmdqb22nErxhyUaZA= =Seo0 -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Some days, it doesn't pay to upgrade ...
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I don't know how critical this is, but I just thought about it ... this is my only system running gmirror ... everything seems fine according ot gmirror status, but maybe something iswron gthere I'm not seeing: Mar 3 01:25:52 mars kernel: GEOM_MIRROR: Device vm: provider mirror/vm destroyed. Mar 3 01:25:52 mars kernel: GEOM_MIRROR: Device vm destroyed. Mar 3 01:25:52 mars kernel: GEOM_MIRROR: Device md2: provider mirror/md2 destroyed. Mar 3 01:25:52 mars kernel: GEOM_MIRROR: Device md2 destroyed. Mar 3 01:25:52 mars kernel: GEOM_STRIPE: Disk mirror/md2 removed from md0. Mar 3 01:25:52 mars kernel: GEOM_STRIPE: Device md0 removed. Mar 3 01:25:52 mars kernel: GEOM_MIRROR: Device md1: provider mirror/md1 destroyed. Mar 3 01:25:52 mars kernel: GEOM_MIRROR: Device md1 destroyed. Mar 3 01:25:52 mars kernel: GEOM_STRIPE: Disk mirror/md1 removed from md0. Mar 3 01:25:52 mars kernel: GEOM_STRIPE: Device md0 destroyed. Mar 3 01:25:52 mars kernel: GEOM_MIRROR: Device md1 created (id=2282154470). Mar 3 01:25:52 mars kernel: GEOM_MIRROR: Device md1: provider da1 detected. Mar 3 01:25:52 mars kernel: GEOM_MIRROR: Device md1: provider da2 detected. Mar 3 01:25:52 mars kernel: GEOM_MIRROR: Device md1: provider da2 activated. Mar 3 01:25:52 mars kernel: GEOM_MIRROR: Device md1: provider da1 activated. Mar 3 01:25:52 mars kernel: GEOM_MIRROR: Device md1: provider mirror/md1 launched. Mar 3 01:25:52 mars kernel: GEOM_MIRROR: Device md2 created (id=3089402334). Mar 3 01:25:52 mars kernel: GEOM_MIRROR: Device md2: provider da3 detected. Mar 3 01:25:52 mars kernel: GEOM_MIRROR: Device md2: provider da4 detected. Mar 3 01:25:52 mars kernel: GEOM_MIRROR: Device md2: provider da4 activated. Mar 3 01:25:52 mars kernel: GEOM_MIRROR: Device md2: provider da3 activated. Mar 3 01:25:52 mars kernel: GEOM_MIRROR: Device md2: provider mirror/md2 launched. Mar 3 01:25:52 mars kernel: GEOM_MIRROR: Device vm created (id=2175292049). Mar 3 01:25:52 mars kernel: GEOM_MIRROR: Device vm: provider da5 detected. Mar 3 01:25:52 mars kernel: GEOM_STRIPE: Device md0 created (id=1094782536). Mar 3 01:25:52 mars kernel: GEOM_STRIPE: Disk mirror/md1 attached to md0. Mar 3 01:25:52 mars kernel: GEOM_STRIPE: Disk mirror/md2 attached to md0. Mar 3 01:25:52 mars kernel: GEOM_STRIPE: Device md0 activated. Mar 3 01:25:52 mars kernel: GEOM_MIRROR: Force device vm start due to timeout. Mar 3 01:25:52 mars kernel: GEOM_MIRROR: Device vm: provider da5 activated. Mar 3 01:25:52 mars kernel: GEOM_MIRROR: Device vm: provider mirror/vm launched. mirror/md1 COMPLETE da1 da2 mirror/md2 COMPLETE da3 da4 mirror/vm DEGRADED da5 I'm not using da5 right now, its just in there ... went with a RAID1+0 vs RAID5 configuration ... - --On Thursday, March 01, 2007 09:51:13 +1100 Antony Mawer [EMAIL PROTECTED] wrote: On 27/02/2007 11:59 PM, Marc G. Fournier wrote: After 155 days of problem free uptime, I upgraded my 6-STABLE system the other day to the latest cvsup ... 3 days later, the whole thing hung solid with: Feb 27 04:32:49 mars uptimec: The server requested that we do a new login Feb 27 04:33:00 mars kernel: maxproc limit exceeded by uid 0, please see tuning(7) and login.conf(5). Feb 27 04:33:10 mars kernel: maxproc limit exceeded by uid 60, please see tuning(7) and login.conf(5). Stupid question: why isn't there some mechanism that prevents new processes from starting up, instead of locking up the whole server? I'm not asking for the evilness of Linux, where it arbitrarily kills off existing processes, but if maxproc is hit, why continue to try and start up new ones? What do you define as 'hung solid'? You are unable to get in via SSH? Or at a console via iLO/etc? I've seen this on some of our 6.0-RELEASE machines (along with maxpipekva exhausted errors), and you can't SSH in from that point... because sshd forks to handle the connection, and all available process slots are used up. I've thought about writing a background daemon to monitor the logs for signs of this (or even to just try and create a short-lived child process by fork()ing every 5 minutes or so), and dump information to disk then reboot the system when this occurs... it's a work-around for something that shouldn't happen, but it does anyway... once I'm able to identify _what_ is causing the build-up of processes, then I might be able to do something about killing them...!!! It's quite deceptive from an end-user point of view, because things like Apache that are already keep running, so all they see are strange bits and pieces that don't work... and as always, its one of those things that only happens on some clients machines, but never on any of our test machines... --Antony PS. I haven't disappeared off the face of the earth.. though close.. my fiance and I have been busy planning the wedding, and wound up buying
Re: Some days, it doesn't pay to upgrade ...
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 - --On Tuesday, February 27, 2007 20:18:50 -0800 Tom Samplonius [EMAIL PROTECTED] wrote: - Marc G. Fournier [EMAIL PROTECTED] wrote: Feb 27 04:32:49 mars uptimec: The server requested that we do a new login Feb 27 04:33:00 mars kernel: maxproc limit exceeded by uid 0, please see tuning(7) and login.conf(5). Feb 27 04:33:10 mars kernel: maxproc limit exceeded by uid 60, please see tuning(7) and login.conf(5). Stupid question: why isn't there some mechanism that prevents new processes from starting up, instead of locking up the whole server? I'm not asking for ... Isn't that what is happening? When maxproc is hit, new processes can't be created. It is harmless, except for the uid that exceeded its process limit. I think the hang is some side-effect. Either because init can't fork a process, therefore there is nothing to login to. Did you try ping the system from remote to really see whether it was a solid hang? Or did you just pound on the keyboard? ping continues to work ... its a remote server, without a serial console, so doing much more on that particular server is a bit more difficult :( all our newer stuff (which, of course, is running great), have remote consoles setup on them ... Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFF5apV4QvfyHIvDvMRApLEAKCAiCPNa4j2173DgqJm6tuaL/itAwCeNokY ueJxtSGcp6TG2tCy8Tir1sM= =K7bg -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Some days, it doesn't pay to upgrade ...
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 After 155 days of problem free uptime, I upgraded my 6-STABLE system the other day to the latest cvsup ... 3 days later, the whole thing hung solid with: Feb 27 04:32:49 mars uptimec: The server requested that we do a new login Feb 27 04:33:00 mars kernel: maxproc limit exceeded by uid 0, please see tuning(7) and login.conf(5). Feb 27 04:33:10 mars kernel: maxproc limit exceeded by uid 60, please see tuning(7) and login.conf(5). Stupid question: why isn't there some mechanism that prevents new processes from starting up, instead of locking up the whole server? I'm not asking for the evilness of Linux, where it arbitrarily kills off existing processes, but if maxproc is hit, why continue to try and start up new ones? - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFF5Csz4QvfyHIvDvMRAvriAJ48K+5X/YdY7YW13Ro8z/nVuca3cQCeIlYk L8cLOgpzH4W4+tz6V8GVVqc= =x/Ok -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fatal trap 12: page fault while in kernel mode
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Working on upgrading and applying patch right now ... thanks ... - --On Sunday, January 07, 2007 14:03:41 + Robert Watson [EMAIL PROTECTED] wrote: On Sat, 6 Jan 2007, Marc G. Fournier wrote: Just had the following happen on a FreeBSD 6.2-PRERELEASE #7: Sun Dec 17 01:28:52 AST 2006 system ... amd64, HP Proliant, 6G of RAM ... have core if there is information that I can provide out of it ... Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x18c fault code = supervisor read, page not present instruction pointer = 0x8:0x801f9053 stack pointer = 0x10:0xb5c78b30 frame pointer = 0x10:0xb5c78b60 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= resume, IOPL = 0 current process = 5 (thread taskq) trap number = 12 panic: page fault cpuid = 0 Uptime: 8d22h25m40s (kgdb) where # 0 doadump () at pcpu.h:172 # 1 0x80203955 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 # 2 0x80204065 in panic (fmt=0xff019b667720 X\223f\233\001ÿÿÿ\020µc\233\001ÿÿÿ) at /usr/src/sys/kern/kern_shutdown.c:565 # 3 0x803287a6 in trap_fatal (frame=0xc, eva=18446742981100074784) # at /usr/src/sys/amd64/amd64/trap.c:660 # 4 0x80328cd8 in trap (frame= {tf_rdi = 112, tf_rsi = -1092609476832, tf_rdx = 6, tf_rcx = 3221225730, tf_r8 = -1245213424, tf_r9 = -1092609476832, tf_rax = 1, tf_rbx = - -1096874331952, tf_rbp = -1245213856, tf_r10 = -2142258536, tf_r11 = 0, tf_r12 = 4, tf_r13 = -1092609476832, tf_r14 = 4, tf_r15 = 1, tf_trapno = 12, tf_addr = 396, tf_flags = -2145197496, tf_err = 0, tf_rip = -2145415085, tf_cs = 8, tf_rflags = 65538, tf_rsp = -1245213888, tf_ss = 16}) at /usr/src/sys/amd64/amd64/trap.c:238 # 5 0x80313c6b in calltrap () at /usr/src/sys/amd64/amd64/exception.S:168 # 6 0x801f9053 in _mtx_lock_sleep (m=0xff009d31f0d0, tid=18446742981100074784, opts=6, file=0xc102 Address 0xc102 out of bounds, line=-1245213424) at /usr/src/sys/kern/kern_mutex.c:546 # 7 0x8025b1ac in unp_gc (arg=0x70, pending=-1687783648) at /usr/src/sys/kern/uipc_usrreq.c:1714 # 8 0x8022c314 in taskqueue_run (queue=0xff844800) at /usr/src/sys/kern/subr_taskqueue.c:257 # 9 0x8022d0e7 in taskqueue_thread_loop (arg=0x70) at /usr/src/sys/kern/subr_taskqueue.c:376 # 10 0x801e7b76 in fork_exit (callout=0x8022d060 taskqueue_thread_loop, arg=0x805030d0, frame=0xb5c78c50) at /usr/src/sys/kern/kern_fork.c:821 # 11 0x80313fce in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:394 This is a NULL pointer dereference in the UNIX domain socket code. John Baldwin recently committed a fix for a bug with these symptoms to 7-CURRENT, with an MFC planned in the near future. The fix won't make 6.2-RELEASE, but assuming it tests out well over the next few weeks, we will cut an errata patch/announcement for it. I believe you can pull down his 6-STABLE version at: http://people.FreeBSD.org/~jhb/patches/unp_gc.patch This same patch is currently in texting on mx1.FreeBSD.org. (John CC'd) Robert N M Watson Computer Laboratory University of Cambridge - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFFoQ8w4QvfyHIvDvMRAuTzAKDrPBUZ0dRgdujdSzQjbFyh2xiYcACgm8Oa adOhc5QuzI99WsjjjWaSi64= =lmyP -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Fatal trap 12: page fault while in kernel mode
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I'm up and running on the patch now as well ... - --On Sunday, January 07, 2007 17:02:40 -0800 Kevin Oberman [EMAIL PROTECTED] wrote: Date: Sun, 7 Jan 2007 14:03:41 + (GMT) From: Robert Watson [EMAIL PROTECTED] Sender: [EMAIL PROTECTED] On Sat, 6 Jan 2007, Marc G. Fournier wrote: Just had the following happen on a FreeBSD 6.2-PRERELEASE #7: Sun Dec 17 01:28:52 AST 2006 system ... amd64, HP Proliant, 6G of RAM ... have core if there is information that I can provide out of it ... Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x18c fault code = supervisor read, page not present instruction pointer = 0x8:0x801f9053 stack pointer = 0x10:0xb5c78b30 frame pointer = 0x10:0xb5c78b60 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= resume, IOPL = 0 current process = 5 (thread taskq) trap number = 12 panic: page fault cpuid = 0 Uptime: 8d22h25m40s (kgdb) where # 0 doadump () at pcpu.h:172 # 1 0x80203955 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 # 2 0x80204065 in panic (fmt=0xff019b667720 X\223f\233\001???\020?c\233\001???) at /usr/src/sys/kern/kern_shutdown.c:565 # 3 0x803287a6 in trap_fatal (frame=0xc, eva=1844674298110007 # 4784) at /usr/src/sys/amd64/amd64/trap.c:660 # 4 0x80328cd8 in trap (frame= {tf_rdi = 112, tf_rsi = -1092609476832, tf_rdx = 6, tf_rcx = 3221225730, tf_r8 = -1245213424, tf_r9 = -1092609476832, tf_rax = 1, tf_rbx = - -1096874331952, tf_rbp = -1245213856, tf_r10 = -2142258536, tf_r11 = 0, tf_r12 = 4, tf_r13 = -1092609476832, tf_r14 = 4, tf_r15 = 1, tf_trapno = 12, tf_addr = 396, tf_flags = -2145197496, tf_err = 0, tf_rip = -2145415085, tf_c s = 8, tf_rflags = 65538, tf_rsp = -1245213888, tf_ss = 16}) at /usr/src/sys/amd64/amd64/trap.c:238 # 5 0x80313c6b in calltrap () at /usr/src/sys/amd64/amd64/exception.S:168 # 6 0x801f9053 in _mtx_lock_sleep (m=0xff009d31f0d0, tid=18446742981100074784, opts=6, file=0xc102 Address 0xc1 02 out of bounds, line=-1245213424) at /usr/src/sys/kern/kern_mutex.c:546 # 7 0x8025b1ac in unp_gc (arg=0x70, pending=-1687783648) at /usr/src/sys/kern/uipc_usrreq.c:1714 # 8 0x8022c314 in taskqueue_run (queue=0xff844800) at /usr/src/sys/kern/subr_taskqueue.c:257 # 9 0x8022d0e7 in taskqueue_thread_loop (arg=0x70) at /usr/src/sys/kern/subr_taskqueue.c:376 # 10 0x801e7b76 in fork_exit (callout=0x8022d060 taskqueue_thread_loop, arg=0x805030d0, frame=0xb5c7 8c50) at /usr/src/sys/kern/kern_fork.c:821 # 11 0x80313fce in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:394 This is a NULL pointer dereference in the UNIX domain socket code. John Baldwin recently committed a fix for a bug with these symptoms to 7-CURRENT , with an MFC planned in the near future. The fix won't make 6.2-RELEASE, bu t assuming it tests out well over the next few weeks, we will cut an errata patch/announcement for it. I believe you can pull down his 6-STABLE versio n at: http://people.FreeBSD.org/~jhb/patches/unp_gc.patch This same patch is currently in texting on mx1.FreeBSD.org. (John CC'd) Robert N M Watson Computer Laboratory University of Cambridge I have installed this on my system, but the panics have always been very erratic, so it may be a while before I am sure whether this fixes it. At the moment the system has been up for 7 days, although I have had multiple crashes in a single day. -- R. Kevin Oberman, Network Engineer Energy Sciences Network (ESnet) Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab) E-mail: [EMAIL PROTECTED] Phone: +1 510 486-8634 Key fingerprint:059B 2DDF 031C 9BA3 14A4 EADA 927D EBB3 987B 3751 - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFFobPh4QvfyHIvDvMRAuGBAJ4vwJoVIRmbdHK6wqBxneuUzjekfACgr4Ys 2DSldX3rTRAHkng3UqKO+8U= =FtuJ -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Fatal trap 12: page fault while in kernel mode
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Just had the following happen on a FreeBSD 6.2-PRERELEASE #7: Sun Dec 17 01:28:52 AST 2006 system ... amd64, HP Proliant, 6G of RAM ... have core if there is information that I can provide out of it ... Fatal trap 12: page fault while in kernel mode cpuid = 0; apic id = 00 fault virtual address = 0x18c fault code = supervisor read, page not present instruction pointer = 0x8:0x801f9053 stack pointer = 0x10:0xb5c78b30 frame pointer = 0x10:0xb5c78b60 code segment= base 0x0, limit 0xf, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags= resume, IOPL = 0 current process = 5 (thread taskq) trap number = 12 panic: page fault cpuid = 0 Uptime: 8d22h25m40s (kgdb) where #0 doadump () at pcpu.h:172 #1 0x80203955 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:409 #2 0x80204065 in panic (fmt=0xff019b667720 X\223f\233\001ÿÿÿ\020µc\233\001ÿÿÿ) at /usr/src/sys/kern/kern_shutdown.c:565 #3 0x803287a6 in trap_fatal (frame=0xc, eva=18446742981100074784) at /usr/src/sys/amd64/amd64/trap.c:660 #4 0x80328cd8 in trap (frame= {tf_rdi = 112, tf_rsi = -1092609476832, tf_rdx = 6, tf_rcx = 3221225730, tf_r8 = -1245213424, tf_r9 = -1092609476832, tf_rax = 1, tf_rbx = - -1096874331952, tf_rbp = -1245213856, tf_r10 = -2142258536, tf_r11 = 0, tf_r12 = 4, tf_r13 = -1092609476832, tf_r14 = 4, tf_r15 = 1, tf_trapno = 12, tf_addr = 396, tf_flags = -2145197496, tf_err = 0, tf_rip = -2145415085, tf_cs = 8, tf_rflags = 65538, tf_rsp = -1245213888, tf_ss = 16}) at /usr/src/sys/amd64/amd64/trap.c:238 #5 0x80313c6b in calltrap () at /usr/src/sys/amd64/amd64/exception.S:168 #6 0x801f9053 in _mtx_lock_sleep (m=0xff009d31f0d0, tid=18446742981100074784, opts=6, file=0xc102 Address 0xc102 out of bounds, line=-1245213424) at /usr/src/sys/kern/kern_mutex.c:546 #7 0x8025b1ac in unp_gc (arg=0x70, pending=-1687783648) at /usr/src/sys/kern/uipc_usrreq.c:1714 #8 0x8022c314 in taskqueue_run (queue=0xff844800) at /usr/src/sys/kern/subr_taskqueue.c:257 #9 0x8022d0e7 in taskqueue_thread_loop (arg=0x70) at /usr/src/sys/kern/subr_taskqueue.c:376 #10 0x801e7b76 in fork_exit (callout=0x8022d060 taskqueue_thread_loop, arg=0x805030d0, frame=0xb5c78c50) at /usr/src/sys/kern/kern_fork.c:821 #11 0x80313fce in fork_trampoline () at /usr/src/sys/amd64/amd64/exception.S:394 #12 0x in ?? () #13 0x in ?? () #14 0x0001 in ?? () #15 0x in ?? () #16 0x in ?? () #17 0x in ?? () #18 0x in ?? () #19 0x in ?? () #20 0x in ?? () #21 0x in ?? () #22 0x in ?? () #23 0x in ?? () #24 0x in ?? () #25 0x in ?? () #26 0x in ?? () #27 0x in ?? () #28 0x in ?? () #29 0x in ?? () #30 0x in ?? () #31 0x in ?? () #32 0x in ?? () #33 0x in ?? () #34 0x in ?? () #35 0x in ?? () #36 0x in ?? () #37 0x in ?? () #38 0x in ?? () #39 0x in ?? () #40 0x in ?? () #41 0x in ?? () #42 0x in ?? () #43 0x in ?? () #44 0x006bc000 in ?? () #45 0x805054c0 in turnstile_chains () #46 0x0001 in ?? () #47 0xff019b669358 in ?? () #48 0xff008d5bc720 in ?? () #49 0xb5c78aa0 in ?? () #50 0xb5c78a78 in ?? () #51 0xff019b667720 in ?? () #52 0x8021a69f in sched_switch (td=0x805030d0, newtd=0x8022d060, flags=0) at /usr/src/sys/kern/sched_4bsd.c:973 Previous frame inner to this frame (corrupt stack?) - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFFn02U4QvfyHIvDvMRArpcAJ9O14aZsWCJ97wQeLKvxKd9DW6bTQCfWSMm nm/uEw6zK2jBPXN6/0OTC34= =4IGH -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: Sleepy thread - Kernel Panic
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Yours makes the third report of this that I know of ... one of us is running 6.2-RC, one 6.1-RELEASE ... what version are you running? I get the same 'hang' also ... Have you enabled DDB in your kernel? Also, have you enabled the dumpdev settings in /etc/rc.conf? - --On Thursday, December 28, 2006 17:27:38 +0545 Tek Bahadur Limbu [EMAIL PROTECTED] wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Dear All, I need some help on the problem below. The following error occurs in my FreeBSD 6.1 (Dell 420) server: Sleeping thread (tid 540242, pid 32378) owns a non-sleepable lock panic: sleeping thread Cannot dump. No dump device defined. Automatic reboot in 15 seconds - press a key on the console to abort. Rebooting However, it does not reboot and simply hangs. I have tried commenting the options PROCFS which seemed to work for 2 says. However on the 3rd day, the same problem surfaced again. I probably think that it is a hardware problem. Does anybody have some ideas regarding this problem. -- With best regards and good wishes, Yours sincerely, Tek Bahadur Limbu (TAG/TDG Group) Jwl Systems Department Worldlink Communications Pvt. Ltd. Jawalakhel, Nepal -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.2.2 (FreeBSD) iD8DBQFFk62uVrOl+eVhOvYRAmfRAJsFtLZOBH84ex9S2h99r1bqf2eYegCcDfgO rJW7nsfCQAIn7Q9RFwsUA3o= =W8n9 -END PGP SIGNATURE- - Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email . [EMAIL PROTECTED] MSN . [EMAIL PROTECTED] Yahoo . yscrappy Skype: hub.orgICQ . 7615664 -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.5 (FreeBSD) iD8DBQFFlXxO4QvfyHIvDvMRAu5wAJ9cdnO87xmzpXcvWRxZfYzK2sxqQQCeMIG3 u87sTXfYCqNGNRbM0SfKqJ8= =TJp6 -END PGP SIGNATURE- ___ freebsd-stable@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-stable To unsubscribe, send any mail to [EMAIL PROTECTED]