from:"Marc G. Fournier"

Re: 9-STABLE - NFS - NetAPP:

2013-02-14 Thread Marc G. Fournier


On 2013-02-14, at 08:41 , Rick Macklem rmack...@uoguelph.ca wrote:

 
 Btw Marc, if you just want this problem to go away, I suspect getting rid
 of the intr mount option would do that.

Am more interested in fixing the problem (if possible) then just masking it, 
but ...

Based on the man page for mount_nfs, wouldn't that have the opposite effect:

 intrMake the mount interruptible, which implies that file
 system calls that are delayed due to an unresponsive
 server will fail with EINTR when a termination signal is
 posted for the process.

I may be mis-reading, but from the above it sounds like a -9 *should* terminate 
the process if intr is enabled, while with it disabled, it would ignore it … ?


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: 9-STABLE - NFS - NetAPP:

2013-02-14 Thread Marc G. Fournier


On 2013-02-14, at 16:24 , Rick Macklem rmack...@uoguelph.ca wrote:

 Marc Fournier wrote:
 On 2013-02-14, at 08:41 , Rick Macklem rmack...@uoguelph.ca wrote:
 
 
 Btw Marc, if you just want this problem to go away, I suspect
 getting rid
 of the intr mount option would do that.
 
 Am more interested in fixing the problem (if possible) then just
 masking it, but ...
 
 Based on the man page for mount_nfs, wouldn't that have the opposite
 effect:
 
 intr Make the mount interruptible, which implies that file
 system calls that are delayed due to an unresponsive
 server will fail with EINTR when a termination signal is
 posted for the process.
 
 I may be mis-reading, but from the above it sounds like a -9 *should*
 terminate the process if intr is enabled, while with it disabled, it
 would ignore it … ?
 
 Yes, you have misread it (or english is a wonderfully ambiguous thing,
 if you prefer;-).
 
 For hard mounts (which is what you get if you don't specify either soft
 nor intr), the RPCs behave like other I/O subsystems, which means they
 do non-interruptible sleeps (D stat in ps) waiting for server replies
 and continue to try and complete the RPC forever. You can't kill off
 the process/thread with any signal.
 
 If umount -f of the filesystem works, that terminates the thread(s).
 Unfortunately, umount -f is quite broken again. I have an idea on
 how to resolve this, but I haven't coded it yet. (The problem is that
 the process doing umount -f gets stuck before it does the VFS_UNMOUNT(),
 so the NFS client doesn't see it.)

For how infrequently this problem generally manifests itself, is there an 
overall  benefit from a debugging standpoint of my leaving intr on and 
reporting when it happens, including procstat output, and then upgrading to 
latest kernel … ?

Its an annoyance, but it isn't like it happens daily, so I don't mind going 
through the process *towards* having it fixed if there is an overall benefit …


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: 9-STABLE - NFS - NetAPP:

2013-02-13 Thread Marc G. Fournier


On 2013-02-13, at 14:50 , Rick Macklem rmack...@uoguelph.ca wrote:

 He does get the odd error reported by nfs_getpages() and I don't
 think we've isolated why yet. The error is 13 (EACCES), but jhb@
 thought it might be because of the bug he fixed where the krpc
 reported EACCES for the EINTR case. I don't think we've heard
 back from Marc w.r.t. whether he has gotten any more of these
 erros logged since applying jhb@'s patch and whether or not
 the errno has changed to EINTR?

As mentioned previously, it doesn't happen all that often … this latest one was 
after 21 days of uptime (or so) … I just upgraded the kernel on that machine to 
take into consideration changes to hfs *since* the last upgrade, so it might be 
another 20-30 days before it happens again *if* that last patch didn't' fix it …

I have several servers that do have fully operational remote consoles though … 
to save time if/when it happens next, what do I all need to run?

ps auxlH
procstat -kk pid (for which process? … all part of that group, or just one 
of the apparently hung processes?)
sysctl debug.kdb.break_to_debugger=1 (shell)
ctlaltesc (from console)

now, is there a way of forcing it to do a dump core so that I can run the 
various commands from a shell *after* its rebooted?   Not particularly easy to 
redirect console output to a file (or is it?), so anything that scrolls off the 
screen is pretty much lost … I'm using a DRAC card in most cases, no serial 
consoles or anything like that that I can run within a script session … a 'ps' 
listing is 500 lines long, just to give an idea ...


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: 9-STABLE - NFS - NetAPP:

2013-02-13 Thread Marc G. Fournier


On 2013-02-13, at 15:16 , Konstantin Belousov kostik...@gmail.com wrote:

 On Wed, Feb 13, 2013 at 05:50:13PM -0500, Rick Macklem wrote:
 I got it resent from him. I've attached it to this post, just in case you
 are interested in taking a look at it.
 
 I do not see the voffset wchains surprising. All of them seems to occur
 in the multithreading process.  The usual reason for the voffset blocking
 is the use of the same file (as in struct file *) to perform operations
 from several threads in parallel.  One thread locked the file offset by
 using read() or write(), and sleeping waiting for the vnode locked.
 All other threads performing read or write on the same file, e.g. by
 using the same file descriptor, are locked on the file offset before
 even trying to lock the vnode.
 
 What I see interesting in the output you mailed, is the pid 93636. Note
 that several its threads are in the 'T' state. It means stopped, while
 other threads obviously do file i/o due to vofflock state. I wonder if
 some stopped thread owns nfs vnode lock. It could be some omission in the
 handling of PBDRY/TDF_BDRY, or other bug.
 
 It is absolutely impossible to say anything definitive without proper
 diagnostic.  At least the procstat -kk is needed.

I had sent out the output of procstat -kk at the time … for next time, would 
you need procstat against all of the 'duplicate processes' that aren't' 
killable?  for instance, in this case, there were three du commands running 
doing the same thing,none of which were killable … so procstat -kk for all 
three of those?



___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Server lock up: kern.maxswzone relate ...

2009-06-10 Thread Marc G. Fournier



I'm running a couple of brand new servers ... 32G of RAM, very little load 
on it right now, and this morning it locked up with that 'kern.maxswzone' 
error on the console ...


The server is running a reasonably current 7.2-STABLE:

FreeBSD pluto.hub.org 7.2-STABLE FreeBSD 7.2-STABLE #0: Sun May 31 
14:48:04 ADT


And top right now, with everything running, shows no swappping, 19G of 
Free memory, 9G of Inact memory ... no reason to do any serious amount of 
swapping.


last pid: 32159;  load averages:  0.12,  0.21,  0.47up 0+10:57:56  11:53:39
573 processes: 1 running, 571 sleeping, 1 zombie
CPU:  2.0% user,  0.0% nice,  1.2% system,  0.0% interrupt, 96.8% idle
Mem: 1331M Active, 9446M Inact, 659M Wired, 35M Cache, 399M Buf, 19G Free
Swap: 32G Total, 32G Free

In fact, my other server (same config), has been up 9 days (they were put 
online 9 days ago), and tops shows it doing a little bit of swapping, but, 
again, huge amounts of Inact memory:


last pid: 26307;  load averages:  0.36,  0.35,  0.36up 9+17:03:48 
11:57:54

680 processes: 2 running, 657 sleeping, 21 zombie
CPU:  0.7% user,  0.0% nice,  0.4% system,  0.0% interrupt, 98.9% idle
Mem: 2915M Active, 25G Inact, 778M Wired, 13M Cache, 399M Buf, 1771M Free
Swap: 32G Total, 1044K Used, 32G Free

So these servers right now are definitely not feeling any pain ...

And, based on experiences with another server, I have my /boot/loader.conf 
set to:


kern.maxswzone=67108864

So, the question is ... what am I missing?  Is there some magical formula 
for calculating maxswzone that 7.2 is missing?  Some nagios plug-in I 
shuld be using to monitor ... what?


Help?


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . scra...@hub.org  MSN . scra...@hub.org
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

More data on 7.2-RELEASE hangs

2009-05-13 Thread Marc G. Fournier



Don't know if this helps with anything, but it just hung after 2days again 
... nothing on the console ... top process running at the time shows the 
following ... anything there look concerning?


last pid:  5196;  load averages:  9.25, 15.97, 10.07 
up 2+07:58:36  04:02:28

1874 processes:317 running, 1537 sleeping, 20 zombie
CPU:  6.2% user,  0.0% nice,  6.7% system,  0.3% interrupt, 86.8% idle
Mem: 4552M Active, 162M Inact, 684M Wired, 46M Cache, 399M Buf, 8240K Free
Swap: 8192M Total, 1308M Used, 6884M Free, 15% Inuse, 1360K In, 63M Out

  PID USERNAME  THR PRI NICE   SIZERES STATE  C   TIME   WCPU COMMAND
28752 root5  960   427M   408M select 1   1:55  0.00% named
 9720 nobody 19  970   402M   186M RUN1   0:00  0.69% nsd
54395 root   16  200  1308M   163M kserel 0   0:00  0.00% java
 8500 nobody 10 1020   193M 86492K ucond  1   0:07  0.00% nsd
 3302102  1  960   158M 66100K select 1   0:37  0.00% postgres
 7853   1304  1  960   154M 54408K select 1   0:39  0.00% postgres
10670 88 28  200   335M 42488K kserel 0   0:00  0.44% mysqld
 4976 root5   40 95444K 41740K kqread 1   1:09  0.00% named
14003 www44  960   443M 41632K ucond  1   0:00  0.00% java
 8528 nobody 15  960   188M 37904K ucond  1   0:00  0.00% nsd
 5157 88109  960 97620K 33704K RUN0   0:00  0.00% mysqld
 1759 www 1   40   167M 32276K select 1   0:01  0.00% httpd
99407 www 1   40   165M 31712K sbwait 0   0:02  0.00% httpd
 4006 www 1   40   124M 31424K sbwait 1   0:01  0.29% httpd
 1299 www 1   40   164M 31376K sbwait 1   0:02  0.00% httpd
 1758 www 1   40   164M 31176K sbwait 0   0:02  0.00% httpd
99402 www 1  960   163M 29892K CPU1   1   0:03  0.00% httpd
 4036 www 1  200   122M 28680K lockf  1   0:00  0.00% httpd
 1757 www 1   40   158M 27856K sbwait 1   0:02  0.00% httpd
 3899 www 1  960   160M 27688K RUN0   0:00  0.00% httpd
 4007 www 1  200   125M 27588K lockf  0   0:01  2.10% httpd
 4525 www 1  960   158M 26624K RUN1   0:00  0.00% httpd
 4607 www 1  960   158M 26096K RUN0   0:00  0.00% httpd
13635 88 34  960 92340K 25604K CPU0   0   0:00  0.05% mysqld
 4024 www 1  960   156M 24880K RUN1   0:00  0.10% httpd
 3585102  1   40   163M 24748K sbwait 1   2:56  0.00% postgres
 3951 www 1  960   155M 24548K RUN1   0:00  0.10% httpd
 4022 www 1  960   155M 24320K RUN0   0:00  0.00% httpd
 3960 www 1  960   155M 24316K RUN1   0:00  0.00% httpd
 3388102  1   40   161M 24228K sbwait 0   1:07  0.00% postgres
 4023 www 1  960   155M 23988K RUN1   0:00  0.00% httpd
99468 www 1  960   104M 23660K RUN1   0:03  0.00% httpd
99423 www 1   40   154M 23456K sbwait 0   0:03  0.00% httpd
 3959 www 1  -40   103M 23144K devfs  0   0:00  0.00% httpd
 5004 www 1   40   154M 23032K sbwait 1   0:00  0.00% httpd
62771 www 1 -160   143M 22824K vnread 1   0:01  0.00% httpd
 4612 www 1  960   153M 21936K RUN1   0:00  0.15% httpd
 4609 www 1  960   153M 21936K RUN0   0:00  0.05% httpd
 5180 www 1  960   145M 21660K RUN0   0:12  0.00% httpd
 5007 www 1   40   115M 21360K sbwait 0   0:00  0.29% httpd
57327 www 1  -80   145M 20996K biord  0   0:04  0.20% httpd
29064 www 1  -80   143M 20812K biord  1   0:04  0.00% httpd
99381 www 1  960   151M 19364K RUN1   0:04  0.00% httpd
 4682 root1   40 62388K 17828K kqread 1   0:00  0.00% perl
 9447 88  8  200 61388K 17508K kserel 0   0:00  0.05% mysqld
13457 bind5  960 45724K 17424K RUN0   0:14  0.00% named
87535 www 1   40   149M 17396K sbwait 1   0:09  0.00% httpd
 4611 www 1   40   146M 17008K sbwait 1   0:00  0.00% httpd
 3386102  1  -40   163M 16544K semwai 0   0:51  0.00% postgres
91929 www 1   40   113M 16196K sbwait 0   0:04  0.00% httpd
 4757 www 1  960   145M 16144K RUN0   0:00  0.00% httpd
10269 88  5  200 57504K 16000K kserel 0   0:00  0.00% mysqld
 3946 www 1   40   126M 15552K sbwait 1   0:01 15.00% httpd
 3619 www 1   40   113M 15172K sbwait 1   0:00  0.00% httpd
 3385102  1  960   163M 14932K RUN1   0:50  0.00% postgres
28755102  1   40   159M 14760K sbwait 0  31:36  0.35% postgres


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . scra...@hub.org  MSN . scra...@hub.org
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
___
freebsd-stable@freebsd.org mailing list
http

Re: More data on 7.2-RELEASE hangs

2009-05-13 Thread Marc G. Fournier


On Wed, 13 May 2009, John Baldwin wrote:


On Wednesday 13 May 2009 3:09:33 am Marc G. Fournier wrote:


Don't know if this helps with anything, but it just hung after 2days again
... nothing on the console ... top process running at the time shows the
following ... anything there look concerning?


Is this a 2 CPU system?  If so, both CPUs are actually running something, so
it is not a deadlock per se.


Yes:

CPU: Intel(R) Xeon(TM) CPU 3.40GHz (3400.14-MHz K8-class CPU)
  Origin = GenuineIntel  Id = 0xf43  Stepping = 3

Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE
  Features2=0x649dSSE3,RSVD2,MON,DS_CPL,EST,CNXT-ID,CX16,xTPR
  AMD Features=0x2800SYSCALL,LM
  Logical CPUs per core: 2
usable memory = 6368911360 (6073 MB)
avail memory  = 6141906944 (5857 MB)
ACPI APIC Table: HP 0083
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  6
ioapic1: Changing APIC ID to 9


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . scra...@hub.org  MSN . scra...@hub.org
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: More data on 7.2-RELEASE hangs

2009-05-13 Thread Marc G. Fournier


On Wed, 13 May 2009, Mike Tancsa wrote:



What does your kernel config look like ?


Included below ... only thought I had, taht I haven't tried yet, was 
changing from SCHED_4BSD - SCHED_ULE ...




machine amd64
cpu HAMMER
ident   kernel

options SMP

options SCHED_4BSD  # 4BSD scheduler
options PREEMPTION  # Enable kernel thread preemption
options INET# InterNETworking
options FFS # Berkeley Fast Filesystem
options SOFTUPDATES
options UFS_ACL # Support for access control lists
options UFS_DIRHASH # Improve performance on big directories
options PROCFS  # Process filesystem (requires PSEUDOFS)
options PSEUDOFS# Pseudo-filesystem framework
options COMPAT_43   # Needed by COMPAT_LINUX32
options COMPAT_IA32 # Compatible with i386 binaries
options COMPAT_FREEBSD4 # Compatible with FreeBSD4
options COMPAT_FREEBSD6 # Compatible with FreeBSD6
options 	COMPAT_LINUX32		# Compatible with i386 linux binaries 
options 	SCSI_DELAY=5000		# Delay (in ms) before probing SCSI

options KTRACE  # ktrace(1) support

options SYSVSHM
options SHMMAXPGS=199608
options SHMMAX=(SHMMAXPGS*PAGE_SIZE+1)

options SYSVSEM
options SEMMNI=4096
options SEMMNS=8192

options SYSVMSG # SYSV-style message queues

options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time 
extensions
options KBD_INSTALL_CDEV# install a CDEV entry in /dev

options ADAPTIVE_GIANT  # Giant mutex is adaptive.

options LINPROCFS   # Cannot be a module yet.

# Bus support.
device  acpi
device  pci

# Serial (COM) ports
device  sio # 8250, 16[45]50 based serial ports

device  scbus   # SCSI bus (required for SCSI)
device  da  # Direct Access (disks)
device  pass# Passthrough device (direct SCSI access)
device  ses # SCSI Environmental Services (and SAF-TE)

device  ciss# Compaq Smart RAID 5*

device  atkbdc  # AT keyboard controller
device  atkbd   # AT keyboard
device  psm # PS/2 mouse

device  vga # VGA video card driver

device  splash  # Splash screen and screen saver support

device  sc

device  agp # support several AGP chipsets

device  miibus  # MII bus support
device  bge # Broadcom BCM570xx Gigabit Ethernet

device  loop# Network loopback
device  random  # Entropy device
device  ether   # Ethernet support
device  pty # Pseudo-ttys (telnet etc)

device  bpf # Berkeley packet filter

options ALT_BREAK_TO_DEBUGGER
options KDB
options DDB


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . scra...@hub.org  MSN . scra...@hub.org
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: More data on 7.2-RELEASE hangs

2009-05-13 Thread Marc G. Fournier


On Wed, 13 May 2009, John Baldwin wrote:


On Wednesday 13 May 2009 3:09:33 am Marc G. Fournier wrote:


Don't know if this helps with anything, but it just hung after 2days again
... nothing on the console ... top process running at the time shows the
following ... anything there look concerning?


Is this a 2 CPU system?  If so, both CPUs are actually running something, so
it is not a deadlock per se.


99402 www 1  960   163M 29892K CPU1   1   0:03  0.00% httpd
13635 88 34  960 92340K 25604K CPU0   0   0:00  0.05% mysqld


Here is what vmstat shows ~10 minutes before (or as) it hung solid last 
time.  I didn't think to save the one that ran just before this one (the 
script runs every 5 minutes), but for the 'r b w' columns 'b' was around 
10ish, while 'w' was 0 ... within a 5 minute period of time, 'w' 
literally skyrockets:


 procs  memory  pagedisks faults 
cpu

 r b w avmfre   flt  re  pi  pofr  sr da0 pa0   in   sy   cs us sy 
id
107 266 122 16155620   23084  3255  22   1   2  3358 1605   0   0  377 17835 
5231 19  7 73
 6 285 382 16446348   22532 111705 21155 1391 10049 51966 2187328 143   0 36344 
499098 423971  3  2 95
 0 73 386 16440468   23072  7052 1155  85  44  1292  73 372   0 1030 18631 8334 
18 12 70
 0 77 388 16440468   23088   126 1050   0   621  27 169   0  521 4186 4125  
2  3 94
 0 66 389 16440468   23104 4 713   0  1344  58 227   0  352 2217 3504  
0  5 95



--
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org




Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . scra...@hub.org  MSN . scra...@hub.org
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: More data on 7.2-RELEASE hangs

2009-05-13 Thread Marc G. Fournier


On Wed, 13 May 2009, John Baldwin wrote:


Well, you had a whole lot of page faults and other VM activity, plus 500k
syscalls.  The 'w' is a count of swapped processes, so basically your box is
swapping a whole lot it seems.  I think your box is just overloaded.


I knew I was going to regret posting that :(

What I posted was what vmstat 5 shows after the issue *starts*, not what 
it normally looks like ... right now, after 10 hours of uptime, and all 
the same processes running, it looks like:


io# vmstat 5 (10 hours uptime now)
 procs  memory  pagedisks faults cpu
 r b w avmfre   flt  re  pi  pofr  sr da0 pa0   in   sy   cs us sy 
id
 0 1 0  10477M   301M  3503  13   1   2  3620 286   0   0  331 45491 4566 26  8 
66
 0 1 0  10430M   305M   278   7   0   0   550   0  18   0  186 19243 2917 4  3 
93
 1 1 0  10474M   295M   511   0   0   0   359   0  91   0  253 11632 3516 7  3 
90
 0 1 0  10447M   310M   819   3   0   0  1473   0  14   0  143 29575 2486 8  3 
89
 0 1 0  10558M   295M  5008  18  13   5  4128   0 121   0  345 24212 4215 16  7 
77

Right now, IO is running ~775 processes ... at the time of the vmstat I 
provided earlier, it was up to 1400 processes ... since there is only 5 
minutes between script runs, something is causing it to go from zero swap 
- high swap within a very short period of time, but since things get 
badly locked up when it happens, I can't isolate where ...


I've got the following two ps outputs at the time of the high paging:

/bin/ps -aucxHl -O jid  ps-long.out
/bin/ps -aux -O jid  ps-short.out

Is there anything in there that I could look at as far as what is putting 
things over the edge?




As to the 'overloaded server', here is another server, with more running 
on it, but exact same configuration:


neptune# vmstat 5 (3 days, 18 hours uptime now)
 procs  memory  pagedisks faults cpu
 r b w avmfre   flt  re  pi  pofr  sr da0 pa0   in   sy   cs us sy 
id
 0 0 0  12521M   303M  3969  15   5   3  2271 1603   0   0  444 6491 5165 37 19 
44
 0 0 0  12464M   309M  3009   1   0  15  2833   0 104   0  296 9378 3689  7  5 
88
23 0 0  12476M   297M  3845   3   0   0  2627   0  31   0  279 10545 2986 14  5 
81
 0 1 0  12530M   266M  5259   0   1   0  2551   0 145   0  432 18070 4133 45  8 
47
 1 0 0  12587M   237M  7049   0   1   0  4484   0 171   0  357 15953 4715 29  7 
64

So, normally these servers purr ... and are highly responsive ...

In fact, here is an older 32bit server, less RAM, run about 50% more 
processes then neptune:


mercury# vmstat 5
 procs  memory  pagedisks faults cpu
 r b w avmfre  flt  re  pi  po  fr  sr da0 pa0   in   sy  cs us sy id
 3 14 1   6817M   114M  641   7   3   1 1036 386   0   0 1109  464 157  5  5 90
 0 8 0   6817M   224M  596  33   0   5 5667 3850  86   0 1303 5768 3885  6 7 87
 1 10 0   6824M   220M 4332  32   2   0 3228   0  17   0  755 9689 3057  8 7 85
 0 9 0   6798M   219M  430   0   0   0 712   0  12   0 1274 4276 3877  2  2 95
 0 11 0   6830M   205M 1026   4   1   3 481   0  84   0 1503 5586 4370  6 4 89




Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . scra...@hub.org  MSN . scra...@hub.org
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: More data on 7.2-RELEASE hangs

2009-05-13 Thread Marc G. Fournier


On Wed, 13 May 2009, Steven Hartland wrote:

We've seen things similar to this when an process uncommon process does 
a query which locks the a table for a large amount of time on mysql.


So many reasons why I hate MySQL :(

One thing that we are trying right now is actually along these lines ... 
we've been working with MySQL 5.1 + NDBD for clustering ... after the last 
hang, we disabled both the NDBD startup, and mysql, to see if that is the 
cause, so nice to have some validation on this one ...



In our example this turned out to be an admin query in vbulletin. When
it happened it turned a machine which was purring along quite nicely
into a totally unresponsive machine in a matter of a few seconds as
apache spawned more process that also then instantly stalled...


Let me check that the next time around ... compare the specific # of http 
processes between monitor runs and see if there is a 'sudden jump' ...


We'll see hwo the next 'test period' works out, with that MySQL stuff 
offline ... the other thing I've been working on is moving jails off of 
that server, one at a time, to see if I can narrow down which one is 
causing the spike ... I will focus on the mysql backend ones going 
forward, to eliminate those ...


Thx ...

 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . scra...@hub.org  MSN . scra...@hub.org
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Debugging server hangs in 7.2-RELEASE

2009-05-10 Thread Marc G. Fournier



I am so completely running out of ideas on how to debug this, maybe 
someone else has some ideas?


The problem appears to be that very suddenly, the disk busy (according to 
vmstat) skyrockets to 100 (from 0) and then the 'runnable but swapped' 
column slowly rises ...


One person suggested that for them, they saw similar when msi/msi-x was 
enabled ... after searching the source code, I found that msi was used in 
the bge driver, but I couldn't find msix used anywhere else on that 
machine, so disabled msi ... its still exhibiting the issue ...


I get no errors on the serial console to indicate any problems, and until 
a relatively recent upgrade of the kernel ( (I can't give an exact date), 
this server was one of my most solid ...


I figure there is a single process that is starting up on the machine that 
is causing this, but no matter what I try, it is eluding me.


I have KDB enabled in the kernel, and the serial console setup so that I 
can break to it ... but when this problem happens, doing 'cr ~ ^b' through 
the serial console doesn't do anything, or, it just prints the message 
about breaking to the debugger and then hangs there ...


My next option is to start time travelling backwards to see if I can find 
a 'stable kernel' again, but if it is just one process causing this, then 
going back to older kernels isn't necessarily going to accomplish anything 
...


Is there something else I can do here to debug this?  Its hard to believe 
we are such an advance OS, but debugging issues like this is so elusive :(





Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . scra...@hub.org  MSN . scra...@hub.org
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

RE: 7.1-STABLE Sun Mar 29 01:06:46 ADT 2009 Locks up ...

2009-05-09 Thread Marc G. Fournier


On Tue, 28 Apr 2009, Gavin Atkinson wrote:


On Fri, 2009-04-24 at 20:39 +0200, Martin Schmidt wrote:

Hi Marc and List,

i had similar issues with FreeBSD 7.2-PRERELEASE. Server (zfs,nfs)
seems to hang in intervals of about 8 hours.
kernel is still there but no connections can be made to nfs/ssh and
login on local console doesn't seem to
work due to incredible slowness. breaking to the debugger takes a
moment but works.
(compiling kernel with WITNESS didnt help)

the server had been solid before with 7 stable kernel from around 19
October 2008.

I now added these lines to /boot/loader.conf

hw.pci.enable_msi=0
hw.pci.enable_msix=0

to disable Message Signaled Interrupts. Which are used by the 3ware
twa driver and igb network driver on our server.


If you are willing to test further on your server, it may be helpful if
you could determine which of those two lines in loader.conf fixes the
problem for you.  It would also be useful to provide a dmesg from the
machine when both msi and msix are enabled.

FWIW, looking at the vmstat -i output it appears that only the igb
driver that are using MSI/MSIX, unless you have a reason to suspect
otherwise?


How do you tell that, about igb?  looking at the server I have the igb 
device on, it doesn't seem to say anything about that ...


# vmstat -i
interrupt  total   rate
irq1: atkbd0 162  0
irq30: twa0402647215187
cpu0: timer   4284778818   1999
irq256: igb0  1282945461598
irq257: igb0   215507100100
irq258: igb0   417702261194
irq259: igb0   314601966146
irq260: igb0   568062067265
irq261: igb0   3  0
cpu5: timer   428475   1999
cpu6: timer   4284731466   1999
cpu7: timer   4284724508   1999
cpu1: timer   4284893874   1999
cpu3: timer   4284899807   1999
cpu2: timer   4284892325   1999
cpu4: timer   4284897264   1999
Total37480028742  17493


The server(s) that I am experiencing the hangs on, vmstat -i shows:

# vmstat -i
interrupt  total   rate
irq1: atkbd0   2  0
irq3: sio1 8  0
irq25: bge0  4614816213
irq72: ciss0 1835763 85
cpu0: timer 43113685   1997
cpu1: timer 43116889   1997
Total   92681163   4293

Are any of these similiarly using MSI/MSIX?


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . scra...@hub.org  MSN . scra...@hub.org
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

RE: 7.1-STABLE Sun Mar 29 01:06:46 ADT 2009 Locks up ...

2009-05-09 Thread Marc G. Fournier




'k, based on grep'ng the source files, turns out that the if_bge device 
driver uses msi, while, as you point out, the igb uses msix ... I have 
disabled msi on the two servers with bge devices, and msix on the one with 
igb ... all three have given the same sort of problem after varying 
periods of time ... let's see if I can get to 30 days uptime with this ...


On Tue, 28 Apr 2009, Gavin Atkinson wrote:


On Fri, 2009-04-24 at 20:39 +0200, Martin Schmidt wrote:

Hi Marc and List,

i had similar issues with FreeBSD 7.2-PRERELEASE. Server (zfs,nfs)
seems to hang in intervals of about 8 hours.
kernel is still there but no connections can be made to nfs/ssh and
login on local console doesn't seem to
work due to incredible slowness. breaking to the debugger takes a
moment but works.
(compiling kernel with WITNESS didnt help)

the server had been solid before with 7 stable kernel from around 19
October 2008.

I now added these lines to /boot/loader.conf

hw.pci.enable_msi=0
hw.pci.enable_msix=0

to disable Message Signaled Interrupts. Which are used by the 3ware
twa driver and igb network driver on our server.


If you are willing to test further on your server, it may be helpful if
you could determine which of those two lines in loader.conf fixes the
problem for you.  It would also be useful to provide a dmesg from the
machine when both msi and msix are enabled.

FWIW, looking at the vmstat -i output it appears that only the igb
driver that are using MSI/MSIX, unless you have a reason to suspect
otherwise?

Gavin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org




Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . scra...@hub.org  MSN . scra...@hub.org
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

server hangs, break to DDB hangs ...

2009-05-05 Thread Marc G. Fournier



I have two HP Proliant servers that, until recently, have run very stable 
... within the past 2 months, the servers hang after anywhere from 10hrs 
through 19 days (one just hung up this aft) ...


vmstat, about the time it hangs, shows:

# cat 16/vmstat.out
 procs  memory  pagedisks faults 
cpu
 r b w avmfre   flt  re  pi  pofr  sr da0 pa0   in   sy   cs 
us sy id
109 156 1 17035752   62152   803  19   5   3  1907 1785   0   0  437  294 
853 50 28 22
 2 332 5 17109460   23056 147346 4319 2061 3139 44030 6539423 1029   0 
4027 398263 38616 40 58  2
 0 32 8 17110588   23052   626 4216  35 203   344 745 572   0  597 16414 
5741  4 10 86
 0 35 14 17110592   23084   446 5102   2 410   210 1596 540   0  516 31616 
4461  4 10 85
 0 25 20 17110588   23032   196 7734   2 28022 1179 445   0  434 34992 
3543  5  7 88


with, by the time I was able to reboot it, the final vmstat was showing:

# cat 46/vmstat.out
 procs  memory  pagedisks faults 
cpu
 r b w avmfre   flt  re  pi  pofr  sr da0 pa0   in   sy   cs 
us sy id
 1 492 1595 24292424   99564   809  20   5   4  1909 1896   0   0  437 
737  863 50 28 22
 1 399 1596 24285028   90708  6195 152 393  76  3185 1061 414   0  683 
54948 32062  8  9 82
 2 231 1595 24276684   85164  4709  94 219 152  3729 642 554   0  420 
39442 20612  7 12 80
 1 174 1595 24259144   71288  8204 143 314 158  3379 1314 605   0  547 
36228 21219 11 18 71
 2 199 1593 24242500   72116  4637  52 251 195  3957 1609 496   0  383 
32305 20225  6 12 82


When I try and break to DDB, all I get on the screen is:

===
KDB: enter: Break sequence on conec
===

And then it hangs there ...

I have ps listings that go back for just over an hour before I rebooted 
(the script runs every 5 minutes, or is supposed to):


# ls -lt */ps*
-rw-r--r--  1 root  wheel  509908 May  5 16:47 46/ps.out
-rw-r--r--  1 root  wheel  450704 May  5 16:35 35/ps.out
-rw-r--r--  1 root  wheel  424047 May  5 16:32 26/ps.out
-rw-r--r--  1 root  wheel  329105 May  5 16:21 21/ps.out
-rw-r--r--  1 root  wheel  278189 May  5 16:17 16/ps.out
-rw-r--r--  1 root  wheel  246726 May  5 15:55 55/ps.out
-rw-r--r--  1 root  wheel  231937 May  5 15:50 50/ps.out
-rw-r--r--  1 root  wheel  240260 May  5 15:45 45/ps.out
-rw-r--r--  1 root  wheel  234731 May  5 15:40 40/ps.out
-rw-r--r--  1 root  wheel  233719 May  5 15:30 30/ps.out
-rw-r--r--  1 root  wheel  222749 May  5 15:25 25/ps.out
-rw-r--r--  1 root  wheel  231617 May  5 15:20 20/ps.out


Looking at swap usage over that period, its obvious that something is 
sucking back the RAM reasonably fast:


neptune# cat 46/swap.out
Device  512-blocks UsedAvail Capacity
/dev/da0s1b   16777216 13789464  298775282%
neptune# cat 35/swap.out
Device  512-blocks UsedAvail Capacity
/dev/da0s1b   16777216 12482312  429490474%
neptune# cat 26/swap.out
Device  512-blocks UsedAvail Capacity
/dev/da0s1b   16777216 12351920  442529674%
neptune# cat 21/swap.out
Device  512-blocks UsedAvail Capacity
/dev/da0s1b   16777216  7807240  896997647%
neptune# cat 16/swap.out
Device  512-blocks UsedAvail Capacity
/dev/da0s1b   16777216  5752832 1102438434%
neptune# cat 55/swap.out
Device  512-blocks UsedAvail Capacity
/dev/da0s1b   16777216  4398928 1237828826%

But I'm not sure what to look at in the ps output to determine what is 
going awry here ...


I'm running

 7.1-STABLE FreeBSD 7.1-STABLE #14: Sat Mar 28 00:05:19 ADT 2009

On the server that just hung, so will upgrade to the latest 7.2-RELEASE 
next, but ... if someone can give me pointers at what else I should be 
checking for, or something in the ps listings that I should be looking 
for?  My monitor script is currently doing:


/usr/sbin/jls  jaillist.out
/bin/ps -aucxHl -O jid  ps.out
/usr/sbin/pstat -s  swap.out
/usr/bin/vmstat 1 5  vmstat.out
/usr/bin/awk '{print $15}' /proc/*/status | /usr/bin/sort | /usr/bin/uniq 
-c  vps_dist.out


Any pointers appreciated ...

Thx


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . scra...@hub.org  MSN . scra...@hub.org
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

7.1-STABLE Sun Mar 29 01:06:46 ADT 2009 Locks up ...

2009-04-14 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Hi ...

  Over the past little while, two of my servers have suddenly started to hang
... servers that up until this started, have been reasonably rock solid ...
they are generally within a day of each other for source code, and the hardware
on both are pretty much identical (HP Proliant DL360 Servers) ...

  I have serial console configured on both so that I can do CR ~ ^b to get to
DDB ... except, when it hangs, all I get is:

KDB: enter: Break sequence on console

  And it hangs there, no prompt.

  I setup a simple script (see attached) to run every 5 minutes that gathers
various pieces of info that I think are pertinent, but most likely don't cover
everything ...

  Whenever this happens, on either machine, vmstat show data *like* (notice the 
high procs - w values?):

 procs  memory  pagedisks faults cpu
 r b w avmfre   flt  re  pi  pofr  sr da0 pa0   in   sy   cs us sy 
id
165 106 2 12699168   33840  3080  38   2   2  3082 1623   0   0  337 36961 4731 
18  7 75
64 75 4 12761744   23084 46809 623  65  43 19307 116 334   0 1189 83674 11708 
70 20 10
 1 68 25 12773980   23068 11036 3003   9  36  4055 116 282   0 1336 78346 14869 
56 16 28
 0 71 25 12774236   23084   186 769   1   518  80 249   0  609 9298 5894  5 
5 91
 5 90 31 12747296   23352   626 2546   5 104  1147 368 281   0 1536 40945 19980 
6  5 90

  Where procs - w just seems to keep rising ... note that the output for
vmstat *5 minutes before* shows:

 procs  memory  pagedisks faults cpu
 r b w avmfre   flt  re  pi  pofr  sr da0 pa0   in   sy   cs us sy 
id
35 121 0 12414692   90552  3080  32   2   1  3090 1403   0   0  337 37022 4730 
18  7 75
31 93 0 12314408   62024 36550 414  46   6 34285  27 563   0  916 94851 8813 67 
33  0
43 179 0 12270932   23080 24035 101  41  12 13887  36 375   0  766 61969 6945 
69 23  7
92 44 0 12265524  119804  2122 2028   1  32 13051 1096092 205   0  558 19460 
4561 19 50 32
38 34 0 12330068   89140 30758 103  39 119 37037 2837365 165   0  773 92041 
7111 47 53  0


  I have one QEMU VPS running on this box, with kqemu running the latest kernel
module ... but the other machine experiencing the same issue is only running
FreeBSD jails ...

  Both servers are running SCHED_4BSD, if that matters any ... ?

  I'm at a loss as to what to look at / for next ... pointers would be greatly
appreciated ...

  I have the various output files that the script generates available if anyone 
thinks they would be useful ...

thank you ...


Marc G. FournierHub.Org Hosting Solutions S.A. (http://www.hub.org)
Email . scra...@hub.org  MSN . scra...@hub.org
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.11 (FreeBSD)

iEYEARECAAYFAknlRcMACgkQ4QvfyHIvDvNmIgCfSWdT9gug6VCjYM1VVMuv1UkN
K28AoK298b6mxEeiddu4BAH0+IpkRsti
=q6lD
-END PGP SIGNATURE-


monitor.sh
Description: Binary data


pgpGiVIOTiHKv.pgp
Description: PGP signature
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Re: ALT_BREAK_TO... + ILO ... missing something in config ...

2009-03-28 Thread Marc G. Fournier


On Sat, 28 Mar 2009, Danny Braniss wrote:


unless the serial port is setup as console, check if /boot/device.hints
has:
hint.sio.0.flags=0x10
escaping to the debugger is not caught.
btw, Jeremy Chadwick had a nice explanation, but I lost the URL.


That was the missing piece, thank you ... I can now break down into DDB 
through the VSP ...



Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . scra...@hub.org  MSN . scra...@hub.org
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

ALT_BREAK_TO... + ILO ... missing something in config ...

2009-03-27 Thread Marc G. Fournier



Due to an issue I'm having with 7.x, and trying to track it down, I spent 
tonight getting my server setup to allow my to break into the debugger 
when it hangs, and hopefully dump core ...


But, although I *think* I've got it all, I'm obviously missing something, 
as it isn't breaking ...


First ... I'm running a proliant server, and when I connect via SSH to ILO 
on that machine, and type 'vsp', I get a shell as I expect, I can type, 
etc ... when I reboot the machine, I get the opening splash screen with 
the 7(?) options (normal boot, single user mode, etc, etc) ... but I get 
nothing between that and the login prompt ... first sign of a problem, 
maybe?


Next, the easy question ... what is the key stroke to issue when one has 
ALT_BREAK_TO_DEBUGGER is set in the kernel? I thought it was CR ~ ^b ... 
is that correct?  I'm using putty to connect via ssh, if that makes a 
difference ... I've also tried using the browser interface into ilo / vsp, 
same lack of a result ...


Beyond adding sio device driver to my kernel, I've also got:

options ALT_BREAK_TO_DEBUGGER
options KDB
options DDB

Missing a kernel option maybe?

I have the following in /boot/loader.conf:

comconsole_speed=9600
console=vidconsole,comconsole # A comma separated list of console(s)
boot_multicons=-D # -D: Use multiple consoles
boot_serial=-h # -h: Use serial console

So ... eithe rI don't have it enabled like I think, or I'm doing the wrong 
key stroke ... or ...


Thx




Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . scra...@hub.org  MSN . scra...@hub.org
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

vmstat memory: avm vs fre

2009-03-17 Thread Marc G. Fournier



I'm getting a really odd condition on one of my servers (and I suspect its 
happening on one of my other servers as well) ... after a period of time 
(3 days), the server hangs solid ...


Running vmstat in an xterm, the one thing I'm noticing is that when it 
hangs, my avm == 12455M and fre == 22M ... when I start the system, it 
looks like: avm == 246M vs fre == 197M ...


I'm suspecting that the lock up is that fre hit 0 at some point, but I'm 
at a loss as to why, or where to look, for this ...


top in another xterm when it hangs shows it appears to have more then 
enough VM:


last pid: 87005;  load averages:  8.57,  7.29,  4.46up 0+17:25:13  20:45:00
1140 processes:317 running, 774 sleeping, 10 zombie, 39 lock
CPU: 23.3% user,  0.0% nice, 11.1% system,  0.4% interrupt, 65.1% idle
Mem: 4610M Active, 440M Inact, 489M Wired, 13M Cache, 214M Buf, 9624K Free
Swap: 8192M Total, 1055M Used, 7137M Free, 12% Inuse, 564K In, 272K Out
kvm_open: cannot open /proc/90106/mem
  PID JID USERNAME  THR PRI NICE   SIZERES STATE  C   TIME   WCPU 
COMMAND
30625   0 root1  960   588M   166M RUN0  14:54  0.10% 
/usr/local/bin/qemu-system-x86_64 -m 512M -net nic,macadd
86866  20   1200  1  960 60888K  1140K RUN0   0:00  0.15% 
postgres: autovacuum worker process(postgres)
86844   1 root1  960 15080K  1028K RUN1   0:00  0.05% 
sshd: [accepted] (sshd)
45533  20 root1  960 15044K   456K RUN1   0:00  0.05% 
/usr/sbin/sshd
86895   0 root1  960 15092K   428K RUN0   0:00  0.05% 
/usr/sbin/sshd
15131  15 root1  960 19692K   376K RUN1   0:00  0.15% 
/usr/sbin/sshd
95911   4 www 1   40   106M 0K accept 0   0:01  0.00% 
/usr/local/sbin/httpd (httpd)




Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . scra...@hub.org  MSN . scra...@hub.org
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to freebsd-stable-unsubscr...@freebsd.org

Problem with Bridging ... and bge devices under FreeBSD 7.x?

2008-10-28 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


I'm trying to run a QEMU VM on top of a FreeBSD 7.x server ... I've tried the 
exact same setup on my desktop, using 192.168.1.x and an fxp device, and it all 
works perfectly, but as soon as I do this on another machine on a public IP, 
I'm not getting any routing, I can't even ping it from the same machine ...

My first thought was  that there was an issue with IP aliases already on the 
bge device, but tried doing the following:

ifconfig bridge0 destroy
ifconfig tap0 destroy
ifconfig fxp0 -alias 192.168.1.101
ifconfig fxp0 alias 192.168.1.101 netmask 255.255.255.255
ifconfig bridge0 create
ifconfig tap0 create
ifconfig bridge0 addm fxp0 addm tap0 up

on my desktop here and then starting up the qemu image, and all worked as 
expected, so having an alias on the interface, before or after, doesn't make a 
difference ... at least with the fxp device ...

Using VNC to connect to the VM, I can look at the interface, and it says it is 
connected ... and the IP/Gateway are all set right for the network I'm on, 
netmask is set to 255.255.255.0, same as on the 'private network' ...

Please note that when I say it works on my private network / desktop, I'm 
using it to connect to my work computer, across the Internet, via Windows RDP, 
and it works flawlessly ...

Looking at /var/log/messages, you can see the bridge being setup:


Oct 27 18:53:21 io kernel: bridge0: Ethernet address: ce:44:c7:1b:47:40

as well as the tap device:

Oct 27 18:53:25 io kernel: tap0: Ethernet address: 00:bd:96:ae:67:00
Oct 27 18:53:41 io kernel: tap0: promiscuous mode enabled

and the ethernet going promiscuous:

Oct 26 20:53:56 ganymede kernel: fxp0: promiscuous mode enabled

So, all I have left is that everything is being setup okay, but there is 
something I'm missing here ... something with bridge-bge, maybe?  I've even 
tries to compare the output of 'ifconfig -a' as far as the bridge0 and tap0 
devices are concerned, and other then the mac address, they look identical also 
...

So, pointers to what I may be missing here?  a sysctl value that I need to set 
for this interface?

Thanks ...




- -- 
Marc G. FournierHub.Org Hosting Solutions S.A. (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.9 (FreeBSD)

iEYEARECAAYFAkkHpscACgkQ4QvfyHIvDvPnFgCgk+6Pg+QeYO0BD9KMIkyZK2g7
JWgAn3VHq+F1OzD9M8VuYLEZDQLfFsNU
=+3J/
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Problem with Bridging ... and bge devices under FreeBSD 7.x?

2008-10-28 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Tuesday, October 28, 2008 22:08:18 -0400 Michael Proto 
[EMAIL PROTECTED] wrote:




 On Tue, Oct 28, 2008 at 7:56 PM, Marc G. Fournier [EMAIL PROTECTED] wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1


 I'm trying to run a QEMU VM on top of a FreeBSD 7.x server ... I've tried the
 exact same setup on my desktop, using 192.168.1.x and an fxp device, and it
 all
 works perfectly, but as soon as I do this on another machine on a public IP,
 I'm not getting any routing, I can't even ping it from the same machine ...

 My first thought was  that there was an issue with IP aliases already on the
 bge device, but tried doing the following:

 ifconfig bridge0 destroy
 ifconfig tap0 destroy
 ifconfig fxp0 -alias 192.168.1.101
 ifconfig fxp0 alias 192.168.1.101 netmask 255.255.255.255
 ifconfig bridge0 create
 ifconfig tap0 create
 ifconfig bridge0 addm fxp0 addm tap0 up

 on my desktop here and then starting up the qemu image, and all worked as
 expected, so having an alias on the interface, before or after, doesn't make a
 difference ... at least with the fxp device ...

 Using VNC to connect to the VM, I can look at the interface, and it says it is
 connected ... and the IP/Gateway are all set right for the network I'm on,
 netmask is set to 255.255.255.0, same as on the 'private network' ...

 Please note that when I say it works on my private network / desktop, I'm
 using it to connect to my work computer, across the Internet, via Windows RDP,
 and it works flawlessly ...

 Looking at /var/log/messages, you can see the bridge being setup:


 Oct 27 18:53:21 io kernel: bridge0: Ethernet address: ce:44:c7:1b:47:40

 as well as the tap device:

 Oct 27 18:53:25 io kernel: tap0: Ethernet address: 00:bd:96:ae:67:00
 Oct 27 18:53:41 io kernel: tap0: promiscuous mode enabled

 and the ethernet going promiscuous:

 Oct 26 20:53:56 ganymede kernel: fxp0: promiscuous mode enabled

 So, all I have left is that everything is being setup okay, but there is
 something I'm missing here ... something with bridge-bge, maybe?  I've even
 tries to compare the output of 'ifconfig -a' as far as the bridge0 and tap0
 devices are concerned, and other then the mac address, they look identical
 also
 ...

 So, pointers to what I may be missing here?  a sysctl value that I need to set
 for this interface?




 I'm having a little trouble understanding the setup you have. In your test
 case, is the IP of your VM 192.168.1.101? If so, then I don't think you want
 that IP aliased on the physical interface of your bridge. The VM NIC will
 answer for packets destined on your local segment, which the bridge would
 forward to the physical interface. If you assign the VM's IP to that physical
 interface, then your host would think that traffic is destined for itself and
 not pass it to the bridge.

 If I'm misunderstanding and the 192.168.1.101 alias (or whatever the equiv in
 your production setup) isn't being used by your VM then I would start looking
 at the ARP traffic crossing both the tap0, lo0, and physical interfaces.

 What does an 'ifconfig -a' look like on both systems? netstat -rn? Any packet
 filtering?

I always fear I'm going to send more info then I should, and generate chaos and 
confusion :)

On my test box, the VM is set to 192.168.1.100 ... the alias I added to fxp0 
was to simulate what I have on the public server, where there is a bge0 
device with n aliases attached to it ... in no case is the IP assigned to the 
VM actually aliased onto any interface on the network itself

Now, to try and answer your other questions ...

netstat -nr on the 192 server shows the IP to be at:

 netstat -nr | grep 168.1.100
192.168.1.100  52:54:00:12:34:56  UHLW11   fxp0   1128

which is very odd, as that MAC address is not found via ifconfig -a:

 ifconfig -a | grep 52


while arp -a also shows the 52:54 MAC, although MACs for the ifconfig -a are, 
in fact:

 ifconfig -a | grep ether
ether 00:02:b3:ee:da:3e
ether 5e:d1:e6:8b:55:50
ether 00:bd:25:18:6d:00

On the server, I'm getting nothing in arp or netstat for the IP in question:

io# arp -a | grep 204.213
io# netstat -nr | grep 204.213
io#

I've even tried doing a ping *from* the VM (logged in with VNC) to see if it 
will broadcast itself out, and nothing ...

I'm starting QEMU on both servers with the same options as well:

qemu -m 512M -net nic -net tap winxp.img

just to confirm that I'm not doing anything different for attaching to the 
network ...

So, right now, all I can see as being different is bge vs fxp interfaces ... 
both machines are running 7.x ...

- -- 
Marc G. FournierHub.Org Hosting Solutions S.A. (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.9 (FreeBSD

Re: Problem with Bridging ... and bge devices under FreeBSD 7.x?

2008-10-28 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


I only have one VM running on one server ...

- --On Tuesday, October 28, 2008 21:14:28 -0700 Bakul Shah [EMAIL PROTECTED] 
wrote:

 On Wed, 29 Oct 2008 00:35:35 -0300 Marc G. Fournier [EMAIL PROTECTED]
 wrote:
 netstat -nr on the 192 server shows the IP to be at:

  netstat -nr | grep 168.1.100
 192.168.1.100  52:54:00:12:34:56  UHLW11   fxp0   1128

 which is very odd, as that MAC address is not found via ifconfig -a:

  ifconfig -a | grep 52
 

 while arp -a also shows the 52:54 MAC, although MACs for the ifconfig -a are,

 in fact:

  ifconfig -a | grep ether
 ether 00:02:b3:ee:da:3e
 ether 5e:d1:e6:8b:55:50
 ether 00:bd:25:18:6d:00

 The setup you get with a tap device talking to qemu is this:

 [host]-tap0qemu---ed0-[VM]

 Each end has its own mac address. The VM's NIC (ed0 or rl0
 or whatever) gets addresses like 52:54:00:12:34:56.  The host
 will have an arp entry for it once the VM sends an arp
 packet.  But tap0 will have an address assigned by the tap
 driver, something like 00:bd:xx:xx:xx.

 If you have two VMs running at the same time on two different
 machines and they both have identical MAC addresses, that
 could be part of your problem.

 But your network topolgy is still not clear.  What would help
 is something like this:

 You have:
 machine A (runs VM A1).
 machine B (runs VM B1).
 machine C (runs windows).

 Can you ping from A to C?
 Can you ping from B to C?
 Can you ping from A to A1?
 Can you ping from B to B1?
 Can you ping from A1 to C?
 Can you ping from B1 to C?
 Can you ping from C to A1?
 Can you ping from C to B1?

 All of the above should work.  Next you can try tcpdump on
 tap devices to see what is going on.  If you are still
 stumped provide ifconfig -a output on A, B, C, A1 and B1.  On
 windows machine you can do ipconfig/all to get at this
 information (IIRC).



- -- 
Marc G. FournierHub.Org Hosting Solutions S.A. (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.9 (FreeBSD)

iEYEARECAAYFAkkH6M4ACgkQ4QvfyHIvDvPciwCgi3LwM74g8DPrRC4XlkNQgFD4
eRgAnj6/CUVTkrzwr8GnzawWKlbfCWBc
=KgEt
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: php5 and postgresql 8.2/8.3

2008-08-20 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Setting ServerName fixed it for me ... thanks for the tip ...

- --On Monday, April 21, 2008 12:53:24 +0200 Claus Guttesen [EMAIL PROTECTED] 
wrote:

 this problem is very old for me. it goes, at least from
  http://www.freebsd.org/cgi/query-pr.cgi?pr=97272

  I found a workaround: you simply should set

  ServerName foobar.emxample

  in httpd.conf

  i don't know why missing ServerName causes coredump of apache in case of
 php+php_pgsql, but this works for me

 Thank you for your tip. I will try that on a test-server. Maby some
 reverse dns-lookup-issue which blocks correct unloading which then
 leads to a core-dump?

 --
 regards
 Claus

 When lenity and cruelty play for a kingdom,
 the gentlest gamester is the soonest winner.

 Shakespeare
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to [EMAIL PROTECTED]



- -- 
Marc G. FournierHub.Org Hosting Solutions S.A. (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.9 (FreeBSD)

iEYEARECAAYFAkisvowACgkQ4QvfyHIvDvMaAgCgnXDNXY7G0d4gC1JghHxxFfvt
n2gAoNQn+EabU6zMLJt0uYKWifHENfg/
=bf+C
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Azureus + 7-STABLE == Slow download + No Upload

2008-03-31 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Monday, March 31, 2008 10:31:12 +0200 Joakim Fogelberg 
[EMAIL PROTECTED] wrote:


 I believe I had the same problem with 7.0-prerelease + Azureus +
 jdk15. If I remember correct, I could only download from other Azureus
 clients. I had no time to even try to find out why. I simply installed
 deluge instead.

Yowch, this is like the difference between night-n-day ... thanks for the 
pointer ...

- -- 
Marc G. FournierHub.Org Hosting Solutions S.A. (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.4 (FreeBSD)

iD8DBQFH8Xdd4QvfyHIvDvMRAuOIAJ4zNC+c8w5iu13CiN1q/nw0V1/M0gCeNk+3
ioBkLAVolNRSd5VUwbWbPHA=
=YpFb
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Azureus + 7-STABLE == Slow download + No Upload

2008-03-30 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Is anyone running Azureus on 7-STABLE and getting decent performance from it?

I just upgraded to 7-STABLE, installed /usr/ports/java/jdk15 (instead of 
diablo) so that it uses libthr (checked with ldd), and now I'm barely able to 
get one downloaded, let alone multiple, and almost nothing uploaded ...

I've added:

-Djava.net.preferIPv4Stack=true

to /usr/local/bin/azureus, but, from reading the jdk15 makefile, IPv6 is only 
enabled if/when you do WITH_IPV6, and I don't have that in my make.conf file, 
therefore this shouldn't affect anything ...

I have nothing in my /etc/libmap.conf file ...

So, if there a problem, or am I missing something?


- -- 
Marc G. FournierHub.Org Hosting Solutions S.A. (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.4 (FreeBSD)

iD8DBQFH8Ddc4QvfyHIvDvMRApPKAKCU1c+VVRqKK9mGpbuTnSlL9+i1SwCggocA
szQk1lVKoHLT9D2P7uAF7Zw=
=q1vl
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Nagios + 6.3-RELEASE == Hung Process

2008-01-02 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Wednesday, January 02, 2008 22:54:33 + Tom Judge [EMAIL PROTECTED] 
wrote:

 Not sure if this is related at all but out of the 3 nagios deployments we
 have here I have only ever seen it on one (It currently has 2 nagios threads
 spinning CPU time atm).

 The differences on that server are:

   * It is amd64 compared to i386

I never tried on i386, but in my case it was an amd64 system as well ... not 
sure if that is relevant or not ... has anyone seen this problem *with* i386?

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.4 (FreeBSD)

iD8DBQFHfB0s4QvfyHIvDvMRAudqAKCuiXkAYPL5goXbmlvJjylpMlqUIwCgiRfM
m15NQlmqpRtO/MtEXR7m+RU=
=utJ9
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Nagios + 6.3-RELEASE == Hung Process

2008-01-02 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Thursday, January 03, 2008 11:05:16 +1030 Jarrod Sayers 
[EMAIL PROTECTED] wrote:

 That's actually good to know, as you're now (unless I am mistaken) the first
 user to contact me about this problem on non-i386 systems.  One user, plus
 myself, have also seen the issue under Nagios 3.x, both on i386 systems
 though.

 I also have a net-mgmt/ndoutils port in the works (less the database support
 for now) which also has the same issue so using broker modules doesn't seem
 to affect the outcome.

 My gut feeling is that it's not an architecture issue but more an
 interoperability issue between the Nagios threading code and the libpthread()
 threading library.

As noted in my original report, this isn't a nagios issue per se ... my first 
experience with this issue was with Azureus/java ... so its a 'threading issue 
in general' ...

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.4 (FreeBSD)

iD8DBQFHfDm94QvfyHIvDvMRAtZkAKCf4z6csc+YaXBS1/UMurQ3NIqXDgCeLCif
jplg0JQzX4xKQEgJsVy/nGY=
=dA7G
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Nagios + 6.3-RELEASE == Hung Process

2008-01-01 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


G'day ...

  Yesterday, I setup nagios to do some system monitoring ... installed the 
latest version from ports into a jail, so that I could easily move it around 
between machines as I upgrade, without losing data ... after about 30 minutes 
running, I get a second nagios process running (fork?) that takes up ch CPU 
time as is available, and just hangs there until I kill -9 it ...

Figuring that it might be a problem with the jail (trying to access somethign 
that isn't available to the process in a jail), I moved it to the physical 
server level ... but, again, after ~30 minutes, its doing the same thing:

# ps aux | grep nagios
nagios  32065 73.2  0.1 10948  3516  ??  R11:15AM   7:40.77 
/usr/local/bin/nagios -d /usr/local/etc/nagios/nagios.cfg
nagios  82120  0.0  0.1 10948  3580  ??  Ss   10:47AM   0:01.18 
/usr/local/bin/nagios -d /usr/local/etc/nagios/nagios.cfg

So, definitely not jail related ...

I've tried to do a 'truss -p 32065', it just hangs.

And: ktrace -f /tmp/output -p 32065 ... produces nothing:

# kdump -f /tmp/output
 32065 nagios   PSIG  SIGKILL SIG_DFL

Once I kill -9 the process, a bunch of 'check_ping' processes start up and then 
things go back to normal ...

My last kernel / world build on that box is: Mon Nov 12 06:43:30 AST 2007

After searching the 'Net a bit, came across this thread:

http://www.nagiosexchange.org/nagios-users.34.0.html?tx_maillisttofaq_pi1%5Bmode%5D=1tx_maillisttofaq_pi1%5BshowUid%5D=7694

That recommends modifying libmap.conf with:

[/usr/local/bin/nagios]
libpthread.so.2 libthr.so.2
libpthread.so libthr.so

This seems to fix the problem on the physical server, and am currently testing 
it in the jail itself to make sure it fixes it there too ...

Should this be something that is more prominently documented somewhere?  Maybe 
in the port itself?  azureus has similar problems that are fixed with entries 
in libmap.conf, so its not just a nagios issue ...



- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.4 (FreeBSD)

iD8DBQFHemsH4QvfyHIvDvMRApUOAKCLRDnmRba6ho4St8qZ6U19V8yJ+wCghMBp
Xph3ac9d7QsMjeKBMtmgkuw=
=mXxF
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

gdbserver on latest -STABLE ...

2007-12-03 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Is this related to the commit that just went through to enable on arch that 
support it?

=== gnu/usr.bin/gdb/gdbserver (clean)
cd: can't cd to /usr/src/gnu/usr.bin/gdb/gdbserver
*** Error code 2

Stop in /usr/src/gnu/usr.bin/gdb.

Or did I catch something 'in between'?


- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.4 (FreeBSD)

iD8DBQFHVLti4QvfyHIvDvMRAlU6AKCmL3VF7Kz6QSyFF/wgWOtk0Td2xgCfag5W
6rRXD/7jpIZwoT9qs/7yXiU=
=u1sx
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: gdbserver on latest -STABLE ...

2007-12-03 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Great, thank you ... bookmarked ... so, should one not report something like 
this if that page shows it as a failure?

- --On Monday, December 03, 2007 21:57:50 -0500 Mike Tancsa [EMAIL PROTECTED] 
wrote:

 At 09:28 PM 12/3/2007, Marc G. Fournier wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1


 Is this related to the commit that just went through to enable on arch that
 support it?

 One way to check is to take a look at the status page for the tinderboxes

 http://tinderbox.des.no/

 which are constantly building world.   If its a general problem, it will show
 up there through a few builds.

  ---Mike
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to [EMAIL PROTECTED]



- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.4 (FreeBSD)

iD8DBQFHVMka4QvfyHIvDvMRAjrQAKDZS6OEiOYoHFXOUYX5DtCluP1VQACeN67Y
RvKiX4T6ugGTiSnPFRFmazo=
=cdFC
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: gdbserver on latest -STABLE ...

2007-12-03 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Teach me to sort threaded :(

Thanks ...

- --On Monday, December 03, 2007 22:34:55 -0500 Mike Tancsa [EMAIL PROTECTED] 
wrote:

 At 10:27 PM 12/3/2007, Marc G. Fournier wrote:
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1


 Great, thank you ... bookmarked ... so, should one not report something like
 this if that page shows it as a failure?

 Its automatically reported to the mailing list. eg.

 http://lists.freebsd.org/pipermail/freebsd-stable/2007-December/038791.html
 and
 http://lists.freebsd.org/pipermail/freebsd-stable/2007-December/038792.html
 and
 http://lists.freebsd.org/pipermail/freebsd-stable/2007-December/038793.html

  ---Mike



 - --On Monday, December 03, 2007 21:57:50 -0500 Mike Tancsa [EMAIL 
 PROTECTED]
 wrote:

  At 09:28 PM 12/3/2007, Marc G. Fournier wrote:
  -BEGIN PGP SIGNED MESSAGE-
  Hash: SHA1
 
 
  Is this related to the commit that just went through to enable
 on arch that
  support it?
 
  One way to check is to take a look at the status page for the tinderboxes
 
  http://tinderbox.des.no/
 
  which are constantly building world.   If its a general problem,
 it will show
  up there through a few builds.
 
   ---Mike
  ___
  freebsd-stable@freebsd.org mailing list
  http://lists.freebsd.org/mailman/listinfo/freebsd-stable
  To unsubscribe, send any mail to [EMAIL PROTECTED]



 - 
 Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
 Email . [EMAIL PROTECTED]  MSN . [EMAIL 
 PROTECTED]
 Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v2.0.4 (FreeBSD)

 iD8DBQFHVMka4QvfyHIvDvMRAjrQAKDZS6OEiOYoHFXOUYX5DtCluP1VQACeN67Y
 RvKiX4T6ugGTiSnPFRFmazo=
 =cdFC
 -END PGP SIGNATURE-




- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.4 (FreeBSD)

iD8DBQFHVNte4QvfyHIvDvMRArj0AJ9zlx1yazaOc9UyhNIgtO3+WA0TzQCfTQ07
QKv9N3YSpODOr2ulo0VDMuA=
=mtVN
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: 6.3 PRERELEASE

2007-11-10 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Friday, November 09, 2007 10:20:47 -0800 Jon Holstrom
[EMAIL PROTECTED] 
wrote:

 I had 6.2 stable all setup 
 had gnome 2.18 all humming along 100%
 java  eclipse, tomcat, bah bah bah!

 updated src  rebuilt only to
 find 6.2 is gone  6.3 prerelease!

What is wrong with 6.3-PRERELEASE?  I had 6-STABLE all setup  had kde 3.5.x 
hum0%, java, azureus, bah bah bah! ... upgraded to 6.3-PRERELEASE and still 
have  6-STABLE all setup  had kde 3.5.x hum0%, java, azureus, bah bah bah! ... 
nothing has changed from what can tell, just newer kernel *shrug*

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.4 (FreeBSD)

iD8DBQFHNh7W4QvfyHIvDvMRAqnuAJ9RN4JsubP808xI7bwZz3iKWl2voQCgucu/
7YKW6UTEDp1zpGIBwMpLvSA=
=suC5
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Call for testing: patch that helps Wine on 6.x

2007-08-15 Thread Marc G. Fournier

   \   midimap
 ELF 7f346000-7f376000   Deferredlibcups.so.2
 ELF 7f376000-7f3f8000   Deferredlibgnutls.so.13
 ELF 7f3f8000-7f447000   Deferredlibgcrypt.so.13
 ELF 7f447000-7f46   Deferredlibcrypt.so.4
 ELF 7f46-7f469000   Deferredlibintl.so.8
 ELF 7f469000-7f557000   Deferredlibiconv.so.3
 Threads:
 process  tid  prio (all id:s are in hex)
 000a
 000c0
 000b0
 0008 (D) C:\Program Files\Macromedia\Dreamweaver 8\Dreamweaver.exe
 000d0
 00090 ==
 daemon%

 Any idea how to resolve this issue?
 Will the patch on http://bugs.winehq.org/show_bug.cgi?id=4139 help to this
 issue?

 thanks in advance,

 Ganbold


 --
 If it's worth doing, do it for money.
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to [EMAIL PROTECTED]



- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v2.0.4 (FreeBSD)

iD8DBQFGw0vC4QvfyHIvDvMRApJIAKCgEXQblbilfCI5AQTpQyHWfz5AfQCfU3vU
/3BivBPQlh1TDb2RAGMifVE=
=GUCw
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Call for testing: patch that helps Wine on 6.x

2007-08-06 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Monday, August 06, 2007 16:05:40 -0400 John Baldwin [EMAIL PROTECTED] 
wrote:

 On Friday 03 August 2007 10:56:48 pm Marc G. Fournier wrote:

 --On Tuesday, July 31, 2007 14:47:50 -0700 Kris Moore [EMAIL PROTECTED]
 wrote:

  I'm not sure all the tests run properly since I didn't run through them
  yet. I'll try it out tomorrow morning though. All I tried was FireFox
  for Windows and installed StarCraft. Both worked just fine here. (I did
  a spawn of Starcraft since the safedisc support isn't working as far as
  I know)

 'k, I just installed the latest patches from http://wiki.freebsd.org/Wine,
 and
 everything builds fine, and I'm getting alot further with the tests, but its
 failing at the rebar test ... I've posted to [EMAIL PROTECTED] with
 my
 results on this, as it seems to be the Wine side, not FreeBSD ...

 John, I've been running both the signal and pfault patches on my 6.x
 desktops
 since Tijl posted them, and haven't noticed any issues resulting from
 them ...

 Does cvsup work?  A similar patch broke cvsup on HEAD.

I've cvsup'd several times since first applying the patch, and haven't noticed 
any issues ...

Also, any chance of getting the thr_kill2() patch Tijl did in?  I've been 
running both on my desktop, and haven't noticed any issues resulting from 
either (other then improvements to wine, of course) ...

Getting both those patches in place should allow us to focus on wine itself 
without having to worry about the OS side of things ...



- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGt9L14QvfyHIvDvMRAik+AJ90kETJRNEw5WXF+XXXvZlUQoxAvACeJFg5
grbZ9Nb/q233PSoAeZ4Iz2w=
=eXFR
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Call for testing: patch that helps Wine on 6.x

2007-08-03 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Tuesday, July 31, 2007 14:47:50 -0700 Kris Moore [EMAIL PROTECTED] 
wrote:

 I'm not sure all the tests run properly since I didn't run through them
 yet. I'll try it out tomorrow morning though. All I tried was FireFox
 for Windows and installed StarCraft. Both worked just fine here. (I did
 a spawn of Starcraft since the safedisc support isn't working as far as
 I know)

'k, I just installed the latest patches from http://wiki.freebsd.org/Wine, and 
everything builds fine, and I'm getting alot further with the tests, but its 
failing at the rebar test ... I've posted to [EMAIL PROTECTED] with my 
results on this, as it seems to be the Wine side, not FreeBSD ...

John, I've been running both the signal and pfault patches on my 6.x desktops 
since Tijl posted them, and haven't noticed any issues resulting from them ...

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGs+rw4QvfyHIvDvMRAiW8AKCpVIKvIZqWPA0yMLfxet/wl33FBQCghy1L
AidVDAaM729qO7Mjms61UIY=
=Z53o
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Call for testing: patch that helps Wine on 6.x

2007-07-31 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Tuesday, July 31, 2007 12:19:23 -0700 Kris Moore [EMAIL PROTECTED] 
wrote:


 I just gave FireFox 2.0.0.6 a shot using FBSD 6-Stable and all the
 various patches on the Wiki page. It loaded and ran just fine on my end.


as user root?  or a regular user?

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGr6Rf4QvfyHIvDvMRAqLmAJ4o7HAxPo+a4JTcP8D1x1xdC0usrgCgoWWT
2p/oZnz+2MQrXZ3UqGPBYXQ=
=1EmJ
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Call for testing: patch that helps Wine on 6.x

2007-07-31 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Tuesday, July 31, 2007 14:21:28 -0700 Kris Moore [EMAIL PROTECTED] 
wrote:


 :) I learned my lesson, I ran it as regular user this time.

'k, now I'm curious ... you have all the kernel patches in place, and you can 
now run 'make tests' as a regular user without any problems?  I just updated my 
kernel, so am going to work tonight on plugging in the OS patches and building 
a new wine here (just got back from camping, still catching up on things) ...

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGr6zS4QvfyHIvDvMRAjKGAJ41uUlIeSGwJojFNG9p1fYQt2Z92ACeOzgQ
+IJ3IJZe7dcEN9VBHn7Fvbw=
=MJ4T
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Static in left speaker (HDA Codec: Realtek ALC883)

2007-07-24 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Just got a new desktop, and so far everything has been a dream, except the 
sound ... did a 'snd_driver_load=YES' to load everything, and the sound 
system is detected as:

pcm0: HDA Codec: Realtek ALC883
pcm0: HDA Driver Revision: 20070710_0047

If I 'kldunload snd_hda', the static goes away ... reload it, the static comes 
back again ...

Found a reference to http://people.freebsd.org/~ariff/, and the lowlatency 
stuff, so tried that, same effect ...

Tried different speakers, just in case, no change ...

I do get sound out, can watch movies and such, but the sound seems to only come 
out the right speaker, and, well, the static is fairly annoying ...

Not sure what else I can do to debug, mind you ... help?

thanks ...


- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGpsc/4QvfyHIvDvMRAiZlAKDjTFeq5Cu/JZoERFU1CrCrL9aYJgCfT+rI
JJ7WvHrpXxyl+zaHPRHUPhw=
=gb3O
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: SATA 300 Drive Being Run At 150

2007-07-21 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Saturday, July 21, 2007 11:55:39 -0500 Dan Nelson 
[EMAIL PROTECTED] wrote:

 In the last episode (Jul 21), Tim Daneliuk said:
 I asked this question a while back, but needed to do more digging to make
 sure I had latest sources etc.

 I have an Intel motherboard that shows this for a SATA controller:

 atapci1: Intel ICH7 SATA300 controller port
 0x20c8-0x20cf,0x20ec-0x20ef,0x20c0-0x20c7,0x20e8-0x20eb,0x20a0-0x20af mem
 0x90204000-0x902043ff irq 19 at device 31.2 on pci0

 But the hard drive - a SATA 300 device - shows up like this:

 ad4: 238475MB WDC WD2500JS-00NCB1 10.02E02 at ata2-master SATA150
 ^^^
 Using dd, I have confirmed that the drive is running nowhere near
 SATA-III speeds, at least on reads:

 968470075 bytes transferred in 7.132891 secs (135775249 bytes/sec)

 What was your dd commandline?  If you've got more than 1GB of RAM and
 tested by reading a file and not the raw device itself, you just tested
 FreeBSD buffer cache. According to
 http://www.wdc.com/en/products/productspecs.asp?driveid=135 , that
 drive's maximum sustained speed is only 93.5 MB/sec, so it doesn't
 really matter if your interface is running at SATA150 or SATA300 unless
 you plan on reading exclusively from its 8MB buffer :)

'k, I just bought a new desktop, SATA/300MB/s interface, and this drive:

http://www.wdc.com/en/products/products.asp?DriveID=254

Web site states 3Gb/s ... I'm seeing same SATA!50:

atapci0: JMicron JMB361 SATA300 controller port 
0xbf00-0xbf07,0xbe00-0xbe03,0xbd00-0xbd07,0xbc00-0xbc03,0xbb00-0xbb0f mem 
0xfdbfe000-0xfdbf irq 16 at device 0.0 on pci2
atapci1: Intel ICH8 SATA300 controller port 
0xfa00-0xfa07,0xf900-0xf903,0xf800-0xf807,0xf700-0xf703,0xf600-0xf60f,0xf500-0xf50f
 
irq 19 at device 31.2 on pci0
atapci2: Intel ICH8 SATA300 controller port 
0xf300-0xf307,0xf200-0xf203,0xf100-0xf107,0xf000-0xf003,0xef00-0xef0f,0xee00-0xee0f
 
irq 19 at device 31.5 on pci0
ad8: 152627MB WDC WD1600AAJS-08PSA0 05.06H05 at ata4-master SATA150

Latest 6.x STABLE ...

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGosqU4QvfyHIvDvMRAhENAKDhq0K+IDbZvD9Lcm51aLTwzjhz9ACgnFZz
b3iDMLhANYWByT3a7Vu3utQ=
=ZnlY
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

rwhod / ntpdate don't work ... amd64/-STABLE ...

2007-07-16 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Just looking over one of our AMD64 servers, and rwhod / syslog / ntpdate won't 
work on that server, although its running the same date/version ... I checked 
securelevel, and they are both running the same ...

rwhod generates no errors when I try to run it, and truss doesn't show anything 
since it does a fork:


stat(/etc/nsswitch.conf,{mode=-rw-r--r-- ,inode=24857,size=113,blksize=4096}) 
= 0 (0x0)
open(/etc/group,O_RDONLY,0666) = 3 (0x3)
fstat(3,{mode=-rw-r--r-- ,inode=24775,size=441,blksize=4096}) = 0 (0x0)
lseek(3,0x0,SEEK_CUR)= 0 (0x0)
lseek(3,0x0,SEEK_SET)= 0 (0x0)
read(3,# $FreeBSD: src/etc/group,v 1.32...,4096) = 441 (0x1b9)
close(3) = 0 (0x0)
sigaction(SIGHUP,{ SIG_IGN 0x0 ss_t },{ SIG_DFL SA_RESTART ss_t }) = 0 (0x0)
fork()   = 90418 (0x16132)
exit(0x0)
process exit, rval = 0

So, I'm not 100% certain what I'm looking for ...

The network looks good, I can connected to the jails running on it, and, syslog 
runs in the jails themselves, just not the physical server ...

If I try syslogd from the command line, it generates an error:

# /usr/sbin/syslogd -s
syslogd: child pid 90996 exited with return code 1

I'm not out of disk space on any of the file systems ...

So, not sure what else I should be looking for here ...

Help?


- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGnBHP4QvfyHIvDvMRAuYnAJ4qU1T486MWB1HDYb1yU+8LwD6gJgCdHS/z
Lah1f/mbLzBQrROzv09J44E=
=GuLe
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: rwhod / ntpdate don't work ... amd64/-STABLE ...

2007-07-16 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Figured it out ... not sure *how* it happened, but on my last upgrade, I must 
have somehow screwed up my mergmaster, and actually wiped out /etc/services ... 
just ran mergemaster on a whim, and the file was totally recreated, and all 
services now start up as expected ...


- --On Monday, July 16, 2007 21:48:15 -0300 Marc G. Fournier [EMAIL 
PROTECTED] 
wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1


 Just looking over one of our AMD64 servers, and rwhod / syslog / ntpdate
 won't  work on that server, although its running the same date/version ... I
 checked  securelevel, and they are both running the same ...

 rwhod generates no errors when I try to run it, and truss doesn't show
 anything  since it does a fork:


 stat(/etc/nsswitch.conf,{mode=-rw-r--r--
 ,inode=24857,size=113,blksize=4096})  = 0 (0x0)
 open(/etc/group,O_RDONLY,0666) = 3 (0x3)
 fstat(3,{mode=-rw-r--r-- ,inode=24775,size=441,blksize=4096}) = 0 (0x0)
 lseek(3,0x0,SEEK_CUR)= 0 (0x0)
 lseek(3,0x0,SEEK_SET)= 0 (0x0)
 read(3,# $FreeBSD: src/etc/group,v 1.32...,4096) = 441 (0x1b9)
 close(3) = 0 (0x0)
 sigaction(SIGHUP,{ SIG_IGN 0x0 ss_t },{ SIG_DFL SA_RESTART ss_t }) = 0 (0x0)
 fork()   = 90418 (0x16132)
 exit(0x0)
 process exit, rval = 0

 So, I'm not 100% certain what I'm looking for ...

 The network looks good, I can connected to the jails running on it, and,
 syslog  runs in the jails themselves, just not the physical server ...

 If I try syslogd from the command line, it generates an error:

# /usr/sbin/syslogd -s
 syslogd: child pid 90996 exited with return code 1

 I'm not out of disk space on any of the file systems ...

 So, not sure what else I should be looking for here ...

 Help?


 - 
 Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
 Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
 Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.5 (FreeBSD)

 iD8DBQFGnBHP4QvfyHIvDvMRAuYnAJ4qU1T486MWB1HDYb1yU+8LwD6gJgCdHS/z
 Lah1f/mbLzBQrROzv09J44E=
 =GuLe
 -END PGP SIGNATURE-

 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to [EMAIL PROTECTED]



- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGnCHM4QvfyHIvDvMRAtB6AJ9+aFEXYmrFRuvtMeDe10rOtTkbBwCeIAwO
SXlG8lCyNxx9mr94d3fWk6A=
=JpRk
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Unix domain socket leak in 6-STABLE

2007-06-14 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Wednesday, June 13, 2007 20:15:56 +0200 Ulrich Spoerlein 
[EMAIL PROTECTED] wrote:


 was your leak a kernel leak or a user leak (if it actually makes a
 difference).

I don't know ... it was caused by an application, but nothing was freed up 
after the application was stop'd ...


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGcWes4QvfyHIvDvMRAnaVAJ4pfQ69GvcfXObQ37yMlHG61Foz4wCcClFp
p2TKa/KvLdgkKv9XCbA5hok=
=d3WG
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Unix domain socket leak in 6-STABLE

2007-06-14 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Thursday, June 14, 2007 14:03:27 -0300 Alexandre Biancalana 
[EMAIL PROTECTED] wrote:

 On 6/14/07, Marc G. Fournier [EMAIL PROTECTED] wrote:

 I don't know ... it was caused by an application, but nothing was freed up
 after the application was stop'd ...


 In my case the sockets are closed only if I stop the samba processes. When I
 just changed the connection mode from Unix Socket to TCP on nss_ldap.conf,
 the connections remain opened. I think this could be a problem with nss_ldap
 (in the way of the connections are handled ?) because samba is accessing
 OpenLDAP directly via TCP, the access via Unix Sockets is only done by Samba
 throughnss_ldap.

 I trying to simulate this error on another machine. I will write some
 scripts/program that connect to OpenLDAP socket directly and via nss_ldap
 and post the results.

 Any more hints ?

Hrmm .. how about nss in general?  the one VPS that I killed off was using 
nss-mysql for passwd/group and shadow ... its definitely not something that is 
normally done here, and about the only thing I can think of that is 'unusual' 
about that specific VPS, in my case ...

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD4DBQFGcZL54QvfyHIvDvMRAgbBAJ4zbygUUNdl6kKEp+sAPW0vLgJsvwCWP768
Ulzq5eM+ygPOM+A243NTsg==
=EuC7
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Unix domain socket leak in 6-STABLE

2007-06-13 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


- --On Wednesday, June 13, 2007 09:17:36 -0700 Jeremy Chadwick 
[EMAIL PROTECTED] wrote:


 I've seen this kind of problem with domain sockets (at least on Linux
 with a multi-use tool called busybox) where on error conditions the
 code never bothered to close the existing socket it opened, thus
 resulting in leaks/resource exhaustion over time.  The code later got
 fixed, but a pretty nasty bug especially when the program is used in
 a lot of embedded products...

 In regards to FreeBSD, I remember reading some mails from Robert Watson
 last month in regards to UNIX domain socket code changes:

 http://monkey.org/freebsd/archive/freebsd-stable/200705/msg00200.html

'k, just to ring in here ... I can definitely attest to there being a leak 
here, as it was me that was originally burned by it ... in my case, I 
eventually was able to isolate which VPS/jail was causing it and haven't run it 
since, but was never able to determine exactly what was causing it, since there 
wasn't really anything unusual running in that jail :(

But ... based on the discussions that were had at the time, it was my 
understanding that if all applications were shut down on the server (to the 
bare minimal), eventually the kernel GC should clean up all residual sockets 
... when I did this (shut down all applications but the very bare minimum) and 
waited for 10+ minutes, socket usage never drop'd below about 4k sockets in 
use, or something like that ...

Unlike Ulrich, I wasn't running LDAP at the time, so that wasn't the cause for 
me ...

I could easily enough restart that jail if there was some more useful 
information I could get from it, but the thread kinda dwindled off over time, 
and rebooting a server ever 3 days was getting a wee bit annoying to my clients 
:)

But, if someone has something they'd like me to do to provide more info, I'm 
willing to do it (short of anything that requires DDB / console access ... that 
server is remote) ...

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGcC0y4QvfyHIvDvMRApuZAJ9xKfa2/LqkcMkFEr4vrtnLt3ObcQCg43hs
7QX1hYskbQh/L8XJn1r1/Ts=
=xKdx
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: fast rate of major FreeBSD releases to STABLE

2007-05-19 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Saturday, May 19, 2007 01:22:40 -0700 KAYVEN RIESE [EMAIL PROTECTED] 
wrote:



 On Thu, 17 May 2007, [EMAIL PROTECTED] (Mark Linimon) wrote:

 On Thu, May 17, 2007 at 01:35:10PM -0500, Craig Boston wrote:

 The alternative would have been to commit what we had and _then_ found
 out all the bugs in the upgrade process (note: you won't be able to just
 blindly use portupgrade -af; you will need to read the UPDATING file for
 the proper procedure.  This is the unusual case of being such a sweeping
 change that the port management tools are not completely up to the task.)

 okay could this freeze an explanation for the fact that my x is totally
 hosed?  i know any random joe can't necessarily answer that.. but assuming
 it is true..

Not sure how ... since the freeze started, I haven't seen any commits to the X 
system go through, or anything else for that matter (sorry, except for one port 
that I can't recall its name) ... I know in my case, I'm looking forward to the 
freeze being lifted since there was a recent release of new versions of PHP ... 
:)

 how long is this freeze going to last then?

Not 100% certain, but Kris just posted a note about the X stuff bbeing 
committed, so I'm guessing RSN ...

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGTxzN4QvfyHIvDvMRAkIqAKDAV3YQkNPIS8+XXtM13dpA7CQybgCbBhUK
rxDqsrCVzL9DFQ+lLpCrSRs=
=ur1s
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: fast rate of major FreeBSD releases to STABLE

2007-05-19 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Saturday, May 19, 2007 22:01:26 +0100 Chris [EMAIL PROTECTED] wrote:


 With the ports freeze I wonder in situations when a full freeze is
 needed it is better to do so on a seperate testing branch so it allows
 security commits etc. to carry on as normal and then remerge again
 after testing is complete.  Or is this simply not possible to do?

IMHO, not impossible, but creates alot more work then the disruption of a 
couple of weeks without commits would justify ... you have to bear in mind, 
once the freeze is lifted, all of the ports that had been modified on the 
'branch' would then need to be re-modified on the regular branch, putting alot 
of work onto the shoulders of the maintainers themselves ...

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGT2ph4QvfyHIvDvMRAjCsAJ9bMyqm63cIFsP+my+FbRjcSNSNQQCgnmbt
UYmaKyCFgIc3ABhM82cTqYg=
=m+LC
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: UNIX domain sockets MFC's

2007-05-15 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Tuesday, May 15, 2007 19:09:22 +0200 Oliver Fromme 
[EMAIL PROTECTED] wrote:

 If there isn't, then start the jails one after another
 (not all at once) and keep checking.  Maybe it's just
 one specific jails (or a few of them) which trigger the
 problem.  With that procedure it should be possible to
 find it (or them).

'k, there is definitely a leak in here somewhere, since if I shut down all 
processes on the machine, the garbage collector should clean up the sockets, 
which isn't happening ...

... that said, after that last round with the 1200 find processes running, I 
shutdown the VPS that they were running in, and my socket usage has stayed 
around 2800, so something in that VPS looks to be 'the cause' ...

... my next step is going to be to restart that specific VPS and see if they 
start to climb again, but, again, even after shutting down all the processes, 
those sockets are not being released, so there is a problem somewhere that that 
one VPS is triggering ...


- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGSiX04QvfyHIvDvMRAoGpAJ0b05pHtfk514NafmDKcYcLYhFziQCfYxP+
mu5RXX5f516GiZHL4GFkeM8=
=HLVo
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Socket leak (Was: Re: What triggers No Buffer Space) ?Available?

2007-05-15 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


It didn't  kept climbing ...

- --On Tuesday, May 15, 2007 21:39:35 +0200 Ulrich Spoerlein 
[EMAIL PROTECTED] wrote:

 I'm slowly cathing up on FreeBSD related mails and found this mail ...

 Marc G. Fournier wrote:
kern.ipc.numopensockets: 7400
kern.ipc.maxsockets: 12328
   
ps looks like:
   

 stuff deleted

  2368  p2  Is+  Sat01PM   0:00.03 /bin/tcsh   root2112  0.0  0.1  5220
  2360  p3  Ss+  Sat01PM   0:00.04 /bin/tcsh   root   91221  0.0  0.1  5140
  2440  p4  Ss+  11:49PM   0:00.12 -tcsh (tcsh)
 
  I don't think those processes should consume 7400 sockets.
  Indeed, this really looks like a leak in the kernel.

 Robert has sent me a suggestion to try that I'm in the process of putting
 together right now, involving backing out some work on uipc_usrreg.c ...

 How did the backing out work for you?

 Ulrich Spoerlein
 --
 The trouble with the dictionary is you have to know how the word is
 spelled before you can look it up to see how it is spelled.
 -- Will Cuppy
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to [EMAIL PROTECTED]



- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGSjDm4QvfyHIvDvMRAv+4AKCUc0ijgXs4igHymP94NGM5XAmvXQCfUi2X
m/jpnf+voCioDKmJjedIRbw=
=dyqI
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: UNIX domain sockets MFC's

2007-05-14 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Monday, May 14, 2007 11:29:12 +0100 Robert Watson [EMAIL PROTECTED] 
wrote:


 On Sat, 12 May 2007, Marc G. Fournier wrote:

 The fix for this has now been merged as 1.155.2.22.  As there have been no
 new reports of UNIX domain socket problems in the last couple of days, it
 sounds like the MFC of the last batch of fixes and cleanups has not lead to
 problems.

 I've just upgraded my kernel to the latest, to include the MFC'd code above
 ...

 Yes -- I was very specific in my e-mail regarding the MFC's that they were
 not believed to address the problem you are reporting.  I think we have a
 leak in the way some edge case is handled with regard to UNIX domain socket
 shutdown. What would be really nice to know is if that persists in 7-CURRENT,
 in which we've redone the way the socket life cycle works.  However, I don't
 know if you are able to tolerate booting a 7-CURRENT kernel in your
 environment...?

On that server, that could be very difficult ... if this was happening on any 
of my HP servers, I would in a minute ...

 Did we determine whether backing out to before the unpcb socket reference
 count change made any difference for you?

The problem appeared to persist after backing it out ...

I'm curious about something ... way back, when I was using unionfs, I had a 
major problem with vnode leakage ... as I mentioned before, this server is the 
only one I have that uses geom/gmirror on its drives, the rest all use hardware 
RAID ... is there *any* possibility that I'm seeing some sort of interaction 
issue?  It really bothers me that the only server that I'm seeing this one is 
the one that I'm using software RAID on ...

Would it be useful to add some DEBUG statements to the socket code, to trace 
open/close/flush/etc?  Maybe to see where flush's are being started, but never 
completed?  That sort of thing?


- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGSFn14QvfyHIvDvMRAow2AKC67Y0QuiiF+ZJA5Tpbd3WUvcmdTwCaAgZS
OY4em31JQzIIbs1CUcmpHNo=
=1Mqr
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: UNIX domain sockets MFC's

2007-05-14 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Monday, May 14, 2007 16:03:13 +0200 Oliver Fromme
[EMAIL PROTECTED] 
wrote:

 FWIW, I have two servers running RELENG_6 (2 months old)
 using gmirror and with a few jails (not many, though ...
 they're used for Apache web servers and PostgreSQL).
 I'm not seeing any socket leakage.

 $ sysctl kern.ipc | grep sockets
 kern.ipc.numopensockets: 118
 kern.ipc.maxsockets: 12328
 $ uptime
  3:55PM  up 82 days, 20:39, 3 users, load averages: 0.04, 0.05, 0.02
 $ gmirror status
   NameStatus  Components
 mirror/gm0  COMPLETE  ad0
   ad1

 If you have more hints how to reproduce the problem, I
 might give it a try if it's not too much trouble.

That's the fun part ... I can't seem to re-create it anywhere except that one 
server :(  And it doesn't seem to matter how many jail(s) I have on it ... I 
just dump'd 25 jails off of it and onto another server, and its still rising 
...

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGSIv14QvfyHIvDvMRAvbLAKDI62gdfiP8Q++eEtsQkL7Qi19KxQCgj3Qw
AmUDtwd92A6n2mLs3REVTkI=
=Av2b
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Known memory leak in 6-STABLE from April 1st?

2007-05-14 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Monday, May 14, 2007 19:07:24 +0200 Ulrich Spoerlein 
[EMAIL PROTECTED] wrote:

 Hi all,

 I observed something funny with our new cyrus/postfix/amavis
 installations running on 6.2-STABLE checked out on April 1st (no, I'm
 not joking).

 They are running symon to grab performance data and I saw the memory
 total becoming less and less. Now I know that adding up
 free+active+inactive != total ram BUT *all* other FreeBSD machines we
 are running show a more or less constant sum.

 I uploaded two pictures showing the trend here (They are i386 machines
 with 4GB RAM, FreeBSD reports 3.3GB as usable):

 http://coyote.dnsalias.net/ms1-day.png
 http://coyote.dnsalias.net/ms1-week.png

 Now after doing some heavy IMAP testing (cyrus reconstruct of big
 maildirs) the system froze to a complete halt. Stupid me already
 rebooted the machine, tomorrow I'll try to break into DDB when it
 happens again. I also started recording top(1) memory output and
 sysctl vm.zone output.

 The main questions is: Were there any known memory leaks at the start
 of April? Any patches I should blindly try before spending several
 days on debugging this?

Hrmmm ... long shot here, but what does:

sysctl kern.ipc.numopensockets

show over that period of time ... just wondering if we are somehow related on 
problems here, just different symptoms ...

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGSJyC4QvfyHIvDvMRAmDJAJwMe9ihH/5ITea58y1Qivilfju2KACgidMf
Aq68KICMse94bckc2UL/7Sw=
=TUSW
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Does a pipe take a socket ... ?

2007-05-14 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


For those that remmeber the other day, I had that swzone issue, where I ran out 
of swap space?  I just about hit it again today, swap was up to 99% used ... I 
was able to get a ps listing in, and there were a whack of find processes 
running ...

Now, I think I know which VPS they were running in, so that isn't a problem ... 
and I suspect that the find was just part of a longer pipe ... I'm just curious 
if those pipes would happen to use up any of those sockets that are 
'evaporating', or is this totally unrelated to sockets?

Thanks ...


- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGSKR54QvfyHIvDvMRAg/iAKCXXw2eBMr6reJlKNqcG2IvlSvXvgCgi0R+
3cPjCNRy9r+N1MSYETwKPv4=
=ha/b
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: UNIX domain sockets MFC's

2007-05-12 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Friday, May 11, 2007 12:49:32 +0100 Robert Watson [EMAIL PROTECTED] 
wrote:

 On Tue, 8 May 2007, Robert Watson wrote:

 Right now I am tracking two known issues with UNIX domain sockets in
 RELENG_6:

 - Reported NULL point derference in unp_connect(), which occurs due to the
  dropping of locks around sonewconn().  This is fixed in HEAD, and I am
  preparing an MFC of this patch.

 The fix for this has now been merged as 1.155.2.22.  As there have been no
 new reports of UNIX domain socket problems in the last couple of days, it
 sounds like the MFC of the last batch of fixes and cleanups has not lead to
 problems.

I've just upgraded my kernel to the latest, to include the MFC'd code above ...

Just before rebooted, as I've done the past couple of times, I shutdown 
everything on the server, so that there were minimal processes running ... 
based on the last one, and this one, it looks like the number of Active open 
sockets is ~4000 ... last time, I was up to 11k sockets open, and it drop'd to 
~7000 once all jails were shut down, but, as reported to Robert/John, there was 
a java process in a soclose state, so I wasn't 100% certain there ...

This time through, I started at about 8800 sockets open, and shut down all 
processes, including all java processes ... using ps auxlw, I checked for any 
processes in a soclose state, and there were none ... I waited a full 10 
minutes to let things 'settle', and after 7 of those, it had drop'd down to:

mars# uptime ; sysctl kern.ipc | grep sock
 2:18PM  up 1 day, 13:26, 5 users, load averages: 0.00, 0.47, 2.57
kern.ipc.maxsockbuf: 262144
kern.ipc.sockbuf_waste_factor: 8
kern.ipc.numopensockets: 4835
kern.ipc.maxsockets: 12328

And stuck there for the remaining 3 minutes before I rebooted ... which is what 
leads me to believe that there are about 4000 active sockets on this server 
when everything is running ...


- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGRfpC4QvfyHIvDvMRAuzoAKDbb5Fndwtw8paTsmLdXIP+FrOBHQCeIVKf
Uhlv8ZRAjVar/fRHD3E6waM=
=yglM
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: UNIX domain sockets MFC's

2007-05-11 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Friday, May 11, 2007 12:49:32 +0100 Robert Watson [EMAIL PROTECTED] 
wrote:

 On Tue, 8 May 2007, Robert Watson wrote:

 Right now I am tracking two known issues with UNIX domain sockets in
 RELENG_6:

 - Reported NULL point derference in unp_connect(), which occurs due to the
  dropping of locks around sonewconn().  This is fixed in HEAD, and I am
  preparing an MFC of this patch.

 The fix for this has now been merged as 1.155.2.22.  As there have been no
 new reports of UNIX domain socket problems in the last couple of days, it
 sounds like the MFC of the last batch of fixes and cleanups has not lead to
 problems.

I will work on upgrading that system right now to the latest -STABLE and let y 
ou know ... did you happen to receive my email concerning that java process in 
a soclose state?

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGRISl4QvfyHIvDvMRAhNVAJ94AKDAhNQIk3Kkq3PRbiru0a+T2QCfWglT
kwaljA9wg70RKzqcyOwDz3U=
=FuMA
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Socket leak (Was: Re: What triggers No Buffer Space) ?Available?

2007-05-08 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Tuesday, May 08, 2007 15:14:29 +0200 Oliver Fromme 
[EMAIL PROTECTED] wrote:

 What kind of jails are those?  What applications are
 running inside them?  It's quite possible that the
 processes on one machine use 120 sockets per jail,
 while on a different machine they use only half that
 many per jail, on average.  Of course, I can't tell
 for sure without knowing what is running in those
 jails.

The all run pretty much the same thing, on all the machines ... by default, 
standard syslog, sshd, cron, cyrus imapd, postfix and apache ... some run 
aolserver over top of that, or jdk/tomcat, or zope ... but they aren't specific 
to the server itself, as they get moved around ...

   kern.ipc.numopensockets: 7400
   kern.ipc.maxsockets: 12328
  
   ps looks like:
  

stuff deleted

 2368  p2  Is+  Sat01PM   0:00.03 /bin/tcsh   root2112  0.0  0.1  5220
 2360  p3  Ss+  Sat01PM   0:00.04 /bin/tcsh   root   91221  0.0  0.1  5140
 2440  p4  Ss+  11:49PM   0:00.12 -tcsh (tcsh)

 I don't think those processes should consume 7400 sockets.
 Indeed, this really looks like a leak in the kernel.

Robert has sent me a suggestion to try that I'm in the process of putting 
together right now, involving backing out some work on uipc_usrreg.c ...


 Maybe sockstat -u and/or fstat | grep -w local (both
 of those commands should basically list the same kind of
 information).  My guess is that the output will be rather
 short, i.e. much shorter than 7355 lines.  If that's true,
 it is another indication that the problem is caused by
 a kernel leak.

at the time I rebooted, with no processes, but 7400 sockets:

 wc -l sockstat.out.txt
  12 sockstat.out.txt
 grep local fstat.out.txt | wc -l
   7

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGQLrf4QvfyHIvDvMRAqlWAJ9Dg2J55e6YVAzkfC9mGascFfr+JQCeJpWo
uXAZtN0WbyKdM4a12WJjszs=
=BA7G
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Socket leak (Was: Re: What triggers No Buffer Space) Available?

2007-05-07 Thread Marc G. Fournier

:00.03 /bin/tcsh
root2112  0.0  0.1  5220  2360  p3  Ss+  Sat01PM   0:00.04 /bin/tcsh
root   91221  0.0  0.1  5140  2440  p4  Ss+  11:49PM   0:00.12 -tcsh (tcsh)

And netstat -n -funix shows 7355 lines similar to:

d05f1000 stream  0  00 d05f109000
d05f1090 stream  0  00 d05f100000
cf1be000 stream  0  00 cf1bdea000
cf1bdea0 stream  0  00 cf1be00000
cec42bd0 stream  0  00 cf2ac48000
cf2ac480 stream  0  00 cec42bd000

with the final few associated with running processes:

c95ad000 stream  0  0 c95aa000000 
/var/run/devd.pipe
c95aca20 dgram   0  00 c95ace1000
c95accf0 dgram   0  0 c95c7110000 
/var/named/var/run/log
c95acd80 dgram   0  0 c95c7330000 /var/run/log
c95ace10 dgram   0  0 c95c74400 c95aca200 
/var/run/logpriv
c95acea0 dgram   0  0 c95c7550000 /var/run/log

So, over 7000 sockets with pretty much all processes shut down ...

Shouldn't the garbage collector be cutting in somewhere here?

I'm willing to shut everthing down like this again the next time it happens (in 
2-3 days) if someone has some other command / output they'd like fo rme to 
provide the output of?

And, I have the following outputs as of the above, where everythign is shutdown 
and its running on minimal processes:

# ls -lt
total 532
- -rw-r--r--  1 root  wheel   11142 May  8 00:20 fstat.out
- -rw-r--r--  1 root  wheel 742 May  8 00:20 netstat_m.out
- -rw-r--r--  1 root  wheel  486047 May  8 00:20 netstat_na.out
- -rw-r--r--  1 root  wheel 735 May  8 00:20 sockstat.out
- -rw-r--r--  1 root  wheel6266 May  8 00:20 vmstat_m.out
- -rw-r--r--  1 root  wheel5376 May  8 00:20 vmstat_z.out
- -rw-r--r--  1 root  wheel4910 May  8 00:20 ps.out


- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGP+8z4QvfyHIvDvMRAlI+AJ9D0LIRCsFvQShS5TjN/QHw9VyTeQCggYMS
Uc0aJpCLwdZxsH3jVllUZi4=
=e97x
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: swap zone exhausted, increase kern.maxswzone

2007-05-06 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Saturday, May 05, 2007 13:11:35 -0700 Matthew Dillon 
[EMAIL PROTECTED] wrote:

 We'll have a better idea as to what is going on when you get the message
 again.  You might even want to do a once-a-10-minutes cron job to
 append pstat -s, vmstat -m, and vmstat -z to a file.

'k, I have the following running out of cron ever 10 minutes ... anything else 
that might be useful?  This combines the information Robert got me to send him, 
as well as adding pstat -s and ps aux ...

#!/bin/sh
DATE=`date +%Y%m%d%H%M`
DIR=/vm/watch/${DATE}
mkdir ${DIR}
ps aux  ${DIR}/ps.out
sockstat  ${DIR}/sockstat.out
netstat -na  ${DIR}/netstat_na.out
fstat ${DIR}/fstat.out
vmstat -z ${DIR}/vmstat_z.out
vmstat -m ${DIR}/vmstat_m.out
netstat -m${DIR}/netstat_m.out
pstat -s  ${DIR}/pstat_s.out



   -Matt
   Matthew Dillon
   [EMAIL PROTECTED]



- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGPqqz4QvfyHIvDvMRAsHgAKDpv7/SIKEAYIx7NVc8tdeUaAL4YwCg7Rnr
OKYu+cZK2EUjXUpn62zSOIQ=
=rVxB
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

swap zone exhausted, increase kern.maxswzone

2007-05-05 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


What exactly does that one mean?  I've searched Google, and all I'm finding is 
a pointer to swap_pager.c, but nothing else ...

What does that one mean?  What would cause that sort of error?

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGPKT/4QvfyHIvDvMRAiJBAJwPv6Su4TQGToWznFRK2wlNeU+L6wCgpCrF
U4mSIwGJGWZ/YTXZ8aBmWv4=
=MUcQ
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: swap zone exhausted, increase kern.maxswzone

2007-05-05 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Saturday, May 05, 2007 12:06:55 -0400 Kris Kennaway [EMAIL PROTECTED] 
wrote:

 On Sat, May 05, 2007 at 12:38:39PM -0300, Marc G. Fournier wrote:

 What exactly does that one mean?  I've searched Google, and all I'm finding
 is  a pointer to swap_pager.c, but nothing else ...

 What does that one mean?  What would cause that sort of error?

 You need to increase the kern.maxswzone tunable to enable more space
 for active swap.

Apparently that doesn't exist on 6-STABLE, although its generating the error?

# sysctl kern.maxswzone
sysctl: unknown oid 'kern.maxswzone'

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGPK214QvfyHIvDvMRArdfAJ9cqw7x1+dYINa776Ptes4iyjaHEwCeMI8X
ZGUy+Xp2rbWMIc7SnId2TJg=
=vMg0
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: swap zone exhausted, increase kern.maxswzone

2007-05-05 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Saturday, May 05, 2007 20:35:46 +0400 pluknet [EMAIL PROTECTED] wrote:

 Hello,

 On 05/05/07, Marc G. Fournier [EMAIL PROTECTED] wrote:
 # sysctl kern.maxswzone
 sysctl: unknown oid 'kern.maxswzone'

 It is a /boot/loader.conf variable, not in sysctl MIB.

Hrmmm ... then how do I know what to increase it to, if I don't know what it 
currently set to? :(  I thought all the /boot/loader.conf variables were 
viewable read only via sysctl ... ?  kinda like nmbclusters:

# sysctl -a | grep nmbcl
kern.ipc.nmbclusters: 25600

I can't set it via sysctl, it has to be in /boot/loader.conf ... but I can at 
least view its value ...

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGPLQM4QvfyHIvDvMRAkSbAKDojBtpy7zbpRZvC9K16Q5BVL4pWQCg51T5
UgGcvEgqOetC2u9uIsjPqfE=
=USzs
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: swap zone exhausted, increase kern.maxswzone

2007-05-05 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Saturday, May 05, 2007 19:12:09 +0200 Martin Hudec [EMAIL PROTECTED] 
wrote:

 Marc G. Fournier wrote:
 Apparently that doesn't exist on 6-STABLE, although its generating the error?

 # sysctl kern.maxswzone
 sysctl: unknown oid 'kern.maxswzone'

 As in /usr/src/sys/conf/NOTES:
 ...
# 2.  In /boot/loader.conf, set the tunables kern.maxswzone,
# kern.maxbcache, kern.maxtsiz, kern.dfldsiz, kern.maxdsiz,
# kern.dflssiz, kern.maxssiz and kern.sgrowsiz.
 ...

 As in man loader:

 kern.maxswzone
 Limits the amount of KVM to be used to hold swap meta
 information, which directly governs the maximum amount of
 swap the system can support.  This value is specified in
 bytes of KVA space and defaults to around 70MBytes.  Care
 should be taken to not reduce this value such that the
 actual amount of configured swap exceeds 1/2 the kernel-
 supported swap.  The default 70MB allows the kernel to sup-
 port a maximum of (approximately) 14GB of configured swap.
 Only mess around with this parameter if you need to greatly
 extend the KVM reservation for other resources such as the
 buffer cache or NMBCLUSTERS.  Modifies VM_SWZONE_SIZE_MAX.

 Also check -hackers maillist for following and the replies:
 http://lists.freebsd.org/pipermail/freebsd-hackers/2007-January/019217.html

Sweet, that definitely helps ... from John's response in that email, it sounds 
like this may be related to the socket issue that I've already reported, since 
it all seems to revolve around the KVA ... I wonder if the socket issue is 
'pushing into' the swap stuff (ie. this is a result of the problem, not the 
cause) ...

But, based on the 'default 70MB == 14G of configured swap' above .. I only have 
8G of swap on that machine, which really makes it sound like this is an 
overflow from the other problem :(

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGPL9J4QvfyHIvDvMRAuKvAKCljSizyOpaY9Ep6OfpFh++9e5HqQCgmXMb
Z+26yS6pgkqF6qsACcnATiM=
=zi67
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: swap zone exhausted, increase kern.maxswzone

2007-05-05 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Saturday, May 05, 2007 10:49:29 -0700 Matthew Dillon 
[EMAIL PROTECTED] wrote:

 The swblock structures only apply to actively swapped out data.  Mark,
 how much data is actually swapped out (pstat -s) at the time the
 problem is reported?

 If you can dump UMA memory statistics that would be beneficial as well.
 I just find it hard to imagine that any system would actually be using
 that much swap, but hey! :-)

That's why I think that the socket issue and this one are co-related ... with 
everything started up (93 jails), my swap usage right now is:

mars# pstat -s
Device  1K-blocks UsedAvail Capacity
/dev/da0s1b   8388608   20  8388588 0%

Its only been up 2.5 hours so far, but still, everything is started up ...

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGPMh54QvfyHIvDvMRAiYuAJ92hIiO+Sx+7aYeHCqNhpz8uwqL3ACgk+/y
t71wYXIg6SCgB92NaVPc9A0=
=+asv
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: swap zone exhausted, increase kern.maxswzone

2007-05-05 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Saturday, May 05, 2007 13:11:35 -0700 Matthew Dillon 
[EMAIL PROTECTED] wrote:


 We'll have a better idea as to what is going on when you get the message
 again.  You might even want to do a once-a-10-minutes cron job to
 append pstat -s, vmstat -m, and vmstat -z to a file.

'k, that I can do :)

Thanks ...

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGPOxA4QvfyHIvDvMRAm8KAJ48oDaEeLYhJ6Ce6m5YH6h2N5gEVACeLAyp
/D8O7DSiGxXYavMpzRN4ft0=
=8QHO
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Socket leak (Was: Re: What triggers No Buffer Space) Available?

2007-05-04 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Friday, May 04, 2007 12:05:11 +0100 Robert Watson [EMAIL PROTECTED] 
wrote:

 I think we should be careful to avoid prematurely drawing conclusions about
 the source of the problem.  First question: have you confirmed that the
 resource limit on sockets is definitely what is causing the error you're
 seeing?  I.e., does the number of sockets hit the maximum sockets?

'k, so, based on your other email this morning, about sockstat | stream, I'm 
now keeping an eye on:

# uptime ; netstat -nA | grep -c stream ; sockstat -u | grep -c stream ; sysctl 
kern.ipc.numopensockets ; sysctl kern.ipc.maxsockets
 8:59AM  up 1 day,  9:57, 7 users, load averages: 1.63, 4.92, 5.12
6877
2323
kern.ipc.numopensockets: 8463
kern.ipc.maxsockets: 12328

I'm at least 24 hours out from the error(s) starting to happen ...

 Second point: there are two kinds of resource leaks that seem likely
 candidates for a socket resource exhaustion problem. First, kernel bugs, in
 which the kernel maintains objects despite there being no application
 references, and second, application reference leaks, in which applications
 keep references to kernel objects despite no longer needing them.  Our
 immediate goal is to determine which of these is the case: is it a kernel
 bug, or an application bug?  Using tools like netstat and sockstat, we can
 try and determine if all kernel sockets are properly referenced.  Experience
 suggests that it is an application bug, but we shouldn't rule out a kernel
 bug; the good news is that the tools to use in the debugging process are
 identical at this stage.

'k, in preparation for it starting, so that I can reboot as quickly as 
possible, but get max information ... do I just want to save the output of 
'sockstat -u' and 'netstat -nA', or is there something else that will be useful?

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGOz294QvfyHIvDvMRAsy6AKCme99kb27uIHrgLC53fVCZrqKkSgCgheFR
2DYk1DPdmAGzoJhqAXpt+Sc=
=G1NF
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: What triggers No Buffer Space Available?

2007-05-03 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Thursday, May 03, 2007 11:17:56 -0400 Chuck Swiger [EMAIL PROTECTED]
wrote:


 The ones you're showing are from Postfix.  It would be interesting to sort
 them by frequency and see what the majority of the use is from.

 If you sort the data by the conn field, do the ones without an address all
 hit the same thing?  If you grep for that in the first field, I found a lot
 that are talking to /var/run/logpriv (ie, a socketpair() to syslogd,
 presumably).

Okay, assuming that I'm doing this right, here' what I have:

Last night, before I went to bed:

mars# netstat -A | grep stream | wc -l ; sockstat -u | wc -l
2705
2981

Today, 5 minutes ago:

# netstat -A | grep stream | wc -l ; sockstat -u | wc -l
4397
2961

Looking at the Conn field from netstat -A:

mars# awk '{print $6}' /tmp/output | sort | uniq -c | sort -nr | head -5
2125 0
   1 d14dbe10
   1 d14dbbd0
   1 d14dbb40
   1 d14dba20

So, 2125 sockets not connected to anything?

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGOhP+4QvfyHIvDvMRAhdvAKCZo5JRwFea0E8wb+iFblJ1aHM57gCdEb2T
KMJhc7OT5kyQNMslL7Rm+LE=
=+0kp
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: What triggers No Buffer Space Available?

2007-05-03 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Thursday, May 03, 2007 19:28:56 +0100 Robert Watson [EMAIL PROTECTED] 
wrote:

 I generally recommend using a combination of netstat and sockstat.  Sockets
 represent, loosely, IPC endpoints.  There are actually two layers
 associated with each socket -- the IPC object (socket) and the protocol
 control block (PCB).  Both are resource limited to pevent run-away processes
 from swamping the system, so exhaustion of either can lead to ENOBUFS.

 The behaviors of netstat and sockstat are quite different, even though the
 output is similar: netstat walks the protocol-layer connection lists and
 prints information about them.  sockstat walks the process file descriptor
 table and prints information on reachable sockets.  As sockets can exist
 without PCBs, and PCBs can exist without sockets, you need to look at both to
 get a full picture.  This can occur if a proces exits, closes the socket, and
 the connection remains in, for example, the TIME_WAIT state.

 There are some other differences -- the same socket can appear more than once
 in sockstat output, as more than one process can reference the same socket.
 Sockets can also exist without any referencing process (if the application
 closes, but there is still data draining on an open socket).

 I would suggest starting with sockstat, as that will allow you to link socket
 use to applications, and provide a fairly neat summary.  When using netstat,
 use netstat -na, which will list all sockets and avoid name lookups.

'k, all I'm looking at right now is the Unix Domain Sockets, and the output of 
netstat - sockstat is growing since I first started counting both ..

This was shortly after reboot:

mars# netstat -A | grep stream | wc -l ; sockstat -u | wc -l
2705
2981

- From your explanation above, I'm guessing that the higher sockstat #s is 
where 
you were talking about one socket being used by multiple processes?  But, right 
now:

mars# netstat -nA | grep stream | wc -l ; sockstat -u | wc -l
5025
2905

sockstat -u #s are *down*, but netstat -na is almost double ...

Again, based on what you state above: Sockets can also exist without any 
referencing process (if the application closes, but there is still data 
draining on an open socket).

Now, looking at another 6-STABLE server, but one that has been running for 2 
months now, I'm seeing numbers more consistent with what mars looks like 
shortly after all the jails start up:

venus# netstat -nA | grep stream | wc -l ; sockstat -u | wc -l
2126
2209

So, if those sockets on mars are 'still draining on an open socket', is there 
some way of finding out where?  If I'm understanding what you've said above, 
these 'draining sockets' don't have any processes associated with them anymore? 
So, its not like I can just kill off a process, correct?


- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGOlh34QvfyHIvDvMRApSUAJ9jPszXBw83hXPRLbczimNWFtn6WwCgpijT
nDWi/kW4Gt8/J2a4U3n2prk=
=IQCW
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Socket leak (Was: Re: What triggers No Buffer Space) Available?

2007-05-03 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


I'm trying to probe this as well as I can, but network stacks and sockets have 
never been my strong suit ...

Robert had mentioned in one of his emails about a Sockets can also exist 
without any referencing process (if the application closes, but there is still 
data draining on an open socket).

Now, that makes sense to me, I can understand that ... but, how would that look 
as far as netstat -nA shows?  Or, would it?  For example, I have:

mars# netstat -nA | grep c9655a20
c9655a20 stream  0  00 c95d63f000
c95d63f0 stream  0  00 c9655a2000
mars# netstat -nA | grep c95d63f0
c9655a20 stream  0  00 c95d63f000
c95d63f0 stream  0  00 c9655a2000

They are attached to each other, but there appears to be no 'referencing 
process' ... it is now 10pm at night ... I saved a 'snapshot' of netstat -nA 
output at 6:45pm, over 3 hours ago, and it has the same entries as above:

c9655a20 stream  0  00 c95d63f000
c95d63f0 stream  0  00 c9655a2000

again, if I'm reading this right, there is no 'referencing process' ... first, 
of course, am I reading this right?

second ... if I am reading this right, and, if I am understanding what Robert 
was saying about 'draining' (alot of ifs, I know) ... isn't it odd for it to 
take 3 hours to drain?

Again, if I'm reading / understanding things right, without the 'referencing 
process', it won't show up in sockstat -u, which is why my netstat -nA numbers 
keep growing, but sockstat -u numbers don't ... which also means that there is 
no way to figure out what process / program is leaving 'dangling sockets'? :(


- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGOoe94QvfyHIvDvMRAj2LAKDXobcYr4VGOB+WfXYqCBTatZNZLQCfbyWa
zsG/o1K3RM3ybjA5RLiSW5s=
=8DJi
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Socket leak (Was: Re: What triggers No Buffer Space) Available?

2007-05-03 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Thursday, May 03, 2007 18:26:30 -0700 Matthew Dillon 
[EMAIL PROTECTED] wrote:


 One thing you can do is drop into single user mode... kill all the
 processes on the system, and see if the sockets are recovered.  That
 will give you a good idea as to whether it is a real leak or whether
 some process is directly or indirectly (by not draining a unix domain
 socket on which other sockets are being transfered) holding onto the
 socket.

*groan*  why couldn't this be happening on a server that I have better remote 
access to? :(

But, based on your explanation(s) above ... if I kill off all of the jail(s) on 
the machine, so that there are minimal processes running, shouldn't I see a 
significant drop in the number of sockets in use as well?  or is there 
something special about single user mode vs just killing off all 'extra 
processes'?

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGOpeM4QvfyHIvDvMRAoppAJ9SNmIi+i2vDXEZzrpaVe74a3uKyQCfeMY7
z3lFWXEo111CL5peXvqqsCQ=
=qxmO
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: What triggers No Buffer Space Available?

2007-05-02 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Wednesday, May 02, 2007 11:00:17 +0800 Adrian Chadd [EMAIL PROTECTED] 
wrote:


 It doesn't panic whe it happens, no?

Nope ... I can login via ssh (sometimes it takes a try or two, but I can always 
login) and then do a 'reboot', and all is well again for another 72 hours or so 
...

 I'd check the number of sockets you've currently got open at that
 point.

ie:

# netstat | egrep tcp4|udp4 | awk '{print $1}' | uniq -c
 171 tcp4
 103 udp4

or is there a better command I should be using?

 Some applications might be holding open a whole load of sockets
 and their buffers stay allocated until they're closed. If they don't
 handle/don't get told about the error then they'll just hold open the
 mbufs.

Is there any way of determining which apps are holding open which sockets?  ie. 
lsof for open files?

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGOMu/4QvfyHIvDvMRAldVAJ9B4uUUGbON16nWw+dR5QKveyQevACgju4M
TtBVUWAqf2PGqHVQxOnRbew=
=4/1c
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: What triggers No Buffer Space Available?

2007-05-02 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


- --On Wednesday, May 02, 2007 11:17:02 -0700 John-Mark Gurney 
[EMAIL PROTECTED] wrote:


 netstat -A will list the socket address, fstat will list the fd, and what
 socket it connected to that fd..

Oh wow ... according to this, I have:

mars# wc -l /tmp/output
   11238 /tmp/output

(minus some header lines) sockets running righ tnow ...

okay, next question ... under 'Active UNIX domain sockets, I see alot that have 
no Addr:

Active UNIX domain sockets
Address  Type   Recv-Q Send-QInode Conn Refs  Nextref Addr
d06b7480 stream  0  00 c969b24000 
private/proxymap
c969b240 stream  0  00 d06b748000
ce6fc870 stream  0  00 cf74487000 
private/rewrite
cf744870 stream  0  00 ce6fc87000
ce4b2630 stream  0  00 d0cee90000 
private/proxymap
d0cee900 stream  0  00 ce4b263000
d0437240 stream  0  00 cf71600000 
private/proxymap
cf716000 stream  0  00 d043724000
c94f4990 stream  0  00 cee6ed8000 
private/rewrite
cee6ed80 stream  0  00 c94f499000
d0cefcf0 stream  0  00 cb281a2000 
private/rewrite
cb281a20 stream  0  00 d0cefcf000
ce0d5240 stream  0  00 cb25148000 private/anvil

Now, the 'Conn' field from the previous line matches the 'Address' line of the 
'blank Addr' ... so there are two sockets for each Addr?  in vs out?

To give reference point ... mars above has 91 jail'd environments running on 
it, its been up 2days, 9hrs now, and has 11k sockets in use ...

Hrmmm ... just checked jupiter, and she has 32 jail with 1080 sockets ... venus 
has 62
jail with 2819 sockets ... and pluto has 35 jails with 1818 sockets ...

mars is running on average 2x the number of sockets per jail then the other 
servers ...

Is this normal?

mars# grep d067f900 /tmp/output
d067f900 stream  0  00 cafd4c6000
cafd4c60 stream  0  00 d067f90000

There is no 'Addr' related to either of them?  I can scroll down pages and 
pages of those types of entries, that don't have any Addr field associated with 
them ...




 --
   John-Mark GurneyVoice: +1 415 225 5579

  All that I will do, has been done, All that I have, has not.
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to [EMAIL PROTECTED]



- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGOPj/4QvfyHIvDvMRAsbFAKDRrAE4QazlJ1iQM6lLOULBwdNSygCfV2r2
AeY8lpmf0E+Av1zmAGijo+g=
=zDXV
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: What triggers No Buffer Space Available?

2007-05-02 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


'k, I just rebooted the server (messages started again), and netstat -A is 
showing 3600 sockets open ... based on jupiter/pluto/venus numbers, this is 
what I'd expect to see (~1000 sockets per 30 jails) ... so, over the course of 
hte next 2 days, I expect that that will grow to the 11k+ that I saw when I 
rebooted, with most of those apparently not attached to an 'Addr' ...

- --On Wednesday, May 02, 2007 17:47:59 -0300 Marc G. Fournier 
[EMAIL PROTECTED] wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1


 - --On Wednesday, May 02, 2007 11:17:02 -0700 John-Mark Gurney
 [EMAIL PROTECTED] wrote:


 netstat -A will list the socket address, fstat will list the fd, and what
 socket it connected to that fd..

 Oh wow ... according to this, I have:

 mars# wc -l /tmp/output
11238 /tmp/output

 (minus some header lines) sockets running righ tnow ...

 okay, next question ... under 'Active UNIX domain sockets, I see alot that
 have  no Addr:

 Active UNIX domain sockets
 Address  Type   Recv-Q Send-QInode Conn Refs  Nextref Addr
 d06b7480 stream  0  00 c969b24000
 private/proxymap
 c969b240 stream  0  00 d06b748000
 ce6fc870 stream  0  00 cf74487000
 private/rewrite
 cf744870 stream  0  00 ce6fc87000
 ce4b2630 stream  0  00 d0cee90000
 private/proxymap
 d0cee900 stream  0  00 ce4b263000
 d0437240 stream  0  00 cf71600000
 private/proxymap
 cf716000 stream  0  00 d043724000
 c94f4990 stream  0  00 cee6ed8000
 private/rewrite
 cee6ed80 stream  0  00 c94f499000
 d0cefcf0 stream  0  00 cb281a2000
 private/rewrite
 cb281a20 stream  0  00 d0cefcf000
 ce0d5240 stream  0  00 cb25148000
 private/anvil

 Now, the 'Conn' field from the previous line matches the 'Address' line of
 the  'blank Addr' ... so there are two sockets for each Addr?  in vs out?

 To give reference point ... mars above has 91 jail'd environments running on
 it, its been up 2days, 9hrs now, and has 11k sockets in use ...

 Hrmmm ... just checked jupiter, and she has 32 jail with 1080 sockets ...
 venus  has 62
 jail with 2819 sockets ... and pluto has 35 jails with 1818 sockets ...

 mars is running on average 2x the number of sockets per jail then the other
 servers ...

 Is this normal?

 mars# grep d067f900 /tmp/output
 d067f900 stream  0  00 cafd4c6000
 cafd4c60 stream  0  00 d067f90000

 There is no 'Addr' related to either of them?  I can scroll down pages and
 pages of those types of entries, that don't have any Addr field associated
 with  them ...




 --
   John-Mark Gurney   Voice: +1 415 225 5579

  All that I will do, has been done, All that I have, has not.
 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to [EMAIL PROTECTED]



 - 
 Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
 Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
 Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.5 (FreeBSD)

 iD8DBQFGOPj/4QvfyHIvDvMRAsbFAKDRrAE4QazlJ1iQM6lLOULBwdNSygCfV2r2
 AeY8lpmf0E+Av1zmAGijo+g=
 =zDXV
 -END PGP SIGNATURE-

 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to [EMAIL PROTECTED]



- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGOVOS4QvfyHIvDvMRAv1RAJwIU84/Mh+8fdJVuyScsljFDSQB1QCg11Qe
C6U/KSqScqYTHUhEM1dLXQM=
=mzYI
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

What triggers No Buffer Space Available?

2007-05-01 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


I'm still being hit by this one ... more frequently right now as I had to move 
a bit more stuff *onto* that server ... I'm trying to figure out what I can 
monitor for a 'leak' somewhere, but the only thing I'm able to find is the 
whole nmbclusters stuff:

mars# netstat -m | grep mbuf clusters
130/542/672/25600 mbuf clusters in use (current/cache/total/max)

the above is after 26hrs uptime ...

Is there something else that will trigger/generate the above error message?


- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGN0W14QvfyHIvDvMRAo+CAKCGpBrcf30/BWFJcrKsJNFr2G7jJQCff67L
FxFIiBd52huPFdQgb88AtHE=
=mbLc
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: How to report bugs (Re: 6.2-STABLE deadlock?)

2007-04-27 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Tuesday, April 24, 2007 23:53:16 -0400 Kris Kennaway
[EMAIL PROTECTED] 
wrote:

 On Wed, Apr 25, 2007 at 10:53:08AM +0800, LI Xin wrote:
 Hi, Oleg,

 Oleg Derevenetz wrote:
  ??? LI Xin [EMAIL PROTECTED]:
 [...]
  I'm not very sure if this is specific to one disk controller.  Actually
  I got some occasional reports about similar hangs on amd64 6.2-RELEASE
  (slightly patched version) that most of processes stuck in the 'ufs'
  state, under very light load, the box was equipped with amr(4) RAID.
 
  I was not able to reproduce the problem at my lab, though, it's still
  unknown that how to trigger the livelock :-(  Still need some
  investigate on their production system.
 
  I reported simular issue for FreeBSD 6.2 in audit-trail for kern/104406:
 
  http://www.freebsd.org/cgi/query-pr.cgi?pr=104406cat=
 
  and there should be a thread related to this. Briefly, I suspects that
  this is  related to nullfs filesystems on my server and when I cvsuped to
  FreeBSD 6.2- STABLE with Daichi's unionfs-related patches and replaced
  nullfs-mounted fs  with unionfs-mounted (that was done 10.03.07) problem
  is gone (seems to be so,  at least).

 Hmm...  Seems to be different issues.  The problem I have received was a
 pgsql server (no nullfs/unionfs involved), and the hang always happen
 when it is not being heavily loaded (usually in the morning, for
 instance, and there is no special configuration, like scheduled tasks
 which can generate disk load, etc., only the entropy harvesting), so
 this is quite confusing.

 Yes, a large part of the confusion is the unfortunate tendency of
 people to do the following:

 user1 my system hangs/panics/etc
 user2 my system hangs/panics/etc too; it must be the same problem!

 What we really need is for every FreeBSD user who encounters a
 hang/panic/etc to avoid jumping to conclusions -- no matter how many
 superficial similarities there may seem to you -- and instead go
 through the relevant steps described here:


 http://www.freebsd.org/doc/en_US.ISO8859-1/books/developers-handbook/kernelde
 bug.html

 Until you (or a developer) have analyzed the resulting information,
 you cannot definitively determine whether or not your problem is the
 same as a given random other problem, and you may just confuse the
 issue by making claims of similarity when you are really reporting a
 completely separate problem.

What about those that don't have the benefit of being able to access the 
console? :(  I've recently started buying servers that have builtin, full 
remote console (ie. the HP servers), but, for instance, I have one box that I 
have to consistently reboot ever 3 days due to a 'No Buffer Space Available' 
...

A thought: how hard would it be to add some method of forcing a system crash, 
that would dump core, from the command line?  Something that, by default, would 
be disabled, but for remote debugging purposes, one could enable in the kernel 
and do a 'sysctl kernel.force_core_crash=1' to have it do it?  I imagine that 
having a core to analyze would allow providing more information then nothing at 
all, no?


- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGMkj34QvfyHIvDvMRAnIsAJ42loBGh0TkX4mfWSrZrMq2FheBuQCgiu4l
B0PCLtLhd9ZiJ4oNLWZ6LT0=
=KK9Y
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: How to report bugs (Re: 6.2-STABLE deadlock?)

2007-04-27 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Friday, April 27, 2007 22:57:29 +0200 Nicolas Rachinsky 
[EMAIL PROTECTED] wrote:

 * Marc G. Fournier [EMAIL PROTECTED] [2007-04-27 16:03 -0300]:
 A thought: how hard would it be to add some method of forcing a system
 crash,  that would dump core, from the command line?  Something that, by
 default, would

 Doesn't 'kill -6 1' work anymore?

I'd never heard of that one ... will it dump core if I do that?

Please note, in my case, with the Buffer Space issue ... I can login and 
cleanly reboot the server, so doing something like the above to get a core dump 
is definitely doable, I'd just never seen a reference to a 'kill -6 1' before 
for doing that ...

Side question to this though ... I remember awhile back using a 'client-server' 
mechanism that allowed me to dump core to a seperate server ... it was so long 
ago that my memory is faint, but there was a reason why I couldn't dump to the 
local server ... not sure whatever happened to that code, but, if one can do 
that for dumping core, shouldn't there be some method possible to connect to 
DDB over the Ethernet without having to have a serial console in place?  For 
the core dump case, the ethernet obviously stayed up while it dump'd, couldn't 
some sort of 'ddb.conf' file be setup that would allow it to ifconfig an IP 
within that shell so that you could connect to it remotely?  say with an 
'from-ip' directive?

Just a thought ...


- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGMmx04QvfyHIvDvMRAlNcAJ0QcIMoRnq+0T9yJVuMwZvTNQnNXwCfaEKK
JB4cHzSbiklD/sodWvNSSzE=
=BwuL
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Jail Resource Limits for 6.x ...

2007-04-14 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Is anyone looking into merging in the patch available at:

  http://www.ualberta.ca/~cdjones/cdjones_jail_soc2006.patch

That provides both memory and cpu limits on a jail?  It appears to be against 
REL_6 from last years SOC ...

Is anyone using it in production anywhere?


- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGIQrl4QvfyHIvDvMRApdeAKClCVc62+hZRZRVi/Gi4WwhlAeJuACeIka2
qy2HZ3H0e6OQq9aDTiNDTFk=
=6DVK
-END PGP SIGNATURE-

___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: 74 hours till next No Buffer Space Available reboot ...

2007-04-13 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Sunday, April 08, 2007 23:04:42 -0400 Dave [EMAIL PROTECTED] wrote:

 Hello,
 This is what i get for catching this late. Can you describe your
 situation? I've got a server, router actually running 6.1-p6 i believe, and
 lately it's been doing this stop. I can't be any more specific than that,
 because that's all i know. The box just goes unresponsive, i can get a login
 prompt on the console, but it's unresponsive. I have to reboot it. This has
 occurred twice now and i'm starting to get concerned. I've ruled out ram, i
 recently replaced it's ram for an unrelated reason so i don't think that's
 it. If your situation is similar can you let me know what you tried?

This is a different situation, I think ... first, I'm running 6.2-STABLE, as of 
about last week, so a much newer kernel then you are running ... and in my 
case, at least, I can still login to the machine using ssh and force a reboot 
remotely ... it doesn't seem to be a 'solid hang' ... if I were to hazard a 
guess as to what it feels like ... it feels like the network interface 
buffer has filled up, but isn't being released properly ... almost like a 
memory leak, but on the network ... if I leave it long enough, it will 
eventually require a tech to power cycle it, but if I catch it early enough, I 
can still get in to do a reboot ...

But ... that said ... when you say 'get a login prompt on the console, but 
it's unresponse ... do you mean that you can actually type in a userid, and 
possibly passwd, but after that it just hangs?


 Thanks.
 Dave.

 - Original Message - From: Marc G. Fournier [EMAIL PROTECTED]
 To: [EMAIL PROTECTED]
 Cc: Chris [EMAIL PROTECTED]; Thiago Esteves de Oliveira
 [EMAIL PROTECTED]
 Sent: Sunday, April 08, 2007 10:28 PM
 Subject: 74 hours till next No Buffer Space Available reboot ...


 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1


 In my case, I can almost set my watch to it (if I had a watch) ... every 3
 days, 2 hours, it seems that I have to reboot this machine, as that is
 when the
 'No Buffer Space Available' r starts to be generated ...

 There are two others (CC'd in this) that have experienced the same ...

 Chris / Thiago ... in your cases, are you finding that it happens as
 regularly
 with your servers?  Thiago, I believe you ended up reverting to an older
 kernel
 to clear up the situation?

 I've included my 'netstat -m' report ... from it, it doesn't look to me
 like
 its an mbuf issue, or am I missing something?  Is there something else
 that, in
 74 hours, I can provide before I do the reboot?

 Chris, you mentioned reducing recvspace/sendspace to correct the issue?
 Has
 that fixed it for you, or just prolonged until it happens again?  How did
 you
 set this?  I've checked both the man pages for ifconfig and fxp, and don't
 see
 anything ... ah, just found it doing a 'sysctl -a' ... can you post your
 settings from /etc/sysctl.conf?  or did you set it somewhere else?  I'd
 like to
 try that and see if maybe that changes my '74 hours uptime', either good
 or bad
 ...



 # netstat -m
 161/949/1110 mbufs in use (current/cache/total)
 133/639/772/25600 mbuf clusters in use (current/cache/total/max)
 133/396 mbuf+clusters out of packet secondary zone in use (current/cache)
 0/0/0/0 4k (page size) jumbo clusters in use (current/cache/total/max)
 0/0/0/0 9k jumbo clusters in use (current/cache/total/max)
 0/0/0/0 16k jumbo clusters in use (current/cache/total/max)
 306K/1515K/1821K bytes allocated to network (current/cache/total)
 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
 0/45/6656 sfbufs in use (current/peak/max)
 0 requests for sfbufs denied
 0 requests for sfbufs delayed
 325 requests for I/O initiated by sendfile
 731 calls to protocol drain routines


 - 
 Marc G. Fournier   Hub.Org Networking Services
 (http://www.hub.org)
 Email . [EMAIL PROTECTED]  MSN . [EMAIL 
 PROTECTED]
 Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.5 (FreeBSD)

 iD8DBQFGGaTD4QvfyHIvDvMRAm3jAKDtZk1IgW3DbMGGKASiSsbNV7Ok3QCgtvwK
 JSuRYW1Af0lfFK2QvYMo9v8=
 =3DwH
 -END PGP SIGNATURE-

 ___
 [EMAIL PROTECTED] mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to [EMAIL PROTECTED]




- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGIEq34QvfyHIvDvMRAo+uAKDTevbmYP2q7p7tvO674RMlFoiPpACgoCVY
cvG08TsmvMN/iwBI3BVEEeo=
=0r5p
-END PGP SIGNATURE-

___
[EMAIL PROTECTED] mailing list

74 hours till next No Buffer Space Available reboot ...

2007-04-08 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


In my case, I can almost set my watch to it (if I had a watch) ... every 3 
days, 2 hours, it seems that I have to reboot this machine, as that is when the 
'No Buffer Space Available' r starts to be generated ...

There are two others (CC'd in this) that have experienced the same ...

Chris / Thiago ... in your cases, are you finding that it happens as regularly 
with your servers?  Thiago, I believe you ended up reverting to an older kernel 
to clear up the situation?

I've included my 'netstat -m' report ... from it, it doesn't look to me like 
its an mbuf issue, or am I missing something?  Is there something else that, in 
74 hours, I can provide before I do the reboot?

Chris, you mentioned reducing recvspace/sendspace to correct the issue?  Has 
that fixed it for you, or just prolonged until it happens again?  How did you 
set this?  I've checked both the man pages for ifconfig and fxp, and don't see 
anything ... ah, just found it doing a 'sysctl -a' ... can you post your 
settings from /etc/sysctl.conf?  or did you set it somewhere else?  I'd like to 
try that and see if maybe that changes my '74 hours uptime', either good or bad 
...



# netstat -m
161/949/1110 mbufs in use (current/cache/total)
133/639/772/25600 mbuf clusters in use (current/cache/total/max)
133/396 mbuf+clusters out of packet secondary zone in use (current/cache)
0/0/0/0 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/0 9k jumbo clusters in use (current/cache/total/max)
0/0/0/0 16k jumbo clusters in use (current/cache/total/max)
306K/1515K/1821K bytes allocated to network (current/cache/total)
0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/45/6656 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
325 requests for I/O initiated by sendfile
731 calls to protocol drain routines


- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGGaTD4QvfyHIvDvMRAm3jAKDtZk1IgW3DbMGGKASiSsbNV7Ok3QCgtvwK
JSuRYW1Af0lfFK2QvYMo9v8=
=3DwH
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: No buffer space available

2007-04-07 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Saturday, April 07, 2007 20:12:00 +0100 Chris [EMAIL PROTECTED] wrote:

 Also to add I now have a 2nd box using 6.2 STABLE few days old code,
 had to use it because of broadcom 5755 nic card, I plan to use large
 tcp window sizes so will be interesting to see if this also suffers
 from the problem.

I've got 8 servers on the same network, 3 are almost identical, but one of them 
(the one with the problem) is using software RAID vs hardware ... but, if you 
are seeing it without using software RAID, then that is obviously not the 
culprit :(

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGGCda4QvfyHIvDvMRAshzAJ47nHUdu2Xlxy8odBbaCxufhfV9igCgjQTw
xNFG2VFQmGPNhjToZJ6HDNk=
=6BN+
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: No buffer space available

2007-04-06 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Friday, April 06, 2007 06:17:04 +0100 Chris [EMAIL PROTECTED] wrote:

 I am seeing the no buffer space error on a machine running 6.2 STABLE
 feb 24 code, the machine isn't using gmirror.  I had to recude
 recvspace and sendspace to lower values then I want to get round the
 problem.

 67/1163/1230 mbufs in use (current/cache/total)
 65/275/340/65536 mbuf clusters in use (current/cache/total/max)
 65/255 mbuf+clusters out of packet secondary zone in use (current/cache)
 0/0/0/0 4k (page size) jumbo clusters in use (current/cache/total/max)
 0/0/0/0 9k jumbo clusters in use (current/cache/total/max)
 0/0/0/0 16k jumbo clusters in use (current/cache/total/max)
 146K/840K/987K bytes allocated to network (current/cache/total)
 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
 0/56/8704 sfbufs in use (current/peak/max)
 0 requests for sfbufs denied
 0 requests for sfbufs delayed
 20233 requests for I/O initiated by sendfile
 7740 calls to protocol drain routines

What ethernet driver are you using?  In my case, its an fxp device ... trying 
to see if there is *some* sort of common denominator here :(

I just upgraded to the latest kernel last night, to see if maybe a recent 
commit had a side-effect of fixing it, but won't know anything for another 48 
hours or so ...

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGFpJ44QvfyHIvDvMRAny4AKCOVStyCBOi5Pwt5uyelgze3ML/kQCgxqCp
6VZ/f9U4ibx/zahMLWu+Fs0=
=U8Y1
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: No buffer space available

2007-04-05 Thread Marc G. Fournier

kern.shutdown.kproc_shutdown_wait: 60
kern.sugid_coredump: 0
kern.coredump: 1
kern.nodump_coredump: 0
kern.corefile: %N.core
kern.fscale: 2048
kern.timecounter.stepwarnings: 0
kern.timecounter.nbinuptime: 64529049
kern.timecounter.nnanouptime: 84
kern.timecounter.nmicrouptime: 297518
kern.timecounter.nbintime: 15065914
kern.timecounter.nnanotime: 5937055
kern.timecounter.nmicrotime: 9127597
kern.timecounter.ngetbinuptime: 78777910
kern.timecounter.ngetnanouptime: 248698
kern.timecounter.ngetmicrouptime: 9289785
kern.timecounter.ngetbintime: 0
kern.timecounter.ngetnanotime: 0
kern.timecounter.ngetmicrotime: 6
kern.timecounter.nsetclock: 3
kern.timecounter.hardware: i8254
kern.timecounter.choice: TSC(-100) i8254(0) dummy(-100)
kern.timecounter.tick: 1
kern.timecounter.smp_tsc: 0
kern.threads.thr_scope: 0
kern.threads.thr_concurrency: 0
kern.threads.debug: 0
kern.threads.max_threads_per_proc: 1500
kern.threads.max_groups_per_proc: 1500
kern.threads.max_threads_hits: 0
kern.threads.virtual_cpu: 2
kern.sched.name: 4BSD
kern.sched.quantum: 10
kern.sched.ipiwakeup.enabled: 1
kern.sched.ipiwakeup.requested: 3687784
kern.sched.ipiwakeup.delivered: 3690316
kern.sched.ipiwakeup.usemask: 1
kern.sched.ipiwakeup.useloop: 0
kern.sched.ipiwakeup.onecpu: 0
kern.sched.ipiwakeup.htt2: 0
kern.sched.followon: 0
kern.sched.pfollowons: 0
kern.sched.kgfollowons: 0
kern.sched.preemption: 1
kern.sched.runq_fuzz: 1
kern.ccpu: 1948
kern.devstat.numdevs: 12
kern.devstat.generation: 538
kern.devstat.version: 6
kern.kobj_methodcount: 73
kern.log_wakeups_per_second: 5
kern.log_console_output: 1
kern.always_console_output: 0
kern.msgbuf: removed lines upon lines of text here
kern.msgbuf_clear: 0
kern.smp.maxcpus: 16
kern.smp.active: 1
kern.smp.disabled: 0
kern.smp.cpus: 2
kern.smp.forward_signal_enabled: 1
kern.smp.forward_roundrobin_enabled: 1
kern.nselcoll: 11052
kern.drainwait: 300
kern.tty_nin: 22760
kern.tty_nout: 15228375
kern.console: consolectl,/consolectl,
kern.consmute: 0
kern.consmsgbuf_size: 8192
kern.constty_wakeups_per_second: 5
kern.filedelay: 30
kern.dirdelay: 29
kern.metadelay: 28
kern.minvnodes: 25000
kern.chroot_allow_open_directories: 1
kern.random.yarrow.gengateinterval: 10
kern.random.yarrow.bins: 10
kern.random.yarrow.fastthresh: 192
kern.random.yarrow.slowthresh: 256
kern.random.yarrow.slowoverthresh: 2
kern.random.sys.seeded: 1
kern.random.sys.harvest.ethernet: 1
kern.random.sys.harvest.point_to_point: 1
kern.random.sys.harvest.interrupt: 1
kern.random.sys.harvest.swi: 0


- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGFUag4QvfyHIvDvMRAr2PAKDn4sSN6dyQulC0W2Q1lr25RfSBPQCgwMgD
wzztdb381CaTTOVtRSXhZzw=
=pUWJ
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: No buffer space available

2007-04-04 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Thiago ...

  What version of kernel did you end up going back to?

- --On Wednesday, April 04, 2007 10:15:48 -0300 Marc G. Fournier 
[EMAIL PROTECTED] wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1


 I'm seeing the same effect (haven't tried older kernel, mind you) almost like
 clockwork, every 72 hours after reboot ... at least now I don't feel so
 crazy,  knowing it isn't just me ...

 - --On Sunday, April 01, 2007 17:07:08 -0300 Thiago Esteves de Oliveira
 [EMAIL PROTECTED] wrote:

 I've tried to increase the kern.ipc.nmbclusters value but it worked only when
 I changed the kernel to an older one.

 netstat -m (Now it's working with the same values.)
 -
 515/850/1365 mbufs in use (current/cache/total)
 512/390/902/65024 mbuf clusters in use (current/cache/total/max)
 512/243 mbuf+clusters out of packet secondary zone in use (current/cache)
 0/0/0/0 4k (page size) jumbo clusters in use (current/cache/total/max)
 0/0/0/0 9k jumbo clusters in use (current/cache/total/max)
 0/0/0/0 16k jumbo clusters in use (current/cache/total/max)
 1152K/992K/2145K bytes allocated to network (current/cache/total)
 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
 0/0/0 sfbufs in use (current/peak/max)
 0 requests for sfbufs denied
 0 requests for sfbufs delayed
 2759 requests for I/O initiated by sendfile
 2982 calls to protocol drain routines

 Ethernet adapters
 -
 em0: Intel(R) PRO/1000 Network Connection Version - 6.0.5 port
 0xec80-0xecbf m em 0xfebe-0xfebf irq 10 at device 4.0 on pci7
 em0: Ethernet address: 00:04:23:c3:06:78
 em0: [FAST]
 skc0: 3Com 3C940 Gigabit Ethernet port 0xe800-0xe8ff mem
 0xfebd8000-0xfebdbfff  irq 15 at device 6.0 on pci7
 skc0: 3Com Gigabit NIC (3C2000) rev. (0x1)
 sk0: Marvell Semiconductor, Inc. Yukon on skc0
 sk0: Ethernet address: 00:0a:5e:65:ad:c3
 miibus0: MII bus on sk0
 e1000phy0: Marvell 88E1000 Gigabit PHY on miibus0
 e1000phy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX-FDX,
 auto

 P.S.: I am using the FreeBSD/amd64.

 Brian A. Seklecki wrote:
 Show us netstat -m on the broken kernel?  Show us your dmesg(8) for
 em(4).

 TIA,
 ~BAS

 On Fri, 2007-03-30 at 11:13 -0300, Thiago Esteves de Oliveira wrote:
 Hello,

 I've had a problem with one of my FreeBSD servers, the machine has stopped
 its network services and then sent these messages:

 -Mar 27 13:00:03 anubis dhcpd: send_packet: No buffer space available
 -Mar 27 13:00:26 anubis routed[431]: Send bcast sendto(em0,
 146.164.92.255.520): No buffer space available

 The messages were repeated a lot of times before a temporary solution. I've
 changed the kernel(FreeBSD 6.2) to an older one(FreeBSD 6.1) and since then
 it's been working well. What happened?

 P.S.: I can give more informations if necessary.
 ___
 freebsd-questions@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to
 [EMAIL PROTECTED]


 ___
 freebsd-questions@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to [EMAIL PROTECTED]



 - 
 Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
 Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
 Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.5 (FreeBSD)

 iD8DBQFGE6UE4QvfyHIvDvMRAlutAJ0WzVTYq99hmx1km2mdXE7pdUC8IgCgt4O1
 eG6kXgqHveumXjkL0t+Q8Q8=
 =sieE
 -END PGP SIGNATURE-

 ___
 freebsd-questions@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to [EMAIL PROTECTED]



- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGE/ZC4QvfyHIvDvMRAsWoAJwJpD8nCtG0iv5U6LY8ISyyDKxgegCg1eti
SezStun7CLDA9pgfrp8GloM=
=UwSU
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: No buffer space available

2007-04-04 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Thiago ...

  I'm just curious here, but are you by any chance using geom at all?  The only 
machine I have that seems to be affected like this (where netstat -m doesn't 
seem to indicate a problem with mbufs) is using gmirror ... the rest all use 
hardware RAID controllers ...

  Its a long shot, but so far, its the only one I seem to be able to draw :(


- --On Sunday, April 01, 2007 17:07:08 -0300 Thiago Esteves de Oliveira 
[EMAIL PROTECTED] wrote:

 I've tried to increase the kern.ipc.nmbclusters value but it worked only when
 I changed the kernel to an older one.

 netstat -m (Now it's working with the same values.)
 -
 515/850/1365 mbufs in use (current/cache/total)
 512/390/902/65024 mbuf clusters in use (current/cache/total/max)
 512/243 mbuf+clusters out of packet secondary zone in use (current/cache)
 0/0/0/0 4k (page size) jumbo clusters in use (current/cache/total/max)
 0/0/0/0 9k jumbo clusters in use (current/cache/total/max)
 0/0/0/0 16k jumbo clusters in use (current/cache/total/max)
 1152K/992K/2145K bytes allocated to network (current/cache/total)
 0/0/0 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
 0/0/0 requests for jumbo clusters denied (4k/9k/16k)
 0/0/0 sfbufs in use (current/peak/max)
 0 requests for sfbufs denied
 0 requests for sfbufs delayed
 2759 requests for I/O initiated by sendfile
 2982 calls to protocol drain routines

 Ethernet adapters
 -
 em0: Intel(R) PRO/1000 Network Connection Version - 6.0.5 port
 0xec80-0xecbf m em 0xfebe-0xfebf irq 10 at device 4.0 on pci7
 em0: Ethernet address: 00:04:23:c3:06:78
 em0: [FAST]
 skc0: 3Com 3C940 Gigabit Ethernet port 0xe800-0xe8ff mem
 0xfebd8000-0xfebdbfff  irq 15 at device 6.0 on pci7
 skc0: 3Com Gigabit NIC (3C2000) rev. (0x1)
 sk0: Marvell Semiconductor, Inc. Yukon on skc0
 sk0: Ethernet address: 00:0a:5e:65:ad:c3
 miibus0: MII bus on sk0
 e1000phy0: Marvell 88E1000 Gigabit PHY on miibus0
 e1000phy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX-FDX,
 auto

 P.S.: I am using the FreeBSD/amd64.

 Brian A. Seklecki wrote:
 Show us netstat -m on the broken kernel?  Show us your dmesg(8) for
 em(4).

 TIA,
 ~BAS

 On Fri, 2007-03-30 at 11:13 -0300, Thiago Esteves de Oliveira wrote:
 Hello,

 I've had a problem with one of my FreeBSD servers, the machine has stopped
 its network services and then sent these messages:

 -Mar 27 13:00:03 anubis dhcpd: send_packet: No buffer space available
 -Mar 27 13:00:26 anubis routed[431]: Send bcast sendto(em0,
 146.164.92.255.520): No buffer space available

 The messages were repeated a lot of times before a temporary solution. I've
 changed the kernel(FreeBSD 6.2) to an older one(FreeBSD 6.1) and since then
 it's been working well. What happened?

 P.S.: I can give more informations if necessary.
 ___
 freebsd-questions@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to [EMAIL PROTECTED]


 ___
 freebsd-questions@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to [EMAIL PROTECTED]



- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGFE/S4QvfyHIvDvMRAvf2AJ94uFbAqplqtvTHeontpNT1FvzE7ACcDqYM
5EVfYDsLw++60NYugCOOwho=
=+Wd7
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

More on: No buffer space available

2007-03-29 Thread Marc G. Fournier

:44,0,  119,  349,30104,0
VMSPACE:  296,0, 1034,  812,  2510845,0
mbuf_packet:  256,0,  142,  382, 33140922,0
mbuf: 256,0,   44,  542, 65939839,0
mbuf_cluster:2048,25600,  524,  164,   345305,0
mbuf_jumbo_pagesize: 4096,0,0,0,0,0
mbuf_jumbo_9k:   9216,0,0,0,0,0
mbuf_jumbo_16k: 16384,0,0,0,0,0
ACL UMA zone: 388,0,0,0,0,0
g_bio:132,0,0, 4205, 87153652,0
VNODE:272,0,71264,22158, 1241560352, 
0
VNODEPOLL: 76,0,0,  100,3,0
S VFS Cache:   68,0,73121,29135, 1248334482, 
0
L VFS Cache:  291,0,  124, 1085,   682683,0
NAMEI:   1024,0,0,  304, 1434961352, 
0
DIRHASH: 1024,0, 1810,  258, 18000204,0
PIPE: 408,0, 1981,  602,  1091976,0
KNOTE: 68,0,   32,  360,  3972127,0
socket:   356,12331,12271,   60,  8439626, 1141
unpcb:144,12339,11561,  373,  5337418,0
ipq:   32,  904,0,  226,2,0
udpcb:180,12342,   74,  146,  2173707,0
inpcb:180,12342,  678, 1478,   927361,0
tcpcb:464,12328,  619,  717,   927361,0
tcptw: 48, 2496,   59, 1501,   256613,0
syncache: 100,15366,0,  195,   676224,0
hostcache: 76,15400,  512,  688,34850,0
tcpreass:  20, 1690,0,  507,53830,0
sackhole:  20,0,0,  507,20912,0
ripcb:180,12342,0,   88, 1127,0
rtentry:  132,0,  203,  319, 6656,0
g_stripe_zone: 131072,  100,0,0,0,0
SWAPMETA: 276,   121576,  957,  429,17641,0
Mountpoints:  664,0,  197,   19,  200,0
FFS inode:132,0,70901,17491, 1239034732, 
0
FFS1 dinode:  128,0,0,0,0,0
FFS2 dinode:  256,0,70901, 7924, 1239034732, 
0

If the '4 hour later' version is of any use, please ask, I did save a copy 
before rebooting ...

Does this provide anything?  Is there something else I should do/try?

Thanks ...


- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGDHfM4QvfyHIvDvMRAhZlAJ4sR9Xe3fuC5egjtt9o9dX8Ek+opACcCu3H
euSZyKGB9/HVcuwilQicfMM=
=bQo7
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: socketpair: No buffer space available

2007-03-27 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


thanks ... just rebooted it yesterday again, so it has another 48 hours before 
it starts up again, so will save that output before next reboot ...

- --On Tuesday, March 27, 2007 21:03:55 +0100 Robert Watson
[EMAIL PROTECTED] 
wrote:


 On Fri, 23 Mar 2007, Marc G. Fournier wrote:

 I've checked nmbclusters between the two machines, and both are at 25600,
 but not sure what sysctl to look at for how much is actually used out of
 that 25600 ...

 netstat -mb

 nmbclusters directly affects the number of clusters available in the network
 stack; it also indirectly affects the scaling of other settings, such as
 resource limits on the number of sockets.  vmstat -z is also generally useful.

 There are a few paths to ENOBUFS in the socket allocation code--one path is
 if you are over-committed on socket buffer resources with respect to the
 resource limits of the user.  Check the output of limits and the socket
 buffer size limit.

 Robert N M Watson
 Computer Laboratory
 University of Cambridge



- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGCYJ24QvfyHIvDvMRAlPjAJ9zbGNDlGxTO/TFuoAQAw2zUsmj/wCgmPlG
9yyzoZWGu3B55xoAZ0iLjhg=
=8QWr
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: socketpair: No buffer space available

2007-03-25 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Monday, March 26, 2007 00:08:07 +0100 Bruce M. Simpson
[EMAIL PROTECTED] 
wrote:

 Marc G. Fournier wrote:
 Mar 20 07:59:26 mars sshd[717]: error: reexec socketpair: No buffer space
 available


 If I have a login session on the machine, I can easily do a reboot of the
 machine, and it seems to come up clean every time (ie. no fsck's need to be
 run) ...
 Does anyone have any ideas of what I can look at?

 How odd. The re-exec feature is not documented in the man page. It appears
 that it can be turned off with the -r switch according to sshd.c. Can you
 give that a try and see if that offers symptomatic relief? It would be
 somewhat less secure as sshd will fork rather than fork..exec.

That was actually just one example ... I get more of:

sendmail[82066]: l2NEA1Ht082066: SYSERR(root): makeconnection: cannot create 
socket: No buffer space available

then I do the sshd errors ... in another 15 hours or so, they will all start up 
again, like clock work :(


- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGBxZ84QvfyHIvDvMRAoNTAKDBkGZL7aCOXEW22QibCCpnJJJnEgCfafMa
ex0pM7sKPgCjVdURJ9nwfH0=
=egaO
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

socketpair: No buffer space available

2007-03-24 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



Almost like clockwork, every 3 days, I have one server that starts to generate
errors similar to below ... it isn't a 'continous thing' at the start, but
gradually grows worse ... it just started happening again today, after 3 days, 
2hrs of uptime ...

Mar 20 07:59:26 mars sshd[717]: error: reexec socketpair: No buffer space
available

As unrelated as this might sound, out of three servers that are virtually
identical, this is the only one using gmirror for its drives vs a hardware raid
controller, two of the three running kernels from about the same time ...

# ssh jupiter uname -a
FreeBSD jupiter.hub.org 6.2-STABLE FreeBSD 6.2-STABLE #1: Fri Mar 16 13:13:02
ADT 2007 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/kernel  i386

vs

# ssh mars uname -a
FreeBSD mars.hub.org 6.2-STABLE FreeBSD 6.2-STABLE #5: Tue Mar 13 02:29:37 ADT
2007 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/kernel  i386

jupiter is running more on it then mars right now ...

So, I either have something mis-configured on mars that is done right on
jupiter, or there is a bug that is being tickled on mars that isn't being
tickled on jupiter ...

If I have a login session on the machine, I can easily do a reboot of the
machine, and it seems to come up clean every time (ie. no fsck's need to be
run) ...

Does anyone have any ideas of what I can look at?

I've checked nmbclusters between the two machines, and both are at 25600, but 
not sure what sysctl to look at for how much is actually used out of that 25600 
...


- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGA+sG4QvfyHIvDvMRAoRuAJ9LXJ5RUZNXEQhEwkDFiMudThyASgCeNJXu
9Y7KZ6fSlk07/WmHGywTvJ4=
=n3XS
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

socketpair: No buffer space available

2007-03-24 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Almost like clockwork, every 3 days, I have one server that starts to generate 
errors similar to below ... it isn't a 'continous thing' at the start, but 
gradually grows worse ...

Mar 20 07:59:26 mars sshd[717]: error: reexec socketpair: No buffer space 
available

As unrelated as this might sound, out of three servers that are virtually 
identical, this is the only one using gmirror for its drives vs a hardware raid 
controller, two of the three running kernels from about the same time ...

# ssh jupiter uname -a
FreeBSD jupiter.hub.org 6.2-STABLE FreeBSD 6.2-STABLE #1: Fri Mar 16 13:13:02 
ADT 2007 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/kernel  i386

vs

# ssh mars uname -a
FreeBSD mars.hub.org 6.2-STABLE FreeBSD 6.2-STABLE #5: Tue Mar 13 02:29:37 ADT 
2007 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/kernel  i386

jupiter is running more on it then mars right now ...

So, I either have something mis-configured on mars that is done right on 
jupiter, or there is a bug that is being tickled on mars that isn't being 
tickled on jupiter ...

If I have a login session on the machine, I can easily do a reboot of the 
machine, and it seems to come up clean every time (ie. no fsck's need to be 
run) ...

Does anyone have any ideas of what I can look at?

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGAV294QvfyHIvDvMRAogOAKCCbTIYS59dQFmV9/gfRth8nUZMpgCggZ9r
8zBIHioOQjlNBgovjv+eDA4=
=lIyS
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

testing ...

2007-03-24 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


I've sent several messages to this list the past couple of days, but none of 
them seem to go through ... I'm not expecting this one to either, just trying 
to see if there is anything in my logs to indicate a problem :(

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFGBWw84QvfyHIvDvMRAhWdAJ9SlIaBU36w/eGudttQrYPwAVVtggCgj7E0
GOJ5alQp4hS4OHTW6rm1vMc=
=gdZ+
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: ping's seem to hang ... 'zoneli' state?

2007-03-06 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


vmstat 1 shows:

# vmstat 1
 procs  memory  pagedisks faults  cpu
 r b w avmfre  flt  re  pi  po  fr  sr da0 da1   in   sy  cs us sy id
 0 1577 0 7813112 657312 1553   9   1   1 1325  68   0   0  569 4774 2359  7 10 
83
 0 1578 0 7815368 656600  199  59   0   0  64   0   0   2  226  679 616  0  6 93
 0 1578 0 7815368 656564 1120   0   0   0 220   0   0   0  208  638 608  1  8 91
 0 1578 0 7815368 6565645   0   0   0 314   0   0   9  343  890 974  1  8 91
 0 1578 0 7815368 656564  804   0   0   0   0   0   2   0  233  469 633  1  9 90

Normally I'd look for any mysql processes using alot of CPU, but I'm not 
finding anything using alot of CPU (this system is the only one we have using 
gmirror, if that helps any) ...



- --On Tuesday, March 06, 2007 11:10:53 -0400 Marc G. Fournier 
[EMAIL PROTECTED] wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1


 Does this show anything?  I can't kill the processes, even with kill -9 ...
 this happens consistently just after 3 days uptime on a kernel built Fri Feb
 23 07:47:20 AST 2007, and the interface is an fxp0 device ...

# ps auxl | grep ping
 root 68994  0.0  0.0  1556   808  ??  D 7:58AM   0:00.02 ping -c 1 -t
 30  0 1   0 -16  0 zoneli
 root 70555  0.0  0.0  1556   808  ??  D 8:05AM   0:00.02 ping -c 1 -t
 30  0 1   0 -16  0 zoneli
 root 73863  0.0  0.0  1452   520  ??  D 8:33AM   0:00.01 ping -c 1 -t
 30  0 1   0 -16  0 zoneli
 root 75857  0.0  0.0  1452   520  ??  D 8:49AM   0:00.01 ping -c 1 -t
 30  0 1   0 -16  0 zoneli
 root 76676  0.0  0.0  1556   808  ??  D 8:53AM   0:00.02 ping -c 1 -t
 30  0 1   0 -16  0 zoneli
 root 77048  0.0  0.0  1556   808  ??  D 8:54AM   0:00.01 ping -c 1 -t
 30  0 1   0 -16  0 zoneli
 root 80071  0.0  0.0  1452   520  ??  D 9:15AM   0:00.01 ping -c 1 -t
 30  0 1   0 -16  0 zoneli
 root 80198  0.0  0.0  1452   520  ??  D 9:15AM   0:00.01 ping -c 1 -t
 5  j 0 1   0 -16  0 zoneli
 root 81210  0.0  0.0  1556   808  ??  D 9:22AM   0:00.02 ping -c 1 -t
 30  0 1   0 -16  0 zoneli
 root 81212  0.0  0.0  1556   808  ??  D 9:22AM   0:00.01 ping -c 1 -t
 30  0 1   0 -16  0 zoneli
 root 81841  0.0  0.0  1556   808  ??  D 9:25AM   0:00.01 ping -c 1 -t
 30  0 1   0 -16  0 zoneli
 root 88041  0.0  0.0  1452   524  ??  D10:11AM   0:00.01 ping -c 1 -t
 30  0 1   0 -16  0 zoneli
 root 88418  0.0  0.0  1452   524  ??  D10:13AM   0:00.01 ping -c 1 -t
 30  0 1   0 -16  0 zoneli
 root 89800  0.0  0.0  1452   524  ??  D10:24AM   0:00.01 ping -c 1 -t
 30  0 1   0 -16  0 zoneli
 root 90774  0.0  0.0  1452   556  ??  D10:58AM   0:00.00 ping -c 1 -t
 30  0 1   0 -16  0 zoneli
 root 91118  0.0  0.0  1452   556  ??  D10:58AM   0:00.00 ping -c 1 -t
 30  0 1   0 -16  0 zoneli
 root 91635  0.0  0.0  1452   556  ??  D11:04AM   0:00.00 ping -c 1 -t
 30  0 1   0 -16  0 zoneli


 - 
 Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
 Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
 Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.5 (FreeBSD)

 iD8DBQFF7YR94QvfyHIvDvMRAqo7AKCsPVLSXhtMD4pFd/ho2hoX3CL5cgCfcQmy
 HkV4+EgX4ue/gxVZzyuXE+U=
 =8YX2
 -END PGP SIGNATURE-



- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFF7YZ54QvfyHIvDvMRAp2SAKCnpJJLgxI1SnkfE83L+xH05/981QCfYQBQ
NUqCavoRoH8lo6ZPdXLyBFg=
=Ww+Z
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

ping's seem to hang ... 'zoneli' state?

2007-03-06 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Does this show anything?  I can't kill the processes, even with kill -9 ... 
this happens consistently just after 3 days uptime on a kernel built Fri Feb 
23 07:47:20 AST 2007, and the interface is an fxp0 device ...

# ps auxl | grep ping
root 68994  0.0  0.0  1556   808  ??  D 7:58AM   0:00.02 ping -c 1 -t 
30  0 1   0 -16  0 zoneli
root 70555  0.0  0.0  1556   808  ??  D 8:05AM   0:00.02 ping -c 1 -t 
30  0 1   0 -16  0 zoneli
root 73863  0.0  0.0  1452   520  ??  D 8:33AM   0:00.01 ping -c 1 -t 
30  0 1   0 -16  0 zoneli
root 75857  0.0  0.0  1452   520  ??  D 8:49AM   0:00.01 ping -c 1 -t 
30  0 1   0 -16  0 zoneli
root 76676  0.0  0.0  1556   808  ??  D 8:53AM   0:00.02 ping -c 1 -t 
30  0 1   0 -16  0 zoneli
root 77048  0.0  0.0  1556   808  ??  D 8:54AM   0:00.01 ping -c 1 -t 
30  0 1   0 -16  0 zoneli
root 80071  0.0  0.0  1452   520  ??  D 9:15AM   0:00.01 ping -c 1 -t 
30  0 1   0 -16  0 zoneli
root 80198  0.0  0.0  1452   520  ??  D 9:15AM   0:00.01 ping -c 1 -t 5 
j 0 1   0 -16  0 zoneli
root 81210  0.0  0.0  1556   808  ??  D 9:22AM   0:00.02 ping -c 1 -t 
30  0 1   0 -16  0 zoneli
root 81212  0.0  0.0  1556   808  ??  D 9:22AM   0:00.01 ping -c 1 -t 
30  0 1   0 -16  0 zoneli
root 81841  0.0  0.0  1556   808  ??  D 9:25AM   0:00.01 ping -c 1 -t 
30  0 1   0 -16  0 zoneli
root 88041  0.0  0.0  1452   524  ??  D10:11AM   0:00.01 ping -c 1 -t 
30  0 1   0 -16  0 zoneli
root 88418  0.0  0.0  1452   524  ??  D10:13AM   0:00.01 ping -c 1 -t 
30  0 1   0 -16  0 zoneli
root 89800  0.0  0.0  1452   524  ??  D10:24AM   0:00.01 ping -c 1 -t 
30  0 1   0 -16  0 zoneli
root 90774  0.0  0.0  1452   556  ??  D10:58AM   0:00.00 ping -c 1 -t 
30  0 1   0 -16  0 zoneli
root 91118  0.0  0.0  1452   556  ??  D10:58AM   0:00.00 ping -c 1 -t 
30  0 1   0 -16  0 zoneli
root 91635  0.0  0.0  1452   556  ??  D11:04AM   0:00.00 ping -c 1 -t 
30  0 1   0 -16  0 zoneli


- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFF7YR94QvfyHIvDvMRAqo7AKCsPVLSXhtMD4pFd/ho2hoX3CL5cgCfcQmy
HkV4+EgX4ue/gxVZzyuXE+U=
=8YX2
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: ping's seem to hang ... 'zoneli' state?

2007-03-06 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Tuesday, March 06, 2007 18:59:04 +0300 Anton Yuzhaninov
[EMAIL PROTECTED] 
wrote:

 Hello Marc,

 You wrote on Tuesday, March 6, 2007, 6:10:45 PM:

 MGF Does this show anything?  I can't kill the processes, even with kill -9
 ... MGF this happens consistently just after 3 days uptime on a kernel built
 Fri Feb MGF 23 07:47:20 AST 2007, and the interface is an fxp0 device ...

 MGF # ps auxl | grep ping
 MGF root 68994  0.0  0.0  1556   808  ??  D 7:58AM   0:00.02 ping -c
 1 -t MGF 30  0 1   0 -16  0 zoneli

 This is know problem:
 http://www.freebsd.org/releases/6.2R/errata.html

 There are some different cases when zonelimit livelock is possible.
 Send vmstat -z output (when processes lock in zonelimit state).

Great, thanks ... just read the errata on zonelimit, and it seems to imply that 
it was fixed on the 12th of February, but (of course) it doesn't indicate which 
files ... I just did a new cvsup since my last one, and all that has changed is:

Updating collection src-all/cvs
 Edit src/lib/libarchive/archive_read_extract.c
 Edit src/share/man/man4/tap.4
 Edit src/share/man/man4/tun.4
 Edit src/sys/amd64/conf/SMP
 Edit src/sys/dev/wi/if_wi_pccard.c
 Edit src/sys/kern/sys_generic.c
 Edit src/sys/net/if_tap.c
 Edit src/sys/net/if_tun.c
 Edit src/sys/netgraph/ng_ksocket.c
 Edit src/sys/netinet/ip_mroute.c
 Edit src/sys/netinet/tcp.h
Finished successfully

Can someone comment on whether I just missed the commit on my last cvsup, or if 
I'm hitting the same problem but in a different way?

Thanks ...

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFF7ZYU4QvfyHIvDvMRAhr6AKDQqpDNoCvq1UJYLbS4ayjcfZ2tSgCfawZX
WNM499ARzFVvxW6ubUJtYDo=
=ImQ+
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: ping's seem to hang ... 'zoneli' state?

2007-03-06 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Wednesday, March 07, 2007 00:44:25 +0800 ??/LI Xin
[EMAIL PROTECTED] 
wrote:

 Marc G. Fournier wrote:
 [...]
 Can someone comment on whether I just missed the commit on my last cvsup, or
 if  I'm hitting the same problem but in a different way?

 I think so.  Try patching your system with:

 src/sys/kern/kern_mbuf.c,v 1.9.2.9
 src/sys/sys/mbuf.h,v 1.170.2.7
 src/sys/vm/uma.h,v 1.22.2.8
 src/sys/vm/uma_core.c,v 1.119.2.19
 src/sys/vm/uma_core.c,v 1.119.2.18

 and perhaps also:

 src/sys/kern/uipc_socket.c,v 1.293.

Here's what I have right now:

__FBSDID($FreeBSD: src/sys/kern/kern_mbuf.c,v 1.9.2.9 2007/02/11 03:31:18 
mohans Exp $);
 * $FreeBSD: src/sys/sys/mbuf.h,v 1.170.2.7 2007/02/11 03:31:19 mohans Exp $
 * $FreeBSD: src/sys/vm/uma.h,v 1.22.2.8 2007/02/11 03:31:19 mohans Exp $
__FBSDID($FreeBSD: src/sys/vm/uma_core.c,v 1.119.2.19 2007/02/11 03:31:19 
mohans Exp $);
__FBSDID($FreeBSD: src/sys/vm/uma_core.c,v 1.119.2.19 2007/02/11 03:31:19 
mohans Exp $);
__FBSDID($FreeBSD: src/sys/kern/uipc_socket.c,v 1.242.2.8 2007/02/03 04:01:22 
bms Exp $);

The only one that looks off is uipc_socket.c ... do I need to copy that from 
HEAD?  Are there any compatibility issues with doing that?


- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFF7aKV4QvfyHIvDvMRAloYAKCsA0x+THahW+MZjW/8MjDZwsJDrgCcD4Qw
WbA/0nXgvv4xwEDtBxirLlo=
=Ro3T
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Some days, it doesn't pay to upgrade ...

2007-03-02 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Based on the suggestion by someone on this list, I setup a screen session with 
top running, to watch things ... again, after 3 days, the server goes 'out of 
process' ... this time, of course, I could get in to look around and kill off 
processes ...

from what I can tell, a process that all it does is:

ping -c 1 host with a 300 sec timeout that runs once a minute started to 'run 
over top of' each other out of cron ... the host that it is pinging is on the 
same switch and has been running fine for 20 days now, and it wasn't until I 
did the last upgrade on teh server causing the problems that these problems 
started ...

Coincidence? :)

I'm going to fix the script so that it doesn't try to run over itself ... 
anyone konw of a problem with the fxp driver in 6-STABLE that might cause the 
ping to hang?

- --On Thursday, March 01, 2007 09:51:13 +1100 Antony Mawer 
[EMAIL PROTECTED] wrote:

 On 27/02/2007 11:59 PM, Marc G. Fournier wrote:
 After 155 days of problem free uptime, I upgraded my 6-STABLE system the
 other  day to the latest cvsup ... 3 days later, the whole thing hung solid
 with:


 Feb 27 04:32:49 mars uptimec: The server requested that we do a new login
 Feb 27 04:33:00 mars kernel: maxproc limit exceeded by uid 0, please see
 tuning(7) and login.conf(5).
 Feb 27 04:33:10 mars kernel: maxproc limit exceeded by uid 60, please see
 tuning(7) and login.conf(5).

 Stupid question: why isn't there some mechanism that prevents new processes
 from starting up, instead of locking up the whole server?  I'm not asking
 for  the evilness of Linux, where it arbitrarily kills off existing
 processes, but  if maxproc is hit, why continue to try and start up new ones?

 What do you define as 'hung solid'? You are unable to get in via SSH? Or at a
 console via iLO/etc?

 I've seen this on some of our 6.0-RELEASE machines (along with maxpipekva
 exhausted errors), and you can't SSH in from that point... because sshd forks
 to handle the connection, and all available process slots are used up.

 I've thought about writing a background daemon to monitor the logs for signs
 of this (or even to just try and create a short-lived child process by
 fork()ing every 5 minutes or so), and dump information to disk then reboot
 the system when this occurs... it's a work-around for something that
 shouldn't happen, but it does anyway... once I'm able to identify _what_ is
 causing the build-up of processes, then I might be able to do something about
 killing them...!!!


 It's quite deceptive from an end-user point of view, because things like
 Apache that are already keep running, so all they see are strange bits and
 pieces that don't work... and as always, its one of those things that only
 happens on some clients machines, but never on any of our test machines...

 --Antony


 PS. I haven't disappeared off the face of the earth.. though close.. my
 fiance and I have been busy planning the wedding, and wound up buying a house
 at the same time..!! Will catch up shortly once I get a chance to come up for
 air!!



- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFF6Ofd4QvfyHIvDvMRAmoqAJ9ka8ZQxq0Ciidyy4R60bTmYfxeggCeLz7i
/De9C0Hmdqb22nErxhyUaZA=
=Seo0
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Some days, it doesn't pay to upgrade ...

2007-03-02 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


I don't know how critical this is, but I just thought about it ... this is my 
only system running gmirror ... everything seems fine according ot gmirror 
status, but maybe something iswron gthere I'm not seeing:

Mar  3 01:25:52 mars kernel: GEOM_MIRROR: Device vm: provider mirror/vm 
destroyed.
Mar  3 01:25:52 mars kernel: GEOM_MIRROR: Device vm destroyed.
Mar  3 01:25:52 mars kernel: GEOM_MIRROR: Device md2: provider mirror/md2 
destroyed.
Mar  3 01:25:52 mars kernel: GEOM_MIRROR: Device md2 destroyed.
Mar  3 01:25:52 mars kernel: GEOM_STRIPE: Disk mirror/md2 removed from md0.
Mar  3 01:25:52 mars kernel: GEOM_STRIPE: Device md0 removed.
Mar  3 01:25:52 mars kernel: GEOM_MIRROR: Device md1: provider mirror/md1 
destroyed.
Mar  3 01:25:52 mars kernel: GEOM_MIRROR: Device md1 destroyed.
Mar  3 01:25:52 mars kernel: GEOM_STRIPE: Disk mirror/md1 removed from md0.
Mar  3 01:25:52 mars kernel: GEOM_STRIPE: Device md0 destroyed.
Mar  3 01:25:52 mars kernel: GEOM_MIRROR: Device md1 created (id=2282154470).
Mar  3 01:25:52 mars kernel: GEOM_MIRROR: Device md1: provider da1 detected.
Mar  3 01:25:52 mars kernel: GEOM_MIRROR: Device md1: provider da2 detected.
Mar  3 01:25:52 mars kernel: GEOM_MIRROR: Device md1: provider da2 activated.
Mar  3 01:25:52 mars kernel: GEOM_MIRROR: Device md1: provider da1 activated.
Mar  3 01:25:52 mars kernel: GEOM_MIRROR: Device md1: provider mirror/md1 
launched.
Mar  3 01:25:52 mars kernel: GEOM_MIRROR: Device md2 created (id=3089402334).
Mar  3 01:25:52 mars kernel: GEOM_MIRROR: Device md2: provider da3 detected.
Mar  3 01:25:52 mars kernel: GEOM_MIRROR: Device md2: provider da4 detected.
Mar  3 01:25:52 mars kernel: GEOM_MIRROR: Device md2: provider da4 activated.
Mar  3 01:25:52 mars kernel: GEOM_MIRROR: Device md2: provider da3 activated.
Mar  3 01:25:52 mars kernel: GEOM_MIRROR: Device md2: provider mirror/md2 
launched.
Mar  3 01:25:52 mars kernel: GEOM_MIRROR: Device vm created (id=2175292049).
Mar  3 01:25:52 mars kernel: GEOM_MIRROR: Device vm: provider da5 detected.
Mar  3 01:25:52 mars kernel: GEOM_STRIPE: Device md0 created (id=1094782536).
Mar  3 01:25:52 mars kernel: GEOM_STRIPE: Disk mirror/md1 attached to md0.
Mar  3 01:25:52 mars kernel: GEOM_STRIPE: Disk mirror/md2 attached to md0.
Mar  3 01:25:52 mars kernel: GEOM_STRIPE: Device md0 activated.
Mar  3 01:25:52 mars kernel: GEOM_MIRROR: Force device vm start due to timeout.
Mar  3 01:25:52 mars kernel: GEOM_MIRROR: Device vm: provider da5 activated.
Mar  3 01:25:52 mars kernel: GEOM_MIRROR: Device vm: provider mirror/vm 
launched.


mirror/md1  COMPLETE  da1
  da2
mirror/md2  COMPLETE  da3
  da4
 mirror/vm  DEGRADED  da5

I'm not using da5 right now, its just in there ... went with a RAID1+0 vs RAID5 
configuration ...




- --On Thursday, March 01, 2007 09:51:13 +1100 Antony Mawer 
[EMAIL PROTECTED] wrote:

 On 27/02/2007 11:59 PM, Marc G. Fournier wrote:
 After 155 days of problem free uptime, I upgraded my 6-STABLE system the
 other  day to the latest cvsup ... 3 days later, the whole thing hung solid
 with:


 Feb 27 04:32:49 mars uptimec: The server requested that we do a new login
 Feb 27 04:33:00 mars kernel: maxproc limit exceeded by uid 0, please see
 tuning(7) and login.conf(5).
 Feb 27 04:33:10 mars kernel: maxproc limit exceeded by uid 60, please see
 tuning(7) and login.conf(5).

 Stupid question: why isn't there some mechanism that prevents new processes
 from starting up, instead of locking up the whole server?  I'm not asking
 for  the evilness of Linux, where it arbitrarily kills off existing
 processes, but  if maxproc is hit, why continue to try and start up new ones?

 What do you define as 'hung solid'? You are unable to get in via SSH? Or at a
 console via iLO/etc?

 I've seen this on some of our 6.0-RELEASE machines (along with maxpipekva
 exhausted errors), and you can't SSH in from that point... because sshd forks
 to handle the connection, and all available process slots are used up.

 I've thought about writing a background daemon to monitor the logs for signs
 of this (or even to just try and create a short-lived child process by
 fork()ing every 5 minutes or so), and dump information to disk then reboot
 the system when this occurs... it's a work-around for something that
 shouldn't happen, but it does anyway... once I'm able to identify _what_ is
 causing the build-up of processes, then I might be able to do something about
 killing them...!!!


 It's quite deceptive from an end-user point of view, because things like
 Apache that are already keep running, so all they see are strange bits and
 pieces that don't work... and as always, its one of those things that only
 happens on some clients machines, but never on any of our test machines...

 --Antony


 PS. I haven't disappeared off the face of the earth.. though close.. my
 fiance and I have been busy planning the wedding, and wound up buying

Re: Some days, it doesn't pay to upgrade ...

2007-02-28 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Tuesday, February 27, 2007 20:18:50 -0800 Tom Samplonius 
[EMAIL PROTECTED] wrote:


 - Marc G. Fournier [EMAIL PROTECTED] wrote:
 Feb 27 04:32:49 mars uptimec: The server requested that we do a new
 login
 Feb 27 04:33:00 mars kernel: maxproc limit exceeded by uid 0, please
 see
 tuning(7) and login.conf(5).
 Feb 27 04:33:10 mars kernel: maxproc limit exceeded by uid 60, please
 see
 tuning(7) and login.conf(5).

 Stupid question: why isn't there some mechanism that prevents new
 processes
 from starting up, instead of locking up the whole server?  I'm not
 asking for
 ...

   Isn't that what is happening?  When maxproc is hit, new processes can't be
 created.  It is harmless, except for the uid that exceeded its process limit.

   I think the hang is some side-effect.  Either because init can't fork a
 process, therefore there is nothing to login to.  Did you try ping the system
 from remote to really see whether it was a solid hang?  Or did you just
 pound on the keyboard?

ping continues to work ... its a remote server, without a serial console, so 
doing much more on that particular server is a bit more difficult :(  all our 
newer stuff (which, of course, is running great), have remote consoles setup on 
them ...


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFF5apV4QvfyHIvDvMRApLEAKCAiCPNa4j2173DgqJm6tuaL/itAwCeNokY
ueJxtSGcp6TG2tCy8Tir1sM=
=K7bg
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Some days, it doesn't pay to upgrade ...

2007-02-27 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


After 155 days of problem free uptime, I upgraded my 6-STABLE system the other 
day to the latest cvsup ... 3 days later, the whole thing hung solid with:


Feb 27 04:32:49 mars uptimec: The server requested that we do a new login
Feb 27 04:33:00 mars kernel: maxproc limit exceeded by uid 0, please see 
tuning(7) and login.conf(5).
Feb 27 04:33:10 mars kernel: maxproc limit exceeded by uid 60, please see 
tuning(7) and login.conf(5).

Stupid question: why isn't there some mechanism that prevents new processes 
from starting up, instead of locking up the whole server?  I'm not asking for 
the evilness of Linux, where it arbitrarily kills off existing processes, but 
if maxproc is hit, why continue to try and start up new ones?

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFF5Csz4QvfyHIvDvMRAvriAJ48K+5X/YdY7YW13Ro8z/nVuca3cQCeIlYk
L8cLOgpzH4W4+tz6V8GVVqc=
=x/Ok
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Fatal trap 12: page fault while in kernel mode

2007-01-07 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Working on upgrading and applying patch right now ... thanks ...

- --On Sunday, January 07, 2007 14:03:41 + Robert Watson 
[EMAIL PROTECTED] wrote:


 On Sat, 6 Jan 2007, Marc G. Fournier wrote:

 Just had the following happen on a FreeBSD 6.2-PRERELEASE #7: Sun Dec 17
 01:28:52 AST 2006 system ... amd64, HP Proliant, 6G of RAM ... have core if
 there is information that I can provide out of it ...

 Fatal trap 12: page fault while in kernel mode
 cpuid = 0; apic id = 00
 fault virtual address   = 0x18c
 fault code  = supervisor read, page not present
 instruction pointer = 0x8:0x801f9053
 stack pointer   = 0x10:0xb5c78b30
 frame pointer   = 0x10:0xb5c78b60
 code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
 processor eflags= resume, IOPL = 0
 current process = 5 (thread taskq)
 trap number = 12
 panic: page fault
 cpuid = 0
 Uptime: 8d22h25m40s

 (kgdb) where
 # 0  doadump () at pcpu.h:172
 # 1  0x80203955 in boot (howto=260) at
 /usr/src/sys/kern/kern_shutdown.c:409
 # 2  0x80204065 in panic (fmt=0xff019b667720
 X\223f\233\001ÿÿÿ\020µc\233\001ÿÿÿ) at
 /usr/src/sys/kern/kern_shutdown.c:565
 # 3  0x803287a6 in trap_fatal (frame=0xc, eva=18446742981100074784)
 # at
 /usr/src/sys/amd64/amd64/trap.c:660
 # 4  0x80328cd8 in trap (frame=
  {tf_rdi = 112, tf_rsi = -1092609476832, tf_rdx = 6, tf_rcx = 3221225730,
 tf_r8 = -1245213424, tf_r9 = -1092609476832, tf_rax = 1, tf_rbx =
 - -1096874331952, tf_rbp = -1245213856, tf_r10 = -2142258536, tf_r11 = 0,
 tf_r12 = 4, tf_r13 = -1092609476832, tf_r14 = 4, tf_r15 = 1, tf_trapno = 12,
 tf_addr = 396, tf_flags = -2145197496, tf_err = 0, tf_rip = -2145415085,
 tf_cs = 8, tf_rflags = 65538, tf_rsp = -1245213888, tf_ss = 16}) at
 /usr/src/sys/amd64/amd64/trap.c:238
 # 5  0x80313c6b in calltrap () at
 /usr/src/sys/amd64/amd64/exception.S:168
 # 6  0x801f9053 in _mtx_lock_sleep (m=0xff009d31f0d0,
 tid=18446742981100074784, opts=6, file=0xc102 Address 0xc102 out of
 bounds, line=-1245213424) at /usr/src/sys/kern/kern_mutex.c:546
 # 7  0x8025b1ac in unp_gc (arg=0x70, pending=-1687783648) at
 /usr/src/sys/kern/uipc_usrreq.c:1714
 # 8  0x8022c314 in taskqueue_run (queue=0xff844800) at
 /usr/src/sys/kern/subr_taskqueue.c:257
 # 9  0x8022d0e7 in taskqueue_thread_loop (arg=0x70) at
 /usr/src/sys/kern/subr_taskqueue.c:376
 # 10 0x801e7b76 in fork_exit (callout=0x8022d060
 taskqueue_thread_loop, arg=0x805030d0, frame=0xb5c78c50) at
 /usr/src/sys/kern/kern_fork.c:821
 # 11 0x80313fce in fork_trampoline () at
 /usr/src/sys/amd64/amd64/exception.S:394

 This is a NULL pointer dereference in the UNIX domain socket code.  John
 Baldwin recently committed a fix for a bug with these symptoms to 7-CURRENT,
 with an MFC planned in the near future.  The fix won't make 6.2-RELEASE, but
 assuming it tests out well over the next few weeks, we will cut an errata
 patch/announcement for it.  I believe you can pull down his 6-STABLE version
 at:

http://people.FreeBSD.org/~jhb/patches/unp_gc.patch

 This same patch is currently in texting on mx1.FreeBSD.org.

 (John CC'd)

 Robert N M Watson
 Computer Laboratory
 University of Cambridge



- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFFoQ8w4QvfyHIvDvMRAuTzAKDrPBUZ0dRgdujdSzQjbFyh2xiYcACgm8Oa
adOhc5QuzI99WsjjjWaSi64=
=lmyP
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Fatal trap 12: page fault while in kernel mode

2007-01-07 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


I'm up and running on the patch now as well ...

- --On Sunday, January 07, 2007 17:02:40 -0800 Kevin Oberman [EMAIL 
PROTECTED] 
wrote:

 Date: Sun, 7 Jan 2007 14:03:41 + (GMT)
 From: Robert Watson [EMAIL PROTECTED]
 Sender: [EMAIL PROTECTED]

 On Sat, 6 Jan 2007, Marc G. Fournier wrote:

  Just had the following happen on a FreeBSD 6.2-PRERELEASE #7: Sun Dec 17
  01:28:52 AST 2006 system ... amd64, HP Proliant, 6G of RAM ... have core 
  if  there is information that I can provide out of it ...
 
  Fatal trap 12: page fault while in kernel mode
  cpuid = 0; apic id = 00
  fault virtual address   = 0x18c
  fault code  = supervisor read, page not present
  instruction pointer = 0x8:0x801f9053
  stack pointer   = 0x10:0xb5c78b30
  frame pointer   = 0x10:0xb5c78b60
  code segment= base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, long 1, def32 0, gran 1
  processor eflags= resume, IOPL = 0
  current process = 5 (thread taskq)
  trap number = 12
  panic: page fault
  cpuid = 0
  Uptime: 8d22h25m40s
 
  (kgdb) where
  # 0  doadump () at pcpu.h:172
  # 1  0x80203955 in boot (howto=260) at
  /usr/src/sys/kern/kern_shutdown.c:409
  # 2  0x80204065 in panic (fmt=0xff019b667720
  X\223f\233\001???\020?c\233\001???) at
  /usr/src/sys/kern/kern_shutdown.c:565
  # 3  0x803287a6 in trap_fatal (frame=0xc, eva=1844674298110007
  # 4784) at
  /usr/src/sys/amd64/amd64/trap.c:660
  # 4  0x80328cd8 in trap (frame=
   {tf_rdi = 112, tf_rsi = -1092609476832, tf_rdx = 6, tf_rcx =
   3221225730, tf_r8 = -1245213424, tf_r9 = -1092609476832, tf_rax = 1,
  tf_rbx = - -1096874331952, tf_rbp = -1245213856, tf_r10 = -2142258536,
  tf_r11  = 0, tf_r12 = 4, tf_r13 = -1092609476832, tf_r14 = 4, tf_r15 = 1,
  tf_trapno  = 12, tf_addr = 396, tf_flags = -2145197496, tf_err = 0,
  tf_rip = -2145415085, tf_c s = 8, tf_rflags = 65538, tf_rsp =
  -1245213888, tf_ss = 16}) at
  /usr/src/sys/amd64/amd64/trap.c:238
  # 5  0x80313c6b in calltrap () at
  /usr/src/sys/amd64/amd64/exception.S:168
  # 6  0x801f9053 in _mtx_lock_sleep (m=0xff009d31f0d0,
  tid=18446742981100074784, opts=6, file=0xc102 Address 0xc1 02
  out of bounds, line=-1245213424) at /usr/src/sys/kern/kern_mutex.c:546
  # 7  0x8025b1ac in unp_gc (arg=0x70, pending=-1687783648) at
  /usr/src/sys/kern/uipc_usrreq.c:1714
  # 8  0x8022c314 in taskqueue_run (queue=0xff844800) at
  /usr/src/sys/kern/subr_taskqueue.c:257
  # 9  0x8022d0e7 in taskqueue_thread_loop (arg=0x70) at
  /usr/src/sys/kern/subr_taskqueue.c:376
  # 10 0x801e7b76 in fork_exit (callout=0x8022d060
  taskqueue_thread_loop, arg=0x805030d0, frame=0xb5c7
  8c50) at /usr/src/sys/kern/kern_fork.c:821
  # 11 0x80313fce in fork_trampoline () at
  /usr/src/sys/amd64/amd64/exception.S:394

 This is a NULL pointer dereference in the UNIX domain socket code.  John
 Baldwin recently committed a fix for a bug with these symptoms to 7-CURRENT
 ,  with an MFC planned in the near future.  The fix won't make 6.2-RELEASE,
 bu t  assuming it tests out well over the next few weeks, we will cut an
 errata   patch/announcement for it.  I believe you can pull down his
 6-STABLE versio n  at:

http://people.FreeBSD.org/~jhb/patches/unp_gc.patch

 This same patch is currently in texting on mx1.FreeBSD.org.

 (John CC'd)

 Robert N M Watson
 Computer Laboratory
 University of Cambridge

 I have installed this on my system, but the panics have always been very
 erratic, so it may be a while before I am sure whether this fixes it. At
 the moment the system has been up for 7 days, although I have had
 multiple crashes in a single day.
 --
 R. Kevin Oberman, Network Engineer
 Energy Sciences Network (ESnet)
 Ernest O. Lawrence Berkeley National Laboratory (Berkeley Lab)
 E-mail: [EMAIL PROTECTED] Phone: +1 510 486-8634
 Key fingerprint:059B 2DDF 031C 9BA3 14A4  EADA 927D EBB3 987B 3751



- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFFobPh4QvfyHIvDvMRAuGBAJ4vwJoVIRmbdHK6wqBxneuUzjekfACgr4Ys
2DSldX3rTRAHkng3UqKO+8U=
=FtuJ
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Fatal trap 12: page fault while in kernel mode

2007-01-05 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Just had the following happen on a FreeBSD 6.2-PRERELEASE #7: Sun Dec 17 
01:28:52 AST 2006 system ... amd64, HP Proliant, 6G of RAM ... have core if 
there is information that I can provide out of it ...


Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0x18c
fault code  = supervisor read, page not present
instruction pointer = 0x8:0x801f9053
stack pointer   = 0x10:0xb5c78b30
frame pointer   = 0x10:0xb5c78b60
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, long 1, def32 0, gran 1
processor eflags= resume, IOPL = 0
current process = 5 (thread taskq)
trap number = 12
panic: page fault
cpuid = 0
Uptime: 8d22h25m40s

(kgdb) where
#0  doadump () at pcpu.h:172
#1  0x80203955 in boot (howto=260) at 
/usr/src/sys/kern/kern_shutdown.c:409
#2  0x80204065 in panic (fmt=0xff019b667720 
X\223f\233\001ÿÿÿ\020µc\233\001ÿÿÿ) at 
/usr/src/sys/kern/kern_shutdown.c:565
#3  0x803287a6 in trap_fatal (frame=0xc, eva=18446742981100074784) at 
/usr/src/sys/amd64/amd64/trap.c:660
#4  0x80328cd8 in trap (frame=
  {tf_rdi = 112, tf_rsi = -1092609476832, tf_rdx = 6, tf_rcx = 3221225730, 
tf_r8 = -1245213424, tf_r9 = -1092609476832, tf_rax = 1, tf_rbx = 
- -1096874331952, tf_rbp = -1245213856, tf_r10 = -2142258536, tf_r11 = 0, 
tf_r12 
= 4, tf_r13 = -1092609476832, tf_r14 = 4, tf_r15 = 1, tf_trapno = 12, tf_addr = 
396, tf_flags = -2145197496, tf_err = 0, tf_rip = -2145415085, tf_cs = 8, 
tf_rflags = 65538, tf_rsp = -1245213888, tf_ss = 16}) at 
/usr/src/sys/amd64/amd64/trap.c:238
#5  0x80313c6b in calltrap () at 
/usr/src/sys/amd64/amd64/exception.S:168
#6  0x801f9053 in _mtx_lock_sleep (m=0xff009d31f0d0, 
tid=18446742981100074784, opts=6, file=0xc102 Address 0xc102 out of 
bounds, line=-1245213424) at /usr/src/sys/kern/kern_mutex.c:546
#7  0x8025b1ac in unp_gc (arg=0x70, pending=-1687783648) at 
/usr/src/sys/kern/uipc_usrreq.c:1714
#8  0x8022c314 in taskqueue_run (queue=0xff844800) at 
/usr/src/sys/kern/subr_taskqueue.c:257
#9  0x8022d0e7 in taskqueue_thread_loop (arg=0x70) at 
/usr/src/sys/kern/subr_taskqueue.c:376
#10 0x801e7b76 in fork_exit (callout=0x8022d060 
taskqueue_thread_loop, arg=0x805030d0, frame=0xb5c78c50) at 
/usr/src/sys/kern/kern_fork.c:821
#11 0x80313fce in fork_trampoline () at 
/usr/src/sys/amd64/amd64/exception.S:394
#12 0x in ?? ()
#13 0x in ?? ()
#14 0x0001 in ?? ()
#15 0x in ?? ()
#16 0x in ?? ()
#17 0x in ?? ()
#18 0x in ?? ()
#19 0x in ?? ()
#20 0x in ?? ()
#21 0x in ?? ()
#22 0x in ?? ()
#23 0x in ?? ()
#24 0x in ?? ()
#25 0x in ?? ()
#26 0x in ?? ()
#27 0x in ?? ()
#28 0x in ?? ()
#29 0x in ?? ()
#30 0x in ?? ()
#31 0x in ?? ()
#32 0x in ?? ()
#33 0x in ?? ()
#34 0x in ?? ()
#35 0x in ?? ()
#36 0x in ?? ()
#37 0x in ?? ()
#38 0x in ?? ()
#39 0x in ?? ()
#40 0x in ?? ()
#41 0x in ?? ()
#42 0x in ?? ()
#43 0x in ?? ()
#44 0x006bc000 in ?? ()
#45 0x805054c0 in turnstile_chains ()
#46 0x0001 in ?? ()
#47 0xff019b669358 in ?? ()
#48 0xff008d5bc720 in ?? ()
#49 0xb5c78aa0 in ?? ()
#50 0xb5c78a78 in ?? ()
#51 0xff019b667720 in ?? ()
#52 0x8021a69f in sched_switch (td=0x805030d0, 
newtd=0x8022d060, flags=0) at /usr/src/sys/kern/sched_4bsd.c:973
Previous frame inner to this frame (corrupt stack?)

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFFn02U4QvfyHIvDvMRArpcAJ9O14aZsWCJ97wQeLKvxKd9DW6bTQCfWSMm
nm/uEw6zK2jBPXN6/0OTC34=
=4IGH
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

Re: Sleepy thread - Kernel Panic

2006-12-29 Thread Marc G. Fournier

-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Yours makes the third report of this that I know of ... one of us is running 
6.2-RC, one 6.1-RELEASE ... what version are you running?  I get the same 
'hang' also ...

Have you enabled DDB in your kernel?  Also, have you enabled the dumpdev 
settings in /etc/rc.conf?   

- --On Thursday, December 28, 2006 17:27:38 +0545 Tek Bahadur Limbu 
[EMAIL PROTECTED] wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1


 Dear All,

 I need some help on the problem below.

 The following error occurs in my FreeBSD 6.1 (Dell 420) server:


 Sleeping thread (tid 540242, pid 32378) owns a non-sleepable lock
 panic: sleeping thread

 Cannot dump. No dump device defined.

 Automatic reboot in 15 seconds - press a key on the console to abort.
 Rebooting


 However, it does not reboot and simply hangs.

 I have tried commenting the options PROCFS which seemed to work for 2
 says. However on the 3rd day, the same problem surfaced again.

 I probably think that it is a hardware problem. Does anybody have
 some ideas regarding this problem.


  --


 With best regards and good wishes,

 Yours sincerely,

 Tek Bahadur Limbu

 (TAG/TDG Group)
 Jwl Systems Department

 Worldlink Communications Pvt. Ltd.

 Jawalakhel, Nepal
 -BEGIN PGP SIGNATURE-
 Version: GnuPG v1.4.2.2 (FreeBSD)

 iD8DBQFFk62uVrOl+eVhOvYRAmfRAJsFtLZOBH84ex9S2h99r1bqf2eYegCcDfgO
 rJW7nsfCQAIn7Q9RFwsUA3o=
 =W8n9
 -END PGP SIGNATURE-




- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFFlXxO4QvfyHIvDvMRAu5wAJ9cdnO87xmzpXcvWRxZfYzK2sxqQQCeMIG3
u87sTXfYCqNGNRbM0SfKqJ8=
=TJp6
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]

1 2 3 4 >

1 - 100 of 383 matches

Mail list logo