Re: kqueue LOR

2006-12-12 Thread Suleiman Souhlal

Kostik Belousov wrote:

On Sun, Nov 26, 2006 at 09:30:39AM +0100, V??clav Haisman wrote:


Hi,
the attached lor.txt contains LOR I got this yesterday. It is FreeBSD 6.1
with relatively recent kernel, from last week or so.

--
VH




+lock order reversal:
+ 1st 0xc537f300 kqueue (kqueue) @ /usr/src/sys/kern/kern_event.c:1547
+ 2nd 0xc45c22dc struct mount mtx (struct mount mtx) @ 
/usr/src/sys/ufs/ufs/ufs_vnops.c:138
+KDB: stack backtrace:
+kdb_backtrace(c07f9879,c45c22dc,c07fd31c,c07fd31c,c080c7b2,...) at 
kdb_backtrace+0x2f
+witness_checkorder(c45c22dc,9,c080c7b2,8a,c07fc6bd,...) at 
witness_checkorder+0x5fe
+_mtx_lock_flags(c45c22dc,0,c080c7b2,8a,e790ba20,...) at _mtx_lock_flags+0x32
+ufs_itimes(c47a0dd0,c47a0e90,e790ba78,c060e1cc,c47a0dd0,...) at ufs_itimes+0x6c
+ufs_getattr(e790ba54,e790baec,c0622af6,c0896f40,e790ba54,...) at 
ufs_getattr+0x20
+VOP_GETATTR_APV(c0896f40,e790ba54,c08a5760,c47a0dd0,e790ba74,...) at 
VOP_GETATTR_APV+0x3a
+filt_vfsread(c4cf261c,6,c07f445e,60b,0,...) at filt_vfsread+0x75
+knote(c4f57114,6,1,1f30c2af,1f30c2af,...) at knote+0x75
+VOP_WRITE_APV(c0896f40,e790bbec,c47a0dd0,227,e790bcb4,...) at 
VOP_WRITE_APV+0x148
+vn_write(c45d5120,e790bcb4,c5802a00,0,c4b73a80,...) at vn_write+0x201
+dofilewrite(c4b73a80,1b,c45d5120,e790bcb4,,...) at dofilewrite+0x84
+kern_writev(c4b73a80,1b,e790bcb4,8220c71,0,...) at kern_writev+0x65
+write(c4b73a80,e790bd04,c,c07d899c,3,...) at write+0x4f
+syscall(3b,3b,bfbf003b,0,bfbfeae4,...) at syscall+0x295
+Xint0x80_syscall() at Xint0x80_syscall+0x1f
+--- syscall (4, FreeBSD ELF32, write), eip = 0x2831d727, esp = 0xbfbfea1c, ebp 
= 0xbfbfea48 ---



Thank you for the report. The LOR is caused by my commit into
sys/ufs/ufs/ufs_vnops.c, rev. 1.280.


Is the mount lock really required, if all we're doing is a single read of a 
single word (mnt_kern_flags) (v_mount should be read-only for the whole 
lifetime of the vnode, I believe)? After all, reads of a single word are atomic 
on all our supported architectures.
The only situation I see where there MIGHT be problems are forced unmounts, but 
I think there are bigger issues with those.
Sorry for noticing this email only now.

-- Suleiman
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


if_tun not working on [EMAIL PROTECTED]

2006-12-12 Thread Divacky Roman
hi

can anyone confirm that he has working tunnel over if_tun
device on 6.2 and amd64?

I cannot get it work (using vtund). The configuration
is correct, it sets up but no packets go through.

any ideas? thnx

roman

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


mountd(8) tries to remount / as R/W

2006-12-12 Thread Ed Schouten
Hello,

I'm currently working on a project at school where we're creating a
minimal fileserver that boots from flash and serves data from a harddisk
through NFS. Our /etc/exports is very small at the moment. It only
contains:

/disk

The problem is that FreeBSD tries to remount the flash as read-write.
When I perform some hard resets, the system boots with the message right
after starting mountd:

WARNING: R/W mount of / denied.  Filesystem is not clean - run fsck

Is it possible to teach mountd to leave the root filesystem alone?

Yours,
-- 
 Ed Schouten [EMAIL PROTECTED]
 WWW: http://g-rave.nl/


pgpqABbDUDGMo.pgp
Description: PGP signature


Re: if_tun not working on [EMAIL PROTECTED]

2006-12-12 Thread Vlad Galu

On 12/12/06, Divacky Roman [EMAIL PROTECTED] wrote:

hi

can anyone confirm that he has working tunnel over if_tun
device on 6.2 and amd64?


 Yes, I have OpenVPN using tun(4) running smoothly on a machine
running RELENG_6 on amd64.


I cannot get it work (using vtund). The configuration
is correct, it sets up but no packets go through.

any ideas? thnx

roman

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]




--
If it's there, and you can see it, it's real.
If it's not there, and you can see it, it's virtual.
If it's there, and you can't see it, it's transparent.
If it's not there, and you can't see it, you erased it.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: mountd(8) tries to remount / as R/W

2006-12-12 Thread Rink Springer
Hi Ed,

On Tue, Dec 12, 2006 at 10:36:45AM +0100, Ed Schouten wrote:
 Is it possible to teach mountd to leave the root filesystem alone?

This might be due to a change I have made in mountd(8) to prevent it
from destroying filesystem flags; could you file a PR for me?

I assume this is a recent 6-STABLE, correct?

Regards,

-- 
Rink P.W. Springer- http://rink.nu
It's you isn't it? THE BASTARD OPERATOR FROM HELL!
In the flesh, on the phone and in your account...   - BOFH #3


smime.p7s
Description: S/MIME cryptographic signature


Re: mountd(8) tries to remount / as R/W

2006-12-12 Thread Ed Schouten
* Rink Springer [EMAIL PROTECTED] wrote:
 On Tue, Dec 12, 2006 at 10:36:45AM +0100, Ed Schouten wrote:
  Is it possible to teach mountd to leave the root filesystem alone?
 
 This might be due to a change I have made in mountd(8) to prevent it
 from destroying filesystem flags; could you file a PR for me?

Done. http://www.freebsd.org/cgi/query-pr.cgi?pr=106636

 I assume this is a recent 6-STABLE, correct?

Yes, 6.2-PRERELEASE - two weeks old at most.

-- 
 Ed Schouten [EMAIL PROTECTED]
 WWW: http://g-rave.nl/


pgplLPv6jBmSN.pgp
Description: PGP signature


Re: kqueue LOR

2006-12-12 Thread Kostik Belousov
On Tue, Dec 12, 2006 at 12:44:54AM -0800, Suleiman Souhlal wrote:
 Kostik Belousov wrote:
 On Sun, Nov 26, 2006 at 09:30:39AM +0100, V??clav Haisman wrote:
 
 Hi,
 the attached lor.txt contains LOR I got this yesterday. It is FreeBSD 6.1
 with relatively recent kernel, from last week or so.
 
 --
 VH
 
 
 +lock order reversal:
 + 1st 0xc537f300 kqueue (kqueue) @ /usr/src/sys/kern/kern_event.c:1547
 + 2nd 0xc45c22dc struct mount mtx (struct mount mtx) @ 
 /usr/src/sys/ufs/ufs/ufs_vnops.c:138
 +KDB: stack backtrace:
 +kdb_backtrace(c07f9879,c45c22dc,c07fd31c,c07fd31c,c080c7b2,...) at 
 kdb_backtrace+0x2f
 +witness_checkorder(c45c22dc,9,c080c7b2,8a,c07fc6bd,...) at 
 witness_checkorder+0x5fe
 +_mtx_lock_flags(c45c22dc,0,c080c7b2,8a,e790ba20,...) at 
 _mtx_lock_flags+0x32
 +ufs_itimes(c47a0dd0,c47a0e90,e790ba78,c060e1cc,c47a0dd0,...) at 
 ufs_itimes+0x6c
 +ufs_getattr(e790ba54,e790baec,c0622af6,c0896f40,e790ba54,...) at 
 ufs_getattr+0x20
 +VOP_GETATTR_APV(c0896f40,e790ba54,c08a5760,c47a0dd0,e790ba74,...) at 
 VOP_GETATTR_APV+0x3a
 +filt_vfsread(c4cf261c,6,c07f445e,60b,0,...) at filt_vfsread+0x75
 +knote(c4f57114,6,1,1f30c2af,1f30c2af,...) at knote+0x75
 +VOP_WRITE_APV(c0896f40,e790bbec,c47a0dd0,227,e790bcb4,...) at 
 VOP_WRITE_APV+0x148
 +vn_write(c45d5120,e790bcb4,c5802a00,0,c4b73a80,...) at vn_write+0x201
 +dofilewrite(c4b73a80,1b,c45d5120,e790bcb4,,...) at 
 dofilewrite+0x84
 +kern_writev(c4b73a80,1b,e790bcb4,8220c71,0,...) at kern_writev+0x65
 +write(c4b73a80,e790bd04,c,c07d899c,3,...) at write+0x4f
 +syscall(3b,3b,bfbf003b,0,bfbfeae4,...) at syscall+0x295
 +Xint0x80_syscall() at Xint0x80_syscall+0x1f
 +--- syscall (4, FreeBSD ELF32, write), eip = 0x2831d727, esp = 
 0xbfbfea1c, ebp = 0xbfbfea48 ---
 
 
 Thank you for the report. The LOR is caused by my commit into
 sys/ufs/ufs/ufs_vnops.c, rev. 1.280.
 
 Is the mount lock really required, if all we're doing is a single read of a 
 single word (mnt_kern_flags) (v_mount should be read-only for the whole 
 lifetime of the vnode, I believe)? After all, reads of a single word are 
 atomic on all our supported architectures.
 The only situation I see where there MIGHT be problems are forced unmounts, 
 but I think there are bigger issues with those.
 Sorry for noticing this email only now.

The problem is real with snapshotting. Ignoring
MNTK_SUSPEND/MNTK_SUSPENDED flags (in particular, reading stale value of
mnt_kern_flag) while setting IN_MODIFIED caused deadlock at ufs vnode
inactivation time. This was the big trouble with nfsd and snapshots. As
such, I think that precise value of mmnt_kern_flag is critical there,
and mount interlock is needed.

Practically speaking, I agree with claim that reading of m_k_f is
surrounded by enough locked operations that would make sure that
the read value is not stale. But there is no such guarantee on
future/non-i386 arches, isn't it ?

As a side note, mount interlock scope could be reduced there.

Index: ufs/ufs/ufs_vnops.c
===
RCS file: /usr/local/arch/ncvs/src/sys/ufs/ufs/ufs_vnops.c,v
retrieving revision 1.283
diff -u -r1.283 ufs_vnops.c
--- ufs/ufs/ufs_vnops.c 6 Nov 2006 13:42:09 -   1.283
+++ ufs/ufs/ufs_vnops.c 12 Dec 2006 10:18:04 -
@@ -133,19 +134,19 @@
 {
struct inode *ip;
struct timespec ts;
-   int mnt_locked;
 
ip = VTOI(vp);
-   mnt_locked = 0;
if ((vp-v_mount-mnt_flag  MNT_RDONLY) != 0) {
VI_LOCK(vp);
goto out;
}
MNT_ILOCK(vp-v_mount); /* For reading of mnt_kern_flags. */
-   mnt_locked = 1;
VI_LOCK(vp);
-   if ((ip-i_flag  (IN_ACCESS | IN_CHANGE | IN_UPDATE)) == 0)
-   goto out_unl;
+   if ((ip-i_flag  (IN_ACCESS | IN_CHANGE | IN_UPDATE)) == 0) {
+   MNT_IUNLOCK(vp-v_mount);
+   VI_UNLOCK(vp);
+   return;
+   }
 
if ((vp-v_type == VBLK || vp-v_type == VCHR)  !DOINGSOFTDEP(vp))
ip-i_flag |= IN_LAZYMOD;
@@ -155,6 +156,7 @@
ip-i_flag |= IN_MODIFIED;
else if (ip-i_flag  IN_ACCESS)
ip-i_flag |= IN_LAZYACCESS;
+   MNT_IUNLOCK(vp-v_mount);
vfs_timestamp(ts);
if (ip-i_flag  IN_ACCESS) {
DIP_SET(ip, i_atime, ts.tv_sec);
@@ -172,10 +174,7 @@
 
  out:
ip-i_flag = ~(IN_ACCESS | IN_CHANGE | IN_UPDATE);
- out_unl:
VI_UNLOCK(vp);
-   if (mnt_locked)
-   MNT_IUNLOCK(vp-v_mount);
 }
 
 /*


pgpkA3SWblcGR.pgp
Description: PGP signature


Re: RELENG_6 panic under heavy load

2006-12-12 Thread Dmitriy Kirhlarov
On Thu, Dec 07, 2006 at 11:18:52AM +0800, David Xu wrote:
 On Thursday 16 November 2006 19:15, Gleb Smirnoff wrote:
  On Thu, Nov 16, 2006 at 01:24:36PM +0300, Gleb Smirnoff wrote:
  T   I wonder why UMA was suspected to be the problem. Dima gave
  T me access to the core. Here are more details from the trace:
 
  It looks like a race between two threads in one process. Look here:
 
 Can you try the patch ?
 http://people.freebsd.org/~davidxu/patch/ksegrp_preempt.patch 

I've tested it. This patch works also, but with a little bit different
behaviour. With patch from jhb@ I got LA 7-8, with this patch I have
LA 5-6, same as on unpatched system. But it seems to me, that system
is less interactive, compared to jhb@ patch.

WBR
Dmitriy
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: RELENG_6 panic under heavy load

2006-12-12 Thread David Xu
On Tuesday 12 December 2006 20:22, Dmitriy Kirhlarov wrote:
 On Thu, Dec 07, 2006 at 11:18:52AM +0800, David Xu wrote:
  On Thursday 16 November 2006 19:15, Gleb Smirnoff wrote:
   On Thu, Nov 16, 2006 at 01:24:36PM +0300, Gleb Smirnoff wrote:
   T   I wonder why UMA was suspected to be the problem. Dima gave
   T me access to the core. Here are more details from the trace:
  
   It looks like a race between two threads in one process. Look here:
 
  Can you try the patch ?
  http://people.freebsd.org/~davidxu/patch/ksegrp_preempt.patch

 I've tested it. This patch works also, but with a little bit different
 behaviour. With patch from jhb@ I got LA 7-8, with this patch I have
 LA 5-6, same as on unpatched system. But it seems to me, that system
 is less interactive, compared to jhb@ patch.

 WBR
 Dmitriy

jhb patch is incomplete, it implies that every place a thread is doing state
transition and waking another thread up should be patched, there is
other code in kern_sig.c unpatched, though I don't know other places,
but the code maybe_preempt_in_ksegrp should be synced with
maybe_preempt, it should fix all problems. the LA you have seen is lower
than jhb might be a nature of KSEGRP, but I am not sure, if you program
forces all threads to be system-scope, it might fix the problem. 

David Xu
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kqueue LOR

2006-12-12 Thread Bruce Evans

On Tue, 12 Dec 2006, Kostik Belousov wrote:


On Tue, Dec 12, 2006 at 12:44:54AM -0800, Suleiman Souhlal wrote:



Is the mount lock really required, if all we're doing is a single read of a
single word (mnt_kern_flags) (v_mount should be read-only for the whole
lifetime of the vnode, I believe)? After all, reads of a single word are
atomic on all our supported architectures.
The only situation I see where there MIGHT be problems are forced unmounts,
but I think there are bigger issues with those.
Sorry for noticing this email only now.


The problem is real with snapshotting. Ignoring
MNTK_SUSPEND/MNTK_SUSPENDED flags (in particular, reading stale value of
mnt_kern_flag) while setting IN_MODIFIED caused deadlock at ufs vnode
inactivation time. This was the big trouble with nfsd and snapshots. As
such, I think that precise value of mmnt_kern_flag is critical there,
and mount interlock is needed.


Locking for just read is almost always bogus, but here (as in most
cases) there is also a write based on the contents of the flag, and
the lock is held across the write.


Practically speaking, I agree with claim that reading of m_k_f is
surrounded by enough locked operations that would make sure that
the read value is not stale. But there is no such guarantee on
future/non-i386 arches, isn't it ?


I think not-very-staleness is implied by acquire/release semantics
which are part of the API for most atomic operations.  This behaviour
doesn't seem to be documented for mutexes, but I don't see how mutexes
could work without it (they have to synchronize all memory accesses,
not just the memory accessed by the lock).


As a side note, mount interlock scope could be reduced there.

Index: ufs/ufs/ufs_vnops.c
===
RCS file: /usr/local/arch/ncvs/src/sys/ufs/ufs/ufs_vnops.c,v
retrieving revision 1.283
diff -u -r1.283 ufs_vnops.c
--- ufs/ufs/ufs_vnops.c 6 Nov 2006 13:42:09 -   1.283
+++ ufs/ufs/ufs_vnops.c 12 Dec 2006 10:18:04 -
@@ -133,19 +134,19 @@
{
struct inode *ip;
struct timespec ts;
-   int mnt_locked;

ip = VTOI(vp);
-   mnt_locked = 0;
if ((vp-v_mount-mnt_flag  MNT_RDONLY) != 0) {
VI_LOCK(vp);
goto out;
}
MNT_ILOCK(vp-v_mount);  /* For reading of mnt_kern_flags. 
*/
-   mnt_locked = 1;
VI_LOCK(vp);
-   if ((ip-i_flag  (IN_ACCESS | IN_CHANGE | IN_UPDATE)) == 0)
-   goto out_unl;
+   if ((ip-i_flag  (IN_ACCESS | IN_CHANGE | IN_UPDATE)) == 0) {
+   MNT_IUNLOCK(vp-v_mount);
+   VI_UNLOCK(vp);
+   return;
+   }


The version that depends on not-very-staleness would test the flags
without acquiring the lock(s) and return immediately in the usual case
where none of the flags are set.  It would have to acquire the locks
and repeat the test to make changes (and the test is already repeated
one flag at a time).  I think this would be correct enough, but still
inefficient and/or even messier.  The current organization is usually:

acquire vnode interlock in caller
release vnode interlock in caller to avoid messes here (inefficient)
call
acquire mount interlock
acquire vnode interlock
test the flags; goto cleanup code if none set (usual case)
do the work
release vnode interlock
release mount interlock
return
acquire vnode interlock (if needed)
release vnode interlock (if needed)

and it might become:

acquire vnode interlock in caller
call
test the flags; return if none set (usual case)
release vnode interlock // check that callers are aware of this
acquire mount interlock
acquire vnode interlock
do the work
// Assume no LOR problem for release, as below.
// Otherwise need another relese+acquire of vnode interlock.
release mount interlock
return
release vnode interlock



if ((vp-v_type == VBLK || vp-v_type == VCHR)  !DOINGSOFTDEP(vp))
ip-i_flag |= IN_LAZYMOD;
@@ -155,6 +156,7 @@
ip-i_flag |= IN_MODIFIED;
else if (ip-i_flag  IN_ACCESS)
ip-i_flag |= IN_LAZYACCESS;
+   MNT_IUNLOCK(vp-v_mount);
vfs_timestamp(ts);
if (ip-i_flag  IN_ACCESS) {
DIP_SET(ip, i_atime, ts.tv_sec);


Is there no LOR problem for release?

As I understand it, MNT_ILOCK() is only protecting IN_ACCESS being
converted to IN_MODIFED, so after this conversion is done the lock
is not needed.  Is this correct?


@@ -172,10 +174,7 @@

 out:
ip-i_flag = ~(IN_ACCESS | IN_CHANGE | IN_UPDATE);
- out_unl:
VI_UNLOCK(vp);
-   if (mnt_locked)
-   MNT_IUNLOCK(vp-v_mount);
}

/*



BTW, vfs.lookup_shared defaults to 0 and decides shared access for all
operations including read, so I wonder if there are [m]any bugs
preventing shared accesses 

Re: RELENG_6 panic under heavy load

2006-12-12 Thread Dmitriy Kirhlarov
On Tue, Dec 12, 2006 at 08:49:21PM +0800, David Xu wrote:

  I've tested it. This patch works also, but with a little bit different
  behaviour. With patch from jhb@ I got LA 7-8, with this patch I have
  LA 5-6, same as on unpatched system. But it seems to me, that system
  is less interactive, compared to jhb@ patch.
 
 jhb patch is incomplete, it implies that every place a thread is doing state
 transition and waking another thread up should be patched, there is
 other code in kern_sig.c unpatched, though I don't know other places,
 but the code maybe_preempt_in_ksegrp should be synced with
 maybe_preempt, it should fix all problems. the LA you have seen is lower
 than jhb might be a nature of KSEGRP, but I am not sure, if you program
 forces all threads to be system-scope, it might fix the problem. 

I think, kern/105464 can be closed, after commit your patch.

WBR.
Dmitriy
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kqueue LOR

2006-12-12 Thread Kostik Belousov
On Tue, Dec 12, 2006 at 11:49:42PM +1100, Bruce Evans wrote:
 On Tue, 12 Dec 2006, Kostik Belousov wrote:
 
 On Tue, Dec 12, 2006 at 12:44:54AM -0800, Suleiman Souhlal wrote:
 
 Is the mount lock really required, if all we're doing is a single read of 
 a
 single word (mnt_kern_flags) (v_mount should be read-only for the whole
 lifetime of the vnode, I believe)? After all, reads of a single word are
 atomic on all our supported architectures.
 The only situation I see where there MIGHT be problems are forced 
 unmounts,
 but I think there are bigger issues with those.
 Sorry for noticing this email only now.
 
 The problem is real with snapshotting. Ignoring
 MNTK_SUSPEND/MNTK_SUSPENDED flags (in particular, reading stale value of
 mnt_kern_flag) while setting IN_MODIFIED caused deadlock at ufs vnode
 inactivation time. This was the big trouble with nfsd and snapshots. As
 such, I think that precise value of mmnt_kern_flag is critical there,
 and mount interlock is needed.
 
 Locking for just read is almost always bogus, but here (as in most
 cases) there is also a write based on the contents of the flag, and
 the lock is held across the write.
 
 Practically speaking, I agree with claim that reading of m_k_f is
 surrounded by enough locked operations that would make sure that
 the read value is not stale. But there is no such guarantee on
 future/non-i386 arches, isn't it ?
 
 I think not-very-staleness is implied by acquire/release semantics
 which are part of the API for most atomic operations.  This behaviour
 doesn't seem to be documented for mutexes, but I don't see how mutexes
 could work without it (they have to synchronize all memory accesses,
 not just the memory accessed by the lock).
 
 As a side note, mount interlock scope could be reduced there.
 
 Index: ufs/ufs/ufs_vnops.c
 ===
 RCS file: /usr/local/arch/ncvs/src/sys/ufs/ufs/ufs_vnops.c,v
 retrieving revision 1.283
 diff -u -r1.283 ufs_vnops.c
 --- ufs/ufs/ufs_vnops.c  6 Nov 2006 13:42:09 -   1.283
 +++ ufs/ufs/ufs_vnops.c  12 Dec 2006 10:18:04 -
 @@ -133,19 +134,19 @@
 {
  struct inode *ip;
  struct timespec ts;
 -int mnt_locked;
 
  ip = VTOI(vp);
 -mnt_locked = 0;
  if ((vp-v_mount-mnt_flag  MNT_RDONLY) != 0) {
  VI_LOCK(vp);
  goto out;
  }
  MNT_ILOCK(vp-v_mount); /* For reading of mnt_kern_flags. */
 -mnt_locked = 1;
  VI_LOCK(vp);
 -if ((ip-i_flag  (IN_ACCESS | IN_CHANGE | IN_UPDATE)) == 0)
 -goto out_unl;
 +if ((ip-i_flag  (IN_ACCESS | IN_CHANGE | IN_UPDATE)) == 0) {
 +MNT_IUNLOCK(vp-v_mount);
 +VI_UNLOCK(vp);
 +return;
 +}
 
 The version that depends on not-very-staleness would test the flags
 without acquiring the lock(s) and return immediately in the usual case
 where none of the flags are set.  It would have to acquire the locks
 and repeat the test to make changes (and the test is already repeated
 one flag at a time).  I think this would be correct enough, but still
 inefficient and/or even messier.  The current organization is usually:
 
 acquire vnode interlock in caller
 release vnode interlock in caller to avoid messes here (inefficient)
 call
 acquire mount interlock
 acquire vnode interlock
 test the flags; goto cleanup code if none set (usual case)
 do the work
 release vnode interlock
 release mount interlock
 return
 acquire vnode interlock (if needed)
 release vnode interlock (if needed)
 
 and it might become:
 
 acquire vnode interlock in caller
 call
 test the flags; return if none set (usual case)
 release vnode interlock   // check that callers are aware of 
 this
 acquire mount interlock
 acquire vnode interlock
 do the work
   // Assume no LOR problem for release, as below.
   // Otherwise need another relese+acquire of vnode interlock.
 release mount interlock
 return
 release vnode interlock
 
 
  if ((vp-v_type == VBLK || vp-v_type == VCHR)  !DOINGSOFTDEP(vp))
  ip-i_flag |= IN_LAZYMOD;
 @@ -155,6 +156,7 @@
  ip-i_flag |= IN_MODIFIED;
  else if (ip-i_flag  IN_ACCESS)
  ip-i_flag |= IN_LAZYACCESS;
 +MNT_IUNLOCK(vp-v_mount);
  vfs_timestamp(ts);
  if (ip-i_flag  IN_ACCESS) {
  DIP_SET(ip, i_atime, ts.tv_sec);
 
 Is there no LOR problem for release?
AFAIK, no. Release is guaranteed to success without blocking.

 
 As I understand it, MNT_ILOCK() is only protecting IN_ACCESS being
 converted to IN_MODIFED, so after this conversion is done the lock
 is not needed.  Is this correct?
The code shall set IN_MODIFIED flag only if MNTK_SUSPEND and
MNTK_SUSPENDED are clean.

The vfs_write_suspend() that set MNTK_SUSPEND, does VFS_SYNC() after setting
the flag. Since VFS_SYNC() 

Re: kqueue LOR

2006-12-12 Thread Attilio Rao

2006/12/12, Kostik Belousov [EMAIL PROTECTED]:

On Tue, Dec 12, 2006 at 12:44:54AM -0800, Suleiman Souhlal wrote:
 Kostik Belousov wrote:
 On Sun, Nov 26, 2006 at 09:30:39AM +0100, V??clav Haisman wrote:
 
 Hi,
 the attached lor.txt contains LOR I got this yesterday. It is FreeBSD 6.1
 with relatively recent kernel, from last week or so.
 
 --
 VH
 
 
 +lock order reversal:
 + 1st 0xc537f300 kqueue (kqueue) @ /usr/src/sys/kern/kern_event.c:1547
 + 2nd 0xc45c22dc struct mount mtx (struct mount mtx) @
 /usr/src/sys/ufs/ufs/ufs_vnops.c:138
 +KDB: stack backtrace:
 +kdb_backtrace(c07f9879,c45c22dc,c07fd31c,c07fd31c,c080c7b2,...) at
 kdb_backtrace+0x2f
 +witness_checkorder(c45c22dc,9,c080c7b2,8a,c07fc6bd,...) at
 witness_checkorder+0x5fe
 +_mtx_lock_flags(c45c22dc,0,c080c7b2,8a,e790ba20,...) at
 _mtx_lock_flags+0x32
 +ufs_itimes(c47a0dd0,c47a0e90,e790ba78,c060e1cc,c47a0dd0,...) at
 ufs_itimes+0x6c
 +ufs_getattr(e790ba54,e790baec,c0622af6,c0896f40,e790ba54,...) at
 ufs_getattr+0x20
 +VOP_GETATTR_APV(c0896f40,e790ba54,c08a5760,c47a0dd0,e790ba74,...) at
 VOP_GETATTR_APV+0x3a
 +filt_vfsread(c4cf261c,6,c07f445e,60b,0,...) at filt_vfsread+0x75
 +knote(c4f57114,6,1,1f30c2af,1f30c2af,...) at knote+0x75
 +VOP_WRITE_APV(c0896f40,e790bbec,c47a0dd0,227,e790bcb4,...) at
 VOP_WRITE_APV+0x148
 +vn_write(c45d5120,e790bcb4,c5802a00,0,c4b73a80,...) at vn_write+0x201
 +dofilewrite(c4b73a80,1b,c45d5120,e790bcb4,,...) at
 dofilewrite+0x84
 +kern_writev(c4b73a80,1b,e790bcb4,8220c71,0,...) at kern_writev+0x65
 +write(c4b73a80,e790bd04,c,c07d899c,3,...) at write+0x4f
 +syscall(3b,3b,bfbf003b,0,bfbfeae4,...) at syscall+0x295
 +Xint0x80_syscall() at Xint0x80_syscall+0x1f
 +--- syscall (4, FreeBSD ELF32, write), eip = 0x2831d727, esp =
 0xbfbfea1c, ebp = 0xbfbfea48 ---
 
 
 Thank you for the report. The LOR is caused by my commit into
 sys/ufs/ufs/ufs_vnops.c, rev. 1.280.

 Is the mount lock really required, if all we're doing is a single read of a
 single word (mnt_kern_flags) (v_mount should be read-only for the whole
 lifetime of the vnode, I believe)? After all, reads of a single word are
 atomic on all our supported architectures.
 The only situation I see where there MIGHT be problems are forced unmounts,
 but I think there are bigger issues with those.
 Sorry for noticing this email only now.

The problem is real with snapshotting. Ignoring
MNTK_SUSPEND/MNTK_SUSPENDED flags (in particular, reading stale value of
mnt_kern_flag) while setting IN_MODIFIED caused deadlock at ufs vnode
inactivation time. This was the big trouble with nfsd and snapshots. As
such, I think that precise value of mmnt_kern_flag is critical there,
and mount interlock is needed.


This can be avoided using a memory barrier when setting flags.
Even if memory barriers usage is not encouraged, some critical code
should really use them replacing a mutex semantic (if that worths it).

Attilio

--
Peace can only be achieved by understanding - A. Einstein
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bge Ierr rate increase from 5.3R - 6.1R

2006-12-12 Thread Greg Eden


Hello Doug

On 11 Dec 2006, at 21:46, Doug Barton wrote:


Greg Eden wrote:

Hello

I recently updated two production servers from 5.3 to 6.1 via  
cvsup and

buildworld. Since the upgrade I've seen an increase in the number of
Input packet errors reported on the bge cards in on both boxes.  
One is a
HP DL360g3, the other is a HP DL380g3. Both have a pair of 2.8GHz  
Xeons

with a SMP kernel.


It would be quite useful at this point if you could update a box or
two to the RELENG_6_2 code base so that we can see if this problem is
solved in the latest release candidate. If it is, your problem is
solved, and if it's not, you're a lot closer to the point where we can
usefully assist you.


I've just updated a box (HP DL360g4) to RELENG_6_2 as requested.  
Problem still appears to be present. sftping over 750MB of log files  
produced the characteristic errors.


netstat -i
NameMtu Network   Address  Ipkts IerrsOpkts  
Oerrs  Coll
bge0   1500 Link#1  00:12:79:3b:**:**   31309124
217911 0 0
bge0   1500 192.168.100   192.168.***.*** 312894 -
217905 - -
bge1*  1500 Link#2  00:12:79:3b:**:**0 0 
0 0 0
lo0   16384 Link#3   0 0 
0 0 0
lo0   16384 your-net  localhost0 - 
0 - -


Here's the dmesg output for the box.

Copyright (c) 1992-2006 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights  
reserved.

FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 6.2-RC1 #0: Tue Dec 12 14:07:47 GMT 2006
:/usr/obj/usr/src/sys/MASAQ
ACPI APIC Table: HP 0083
Timecounter i8254 frequency 1193182 Hz quality 0
CPU: Intel(R) Xeon(TM) CPU 3.00GHz (3000.12-MHz 686-class CPU)
  Origin = GenuineIntel  Id = 0xf41  Stepping = 1
   
Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE 
,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE

  Features2=0x641dSSE3,RSVD2,MON,DS_CPL,CNTX-ID,CX16,b14
  AMD Features=0x2010NX,LM
  Logical CPUs per core: 2
real memory  = 1073688576 (1023 MB)
avail memory = 1045893120 (997 MB)
ioapic1: Changing APIC ID to 9
ioapic0 Version 2.0 irqs 0-23 on motherboard
ioapic1 Version 2.0 irqs 24-47 on motherboard
ioapic2 Version 2.0 irqs 48-71 on motherboard
ioapic3 Version 2.0 irqs 72-95 on motherboard
kbd1 at kbdmux0
acpi0: HP P52 on motherboard
acpi0: Power Button (fixed)
Timecounter ACPI-safe frequency 3579545 Hz quality 1000
acpi_timer0: 24-bit timer at 3.579545MHz port 0x908-0x90b on acpi0
cpu0: ACPI CPU on acpi0
pcib0: ACPI Host-PCI bridge on acpi0
pci0: ACPI PCI bus on pcib0
pcib1: ACPI PCI-PCI bridge at device 2.0 on pci0
pci13: ACPI PCI bus on pcib1
pcib2: ACPI PCI-PCI bridge at device 4.0 on pci0
pci6: ACPI PCI bus on pcib2
pcib3: ACPI PCI-PCI bridge at device 0.0 on pci6
pci7: ACPI PCI bus on pcib3
pcib4: ACPI PCI-PCI bridge at device 0.2 on pci6
pci10: ACPI PCI bus on pcib4
pcib5: ACPI PCI-PCI bridge at device 6.0 on pci0
pci3: ACPI PCI bus on pcib5
pcib6: ACPI PCI-PCI bridge at device 28.0 on pci0
pci2: ACPI PCI bus on pcib6
ciss0: HP Smart Array 6i port 0x4000-0x40ff mem  
0xfdff-0xfdff1fff,0xfdf8-0xfdfb irq 24 at device 1.0 on pci2

ciss0: [GIANT-LOCKED]
bge0: Broadcom BCM5704 B0, ASIC rev. 0x2100 mem  
0xfdf7-0xfdf7 irq 25 at device 2.0 on pci2

miibus0: MII bus on bge0
brgphy0: BCM5704 10/100/1000baseTX PHY on miibus0
brgphy0:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX,  
1000baseTX-FDX, auto

bge0: Ethernet address: 00:12:79:3b:83:52
bge1: Broadcom BCM5704 B0, ASIC rev. 0x2100 mem  
0xfdf6-0xfdf6 irq 26 at device 2.1 on pci2

miibus1: MII bus on bge1
brgphy1: BCM5704 10/100/1000baseTX PHY on miibus1
brgphy1:  10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, 1000baseTX,  
1000baseTX-FDX, auto

bge1: Ethernet address: 00:12:79:3b:83:51
uhci0: UHCI (generic) USB controller port 0x2000-0x201f irq 16 at  
device 29.0 on pci0

uhci0: [GIANT-LOCKED]
usb0: UHCI (generic) USB controller on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
uhci1: UHCI (generic) USB controller port 0x2020-0x203f irq 19 at  
device 29.1 on pci0

uhci1: [GIANT-LOCKED]
usb1: UHCI (generic) USB controller on uhci1
usb1: USB revision 1.0
uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub1: 2 ports with 2 removable, self powered
pci0: base peripheral at device 29.4 (no driver attached)
pci0: base peripheral, interrupt controller at device 29.5 (no  
driver attached)
ehci0: Intel 6300ESB USB 2.0 controller mem 0xfbee-0xfbee03ff  
irq 23 at device 29.7 on pci0

ehci0: [GIANT-LOCKED]
usb2: EHCI version 1.0
usb2: companion controllers, 2 ports each: usb0 usb1
usb2: Intel 6300ESB USB 2.0 controller on ehci0
usb2: USB 

2G+ KVA on mad64

2006-12-12 Thread Igor Sysoev

Can FreeBSD/amd64 support more than 2G+ KVA ?
Is it safe to set KPDPI to (NPDPEPG-4) in /sys/amd64/include/pmap.h
to get 4G KVA ?


Igor Sysoev
http://sysoev.ru/en/
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


6.2-RC1 em(4) issue - freezes during CD boot

2006-12-12 Thread Chris Buechler
I'm attempting to install 6.2-RC1 on a box that ran 5.3, 5.4, 6.0, and 
6.1 for the past few years, but the 6.2-RC1 CD won't boot.  After it 
detects the em0 NIC, before it gets into sysinstall, it freezes solid.  
I've let it sit there for over an hour. 

I know there has been some recent work on em(4), but it doesn't appear 
all the issues are resolved.  I'm not sure what I should try, what 
information I could provide that might be useful, etc.  I'd be more than 
happy to help troubleshoot this, with some guidance on what I could do 
to be helpful. 


Cheers,
-Chris

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: 6.2-RC1 em(4) issue - freezes during CD boot

2006-12-12 Thread Jack Vogel

On 12/12/06, Chris Buechler [EMAIL PROTECTED] wrote:

I'm attempting to install 6.2-RC1 on a box that ran 5.3, 5.4, 6.0, and
6.1 for the past few years, but the 6.2-RC1 CD won't boot.  After it
detects the em0 NIC, before it gets into sysinstall, it freezes solid.
I've let it sit there for over an hour.

I know there has been some recent work on em(4), but it doesn't appear
all the issues are resolved.  I'm not sure what I should try, what
information I could provide that might be useful, etc.  I'd be more than
happy to help troubleshoot this, with some guidance on what I could do
to be helpful.


UH, can't do anything if you don't give any information, saying its a 'box'
and it has an 'em' is useless. We have installed RC1 on a number of
systems without problem.

When you give this kind of report think about it like you have to debug
it yourself (which you do :). What is the system, pciconf, which will
tell us what kind of em it is, how bout Beta 6.2 did IT build. Um, so
can you pull the NIC out and then does it install... etc etc..

Cheers,

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


FreeBSD 6.2

2006-12-12 Thread Jack Raats
Hi everyone

http://www.freebsd.org/releases/6.2R/schedule.html is not being updated for a 
long time.

Is FreeBSD 6.2-RELEASE going to become a christmas present?

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


FreeBSD 6.2-RELEASE a Xmas present?

2006-12-12 Thread Jack Raats
Hi everyone

http://www.freebsd.org/releases/6.2R/schedule.html is not being updated for a 
long time.

Is FreeBSD 6.2-RELEASE going to become a christmas present?

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: 6.2-RC1 em(4) issue - freezes during CD boot

2006-12-12 Thread Jack Vogel

On 12/12/06, Jack Vogel [EMAIL PROTECTED] wrote:

On 12/12/06, Chris Buechler [EMAIL PROTECTED] wrote:
 I'm attempting to install 6.2-RC1 on a box that ran 5.3, 5.4, 6.0, and
 6.1 for the past few years, but the 6.2-RC1 CD won't boot.  After it
 detects the em0 NIC, before it gets into sysinstall, it freezes solid.
 I've let it sit there for over an hour.

 I know there has been some recent work on em(4), but it doesn't appear
 all the issues are resolved.  I'm not sure what I should try, what
 information I could provide that might be useful, etc.  I'd be more than
 happy to help troubleshoot this, with some guidance on what I could do
 to be helpful.

UH, can't do anything if you don't give any information, saying its a 'box'
and it has an 'em' is useless. We have installed RC1 on a number of
systems without problem.

When you give this kind of report think about it like you have to debug
it yourself (which you do :). What is the system, pciconf, which will
tell us what kind of em it is, how bout Beta 6.2 did IT build. Um, so
can you pull the NIC out and then does it install... etc etc..


Another suggestion, boot with ACPI off or even SAFE mode and see
if that helps.

Jack
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


asr / raidutil not available for 6.x either ... ? :(

2006-12-12 Thread Marc G. Fournier
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1


Installed it from ports, and it installed the compat4x stuff, but although it 
runs with no args, as soon as I try to access the controller:


# raidutil -d 1 -L physical
osdIOrequest : File /dev/rdptr17 Could Not Be Opened
Engine connect failed: COMPATILITY number


so I take it that like the storcon stuff, one is out of luck running FreeBSD 
6.x? :(


- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFFfu254QvfyHIvDvMRAm0TAKC8ppaXLohOtnJZlkfSvMHmiZwXcwCcC4VW
YJ9IdKMiee35xnEUJB991oY=
=+hzG
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD 6.2

2006-12-12 Thread Wilko Bulte
On Tue, Dec 12, 2006 at 06:41:34PM +0100, Jack Raats wrote..
 Hi everyone
 
 http://www.freebsd.org/releases/6.2R/schedule.html is not being updated for a 
 long time.
 
 Is FreeBSD 6.2-RELEASE going to become a christmas present?

Only if Santa thinks you have been nice to your family the last year :)

-- 
Wilko Bulte [EMAIL PROTECTED]
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD 6.2

2006-12-12 Thread Jack Raats



Hi everyone

http://www.freebsd.org/releases/6.2R/schedule.html is not being updated 
for a long time.


Is FreeBSD 6.2-RELEASE going to become a christmas present?


Only if Santa thinks you have been nice to your family the last year :)

--
Wilko Bulte [EMAIL PROTECTED]


Santa knows, so that's the reason I 'm getting the FreeBSD 6.3-RELEASE :-) 


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kqueue LOR

2006-12-12 Thread Suleiman Souhlal

Attilio Rao wrote:

2006/12/12, Kostik Belousov [EMAIL PROTECTED]:


On Tue, Dec 12, 2006 at 12:44:54AM -0800, Suleiman Souhlal wrote:
 Kostik Belousov wrote:
 On Sun, Nov 26, 2006 at 09:30:39AM +0100, V??clav Haisman wrote:
 
 Hi,
 the attached lor.txt contains LOR I got this yesterday. It is 
FreeBSD 6.1

 with relatively recent kernel, from last week or so.
 
 --
 VH
 
 
 +lock order reversal:
 + 1st 0xc537f300 kqueue (kqueue) @ 
/usr/src/sys/kern/kern_event.c:1547

 + 2nd 0xc45c22dc struct mount mtx (struct mount mtx) @
 /usr/src/sys/ufs/ufs/ufs_vnops.c:138
 +KDB: stack backtrace:
 +kdb_backtrace(c07f9879,c45c22dc,c07fd31c,c07fd31c,c080c7b2,...) at
 kdb_backtrace+0x2f
 +witness_checkorder(c45c22dc,9,c080c7b2,8a,c07fc6bd,...) at
 witness_checkorder+0x5fe
 +_mtx_lock_flags(c45c22dc,0,c080c7b2,8a,e790ba20,...) at
 _mtx_lock_flags+0x32
 +ufs_itimes(c47a0dd0,c47a0e90,e790ba78,c060e1cc,c47a0dd0,...) at
 ufs_itimes+0x6c
 +ufs_getattr(e790ba54,e790baec,c0622af6,c0896f40,e790ba54,...) at
 ufs_getattr+0x20
 +VOP_GETATTR_APV(c0896f40,e790ba54,c08a5760,c47a0dd0,e790ba74,...) at
 VOP_GETATTR_APV+0x3a
 +filt_vfsread(c4cf261c,6,c07f445e,60b,0,...) at filt_vfsread+0x75
 +knote(c4f57114,6,1,1f30c2af,1f30c2af,...) at knote+0x75
 +VOP_WRITE_APV(c0896f40,e790bbec,c47a0dd0,227,e790bcb4,...) at
 VOP_WRITE_APV+0x148
 +vn_write(c45d5120,e790bcb4,c5802a00,0,c4b73a80,...) at 
vn_write+0x201

 +dofilewrite(c4b73a80,1b,c45d5120,e790bcb4,,...) at
 dofilewrite+0x84
 +kern_writev(c4b73a80,1b,e790bcb4,8220c71,0,...) at kern_writev+0x65
 +write(c4b73a80,e790bd04,c,c07d899c,3,...) at write+0x4f
 +syscall(3b,3b,bfbf003b,0,bfbfeae4,...) at syscall+0x295
 +Xint0x80_syscall() at Xint0x80_syscall+0x1f
 +--- syscall (4, FreeBSD ELF32, write), eip = 0x2831d727, esp =
 0xbfbfea1c, ebp = 0xbfbfea48 ---
 
 
 Thank you for the report. The LOR is caused by my commit into
 sys/ufs/ufs/ufs_vnops.c, rev. 1.280.

 Is the mount lock really required, if all we're doing is a single 
read of a

 single word (mnt_kern_flags) (v_mount should be read-only for the whole
 lifetime of the vnode, I believe)? After all, reads of a single word 
are

 atomic on all our supported architectures.
 The only situation I see where there MIGHT be problems are forced 
unmounts,

 but I think there are bigger issues with those.
 Sorry for noticing this email only now.

The problem is real with snapshotting. Ignoring
MNTK_SUSPEND/MNTK_SUSPENDED flags (in particular, reading stale value of
mnt_kern_flag) while setting IN_MODIFIED caused deadlock at ufs vnode
inactivation time. This was the big trouble with nfsd and snapshots. As
such, I think that precise value of mmnt_kern_flag is critical there,
and mount interlock is needed.



This can be avoided using a memory barrier when setting flags.
Even if memory barriers usage is not encouraged, some critical code
should really use them replacing a mutex semantic (if that worths it).


Why is memory barrier usage not encouraged? As you said, they can be used to 
reduce the number of atomic (LOCKed) operations, in some cases.

FWIW, Linux has rmb() (load mem barrier), wmb() (store mem barrier), mb() (load/store mem 
barrier), smp_rmb(), smp_wmb(), smp_mb() (mem barriers only needed on SMP), and barrier() 
(GCC barrier (__asm __volatile (:::memory)) macros that I've personally found 
very useful.
Admittedly, they are harder to use than atomic operations, but it might still 
worth having something similar.

-- Suleiman
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: asr / raidutil not available for 6.x either ... ? :(

2006-12-12 Thread Jim Pingle
Marc G. Fournier wrote:
 
 Installed it from ports, and it installed the compat4x stuff, but although it 
 runs with no args, as soon as I try to access the controller:
 
 
 # raidutil -d 1 -L physical
 osdIOrequest : File /dev/rdptr17 Could Not Be Opened
 Engine connect failed: COMPATILITY number
 
 
 so I take it that like the storcon stuff, one is out of luck running FreeBSD 
 6.x? :(

I have a bunch of these and using raidutil works great, however you need to
rebuild your kernel with:

options ASR_COMPAT

That is in addition to having compat4x_enable=YES in rc.conf

Jim
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: deadlock in zoneli state on 6.2-PRERELEASE

2006-12-12 Thread Nikolay Pavlov
No luck at all.
patch-zonelim-drain-20061212 works for me as a previos one.
no panics, but still zoneli.
All this is very odd, because other two squid servers works 
perfectly in the same loadbalancer with out any patches and 
kernel panics. I think that the case with this server 
is realy rare.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kqueue LOR

2006-12-12 Thread Attilio Rao

2006/12/12, Suleiman Souhlal [EMAIL PROTECTED]:

Attilio Rao wrote:
 2006/12/12, Kostik Belousov [EMAIL PROTECTED]:

 On Tue, Dec 12, 2006 at 12:44:54AM -0800, Suleiman Souhlal wrote:
  Kostik Belousov wrote:
  On Sun, Nov 26, 2006 at 09:30:39AM +0100, V??clav Haisman wrote:
  
  Hi,
  the attached lor.txt contains LOR I got this yesterday. It is
 FreeBSD 6.1
  with relatively recent kernel, from last week or so.
  
  --
  VH
  
  
  +lock order reversal:
  + 1st 0xc537f300 kqueue (kqueue) @
 /usr/src/sys/kern/kern_event.c:1547
  + 2nd 0xc45c22dc struct mount mtx (struct mount mtx) @
  /usr/src/sys/ufs/ufs/ufs_vnops.c:138
  +KDB: stack backtrace:
  +kdb_backtrace(c07f9879,c45c22dc,c07fd31c,c07fd31c,c080c7b2,...) at
  kdb_backtrace+0x2f
  +witness_checkorder(c45c22dc,9,c080c7b2,8a,c07fc6bd,...) at
  witness_checkorder+0x5fe
  +_mtx_lock_flags(c45c22dc,0,c080c7b2,8a,e790ba20,...) at
  _mtx_lock_flags+0x32
  +ufs_itimes(c47a0dd0,c47a0e90,e790ba78,c060e1cc,c47a0dd0,...) at
  ufs_itimes+0x6c
  +ufs_getattr(e790ba54,e790baec,c0622af6,c0896f40,e790ba54,...) at
  ufs_getattr+0x20
  +VOP_GETATTR_APV(c0896f40,e790ba54,c08a5760,c47a0dd0,e790ba74,...) at
  VOP_GETATTR_APV+0x3a
  +filt_vfsread(c4cf261c,6,c07f445e,60b,0,...) at filt_vfsread+0x75
  +knote(c4f57114,6,1,1f30c2af,1f30c2af,...) at knote+0x75
  +VOP_WRITE_APV(c0896f40,e790bbec,c47a0dd0,227,e790bcb4,...) at
  VOP_WRITE_APV+0x148
  +vn_write(c45d5120,e790bcb4,c5802a00,0,c4b73a80,...) at
 vn_write+0x201
  +dofilewrite(c4b73a80,1b,c45d5120,e790bcb4,,...) at
  dofilewrite+0x84
  +kern_writev(c4b73a80,1b,e790bcb4,8220c71,0,...) at kern_writev+0x65
  +write(c4b73a80,e790bd04,c,c07d899c,3,...) at write+0x4f
  +syscall(3b,3b,bfbf003b,0,bfbfeae4,...) at syscall+0x295
  +Xint0x80_syscall() at Xint0x80_syscall+0x1f
  +--- syscall (4, FreeBSD ELF32, write), eip = 0x2831d727, esp =
  0xbfbfea1c, ebp = 0xbfbfea48 ---
  
  
  Thank you for the report. The LOR is caused by my commit into
  sys/ufs/ufs/ufs_vnops.c, rev. 1.280.
 
  Is the mount lock really required, if all we're doing is a single
 read of a
  single word (mnt_kern_flags) (v_mount should be read-only for the whole
  lifetime of the vnode, I believe)? After all, reads of a single word
 are
  atomic on all our supported architectures.
  The only situation I see where there MIGHT be problems are forced
 unmounts,
  but I think there are bigger issues with those.
  Sorry for noticing this email only now.

 The problem is real with snapshotting. Ignoring
 MNTK_SUSPEND/MNTK_SUSPENDED flags (in particular, reading stale value of
 mnt_kern_flag) while setting IN_MODIFIED caused deadlock at ufs vnode
 inactivation time. This was the big trouble with nfsd and snapshots. As
 such, I think that precise value of mmnt_kern_flag is critical there,
 and mount interlock is needed.


 This can be avoided using a memory barrier when setting flags.
 Even if memory barriers usage is not encouraged, some critical code
 should really use them replacing a mutex semantic (if that worths it).

Why is memory barrier usage not encouraged? As you said, they can be used to 
reduce the number of atomic (LOCKed) operations, in some cases.


Beacause they can lead to errors as it is not so straightforward to
understand when a memory barrier is needed more than an atomic
instruction and so on
(even if it doesn't value, for example, for ia32, for other
architectures memory barriers could be more expensive than the atomic
instruction, without counting a possible error).


FWIW, Linux has rmb() (load mem barrier), wmb() (store mem barrier), mb() (load/store mem 
barrier), smp_rmb(), smp_wmb(), smp_mb() (mem barriers only needed on SMP), and barrier() 
(GCC barrier (__asm __volatile (:::memory)) macros that I've personally found 
very useful.


I think that our memory barriers reflect the usage we do into the
kernel as the base for building syncronizing primitives. From this
point of view our atomic operations (meant into the wider possible
sense, man 9 atomic) are more suitable than having something like
Linux's smp_*().

Attilio


--
Peace can only be achieved by understanding - A. Einstein
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: deadlock in zoneli state on 6.2-PRERELEASE

2006-12-12 Thread LI Xin
Nikolay Pavlov wrote:
 No luck at all.
 patch-zonelim-drain-20061212 works for me as a previos one.
 no panics, but still zoneli.
 All this is very odd, because other two squid servers works 
 perfectly in the same loadbalancer with out any patches and 
 kernel panics. I think that the case with this server 
 is realy rare.

Would you please give a vmstat -z output when the server stuck in the
zonelim livelock?  Thanks!

Cheers,
-- 
Xin LI [EMAIL PROTECTED]  http://www.delphij.net/
FreeBSD - The Power to Serve!



signature.asc
Description: OpenPGP digital signature


Re: kqueue LOR

2006-12-12 Thread John Baldwin
On Tuesday 12 December 2006 13:43, Suleiman Souhlal wrote:
 Attilio Rao wrote:
  2006/12/12, Kostik Belousov [EMAIL PROTECTED]:
  
  On Tue, Dec 12, 2006 at 12:44:54AM -0800, Suleiman Souhlal wrote:
   Kostik Belousov wrote:
   On Sun, Nov 26, 2006 at 09:30:39AM +0100, V??clav Haisman wrote:
   
   Hi,
   the attached lor.txt contains LOR I got this yesterday. It is 
  FreeBSD 6.1
   with relatively recent kernel, from last week or so.
   
   --
   VH
   
   
   +lock order reversal:
   + 1st 0xc537f300 kqueue (kqueue) @ 
  /usr/src/sys/kern/kern_event.c:1547
   + 2nd 0xc45c22dc struct mount mtx (struct mount mtx) @
   /usr/src/sys/ufs/ufs/ufs_vnops.c:138
   +KDB: stack backtrace:
   +kdb_backtrace(c07f9879,c45c22dc,c07fd31c,c07fd31c,c080c7b2,...) at
   kdb_backtrace+0x2f
   +witness_checkorder(c45c22dc,9,c080c7b2,8a,c07fc6bd,...) at
   witness_checkorder+0x5fe
   +_mtx_lock_flags(c45c22dc,0,c080c7b2,8a,e790ba20,...) at
   _mtx_lock_flags+0x32
   +ufs_itimes(c47a0dd0,c47a0e90,e790ba78,c060e1cc,c47a0dd0,...) at
   ufs_itimes+0x6c
   +ufs_getattr(e790ba54,e790baec,c0622af6,c0896f40,e790ba54,...) at
   ufs_getattr+0x20
   +VOP_GETATTR_APV(c0896f40,e790ba54,c08a5760,c47a0dd0,e790ba74,...) at
   VOP_GETATTR_APV+0x3a
   +filt_vfsread(c4cf261c,6,c07f445e,60b,0,...) at filt_vfsread+0x75
   +knote(c4f57114,6,1,1f30c2af,1f30c2af,...) at knote+0x75
   +VOP_WRITE_APV(c0896f40,e790bbec,c47a0dd0,227,e790bcb4,...) at
   VOP_WRITE_APV+0x148
   +vn_write(c45d5120,e790bcb4,c5802a00,0,c4b73a80,...) at 
  vn_write+0x201
   +dofilewrite(c4b73a80,1b,c45d5120,e790bcb4,,...) at
   dofilewrite+0x84
   +kern_writev(c4b73a80,1b,e790bcb4,8220c71,0,...) at kern_writev+0x65
   +write(c4b73a80,e790bd04,c,c07d899c,3,...) at write+0x4f
   +syscall(3b,3b,bfbf003b,0,bfbfeae4,...) at syscall+0x295
   +Xint0x80_syscall() at Xint0x80_syscall+0x1f
   +--- syscall (4, FreeBSD ELF32, write), eip = 0x2831d727, esp =
   0xbfbfea1c, ebp = 0xbfbfea48 ---
   
   
   Thank you for the report. The LOR is caused by my commit into
   sys/ufs/ufs/ufs_vnops.c, rev. 1.280.
  
   Is the mount lock really required, if all we're doing is a single 
  read of a
   single word (mnt_kern_flags) (v_mount should be read-only for the whole
   lifetime of the vnode, I believe)? After all, reads of a single word 
  are
   atomic on all our supported architectures.
   The only situation I see where there MIGHT be problems are forced 
  unmounts,
   but I think there are bigger issues with those.
   Sorry for noticing this email only now.
 
  The problem is real with snapshotting. Ignoring
  MNTK_SUSPEND/MNTK_SUSPENDED flags (in particular, reading stale value of
  mnt_kern_flag) while setting IN_MODIFIED caused deadlock at ufs vnode
  inactivation time. This was the big trouble with nfsd and snapshots. As
  such, I think that precise value of mmnt_kern_flag is critical there,
  and mount interlock is needed.
  
  
  This can be avoided using a memory barrier when setting flags.
  Even if memory barriers usage is not encouraged, some critical code
  should really use them replacing a mutex semantic (if that worths it).
 
 Why is memory barrier usage not encouraged? As you said, they can be used to 
reduce the number of atomic (LOCKed) operations, in some cases.
 
 FWIW, Linux has rmb() (load mem barrier), wmb() (store mem barrier), mb() 
(load/store mem barrier), smp_rmb(), smp_wmb(), smp_mb() (mem barriers only 
needed on SMP), and barrier() (GCC barrier (__asm __volatile (:::memory)) 
macros that I've personally found very useful.
 Admittedly, they are harder to use than atomic operations, but it might 
still worth having something similar.

Memory barriers just specify ordering, they don't ensure a cache flush so 
another CPU reads up to date values.  You can use memory barriers in 
conjunction with atomic operations on a variable to ensure that you can 
safely read other variables (which is what locks do).  For example, in this 
case IIUC, you have a race that is because there is shared state between two 
fields, one in the mount structure, and one in the ufs i-node.  Memory 
barriers alone won't prevent you from operating on those flags 
non-consistently.  That is, you have two memory locations in play here, and 
atomic ops only work on a single one.  There isn't an atomic op to do read 
from memory location A, check flag B, and if it's true write C to memory 
location D.  Where would you put the membar in this case to ensure that the 
action always results in consistent behavior?

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: deadlock in zoneli state on 6.2-PRERELEASE

2006-12-12 Thread Nikolay Pavlov
On Wednesday, 13 December 2006 at  3:02:40 +0800, LI Xin wrote:
 Nikolay Pavlov wrote:
  No luck at all.
  patch-zonelim-drain-20061212 works for me as a previos one.
  no panics, but still zoneli.
  All this is very odd, because other two squid servers works 
  perfectly in the same loadbalancer with out any patches and 
  kernel panics. I think that the case with this server 
  is realy rare.
 
 Would you please give a vmstat -z output when the server stuck in the
 zonelim livelock?  Thanks!

130947/775/131722 mbufs in use (current/cache/total)
130859/213/131072/131072 mbuf clusters in use (current/cache/total/max)
130859/213 mbuf+clusters out of packet secondary zone in use
(current/cache)
0/0/0/0 4k (page size) jumbo clusters in use (current/cache/total/max)
0/0/0/0 9k jumbo clusters in use (current/cache/total/max)
0/0/0/0 16k jumbo clusters in use (current/cache/total/max)
294454K/619K/295074K bytes allocated to network (current/cache/total)
0/493001/246499 requests for mbufs denied (mbufs/clusters/mbuf+clusters)
0/0/0 requests for jumbo clusters denied (4k/9k/16k)
0/4/6656 sfbufs in use (current/peak/max)
0 requests for sfbufs denied
0 requests for sfbufs delayed
0 requests for I/O initiated by sendfile
83 calls to protocol drain routines

ITEMSIZE LIMIT USEDFREE  REQUESTS

UMA Kegs:140,0,  84, 12,   84
UMA Zones:   120,0,  84,  6,   84
UMA Slabs:64,0, 941,  3, 2709
UMA RCntSlabs:   104,0,   65536, 28,65536
UMA Hash:128,0,   4, 26,6
16 Bucket:76,0,  27, 23,   35
32 Bucket:   140,0,  21,  7,   29
64 Bucket:   268,0,  30, 26,   67
128 Bucket:  524,0, 215,457,   827387
VM OBJECT:   132,0,   41860, 16,72475
MAP: 192,0,   7, 33,7
KMAP ENTRY:   68,57456, 121, 47,97518
MAP ENTRY:68,0, 702,362,94390
PV ENTRY: 24,  2228360,  124664,   2646,   984278
DP fakepg:72,0,   0,  0,0
mt_zone:  64,0, 182, 54,  182
16:   16,0,3754,306,   432518
32:   32,0,1794,353,   116429
64:   64,0,3205,   3580,   206314
128: 128,0,1616,   1384,   371069
256: 256,0, 368,427,22536
512: 512,0,1266, 30,   486094
1024:   1024,0,  48, 84,   434408
2048:   2048,0, 147, 61,43326
4096:   4096,0, 129, 22, 4942
Files:72,0,2672,   2734,   481720
MAC labels:   20,0,   59586,   1085,   622560
PROC:536,0,  71, 27, 1564
THREAD:  376,0,  98, 22,   98
KSEGRP:   88,0,  98, 62,   98
UPCALL:   44,0,   0,  0,0
VMSPACE: 296,0,  28, 24, 1518
mbuf_packet: 256,0,  131052, 20,  9510911
mbuf:256,0,  91,559, 12694062
mbuf_cluster:   2048,   131072,  131072,  0,   136594
mbuf_jumbo_pagesize: 4096,0,   0,  0,0
mbuf_jumbo_9k:  9216,0,   0,  0,0
mbuf_jumbo_16k: 16384,0,   0,  0,0
ACL UMA zone:388,0,   0,  0,0
g_bio:   132,0,   0,   1160,   522435
ata_request: 204,0,   0,  0,0
ata_composite:   196,0,   0,  0,0
VNODE:   348,0,   47312, 10,52436
VNODEPOLL:76,0,   0,  0,0
S VFS Cache:  68,0,   39770, 46,44847
L VFS Cache: 291,0,   0,  0,0
NAMEI:  1024,0,   0, 12,   205701
DIRHASH:1024,0,1595,229, 7023
NFSMOUNT:480,0,   1, 15,1
NFSNODE: 536,0,  16,  5,   16
PIPE:408,0,   6, 21,  664
KNOTE:68,0,   0,112,   76
socket:  356,   131076,3930,   1229,   186491
unpcb:   140,   131096,  12, 44,  170
ipq:  32, 4181,   0,  0,0
udpcb:   180,   131076,  12, 32,  188
inpcb:   180,   131076,3859,   1509,   186132
tcpcb:   464,   131072,3859,   1237,   186132
tcptw:48, 8268,   0,546,   140117
syncache:100,15366,   2,271,   165792
hostcache:76,15400,3717, 33, 6425
tcpreass: 20, 8281,   0,169, 1493
sackhole: 20,0,   4,334,   372356
ripcb

Re: panic: sleeping thread

2006-12-12 Thread Marc G. Fournier
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Monday, December 11, 2006 17:40:22 -0500 John Baldwin [EMAIL 
PROTECTED] 
wrote:

 Maybe use ssh -e none?  You don't need to break into ddb though, when it
 panics it will print out more useful info on its own.

Ah, like:

Sleeping thread (tid 101409, pid 78573) owns a non-sleepable lock
sched_switch() at sched_switch+0x11f
mi_switch() at mi_switch+0x14c
sleepq_wait() at sleepq_wait+0x5b
cv_wait() at cv_wait+0xed
_sx_xlock() at _sx_xlock+0x51
vm_map_lookup() at vm_map_lookup+0x3c
vm_fault() at vm_fault+0xba
trap_pfault() at trap_pfault+0x127
trap() at trap+0x1bd
calltrap() at calltrap+0x5
- --- trap 0xc, rip = 0x801f8c91, rsp = 0xb908a930, rbp =
0xff
ffb908a970 ---
_mtx_trylock() at _mtx_trylock+0x1
unlock_and_deallocate() at unlock_and_deallocate+0x10e
vm_fault() at vm_fault+0x1ca0
trap_pfault() at trap_pfault+0x127
trap() at trap+0x3e6
calltrap() at calltrap+0x5
- --- trap 0xc, rip = 0x8028d9bf7, rsp = 0x7fffe900, rbp = 0x7fffe900 
---
panic: sleeping thread
cpuid = 1

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFFfxVo4QvfyHIvDvMRAlnpAJ4vNzbF1Ggx1U5McHWKcZ2zD/e4MgCfQuaW
HAOZ4Rl+CVKORC6JPB1KL68=
=NqEk
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bge Ierr rate increase from 5.3R - 6.1R

2006-12-12 Thread Andrew Pantyukhin

On 12/11/06, Greg Eden [EMAIL PROTECTED] wrote:

I recently updated two production servers from 5.3 to 6.1 via cvsup
and buildworld. Since the upgrade I've seen an increase in the number
of Input packet errors reported on the bge cards in on both boxes.
One is a HP DL360g3, the other is a HP DL380g3. Both have a pair of
2.8GHz Xeons with a SMP kernel.


Just to be sure, is polling disabled?
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: panic: sleeping thread

2006-12-12 Thread John Baldwin
On Tuesday 12 December 2006 15:47, Marc G. Fournier wrote:
 
 --On Monday, December 11, 2006 17:40:22 -0500 John Baldwin [EMAIL PROTECTED]
 wrote:
 
  Maybe use ssh -e none?  You don't need to break into ddb though, when it
  panics it will print out more useful info on its own.
 
 Ah, like:
 
 Sleeping thread (tid 101409, pid 78573) owns a non-sleepable lock
 sched_switch() at sched_switch+0x11f
 mi_switch() at mi_switch+0x14c
 sleepq_wait() at sleepq_wait+0x5b
 cv_wait() at cv_wait+0xed
 _sx_xlock() at _sx_xlock+0x51
 vm_map_lookup() at vm_map_lookup+0x3c
 vm_fault() at vm_fault+0xba
 trap_pfault() at trap_pfault+0x127
 trap() at trap+0x1bd
 calltrap() at calltrap+0x5
 --- trap 0xc, rip = 0x801f8c91, rsp = 0xb908a930, rbp =
 0xff
 ffb908a970 ---
 _mtx_trylock() at _mtx_trylock+0x1
 unlock_and_deallocate() at unlock_and_deallocate+0x10e
 vm_fault() at vm_fault+0x1ca0
 trap_pfault() at trap_pfault+0x127
 trap() at trap+0x3e6
 calltrap() at calltrap+0x5
 --- trap 0xc, rip = 0x8028d9bf7, rsp = 0x7fffe900, rbp = 
0x7fffe900 ---
 panic: sleeping thread
 cpuid = 1

Yeah.  The LOR is bogus though, it's a secondary effect.  The real problem is 
the fault in _mtx_trylock(), and that's probably due to a bug in the previous 
frame in unlock_and_deallocate().  If you can get a core dump that would be 
most helpful.

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bge Ierr rate increase from 5.3R - 6.1R

2006-12-12 Thread Jeremy Chadwick
On Tue, Dec 12, 2006 at 03:07:28PM +, Greg Eden wrote:
 I recently updated two production servers from 5.3 to 6.1 via  
 cvsup and
 buildworld.

Greg,

This may or may not be any help (read: possible red herring).
But from looking at your below dmesg, I don't see any signs
of SMP being used:

  Since the upgrade I've seen an increase in the number of
 Input packet errors reported on the bge cards in on both boxes.  
 One is a
 HP DL360g3, the other is a HP DL380g3. Both have a pair of 2.8GHz  
 Xeons with a SMP kernel.
  ^

 CPU: Intel(R) Xeon(TM) CPU 3.00GHz (3000.12-MHz 686-class CPU)
   Origin = GenuineIntel  Id = 0xf41  Stepping = 1

 Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE 
 ,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE
   Features2=0x641dSSE3,RSVD2,MON,DS_CPL,CNTX-ID,CX16,b14
   AMD Features=0x2010NX,LM
   Logical CPUs per core: 2
 real memory  = 1073688576 (1023 MB)
 avail memory = 1045893120 (997 MB)
--- SMP details are missing from here ---
 ioapic1: Changing APIC ID to 9
 ioapic0 Version 2.0 irqs 0-23 on motherboard
 ioapic1 Version 2.0 irqs 24-47 on motherboard
 ioapic2 Version 2.0 irqs 48-71 on motherboard
 ioapic3 Version 2.0 irqs 72-95 on motherboard
 kbd1 at kbdmux0
 ...

Normally, SMP kernels display something like this:

CPU: AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ (2010.31-MHz 686-class CPU)
  Origin = AuthenticAMD  Id = 0x20f32  Stepping = 2
  
Features=0x178bfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT
  Features2=0x1SSE3
  AMD Features=0xe2500800SYSCALL,NX,MMX+,FFXSR,LM,3DNow+,3DNow
  AMD Features2=0x3LAHF,CMP
  Cores per package: 2
real memory  = 2147418112 (2047 MB)
avail memory = 2096336896 (1999 MB)
ACPI APIC Table: Nvidia AWRDACPI
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
ioapic0: Changing APIC ID to 2
ioapic0 Version 1.1 irqs 0-23 on motherboard

Or, for comparison, a 4.11 box:

CPU: Intel Pentium III (933.03-MHz 686-class CPU)
  Origin = GenuineIntel  Id = 0x68a  Stepping = 10
  
Features=0x383fbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE
real memory  = 536805376 (524224K bytes)
avail memory = 518811648 (506652K bytes)
Programming 24 pins in IOAPIC #0
IOAPIC #0 intpin 2 - irq 0
FreeBSD/SMP: Multiprocessor motherboard: 2 CPUs
 cpu0 (BSP): apic id:  0, version: 0x00040011, at 0xfee0
 cpu1 (AP):  apic id:  1, version: 0x00040011, at 0xfee0
 io0 (APIC): apic id:  2, version: 0x00178011, at 0xfec0
Preloaded elf kernel kernel at 0xc0358000.

Additionally, your Email says two 2.8GHz Xeons, but it looks as
if you have one physical 3.0GHz Xeon that has dual cores.

-- 
| Jeremy Chadwick jdc at parodius.com |
| Parodius Networkinghttp://www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA, USA |
| Making life hard for others since 1977.   PGP: 4BD6C0CB |

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: panic: sleeping thread

2006-12-12 Thread Marc G. Fournier
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Tuesday, December 12, 2006 16:12:50 -0500 John Baldwin [EMAIL 
PROTECTED] 
wrote:

 On Tuesday 12 December 2006 15:47, Marc G. Fournier wrote:

 --On Monday, December 11, 2006 17:40:22 -0500 John Baldwin [EMAIL 
 PROTECTED]
 wrote:

  Maybe use ssh -e none?  You don't need to break into ddb though, when it
  panics it will print out more useful info on its own.

 Ah, like:

 Sleeping thread (tid 101409, pid 78573) owns a non-sleepable lock
 sched_switch() at sched_switch+0x11f
 mi_switch() at mi_switch+0x14c
 sleepq_wait() at sleepq_wait+0x5b
 cv_wait() at cv_wait+0xed
 _sx_xlock() at _sx_xlock+0x51
 vm_map_lookup() at vm_map_lookup+0x3c
 vm_fault() at vm_fault+0xba
 trap_pfault() at trap_pfault+0x127
 trap() at trap+0x1bd
 calltrap() at calltrap+0x5
 --- trap 0xc, rip = 0x801f8c91, rsp = 0xb908a930, rbp =
 0xff
 ffb908a970 ---
 _mtx_trylock() at _mtx_trylock+0x1
 unlock_and_deallocate() at unlock_and_deallocate+0x10e
 vm_fault() at vm_fault+0x1ca0
 trap_pfault() at trap_pfault+0x127
 trap() at trap+0x3e6
 calltrap() at calltrap+0x5
 --- trap 0xc, rip = 0x8028d9bf7, rsp = 0x7fffe900, rbp =
 0x7fffe900 ---
 panic: sleeping thread
 cpuid = 1

 Yeah.  The LOR is bogus though, it's a secondary effect.  The real problem is
 the fault in _mtx_trylock(), and that's probably due to a bug in the previous
 frame in unlock_and_deallocate().  If you can get a core dump that would be
 most helpful.

That woudl take being able to get into DDB from an SSH session ;)

I'll try the -e none the next time it crashes, unless someone else has another 
idea for doing it?  Actually, I'll try the -e none after its up again, instead 
of waiting ...


- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFFfyHg4QvfyHIvDvMRApAqAJ4pnw5nZK+kvyy/9z0TrTmTlu9OCgCg2TIp
naCXTEeA+EljNnWoVcD/1PU=
=kYAM
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


BREAK TO DDB over SSH (Was: Re: panic: sleeping thread)

2006-12-12 Thread Marc G. Fournier
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Tuesday, December 12, 2006 17:40:48 -0400 Marc G. Fournier 
[EMAIL PROTECTED] wrote:

 I'll try the -e none the next time it crashes, unless someone else has
 another  idea for doing it?  Actually, I'll try the -e none after its up
 again, instead  of waiting ...

Nope, now it just echos back the ~B as if I was typing regular characters ... i 
hate to do it, but am turning off ssh and going to telnet until I can get a 
core dump ...

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFFfyPt4QvfyHIvDvMRAmN3AJ91MeKAd9WN/b3hdO8u2vv4N7nZ1gCcDJ/o
kBiDgFHZNeSu5QbDu7psybE=
=xYCw
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bge Ierr rate increase from 5.3R - 6.1R

2006-12-12 Thread Greg Eden


On 12 Dec 2006, at 20:48, Andrew Pantyukhin wrote:


On 12/11/06, Greg Eden [EMAIL PROTECTED] wrote:

I recently updated two production servers from 5.3 to 6.1 via cvsup
and buildworld. Since the upgrade I've seen an increase in the number
of Input packet errors reported on the bge cards in on both boxes.
One is a HP DL360g3, the other is a HP DL380g3. Both have a pair of
2.8GHz Xeons with a SMP kernel.


Just to be sure, is polling disabled?


yes. i don't use it on any of five the machines producing the problem.

best.
greg.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: BREAK TO DDB over SSH (Was: Re: panic: sleeping thread)

2006-12-12 Thread Marc G. Fournier
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Tuesday, December 12, 2006 17:49:33 -0400 Marc G. Fournier 
[EMAIL PROTECTED] wrote:

 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1



 - --On Tuesday, December 12, 2006 17:40:48 -0400 Marc G. Fournier
 [EMAIL PROTECTED] wrote:

 I'll try the -e none the next time it crashes, unless someone else has
 another  idea for doing it?  Actually, I'll try the -e none after its up
 again, instead  of waiting ...

 Nope, now it just echos back the ~B as if I was typing regular characters ...
 i  hate to do it, but am turning off ssh and going to telnet until I can get
 a  core dump ...

'k, I must have screwed something up altogether then ... I can't break into DDB 
using 'send break' through a telnet session ...

I have:

___options KDB
___options DDB
___options KDB_UNATTENDED
___options BREAK_TO_DEBUGGER

In my kernel ... is there something else I'm missing?  Its the same settings I 
have on my other box that I *can* do this with ...

The only difference is that this is an HP Proliant box, and I'm using REMCONS 
through iLO to connect to the console ...

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFFfyZD4QvfyHIvDvMRAirnAJ4sn/fGHuxQmxmpylu8h2emDi445wCgx57o
SJ+38e2aXN352AahsEuf0NE=
=BX8A
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bge Ierr rate increase from 5.3R - 6.1R

2006-12-12 Thread Greg Eden


On 12 Dec 2006, at 21:39, Jeremy Chadwick wrote:


On Tue, Dec 12, 2006 at 03:07:28PM +, Greg Eden wrote:

I recently updated two production servers from 5.3 to 6.1 via
cvsup and
buildworld.


Greg,

This may or may not be any help (read: possible red herring).
But from looking at your below dmesg, I don't see any signs
of SMP being used:


good - it's not an SMP box :)

sorry for any confusion. the box I *was* able to upgrade to  
RELENG_6_2, and reported in last the email with the dmesg output is  
not SMP. all five HP DL3xx boxes (two of which *are* SMP) show  
exactly the same behaviour irrespective of being SMP or UP. none of  
them have polling enabled.


best.
greg.


Since the upgrade I've seen an increase in the number of
Input packet errors reported on the bge cards in on both boxes.
One is a
HP DL360g3, the other is a HP DL380g3. Both have a pair of 2.8GHz
Xeons with a SMP kernel.

  ^


CPU: Intel(R) Xeon(TM) CPU 3.00GHz (3000.12-MHz 686-class CPU)
  Origin = GenuineIntel  Id = 0xf41  Stepping = 1

Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR, 
PGE

,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE
  Features2=0x641dSSE3,RSVD2,MON,DS_CPL,CNTX-ID,CX16,b14
  AMD Features=0x2010NX,LM
  Logical CPUs per core: 2
real memory  = 1073688576 (1023 MB)
avail memory = 1045893120 (997 MB)

--- SMP details are missing from here ---

ioapic1: Changing APIC ID to 9
ioapic0 Version 2.0 irqs 0-23 on motherboard
ioapic1 Version 2.0 irqs 24-47 on motherboard
ioapic2 Version 2.0 irqs 48-71 on motherboard
ioapic3 Version 2.0 irqs 72-95 on motherboard
kbd1 at kbdmux0
...


Normally, SMP kernels display something like this:

CPU: AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ (2010.31-MHz  
686-class CPU)

  Origin = AuthenticAMD  Id = 0x20f32  Stepping = 2
   
Features=0x178bfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,P 
GE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT

  Features2=0x1SSE3
  AMD Features=0xe2500800SYSCALL,NX,MMX+,FFXSR,LM,3DNow+,3DNow
  AMD Features2=0x3LAHF,CMP
  Cores per package: 2
real memory  = 2147418112 (2047 MB)
avail memory = 2096336896 (1999 MB)
ACPI APIC Table: Nvidia AWRDACPI
FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs
 cpu0 (BSP): APIC ID:  0
 cpu1 (AP): APIC ID:  1
ioapic0: Changing APIC ID to 2
ioapic0 Version 1.1 irqs 0-23 on motherboard

Or, for comparison, a 4.11 box:

CPU: Intel Pentium III (933.03-MHz 686-class CPU)
  Origin = GenuineIntel  Id = 0x68a  Stepping = 10
   
Features=0x383fbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PG 
E,MCA,CMOV,PAT,PSE36,MMX,FXSR,SSE

real memory  = 536805376 (524224K bytes)
avail memory = 518811648 (506652K bytes)
Programming 24 pins in IOAPIC #0
IOAPIC #0 intpin 2 - irq 0
FreeBSD/SMP: Multiprocessor motherboard: 2 CPUs
 cpu0 (BSP): apic id:  0, version: 0x00040011, at 0xfee0
 cpu1 (AP):  apic id:  1, version: 0x00040011, at 0xfee0
 io0 (APIC): apic id:  2, version: 0x00178011, at 0xfec0
Preloaded elf kernel kernel at 0xc0358000.

Additionally, your Email says two 2.8GHz Xeons, but it looks as
if you have one physical 3.0GHz Xeon that has dual cores.

--
| Jeremy Chadwick jdc at  
parodius.com |
| Parodius Networkinghttp:// 
www.parodius.com/ |
| UNIX Systems Administrator   Mountain View, CA,  
USA |
| Making life hard for others since 1977.   PGP:  
4BD6C0CB |




___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: BREAK TO DDB over SSH (Was: Re: panic: sleeping thread)

2006-12-12 Thread Max Laier
On Tuesday 12 December 2006 22:59, Marc G. Fournier wrote:
 --On Tuesday, December 12, 2006 17:49:33 -0400 Marc G. Fournier

 [EMAIL PROTECTED] wrote:
  -BEGIN PGP SIGNED MESSAGE-
  Hash: SHA1
 
 
 
  - --On Tuesday, December 12, 2006 17:40:48 -0400 Marc G. Fournier
 
  [EMAIL PROTECTED] wrote:
  I'll try the -e none the next time it crashes, unless someone else
  has another  idea for doing it?  Actually, I'll try the -e none
  after its up again, instead  of waiting ...
 
  Nope, now it just echos back the ~B as if I was typing regular
  characters ... i  hate to do it, but am turning off ssh and going to
  telnet until I can get a  core dump ...

 'k, I must have screwed something up altogether then ... I can't break
 into DDB using 'send break' through a telnet session ...

 I have:

 ___options KDB
 ___options DDB
 ___options KDB_UNATTENDED
 ___options BREAK_TO_DEBUGGER

 In my kernel ... is there something else I'm missing?  Its the same
 settings I have on my other box that I *can* do this with ...

 The only difference is that this is an HP Proliant box, and I'm using
 REMCONS through iLO to connect to the console ...

sysctl debug.kdb.enter=1

 
 Marc G. Fournier   Hub.Org Networking Services
 (http://www.hub.org) Email . [EMAIL PROTECTED]   
   MSN . [EMAIL PROTECTED] Yahoo . yscrappy   Skype: hub.org  
  ICQ . 7615664

 ___
 freebsd-stable@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-stable
 To unsubscribe, send any mail to
 [EMAIL PROTECTED]

-- 
/\  Best regards,  | [EMAIL PROTECTED]
\ /  Max Laier  | ICQ #67774661
 X   http://pf4freebsd.love2party.net/  | [EMAIL PROTECTED]
/ \  ASCII Ribbon Campaign  | Against HTML Mail and News


pgpEw4czs5umt.pgp
Description: PGP signature


Re: BREAK TO DDB over SSH (Was: Re: panic: sleeping thread)

2006-12-12 Thread John Baldwin
On Tuesday 12 December 2006 16:59, Marc G. Fournier wrote:
 
 --On Tuesday, December 12, 2006 17:49:33 -0400 Marc G. Fournier
 [EMAIL PROTECTED] wrote:
 
  -BEGIN PGP SIGNED MESSAGE-
  Hash: SHA1
 
 
 
  - --On Tuesday, December 12, 2006 17:40:48 -0400 Marc G. Fournier
  [EMAIL PROTECTED] wrote:
 
  I'll try the -e none the next time it crashes, unless someone else has
  another  idea for doing it?  Actually, I'll try the -e none after its up
  again, instead  of waiting ...
 
  Nope, now it just echos back the ~B as if I was typing regular 
characters ...
  i  hate to do it, but am turning off ssh and going to telnet until I can 
get
  a  core dump ...
 
 'k, I must have screwed something up altogether then ... I can't break into 
DDB
 using 'send break' through a telnet session ...
 
 I have:
 
 ___options KDB
 ___options DDB
 ___options KDB_UNATTENDED
 ___options BREAK_TO_DEBUGGER
 
 In my kernel ... is there something else I'm missing?  Its the same settings 
I
 have on my other box that I *can* do this with ...
 
 The only difference is that this is an HP Proliant box, and I'm using 
REMCONS
 through iLO to connect to the console ...

KDB_UNATTENDED should make it do a coredump and not bother with dropping into 
ddb when it panics.

-- 
John Baldwin
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: BREAK TO DDB over SSH (Was: Re: panic: sleeping thread)

2006-12-12 Thread Marc G. Fournier
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Tuesday, December 12, 2006 23:40:36 +0100 Max Laier [EMAIL PROTECTED] 
wrote:

 On Tuesday 12 December 2006 22:59, Marc G. Fournier wrote:
 --On Tuesday, December 12, 2006 17:49:33 -0400 Marc G. Fournier

 [EMAIL PROTECTED] wrote:
  -BEGIN PGP SIGNED MESSAGE-
  Hash: SHA1
 
 
 
  - --On Tuesday, December 12, 2006 17:40:48 -0400 Marc G. Fournier
 
  [EMAIL PROTECTED] wrote:
  I'll try the -e none the next time it crashes, unless someone else
  has another  idea for doing it?  Actually, I'll try the -e none
  after its up again, instead  of waiting ...
 
  Nope, now it just echos back the ~B as if I was typing regular
  characters ... i  hate to do it, but am turning off ssh and going to
  telnet until I can get a  core dump ...

 'k, I must have screwed something up altogether then ... I can't break
 into DDB using 'send break' through a telnet session ...

 I have:

 ___options KDB
 ___options DDB
 ___options KDB_UNATTENDED
 ___options BREAK_TO_DEBUGGER

 In my kernel ... is there something else I'm missing?  Its the same
 settings I have on my other box that I *can* do this with ...

 The only difference is that this is an HP Proliant box, and I'm using
 REMCONS through iLO to connect to the console ...

 sysctl debug.kdb.enter=1

You mentioned this once already, but don't I need to be able to type at a 
command prompt?  In my case, the server has crashed, but not rebooted or drop'd 
into DDB ... I need to somehow send a 'break' to drop it into DDB ...


- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFFfzOe4QvfyHIvDvMRAiBVAJ4gvadfhK2SQuZRGr04DHCqOFRklgCcDuLN
bvO52s7O4l+ngHqybfgLsHQ=
=0GC6
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: BREAK TO DDB over SSH (Was: Re: panic: sleeping thread)

2006-12-12 Thread Marc G. Fournier
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Tuesday, December 12, 2006 17:40:17 -0500 John Baldwin [EMAIL 
PROTECTED] 
wrote:

 KDB_UNATTENDED should make it do a coredump and not bother with dropping into
 ddb when it panics.

Its not ... it just prints everything out and then just hangs there ...


Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFFfzRb4QvfyHIvDvMRAlNaAJ4lDRW3moVl+Wj/AiW6xjUeEsHPKwCglLmT
E9KFtRIx8NAIXYj82et9tp8=
=7D/4
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: BREAK TO DDB over SSH (Was: Re: panic: sleeping thread)

2006-12-12 Thread Marc G. Fournier
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Tuesday, December 12, 2006 17:40:17 -0500 John Baldwin [EMAIL 
PROTECTED] 
wrote:

 KDB_UNATTENDED should make it do a coredump and not bother with dropping into
 ddb when it panics.

Would setting this to 1 help me any:

# sysctl -d debug.debugger_on_panic
debug.debugger_on_panic: Run debugger on kernel panic

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFFfzhg4QvfyHIvDvMRAhxKAJ42gt6j5t/OEbTqnAX8F3WTQ14ppACffvCC
n3iDPtSi/NI5JN1Hn/SYS+s=
=Qmbl
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: BREAK TO DDB over SSH (Was: Re: panic: sleeping thread)

2006-12-12 Thread Marc G. Fournier
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Tuesday, December 12, 2006 17:40:17 -0500 John Baldwin [EMAIL 
PROTECTED] 
wrote:

 KDB_UNATTENDED should make it do a coredump and not bother with dropping into
 ddb when it panics.

'k, I'm updating my kernel/world to todays, removed KDB_UNATTENDED and changed 
BREAK_TO... to ALT_BREAK_TO... to see if its escape sequence will serve me 
better ...

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFFf0ya4QvfyHIvDvMRArlOAKC10Fcz9Z186K9Arkh/wv7KaqKemQCgsgnP
cme1iSUzp0zcHyZjvgr3ad8=
=6Iqz
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


how to disable exec attributes on ntfs mounts ?

2006-12-12 Thread martinko
Hello,

I wonder why there is no -M option for mount_ntfs(8) as it is there for
mount_msdosfs(8) ??

I am trying to get rid of executable attributes for files but with -m
option only it is not possible as it changes directories too.

I'm using the following options for msdosfs:
msdosfs rw,noatime,noauto,-m640,-M750,-Lsk_SK.ISO8859-2

Any ideas about how to achieve the same for ntfs please ??

Cheers,

Martin

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: BREAK TO DDB over SSH (Was: Re: panic: sleeping thread)

2006-12-12 Thread Marc G. Fournier
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Tuesday, December 12, 2006 20:43:07 -0400 Marc G. Fournier 
[EMAIL PROTECTED] wrote:

 'k, I'm updating my kernel/world to todays, removed KDB_UNATTENDED and
 changed  BREAK_TO... to ALT_BREAK_TO... to see if its escape sequence will
 serve me  better ...

Nope, no different ... I suspect its because I'm going in through iLO, and its 
somehow trapping what I'm sending :(

Oh well, unless jmc is out there and has a better idea, I guess I'll just have 
to hope that it drop's into it on its own next time :(  I'm out of ideas here 
...

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFFf2Oh4QvfyHIvDvMRAtx+AJwIgXbdZvMy7ozZdHxs0j7JglBBJACg02Fg
pqCIlSd/xftz7uldlaVEIBE=
=8oVL
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kqueue LOR

2006-12-12 Thread Bruce Evans

On Tue, 12 Dec 2006, John Baldwin wrote:


On Tuesday 12 December 2006 13:43, Suleiman Souhlal wrote:



Why is memory barrier usage not encouraged? As you said, they can be used to
reduce the number of atomic (LOCKed) operations, in some cases.
...
Admittedly, they are harder to use than atomic operations, but it might
still worth having something similar.


How would MI code know when using memory barriers is good?  This is already
hard to know for atomic ops -- if there would more than a couple of atomic
ops then it is probably better to use 1 mutex lock/unlock and no atomic
ops, since this reduces the total number of atomic ops in most cases, but
it is hard for MI code to know how many a couple is.  (This also depends
on the SMP option -- without SMP, locking is automatic so atomic ops are
very fast but mutexes are still slow since they do a lot more than an
atomic op.)


Memory barriers just specify ordering, they don't ensure a cache flush so
another CPU reads up to date values.  You can use memory barriers in
conjunction with atomic operations on a variable to ensure that you can
safely read other variables (which is what locks do).  For example, in this


I thought that the acquire/release variants of atomic ops guarantee
this.  They seem to be documented to do this, while mutexes don't seem
to be documented to do this.  The MI (?) implementation of mutexes
depends on atomic_cmpset_{acq,rel}_ptr() doing this.

Bruc
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: bus_dmamem_alloc in drm / radeon

2006-12-12 Thread Kai Lockwood
I am having nearly the same problem. I startx and the machine ether 
hangs or kernel panics. I tried to look at the crash dumps but I am no 
expert will kgdb.


Brian A. Seklecki wrote:


A highly repeatable situation:

FreeBSD soundwave 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE #0: Wed Sep 13 
14:51:18 EDT 2006 [EMAIL PROTECTED]:/usr/obj/usr/src/sys/SMP  i386



drm0: ATI Radeon RV280 9250 port 0xdd00-0xddff mem 
0xf000-0xf7ff,0xfe9e-0xfe9e irq 22 at device 1.0 on pci2

info: [drm] Initialized radeon 1.24.0 20060225
info: [drm] Setting GART location based on old memory map

* bus_dmamem_alloc failed to align memory properly.info:

[drm] Loading R200 Microcode info:
[drm] writeback test succeeded in 1 usecs

The Xorg process goes solid in ioctl on /dev/dri/card0 presumably

  768 seklecki  1 1280   149M 10516K RUN  4:56 100.15% Xorg

   768 Xorg CALL  ioctl(0x8,0x20006444 ,0)
   768 Xorg RET   ioctl -1 errno 16 Device busy
   768 Xorg CALL  ioctl(0x8,0x20006444 ,0)
   768 Xorg RET   ioctl -1 errno 16 Device busy
   768 Xorg CALL  ioctl(0x8,0x20006444 ,0)
   768 Xorg RET   ioctl -1 errno 16 Device busy
   768 Xorg CALL  ioctl(0x8,0x20006444 ,0)
   768 Xorg RET   ioctl -1 errno 16 Device busy
   768 Xorg CALL  ioctl(0x8,0x20006444 ,0)
   768 Xorg RET   ioctl -1 errno 16 Device busy
   768 Xorg CALL  ioctl(0x8,0x20006444 ,0)
   768 Xorg RET   ioctl -1 errno 16 Device busy
   768 Xorg CALL  ioctl(0x8,0x20006444 ,0)

I can provide full dmesg(8) if needed.


l8*
-lava (Brian A. Seklecki - Pittsburgh, PA, USA)
   http://www.spiritual-machines.org/

...from back in the heady days when helpdesk meant nothing, 
diskquota
meant everything, and lives could be bought and sold for a couple of 
pages

of laser printout - and frequently were.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: kqueue LOR

2006-12-12 Thread Tor Egge
 Hmm, may be, since vnode must be interlocked by ffs_sync() after
 MNTK_SUSPENDED set, and before MNTK_SUSPEND, mount interlock is not
 needed in ufs_itimes.
 
 Tor ?

If neither IN_CHANGE nor IN_UPDATE is set then it might be unsafe to set
IN_MODIFIED in ufs_itimes() since the file system might be suspended or in the
process of being suspended with the vnode sync loop in ffs_sync() having
iterated past the vnode.

I don't think the mount interlock is needed to check for MNTK_SUSPEND being set
in ufs_itimes() as long as the vnode interlock is held.  If a stale value is
read without MNTK_SUSPEND set then the vnode sync loop in ffs_sync() cannot
have iterated past the vnode, thus it should still be safe to set IN_MODIFIED.
All writes by the CPU performing the vnode sync loop before it released the
vnode interlock for the same vnode should be visible to the CPU in ufs_itimes()
after it has obtained the vnode interlock.

- Tor Egge
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


pf killing NFS

2006-12-12 Thread Charles Sprickman

Hi all,

I'm running a 6.2-RC1 box (cvsup'd today) that has two broadcom nics.  One 
is an internal network (nfs) and the other is external.


PF has this rule for all traffic on the private net:

[EMAIL PROTECTED] /home/jails]# pfctl -sr|grep bge1
pass in quick on bge1 inet from 192.168.1.0/24 to any
pass out quick on bge1 inet from any to 192.168.1.0/24

No state since these are quick and symmetrical.

Doing something like ls /usr/ports will just hang until interrupted. 
Using tcp for nfs makes it workable, but very slow.


If I disable pf (pfctl -d), both types of mounts work, and speed is 
excellent.  I also just found that if I remove the scrub in all 
statement and change it to scrub in on bge0, things are fine.


Any idea what's going on?  The tcpdump output confuses me (see bad 
cksum!), so I'm posting some snippets here.


Looking at tcpdump, things look a bit odd. 192.168.1.111 is the nfs 
client (6.2-RC1), 192.168.1.100 is the nfs server (4.11):


[EMAIL PROTECTED] /home/spork]# tcpdump -i bge1 -v
tcpdump: listening on bge1, link-type EN10MB (Ethernet), capture size 96 
bytes


00:59:16.269659 IP (tos 0x0, ttl  64, id 5395, offset 0, flags [none], 
proto: UDP (17), length: 132, bad cksum 0 (-e132)!)

192.168.1.111.1861387036  192.168.1.100.nfs: 104 access [|nfs]

bad checksum before even hitting the wire??

00:59:16.269920 IP (tos 0x0, ttl  64, id 46705, offset 0, flags [none], 
proto: UDP (17), length: 148) 192.168.1.100.nfs  
192.168.1.111.1861387036: reply ok 120 access attr: DIR 755 ids 0/0 [|nfs]


We get a reply (dir is mode 755)

00:59:16.270010 IP (tos 0x0, ttl  64, id 5396, offset 0, flags [none], 
proto: UDP (17), length: 132, bad cksum 0 (-e131)!) 
192.168.1.111.1861387037  192.168.1.100.nfs: 104 access [|nfs]


Again, bad checksum FROM nfs client to server...

00:59:16.270211 IP (tos 0x0, ttl  64, id 58236, offset 0, flags [none], 
proto: UDP (17), length: 148) 192.168.1.100.nfs  
192.168.1.111.1861387037: reply ok 120 access attr: DIR 755 ids 0/0 [|nfs]
00:59:16.270306 IP (tos 0x0, ttl  64, id 5397, offset 0, flags [none], 
proto: UDP (17), length: 132, bad cksum 0 (-e130)!) 
192.168.1.111.1861387038  192.168.1.100.nfs: 104 access [|nfs]


Now to confuse things further, if I disable pf (pfctl -d), speeds are 
great, but I still get these bad checksum errors:


01:04:21.498293 IP (tos 0x0, ttl  64, id 5482, offset 0, flags [none], 
proto: UDP (17), length: 132, bad cksum 0 (-e0db)!) 
192.168.1.111.1861387048  192.168.1.100.nfs: 104 access [|nfs]
01:04:21.498607 IP (tos 0x0, ttl  64, id 16228, offset 0, flags [none], 
proto: UDP (17), length: 148) 192.168.1.100.nfs  
192.168.1.111.1861387048: reply ok 120 access attr: DIR 755 ids 0/0 [|nfs]
01:04:21.498675 IP (tos 0x0, ttl  64, id 5483, offset 0, flags [none], 
proto: UDP (17), length: 132, bad cksum 0 (-e0da)!) 
192.168.1.111.1861387049  192.168.1.100.nfs: 104 access [|nfs]
01:04:21.498900 IP (tos 0x0, ttl  64, id 13349, offset 0, flags [none], 
proto: UDP (17), length: 148) 192.168.1.100.nfs  
192.168.1.111.1861387049: reply ok 120 access attr: DIR 755 ids 0/0 [|nfs]
01:04:21.498924 IP (tos 0x0, ttl  64, id 5484, offset 0, flags [none], 
proto: UDP (17), length: 132, bad cksum 0 (-e0d9)!) 
192.168.1.111.1861387050  192.168.1.100.nfs: 104 access [|nfs]
01:04:21.499195 IP (tos 0x0, ttl  64, id 34907, offset 0, flags [none], 
proto: UDP (17), length: 148) 192.168.1.100.nfs  
192.168.1.111.1861387050: reply ok 120 access attr: DIR 755 ids 0/0 [|nfs]


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


g_vfs_done() failures on 6.2-RC1

2006-12-12 Thread Jan Mikkelsen

Hi,

I have seen g_vfs_done() failures with absurd offsets in the face of heavy 
I/O.  Recovery doesn't seem to happen, leading to the need to reboot the 
system.  The problem seems to occur without any underlying disk device 
failure.


An example from yesterday:

This message repeats on the order of 10s of thousands of times, with no 
earlier message:


g_vfs_done():da1s1d[READ(offset=5036583429229836288, length=16384)]error = 5

Bsdlabel correctly reports that /dev/da1s1d has 1748318312 512-byte blocks, 
so the offset is clearly wrong.  The filesystem was using softupdates and 
was doing a few rm -rfs on two cvs repositories.  After this error the rm 
didn't die from SIGTERM or SIGKILL.  (Unfortunately, I didn't check wchan 
for the rm process.  Sorry.)


The shutdown took three hours.  I didn't have console access, so I don't 
know the console messages at the time.  The machine did respond to pings 
during at least the first hour.  After it came back up, the filesystems were 
all reported as clean.  Attempting to finish off the rm produced this 
result:


bad block 8819084429375818952, ino 92865791
pid 49 (softdepflush), uid 0 inumber 92865791 on /work: bad block
bad block -8123569960048088809, ino 92865791
pid 49 (softdepflush), uid 0 inumber 92865791 on /work: bad block
handle_workitem_freeblocks: block count
g_vfs_done():da1s1d[READ(offset=1154660658434844672, length=16384)]error = 5
bad block -9114721846648257515, ino 92865789
pid 49 (softdepflush), uid 0 inumber 92865789 on /work: bad block
g_vfs_done():da1s1d[READ(offset=8698001308483434496, length=16384)]error = 5
bad block -8102232258315484873, ino 92865789
pid 49 (softdepflush), uid 0 inumber 92865789 on /work: bad block
g_vfs_done():da1s1d[READ(offset=4586979512427630592, length=16384)]error = 5
bad block -3438510379221006390, ino 92865789
pid 49 (softdepflush), uid 0 inumber 92865789 on /work: bad block
g_vfs_done():da1s1d[READ(offset=196654394503331840, length=16384)]error = 5
g_vfs_done():da1s1d[READ(offset=26142581273591808, length=16384)]error = 5
bad block 504981533259792482, ino 92865789
pid 49 (softdepflush), uid 0 inumber 92865789 on /work: bad block
bad block 1538054898336656903, ino 92865789
pid 49 (softdepflush), uid 0 inumber 92865789 on /work: bad block
g_vfs_done():da1s1d[READ(offset=249387551018614784, length=16384)]error = 5
bad block 18582847101533720, ino 92865789
pid 49 (softdepflush), uid 0 inumber 92865789 on /work: bad block
g_vfs_done():da1s1d[READ(offset=259247319150690304, length=16384)]error = 5
bad block -3429473246997783577, ino 92865789
pid 49 (softdepflush), uid 0 inumber 92865789 on /work: bad block
bad block -3335830404336954747, ino 92865789
pid 49 (softdepflush), uid 0 inumber 92865789 on /work: bad block
bad block -1007814018434232494, ino 92865789
pid 49 (softdepflush), uid 0 inumber 92865789 on /work: bad block
handle_workitem_freeblocks: block count

A reboot to single user mode and an fsck cleaned things up.

In this case it is a machine running 6.2-RC1/amd64 with patches on a 
SuperMicro motherboard, 2 x Xeon 5140 CPUs, 4GB ECC memory and an Areca SATA 
raid controller.  The raid array is RAID-6, with write-through controller 
cache and drive write cache disabled.  The controller reported no I/O 
errors, no volumes are degraded.  I have also seen very similar problems on 
a dual-Opteron machine with ataraid (in that case, 6.1-RELEASE), again 
undegraded and no device I/O errors reported.


The patches:

- Daichi Goto's unionfs-p16 has been applied.
- The Areca driver is 1.20.00.12 from the Areca website.
- sym(4) patch (see PR/89550), but no sym controller present.
- SMP + FAST_IPSEC + SUIDDIR + device crypto.

So:  I've seen this problem on a few machines under heavy I/O load, with 
ataraid and with arcmsr.  I've seen others report similar problems, but I've 
seen no resolution.  Does anyone have any idea what the problem is?  Has 
anyone else seen similar problems?  Where to from here?


Thanks,

Jan Mikkelsen
[EMAIL PROTECTED]

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: 6.2-RC1 em(4) issue - freezes during CD boot

2006-12-12 Thread Chris Buechler

Jack Vogel wrote:
UH, can't do anything if you don't give any information, saying its a 
'box'

and it has an 'em' is useless. We have installed RC1 on a number of
systems without problem.


Yes, I understand that.  I was after what info would be useful.  You 
gave me a few ideas, thanks. 

As for ACPI, the boot menu option recently changed to enable ACPI, so 
I'm assuming it must be disabled by default now.  I tried with ACPI on 
6.2-RC1 and got the same result. 

I was able to narrow this down to something that has changed since 
6.2-BETA3.  The BETA3 CD boots, installs, and works fine.  The RC1 CD 
won't boot, as I described in my first message. 

With the em card taken out, the 6.2-RC1 CD boots and installs just fine, 
so it's definitely the em card. 


Hardware:
Dual AMD Athlon MP 2800+, Tyan Tiger MPX motherboard, 2 GB reg ECC RAM, 
Adaptec 2110S PCI-X SCSI RAID, onboard xl(4) NIC, Intel PRO/1000 MT PCI 
desktop adapter (ugh, I know it's not well suited for the box, but this 
is a personal server and a cheap card that's adequate for my needs).  
Drives:  3 x 18 GB SCSI in RAID 5, 1 x 73 GB SCSI on RAID controller but 
a stand alone drive, and a 500 GB IDE drive on the onboard IDE controller. 


Cheers,
-Chris
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: pf killing NFS

2006-12-12 Thread Luke Dean



On Wed, 13 Dec 2006, Charles Sprickman wrote:


Hi all,

I'm running a 6.2-RC1 box (cvsup'd today) that has two broadcom nics.  One is 
an internal network (nfs) and the other is external.


PF has this rule for all traffic on the private net:

[EMAIL PROTECTED] /home/jails]# pfctl -sr|grep bge1
pass in quick on bge1 inet from 192.168.1.0/24 to any
pass out quick on bge1 inet from any to 192.168.1.0/24

No state since these are quick and symmetrical.

Doing something like ls /usr/ports will just hang until interrupted. Using 
tcp for nfs makes it workable, but very slow.


If I disable pf (pfctl -d), both types of mounts work, and speed is 
excellent.  I also just found that if I remove the scrub in all statement 
and change it to scrub in on bge0, things are fine.


I believe it's a bad idea to run NFS traffic through scrub unless you use 
the no-df option with it.  I just don't scrub my internal network 
traffic at all.

I got this from man pf.conf:

 scrub has the following options:

 no-df
   Clears the dont-fragment bit from a matching IP packet.  Some oper-
   ating systems are known to generate fragmented packets with the
   dont-fragment bit set.  This is particularly true with NFS.  Scrub
   will drop such fragmented dont-fragment packets unless no-df is
   specified.

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: BREAK TO DDB over SSH (Was: Re: panic: sleeping thread)

2006-12-12 Thread Danny Braniss
 -BEGIN PGP SIGNED MESSAGE-
 Hash: SHA1
 
 
 
 - --On Tuesday, December 12, 2006 20:43:07 -0400 Marc G. Fournier 
 [EMAIL PROTECTED] wrote:
 
  'k, I'm updating my kernel/world to todays, removed KDB_UNATTENDED and
  changed  BREAK_TO... to ALT_BREAK_TO... to see if its escape sequence will
  serve me  better ...
 
 Nope, no different ... I suspect its because I'm going in through iLO, and 
 its 
 somehow trapping what I'm sending :(
 
 Oh well, unless jmc is out there and has a better idea, I guess I'll just 
 have 
 to hope that it drop's into it on its own next time :(  I'm out of ideas here 
 ...
 
have you tried enabling ttyd0 in /etc/ttys?

danny


___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Strange panics

2006-12-12 Thread Alex Povolotsky

Hello!

I'm getting lots of panics on quite lightly loaded box under 6.1-RELEASE.

The box runned two jails without any flaws for several weeks, after 
adding third one it begins to panics at least once a day.


I have no idea on what could cause this.

Here is my dmesg.boot (partially, I can send it all)



Copyright (c) 1992-2006 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
   The Regents of the University of California. All rights reserved.
FreeBSD 6.1-RELEASE #0: Tue Dec 12 08:44:28 MSK 2006
   [EMAIL PROTECTED]:/usr/obj/usr/src/sys/SMP
Timecounter i8254 frequency 1193182 Hz quality 0
CPU: Intel(R) Xeon(TM) CPU 2.80GHz (2793.01-MHz 686-class CPU)
 Origin = GenuineIntel  Id = 0xf43  Stepping = 3
 
Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT

,TM,PBE
 Features2=0x641dSSE3,RSVD2,MON,DS_CPL,CNTX-ID,CX16,b14
 AMD Features=0x2010NX,LM
 Logical CPUs per core: 2
real memory  = 2147352576 (2047 MB)
avail memory = 2096328704 (1999 MB)
ACPI APIC Table: A M I  OEMAPIC 
FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
cpu0 (BSP): APIC ID:  0
cpu1 (AP): APIC ID:  1
cpu2 (AP): APIC ID:  6
cpu3 (AP): APIC ID:  7

and here is kgbd output

Fatal trap 12: page fault while in kernel mode
cpuid = 2; apic id = 06
fault virtual address   = 0xac
fault code  = supervisor write, page not present
instruction pointer = 0x20:0xc06ee9d0
stack pointer   = 0x28:0xe91e2b18
frame pointer   = 0x28:0xe91e2b34
code segment= base 0x0, limit 0xf, type 0x1b
   = DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 16260 (tcpserver)
trap number = 12
panic: page fault
cpuid = 2
Uptime: 19h33m1s

(kgdb) bt
#0  doadump () at pcpu.h:165
#1  0xc06507c9 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:402
#2  0xc0650af1 in panic (fmt=0xc08958ae %s) at 
/usr/src/sys/kern/kern_shutdown.c:558
#3  0xc084a2cc in trap_fatal (frame=0xe91e2ad8, eva=172) at 
/usr/src/sys/i386/i386/trap.c:836
#4  0xc084a00b in trap_pfault (frame=0xe91e2ad8, usermode=0, eva=172) at 
/usr/src/sys/i386/i386/trap.c:744

#5  0xc0849c45 in trap (frame=
 {tf_fs = -1065156600, tf_es = -1056636888, tf_ds = 40, tf_edi = 
55, tf_esi = 0, tf_ebp = -383898828, tf_isp = -383898876, tf_ebx = 
-383898480, tf_edx = -944133120, tf_ecx = 0, tf_eax = 4, tf_trapno = 12, 
tf_err = 2, tf_eip = -1066473008, tf_cs = 32, tf_eflags = 66182, tf_esp 
= -880992256, tf_ss = 0}) at /usr/src/sys/i386/i386/trap.c:434

#6  0xc0836c4a in calltrap () at /usr/src/sys/i386/i386/exception.s:139
#7  0xc06ee9d0 in ip_ctloutput (so=0x4, sopt=0xe91e2c90) at atomic.h:146
#8  0xc06feb73 in tcp_ctloutput (so=0xc811b6f4, sopt=0xe91e2c90) at 
/usr/src/sys/netinet/tcp_usrreq.c:1038
#9  0xc068bdf8 in sosetopt (so=0xc811b6f4, sopt=0xe91e2c90) at 
/usr/src/sys/kern/uipc_socket.c:1560
#10 0xc0691115 in kern_setsockopt (td=0xc7b9ac00, s=0, level=4, name=4, 
val=0xc7b9ac00, valseg=UIO_USERSPACE, valsize=0)

   at /usr/src/sys/kern/uipc_syscalls.c:1351
#11 0xc0691046 in setsockopt (td=0xc7b9ac00, uap=0x4) at 
/usr/src/sys/kern/uipc_syscalls.c:1307

#12 0xc084a613 in syscall (frame=
 {tf_fs = -1078001605, tf_es = 59, tf_ds = -1078001605, tf_edi = 
-1077942204, tf_esi = 3, tf_ebp = -1077942408, tf_isp = -383898268, 
tf_ebx = 0, tf_edx = 2, tf_ecx = 134545464, tf_eax = 105, tf_trapno = 
12, tf_err = 2, tf_eip = 186394799, tf_cs = 51, tf_eflags = 642, tf_esp 
= -1077942452, tf_ss = 59}) at /usr/src/sys/i386/i386/trap.c:981
#13 0xc0836c9f in Xint0x80_syscall () at 
/usr/src/sys/i386/i386/exception.s:200

#14 0x0033 in ?? ()
Previous frame inner to this frame (corrupt stack?)


=== cut ===

Fatal trap 12: page fault while in kernel mode
cpuid = 0; apic id = 00
fault virtual address   = 0xac
fault code  = supervisor write, page not present
instruction pointer = 0x20:0xc06ee9d0
stack pointer   = 0x28:0xe8f77b18
frame pointer   = 0x28:0xe8f77b34
code segment= base 0x0, limit 0xf, type 0x1b
   = DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 86881 (tcpserver)
trap number = 12
panic: page fault
cpuid = 0
Uptime: 1h55m49s
Dumping 2047 MB (2 chunks)
 chunk 0: 1MB (158 pages) ... ok
 chunk 1: 2047MB (524000 pages) 2031 2015 1999 1983 1967 1951 1935 1919 
1903 1887 1871 1855 1839 1823 1807 1791 1775 1759 1743 1727 1711 1695 
1679 1663 1647 1631 1615 1599 1583 1567 1551 1535 1519 1503 1487 1471 
1455 1439 1423 1407 1391 1375 1359 1343 1327 1311 1295 1279 1263 1247 
1231 1215 1199 1183 1167 1151 1135 1119 1103 1087 1071 1055 1039 1023 
1007 991 975 959 943 927 911 895 879 863 847 831 815 799 783 767 751 735 
719 703 687 671 655 639 623 607 591 575 559 543 527 511 495 479 463 447 
431 415 

Re: g_vfs_done() failures on 6.2-RC1

2006-12-12 Thread Scott Long

Jan Mikkelsen wrote:


- Daichi Goto's unionfs-p16 has been applied.
- The Areca driver is 1.20.00.12 from the Areca website.
- sym(4) patch (see PR/89550), but no sym controller present.
- SMP + FAST_IPSEC + SUIDDIR + device crypto.

So:  I've seen this problem on a few machines under heavy I/O load, with 
ataraid and with arcmsr.  I've seen others report similar problems, but 
I've seen no resolution.  Does anyone have any idea what the problem 
is?  Has anyone else seen similar problems?  Where to from here?


Thanks,



You mention that you are using a driver from the Areca website.  Have
you tried using the stock driver that comes with FreeBSD?  I don't know
if it will be better or not, but I was planning on doing a refresh of
the stock driver, and I'd hate to introduce instability that wasn't 
there before.


Scott
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: g_vfs_done() failures on 6.2-RC1

2006-12-12 Thread Matthew D. Fuller
On Wed, Dec 13, 2006 at 12:25:49AM -0700 I heard the voice of
Scott Long, and lo! it spake thus:
 
 You mention that you are using a driver from the Areca website.
 Have you tried using the stock driver that comes with FreeBSD?  I
 don't know if it will be better or not, but I was planning on doing
 a refresh of the stock driver, and I'd hate to introduce instability
 that wasn't there before.

I don't think it's related to that.  I know a guy who saw similar
errors on an ICH-6(?) controller (with no RAID) on some random
RELENG_6, ending up in somewhat wonky filesystems.  I was inclined to
blame the controller, since it ends up with SATA slave devices.


-- 
Matthew Fuller (MF4839)   |  [EMAIL PROTECTED]
Systems/Network Administrator |  http://www.over-yonder.net/~fullermd/
   On the Internet, nobody can hear you scream.
___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: BREAK TO DDB over SSH (Was: Re: panic: sleeping thread)

2006-12-12 Thread Marc G. Fournier
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1



- --On Wednesday, December 13, 2006 08:53:32 +0200 Danny Braniss 
[EMAIL PROTECTED] wrote:


 have you tried enabling ttyd0 in /etc/ttys?

Although I'm willing to try just about anything, to what effect?  I don't have 
anything attached to any serial ports ...

- 
Marc G. Fournier   Hub.Org Networking Services (http://www.hub.org)
Email . [EMAIL PROTECTED]  MSN . [EMAIL PROTECTED]
Yahoo . yscrappy   Skype: hub.orgICQ . 7615664
-BEGIN PGP SIGNATURE-
Version: GnuPG v1.4.5 (FreeBSD)

iD8DBQFFf66m4QvfyHIvDvMRApIhAKDu6nQujoTRG2oorqHE+mkcnY6TwACfb9zH
uqZ/ZeZJUSOkOYQzeo5OiTs=
=BAKt
-END PGP SIGNATURE-

___
freebsd-stable@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to [EMAIL PROTECTED]