Re: -CURRENT panics in NFS

2021-02-27 Thread Mateusz Guzik
Thanks. I adjusted the namecache. However, the nfs fix provided by
Rick should go in regardless.

On 2/27/21, Juraj Lutter  wrote:
>
>
>> On 27 Feb 2021, at 21:49, Mateusz Guzik  wrote:
>>
>> Can you dump 'struct componentname *cnp'? This should do the trick:
>> f 12
>> p cnp
>>
>> Most notably I want to know if the name to added is a literal dot.
>>
>
> Yes, it is a dot (the directory itself):
>
> cn_nameptr = 0xfe0011428018 ".", cn_namelen = 1
>
> otis
>
>


-- 
Mateusz Guzik 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: -CURRENT panics in NFS

2021-02-27 Thread Juraj Lutter



> On 27 Feb 2021, at 21:49, Mateusz Guzik  wrote:
> 
> Can you dump 'struct componentname *cnp'? This should do the trick:
> f 12
> p cnp
> 
> Most notably I want to know if the name to added is a literal dot.
> 

Yes, it is a dot (the directory itself):

cn_nameptr = 0xfe0011428018 ".", cn_namelen = 1

otis

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: -CURRENT panics in NFS

2021-02-27 Thread Mateusz Guzik
You should be able to just use kgdb on the old kernel and the
crashdump you already collected, provided both are still around.
Alternatively boot with this without the fix:

diff --git a/sys/kern/vfs_cache.c b/sys/kern/vfs_cache.c
index fef1e31d197b..c4d2990b155d 100644
--- a/sys/kern/vfs_cache.c
+++ b/sys/kern/vfs_cache.c
@@ -2266,6 +2266,9 @@ cache_enter_time(struct vnode *dvp, struct vnode
*vp, struct componentname *cnp,
KASSERT(cnp->cn_namelen <= NAME_MAX,
("%s: passed len %ld exceeds NAME_MAX (%d)", __func__,
cnp->cn_namelen,
NAME_MAX));
+   if (dvp == vp) {
+   panic("%s: same vnodes; cnp [%s] len %ld\n", __func__,
cnp->cn_nameptr, cnp->cn_namelen);
+   }
VNPASS(dvp != vp, dvp);
VNPASS(!VN_IS_DOOMED(dvp), dvp);
VNPASS(dvp->v_type != VNON, dvp);


On 2/27/21, Juraj Lutter  wrote:
> I am now running a patched kernel, without problems.
>
> I can boot to unpatched one and see the output of these ddb commands.
>
> otis
>
> —
> Juraj Lutter
> XMPP: juraj (at) lutter.sk
> GSM: +421907986576
>
>> On 27 Feb 2021, at 21:49, Mateusz Guzik  wrote:
>>
>> Can you dump 'struct componentname *cnp'? This should do the trick:
>> f 12
>> p cnp
>>
>
>
>


-- 
Mateusz Guzik 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: -CURRENT panics in NFS

2021-02-27 Thread Juraj Lutter
I am now running a patched kernel, without problems.

I can boot to unpatched one and see the output of these ddb commands.

otis

—
Juraj Lutter
XMPP: juraj (at) lutter.sk
GSM: +421907986576

> On 27 Feb 2021, at 21:49, Mateusz Guzik  wrote:
> 
> Can you dump 'struct componentname *cnp'? This should do the trick:
> f 12
> p cnp
> 


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: -CURRENT panics in NFS

2021-02-27 Thread Mateusz Guzik
Can you dump 'struct componentname *cnp'? This should do the trick:
f 12
p cnp

Most notably I want to know if the name to added is a literal dot.

That case is handled if necessary, but the assert was added to start
making the interface stricter. If the name is a dot I'll be inclined
to remove the assert for 13.x to avoid problems with other callers of
the sort.

Otherwise I'll have to check what's going on there.

On 2/27/21, Juraj Lutter  wrote:
> Hi,
>
> thank you for the swift reaction. This patch fixed my problem.
>
> otis
>
> —
> Juraj Lutter
> XMPP: juraj (at) lutter.sk
> GSM: +421907986576
>
>> On 27 Feb 2021, at 16:53, Rick Macklem  wrote:
>>
>> I reproduced the problem and the attached trivial patch
>> seems to fix it. Please test the patch if you can.
>>
>
>


-- 
Mateusz Guzik 
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: -CURRENT panics in NFS

2021-02-27 Thread Juraj Lutter
Hi,

thank you for the swift reaction. This patch fixed my problem.

otis

—
Juraj Lutter
XMPP: juraj (at) lutter.sk
GSM: +421907986576

> On 27 Feb 2021, at 16:53, Rick Macklem  wrote:
> 
> I reproduced the problem and the attached trivial patch
> seems to fix it. Please test the patch if you can.
> 

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: -CURRENT panics in NFS

2021-02-27 Thread Rick Macklem
I reproduced the problem and the attached trivial patch
seems to fix it. Please test the patch if you can.

Mateusz, I assume the directory shouldn't try and add
a cache entry for itself?
I don't test NFSv3 much and I don't test "rdirplus"
much, so it slipped through the cracks.

Thanks for reporting it, rick


From: owner-freebsd-curr...@freebsd.org  on 
behalf of Juraj Lutter 
Sent: Saturday, February 27, 2021 9:31 AM
To: freebsd-current
Subject: Re: -CURRENT panics in NFS

CAUTION: This email originated from outside of the University of Guelph. Do not 
click links or open attachments unless you recognize the sender and know the 
content is safe. If in doubt, forward suspicious emails to ith...@uoguelph.ca


And a kgdb backtrace:

(kgdb) bt
#0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=textdump@entry=0) at /usr/src/sys/kern/kern_shutdown.c:399
#2  0x804c7b2a in db_dump (dummy=, dummy2=, 
dummy3=, dummy4=) at /usr/src/sys/ddb/db_command.c:575
#3  0x804c78ee in db_command (last_cmdp=, 
cmd_table=, dopager=dopager@entry=1) at 
/usr/src/sys/ddb/db_command.c:482
#4  0x804c762d in db_command_loop () at 
/usr/src/sys/ddb/db_command.c:535
#5  0x804cac36 in db_trap (type=, code=) 
at /usr/src/sys/ddb/db_main.c:270
#6  0x80c59d04 in kdb_trap (type=type@entry=3, code=code@entry=0, 
tf=, tf@entry=0xfe00d01c3d40) at 
/usr/src/sys/kern/subr_kdb.c:727
#7  0x810bc1ee in trap (frame=0xfe00d01c3d40) at 
/usr/src/sys/amd64/amd64/trap.c:576
#8  
#9  kdb_enter (why=0x812accc9 "panic", msg=) at 
/usr/src/sys/kern/subr_kdb.c:506
#10 0x80c0d5d2 in vpanic (fmt=, ap=, 
ap@entry=0xfe00d01c3ea0) at /usr/src/sys/kern/kern_shutdown.c:907
#11 0x80c0d363 in panic (fmt=0x81e9a178  
"\177\256&\201\377\377\377\377") at /usr/src/sys/kern/kern_shutdown.c:843
#12 0x80cd6d74 in cache_enter_time (dvp=0xf80079321e00, 
vp=0xf80079321e00, cnp=cnp@entry=0xfe00d01c4030, 
tsp=tsp@entry=0xfe00d01c40e0, dtsp=)
at /usr/src/sys/kern/vfs_cache.c:2274
#13 0x80ae2bd6 in nfsrpc_readdirplus (vp=, 
vp@entry=0xf80079321e00, uiop=, 
uiop@entry=0xfe00d01c4540,
cookiep=cookiep@entry=0xfe00d01c44e0, 
cred=cred@entry=0xf80079307e00, p=, 
p@entry=0xfe00de06be00, nap=nap@entry=0xfe00d01c4400,
attrflagp=0xfe00d01c44f0, eofp=0xfe00d01c44f4, stuff=0x0) at 
/usr/src/sys/fs/nfsclient/nfs_clrpcops.c:3766
#14 0x80aed4ec in ncl_readdirplusrpc (vp=vp@entry=0xf80079321e00, 
uiop=uiop@entry=0xfe00d01c4540, cred=0xf80079307e00, 
td=td@entry=0xfe00de06be00)
at /usr/src/sys/fs/nfsclient/nfs_clvnops.c:2490
#15 0x80afdc93 in ncl_doio (vp=vp@entry=0xf80079321e00, 
bp=bp@entry=0xfe000ee1c610, cr=0xfe00d01c3d00, 
cr@entry=0xf80079307e00, td=td@entry=0xfe00de06be00,
called_from_strategy=called_from_strategy@entry=0) at 
/usr/src/sys/fs/nfsclient/nfs_clbio.c:1686
#16 0x80afce3c in ncl_bioread (vp=, 
vp@entry=0xf80079321e00, uio=, ioflag=ioflag@entry=0, 
cred=)
at /usr/src/sys/fs/nfsclient/nfs_clbio.c:604
#17 0x80af1baf in nfs_readdir (ap=ap@entry=0xfe00d01c4918) at 
/usr/src/sys/fs/nfsclient/nfs_clvnops.c:2383
#18 0x80ce490f in vop_sigdefer (vop=, 
a=0xfe00d01c4918) at /usr/src/sys/kern/vfs_default.c:1471
#19 0x81181f38 in VOP_READDIR_APV (vop=0x81af00d8 
, a=a@entry=0xfe00d01c4918) at vnode_if.c:1939
#20 0x80d0b23b in VOP_READDIR (vp=0xf80079321e00, 
uio=0xfe00d01c48d0, cred=, eofflag=0xfe00d01c48cc, 
ncookies=0x0, cookies=0x0) at ./vnode_if.h:985
#21 kern_getdirentries (td=, fd=, buf=0x801851000 
, count=4096, 
basep=basep@entry=0xfe00d01c49b0,
residp=residp@entry=0x0, bufseg=UIO_USERSPACE) at 
/usr/src/sys/kern/vfs_syscalls.c:4142
#22 0x80d0b449 in sys_getdirentries (td=0x81e9a178 
, uap=0xfe00de06c1e8) at /usr/src/sys/kern/vfs_syscalls.c:4089
#23 0x810bd00e in syscallenter (td=) at 
/usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:189
#24 amd64_syscall (td=0xfe00de06be00, traced=0) at 
/usr/src/sys/amd64/amd64/trap.c:1156
#25 
#26 0x0008012a83fa in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffd928

—
Juraj Lutter
XMPP: juraj (at) lutter.sk
GSM: +421907986576

> On 27 Feb 2021, at 15:18, Juraj Lutter  wrote:
>
> Reliably reproducible:
>

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


rdirplus.patch
Description: rdirplus.patch
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: -CURRENT panics in NFS

2021-02-27 Thread Juraj Lutter
And a kgdb backtrace:

(kgdb) bt
#0  __curthread () at /usr/src/sys/amd64/include/pcpu_aux.h:55
#1  doadump (textdump=textdump@entry=0) at /usr/src/sys/kern/kern_shutdown.c:399
#2  0x804c7b2a in db_dump (dummy=, dummy2=, 
dummy3=, dummy4=) at /usr/src/sys/ddb/db_command.c:575
#3  0x804c78ee in db_command (last_cmdp=, 
cmd_table=, dopager=dopager@entry=1) at 
/usr/src/sys/ddb/db_command.c:482
#4  0x804c762d in db_command_loop () at 
/usr/src/sys/ddb/db_command.c:535
#5  0x804cac36 in db_trap (type=, code=) 
at /usr/src/sys/ddb/db_main.c:270
#6  0x80c59d04 in kdb_trap (type=type@entry=3, code=code@entry=0, 
tf=, tf@entry=0xfe00d01c3d40) at 
/usr/src/sys/kern/subr_kdb.c:727
#7  0x810bc1ee in trap (frame=0xfe00d01c3d40) at 
/usr/src/sys/amd64/amd64/trap.c:576
#8  
#9  kdb_enter (why=0x812accc9 "panic", msg=) at 
/usr/src/sys/kern/subr_kdb.c:506
#10 0x80c0d5d2 in vpanic (fmt=, ap=, 
ap@entry=0xfe00d01c3ea0) at /usr/src/sys/kern/kern_shutdown.c:907
#11 0x80c0d363 in panic (fmt=0x81e9a178  
"\177\256&\201\377\377\377\377") at /usr/src/sys/kern/kern_shutdown.c:843
#12 0x80cd6d74 in cache_enter_time (dvp=0xf80079321e00, 
vp=0xf80079321e00, cnp=cnp@entry=0xfe00d01c4030, 
tsp=tsp@entry=0xfe00d01c40e0, dtsp=)
at /usr/src/sys/kern/vfs_cache.c:2274
#13 0x80ae2bd6 in nfsrpc_readdirplus (vp=, 
vp@entry=0xf80079321e00, uiop=, 
uiop@entry=0xfe00d01c4540,
cookiep=cookiep@entry=0xfe00d01c44e0, 
cred=cred@entry=0xf80079307e00, p=, 
p@entry=0xfe00de06be00, nap=nap@entry=0xfe00d01c4400,
attrflagp=0xfe00d01c44f0, eofp=0xfe00d01c44f4, stuff=0x0) at 
/usr/src/sys/fs/nfsclient/nfs_clrpcops.c:3766
#14 0x80aed4ec in ncl_readdirplusrpc (vp=vp@entry=0xf80079321e00, 
uiop=uiop@entry=0xfe00d01c4540, cred=0xf80079307e00, 
td=td@entry=0xfe00de06be00)
at /usr/src/sys/fs/nfsclient/nfs_clvnops.c:2490
#15 0x80afdc93 in ncl_doio (vp=vp@entry=0xf80079321e00, 
bp=bp@entry=0xfe000ee1c610, cr=0xfe00d01c3d00, 
cr@entry=0xf80079307e00, td=td@entry=0xfe00de06be00,
called_from_strategy=called_from_strategy@entry=0) at 
/usr/src/sys/fs/nfsclient/nfs_clbio.c:1686
#16 0x80afce3c in ncl_bioread (vp=, 
vp@entry=0xf80079321e00, uio=, ioflag=ioflag@entry=0, 
cred=)
at /usr/src/sys/fs/nfsclient/nfs_clbio.c:604
#17 0x80af1baf in nfs_readdir (ap=ap@entry=0xfe00d01c4918) at 
/usr/src/sys/fs/nfsclient/nfs_clvnops.c:2383
#18 0x80ce490f in vop_sigdefer (vop=, 
a=0xfe00d01c4918) at /usr/src/sys/kern/vfs_default.c:1471
#19 0x81181f38 in VOP_READDIR_APV (vop=0x81af00d8 
, a=a@entry=0xfe00d01c4918) at vnode_if.c:1939
#20 0x80d0b23b in VOP_READDIR (vp=0xf80079321e00, 
uio=0xfe00d01c48d0, cred=, eofflag=0xfe00d01c48cc, 
ncookies=0x0, cookies=0x0) at ./vnode_if.h:985
#21 kern_getdirentries (td=, fd=, buf=0x801851000 
, count=4096, 
basep=basep@entry=0xfe00d01c49b0,
residp=residp@entry=0x0, bufseg=UIO_USERSPACE) at 
/usr/src/sys/kern/vfs_syscalls.c:4142
#22 0x80d0b449 in sys_getdirentries (td=0x81e9a178 
, uap=0xfe00de06c1e8) at /usr/src/sys/kern/vfs_syscalls.c:4089
#23 0x810bd00e in syscallenter (td=) at 
/usr/src/sys/amd64/amd64/../../kern/subr_syscall.c:189
#24 amd64_syscall (td=0xfe00de06be00, traced=0) at 
/usr/src/sys/amd64/amd64/trap.c:1156
#25 
#26 0x0008012a83fa in ?? ()
Backtrace stopped: Cannot access memory at address 0x7fffd928

—
Juraj Lutter
XMPP: juraj (at) lutter.sk
GSM: +421907986576

> On 27 Feb 2021, at 15:18, Juraj Lutter  wrote:
> 
> Reliably reproducible:
> 

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: -CURRENT panics in NFS

2021-02-27 Thread Juraj Lutter
Reliably reproducible:

VNASSERT failed: dvp != vp not true at /usr/src/sys/kern/vfs_cache.c:2269 
(cache_enter_time)
0xf80079321e00: type VDIR
usecount 4, writecount 0, refcount 3 seqc users 0 mountedhere 0
hold count flags ()
flags (VV_ROOT|VV_VMSIZEVNLOCK)
v_object 0xf801eeaf1d68 ref 0 pages 2 cleanbuf 1 dirtybuf 0
lock type nfs: SHARED (count 1)
fileid 34 fsid 0x3a3a00ff02
panic: condition dvp != vp not met at /usr/src/sys/kern/vfs_cache.c:2269 
(cache_enter_time)
cpuid = 1
time = 1614435453
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe00d01c3e10
vpanic() at vpanic+0x181/frame 0xfe00d01c3e60
panic() at panic+0x43/frame 0xfe00d01c3ec0
cache_enter_time() at cache_enter_time+0x1574/frame 0xfe00d01c3fa0
nfsrpc_readdirplus() at nfsrpc_readdirplus+0xcb6/frame 0xfe00d01c43d0
ncl_readdirplusrpc() at ncl_readdirplusrpc+0xdc/frame 0xfe00d01c4520
ncl_doio() at ncl_doio+0x423/frame 0xfe00d01c45b0
ncl_bioread() at ncl_bioread+0x5cc/frame 0xfe00d01c4740
nfs_readdir() at nfs_readdir+0x18f/frame 0xfe00d01c4850
vop_sigdefer() at vop_sigdefer+0x2f/frame 0xfe00d01c4880
VOP_READDIR_APV() at VOP_READDIR_APV+0x38/frame 0xfe00d01c48a0
kern_getdirentries() at kern_getdirentries+0x1fb/frame 0xfe00d01c4990
sys_getdirentries() at sys_getdirentries+0x29/frame 0xfe00d01c49c0
amd64_syscall() at amd64_syscall+0x12e/frame 0xfe00d01c4af0
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfe00d01c4af0
--- syscall (554, FreeBSD ELF64, sys_getdirentries), rip = 0x8012a83fa, rsp = 
0x7fffd928, rbp = 0x7fffd960 ---
KDB: enter: panic
[ thread pid 1879 tid 101207 ]
Stopped at  kdb_enter+0x37: movq$0,0x128bdde(%rip)
db>


—
Juraj Lutter
XMPP: juraj (at) lutter.sk
GSM: +421907986576

> On 27 Feb 2021, at 15:10, Juraj Lutter  wrote:
> 
> - poudriere data stored on NFS
> - NFS server 12-STABLE
> - NFS client (that panicked) 14-CURRENT
> - Panic string:
> 
> condition dvp != vp not met at /usr/src/sys/kern/vfs_cache.c:2269 
> (cache_enter_time)
> 
> backtrace:
> 
> Tracing pid 27294 tid 100893 td 0xfe00ea1a3500
> kdb_enter() at kdb_enter+0x37/frame 0xfe00ea3dee10
> vpanic() at vpanic+0x1b2/frame 0xfe00ea3dee60
> panic() at panic+0x43/frame 0xfe00ea3deec0
> cache_enter_time() at cache_enter_time+0x1574/frame 0xfe00ea3defa0
> nfsrpc_readdirplus() at nfsrpc_readdirplus+0xcb6/frame 0xfe00ea3df3d0
> ncl_readdirplusrpc() at ncl_readdirplusrpc+0xdc/frame 0xfe00ea3df520
> ncl_doio() at ncl_doio+0x423/frame 0xfe00ea3df5b0
> ncl_bioread() at ncl_bioread+0x5cc/frame 0xfe00ea3df740
> nfs_readdir() at nfs_readdir+0x18f/frame 0xfe00ea3df850
> vop_sigdefer() at vop_sigdefer+0x2f/frame 0xfe00ea3df880
> VOP_READDIR_APV() at VOP_READDIR_APV+0x38/frame 0xfe00ea3df8a0
> kern_getdirentries() at kern_getdirentries+0x1fb/frame 0xfe00ea3df990
> sys_getdirentries() at sys_getdirentries+0x29/frame 0xfe00ea3df9c0
> amd64_syscall() at amd64_syscall+0x12e/frame 0xfe00ea3dfaf0
> fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfe00ea3dfaf0
> 
> 
> running processes:
> 
> 27295 27179 27179 0  S+  piperd  0xf800090b4ba0  xargs
> 27294 27179 27179 0  R+  CPU 1   find
> 27179   816 27179 0  S+  wait0xf80181b69528  sh
> 
> Dump header from device: /dev/vtbd0p3
>  Architecture: amd64
>  Architecture Version: 2
>  Dump Length: 1860571136
>  Blocksize: 512
>  Compression: none
>  Dumptime: 2021-02-27 14:59:59 +0100
>  Hostname: b14.builder.wilbury.net
>  Magic: FreeBSD Kernel Dump
>  Version String: FreeBSD 14.0-CURRENT #0 main-n245107-172f2fc11cc: Fri Feb 26 
> 15:20:00 CET 2021
>r...@b14.builder.wilbury.net:/usr/obj/usr/src/amd64.amd64/sys/GENERIC
>  Panic String: condition dvp != vp not met at 
> /usr/src/sys/kern/vfs_cache.c:2269 (cache_enter_time)
>  Dump Parity: 1481068399
>  Bounds: 0
>  Dump Status: good
> 
> —
> Juraj Lutter
> XMPP: juraj (at) lutter.sk
> GSM: +421907986576
> 
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


-CURRENT panics in NFS

2021-02-27 Thread Juraj Lutter
- poudriere data stored on NFS
- NFS server 12-STABLE
- NFS client (that panicked) 14-CURRENT
- Panic string:

condition dvp != vp not met at /usr/src/sys/kern/vfs_cache.c:2269 
(cache_enter_time)

backtrace:

Tracing pid 27294 tid 100893 td 0xfe00ea1a3500
kdb_enter() at kdb_enter+0x37/frame 0xfe00ea3dee10
vpanic() at vpanic+0x1b2/frame 0xfe00ea3dee60
panic() at panic+0x43/frame 0xfe00ea3deec0
cache_enter_time() at cache_enter_time+0x1574/frame 0xfe00ea3defa0
nfsrpc_readdirplus() at nfsrpc_readdirplus+0xcb6/frame 0xfe00ea3df3d0
ncl_readdirplusrpc() at ncl_readdirplusrpc+0xdc/frame 0xfe00ea3df520
ncl_doio() at ncl_doio+0x423/frame 0xfe00ea3df5b0
ncl_bioread() at ncl_bioread+0x5cc/frame 0xfe00ea3df740
nfs_readdir() at nfs_readdir+0x18f/frame 0xfe00ea3df850
vop_sigdefer() at vop_sigdefer+0x2f/frame 0xfe00ea3df880
VOP_READDIR_APV() at VOP_READDIR_APV+0x38/frame 0xfe00ea3df8a0
kern_getdirentries() at kern_getdirentries+0x1fb/frame 0xfe00ea3df990
sys_getdirentries() at sys_getdirentries+0x29/frame 0xfe00ea3df9c0
amd64_syscall() at amd64_syscall+0x12e/frame 0xfe00ea3dfaf0
fast_syscall_common() at fast_syscall_common+0xf8/frame 0xfe00ea3dfaf0


running processes:

27295 27179 27179 0  S+  piperd  0xf800090b4ba0  xargs
27294 27179 27179 0  R+  CPU 1   find
27179   816 27179 0  S+  wait0xf80181b69528  sh

Dump header from device: /dev/vtbd0p3
  Architecture: amd64
  Architecture Version: 2
  Dump Length: 1860571136
  Blocksize: 512
  Compression: none
  Dumptime: 2021-02-27 14:59:59 +0100
  Hostname: b14.builder.wilbury.net
  Magic: FreeBSD Kernel Dump
  Version String: FreeBSD 14.0-CURRENT #0 main-n245107-172f2fc11cc: Fri Feb 26 
15:20:00 CET 2021
r...@b14.builder.wilbury.net:/usr/obj/usr/src/amd64.amd64/sys/GENERIC
  Panic String: condition dvp != vp not met at 
/usr/src/sys/kern/vfs_cache.c:2269 (cache_enter_time)
  Dump Parity: 1481068399
  Bounds: 0
  Dump Status: good

—
Juraj Lutter
XMPP: juraj (at) lutter.sk
GSM: +421907986576

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Current panics on connecting disks to a LSI-3108 controller

2020-07-15 Thread Lev Serebryakov
On 14.07.2020 1:47, Yuri Pankov wrote:

>> AVAGO MegaRAID SAS FreeBSD mrsas driver version: 07.709.04.00-fbsd
>> mfi0:  port 0x6000-0x60ff mem 
>> 0xc730-0xc730,0xc720-0xc72f irq 26 at device
>> 0.0 numa-domain 0 on pci3
>> mfi0: Using MSI
>> mfi0: Megaraid SAS driver Ver 4.23
>> mfi0: FW MaxCmds = 928, limiting to 128
>> mfi0: MaxCmd = 928, Drv MaxCmd = 128, MaxSgl = 70, state = 0xb73c03a0
>> .
>> mfi0: 54944 (boot + 6s/0x0020/info) - Firmware initialization started (PCI 
>> ID 005d/1000/0809/15d9)
>> pcib4: mfi0: 54945 (boot + 6s/0x0020/info) - Firmware 
>> version 4.290.00-4536
>>
>>
>> I have posted screenshots of the panic at:
>>  www.tegenbosch28.nl/FreeBSD/Crash-LSI3108
>>
>> But basically it crashes in
>>  mfi_tbolt_send_frame() +0x132
>>
>> So is there anybody out there that can help me with analyzing and fixing 
>> this panic?
> 
> I guess it's not the answer you are looking for, but you could try the mrsas 
> driver and check if it's behaves better for you, by setting 'set 
> hw.mfi.mrsas_enable=1' from loader prompt.

 I'm puzzled. I'm have SuperMicro add-on card based on LSI/Avago/Broadcom 3008. 
An I'm using "mpr" driver:

mpr0:  port 0xe000-0xe0ff mem 
0xf724-0xf724,0xf720-0xf723 irq 16 at device 0.0 on pci1
mpr0: Firmware: 14.00.00.00, Driver: 23.00.00.00-fbsd
mpr0: IOCCapabilities: 
7a85c

 "man mpr" mentions LSI-3108 too, and "man mfi" doesn't mention LSI-3xxx chips. 
Why does "mfi" attaches to LSI-3xxx? What is proper driver for these chips?


-- 
// Lev Serebryakov



signature.asc
Description: OpenPGP digital signature


Re: Current panics on connecting disks to a LSI-3108 controller

2020-07-15 Thread Mark Linimon
On Wed, Jul 15, 2020 at 11:56:43AM +0200, Willem Jan Withagen wrote:
> A bit of a pain, since pkg does not do it

because ...

> you need to manually fetch the tar from Broadcom first.

Finally:

> [pkg] also does not tell you why

Just ask it:

  portsjail% cd sysutils/storcli
  portsjail% make -V IGNORE
  You must manually fetch the distribution file 
(007.1211.._Unified_StorCLI.zip) from 
https://docs.broadcom.com/docs-and-downloads/raid-controllers/raid-controllers-common-files/007.1211.._Unified_StorCLI.zip,
 place it in /home/linimon/ports/default/distfiles and then run make again
  portsjail% make -V LICENSE
  storcli
  portsjail% make -V LICENSE_TEXT
  Source recipient must acknowledge license. Reproduction or redistribution 
prohibited. See 
https://www.broadcom.com/cs/Satellite?pagename=AVG2/Utilities/EulaMsg

mcl
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Current panics on connecting disks to a LSI-3108 controller

2020-07-15 Thread Willem Jan Withagen

On 14-7-2020 22:59, mike tancsa wrote:

On 7/14/2020 5:14 AM, Willem Jan Withagen wrote:

On 14-7-2020 07:45, Andriy Gapon wrote:

On 14/07/2020 03:39, Willem Jan Withagen wrote:

And what I read from the manual page, mrsas plays even nicer with
CAM which is a
plus.

If by "nicer" you mean that mfi does not integrate with CAM at all,
then you are
right :-)
Also, last I looked mfi has some pretty serious bugs in its direct
interface to
GEOM.  We've seen all kinds of crashes with mfi at work.

Right that was what I meant.
Disadvantage is that mfiutil no longer works.
But then if you JBOD, it does not really matter.
Unless it still uses caching for JBODs and then I'd like to know the
state of the
battery.


Take a look from the ports storcli or MegaCli

You can do pretty well everything you need with those 2 tools to talk to
mrsas attached disks

eg

MegaCli -CfgEachDskRaid0 WT NORA Direct CachedBadBBU -Automatic -a0
or
storcli /c0 show all
storcli /c0 show help
storcli /c0 set jbod=on (enable jbod mode for drives)
storcli /c0/e252/s0 set jbod (sets a disk into jbod mode)

Great suggestion.

Seems storcli is a followup on MegaCli. So I just got the last one.
A bit of a pain, since PKG does not do it, but it also does not tell you 
why.

But you need to manually fetch the tar from Broadcom first.

But then it works like a charm, actual upgrading whilest the system is 
running of that controller:


root@quad-a:/usr/ports/sysutils/storcli # storcli /c0 download 
file=/tmp/smc3108.rom

Download Completed.
Flashing image to adapter...
CLI Version = 007.1211.. Nov 07, 2019
Operating system = FreeBSD 13.0-CURRENT
Controller = 0
Status = Success
Description = F/W Flash Completed. Please reboot the system for the 
changes to take effect


Current package version = 24.9.0-0022
New package version = 24.21.0-0100

Thanx,
--WjW

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Current panics on connecting disks to a LSI-3108 controller

2020-07-14 Thread mike tancsa
On 7/14/2020 5:14 AM, Willem Jan Withagen wrote:
> On 14-7-2020 07:45, Andriy Gapon wrote:
>> On 14/07/2020 03:39, Willem Jan Withagen wrote:
>>> And what I read from the manual page, mrsas plays even nicer with
>>> CAM which is a
>>> plus.
>> If by "nicer" you mean that mfi does not integrate with CAM at all,
>> then you are
>> right :-)
>> Also, last I looked mfi has some pretty serious bugs in its direct
>> interface to
>> GEOM.  We've seen all kinds of crashes with mfi at work.
> Right that was what I meant.
> Disadvantage is that mfiutil no longer works.
> But then if you JBOD, it does not really matter.
> Unless it still uses caching for JBODs and then I'd like to know the
> state of the
> battery.
>
Take a look from the ports storcli or MegaCli

You can do pretty well everything you need with those 2 tools to talk to
mrsas attached disks

eg

MegaCli -CfgEachDskRaid0 WT NORA Direct CachedBadBBU -Automatic -a0
or
storcli /c0 show all
storcli /c0 show help
storcli /c0 set jbod=on (enable jbod mode for drives)
storcli /c0/e252/s0 set jbod (sets a disk into jbod mode)


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Current panics on connecting disks to a LSI-3108 controller

2020-07-14 Thread Willem Jan Withagen

On 14-7-2020 11:18, Hans Petter Selasky wrote:

On 2020-07-14 02:39, Willem Jan Withagen wrote:

I guess that there are reason not to do this by default.


I've seen the exact same panic.

+1 for fixing it :-)


I do not have the knowledge to fix this panic.
So the only thing I/we can do is:

Get extra information in the mfi manpages.

And perhaps get the preference reverted for mrsas <> mfi
but I would not know where to start that discussion?

--WjW
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Current panics on connecting disks to a LSI-3108 controller

2020-07-14 Thread Hans Petter Selasky

On 2020-07-14 02:39, Willem Jan Withagen wrote:

I guess that there are reason not to do this by default.


I've seen the exact same panic.

+1 for fixing it :-)

--HPS
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Current panics on connecting disks to a LSI-3108 controller

2020-07-14 Thread Willem Jan Withagen

On 14-7-2020 07:45, Andriy Gapon wrote:

On 14/07/2020 03:39, Willem Jan Withagen wrote:

And what I read from the manual page, mrsas plays even nicer with CAM which is a
plus.

If by "nicer" you mean that mfi does not integrate with CAM at all, then you are
right :-)
Also, last I looked mfi has some pretty serious bugs in its direct interface to
GEOM.  We've seen all kinds of crashes with mfi at work.

Right that was what I meant.
Disadvantage is that mfiutil no longer works.
But then if you JBOD, it does not really matter.
Unless it still uses caching for JBODs and then I'd like to know the 
state of the

battery.


Whatever the reason why mrsas is not always preferred over mfi, it must pretty
nebulous like POLA for existing users.  From technical point of view, mrsas
appears to be superior


Right, the Pola argument...

Least it would warrant for is a warning in the mfi/mfiutil manpage that 
mrsas is a lot
more modern and that it should be prefered is user has no specific 
reason to select

mfi.

And perhaps it is too complicated in the build of the boot images for 
isos and sticks, but
there it would help a lot. Now booting with this controller and disk in 
the system leads to
a panic. And a heavy one which requires hard reset/power-cycle, since 
the console is dead.


--WjW


___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Current panics on connecting disks to a LSI-3108 controller

2020-07-13 Thread Andriy Gapon
On 14/07/2020 03:39, Willem Jan Withagen wrote:
> And what I read from the manual page, mrsas plays even nicer with CAM which 
> is a
> plus.

If by "nicer" you mean that mfi does not integrate with CAM at all, then you are
right :-)
Also, last I looked mfi has some pretty serious bugs in its direct interface to
GEOM.  We've seen all kinds of crashes with mfi at work.

Whatever the reason why mrsas is not always preferred over mfi, it must pretty
nebulous like POLA for existing users.  From technical point of view, mrsas
appears to be superior.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Current panics on connecting disks to a LSI-3108 controller

2020-07-13 Thread Willem Jan Withagen

On 14-7-2020 00:47, Yuri Pankov wrote:

Willem Jan Withagen wrote:

Hi,

I have this Supermicro SUPERSERVER®2028TP
Which hold four nodes each with a X10DRT-PT motherboard
and a LSI-3108 SAS controller connecting to 6 disks.

Trying to run the most recent current snapshot on it.
# uname -a
FreeBSD quadbox-d.digiware.nl 13.0-CURRENT FreeBSD 13.0-CURRENT #0 
r363032: Thu Jul  9 04:13:17 UTC 2020 
r...@releng1.nyi.freebsd.org:/usr/obj/usr/src/amd64.amd64/sys/GENERIC 
amd64


I have installed the OS on a SATA flash DOM.
Booting works fine as long as there are no disks connected to 
LSI-3108 controller.

Booting with SAS disks connected results in a panic.
Attaching a SAS disk to the LSI-3108 controller give a panic as well.



AVAGO MegaRAID SAS FreeBSD mrsas driver version: 07.709.04.00-fbsd
mfi0:  port 0x6000-0x60ff mem 
0xc730-0xc730,0xc720-0xc72f irq 26 at device

0.0 numa-domain 0 on pci3
mfi0: Using MSI
mfi0: Megaraid SAS driver Ver 4.23
mfi0: FW MaxCmds = 928, limiting to 128
mfi0: MaxCmd = 928, Drv MaxCmd = 128, MaxSgl = 70, state = 0xb73c03a0
.
mfi0: 54944 (boot + 6s/0x0020/info) - Firmware initialization started 
(PCI ID 005d/1000/0809/15d9)
pcib4: mfi0: 54945 (boot + 6s/0x0020/info) - 
Firmware version 4.290.00-4536



I have posted screenshots of the panic at:
 www.tegenbosch28.nl/FreeBSD/Crash-LSI3108

But basically it crashes in
 mfi_tbolt_send_frame() +0x132

So is there anybody out there that can help me with analyzing and 
fixing this panic?


I guess it's not the answer you are looking for, but you could try the 
mrsas driver and check if it's behaves better for you, by setting 'set 
hw.mfi.mrsas_enable=1' from loader prompt.


That was a great suggestion.
And what I read from the manual page, mrsas plays even nicer with CAM 
which is a plus.


I guess that there are reason not to do this by default.
So one gets mrsas unless it does not attach to that specific card in the 
system?


--WjW

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: Current panics on connecting disks to a LSI-3108 controller

2020-07-13 Thread Yuri Pankov

Willem Jan Withagen wrote:

Hi,

I have this Supermicro SUPERSERVER®2028TP
Which hold four nodes each with a X10DRT-PT motherboard
and a LSI-3108 SAS controller connecting to 6 disks.

Trying to run the most recent current snapshot on it.
# uname -a
FreeBSD quadbox-d.digiware.nl 13.0-CURRENT FreeBSD 13.0-CURRENT #0 
r363032: Thu Jul  9 04:13:17 UTC 2020 
r...@releng1.nyi.freebsd.org:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64


I have installed the OS on a SATA flash DOM.
Booting works fine as long as there are no disks connected to LSI-3108 
controller.

Booting with SAS disks connected results in a panic.
Attaching a SAS disk to the LSI-3108 controller give a panic as well.



AVAGO MegaRAID SAS FreeBSD mrsas driver version: 07.709.04.00-fbsd
mfi0:  port 0x6000-0x60ff mem 
0xc730-0xc730,0xc720-0xc72f irq 26 at device

0.0 numa-domain 0 on pci3
mfi0: Using MSI
mfi0: Megaraid SAS driver Ver 4.23
mfi0: FW MaxCmds = 928, limiting to 128
mfi0: MaxCmd = 928, Drv MaxCmd = 128, MaxSgl = 70, state = 0xb73c03a0
.
mfi0: 54944 (boot + 6s/0x0020/info) - Firmware initialization started 
(PCI ID 005d/1000/0809/15d9)
pcib4: mfi0: 54945 (boot + 6s/0x0020/info) - 
Firmware version 4.290.00-4536



I have posted screenshots of the panic at:
     www.tegenbosch28.nl/FreeBSD/Crash-LSI3108

But basically it crashes in
     mfi_tbolt_send_frame() +0x132

So is there anybody out there that can help me with analyzing and fixing 
this panic?


I guess it's not the answer you are looking for, but you could try the 
mrsas driver and check if it's behaves better for you, by setting 'set 
hw.mfi.mrsas_enable=1' from loader prompt.

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Current panics on connecting disks to a LSI-3108 controller

2020-07-13 Thread Willem Jan Withagen

Hi,

I have this Supermicro SUPERSERVER®2028TP
Which hold four nodes each with a X10DRT-PT motherboard
and a LSI-3108 SAS controller connecting to 6 disks.

Trying to run the most recent current snapshot on it.
# uname -a
FreeBSD quadbox-d.digiware.nl 13.0-CURRENT FreeBSD 13.0-CURRENT #0 
r363032: Thu Jul  9 04:13:17 UTC 2020 
r...@releng1.nyi.freebsd.org:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64


I have installed the OS on a SATA flash DOM.
Booting works fine as long as there are no disks connected to LSI-3108 
controller.

Booting with SAS disks connected results in a panic.
Attaching a SAS disk to the LSI-3108 controller give a panic as well.



AVAGO MegaRAID SAS FreeBSD mrsas driver version: 07.709.04.00-fbsd
mfi0:  port 0x6000-0x60ff mem 
0xc730-0xc730,0xc720-0xc72f irq 26 at device

0.0 numa-domain 0 on pci3
mfi0: Using MSI
mfi0: Megaraid SAS driver Ver 4.23
mfi0: FW MaxCmds = 928, limiting to 128
mfi0: MaxCmd = 928, Drv MaxCmd = 128, MaxSgl = 70, state = 0xb73c03a0
.
mfi0: 54944 (boot + 6s/0x0020/info) - Firmware initialization started 
(PCI ID 005d/1000/0809/15d9)
pcib4: mfi0: 54945 (boot + 6s/0x0020/info) - 
Firmware version 4.290.00-4536



I have posted screenshots of the panic at:
    www.tegenbosch28.nl/FreeBSD/Crash-LSI3108

But basically it crashes in
    mfi_tbolt_send_frame() +0x132

So is there anybody out there that can help me with analyzing and fixing 
this panic?


Thanx,
--WjW

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: 12-Current panics on boot (didn't a week ago.)

2018-03-31 Thread Joe Maloney
The drm-next-kmod, and drm-stable-kmod modules panic for me.  I will attach
logs when I can.

On Friday, March 30, 2018, Andrew Reilly  wrote:

> Hi Jonathan, all,
>
> I've just compiled and booted a kernel derived from current-GENERIC
> but with nooptions TCP_BLACKBOX, and much to my surprise it boots.
> Possible link to network-related activities is that the next line
> of boot output that was not being displayed during the crash is:
>
> [ath_hal] loaded
>
> That's vaguely network-shaped: could it be an issue?
>
> Please let me know if there's anything else that I could test or
> poke, in order to find the real culprit.
>
> My make.conf says:
>
> KERNCONF=ZEN
> WRKDIRPREFIX=/usr/obj/ports
> MALLOC_PRODUCTION=yes
>
> My /usr/src/sys/amd64/conf/ZEN says:
>
> include GENERIC
> nooptions TCP_BLACKBOX
>
> Uname -a says:
> FreeBSD Zen.ac-r.nu 12.0-CURRENT FreeBSD 12.0-CURRENT #0 r331768M: Sat
> Mar 31 10:47:52 AEDT 2018 root@Zen:/usr/obj/usr/src/amd64.amd64/sys/ZEN
> amd64
>
> Cheers,
>
> Andrew
>
>
> Here's the top part of the new dmesg.boot, FYI:
> Copyright (c) 1992-2018 The FreeBSD Project.
> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
> The Regents of the University of California. All rights reserved.
> FreeBSD is a registered trademark of The FreeBSD Foundation.
> FreeBSD 12.0-CURRENT #0 r331768M: Sat Mar 31 10:47:52 AEDT 2018
> root@Zen:/usr/obj/usr/src/amd64.amd64/sys/ZEN amd64
> FreeBSD clang version 6.0.0 (tags/RELEASE_600/final 326565) (based on LLVM
> 6.0.0)
> WARNING: WITNESS option enabled, expect reduced performance.
> VT(vga): resolution 640x480
> CPU: AMD Ryzen 7 1700 Eight-Core Processor   (2994.45-MHz K8-class
> CPU)
>   Origin="AuthenticAMD"  Id=0x800f11  Family=0x17  Model=0x1  Stepping=1
>   Features=0x178bfbff APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
>   Features2=0x7ed8320b SSE4.1,SSE4.2,MOVBE,POPCNT,AESNI,XSAVE,OSXSAVE,AVX,F16C,RDRAND>
>   AMD Features=0x2e500800
>   AMD Features2=0x35c233ff Prefetch,OSVW,SKINIT,WDT,TCE,Topology,PCXC,PNXC,DBE,PL2I,MWAITX>
>   Structured Extended Features=0x209c01a9 BMI1,AVX2,SMEP,BMI2,RDSEED,ADX,SMAP,CLFLUSHOPT,SHA>
>   XSAVE Features=0xf
>   AMD Extended Feature Extensions ID EBX=0x7
>   SVM: (disabled in BIOS) NP,NRIP,VClean,AFlush,DAssist,NAsids=32768
>   TSC: P-state invariant, performance statistics
> real memory  = 34359738368 (32768 MB)
> avail memory = 33271214080 (31729 MB)
> Event timer "LAPIC" quality 600
> ACPI APIC Table: 
> FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs
> FreeBSD/SMP: 1 package(s) x 2 cache groups x 4 core(s)
> random: unblocking device.
> Firmware Warning (ACPI): Optional FADT field Pm2ControlBlock has valid
> Length but zero Address: 0x/0x1 (20180313/tbfadt-796)
> ioapic0  irqs 0-23 on motherboard
> ioapic1  irqs 24-55 on motherboard
> SMP: AP CPU #7 Launched!
> SMP: AP CPU #3 Launched!
> SMP: AP CPU #2 Launched!
> SMP: AP CPU #6 Launched!
> SMP: AP CPU #5 Launched!
> SMP: AP CPU #4 Launched!
> SMP: AP CPU #1 Launched!
> Timecounter "TSC-low" frequency 1497224985 Hz quality 1000
> random: entropy device external interface
> [ath_hal] loaded
> module_register_init: MOD_LOAD (vesa, 0x8109f600, 0) error 19
> random: registering fast source Intel Secure Key RNG
> random: fast provider: "Intel Secure Key RNG"
> kbd1 at kbdmux0
> netmap: loaded module
> nexus0
> vtvga0:  on motherboard
> cryptosoft0:  on motherboard
> aesni0:  on motherboard
> acpi0:  on motherboard
> acpi0: Power Button (fixed)
> cpu0:  on acpi0
> cpu1:  on acpi0
> cpu2:  on acpi0
> cpu3:  on acpi0
> cpu4:  on acpi0
> cpu5:  on acpi0
> cpu6:  on acpi0
> cpu7:  on acpi0
> attimer0:  port 0x40-0x43 irq 0 on acpi0
> Timecounter "i8254" frequency 1193182 Hz quality 0
> Event timer "i8254" frequency 1193182 Hz quality 100
> atrtc0:  port 0x70-0x71 on acpi0
> atrtc0: registered as a time-of-day clock, resolution 1.00s
> Event timer "RTC" frequency 32768 Hz quality 0
> hpet0:  iomem 0xfed0-0xfed003ff irq 0,8 on
> acpi0
> Timecounter "HPET" frequency 14318180 Hz quality 950
> Event timer "HPET" frequency 14318180 Hz quality 350
> Event timer "HPET1" frequency 14318180 Hz quality 350
> Event timer "HPET2" frequency 14318180 Hz quality 350
> Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
> acpi_timer0: <32-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
> pcib0:  port 0xcf8-0xcff on acpi0
> pci0:  on pcib0
> amdsmn0:  on hostb0
> amdtemp0:  on hostb0
>
>
> On Sun, Mar 25, 2018 at 04:35:31AM +, Jonathan Looney wrote:
> > For now, you can update through r331485 and then take TCP_BLACKBOX out of
> > your kernel config file. That won’t really “fix” 

Re: 12-Current panics on boot (didn't a week ago.)

2018-03-30 Thread Andrew Reilly
Hi Jonathan, all,

I've just compiled and booted a kernel derived from current-GENERIC
but with nooptions TCP_BLACKBOX, and much to my surprise it boots.
Possible link to network-related activities is that the next line
of boot output that was not being displayed during the crash is:

[ath_hal] loaded

That's vaguely network-shaped: could it be an issue?

Please let me know if there's anything else that I could test or
poke, in order to find the real culprit.

My make.conf says:

KERNCONF=ZEN
WRKDIRPREFIX=/usr/obj/ports
MALLOC_PRODUCTION=yes

My /usr/src/sys/amd64/conf/ZEN says:

include GENERIC
nooptions TCP_BLACKBOX

Uname -a says:
FreeBSD Zen.ac-r.nu 12.0-CURRENT FreeBSD 12.0-CURRENT #0 r331768M: Sat Mar 31 
10:47:52 AEDT 2018 root@Zen:/usr/obj/usr/src/amd64.amd64/sys/ZEN  amd64

Cheers,

Andrew


Here's the top part of the new dmesg.boot, FYI:
Copyright (c) 1992-2018 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 12.0-CURRENT #0 r331768M: Sat Mar 31 10:47:52 AEDT 2018
root@Zen:/usr/obj/usr/src/amd64.amd64/sys/ZEN amd64
FreeBSD clang version 6.0.0 (tags/RELEASE_600/final 326565) (based on LLVM 
6.0.0)
WARNING: WITNESS option enabled, expect reduced performance.
VT(vga): resolution 640x480
CPU: AMD Ryzen 7 1700 Eight-Core Processor   (2994.45-MHz K8-class CPU)
  Origin="AuthenticAMD"  Id=0x800f11  Family=0x17  Model=0x1  Stepping=1
  
Features=0x178bfbff
  
Features2=0x7ed8320b
  AMD Features=0x2e500800
  AMD 
Features2=0x35c233ff
  Structured Extended 
Features=0x209c01a9
  XSAVE Features=0xf
  AMD Extended Feature Extensions ID EBX=0x7
  SVM: (disabled in BIOS) NP,NRIP,VClean,AFlush,DAssist,NAsids=32768
  TSC: P-state invariant, performance statistics
real memory  = 34359738368 (32768 MB)
avail memory = 33271214080 (31729 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: 
FreeBSD/SMP: Multiprocessor System Detected: 8 CPUs
FreeBSD/SMP: 1 package(s) x 2 cache groups x 4 core(s)
random: unblocking device.
Firmware Warning (ACPI): Optional FADT field Pm2ControlBlock has valid Length 
but zero Address: 0x/0x1 (20180313/tbfadt-796)
ioapic0  irqs 0-23 on motherboard
ioapic1  irqs 24-55 on motherboard
SMP: AP CPU #7 Launched!
SMP: AP CPU #3 Launched!
SMP: AP CPU #2 Launched!
SMP: AP CPU #6 Launched!
SMP: AP CPU #5 Launched!
SMP: AP CPU #4 Launched!
SMP: AP CPU #1 Launched!
Timecounter "TSC-low" frequency 1497224985 Hz quality 1000
random: entropy device external interface
[ath_hal] loaded
module_register_init: MOD_LOAD (vesa, 0x8109f600, 0) error 19
random: registering fast source Intel Secure Key RNG
random: fast provider: "Intel Secure Key RNG"
kbd1 at kbdmux0
netmap: loaded module
nexus0
vtvga0:  on motherboard
cryptosoft0:  on motherboard
aesni0:  on motherboard
acpi0:  on motherboard
acpi0: Power Button (fixed)
cpu0:  on acpi0
cpu1:  on acpi0
cpu2:  on acpi0
cpu3:  on acpi0
cpu4:  on acpi0
cpu5:  on acpi0
cpu6:  on acpi0
cpu7:  on acpi0
attimer0:  port 0x40-0x43 irq 0 on acpi0
Timecounter "i8254" frequency 1193182 Hz quality 0
Event timer "i8254" frequency 1193182 Hz quality 100
atrtc0:  port 0x70-0x71 on acpi0
atrtc0: registered as a time-of-day clock, resolution 1.00s
Event timer "RTC" frequency 32768 Hz quality 0
hpet0:  iomem 0xfed0-0xfed003ff irq 0,8 on acpi0
Timecounter "HPET" frequency 14318180 Hz quality 950
Event timer "HPET" frequency 14318180 Hz quality 350
Event timer "HPET1" frequency 14318180 Hz quality 350
Event timer "HPET2" frequency 14318180 Hz quality 350
Timecounter "ACPI-fast" frequency 3579545 Hz quality 900
acpi_timer0: <32-bit timer at 3.579545MHz> port 0x808-0x80b on acpi0
pcib0:  port 0xcf8-0xcff on acpi0
pci0:  on pcib0
amdsmn0:  on hostb0
amdtemp0:  on hostb0


On Sun, Mar 25, 2018 at 04:35:31AM +, Jonathan Looney wrote:
> For now, you can update through r331485 and then take TCP_BLACKBOX out of
> your kernel config file. That won’t really “fix” anything, but should at
> least get you a booting system (assuming the new code from r331347 is
> really triggering a problem).
> 
> 
> I’ll take another look to see if I missed something in the commit. But, at
> the moment, I’m hard-pressed to see how r331347 would cause the problem you
> describe.
> 
> 
> Jonathan
> 
> On Sat, Mar 24, 2018 at 9:17 PM Andrew Reilly 
> 

Re: 12-Current panics on boot (didn't a week ago.)

2018-03-25 Thread Jonathan Looney
For now, you can update through r331485 and then take TCP_BLACKBOX out of
your kernel config file. That won’t really “fix” anything, but should at
least get you a booting system (assuming the new code from r331347 is
really triggering a problem).


I’ll take another look to see if I missed something in the commit. But, at
the moment, I’m hard-pressed to see how r331347 would cause the problem you
describe.


Jonathan

On Sat, Mar 24, 2018 at 9:17 PM Andrew Reilly 
wrote:

> OK, I've completed the search: r331346 works, r331347 panics
> somewhere in the initialization of random.
>
> In the 331347 change (Add the "TCP Blackbox Recorder") I can't see
> anything obvious to tweak, unfortunately.  It's a fair chunk of new
> code but it's all network-stack related, and my kernel is panicking
> long before any network activity happens.
>
> Any suggestions?
>
> Cheers,
>
> Andrew
>
> On Sat, Mar 24, 2018 at 05:23:18PM -0600, Warner Losh wrote:
> > Thanks Andrew... I can't recreate this on my VM nor my real hardware.
> >
> > Warner
> >
> > On Sat, Mar 24, 2018 at 5:22 PM, Andrew Reilly 
> > wrote:
> >
> > > So, r331464 crashes in the same place, on my system.  r331064 still
> boots
> > > OK.  I'll keep searching.
> > >
> > > One week ago there was a change to randomdev to poll for signals every
> so
> > > often, as a defence against very large reads.  That wouldn't have
> > > introduced a race somewhere,
> > > or left things in an unexpected state, perhaps?  That change (r331070)
> by
> > > cem@ is just a few revisions after the one that is working for me.
> I'll
> > > start looking there...
> > >
> > > Cheers,
> > >
> > > Andrew
> > >
> > > On Sun, Mar 25, 2018 at 07:49:17AM +1100, Andrew Reilly wrote:
> > > > Hi Warner,
> > > >
> > > > The breakage was in 331470,  and at least one version earlier, that I
> > > updated past when it panicked.
> > > >
> > > > I'm guessing that kdb's inability to dump would be down to it not
> having
> > > found any disk devices yet, right?  So yes, bisecting to narrow down
> the
> > > issue is probably the best bet.  I'll try your r331464: if that works
> that
> > > leaves only four or five revisions.  Of course the breakage could be
> > > hardware specific.
> > > >
> > > > Cheers,
> > > > --
> > > > Andrew
> > >
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: 12-Current panics on boot (didn't a week ago.)

2018-03-25 Thread Herbert J. Skuhra
On Sun, 25 Mar 2018 05:21:10 +0200, Andrew Reilly wrote:
> 
> OK, I've completed the search: r331346 works, r331347 panics
> somewhere in the initialization of random.
> 
> In the 331347 change (Add the "TCP Blackbox Recorder") I can't see
> anything obvious to tweak, unfortunately.  It's a fair chunk of new
> code but it's all network-stack related, and my kernel is panicking
> long before any network activity happens.
> 
> Any suggestions?

Does your system boot if you upgrade to at least r331485 and remove
"options TCP_BLACKBOX" from sys/amd64/conf/GENERIC (if you build and
run GENERIC)?

--
Herbert
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: 12-Current panics on boot (didn't a week ago.)

2018-03-24 Thread Andrew Reilly
OK, I've completed the search: r331346 works, r331347 panics
somewhere in the initialization of random.

In the 331347 change (Add the "TCP Blackbox Recorder") I can't see
anything obvious to tweak, unfortunately.  It's a fair chunk of new
code but it's all network-stack related, and my kernel is panicking
long before any network activity happens.

Any suggestions?

Cheers,

Andrew

On Sat, Mar 24, 2018 at 05:23:18PM -0600, Warner Losh wrote:
> Thanks Andrew... I can't recreate this on my VM nor my real hardware.
> 
> Warner
> 
> On Sat, Mar 24, 2018 at 5:22 PM, Andrew Reilly 
> wrote:
> 
> > So, r331464 crashes in the same place, on my system.  r331064 still boots
> > OK.  I'll keep searching.
> >
> > One week ago there was a change to randomdev to poll for signals every so
> > often, as a defence against very large reads.  That wouldn't have
> > introduced a race somewhere,
> > or left things in an unexpected state, perhaps?  That change (r331070) by
> > cem@ is just a few revisions after the one that is working for me.  I'll
> > start looking there...
> >
> > Cheers,
> >
> > Andrew
> >
> > On Sun, Mar 25, 2018 at 07:49:17AM +1100, Andrew Reilly wrote:
> > > Hi Warner,
> > >
> > > The breakage was in 331470,  and at least one version earlier, that I
> > updated past when it panicked.
> > >
> > > I'm guessing that kdb's inability to dump would be down to it not having
> > found any disk devices yet, right?  So yes, bisecting to narrow down the
> > issue is probably the best bet.  I'll try your r331464: if that works that
> > leaves only four or five revisions.  Of course the breakage could be
> > hardware specific.
> > >
> > > Cheers,
> > > --
> > > Andrew
> >
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: 12-Current panics on boot (didn't a week ago.)

2018-03-24 Thread Andrew Reilly
So, r331464 crashes in the same place, on my system.  r331064 still boots OK.  
I'll keep searching.

One week ago there was a change to randomdev to poll for signals every so 
often, as a defence against very large reads.  That wouldn't have introduced a 
race somewhere,
or left things in an unexpected state, perhaps?  That change (r331070) by cem@ 
is just a few revisions after the one that is working for me.  I'll start 
looking there...

Cheers,

Andrew

On Sun, Mar 25, 2018 at 07:49:17AM +1100, Andrew Reilly wrote:
> Hi Warner,
> 
> The breakage was in 331470,  and at least one version earlier, that I updated 
> past when it panicked.
> 
> I'm guessing that kdb's inability to dump would be down to it not having 
> found any disk devices yet, right?  So yes, bisecting to narrow down the 
> issue is probably the best bet.  I'll try your r331464: if that works that 
> leaves only four or five revisions.  Of course the breakage could be hardware 
> specific.
> 
> Cheers,
> -- 
> Andrew
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: 12-Current panics on boot (didn't a week ago.)

2018-03-24 Thread Warner Losh
Thanks Andrew... I can't recreate this on my VM nor my real hardware.

Warner

On Sat, Mar 24, 2018 at 5:22 PM, Andrew Reilly 
wrote:

> So, r331464 crashes in the same place, on my system.  r331064 still boots
> OK.  I'll keep searching.
>
> One week ago there was a change to randomdev to poll for signals every so
> often, as a defence against very large reads.  That wouldn't have
> introduced a race somewhere,
> or left things in an unexpected state, perhaps?  That change (r331070) by
> cem@ is just a few revisions after the one that is working for me.  I'll
> start looking there...
>
> Cheers,
>
> Andrew
>
> On Sun, Mar 25, 2018 at 07:49:17AM +1100, Andrew Reilly wrote:
> > Hi Warner,
> >
> > The breakage was in 331470,  and at least one version earlier, that I
> updated past when it panicked.
> >
> > I'm guessing that kdb's inability to dump would be down to it not having
> found any disk devices yet, right?  So yes, bisecting to narrow down the
> issue is probably the best bet.  I'll try your r331464: if that works that
> leaves only four or five revisions.  Of course the breakage could be
> hardware specific.
> >
> > Cheers,
> > --
> > Andrew
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: 12-Current panics on boot (didn't a week ago.)

2018-03-24 Thread Andrew Reilly
Hi Warner,

The breakage was in 331470,  and at least one version earlier, that I updated 
past when it panicked.

I'm guessing that kdb's inability to dump would be down to it not having found 
any disk devices yet, right?  So yes, bisecting to narrow down the issue is 
probably the best bet.  I'll try your r331464: if that works that leaves only 
four or five revisions.  Of course the breakage could be hardware specific.

Cheers,
-- 
Andrew

On 25 March 2018 1:14:40 am AEDT, Warner Losh  wrote:
>Also, what rev failed? I booted r331464 last night w/o issue.
>
>Warner
>
>On Fri, Mar 23, 2018 at 9:56 PM, Andrew Reilly 
>wrote:
>
>> Hi all,
>>
>> For reasons that still escape me, I haven't been able to get a kernel
>dump
>> to debug, sorry.
>>
>> Just thought that I'd generate a fairly low-quality report, to see if
>> anyone has some ideas.
>>
>> The last kernel that I have that booted OK (and I'm now running) is:
>> FreeBSD Zen.ac-r.nu 12.0-CURRENT FreeBSD 12.0-CURRENT #1 r331064M:
>Sat
>> Mar 17 07:54:51 AEDT 2018
>root@Zen:/usr/obj/usr/src/amd64.amd64/sys/GENERIC
>> amd64
>>
>> The machine is a:
>> CPU: AMD Ryzen 7 1700 Eight-Core Processor   (2994.46-MHz
>K8-class
>> CPU)
>>   Origin="AuthenticAMD"  Id=0x800f11  Family=0x17  Model=0x1 
>Stepping=1
>>   Features=0x178bfbff> APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
>>
>> Kernels built from head as of a couple of hours ago get through
>launching
>> the other CPUs and then stops somewhere in random, apparently:
>>
>> SMP: AP CPU #2 Launched!
>> Timecounter "TSC-low" frequency 1497223020 Hz quality 1000
>> random: entpanic: mtx_lock() of spin mutex (null) @
>> /usr/src/sys/kern/subr_bus.c:617
>> cpuid = 0
>> time = 1
>> KDB: stack backtrace:
>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
>> 0xfe4507a0
>> vpanic() at vpanic+0x18d/frame 0xfe450800
>> doadump () at doadump/frame 0xfe450880
>> __mtx_lock_flags() at __mtx_lock_flags+0x163/frame 0xfe4508d0
>> devctl_queue_data_f() at devctl_queue_data_f+0x6a/frame
>0xfe450900
>> g_dev_taste() at g_dev_taste+0x370/frame 0xfe450a10
>> g_new_provider_event() at g_new_provider_event+0xfa/frame
>> 0xfe450a30
>> g_run_events() at g_run_events+0x151/frame 0xfe450a70
>> fork_exit() at fork_exit+0x84/frame 0xfe450ab0
>> fork_trampoline() at fork_trampoline+0xe/frame 0xfe450ab0
>> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
>> KDB: enter: panic
>> [ thread pid 14 tid 100052 ]
>> Stopped at kdb_enter+0x3b: movq$0,kdb_why
>> db> dump
>> Cannot dump: no dump device specified.
>> db>
>>
>> Now dumping worked fine the last time the kernel panicked: I have
>> dumpdev=AUTO in rc.conf and I have swap on nvd0p3 (first) and
>> /dev/zvol/root/swap
>> (second, larger than the first.)
>>
>> Root on the nvd0p2 is ZFS, and ther's a four-drive raidZ with user
>> directories and what-not on them, and another ZFS on an external USB
>drive
>> that I use
>> for backups, unmounted.
>>
>> In the new kernels, we clearly aren't even getting as far as finding
>the
>> hubs and controllers, let alone the drives.
>>
>> I've attached dmesg.boot from the last boot from last week's good
>kernel.
>> (While briefly in yoyo mode I turned the SMT back on, so now there
>are 16
>> cores
>> instead of the eight mentioned in the crash dump.  Didn't help, but I
>> haven't turned it back off yet.)
>>
>> Cheers,
>>
>> Andrew
>>
>>
>> ___
>> freebsd-current@freebsd.org mailing list
>> https://lists.freebsd.org/mailman/listinfo/freebsd-current
>> To unsubscribe, send any mail to
>"freebsd-current-unsubscr...@freebsd.org"
>>
>>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: 12-Current panics on boot (didn't a week ago.)

2018-03-24 Thread Warner Losh
Also, what rev failed? I booted r331464 last night w/o issue.

Warner

On Fri, Mar 23, 2018 at 9:56 PM, Andrew Reilly 
wrote:

> Hi all,
>
> For reasons that still escape me, I haven't been able to get a kernel dump
> to debug, sorry.
>
> Just thought that I'd generate a fairly low-quality report, to see if
> anyone has some ideas.
>
> The last kernel that I have that booted OK (and I'm now running) is:
> FreeBSD Zen.ac-r.nu 12.0-CURRENT FreeBSD 12.0-CURRENT #1 r331064M: Sat
> Mar 17 07:54:51 AEDT 2018 
> root@Zen:/usr/obj/usr/src/amd64.amd64/sys/GENERIC
> amd64
>
> The machine is a:
> CPU: AMD Ryzen 7 1700 Eight-Core Processor   (2994.46-MHz K8-class
> CPU)
>   Origin="AuthenticAMD"  Id=0x800f11  Family=0x17  Model=0x1  Stepping=1
>   Features=0x178bfbff APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
>
> Kernels built from head as of a couple of hours ago get through launching
> the other CPUs and then stops somewhere in random, apparently:
>
> SMP: AP CPU #2 Launched!
> Timecounter "TSC-low" frequency 1497223020 Hz quality 1000
> random: entpanic: mtx_lock() of spin mutex (null) @
> /usr/src/sys/kern/subr_bus.c:617
> cpuid = 0
> time = 1
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> 0xfe4507a0
> vpanic() at vpanic+0x18d/frame 0xfe450800
> doadump () at doadump/frame 0xfe450880
> __mtx_lock_flags() at __mtx_lock_flags+0x163/frame 0xfe4508d0
> devctl_queue_data_f() at devctl_queue_data_f+0x6a/frame 0xfe450900
> g_dev_taste() at g_dev_taste+0x370/frame 0xfe450a10
> g_new_provider_event() at g_new_provider_event+0xfa/frame
> 0xfe450a30
> g_run_events() at g_run_events+0x151/frame 0xfe450a70
> fork_exit() at fork_exit+0x84/frame 0xfe450ab0
> fork_trampoline() at fork_trampoline+0xe/frame 0xfe450ab0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> KDB: enter: panic
> [ thread pid 14 tid 100052 ]
> Stopped at kdb_enter+0x3b: movq$0,kdb_why
> db> dump
> Cannot dump: no dump device specified.
> db>
>
> Now dumping worked fine the last time the kernel panicked: I have
> dumpdev=AUTO in rc.conf and I have swap on nvd0p3 (first) and
> /dev/zvol/root/swap
> (second, larger than the first.)
>
> Root on the nvd0p2 is ZFS, and ther's a four-drive raidZ with user
> directories and what-not on them, and another ZFS on an external USB drive
> that I use
> for backups, unmounted.
>
> In the new kernels, we clearly aren't even getting as far as finding the
> hubs and controllers, let alone the drives.
>
> I've attached dmesg.boot from the last boot from last week's good kernel.
> (While briefly in yoyo mode I turned the SMT back on, so now there are 16
> cores
> instead of the eight mentioned in the crash dump.  Didn't help, but I
> haven't turned it back off yet.)
>
> Cheers,
>
> Andrew
>
>
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
>
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: 12-Current panics on boot (didn't a week ago.)

2018-03-24 Thread Warner Losh
That lock has been there for a long, long time  (like 5 or 6 major
releases)... It's surprising that it's causing issues now.

Can you bisect versions to find when this starts happening?

Warner

On Fri, Mar 23, 2018 at 9:56 PM, Andrew Reilly 
wrote:

> Hi all,
>
> For reasons that still escape me, I haven't been able to get a kernel dump
> to debug, sorry.
>
> Just thought that I'd generate a fairly low-quality report, to see if
> anyone has some ideas.
>
> The last kernel that I have that booted OK (and I'm now running) is:
> FreeBSD Zen.ac-r.nu 12.0-CURRENT FreeBSD 12.0-CURRENT #1 r331064M: Sat
> Mar 17 07:54:51 AEDT 2018 
> root@Zen:/usr/obj/usr/src/amd64.amd64/sys/GENERIC
> amd64
>
> The machine is a:
> CPU: AMD Ryzen 7 1700 Eight-Core Processor   (2994.46-MHz K8-class
> CPU)
>   Origin="AuthenticAMD"  Id=0x800f11  Family=0x17  Model=0x1  Stepping=1
>   Features=0x178bfbff APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,SSE,SSE2,HTT>
>
> Kernels built from head as of a couple of hours ago get through launching
> the other CPUs and then stops somewhere in random, apparently:
>
> SMP: AP CPU #2 Launched!
> Timecounter "TSC-low" frequency 1497223020 Hz quality 1000
> random: entpanic: mtx_lock() of spin mutex (null) @
> /usr/src/sys/kern/subr_bus.c:617
> cpuid = 0
> time = 1
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> 0xfe4507a0
> vpanic() at vpanic+0x18d/frame 0xfe450800
> doadump () at doadump/frame 0xfe450880
> __mtx_lock_flags() at __mtx_lock_flags+0x163/frame 0xfe4508d0
> devctl_queue_data_f() at devctl_queue_data_f+0x6a/frame 0xfe450900
> g_dev_taste() at g_dev_taste+0x370/frame 0xfe450a10
> g_new_provider_event() at g_new_provider_event+0xfa/frame
> 0xfe450a30
> g_run_events() at g_run_events+0x151/frame 0xfe450a70
> fork_exit() at fork_exit+0x84/frame 0xfe450ab0
> fork_trampoline() at fork_trampoline+0xe/frame 0xfe450ab0
> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> KDB: enter: panic
> [ thread pid 14 tid 100052 ]
> Stopped at kdb_enter+0x3b: movq$0,kdb_why
> db> dump
> Cannot dump: no dump device specified.
> db>
>
> Now dumping worked fine the last time the kernel panicked: I have
> dumpdev=AUTO in rc.conf and I have swap on nvd0p3 (first) and
> /dev/zvol/root/swap
> (second, larger than the first.)
>
> Root on the nvd0p2 is ZFS, and ther's a four-drive raidZ with user
> directories and what-not on them, and another ZFS on an external USB drive
> that I use
> for backups, unmounted.
>
> In the new kernels, we clearly aren't even getting as far as finding the
> hubs and controllers, let alone the drives.
>
> I've attached dmesg.boot from the last boot from last week's good kernel.
> (While briefly in yoyo mode I turned the SMT back on, so now there are 16
> cores
> instead of the eight mentioned in the crash dump.  Didn't help, but I
> haven't turned it back off yet.)
>
> Cheers,
>
> Andrew
>
>
> ___
> freebsd-current@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"
>
>
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


12-Current panics on boot (didn't a week ago.)

2018-03-24 Thread Andrew Reilly
Hi all,

For reasons that still escape me, I haven't been able to get a kernel dump to 
debug, sorry.

Just thought that I'd generate a fairly low-quality report, to see if anyone 
has some ideas.

The last kernel that I have that booted OK (and I'm now running) is:
FreeBSD Zen.ac-r.nu 12.0-CURRENT FreeBSD 12.0-CURRENT #1 r331064M: Sat Mar 17 
07:54:51 AEDT 2018 root@Zen:/usr/obj/usr/src/amd64.amd64/sys/GENERIC  amd64

The machine is a:
CPU: AMD Ryzen 7 1700 Eight-Core Processor   (2994.46-MHz K8-class CPU)
  Origin="AuthenticAMD"  Id=0x800f11  Family=0x17  Model=0x1  Stepping=1
  
Features=0x178bfbff

Kernels built from head as of a couple of hours ago get through launching the 
other CPUs and then stops somewhere in random, apparently:

SMP: AP CPU #2 Launched!
Timecounter "TSC-low" frequency 1497223020 Hz quality 1000
random: entpanic: mtx_lock() of spin mutex (null) @ 
/usr/src/sys/kern/subr_bus.c:617
cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe4507a0
vpanic() at vpanic+0x18d/frame 0xfe450800
doadump () at doadump/frame 0xfe450880
__mtx_lock_flags() at __mtx_lock_flags+0x163/frame 0xfe4508d0
devctl_queue_data_f() at devctl_queue_data_f+0x6a/frame 0xfe450900
g_dev_taste() at g_dev_taste+0x370/frame 0xfe450a10
g_new_provider_event() at g_new_provider_event+0xfa/frame 0xfe450a30
g_run_events() at g_run_events+0x151/frame 0xfe450a70
fork_exit() at fork_exit+0x84/frame 0xfe450ab0
fork_trampoline() at fork_trampoline+0xe/frame 0xfe450ab0
--- trap 0, rip = 0, rsp = 0, rbp = 0 ---
KDB: enter: panic
[ thread pid 14 tid 100052 ]
Stopped at kdb_enter+0x3b: movq$0,kdb_why
db> dump
Cannot dump: no dump device specified.
db> 

Now dumping worked fine the last time the kernel panicked: I have dumpdev=AUTO 
in rc.conf and I have swap on nvd0p3 (first) and /dev/zvol/root/swap
(second, larger than the first.)

Root on the nvd0p2 is ZFS, and ther's a four-drive raidZ with user directories 
and what-not on them, and another ZFS on an external USB drive that I use
for backups, unmounted.

In the new kernels, we clearly aren't even getting as far as finding the hubs 
and controllers, let alone the drives.

I've attached dmesg.boot from the last boot from last week's good kernel.  
(While briefly in yoyo mode I turned the SMT back on, so now there are 16 cores
instead of the eight mentioned in the crash dump.  Didn't help, but I haven't 
turned it back off yet.)

Cheers,

Andrew

Copyright (c) 1992-2018 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD is a registered trademark of The FreeBSD Foundation.
FreeBSD 12.0-CURRENT #1 r331064M: Sat Mar 17 07:54:51 AEDT 2018
root@Zen:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64
FreeBSD clang version 6.0.0 (tags/RELEASE_600/final 326565) (based on LLVM 
6.0.0)
WARNING: WITNESS option enabled, expect reduced performance.
VT(vga): resolution 640x480
CPU: AMD Ryzen 7 1700 Eight-Core Processor   (2994.46-MHz K8-class CPU)
  Origin="AuthenticAMD"  Id=0x800f11  Family=0x17  Model=0x1  Stepping=1
  
Features=0x178bfbff
  
Features2=0x7ed8320b
  AMD Features=0x2e500800
  AMD 
Features2=0x35c233ff
  Structured Extended 
Features=0x209c01a9
  XSAVE Features=0xf
  AMD Extended Feature Extensions ID EBX=0x7
  SVM: (disabled in BIOS) NP,NRIP,VClean,AFlush,DAssist,NAsids=32768
  TSC: P-state invariant, performance statistics
real memory  = 34359738368 (32768 MB)
avail memory = 33272578048 (31731 MB)
Event timer "LAPIC" quality 600
ACPI APIC Table: 
FreeBSD/SMP: Multiprocessor System Detected: 16 CPUs
FreeBSD/SMP: 1 package(s) x 2 cache groups x 4 core(s) x 2 hardware threads
random: unblocking device.
Firmware Warning (ACPI): Optional FADT field Pm2ControlBlock has valid Length 
but zero Address: 0x/0x1 (20180313/tbfadt-796)
ioapic0: Changing APIC ID to 17
ioapic1: Changing APIC ID to 18
ioapic0  irqs 0-23 on motherboard
ioapic1  irqs 24-55 on motherboard
SMP: AP CPU #12 Launched!
SMP: AP CPU #5 Launched!
SMP: AP CPU #9 Launched!
SMP: AP CPU #13 Launched!
SMP: AP CPU #3 Launched!
SMP: AP CPU #1 Launched!
SMP: AP CPU #2 Launched!
SMP: AP CPU #8 Launched!
SMP: AP CPU #15 Launched!
SMP: AP CPU #4 Launched!
SMP: AP CPU #7 Launched!
SMP: 

Re: current panics

2003-11-07 Thread Soren Schmidt
It seems Kirill Ponomarew wrote:
 I got panic during the boot:
 
 ioapic0: Changing APIC ID to 2
 ioapic0 Version 0.3 irqs 0-23 on motherboard
 panic: Can't find ExtINT pin to route through!
 cpuid=0;
 
 Is it known problem ?

Dunno, but I get it as well when I set interrupt mode to APIC in
the BIOS, if I choose PIC it works (but without an APIC of course).

-Søren
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: current panics

2003-11-07 Thread Kirill Ponomarew
Hi,

On Thu, Nov 06, 2003 at 01:42:30PM +0100, Soren Schmidt wrote:
 
 Dunno, but I get it as well when I set interrupt mode to APIC in
 the BIOS, if I choose PIC it works (but without an APIC of course).

I had the same situation as you. But jhb@ fixed it already.

-Kirill


pgp0.pgp
Description: PGP signature


current panics

2003-11-06 Thread Kirill Ponomarew
Hi,

I got panic during the boot:

ioapic0: Changing APIC ID to 2
ioapic0 Version 0.3 irqs 0-23 on motherboard
panic: Can't find ExtINT pin to route through!
cpuid=0;

Is it known problem ?

-Kirill


pgp0.pgp
Description: PGP signature


RE: current panics

2003-11-06 Thread John Baldwin

On 06-Nov-2003 Kirill Ponomarew wrote:
 Hi,
 
 I got panic during the boot:
 
 ioapic0: Changing APIC ID to 2
 ioapic0 Version 0.3 irqs 0-23 on motherboard
 panic: Can't find ExtINT pin to route through!
 cpuid=0;
 
 Is it known problem ?

No, can you provide a boot -v dmesg?

-- 

John Baldwin [EMAIL PROTECTED]http://www.FreeBSD.org/~jhb/
Power Users Use the Power to Serve!  -  http://www.FreeBSD.org/
___
[EMAIL PROTECTED] mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: current panics

2003-11-06 Thread Kirill Ponomarew
Hi,

On Thu, Nov 06, 2003 at 11:34:06AM -0500, John Baldwin wrote:
 
  Is it known problem ?
 
 No, can you provide a boot -v dmesg?

You fixed it by your last commit of src/sys/i386/acpica/madt.c,v 1.4 
No panic now ;-)

-Kirill


pgp0.pgp
Description: PGP signature


Four -CURRENT panics (backtrace included)

2003-09-04 Thread Xin LI/
Hello everyone,

I have encounted several panics in recent kernels. The kernel was compiled right after 
cvsup, so the date will apply to the source code.

panic 1:

beastie# gdb -k kernel.debug /var/crash/vmcore.0
GNU gdb 5.2.1 (FreeBSD)
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for details.
This GDB was configured as i386-undermydesk-freebsd...
panic: sent too much
panic messages:
---
---
Reading symbols from 
/usr/obj/usr/src/sys/BEASTIE/modules/usr/src/sys/modules/acpi/acpi.ko.debug...done.
Loaded symbols for 
/usr/obj/usr/src/sys/BEASTIE/modules/usr/src/sys/modules/acpi/acpi.ko.debug
#0  doadump () at /usr/src/sys/kern/kern_shutdown.c:240
240 dumping++;
(kgdb) where
#0  doadump () at /usr/src/sys/kern/kern_shutdown.c:240
#1  0xc019b6ef in boot (howto=256) at /usr/src/sys/kern/kern_shutdown.c:372
#2  0xc019ba77 in panic () at /usr/src/sys/kern/kern_shutdown.c:550
#3  0xc022c240 in tcp_input (m=0xc0a3f300, off0=20) at 
/usr/src/sys/netinet/tcp_input.c:2310
#4  0xc0221893 in ip_input (m=0xc0a3f300) at /usr/src/sys/netinet/ip_input.c:950
#5  0xc020cca2 in swi_net (dummy=0x0) at /usr/src/sys/net/netisr.c:236
#6  0xc0189692 in ithread_loop (arg=0xc09f9c00) at /usr/src/sys/kern/kern_intr.c:534
#7  0xc01886bf in fork_exit (callout=0xc0189510 ithread_loop, arg=0x0, frame=0x0)
at /usr/src/sys/kern/kern_fork.c:796
(kgdb) bt full
#0  doadump () at /usr/src/sys/kern/kern_shutdown.c:240
No locals.
#1  0xc019b6ef in boot (howto=256) at /usr/src/sys/kern/kern_shutdown.c:372
No locals.
#2  0xc019ba77 in panic () at /usr/src/sys/kern/kern_shutdown.c:550
td = (struct thread *) 0xc0a07ab0
bootopt = 256
newpanic = 1
ap = 0xc6242b20 Hf
buf = sent too much, '\0' repeats 242 times
#3  0xc022c240 in tcp_input (m=0xc0a3f300, off0=20) at 
/usr/src/sys/netinet/tcp_input.c:2310
th = (struct tcphdr *) 0xc0c3f834
ip = (struct ip *) 0xc0c3f820
ipov = (struct ipovly *) 0x16bc
inp = (struct inpcb *) 0xc176aab0
optp = (u_char *) 0xc0c3f848 \001\001\b\n
optlen = 12
len = -1460824075
tlen = 0
off = -1460824075
drop_hdrlen = 52
tp = (struct tcpcb *) 0xc189a858
thflags = 5820
so = (struct socket *) 0xc1888700
todrop = -1460824075
acked = -1460824075
ourfinisacked = -1460824075
needoutput = 0
tiwin = 57600
to = {to_flags = 1, to_tsval = 517689, to_tsecr = 169603, to_cc = 0, to_ccecho 
= 0, to_mss = 0, 
  to_requested_s_scale = 0 '\0', to_pad = 0 '\0'}
taop = (struct rmxp_tao *) 0xa8ed97f5
tao_noncached = {tao_cc = 3324259320, tao_ccsent = 3324259328, tao_mssopt = 
10971}
headlocked = 1
next_hop = (struct sockaddr_in *) 0x0
rstreason = -1460824075
---Type return to continue, or q return to quit---
#4  0xc0221893 in ip_input (m=0xc0a3f300) at /usr/src/sys/netinet/ip_input.c:950
ip = (struct ip *) 0xc0c3f820
fp = (struct ipq *) 0xc02efe4d
ia = (struct in_ifaddr *) 0xc1578c00
ifa = (struct ifaddr *) 0x0
i = 0
hlen = 20
checkif = 1
sum = 0
pkt_dst = {s_addr = 126970842}
divert_info = 0
args = {m = 0xc0191c00, oif = 0x0, next_hop = 0x0, rule = 0x0, eh = 0x0, ro = 
0xe000, 
  dst = 0xc0369ff4, flags = 233, f_id = {dst_ip = 3224305229, src_ip = 3324259512, 
dst_port = 7120, 
src_port = 49177, proto = 244 '?, flags = 159 '\237'}, divert_rule = 0, retval = 
3224267492}
#5  0xc020cca2 in swi_net (dummy=0x0) at /usr/src/sys/net/netisr.c:236
ni = (struct netisr *) 0xc0367030
m = (struct mbuf *) 0xc0a3f300
bits = 2147483648
i = 0
#6  0xc0189692 in ithread_loop (arg=0xc09f9c00) at /usr/src/sys/kern/kern_intr.c:534
ithd = (struct ithd *) 0xc09f9c00
ih = (struct intrhand *) 0xc09fdd80
td = (struct thread *) 0xc0a07ab0
p = (struct proc *) 0xc0a06790
#7  0xc01886bf in fork_exit (callout=0xc0189510 ithread_loop, arg=0x0, frame=0x0)
at /usr/src/sys/kern/kern_fork.c:796
p = (struct proc *) 0xc0a06790
td = (struct thread *) 0xc0a07ab0


Panic #2:

beastie# gdb -k kernel.debug /var/crash/vmcore.1
GNU gdb 5.2.1 (FreeBSD)
Copyright 2002 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.

-current panics of today

2002-01-09 Thread Michael Reifenberger

Hi,
attached are three backtraces (sorry, no matching kernel.debug for them)
of some panics of today.
The first was during an copy operation from CDROM to /tmp (md disk)
The next where during background fsck-ing after the first dump...

Bye!

Michael Reifenberger
^.*Plaut.*$, IT, R/3 Basis, GPS


GNU gdb 4.18
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for details.
This GDB was configured as i386-unknown-freebsd...
(no debugging symbols found)...
IdlePTD at phsyical address 0x003d
initial pcb at physical address 0x00279840
panicstr: bwrite: buffer is not busy???
panic messages:
---
panic: vm_page_free: freeing wired page


syncing disks... panic: bwrite: buffer is not busy???
Uptime: 2h14m53s

dumping to dev ad0s3b, offset 3933440
dump ata0: resetting devices .. done
127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 
106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 
80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 
51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 
22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 
---
#0  0xc0164022 in dumpsys ()
(kgdb) #0  0xc0164022 in dumpsys ()
#1  0xc0163e08 in boot ()
#2  0xc0164257 in panic ()
#3  0xc01928d7 in bwrite ()
#4  0xc0192e8e in bawrite ()
#5  0xc01424c6 in spec_fsync ()
#6  0xc0141f8d in spec_vnoperate ()
#7  0xc01d6f60 in ffs_sync ()
#8  0xc019f6f6 in sync ()
#9  0xc0163a69 in boot ()
#10 0xc0164257 in panic ()
#11 0xc01ea886 in vm_page_free_toq ()
#12 0xc01ea2b9 in vm_page_free ()
#13 0xc01e1bd9 in vm_fault1 ()
#14 0xc01e1651 in vm_fault ()
#15 0xc021236f in trap_pfault ()
#16 0xc0211e8f in trap ()
#17 0xc0210b96 in generic_copyin ()
#18 0xc01d7f46 in ffs_write ()
#19 0xc01a532f in vn_write ()
#20 0xc017e1bb in dofilewrite ()
#21 0xc017dfb9 in write ()
#22 0xc0212be8 in syscall ()
#23 0xc020675d in syscall_with_err_pushed ()
#24 0x80489ed in ?? ()
#25 0x804857e in ?? ()
#26 0x8048135 in ?? ()
(kgdb) 

GNU gdb 4.18
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type show copying to see the conditions.
There is absolutely no warranty for GDB.  Type show warranty for details.
This GDB was configured as i386-unknown-freebsd...
(no debugging symbols found)...
IdlePTD at phsyical address 0x003d
initial pcb at physical address 0x00279840
panicstr: bremfree: removing a buffer not on a queue
panic messages:
---
Fatal trap 12: page fault while in kernel mode
fault virtual address   = 0x100
fault code  = supervisor read, page not present
instruction pointer = 0x8:0xc01d24b0
stack pointer   = 0x10:0xc9ffd894
frame pointer   = 0x10:0xc9ffd8a4
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= interrupt enabled, resume, IOPL = 0
current process = 416 (fsck_ufs)
trap number = 12
panic: page fault

syncing disks... panic: bremfree: removing a buffer not on a queue
Uptime: 2m39s
pfs_vncache_unload(): 1 entries remaining

dumping to dev ad0s3b, offset 3933440
dump ata0: resetting devices .. done
127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 
106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 
80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 
51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 
22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 
---
#0  0xc0164022 in dumpsys ()
(kgdb) #0  0xc0164022 in dumpsys ()
#1  0xc0163e08 in boot ()
#2  0xc0164257 in panic ()
#3  0xc0192626 in bremfree ()
#4  0xc0193b67 in vfs_bio_awrite ()
#5  0xc01d8734 in ffs_fsync ()
#6  0xc01d6d46 in ffs_sync ()
#7  0xc019f6f6 in sync ()
#8  0xc0163a69 in boot ()
#9  0xc0164257 in panic ()
#10 0xc021275c in trap_fatal ()
#11 0xc0212485 in trap_pfault ()
#12 0xc0211e8f in trap ()
#13 0xc01d24b0 in softdep_disk_io_initiation ()
#14 0xc014259d in spec_strategy ()
#15 0xc0141f8d in spec_vnoperate ()
#16 0xc01de555 in ufs_strategy ()
#17 0xc01dee29 in ufs_vnoperate ()
#18 0xc0192af8 in bwrite ()
#19 0xc0192e8e in bawrite ()
#20 0xc01cd652 in cgaccount ()
#21 0xc01cca89 in ffs_snapshot ()
#22 0xc01d5682 in ffs_mount ()
#23 0xc019ee0b in vfs_mount ()
#24 0xc019e7a7 in mount ()
#25 0xc0212be8 in syscall ()
#26 0xc020675d in 

Re: 5.0-20010304-CURRENT panics during boot on Sony Vaio

2001-03-13 Thread Tom Uffner

John Baldwin wrote:
 On 05-Mar-01 Tom Uffner wrote:
  John Baldwin wrote:
  On 04-Mar-01 Tom Uffner wrote:
   all of the snapshots since the 24th have exhibited this same or
   very similar behavior.
  Does it happen for snapshots before the 24th?
  no, it does not, at least not for the 5.0-20010210-CURRENT snap.

 Can you try cvsupping the src/sys tree one day at a time to see what day
 the kernel starts breaking for you?

ok, now i'm really confused. i built GENERIC kernels for every day
from the 2/10 to 2/25 and none of them panic. the snaps for the 24th
and 25th both during boot (only on my Vaio, not my other test systems)

could something outside the kernel have changed that made a difference?
or could it from building with different options? my /etc/make.conf has 
"COPTFLAGS= -O -pipe -march=i686". i presume that the snaps were built
without any optimization, could this make a difference?

-- 
Tom Uffner   [EMAIL PROTECTED]

Give a man a fish and you feed him for a day;
 teach him to use the Net and he won't bother you for weeks.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: 5.0-20010304-CURRENT panics during boot on Sony Vaio

2001-03-05 Thread Maxim Sobolev

Tom Uffner wrote:

 all of the snapshots since the 24th have exhibited this same or
 very similar behavior.

 when booting from the kern  mfsroot floppies i get:

 .
 .
 .
 unknown: PNP0e03 can't assign resources
 unknown: PNP0700 can't assign resources
 unknown: PNP0501 can't assign resources
 unknown: PNP0401 can't assign resources
 pccard: card inserted, slot 0
 kernel trap 9 with interrupts disabled

 Fatal trap 9: general protection fault while it kernel mode
 instruction pointer = 0x8:0xc02e3858
 stack pointer   = 0x10:0xc78c8f50
 frame pointer   = 0x10:0xc78c8f64
 code segment= base 0x0, limit 0xf, type 0x1b
 = DPL 0, pres 1, def32 1, gran 1
 processor eflags= resume, IOPL = 0
 current process = 19 (irq9: uhci0)
 trap number = 9
 panic: general protection fault

Looks like another `ltr %si' panic.

-Maxim


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



5.0-20010304-CURRENT panics during boot on Sony Vaio

2001-03-04 Thread Tom Uffner

all of the snapshots since the 24th have exhibited this same or
very similar behavior. 

when booting from the kern  mfsroot floppies i get:

.
.
.
unknown: PNP0e03 can't assign resources
unknown: PNP0700 can't assign resources
unknown: PNP0501 can't assign resources
unknown: PNP0401 can't assign resources
pccard: card inserted, slot 0
kernel trap 9 with interrupts disabled


Fatal trap 9: general protection fault while it kernel mode
instruction pointer = 0x8:0xc02e3858
stack pointer   = 0x10:0xc78c8f50
frame pointer   = 0x10:0xc78c8f64
code segment= base 0x0, limit 0xf, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags= resume, IOPL = 0
current process = 19 (irq9: uhci0)
trap number = 9
panic: general protection fault


i only get this on my vaio (PCG-XG9); several other pc's i
tried these boot floppies on boot and run sysinstall just fine.
none of my other test boxes have USB though.

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



RE: 5.0-20010304-CURRENT panics during boot on Sony Vaio

2001-03-04 Thread John Baldwin


On 04-Mar-01 Tom Uffner wrote:
 all of the snapshots since the 24th have exhibited this same or
 very similar behavior. 

Does it happen for snapshots before the 24th?

-- 

John Baldwin [EMAIL PROTECTED] -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: 5.0-20010304-CURRENT panics during boot on Sony Vaio

2001-03-04 Thread Tom Uffner

John Baldwin wrote:
 
 On 04-Mar-01 Tom Uffner wrote:
  all of the snapshots since the 24th have exhibited this same or
  very similar behavior.
 
 Does it happen for snapshots before the 24th?
 
no, it does not, at least not for the 5.0-20010210-CURRENT snap.
it boots from the floppies and once installed, from the disk. 

oh well, so much for the idea that it would be easier to get past 
the libc change by installing a snapshot...

Copyright (c) 1992-2001 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
FreeBSD 5.0-20010210-CURRENT #0: Sat Feb 10 16:04:54 GMT 2001
[EMAIL PROTECTED]:/usr/src/sys/compile/GENERIC
Timecounter "i8254"  frequency 1193182 Hz
Timecounter "TSC"  frequency 496311413 Hz
CPU: Pentium III/Pentium III Xeon/Celeron (496.31-MHz 686-class CPU)
  Origin = "GenuineIntel"  Id = 0x681  Stepping = 1
 
Features=0x383f9ffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PA
T,PSE36,MMX,FXSR,SSE
real memory  = 134152192 (131008K bytes)
avail memory = 125415424 (122476K bytes)
Preloaded elf kernel "kernel" at 0xc0506000.
Pentium Pro MTRR support enabled
WARNING: Driver mistake: destroy_dev on 154/0
Using $PIR table, 8 entries at 0xc00fdf40
npx0: math processor on motherboard
npx0: INT 16 interface
pcib0: Intel 82443BX (440 BX) host to PCI bridge at pcibus 0 on
motherboard
pci0: PCI bus on pcib0
pcib1: PCI-PCI bridge at device 1.0 on pci0
pci1: PCI bus on pcib1
pci1: display, VGA at 0.0 (no driver attached)
isab0: PCI-ISA bridge at device 7.0 on pci0
isa0: ISA bus on isab0
atapci0: Intel PIIX4 ATA33 controller port 0xfc90-0xfc9f at device 7.1 on
pci0
ata0: at 0x1f0 irq 14 on atapci0
ata1: at 0x170 irq 15 on atapci0
uhci0: Intel 82371AB/EB (PIIX4) USB controller port 0xfca0-0xfcbf irq 9
at dev
ice 7.2 on pci0
usb0: Intel 82371AB/EB (PIIX4) USB controller on uhci0
usb0: USB revision 1.0
uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1
uhub0: 2 ports with 2 removable, self powered
pci0: bridge, PCI-unknown at 7.3 (no driver attached)
pci0: serial bus, FireWire at 8.0 (no driver attached)
pcm0: Yamaha DS-1E (YMF744) port 0xfc8c-0xfc8f,0xfcc0-0xfcff mem
0xfedf8000-0x
fedf irq 9 at device 9.0 on pci0
pci0: simple comms at 10.0 (no driver attached)
pcic-pci0: Ricoh RL5C478 PCI-CardBus Bridge at device 12.0 on pci0
pcic-pci1: Ricoh RL5C478 PCI-CardBus Bridge at device 12.1 on pci0
atkbdc0: Keyboard controller (i8042) at port 0x60,0x64 on isa0
atkbd0: AT Keyboard flags 0x1 irq 1 on atkbdc0
kbd0 at atkbd0
psm0: PS/2 Mouse irq 12 on atkbdc0
psm0: model Generic PS/2 mouse, device ID 0
fdc0: NEC 72065B or clone at port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on isa0
fdc0: FIFO enabled, 8 bytes threshold
fd0: 1440-KB 3.5" drive on fdc0 drive 0
pcic0: Intel i82365 at port 0x3e0 iomem 0xd on isa0
pcic0: Polling mode
pccard0: PC Card bus -- kludge version on pcic0
pccard1: PC Card bus -- kludge version on pcic0
pmtimer0 on isa0
ppc0: Parallel port at port 0x378-0x37f irq 7 on isa0
ppc0: Generic chipset (ECP/PS2/NIBBLE) in COMPATIBLE mode
ppc0: FIFO with 16/16/8 bytes threshold
plip0: PLIP network interface on ppbus0
lpt0: Printer on ppbus0
lpt0: Interrupt-driven port
ppi0: Parallel I/O on ppbus0
sc0: System console at flags 0x100 on isa0
sc0: VGA 16 virtual consoles, flags=0x300
sio0 at port 0x3f8-0x3ff irq 4 flags 0x10 on isa0
sio0: type 16550A
sio1: configured irq 3 not in bitmap of probed irqs 0
sn0: ioaddr is 0x300
sn0: test1 failed
vga0: Generic ISA VGA at port 0x3c0-0x3df iomem 0xa-0xb on isa0
unknown: PNP0303 can't assign resources
unknown: PNP0e03 can't assign resources
unknown: PNP0700 can't assign resources
unknown: PNP0501 can't assign resources
unknown: PNP0401 can't assign resources
pccard: card inserted, slot 0
ata1-slave: ata_command: timeout waiting for intr
ata1-slave: identify failed
ad0: 17301MB IBM-DARA-218000 [35152/16/63] at ata0-master UDMA33
acd0: DVD-ROM TOSHIBA DVD-ROM SD-C2202 at ata1-master using PIO4
Mounting root from ufs:/dev/ad0s2a
WARNING: / was not properly dismounted
lp0: IPv6 not supported
ed1 at port 0x300-0x31f irq 3 slot 0 on pccard0
ed1: address 00:e0:98:70:10:ee, type Linksys (16 bit)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: 5.0-20010304-CURRENT panics during boot on Sony Vaio

2001-03-04 Thread John Baldwin


On 05-Mar-01 Tom Uffner wrote:
 John Baldwin wrote:
 
 On 04-Mar-01 Tom Uffner wrote:
  all of the snapshots since the 24th have exhibited this same or
  very similar behavior.
 
 Does it happen for snapshots before the 24th?
 
 no, it does not, at least not for the 5.0-20010210-CURRENT snap.
 it boots from the floppies and once installed, from the disk. 
 
 oh well, so much for the idea that it would be easier to get past 
 the libc change by installing a snapshot...

Can you try cvsupping the src/sys tree one day at a time to see what day the
kernel starts breaking for you?

-- 

John Baldwin [EMAIL PROTECTED] -- http://www.FreeBSD.org/~jhb/
PGP Key: http://www.baldwin.cx/~john/pgpkey.asc
"Power Users Use the Power to Serve!"  -  http://www.FreeBSD.org/

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



current panics in mount(2)

2001-01-22 Thread Bruce Evans

My nfs server now always panics when it attempts to export ufs
filesystems.  This is caused by my mount(8) being slightly out of
date.  This shouldn't be a problem, but `struct export_args' contains
a `struct ucred' which contains a `struct mtx', so when `struct mtx'
shrunk by 1 pointer yesterday, the out of date mount(8) started
supplying garbage for all the export args following the ucred one.
FreeBSD does very little checking of the export args and panics in
the following malloc() in vfs_hang_addrlist():

i = sizeof(struct netcred) + argp-ex_addrlen + argp-ex_masklen;
np = (struct netcred *)malloc(i, M_NETADDR, M_WAITOK | M_ZERO);

ISTR a PR about lack of checking of export args.

Somehow there were few problems when `struct mtx' was added to
`struct ucred'.  The critical args were probably usually 0.

Bruce



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: current panics in mount(2)

2001-01-22 Thread Alfred Perlstein

* Garrett Wollman [EMAIL PROTECTED] [010122 10:27] wrote:
 On Tue, 23 Jan 2001 01:28:11 +1100 (EST), Bruce Evans [EMAIL PROTECTED] said:
 
  Somehow there were few problems when `struct mtx' was added to
  `struct ucred'.  The critical args were probably usually 0.
 
 It's a bug that mount(2) uses a bare `struct ucred' and not a
 well-defined, user-exportable credential structure (like struct
 cmsgcred used for SCM_CREDS ancillary data).

I looked at fixing this once, but got scared off because of binary
compatibility issues.  Would 'fixing' mount to use cmsgcred be
acceptable?

-- 
-Alfred Perlstein - [[EMAIL PROTECTED]|[EMAIL PROTECTED]]
"I have the heart of a child; I keep it in a jar on my desk."


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: current panics in mount(2)

2001-01-22 Thread Warner Losh

In message [EMAIL PROTECTED] Alfred Perlstein writes:
: I looked at fixing this once, but got scared off because of binary
: compatibility issues.  Would 'fixing' mount to use cmsgcred be
: acceptable?

I think so.  Right now we have lots of killer, panic inducing binary
incompatibilities.  One more that doesn't suffer from the panic
inducing part would be an excellent idea.

Warner


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: current panics in mount(2)

2001-01-22 Thread Garrett Wollman

On Mon, 22 Jan 2001 10:54:04 -0800, Alfred Perlstein [EMAIL PROTECTED] said:

 I looked at fixing this once, but got scared off because of binary
 compatibility issues.  Would 'fixing' mount to use cmsgcred be
 acceptable?

No, it should use a structure appropriately named and designed for its
own purpose.  (By preference, it should be binary-compatible with
4.x to make upgrades easier in six months' time.)

-GAWollman



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: current panics in mount(2)

2001-01-22 Thread Mike Smith


I got quite upset about this last time, and I guess it's time to do it 
again.

Folks, *please* stop exporting "pure" kernel structures to userland.  
Make a sanitisied, versioned structure and just copy your damn args back 
and forth.  'struct ucred' should probably never have been exported to 
userspace, and it *certainly* never should have been exported to 
userpsace with a mutex in it!

I also asked the perpetrator of this error to do something about it 
following the last debacle it caused.  In the meantime, perhaps we could 
ask that one of the SMPng rules of engagement mandate that no mutex 
structures or structure members should ever be exported as part of a 
userspace interface?

 My nfs server now always panics when it attempts to export ufs
 filesystems.  This is caused by my mount(8) being slightly out of
 date.  This shouldn't be a problem, but `struct export_args' contains
 a `struct ucred' which contains a `struct mtx', so when `struct mtx'
 shrunk by 1 pointer yesterday, the out of date mount(8) started
 supplying garbage for all the export args following the ucred one.
 FreeBSD does very little checking of the export args and panics in
 the following malloc() in vfs_hang_addrlist():
 
   i = sizeof(struct netcred) + argp-ex_addrlen + argp-ex_masklen;
   np = (struct netcred *)malloc(i, M_NETADDR, M_WAITOK | M_ZERO);
 
 ISTR a PR about lack of checking of export args.
 
 Somehow there were few problems when `struct mtx' was added to
 `struct ucred'.  The critical args were probably usually 0.
 
 Bruce
 
 
 
 To Unsubscribe: send mail to [EMAIL PROTECTED]
 with "unsubscribe freebsd-current" in the body of the message

-- 
... every activity meets with opposition, everyone who acts has his
rivals and unfortunately opponents also.  But not because people want
to be opponents, rather because the tasks and relationships force
people to take different points of view.  [Dr. Fritz Todt]
   V I C T O R Y   N O T   V E N G E A N C E




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: current panics in mount(2)

2001-01-22 Thread Jason Evans

On Mon, Jan 22, 2001 at 12:16:38PM -0800, Mike Smith wrote:
 In the meantime, perhaps we could 
 ask that one of the SMPng rules of engagement mandate that no mutex 
 structures or structure members should ever be exported as part of a 
 userspace interface?

This sounds fine in principle, but the real problem is that kernel
structures are exported.  In order for us to fix some of the places where
structures are exported and an embedded mutex becomes necessary, we would
have to go out of our way to fix an existing design flaw.

Under normal circumstances, I would agree with you that broken code should
be fixed as it is modified.  However, the amount of work that the SMPng
project is already taking on is overwhelming.  Placing this additional
burden on the SMPng developers would in my opinion be detrimental to the
medium-term success of the project.

Jason


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: current panics in mount(2)

2001-01-22 Thread Mike Smith

 On Mon, Jan 22, 2001 at 12:16:38PM -0800, Mike Smith wrote:
  In the meantime, perhaps we could 
  ask that one of the SMPng rules of engagement mandate that no mutex 
  structures or structure members should ever be exported as part of a 
  userspace interface?
 
 This sounds fine in principle, but the real problem is that kernel
 structures are exported.  In order for us to fix some of the places where
 structures are exported and an embedded mutex becomes necessary, we would
 have to go out of our way to fix an existing design flaw.

This would seem to be more or less obvious, yes.

 Under normal circumstances, I would agree with you that broken code should
 be fixed as it is modified.  However, the amount of work that the SMPng
 project is already taking on is overwhelming.  Placing this additional
 burden on the SMPng developers would in my opinion be detrimental to the
 medium-term success of the project.

I think that the alternative is also fairly undesirable.  However, you're 
in the hot seat on this one, so it's your call.

-- 
... every activity meets with opposition, everyone who acts has his
rivals and unfortunately opponents also.  But not because people want
to be opponents, rather because the tasks and relationships force
people to take different points of view.  [Dr. Fritz Todt]
   V I C T O R Y   N O T   V E N G E A N C E




To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: current panics

2000-03-24 Thread Bernd Walter

On Thu, Mar 23, 2000 at 12:29:28PM +0100, Niels Chr. Bank-Pedersen wrote:
 Hi,
 
 I know it isn't much (no debugger compiled in (yet)), but is
 anybody else seeing panics like this:
 
   mode = 0100644, inum = 214354, fs = /data0
   panic: ffs_valloc: dup alloc
   
   syncing disks... 23 13 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 
   giving up on 2 buffers
   Uptime: 3m24s
   Automatic reboot in 15 seconds - press a key on the console to abort
 
 and
 
   dev = #vinum/1, block = 9757, fs = /data10
   panic: ffs_blkfree: freeing free frag
   
   syncing disks... 63 13 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
   giving up on 1 buffers
   Uptime: 3m34s
   Automatic reboot in 15 seconds - press a key on the console to abort
   Rebooting...
 
 Happening within a couple of minutes on -current kernels from 22/3 and
 23/3 but not on a kernel from around the 18/3.
 Running SU and vinum (both panics on a vinum fs), but otherwise
 its just a plain nfs-server.

What vinum config are you using?
'vinum list' ist best for this.

-- 
B.Walter  COSMO-Project  http://www.cosmo-project.de
[EMAIL PROTECTED] Usergroup[EMAIL PROTECTED]



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



current panics

2000-03-23 Thread Niels Chr. Bank-Pedersen

Hi,

I know it isn't much (no debugger compiled in (yet)), but is
anybody else seeing panics like this:

  mode = 0100644, inum = 214354, fs = /data0
  panic: ffs_valloc: dup alloc
  
  syncing disks... 23 13 4 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 3 
  giving up on 2 buffers
  Uptime: 3m24s
  Automatic reboot in 15 seconds - press a key on the console to abort

and

  dev = #vinum/1, block = 9757, fs = /data10
  panic: ffs_blkfree: freeing free frag
  
  syncing disks... 63 13 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
  giving up on 1 buffers
  Uptime: 3m34s
  Automatic reboot in 15 seconds - press a key on the console to abort
  Rebooting...

Happening within a couple of minutes on -current kernels from 22/3 and
23/3 but not on a kernel from around the 18/3.
Running SU and vinum (both panics on a vinum fs), but otherwise
its just a plain nfs-server.


/Niels Chr.

-- 
 Niels Christian Bank-Pedersen, NCB1-RIPE.
 Network Manager, Tele Danmark NET, IP-section.

 "Hey, are any of you guys out there actually *using* RFC 2549?"


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message