Re: zfs solaris assert panic in 11.0-ALPHA5 r302256

2016-07-18 Thread Andriy Gapon
On 18/07/2016 20:58, Don Lewis wrote:
> This is with kgdb from ports:
> 
> [snip]
> #13 0x824be23a in assfail (
> a=0x80 , 
> f=0xfe085a435d90 "", l=0)
> at /usr/src/sys/cddl/compat/opensolaris/kern/opensolaris_cmn_err.c:81
> #14 0x8215b928 in dbuf_dirty (db=, tx=)
> at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1232
> #15 0x8215c0e4 in dbuf_dirty (db=, tx=)
> at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1383
> #16 0x8215c0e4 in dbuf_dirty (db=, tx=)
> at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1383
> #17 0x82166fd9 in dmu_write_uio_dnode (dn=, 
> uio=, size=, tx=)
> at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:1193
> #18 0x82166e92 in dmu_write_uio_dbuf (zdb=0xf806bacc0b88, 
> uio=0xfe085a4368f0, size=65536, tx=0xf803a4043600)
> at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:1244
> #19 0x82224bac in zfs_write (vp=, uio=, 
> ioflag=, cr=, ct=)
> [snip]
> 
> There's not a lot for me to get traction with ...

I think that it should be possible to get to the interesting data
starting with zdb in frame 18, but it can take some effort.

> This is also not a very repeatable thing for me.  I've only had it
> happen once even though this machine is kept very busy building ports.

Okay.  It seems to have happened at least once for one other person.

-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: zfs solaris assert panic in 11.0-ALPHA5 r302256

2016-07-18 Thread Don Lewis
On 18 Jul, Andriy Gapon wrote:
> On 18/07/2016 20:40, Don Lewis wrote:
>> On 18 Jul, Andriy Gapon wrote:
>>> On 08/07/2016 07:13, Don Lewis wrote:
 My package buiding machine just crashed with this panic during a
 poudriere run:

 panic: solaris assert: (dn->dn_phys->dn_nlevels == 0 && db->db_level == 0) 
 || dn->dn_phys->dn_nlevels > db->db_level || dn->dn_next_nlevels[txgoff] > 
 db->db_level || dn->dn_next_nlevels[(tx->tx_txg-1) & TXG_MASK] > 
 db->db_level || dn->dn_next_nlevels[(tx->tx_txg
>>>
>>> Don,
>>>
>>> do you have a crash dump?
>>> It would be interesting to see a pretty-print of dn, dn->dn_phys, db and
>>> tx in the frame where the assert is hit.
>> 
>> I do.  Unfortunately kgdb reports that the values of dn and db were
>> optimized out.
>> 
> 
> Well... You can try to use kgdb7111 from ports, perhaps it would work
> better. Also, it's often possible to find values of wanted variables by
> finding a relevant value that's not optimized out and then following
> through pointers, etc to get to the right values.  in other cases it's
> possible to get the values by examining the disassembly and values of
> registers.

This is with kgdb from ports:

[snip]
#13 0x824be23a in assfail (
a=0x80 , 
f=0xfe085a435d90 "", l=0)
at /usr/src/sys/cddl/compat/opensolaris/kern/opensolaris_cmn_err.c:81
#14 0x8215b928 in dbuf_dirty (db=, tx=)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1232
#15 0x8215c0e4 in dbuf_dirty (db=, tx=)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1383
#16 0x8215c0e4 in dbuf_dirty (db=, tx=)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1383
#17 0x82166fd9 in dmu_write_uio_dnode (dn=, 
uio=, size=, tx=)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:1193
#18 0x82166e92 in dmu_write_uio_dbuf (zdb=0xf806bacc0b88, 
uio=0xfe085a4368f0, size=65536, tx=0xf803a4043600)
at /usr/src/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:1244
#19 0x82224bac in zfs_write (vp=, uio=, 
ioflag=, cr=, ct=)
[snip]

There's not a lot for me to get traction with ...

This is also not a very repeatable thing for me.  I've only had it
happen once even though this machine is kept very busy building ports.

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: zfs solaris assert panic in 11.0-ALPHA5 r302256

2016-07-18 Thread Andriy Gapon
On 18/07/2016 20:40, Don Lewis wrote:
> On 18 Jul, Andriy Gapon wrote:
>> On 08/07/2016 07:13, Don Lewis wrote:
>>> My package buiding machine just crashed with this panic during a
>>> poudriere run:
>>>
>>> panic: solaris assert: (dn->dn_phys->dn_nlevels == 0 && db->db_level == 0) 
>>> || dn->dn_phys->dn_nlevels > db->db_level || dn->dn_next_nlevels[txgoff] > 
>>> db->db_level || dn->dn_next_nlevels[(tx->tx_txg-1) & TXG_MASK] > 
>>> db->db_level || dn->dn_next_nlevels[(tx->tx_txg
>>
>> Don,
>>
>> do you have a crash dump?
>> It would be interesting to see a pretty-print of dn, dn->dn_phys, db and
>> tx in the frame where the assert is hit.
> 
> I do.  Unfortunately kgdb reports that the values of dn and db were
> optimized out.
> 

Well... You can try to use kgdb7111 from ports, perhaps it would work
better. Also, it's often possible to find values of wanted variables by
finding a relevant value that's not optimized out and then following
through pointers, etc to get to the right values.  in other cases it's
possible to get the values by examining the disassembly and values of
registers.

Finally, if you can get anything useful out of the dump it would make
sense to follow up to
https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=203419


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: zfs solaris assert panic in 11.0-ALPHA5 r302256

2016-07-18 Thread Don Lewis
On 18 Jul, Andriy Gapon wrote:
> On 08/07/2016 07:13, Don Lewis wrote:
>> My package buiding machine just crashed with this panic during a
>> poudriere run:
>> 
>> panic: solaris assert: (dn->dn_phys->dn_nlevels == 0 && db->db_level == 0) 
>> || dn->dn_phys->dn_nlevels > db->db_level || dn->dn_next_nlevels[txgoff] > 
>> db->db_level || dn->dn_next_nlevels[(tx->tx_txg-1) & TXG_MASK] > 
>> db->db_level || dn->dn_next_nlevels[(tx->tx_txg
> 
> Don,
> 
> do you have a crash dump?
> It would be interesting to see a pretty-print of dn, dn->dn_phys, db and
> tx in the frame where the assert is hit.

I do.  Unfortunately kgdb reports that the values of dn and db were
optimized out.

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: zfs solaris assert panic in 11.0-ALPHA5 r302256

2016-07-18 Thread Andriy Gapon
On 08/07/2016 07:13, Don Lewis wrote:
> My package buiding machine just crashed with this panic during a
> poudriere run:
> 
> panic: solaris assert: (dn->dn_phys->dn_nlevels == 0 && db->db_level == 0) || 
> dn->dn_phys->dn_nlevels > db->db_level || dn->dn_next_nlevels[txgoff] > 
> db->db_level || dn->dn_next_nlevels[(tx->tx_txg-1) & TXG_MASK] > db->db_level 
> || dn->dn_next_nlevels[(tx->tx_txg

Don,

do you have a crash dump?
It would be interesting to see a pretty-print of dn, dn->dn_phys, db and
tx in the frame where the assert is hit.

> cpuid = 2
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe085a435eb0
> vpanic() at vpanic+0x182/frame 0xfe085a435f30
> panic() at panic+0x43/frame 0xfe085a435f90
> assfail() at assfail+0x1a/frame 0xfe085a435fa0
> dbuf_dirty() at dbuf_dirty+0x3e8/frame 0xfe085a436060
> dbuf_dirty() at dbuf_dirty+0xba4/frame 0xfe085a436120
> dbuf_dirty() at dbuf_dirty+0xba4/frame 0xfe085a4361e0
> dmu_write_uio_dnode() at dmu_write_uio_dnode+0x129/frame 0xfe085a436270
> dmu_write_uio_dbuf() at dmu_write_uio_dbuf+0x42/frame 0xfe085a4362a0
> zfs_freebsd_write() at zfs_freebsd_write+0x87c/frame 0xfe085a4364c0
> VOP_WRITE_APV() at VOP_WRITE_APV+0x16f/frame 0xfe085a4365d0
> vn_write() at vn_write+0x218/frame 0xfe085a436650
> vn_io_fault1() at vn_io_fault1+0x1d2/frame 0xfe085a4367b0
> vn_io_fault() at vn_io_fault+0x197/frame 0xfe085a436830
> dofilewrite() at dofilewrite+0x87/frame 0xfe085a436880
> kern_writev() at kern_writev+0x68/frame 0xfe085a4368d0
> sys_write() at sys_write+0x84/frame 0xfe085a436920
> amd64_syscall() at amd64_syscall+0x2db/frame 0xfe085a436a30
> Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe085a436a30
> --- syscall (4, FreeBSD ELF64, sys_write), rip = 0x800b9ddea, rsp = 
> 0x7fffe568, rbp = 0x7fffe760 ---
> KDB: enter: panic
> 
> The pool is a two disk mirror.  I'm currently running a scrub.
> 
> RAM is ECC.
> 
> FreeBSD zipper.catspoiler.org 11.0-ALPHA5 FreeBSD 11.0-ALPHA5 #14 r302256M: 
> Thu Jun 30 00:05:32 PDT 2016 
> d...@zipper.catspoiler.org:/usr/obj/usr/src/sys/GENERIC  amd64


-- 
Andriy Gapon
___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


Re: zfs solaris assert panic in 11.0-ALPHA5 r302256

2016-07-08 Thread Don Lewis
On  7 Jul, To: freebsd-current@FreeBSD.org wrote:
> My package buiding machine just crashed with this panic during a
> poudriere run:
> 
> panic: solaris assert: (dn->dn_phys->dn_nlevels == 0 && db->db_level == 0) || 
> dn->dn_phys->dn_nlevels > db->db_level || dn->dn_next_nlevels[txgoff] > 
> db->db_level || dn->dn_next_nlevels[(tx->tx_txg-1) & TXG_MASK] > db->db_level 
> || dn->dn_next_nlevels[(tx->tx_txg
> cpuid = 2
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe085a435eb0
> vpanic() at vpanic+0x182/frame 0xfe085a435f30
> panic() at panic+0x43/frame 0xfe085a435f90
> assfail() at assfail+0x1a/frame 0xfe085a435fa0
> dbuf_dirty() at dbuf_dirty+0x3e8/frame 0xfe085a436060
> dbuf_dirty() at dbuf_dirty+0xba4/frame 0xfe085a436120
> dbuf_dirty() at dbuf_dirty+0xba4/frame 0xfe085a4361e0
> dmu_write_uio_dnode() at dmu_write_uio_dnode+0x129/frame 0xfe085a436270
> dmu_write_uio_dbuf() at dmu_write_uio_dbuf+0x42/frame 0xfe085a4362a0
> zfs_freebsd_write() at zfs_freebsd_write+0x87c/frame 0xfe085a4364c0
> VOP_WRITE_APV() at VOP_WRITE_APV+0x16f/frame 0xfe085a4365d0
> vn_write() at vn_write+0x218/frame 0xfe085a436650
> vn_io_fault1() at vn_io_fault1+0x1d2/frame 0xfe085a4367b0
> vn_io_fault() at vn_io_fault+0x197/frame 0xfe085a436830
> dofilewrite() at dofilewrite+0x87/frame 0xfe085a436880
> kern_writev() at kern_writev+0x68/frame 0xfe085a4368d0
> sys_write() at sys_write+0x84/frame 0xfe085a436920
> amd64_syscall() at amd64_syscall+0x2db/frame 0xfe085a436a30
> Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe085a436a30
> --- syscall (4, FreeBSD ELF64, sys_write), rip = 0x800b9ddea, rsp = 
> 0x7fffe568, rbp = 0x7fffe760 ---
> KDB: enter: panic
> 
> The pool is a two disk mirror.  I'm currently running a scrub.
> 
> RAM is ECC.
> 
> FreeBSD zipper.catspoiler.org 11.0-ALPHA5 FreeBSD 11.0-ALPHA5 #14 r302256M: 
> Thu Jun 30 00:05:32 PDT 2016 
> d...@zipper.catspoiler.org:/usr/obj/usr/src/sys/GENERIC  amd64

scrub didn't find any problems:

  pool: zroot
 state: ONLINE
status: Some supported features are not enabled on the pool. The pool can
still be used, but some features are unavailable.
action: Enable all features using 'zpool upgrade'. Once this is done,
the pool may no longer be accessible by software that does not support
the features. See zpool-features(7) for details.
  scan: scrub repaired 0 in 13h50m with 0 errors on Fri Jul  8 10:58:01 2016
config:

NAMESTATE READ WRITE CKSUM
zroot   ONLINE   0 0 0
  mirror-0  ONLINE   0 0 0
ada0p3  ONLINE   0 0 0
ada1p3  ONLINE   0 0 0

errors: No known data errors

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"


zfs solaris assert panic in 11.0-ALPHA5 r302256

2016-07-07 Thread Don Lewis
My package buiding machine just crashed with this panic during a
poudriere run:

panic: solaris assert: (dn->dn_phys->dn_nlevels == 0 && db->db_level == 0) || 
dn->dn_phys->dn_nlevels > db->db_level || dn->dn_next_nlevels[txgoff] > 
db->db_level || dn->dn_next_nlevels[(tx->tx_txg-1) & TXG_MASK] > db->db_level 
|| dn->dn_next_nlevels[(tx->tx_txg
cpuid = 2
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe085a435eb0
vpanic() at vpanic+0x182/frame 0xfe085a435f30
panic() at panic+0x43/frame 0xfe085a435f90
assfail() at assfail+0x1a/frame 0xfe085a435fa0
dbuf_dirty() at dbuf_dirty+0x3e8/frame 0xfe085a436060
dbuf_dirty() at dbuf_dirty+0xba4/frame 0xfe085a436120
dbuf_dirty() at dbuf_dirty+0xba4/frame 0xfe085a4361e0
dmu_write_uio_dnode() at dmu_write_uio_dnode+0x129/frame 0xfe085a436270
dmu_write_uio_dbuf() at dmu_write_uio_dbuf+0x42/frame 0xfe085a4362a0
zfs_freebsd_write() at zfs_freebsd_write+0x87c/frame 0xfe085a4364c0
VOP_WRITE_APV() at VOP_WRITE_APV+0x16f/frame 0xfe085a4365d0
vn_write() at vn_write+0x218/frame 0xfe085a436650
vn_io_fault1() at vn_io_fault1+0x1d2/frame 0xfe085a4367b0
vn_io_fault() at vn_io_fault+0x197/frame 0xfe085a436830
dofilewrite() at dofilewrite+0x87/frame 0xfe085a436880
kern_writev() at kern_writev+0x68/frame 0xfe085a4368d0
sys_write() at sys_write+0x84/frame 0xfe085a436920
amd64_syscall() at amd64_syscall+0x2db/frame 0xfe085a436a30
Xfast_syscall() at Xfast_syscall+0xfb/frame 0xfe085a436a30
--- syscall (4, FreeBSD ELF64, sys_write), rip = 0x800b9ddea, rsp = 
0x7fffe568, rbp = 0x7fffe760 ---
KDB: enter: panic

The pool is a two disk mirror.  I'm currently running a scrub.

RAM is ECC.

FreeBSD zipper.catspoiler.org 11.0-ALPHA5 FreeBSD 11.0-ALPHA5 #14 r302256M: Thu 
Jun 30 00:05:32 PDT 2016 
d...@zipper.catspoiler.org:/usr/obj/usr/src/sys/GENERIC  amd64

___
freebsd-current@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-current
To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"