Re: [zfs-discuss] System crash on zpool attach object_count == usedobjs failed assertion

2010-03-03 Thread Nigel Smith
I've just run zdb against the two pools on my home OpenSolaris box,
and now both are showing this failed assertion, with the counts off by one.

  # zdb rpool /dev/null
  Assertion failed: object_count == usedobjs (0x18da2 == 0x18da3), file 
../zdb.c, line 1460
  Abort (core dumped)

  # zdb rz2pool /dev/null
  Assertion failed: object_count == usedobjs (0x2ba25 == 0x2ba26), file 
../zdb.c, line 1460
  Abort (core dumped)

The last time I checked them with zdb, probably a few months back,
they were fine.

And since the pools otherwise seem to be behaving without problem,
I've had no reason to run zdb.

'zpool status' looks fine, and the pools mount without problem.
'zpool scrub' works without problem.

I have been upgrading to most of the recent 'dev' version of OpenSolaris.
I wonder if there is some bug in the code that could cause this assertion.

Maybe one unusual thing, is that I have not yet upgraded the 
versions of the pools.

  # uname -a
  SunOS opensolaris 5.11 snv_133 i86pc i386 i86pc  
  # zpool upgrade
  This system is currently running ZFS pool version 22.

  The following pools are out of date, and can be upgraded.  After being
  upgraded, these pools will no longer be accessible by older software versions.

  VER  POOL
  ---  
  13   rpool
  16   rz2pool

The assertions is being tracked by this bug:

  http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6801840

..but in that report, the counts are not off by one,
Unfortunately, there is little indication of any progress being made.

Maybe some other 'zfs-discuss' readers would try zdb on there pools,
if using a recent dev build and see if they get a similar problem...

Thanks
Nigel Smith


# mdb core
Loading modules: [ libumem.so.1 libc.so.1 libzpool.so.1 libtopo.so.1 
libavl.so.1 libnvpair.so.1 ld.so.1 ]
 ::status
debugging core file of zdb (64-bit) from opensolaris
file: /usr/sbin/amd64/zdb
initial argv: zdb rpool
threading model: native threads
status: process terminated by SIGABRT (Abort), pid=883 uid=0 code=-1
panic message:
Assertion failed: object_count == usedobjs (0x18da2 == 0x18da3), file ../zdb.c,
line 1460
 $C
fd7fffdff090 libc.so.1`_lwp_kill+0xa()
fd7fffdff0b0 libc.so.1`raise+0x19()
fd7fffdff0f0 libc.so.1`abort+0xd9()
fd7fffdff320 libc.so.1`_assert+0x7d()
fd7fffdff810 dump_dir+0x35a()
fd7fffdff840 dump_one_dir+0x54()
fd7fffdff850 libzpool.so.1`findfunc+0xf()
fd7fffdff940 libzpool.so.1`dmu_objset_find_spa+0x39f()
fd7fffdffa30 libzpool.so.1`dmu_objset_find_spa+0x1d2()
fd7fffdffb20 libzpool.so.1`dmu_objset_find_spa+0x1d2()
fd7fffdffb40 libzpool.so.1`dmu_objset_find+0x2c()
fd7fffdffb70 dump_zpool+0x197()
fd7fffdffc10 main+0xa3d()
fd7fffdffc20 0x406e6c()
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] System crash on zpool attach object_count == usedobjs failed assertion

2010-03-03 Thread Nigel Smith
Hi Stephen 

If your system is crashing while attaching the new device,
are you getting a core dump file?

If so, it would be interesting to examine the file with mdb,
to see the stack backtrace, as this may give a clue to what's going wrong.

What storage controller you are using for the disks?
And what device driver is the controller using?

Thanks
Nigel Smith
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss