Re: [zfs-discuss] today panic ...

2007-03-28 Thread Victor Latushkin

Hi Gino,

this looks like an instance of bug 6458218 (see 
http://bugs.opensolaris.org/view_bug.do?bug_id=6458218)


The fix for this bug is integrated into snv_60.

Kind regards,
Victor


Gino Ruopolo wrote:

Hi All,

Last week we had a panic caused by ZFS and then we had a corrupted zpool!
Today we are doing some test with the same data, but on a different 
server/storage array.  While copying the data ... panic!
And again we had a corrupted zpool!!



Mar 28 12:38:19 SERVER144 genunix: [ID 403854 kern.notice] assertion failed: ss 
!= NULL, file: ../../common/fs/zfs/space_map.c, line: 125
Mar 28 12:38:19 SERVER144 unix: [ID 10 kern.notice]
Mar 28 12:38:19 SERVER144 genunix: [ID 802836 kern.notice] fe80002db620 
fb9acff3 ()
Mar 28 12:38:19 SERVER144 genunix: [ID 655072 kern.notice] fe80002db6a0 
zfs:space_map_remove+239 ()
Mar 28 12:38:19 SERVER144 genunix: [ID 655072 kern.notice] fe80002db710 
zfs:space_map_load+17d ()
Mar 28 12:38:19 SERVER144 genunix: [ID 655072 kern.notice] fe80002db740 
zfs:zfsctl_ops_root+2fb80397 ()
Mar 28 12:38:19 SERVER144 genunix: [ID 655072 kern.notice] fe80002db7c0 
zfs:metaslab_group_alloc+186 ()
Mar 28 12:38:19 SERVER144 genunix: [ID 655072 kern.notice] fe80002db850 
zfs:metaslab_alloc_dva+ab ()
Mar 28 12:38:19 SERVER144 genunix: [ID 655072 kern.notice] fe80002db8a0 
zfs:zfsctl_ops_root+2fb81189 ()
Mar 28 12:38:19 SERVER144 genunix: [ID 655072 kern.notice] fe80002db8c0 
zfs:zio_dva_allocate+3f ()
Mar 28 12:38:19 SERVER144 genunix: [ID 655072 kern.notice] fe80002db8d0 
zfs:zio_next_stage+72 ()
Mar 28 12:38:19 SERVER144 genunix: [ID 655072 kern.notice] fe80002db8f0 
zfs:zio_checksum_generate+5f ()
Mar 28 12:38:19 SERVER144 genunix: [ID 655072 kern.notice] fe80002db900 
zfs:zio_next_stage+72 ()
Mar 28 12:38:19 SERVER144 genunix: [ID 655072 kern.notice] fe80002db950 
zfs:zio_write_compress+136 ()
Mar 28 12:38:19 SERVER144 genunix: [ID 655072 kern.notice] fe80002db960 
zfs:zio_next_stage+72 ()
Mar 28 12:38:19 SERVER144 genunix: [ID 655072 kern.notice] fe80002db990 
zfs:zio_wait_for_children+49 ()
Mar 28 12:38:19 SERVER144 genunix: [ID 655072 kern.notice] fe80002db9a0 
zfs:zio_wait_children_ready+15 ()
Mar 28 12:38:19 SERVER144 genunix: [ID 655072 kern.notice] fe80002db9b0 
zfs:zfsctl_ops_root+2fb9a1e6 ()
Mar 28 12:38:19 SERVER144 genunix: [ID 655072 kern.notice] fe80002db9e0 
zfs:zio_wait+2d ()
Mar 28 12:38:19 SERVER144 genunix: [ID 655072 kern.notice] fe80002dba70 
zfs:arc_write+cc ()
Mar 28 12:38:19 SERVER144 genunix: [ID 655072 kern.notice] fe80002dbb00 
zfs:dmu_objset_sync+141 ()
Mar 28 12:38:19 SERVER144 genunix: [ID 655072 kern.notice] fe80002dbb20 
zfs:dsl_dataset_sync+23 ()
Mar 28 12:38:19 SERVER144 genunix: [ID 655072 kern.notice] fe80002dbb70 
zfs:dsl_pool_sync+6b ()
Mar 28 12:38:19 SERVER144 genunix: [ID 655072 kern.notice] fe80002dbbd0 
zfs:spa_sync+fa ()
Mar 28 12:38:19 SERVER144 genunix: [ID 655072 kern.notice] fe80002dbc60 
zfs:txg_sync_thread+115 ()
Mar 28 12:38:19 SERVER144 genunix: [ID 655072 kern.notice] fe80002dbc70 
unix:thread_start+8 ()
Mar 28 12:38:19 SERVER144 unix: [ID 10 kern.notice]
Mar 28 12:38:19 SERVER144 genunix: [ID 672855 kern.notice] syncing file 
systems...
Mar 28 12:38:19 SERVER144 genunix: [ID 733762 kern.notice]  1
Mar 28 12:38:20 SERVER144 genunix: [ID 904073 kern.notice]  done
Mar 28 12:38:21 SERVER144 genunix: [ID 111219 kern.notice] dumping to 
/dev/dsk/c1t0d0s4, offset 1677983744, content: kernel
Mar 28 12:38:26 SERVER144 genunix: [ID 409368 kern.notice] ^M100% done: 129179 
pages dumped, compression ratio 5.16,
Mar 28 12:38:26 SERVER144 genunix: [ID 851671 kern.notice] dump succeeded

Suggestion?
We have about 2TB free on that zpool and were copying about 70GB.

tnx,
Gino
 
 
This message posted from opensolaris.org

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] today panic ...

2007-03-28 Thread Wade . Stuart






[EMAIL PROTECTED] wrote on 03/28/2007 06:34:12 AM:

 Hi Gino,

 this looks like an instance of bug 6458218 (see
 http://bugs.opensolaris.org/view_bug.do?bug_id=6458218)

 The fix for this bug is integrated into snv_60.

 Kind regards,
 Victor

I know I may be somewhat of an outsider here, but we use full Solaris
releases + patches.  Is there any way to ref that a fix has made it back to
Solaris release or patch at the same level as the Nevada bit push note in
the bug reports?  Or is there another way to tell where bits are pushed for
real (tm) Solaris releases?  I seem to spend a lot of time trying to find
specific zfs related patches or status when it comes to supported Solaris
releases...

-Wade

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] today panic ...

2007-03-28 Thread Malachi de Ælfweald

I was thinking of something similar. When we go to download the various bits
(iso-a.zip through iso-e.zip and the md5sums), it seems like there should
also be Release Notes on the list of files being downloaded.  Similar to the
Java release notes, I would expect it to point out which bugs were fixed,
major changes to how the main tools work, etc...

Although, I guess this isn't really the right list for that

Malachi

On 3/28/07, [EMAIL PROTECTED] [EMAIL PROTECTED] wrote:








[EMAIL PROTECTED] wrote on 03/28/2007 06:34:12 AM:

 Hi Gino,

 this looks like an instance of bug 6458218 (see
 http://bugs.opensolaris.org/view_bug.do?bug_id=6458218)

 The fix for this bug is integrated into snv_60.

 Kind regards,
 Victor

I know I may be somewhat of an outsider here, but we use full Solaris
releases + patches.  Is there any way to ref that a fix has made it back
to
Solaris release or patch at the same level as the Nevada bit push note in
the bug reports?  Or is there another way to tell where bits are pushed
for
real (tm) Solaris releases?  I seem to spend a lot of time trying to
find
specific zfs related patches or status when it comes to supported Solaris
releases...

-Wade

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] today panic ...

2007-03-28 Thread Matthew Ahrens

Gino Ruopolo wrote:

Hi All,

Last week we had a panic caused by ZFS and then we had a corrupted
zpool! Today we are doing some test with the same data, but on a
different server/storage array.  While copying the data ... panic! 
And again we had a corrupted zpool!!


This is bug 6458218, which was fixed in snv_60 and will be fixed in s10u4.

To recover from this situation, try running build 60 or later, and put
'set zfs:zfs_recover=1' in /etc/system.  This should allow you to read
your pool again.  (However, we can't recommend running in this state
forever; you should backup and restore your pool ASAP.)

We're very sorry that you've encountered this bug.  Unfortunately, it
was very difficult to track down, so it existed for quite some time.
Thankfully, it is now fixed so you don't need to hit it anymore.

--matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Today PANIC :(

2007-02-28 Thread Neil Perrin

Gino,

We have ween this before but only very rarely and never got a good crash dump. 
Coincidently, we saw it
only yesterday on a server here, and are currently investigating it. Did you 
also get a dump we
can access? That would If not can you tell us what zfs version you were running. 
 At the moment I'm not sure how

even you can recover from it. Sorry about this problem.

FYI this is bug:

http://bugs.opensolaris.org/view_bug.do?bug_id=6458218

Neil.

Gino Ruopolo wrote On 02/28/07 02:17,:

Feb 28 05:47:31 server141 genunix: [ID 403854 kern.notice] assertion failed: ss 
== NULL, file: ../../common/fs/zfs/space_map.c, line: 81
Feb 28 05:47:31 server141 unix: [ID 10 kern.notice]
Feb 28 05:47:31 server141 genunix: [ID 802836 kern.notice] fe8000d559f0 
fb9acff3 ()
Feb 28 05:47:31 server141 genunix: [ID 655072 kern.notice] fe8000d55a70 
zfs:space_map_add+c2 ()
Feb 28 05:47:31 server141 genunix: [ID 655072 kern.notice] fe8000d55aa0 
zfs:space_map_free+22 ()
Feb 28 05:47:31 server141 genunix: [ID 655072 kern.notice] fe8000d55ae0 
zfs:space_map_vacate+38 ()
Feb 28 05:47:31 server141 genunix: [ID 655072 kern.notice] fe8000d55b40 
zfs:zfsctl_ops_root+2fdbc7e7 ()
Feb 28 05:47:31 server141 genunix: [ID 655072 kern.notice] fe8000d55b70 
zfs:vdev_sync_done+2b ()
Feb 28 05:47:31 server141 genunix: [ID 655072 kern.notice] fe8000d55bd0 
zfs:spa_sync+215 ()
Feb 28 05:47:31 server141 genunix: [ID 655072 kern.notice] fe8000d55c60 
zfs:txg_sync_thread+115 ()
Feb 28 05:47:31 server141 genunix: [ID 655072 kern.notice] fe8000d55c70 
unix:thread_start+8 ()
Feb 28 05:47:31 server141 unix: [ID 10 kern.notice]
Feb 28 05:47:31 server141 genunix: [ID 672855 kern.notice] syncing file 
systems...
Feb 28 05:47:32 server141 genunix: [ID 733762 kern.notice]  1
Feb 28 05:47:33 server141 genunix: [ID 904073 kern.notice]  done 


What happened this time? Any suggest?

thanks,
gino
 
 
This message posted from opensolaris.org

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss