Re: [zfs-discuss] Serious ZFS problems

2007-09-06 Thread Neil Perrin


Tim Spriggs wrote:
 Hello,
 
 I think I have gained sufficient fool status for testing the 
 fool-proof-ness of zfs. I have a cluster of T1000 servers running 
 Solaris 10 and two x4100's running an OpenSolaris dist (Nexenta) which 
 is at b68. Each T1000 hosts several zones each of which has its own 
 zpool associated with it. Each zpool is a mirrored configuration between 
 and IBM N series Nas and another OSOL box serving iscsi from zvols. To 
 move zones around, I move the zone configuration and then move the zpool 
 from one T1000 to another and bring the zone up. Now for the problem.
 
 For sake of brevity:
 
 T1000-1: zpool export pool1
 T1000-2: zpool export pool2
 T1000-3: zpool import -f pool1
 T1000-4: zpool import -f pool2
 and other similar operations to move zone data around.
 
 Then I 'init 6'd all the T1000s. The reason for the init 6 was so that 
 all of the pools would completely let go of the iscsi luns so I can 
 remove static-configurations from each T1000.
 
 upon reboot, pool1 has the following problem:
 
 WARNING: can't process intent log for pool1

During pool startup (spa_load()) zil_claim() is called on
each dataset in the pool and the first thing it tries to do is
open the dataset (dmu_objset_open()). If this fails then the
can't process intent log... is printed. So you have a pretty
serious pool consistency problem.

I guess more information is needed. Running zdb on the pool would
be useful, or zdb -l device to display the labels (on a exported pool).

 
 and then attempts to export the pool fail with:
 
 cannot open 'pool1': I/O error
 
 
 pool2 can consistently make a T1000 (Sol1) kernel panic when imported. 
 It will also make an x4100 panic (osol)
 
 
 Any ideas?
 
 Thanks in advance.
 -Tim
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Serious ZFS problems

2007-09-06 Thread Tim Spriggs
Neil Perrin wrote:


 Tim Spriggs wrote:
 Hello,

 I think I have gained sufficient fool status for testing the 
 fool-proof-ness of zfs. I have a cluster of T1000 servers running 
 Solaris 10 and two x4100's running an OpenSolaris dist (Nexenta) 
 which is at b68. Each T1000 hosts several zones each of which has its 
 own zpool associated with it. Each zpool is a mirrored configuration 
 between and IBM N series Nas and another OSOL box serving iscsi from 
 zvols. To move zones around, I move the zone configuration and then 
 move the zpool from one T1000 to another and bring the zone up. Now 
 for the problem.

 For sake of brevity:

 T1000-1: zpool export pool1
 T1000-2: zpool export pool2
 T1000-3: zpool import -f pool1
 T1000-4: zpool import -f pool2
 and other similar operations to move zone data around.

 Then I 'init 6'd all the T1000s. The reason for the init 6 was so 
 that all of the pools would completely let go of the iscsi luns so I 
 can remove static-configurations from each T1000.

 upon reboot, pool1 has the following problem:

 WARNING: can't process intent log for pool1

 During pool startup (spa_load()) zil_claim() is called on
 each dataset in the pool and the first thing it tries to do is
 open the dataset (dmu_objset_open()). If this fails then the
 can't process intent log... is printed. So you have a pretty
 serious pool consistency problem.

 I guess more information is needed. Running zdb on the pool would
 be useful, or zdb -l device to display the labels (on a exported pool).

I can't export one of the pools. Here is the zpool status -x output for 
reference:

# zpool status -x
  pool: zs-scat-dmz
 state: ONLINE
status: One or more devices has experienced an error resulting in data
corruption.  Applications may be affected.
action: Restore the file in question if possible.  Otherwise restore the
entire pool from backup.
   see: http://www.sun.com/msg/ZFS-8000-8A
 scrub: none requested
config:

NAME   STATE READ WRITE 
CKSUM
zs-scat-dmzONLINE   0 
072
  mirror   ONLINE   0 
072
c2t015045BB9E322A0046CBDAAEd0  ONLINE   0 
0   144
c1t80d0ONLINE   0 
0   144

errors: 0 data errors, use '-v' for a list

# zdb -l /dev/dsk/c2t015045BB9E322A0046CBDAAEd0

LABEL 0


LABEL 1

failed to unpack label 1

LABEL 2


LABEL 3

root @ T1000-3 zdb -l /dev/dsk/c1t80d0

LABEL 0


LABEL 1

failed to unpack label 1

LABEL 2


LABEL 3


I find that appending s0 to the device gives me better information:

# zdb -l /dev/dsk/c1t80d0s0

LABEL 0

version=3
name='zs-scat-dmz'
state=0
txg=1188440
pool_guid=949639000150966246
top_guid=15919546701143465277
guid=4814968902145809239
vdev_tree
type='mirror'
id=0
guid=15919546701143465277
whole_disk=0
metaslab_array=13
metaslab_shift=28
ashift=9
asize=42977198080
children[0]
type='disk'
id=0
guid=5615878807187049290
path='/dev/dsk/c2t015045BB9E322A0046CBDAAEd0s0'
devid='id1,[EMAIL PROTECTED]/a'
whole_disk=1
DTL=55
children[1]
type='disk'
id=1
guid=4814968902145809239
path='/dev/dsk/c1t80d0s0'
devid='id1,[EMAIL PROTECTED]/a'
whole_disk=1
DTL=50

LABEL 1

version=3
name='zs-scat-dmz'
state=0
txg=1188440
pool_guid=949639000150966246
top_guid=15919546701143465277
guid=4814968902145809239
vdev_tree
type='mirror'
id=0
guid=15919546701143465277
whole_disk=0
metaslab_array=13
metaslab_shift=28
ashift=9
asize=42977198080
children[0]
type='disk'
id=0

[zfs-discuss] Serious ZFS problems

2007-09-05 Thread Tim Spriggs
Hello,

I think I have gained sufficient fool status for testing the 
fool-proof-ness of zfs. I have a cluster of T1000 servers running 
Solaris 10 and two x4100's running an OpenSolaris dist (Nexenta) which 
is at b68. Each T1000 hosts several zones each of which has its own 
zpool associated with it. Each zpool is a mirrored configuration between 
and IBM N series Nas and another OSOL box serving iscsi from zvols. To 
move zones around, I move the zone configuration and then move the zpool 
from one T1000 to another and bring the zone up. Now for the problem.

For sake of brevity:

T1000-1: zpool export pool1
T1000-2: zpool export pool2
T1000-3: zpool import -f pool1
T1000-4: zpool import -f pool2
and other similar operations to move zone data around.

Then I 'init 6'd all the T1000s. The reason for the init 6 was so that 
all of the pools would completely let go of the iscsi luns so I can 
remove static-configurations from each T1000.

upon reboot, pool1 has the following problem:

WARNING: can't process intent log for pool1

and then attempts to export the pool fail with:

cannot open 'pool1': I/O error


pool2 can consistently make a T1000 (Sol1) kernel panic when imported. 
It will also make an x4100 panic (osol)


Any ideas?

Thanks in advance.
-Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss