Re: [zfs-discuss] Serious ZFS problems
Tim Spriggs wrote: Hello, I think I have gained sufficient fool status for testing the fool-proof-ness of zfs. I have a cluster of T1000 servers running Solaris 10 and two x4100's running an OpenSolaris dist (Nexenta) which is at b68. Each T1000 hosts several zones each of which has its own zpool associated with it. Each zpool is a mirrored configuration between and IBM N series Nas and another OSOL box serving iscsi from zvols. To move zones around, I move the zone configuration and then move the zpool from one T1000 to another and bring the zone up. Now for the problem. For sake of brevity: T1000-1: zpool export pool1 T1000-2: zpool export pool2 T1000-3: zpool import -f pool1 T1000-4: zpool import -f pool2 and other similar operations to move zone data around. Then I 'init 6'd all the T1000s. The reason for the init 6 was so that all of the pools would completely let go of the iscsi luns so I can remove static-configurations from each T1000. upon reboot, pool1 has the following problem: WARNING: can't process intent log for pool1 During pool startup (spa_load()) zil_claim() is called on each dataset in the pool and the first thing it tries to do is open the dataset (dmu_objset_open()). If this fails then the can't process intent log... is printed. So you have a pretty serious pool consistency problem. I guess more information is needed. Running zdb on the pool would be useful, or zdb -l device to display the labels (on a exported pool). and then attempts to export the pool fail with: cannot open 'pool1': I/O error pool2 can consistently make a T1000 (Sol1) kernel panic when imported. It will also make an x4100 panic (osol) Any ideas? Thanks in advance. -Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Serious ZFS problems
Neil Perrin wrote: Tim Spriggs wrote: Hello, I think I have gained sufficient fool status for testing the fool-proof-ness of zfs. I have a cluster of T1000 servers running Solaris 10 and two x4100's running an OpenSolaris dist (Nexenta) which is at b68. Each T1000 hosts several zones each of which has its own zpool associated with it. Each zpool is a mirrored configuration between and IBM N series Nas and another OSOL box serving iscsi from zvols. To move zones around, I move the zone configuration and then move the zpool from one T1000 to another and bring the zone up. Now for the problem. For sake of brevity: T1000-1: zpool export pool1 T1000-2: zpool export pool2 T1000-3: zpool import -f pool1 T1000-4: zpool import -f pool2 and other similar operations to move zone data around. Then I 'init 6'd all the T1000s. The reason for the init 6 was so that all of the pools would completely let go of the iscsi luns so I can remove static-configurations from each T1000. upon reboot, pool1 has the following problem: WARNING: can't process intent log for pool1 During pool startup (spa_load()) zil_claim() is called on each dataset in the pool and the first thing it tries to do is open the dataset (dmu_objset_open()). If this fails then the can't process intent log... is printed. So you have a pretty serious pool consistency problem. I guess more information is needed. Running zdb on the pool would be useful, or zdb -l device to display the labels (on a exported pool). I can't export one of the pools. Here is the zpool status -x output for reference: # zpool status -x pool: zs-scat-dmz state: ONLINE status: One or more devices has experienced an error resulting in data corruption. Applications may be affected. action: Restore the file in question if possible. Otherwise restore the entire pool from backup. see: http://www.sun.com/msg/ZFS-8000-8A scrub: none requested config: NAME STATE READ WRITE CKSUM zs-scat-dmzONLINE 0 072 mirror ONLINE 0 072 c2t015045BB9E322A0046CBDAAEd0 ONLINE 0 0 144 c1t80d0ONLINE 0 0 144 errors: 0 data errors, use '-v' for a list # zdb -l /dev/dsk/c2t015045BB9E322A0046CBDAAEd0 LABEL 0 LABEL 1 failed to unpack label 1 LABEL 2 LABEL 3 root @ T1000-3 zdb -l /dev/dsk/c1t80d0 LABEL 0 LABEL 1 failed to unpack label 1 LABEL 2 LABEL 3 I find that appending s0 to the device gives me better information: # zdb -l /dev/dsk/c1t80d0s0 LABEL 0 version=3 name='zs-scat-dmz' state=0 txg=1188440 pool_guid=949639000150966246 top_guid=15919546701143465277 guid=4814968902145809239 vdev_tree type='mirror' id=0 guid=15919546701143465277 whole_disk=0 metaslab_array=13 metaslab_shift=28 ashift=9 asize=42977198080 children[0] type='disk' id=0 guid=5615878807187049290 path='/dev/dsk/c2t015045BB9E322A0046CBDAAEd0s0' devid='id1,[EMAIL PROTECTED]/a' whole_disk=1 DTL=55 children[1] type='disk' id=1 guid=4814968902145809239 path='/dev/dsk/c1t80d0s0' devid='id1,[EMAIL PROTECTED]/a' whole_disk=1 DTL=50 LABEL 1 version=3 name='zs-scat-dmz' state=0 txg=1188440 pool_guid=949639000150966246 top_guid=15919546701143465277 guid=4814968902145809239 vdev_tree type='mirror' id=0 guid=15919546701143465277 whole_disk=0 metaslab_array=13 metaslab_shift=28 ashift=9 asize=42977198080 children[0] type='disk' id=0
[zfs-discuss] Serious ZFS problems
Hello, I think I have gained sufficient fool status for testing the fool-proof-ness of zfs. I have a cluster of T1000 servers running Solaris 10 and two x4100's running an OpenSolaris dist (Nexenta) which is at b68. Each T1000 hosts several zones each of which has its own zpool associated with it. Each zpool is a mirrored configuration between and IBM N series Nas and another OSOL box serving iscsi from zvols. To move zones around, I move the zone configuration and then move the zpool from one T1000 to another and bring the zone up. Now for the problem. For sake of brevity: T1000-1: zpool export pool1 T1000-2: zpool export pool2 T1000-3: zpool import -f pool1 T1000-4: zpool import -f pool2 and other similar operations to move zone data around. Then I 'init 6'd all the T1000s. The reason for the init 6 was so that all of the pools would completely let go of the iscsi luns so I can remove static-configurations from each T1000. upon reboot, pool1 has the following problem: WARNING: can't process intent log for pool1 and then attempts to export the pool fail with: cannot open 'pool1': I/O error pool2 can consistently make a T1000 (Sol1) kernel panic when imported. It will also make an x4100 panic (osol) Any ideas? Thanks in advance. -Tim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss