I create zpools via iscsi targets similar to this. At first I used the on-board Nvidia SATA ports on a Tyan S2927, but had random disk problems like you are describing. I switched to using SuperMicro AOC-SAT2-MV8 or AOC-USAS-L8i and haven't had a problem since.
Are your 1TB disks made by Seagate? If they are Barracuda 7200.11 you should upgrade the drive firmware before all of your disks stop functioning. kristof wrote: > I'm trying to setup a redundant zpool using 2 servers and opensolaris b105 > with comstar iscsi. > > Here are my setup details: > > 2 Servers each: > - tyan S2925 > - 6 x 1TB disks > - 2 onboard nge nics > - 1 PCIE IB card: MHEA28-1TC > > partitionlayout: > > * Id Act Bhead Bsect Bcyl Ehead Esect Ecyl Rsect Numsect > 191 128 0 1 2 254 63 1023 32130 58605120 > 191 0 254 63 1023 254 63 1023 58637250 625121280 > 191 0 254 63 1023 254 63 1023 683758530 1250242560 > > - First primary partition (disk 1 & 2) used for rpool > - Second primary partition is exposed via iscsi (comstar) > - Third partition is not used so far > > r...@comstar1:~/iser# stmfadm list-lu -v > LU Name: 600144F05850C200000049884CF70001 > Operational Status: Online > Provider Name : sbd > Alias : /dev/rdsk/c1t0d0p2 > View Entry Count : 1 > LU Name: 600144F05850C200000049884CFB0002 > Operational Status: Online > Provider Name : sbd > Alias : /dev/rdsk/c1t1d0p2 > View Entry Count : 1 > LU Name: 600144F05850C200000049884D060003 > Operational Status: Online > Provider Name : sbd > Alias : /dev/rdsk/c2t0d0p2 > View Entry Count : 1 > LU Name: 600144F05850C200000049884D090004 > Operational Status: Online > Provider Name : sbd > Alias : /dev/rdsk/c2t1d0p2 > View Entry Count : 1 > LU Name: 600144F05850C200000049884D100005 > Operational Status: Online > Provider Name : sbd > Alias : /dev/rdsk/c5t0d0p2 > View Entry Count : 1 > LU Name: 600144F05850C200000049884D130006 > Operational Status: Online > Provider Name : sbd > Alias : /dev/rdsk/c5t1d0p2 > View Entry Count : 1 > > LU Name: 600144F0B2174500000049884D550002 > Operational Status: Online > Provider Name : sbd > Alias : /dev/rdsk/c1t0d0p2 > View Entry Count : 1 > LU Name: 600144F0B2174500000049884D5F0003 > Operational Status: Online > Provider Name : sbd > Alias : /dev/rdsk/c1t1d0p2 > View Entry Count : 1 > LU Name: 600144F0B2174500000049884E300004 > Operational Status: Online > Provider Name : sbd > Alias : /dev/rdsk/c2t0d0p2 > View Entry Count : 1 > LU Name: 600144F0B2174500000049884E330005 > Operational Status: Online > Provider Name : sbd > Alias : /dev/rdsk/c2t1d0p2 > View Entry Count : 1 > LU Name: 600144F0B2174500000049884E380006 > Operational Status: Online > Provider Name : sbd > Alias : /dev/rdsk/c3t0d0p2 > View Entry Count : 1 > LU Name: 600144F0B2174500000049884E3A0007 > Operational Status: Online > Provider Name : sbd > Alias : /dev/rdsk/c3t1d0p2 > View Entry Count : 1 > > On both servers I created a target group & host group: > > r...@comstar1:~/iser# stmfadm list-tg -v > Target Group: comstarcluster > Member: iqn.1986-03.com.sun:02:72169083-d7a2-cf5f-8b5a-f253fca09ad3 > > Host Group: cluster > Member: iqn.1986-03.com.sun:01:e00000000000.492d42bc > Member: iqn.1986-03.com.sun:01:e00000000000.492d42bd > > Target Group: comstarcluster > Member: iqn.1986-03.com.sun:02:687afe9c-97fb-6733-896d-f4fe742ace59 > > r...@comstar2:~# stmfadm list-hg -v > Host Group: cluster > Member: iqn.1986-03.com.sun:01:e00000000000.492d42bc > Member: iqn.1986-03.com.sun:01:e00000000000.492d42bd > > Then I added a view per server (6 LUNS per view) > > Finally I configured static iscsi configuration on the server: comstar1 and > created a mirrored zpool: > > r...@comstar1:~/iser# iscsiadm list static-config > > Static Configuration Target: > iqn.1986-03.com.sun:02:687afe9c-97fb-6733-896d-f4fe742ace59,192.168.100.2:3260 > > Static Configuration Target: > iqn.1986-03.com.sun:02:687afe9c-97fb-6733-896d-f4fe742ace59,192.168.101.2:3260 > > Static Configuration Target: > iqn.1986-03.com.sun:02:72169083-d7a2-cf5f-8b5a-f253fca09ad3,127.0.0.1:3260 > > - 192.168.100.2 is ibd0 on server comstar2 > - 192.168.101.2 is ibd1 on server comstar2 > > zpool create storagepoolb mirror c3t600144F05850C200000049884CF70001d0 > c3t600144F0B2174500000049884D550002d0 mirror > c3t600144F05850C200000049884CFB0002d0 c3t600144F0B2174500000049884D5F0003d0 > mirror c3t600144F05850C200000049884D060003d0 > c3t600144F0B2174500000049884E300004d0 mirror > c3t600144F05850C200000049884D090004d0 c3t600144F0B2174500000049884E330005d0 > mirror c3t600144F05850C200000049884D100005d0 > c3t600144F0B2174500000049884E380006d0 mirror > c3t600144F05850C200000049884D130006d0 c3t600144F0B2174500000049884E3A0007d0 > > pool: storagepoolb > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > storagepoolb ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c3t600144F05850C200000049884CF70001d0 ONLINE 0 0 0 > c3t600144F0B2174500000049884D550002d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c3t600144F05850C200000049884CFB0002d0 ONLINE 0 0 0 > c3t600144F0B2174500000049884D5F0003d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c3t600144F05850C200000049884D060003d0 ONLINE 0 0 0 > c3t600144F0B2174500000049884E300004d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c3t600144F05850C200000049884D090004d0 ONLINE 0 0 0 > c3t600144F0B2174500000049884E330005d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c3t600144F05850C200000049884D100005d0 ONLINE 0 0 0 > c3t600144F0B2174500000049884E380006d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c3t600144F05850C200000049884D130006d0 ONLINE 0 0 0 > c3t600144F0B2174500000049884E3A0007d0 ONLINE 0 0 0 > > > Then I created some filesystems and volumes. > > storagepoolb 4.87G 1.71T 21K /storagepoolb > storagepoolb/clients 18K 1.71T 18K /storagepoolb/clients > storagepoolb/images 4.87G 1.71T 18K /storagepoolb/images > storagepoolb/images/vista 16K 1.71T 16K - > storagepoolb/images/xp 4.87G 1.71T 4.87G - > > I exposed storagepoolb/images/xp via iscsi (comstar) and connected via the > same server, so i could convert my iscsiboot-image (VDI file) to the raw > device. > > So far so good. Now I tried to boot my thinclient using this exposed volume. > > The boot started but then after some time it was "hanging". I checked the > logs files on server comstar 1 and found the following error messages: > > Feb 3 18:21:59 comstar1 scsi: [ID 243001 kern.warning] WARNING: /scsi_vhci > (scsi_vhci0): > Feb 3 18:21:59 comstar1 > /scsi_vhci/d...@g600144f05850c200000049884d060003 (sd14): Command Timeout on > path /iscsi (iscsi0) > Feb 3 18:21:59 comstar1 scsi: [ID 243001 kern.warning] WARNING: /scsi_vhci > (scsi_vhci0): > Feb 3 18:21:59 comstar1 > /scsi_vhci/d...@g600144f05850c200000049884d130006 (sd17): Command Timeout on > path /iscsi (iscsi0) > Feb 3 18:21:59 comstar1 scsi_vhci: [ID 734749 kern.warning] WARNING: > vhci_scsi_reset 0x1 > Feb 3 18:21:59 comstar1 last message repeated 1 time > Feb 3 18:22:09 comstar1 scsi: [ID 107833 kern.warning] WARNING: > /scsi_vhci/d...@g600144f05850c200000049884d130006 (sd17): > Feb 3 18:22:09 comstar1 device busy too long > Feb 3 18:22:09 comstar1 scsi: [ID 107833 kern.warning] WARNING: > /scsi_vhci/d...@g600144f05850c200000049884d060003 (sd14): > Feb 3 18:22:09 comstar1 device busy too long > Feb 3 18:22:14 comstar1 scsi_vhci: [ID 734749 kern.warning] WARNING: > vhci_scsi_reset 0x1 > Feb 3 18:23:15 comstar1 scsi: [ID 243001 kern.warning] WARNING: /scsi_vhci > (scsi_vhci0): > Feb 3 18:23:15 comstar1 (sd17): path (iscsi0), reset 1 failed > Feb 3 18:23:15 comstar1 scsi_vhci: [ID 734749 kern.warning] WARNING: > vhci_scsi_reset 0x0 > Feb 3 18:23:15 comstar1 scsi_vhci: [ID 734749 kern.warning] WARNING: > vhci_scsi_reset 0x1 > Feb 3 18:24:16 comstar1 scsi: [ID 243001 kern.warning] WARNING: /scsi_vhci > (scsi_vhci0): > Feb 3 18:24:16 comstar1 (sd14): path (iscsi0), reset 1 failed > Feb 3 18:24:16 comstar1 scsi_vhci: [ID 734749 kern.warning] WARNING: > vhci_scsi_reset 0x0 > Feb 3 18:24:21 comstar1 scsi: [ID 107833 kern.warning] WARNING: > /scsi_vhci/d...@g600144f05850c200000049884d130006 (sd17): > Feb 3 18:24:21 comstar1 device busy too long > Feb 3 18:24:21 comstar1 scsi_vhci: [ID 734749 kern.warning] WARNING: > vhci_scsi_reset 0x1 > Feb 3 18:24:21 comstar1 scsi: [ID 243001 kern.warning] WARNING: /scsi_vhci > (scsi_vhci0): > Feb 3 18:24:21 comstar1 (sd17): path (iscsi0), reset 1 failed > Feb 3 18:24:21 comstar1 scsi_vhci: [ID 734749 kern.warning] WARNING: > vhci_scsi_reset 0x0 > Feb 3 18:24:21 comstar1 scsi_vhci: [ID 734749 kern.warning] WARNING: > vhci_scsi_reset 0x1 > Feb 3 18:24:26 comstar1 scsi: [ID 107833 kern.warning] WARNING: > /scsi_vhci/d...@g600144f05850c200000049884d060003 (sd14): > Feb 3 18:24:26 comstar1 device busy too long > Feb 3 18:24:26 comstar1 scsi_vhci: [ID 734749 kern.warning] WARNING: > vhci_scsi_reset 0x1 > Feb 3 18:24:26 comstar1 scsi: [ID 243001 kern.warning] WARNING: /scsi_vhci > (scsi_vhci0): > Feb 3 18:24:26 comstar1 (sd14): path (iscsi0), reset 1 failed > Feb 3 18:24:26 comstar1 scsi_vhci: [ID 734749 kern.warning] WARNING: > vhci_scsi_reset 0x0 > Feb 3 18:24:26 comstar1 scsi_vhci: [ID 734749 kern.warning] WARNING: > vhci_scsi_reset 0x1 > Feb 3 18:24:31 comstar1 scsi: [ID 107833 kern.warning] WARNING: > /scsi_vhci/d...@g600144f05850c200000049884d130006 (sd17): > Feb 3 18:24:31 comstar1 device busy too long > > > Feb 03 2009 18:22:09.990389013 ereport.io.scsi.cmd.disk.dev.rqs.derr > nvlist version: 0 > class = ereport.io.scsi.cmd.disk.dev.rqs.derr > ena = 0xc68a392180200001 > detector = (embedded nvlist) > nvlist version: 0 > version = 0x0 > scheme = dev > device-path = > /iscsi/[email protected]:02:72169083-d7a2-cf5f-8b5a-f253fca09ad30001,2 > devid = id1,s...@n600144f05850c200000049884d060003 > (end detector) > > driver-assessment = retry > op-code = 0x28 > cdb = 0x28 0x0 0x25 0x42 0x53 0x0 0x0 0x0 0x10 0x0 > pkt-reason = 0x0 > pkt-state = 0x3f > pkt-stats = 0x0 > stat-code = 0x2 > key = 0x6 > asc = 0x29 > ascq = 0x0 > sense-data = 0x70 0x0 0x6 0x0 0x0 0x0 0x0 0xa 0x0 0x0 0x0 0x0 0x29 > 0x0 0x0 0x0 0x0 0x0 0x0 0x0 > __ttl = 0x1 > __tod = 0x49887d41 0x3b082315 > > Feb 03 2009 18:22:09.990389005 ereport.io.scsi.cmd.disk.recovered > nvlist version: 0 > class = ereport.io.scsi.cmd.disk.recovered > ena = 0xc68a392180200001 > detector = (embedded nvlist) > nvlist version: 0 > version = 0x0 > scheme = dev > device-path = > /iscsi/[email protected]:02:72169083-d7a2-cf5f-8b5a-f253fca09ad30001,2 > devid = id1,s...@n600144f05850c200000049884d060003 > (end detector) > > driver-assessment = recovered > op-code = 0x28 > cdb = 0x28 0x0 0x25 0x42 0x53 0x0 0x0 0x0 0x10 0x0 > pkt-reason = 0x0 > pkt-state = 0x1f > pkt-stats = 0x0 > __ttl = 0x1 > __tod = 0x49887d41 0x3b08230d > > Feb 03 2009 18:22:09.990388931 ereport.io.scsi.cmd.disk.recovered > nvlist version: 0 > class = ereport.io.scsi.cmd.disk.recovered > ena = 0xc68a392180200001 > detector = (embedded nvlist) > nvlist version: 0 > version = 0x0 > scheme = dev > device-path = > /iscsi/[email protected]:02:72169083-d7a2-cf5f-8b5a-f253fca09ad30001,2 > devid = id1,s...@n600144f05850c200000049884d060003 > (end detector) > > driver-assessment = recovered > op-code = 0x28 > cdb = 0x28 0x0 0x0 0x0 0x3 0x0 0x0 0x0 0x10 0x0 > pkt-reason = 0x0 > pkt-state = 0x1f > pkt-stats = 0x0 > __ttl = 0x1 > __tod = 0x49887d41 0x3b0822c3 > > Feb 03 2009 18:24:37.976452834 ereport.fs.zfs.io > nvlist version: 0 > class = ereport.fs.zfs.io > ena = 0xc8b183a3b8100001 > detector = (embedded nvlist) > nvlist version: 0 > version = 0x0 > scheme = zfs > pool = 0x8b47077e6fdf66ba > vdev = 0x9ca5aa13d8563613 > (end detector) > > pool = storagepoolb > pool_guid = 0x8b47077e6fdf66ba > pool_context = 0 > pool_failmode = wait > vdev_guid = 0x9ca5aa13d8563613 > vdev_type = disk > vdev_path = /dev/dsk/c3t600144F05850C200000049884D060003d0s0 > vdev_devid = id1,s...@n600144f05850c200000049884d060003/a > parent_guid = 0x78f0b1a8acd2242e > parent_type = mirror > zio_err = 5 > zio_offset = 0x10ca000 > zio_size = 0x2000 > zio_objset = 0x2f > zio_object = 0x1 > zio_level = 0 > zio_blkid = 0x2458 > __ttl = 0x1 > __tod = 0x49887dd5 0x3a337ce2 > > Also zpool status was showing errors: > > pool: storagepoolb > state: ONLINE > status: One or more devices has experienced an unrecoverable error. An > attempt was made to correct the error. Applications are unaffected. > action: Determine if the device needs to be replaced, and clear the errors > using 'zpool clear' or replace the device with 'zpool replace'. > see: http://www.sun.com/msg/ZFS-8000-9P > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > storagepoolb ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c3t600144F05850C200000049884CF70001d0 ONLINE 0 0 0 > c3t600144F0B2174500000049884D550002d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c3t600144F05850C200000049884CFB0002d0 ONLINE 0 0 0 > c3t600144F0B2174500000049884D5F0003d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c3t600144F05850C200000049884D060003d0 ONLINE 2 0 0 > c3t600144F0B2174500000049884E300004d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c3t600144F05850C200000049884D090004d0 ONLINE 0 0 0 > c3t600144F0B2174500000049884E330005d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c3t600144F05850C200000049884D100005d0 ONLINE 0 0 0 > c3t600144F0B2174500000049884E380006d0 ONLINE 0 0 0 > mirror ONLINE 0 0 0 > c3t600144F05850C200000049884D130006d0 ONLINE 2 0 0 > c3t600144F0B2174500000049884E3A0007d0 ONLINE 0 0 0 > > errors: No known data errors > > Can someone tell me what is going wrong? > > Thanks in advance > > Kristof _______________________________________________ storage-discuss mailing list [email protected] http://mail.opensolaris.org/mailman/listinfo/storage-discuss
