[zfs-discuss] Dirves going offline in Zpool
Hi, I have Dell md1200 connected to two heads ( Dell R710 ). The heads have Perc H800 card and drives are configured in Raid0 ( Virtual Disk) in the RAID controller. One of the drives had crashed and is replaced by a spare. Resilvering was triggered but fails to complete due to drives going offline. I have to reboot the head ( R710) and drives comes online. This happened repeatedly when resilver was 4% done, and again was rebooted , again hung at 27% done, etc. The issues happens with both Solaris11.1/ Omnios. Its a 100Tb pool with 69Tb used. I have critical data and cant afford loss of data. Can I recover the data anyway ( atleast partially ) ? I had verified there is no hardware issue with H800 and also upgraded the firmware for H800. The issue happens with both the heads. Current OS: Solaris 11.1 Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci@0 ,0/pci8086,340e@7/pci1028,1f15@0/sd@12,0 (sd26): Mar 22 21:47:55 solarisCommand failed to complete...Device is gone Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci@0 ,0/pci8086,340e@7/pci1028,1f15@0/sd@c,0 (sd20): Mar 22 21:47:55 solarisCommand failed to complete...Device is gone Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci@0 ,0/pci8086,340e@7/pci1028,1f15@0/sd@18,0 (sd32): Mar 22 21:47:55 solarisCommand failed to complete...Device is gone Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci@0 ,0/pci8086,340e@7/pci1028,1f15@0/sd@1c,0 (sd36): Mar 22 21:47:55 solarisCommand failed to complete...Device is gone Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci@0 ,0/pci8086,340e@7/pci1028,1f15@0/sd@1b,0 (sd35): Mar 22 21:47:55 solarisCommand failed to complete...Device is gone Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci@0 ,0/pci8086,340e@7/pci1028,1f15@0/sd@1e,0 (sd38): Mar 22 21:47:55 solarisCommand failed to complete...Device is gone Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci@0 ,0/pci8086,340e@7/pci1028,1f15@0/sd@19,0 (sd33): Mar 22 21:47:55 solarisCommand failed to complete...Device is gone Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci@0 ,0/pci8086,340e@7/pci1028,1f15@0/sd@1d,0 (sd37): Mar 22 21:47:55 solarisCommand failed to complete...Device is gone Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci@0 ,0/pci8086,340e@7/pci1028,1f15@0/sd@27,0 (sd47): Mar 22 21:47:55 solarisCommand failed to complete...Device is gone Mar 22 21:47:55 solaris scsi: [ID 107833 kern.warning] WARNING: /pci@0 ,0/pci8086,340e@7/pci1028,1f15@0/sd@26,0 (sd46): Mar 22 21:47:55 solarisCommand failed to complete...Device is gone # zpool status -v pool: test state: DEGRADED status: One or more devices is currently being resilvered. The pool will continue to function in a degraded state. action: Wait for the resilver to complete. scan: resilver in progress since Wed Mar 20 19:13:40 2013 27.4T scanned out of 69.6T at 183M/s, 67h11m to go 2.43T resilvered, 39.32% done config: NAMESTATE READ WRITE CKSUM test DEGRADED 0 0 0 raidz1-0 DEGRADED 0 0 0 c8t0d0 ONLINE 0 0 0 c8t1d0 DEGRADED 0 0 0 c8t2d0 DEGRADED 0 0 0 c8t3d0 ONLINE 0 0 0 spare-4 DEGRADED 0 0 0 12459181442598970150 UNAVAIL 0 0 0 c8t45d0 DEGRADED 0 0 0 (resilvering) raidz1-1 ONLINE 0 0 0 c8t5d0 ONLINE 0 0 0 c8t6d0 ONLINE 0 0 0 c8t7d0 ONLINE 0 0 0 c8t8d0 ONLINE 0 0 0 c8t9d0 ONLINE 0 0 0 raidz1-3 DEGRADED 0 0 0 c8t12d0 ONLINE 0 0 0 c8t13d0 ONLINE 0 0 0 c8t14d0 ONLINE 0 0 0 c8t15d0 DEGRADED 0 0 0 c8t16d0 ONLINE 0 0 0 c8t17d0 ONLINE 0 0 0 c8t18d0 ONLINE 0 0 0 c8t19d0 ONLINE 0 0 0 c8t20d0 DEGRADED 0 0 0 c8t21d0 DEGRADED 0 0 0 spare-10DEGRADED 0 0 0 c8t22d0 DEGRADED 0 0 0 c8t47d0 DEGRADED 0
[zfs-discuss] SSD for L2arc
Hi, Can I know how to configure a SSD to be used for L2arc ? Basically I want to improve read performance. To increase write performance, will SSD for Zil help ? As I read on forums, Zil is only used for mysql/transaction based writes. I have regular writes only. Thanks. Regards, Ram ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Slow zfs writes
Hi, My OmniOS host is expreiencing slow zfs writes ( around 30 times slower ). iostat reports below error though pool is healthy. This is happening in past 4 days though no change was done to system. Is the hard disks faulty ? Please help. # zpool status -v root@host:~# zpool status -v pool: test state: ONLINE status: The pool is formatted using a legacy on-disk format. The pool can still be used, but some features are unavailable. action: Upgrade the pool using 'zpool upgrade'. Once this is done, the pool will no longer be accessible on software that does not support feature flags. config: NAME STATE READ WRITE CKSUM test ONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 c2t0d0 ONLINE 0 0 0 c2t1d0 ONLINE 0 0 0 c2t2d0 ONLINE 0 0 0 c2t3d0 ONLINE 0 0 0 c2t4d0 ONLINE 0 0 0 raidz1-1 ONLINE 0 0 0 c2t5d0 ONLINE 0 0 0 c2t6d0 ONLINE 0 0 0 c2t7d0 ONLINE 0 0 0 c2t8d0 ONLINE 0 0 0 c2t9d0 ONLINE 0 0 0 raidz1-3 ONLINE 0 0 0 c2t12d0 ONLINE 0 0 0 c2t13d0 ONLINE 0 0 0 c2t14d0 ONLINE 0 0 0 c2t15d0 ONLINE 0 0 0 c2t16d0 ONLINE 0 0 0 c2t17d0 ONLINE 0 0 0 c2t18d0 ONLINE 0 0 0 c2t19d0 ONLINE 0 0 0 c2t20d0 ONLINE 0 0 0 c2t21d0 ONLINE 0 0 0 c2t22d0 ONLINE 0 0 0 c2t23d0 ONLINE 0 0 0 raidz1-4 ONLINE 0 0 0 c2t24d0 ONLINE 0 0 0 c2t25d0 ONLINE 0 0 0 c2t26d0 ONLINE 0 0 0 c2t27d0 ONLINE 0 0 0 c2t28d0 ONLINE 0 0 0 c2t29d0 ONLINE 0 0 0 c2t30d0 ONLINE 0 0 0 raidz1-5 ONLINE 0 0 0 c2t31d0 ONLINE 0 0 0 c2t32d0 ONLINE 0 0 0 c2t33d0 ONLINE 0 0 0 c2t34d0 ONLINE 0 0 0 c2t35d0 ONLINE 0 0 0 c2t36d0 ONLINE 0 0 0 c2t37d0 ONLINE 0 0 0 raidz1-6 ONLINE 0 0 0 c2t38d0 ONLINE 0 0 0 c2t39d0 ONLINE 0 0 0 c2t40d0 ONLINE 0 0 0 c2t41d0 ONLINE 0 0 0 c2t42d0 ONLINE 0 0 0 c2t43d0 ONLINE 0 0 0 c2t44d0 ONLINE 0 0 0 spares c5t10d0AVAIL c5t11d0AVAIL c2t45d0AVAIL c2t46d0AVAIL c2t47d0AVAIL # iostat -En c4t0d0 Soft Errors: 0 Hard Errors: 5 Transport Errors: 0 Vendor: iDRACProduct: Virtual CD Revision: 0323 Serial No: Size: 0.00GB 0 bytes Media Error: 0 Device Not Ready: 5 No Device: 0 Recoverable: 0 Illegal Request: 1 Predictive Failure Analysis: 0 c3t0d0 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Vendor: iDRACProduct: LCDRIVE Revision: 0323 Serial No: Size: 0.00GB 0 bytes Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 Illegal Request: 0 Predictive Failure Analysis: 0 c4t0d1 Soft Errors: 0 Hard Errors: 0 Transport Errors: 0 Vendor: iDRACProduct: Virtual Floppy Revision: 0323 Serial No: Size: 0.00GB 0 bytes Media Error: 0 Device Not Ready: 0 No Device: 0 Recoverable: 0 root@host:~# fmadm faulty --- -- - TIMEEVENT-ID MSG-ID SEVERITY --- -- - Jan 05 08:21:09 7af1ab3c-83c2-602d-d4b9-f9040db6944a ZFS-8000-HC Major Host: host Platform: PowerEdge-R810 Product_sn : Fault class : fault.fs.zfs.io_failure_wait Affects : zfs://pool=test faulted but still in service Problem in : zfs://pool=test faulted but still in service Description : The ZFS pool has experienced currently unrecoverable I/O failures. Refer to http://illumos.org/msg/ZFS-8000-HCfor more information. Response: No automated response will be taken. Impact : Read and write I/Os cannot be serviced. Action : Make sure the affected devices are connected, then run 'zpool clear'. Regards, Ram
Re: [zfs-discuss] Slow zfs writes
Hi Roy, You are right. So it looks like re-distribution issue. Initially there were two Vdev with 24 disks ( disk 0-23 ) for close to year. After which which we added 24 more disks and created additional vdevs. The initial vdevs are filled up and so write speed declined. Now how to find files that are present in a Vdev or a disk. That way I can remove and re-copy back to distribute data. Any other way to solve this ? Total capacity of pool - 98Tb Used - 44Tb Free - 54 Tb root@host:# zpool iostat -v capacity operationsbandwidth pool alloc free read write read write --- - - - - - - test 54.0T 62.7T 52 1.12K 2.16M 5.78M raidz1 11.2T 2.41T 13 30 176K 146K c2t0d0 - - 5 18 42.1K 39.0K c2t1d0 - - 5 18 42.2K 39.0K c2t2d0 - - 5 18 42.5K 39.0K c2t3d0 - - 5 18 42.9K 39.0K c2t4d0 - - 5 18 42.6K 39.0K raidz1 13.3T 308G 13100 213K 521K c2t5d0 - - 5 94 50.8K 135K c2t6d0 - - 5 94 51.0K 135K c2t7d0 - - 5 94 50.8K 135K c2t8d0 - - 5 94 51.1K 135K c2t9d0 - - 5 94 51.1K 135K raidz1 13.4T 19.1T 9455 743K 2.31M c2t12d0 - - 3137 69.6K 235K c2t13d0 - - 3129 69.4K 227K c2t14d0 - - 3139 69.6K 235K c2t15d0 - - 3131 69.6K 227K c2t16d0 - - 3141 69.6K 235K c2t17d0 - - 3132 69.5K 227K c2t18d0 - - 3142 69.6K 235K c2t19d0 - - 3133 69.6K 227K c2t20d0 - - 3143 69.6K 235K c2t21d0 - - 3133 69.5K 227K c2t22d0 - - 3143 69.6K 235K c2t23d0 - - 3133 69.5K 227K raidz1 2.44T 16.6T 5103 327K 485K c2t24d0 - - 2 48 50.8K 87.4K c2t25d0 - - 2 49 50.7K 87.4K c2t26d0 - - 2 49 50.8K 87.3K c2t27d0 - - 2 49 50.8K 87.3K c2t28d0 - - 2 49 50.8K 87.3K c2t29d0 - - 2 49 50.8K 87.3K c2t30d0 - - 2 49 50.8K 87.3K raidz1 8.18T 10.8T 5295 374K 1.54M c2t31d0 - - 2131 58.2K 279K c2t32d0 - - 2131 58.1K 279K c2t33d0 - - 2131 58.2K 279K c2t34d0 - - 2132 58.2K 279K c2t35d0 - - 2132 58.1K 279K c2t36d0 - - 2133 58.3K 279K c2t37d0 - - 2133 58.2K 279K raidz1 5.42T 13.6T 5163 383K 823K c2t38d0 - - 2 61 59.4K 146K c2t39d0 - - 2 61 59.3K 146K c2t40d0 - - 2 61 59.4K 146K c2t41d0 - - 2 61 59.4K 146K c2t42d0 - - 2 61 59.3K 146K c2t43d0 - - 2 62 59.2K 146K c2t44d0 - - 2 62 59.3K 146K On Mon, Feb 11, 2013 at 10:23 PM, Roy Sigurd Karlsbakk r...@karlsbakk.netwrote: root@host:~# fmadm faulty --- -- - TIMEEVENT-ID MSG-ID SEVERITY --- -- - Jan 05 08:21:09 7af1ab3c-83c2-602d-d4b9-f9040db6944a ZFS-8000-HC Major Host: host Platform: PowerEdge-R810 Product_sn : Fault class : fault.fs.zfs.io_failure_wait Affects : zfs://pool=test faulted but still in service Problem in : zfs://pool=test faulted but still in service Description : The ZFS pool has experienced currently unrecoverable I/O failures. Refer to http://illumos.org/msg/ZFS-8000-HCfor more information. Response: No automated response will be taken. Impact : Read and write I/Os cannot be serviced. Action : Make sure the affected devices are connected, then run 'zpool clear'. -- The pool looks healthy to me, but it it isn't very well balanced. Have you been adding new VDEVs on your way to grow it? Check if of the VDEVs are fuller than others. I don't have an OI/IllumOS system available ATM, but IIRC this can be done with iostat -v. Older versions of ZFS striped to all VDEVs regardless to fill, which slowed down the write speeds rather horribly if some VDEVs were full (90%). This shouldn't be the case with OmniOS, but it *may* be the case with an old zpool version. I don't know. I'd check fill
[zfs-discuss] Bp rewrite
Hi, Anyone knows if there is any progress on bp_rewrite ? Its much awaited to solve re-distribution issue, and moving vdevs. Regards, Ram ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss