Re: [zfs-discuss] ZFS no longer working with FC devices.
On Sun, May 23, 2010 at 12:02 PM, Torrey McMahon tmcmah...@yahoo.com wrote: On 5/23/2010 11:49 AM, Richard Elling wrote: FWIW, the A5100 went end-of-life (EOL) in 2001 and end-of-service-life (EOSL) in 2006. Personally, I hate them with a passion and would like to extend an offer to use my tractor to bury the beast:-). I'm sure I can get some others to help. Can I smash the gbics? Those were my favorite. :-) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss I'd be more then happy to take someone up on the offer but I'd need a good deal on more current FC array. Since this is my home environment I am limited by my insignificant pay and the wife factor (who does indulge me from time to time). Without a corporate IT budget I make do with everything from free to what I can afford used. To be honest I'd rather be using an IBM DS4K series array. Current stress test is creating 700 (50% of array capacity) 1GB files from /dev/urandom and then I will scrub. If all goes well it's back to u8 and tuning it. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS no longer working with FC devices.
On May 24, 2010, at 4:06 AM, Demian Phillips wrote: On Sun, May 23, 2010 at 12:02 PM, Torrey McMahon tmcmah...@yahoo.com wrote: On 5/23/2010 11:49 AM, Richard Elling wrote: FWIW, the A5100 went end-of-life (EOL) in 2001 and end-of-service-life (EOSL) in 2006. Personally, I hate them with a passion and would like to extend an offer to use my tractor to bury the beast:-). I'm sure I can get some others to help. Can I smash the gbics? Those were my favorite. :-) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss I'd be more then happy to take someone up on the offer but I'd need a good deal on more current FC array. Since this is my home environment I am limited by my insignificant pay and the wife factor (who does indulge me from time to time). Without a corporate IT budget I make do with everything from free to what I can afford used. To be honest I'd rather be using an IBM DS4K series array. Current stress test is creating 700 (50% of array capacity) 1GB files from /dev/urandom and then I will scrub. Unfortunately, /dev/urandom is too slow for direct stress testing. It can be used as a seed for random data files that are then used for stress testing. -- richard -- ZFS and NexentaStor training, Rotterdam, July 13-15, 2010 http://nexenta-rotterdam.eventbrite.com/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS no longer working with FC devices.
On Sat, May 22, 2010 at 11:33 AM, Bob Friesenhahn bfrie...@simple.dallas.tx.us wrote: On Fri, 21 May 2010, Demian Phillips wrote: For years I have been running a zpool using a Fibre Channel array with no problems. I would scrub every so often and dump huge amounts of data (tens or hundreds of GB) around and it never had a problem outside of one confirmed (by the array) disk failure. I upgraded to sol10x86 05/09 last year and since then I have discovered any sufficiently high I/O from ZFS starts causing timeouts and off-lining disks. This leads to failure (once rebooted and cleaned all is well) long term because you can no longer scrub reliably. The problem could be with the device driver, your FC card, or the array itself. In my case, issues I thought were to blame on my motherboard or Solaris were due to a defective FC card and replacing the card resolved the problem. If the problem is that your storage array is becoming overloaded with requests, then try adding this to your /etc/system file: * Set device I/O maximum concurrency * http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Device_I.2FO_Queue_Size_.28I.2FO_Concurrency.29 set zfs:zfs_vdev_max_pending = 5 Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer, http://www.GraphicsMagick.org/ I've gone back to Solaris 10 11/06. It's working fine, but I notice some differences in performance that are I think key to the problem. With the latest Solaris 10 (u8) throughput according to zpool iostat was hitting about 115MB/sec sometimes a little higher. With 11/06 it maxes out at 40MB/sec. Both setups are using mpio devices as far as I can tell. Next is to go back to u8 and see if the tuning you suggested will help. It really looks to me that the OS is asking too much of the FC chain I have. The really puzzling thing is I just got told about a brand new Dell Solaris x86 production box using current and supported FC devices and a supported SAN get the same kind of problems when a scrub is run. I'm going to investigate that and see if we can get a fix from Oracle as that does have a support contract. It may shed some light on the issue I am seeing on the older hardware. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS no longer working with FC devices.
On May 23, 2010, at 6:01 AM, Demian Phillips wrote: On Sat, May 22, 2010 at 11:33 AM, Bob Friesenhahn bfrie...@simple.dallas.tx.us wrote: On Fri, 21 May 2010, Demian Phillips wrote: For years I have been running a zpool using a Fibre Channel array with no problems. I would scrub every so often and dump huge amounts of data (tens or hundreds of GB) around and it never had a problem outside of one confirmed (by the array) disk failure. I upgraded to sol10x86 05/09 last year and since then I have discovered any sufficiently high I/O from ZFS starts causing timeouts and off-lining disks. This leads to failure (once rebooted and cleaned all is well) long term because you can no longer scrub reliably. The problem could be with the device driver, your FC card, or the array itself. In my case, issues I thought were to blame on my motherboard or Solaris were due to a defective FC card and replacing the card resolved the problem. If the problem is that your storage array is becoming overloaded with requests, then try adding this to your /etc/system file: * Set device I/O maximum concurrency * http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Device_I.2FO_Queue_Size_.28I.2FO_Concurrency.29 set zfs:zfs_vdev_max_pending = 5 I would lower it even farther. Perhaps 2. I've gone back to Solaris 10 11/06. It's working fine, but I notice some differences in performance that are I think key to the problem. Yep, lots of performance improvements were added later. With the latest Solaris 10 (u8) throughput according to zpool iostat was hitting about 115MB/sec sometimes a little higher. That should be about right for the A5100. With 11/06 it maxes out at 40MB/sec. Both setups are using mpio devices as far as I can tell. Next is to go back to u8 and see if the tuning you suggested will help. It really looks to me that the OS is asking too much of the FC chain I have. I think that is a nice way of saying it. The really puzzling thing is I just got told about a brand new Dell Solaris x86 production box using current and supported FC devices and a supported SAN get the same kind of problems when a scrub is run. I'm going to investigate that and see if we can get a fix from Oracle as that does have a support contract. It may shed some light on the issue I am seeing on the older hardware. The scrub workload is no different than any other stress test. I'm sure you can run a benchmark or three on the raw device and get the same error messages. FWIW, the A5100 went end-of-life (EOL) in 2001 and end-of-service-life (EOSL) in 2006. Personally, I hate them with a passion and would like to extend an offer to use my tractor to bury the beast :-). -- richard -- ZFS and NexentaStor training, Rotterdam, July 13-15, 2010 http://nexenta-rotterdam.eventbrite.com/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS no longer working with FC devices.
On 5/23/2010 11:49 AM, Richard Elling wrote: FWIW, the A5100 went end-of-life (EOL) in 2001 and end-of-service-life (EOSL) in 2006. Personally, I hate them with a passion and would like to extend an offer to use my tractor to bury the beast:-). I'm sure I can get some others to help. Can I smash the gbics? Those were my favorite. :-) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] ZFS no longer working with FC devices.
On Fri, 21 May 2010, Demian Phillips wrote: For years I have been running a zpool using a Fibre Channel array with no problems. I would scrub every so often and dump huge amounts of data (tens or hundreds of GB) around and it never had a problem outside of one confirmed (by the array) disk failure. I upgraded to sol10x86 05/09 last year and since then I have discovered any sufficiently high I/O from ZFS starts causing timeouts and off-lining disks. This leads to failure (once rebooted and cleaned all is well) long term because you can no longer scrub reliably. The problem could be with the device driver, your FC card, or the array itself. In my case, issues I thought were to blame on my motherboard or Solaris were due to a defective FC card and replacing the card resolved the problem. If the problem is that your storage array is becoming overloaded with requests, then try adding this to your /etc/system file: * Set device I/O maximum concurrency * http://www.solarisinternals.com/wiki/index.php/ZFS_Evil_Tuning_Guide#Device_I.2FO_Queue_Size_.28I.2FO_Concurrency.29 set zfs:zfs_vdev_max_pending = 5 Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss