Re: [zfs-discuss] Long resilver time
On Dec 26, 2010, at 5:33 AM, Jackson Wang wrote: > Dear Richard, > Thanks for your reply. > > Actually there is NO any other disk/controlller fault in this system. An > engineer of NexentaStor, Andrew, just add a line in /kernel/drv/sd.conf of > "allow-bus-device-reset=0" of the NexentaStor system and then the resilver > speed get high. Before the parameter add-on, the system had resilver more > than 2 days and not complete yet. After the engineer add-on that line and > reboot the system, the reslver just spend about 10 hours to complete. Do you > know what happen about it? Thanks!! This occurs when a device is misbehaving and not responding to commands. When a device does not respond to commands for more than 60 seconds, the sd driver will issue a bus reset, which affects other devices on the "bus." This can happen regardless of the I/O workload. The workaround disables the bus resets, as described in the sd man page. -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Long resilver time
Do you have SSD in? Which ones and any errors on those? On 26 Dec 2010 13:35, "Jackson Wang" wrote: > Dear Richard, > Thanks for your reply. > > Actually there is NO any other disk/controlller fault in this system. An > engineer of NexentaStor, Andrew, just add a line in /kernel/drv/sd.conf of > "allow-bus-device-reset=0" of the NexentaStor system and then the resilver > speed get high. Before the parameter add-on, the system had resilver more > than 2 days and not complete yet. After the engineer add-on that line and > reboot the system, the reslver just spend about 10 hours to complete. Do you > know what happen about it? Thanks!! > > > On Sun, Dec 26, 2010 at 1:24 PM, Richard Elling wrote: > >> On Dec 21, 2010, at 8:18 AM, Jackson Wang wrote: >> > Dear Richard, >> > I am a Nexenta user and now I meet the same problem of the resilver spend >> too long time. I try to find out solution from the link on your content that >> "zfs set resilver_speed=10% pool_name" but the Nexenta without the property >> of resiler_speed. How can I slove my issue on Nexenta? Please advise. >> Thanks! >> >> In general, resilver will take as long as needed. If your resilver is going >> very, very slow, then there could be other issues causing the slowness. >> Has the system been logging error messages related to the I/O subsystem >> during the resilver? >> -- richard >> >> > > > -- > InfoTech Technology Corp. > 威傑科技有限公司 > http://www.infowize.com.tw > > Jackson Wang 王仁傑 > M: 0916163480 > T:02-26791430 / 03-5834432 / 070-1020-9886 > F:0940-472248 > > Tech Supp: supp...@infowize.com.tw > Sales Supp: sa...@infowize.com.tw ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Long resilver time
Dear Richard, Thanks for your reply. Actually there is NO any other disk/controlller fault in this system. An engineer of NexentaStor, Andrew, just add a line in /kernel/drv/sd.conf of "allow-bus-device-reset=0" of the NexentaStor system and then the resilver speed get high. Before the parameter add-on, the system had resilver more than 2 days and not complete yet. After the engineer add-on that line and reboot the system, the reslver just spend about 10 hours to complete. Do you know what happen about it? Thanks!! On Sun, Dec 26, 2010 at 1:24 PM, Richard Elling wrote: > On Dec 21, 2010, at 8:18 AM, Jackson Wang wrote: > > Dear Richard, > > I am a Nexenta user and now I meet the same problem of the resilver spend > too long time. I try to find out solution from the link on your content that > "zfs set resilver_speed=10% pool_name" but the Nexenta without the property > of resiler_speed. How can I slove my issue on Nexenta? Please advise. > Thanks! > > In general, resilver will take as long as needed. If your resilver is going > very, very slow, then there could be other issues causing the slowness. > Has the system been logging error messages related to the I/O subsystem > during the resilver? > -- richard > > -- InfoTech Technology Corp. 威傑科技有限公司 http://www.infowize.com.tw Jackson Wang 王仁傑 M: 0916163480 T:02-26791430 / 03-5834432 / 070-1020-9886 F:0940-472248 Tech Supp: supp...@infowize.com.tw Sales Supp: sa...@infowize.com.tw ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Long resilver time
On Dec 21, 2010, at 8:18 AM, Jackson Wang wrote: > Dear Richard, > I am a Nexenta user and now I meet the same problem of the resilver spend too > long time. I try to find out solution from the link on your content that "zfs > set resilver_speed=10% pool_name" but the Nexenta without the property of > resiler_speed. How can I slove my issue on Nexenta? Please advise. Thanks! In general, resilver will take as long as needed. If your resilver is going very, very slow, then there could be other issues causing the slowness. Has the system been logging error messages related to the I/O subsystem during the resilver? -- richard ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Long resilver time
Dear Richard, How can I update the important ZFS fixes on NexentaStor? Now my version of NexentsStor is v3.0.4 enterprise. -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Long resilver time
Dear Richard, I am a Nexenta user and now I meet the same problem of the resilver spend too long time. I try to find out solution from the link on your content that "zfs set resilver_speed=10% pool_name" but the Nexenta without the property of resiler_speed. How can I slove my issue on Nexenta? Please advise. Thanks! -- This message posted from opensolaris.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Long resilver time
Err...I meant Nexenta Core. -J On Mon, Sep 27, 2010 at 12:02 PM, Jason J. W. Williams < jasonjwwilli...@gmail.com> wrote: > 134 it is. This is an OpenSolaris rig that's going to be replaced within > the next 60 days, so just need to get it to something that won't through > false checksum errors like the 120-123 builds do and has decent rebuild > times. > > Future boxes will be NexentaStor. > > Thank you guys. :) > > -J > > On Sun, Sep 26, 2010 at 2:21 PM, Richard Elling wrote: > >> On Sep 26, 2010, at 1:16 PM, Roy Sigurd Karlsbakk wrote: >> >>> Upgrading is definitely an option. What is the current snv favorite >> >>> for ZFS stability? I apologize, with all the Oracle/Sun changes I >> >>> haven't been paying as close attention to big reports on zfs-discuss >> >>> as I used to. >> >> >> >> OpenIndiana b147 is the latest binary release, but it also includes >> >> the fix for >> >> CR6494473, ZFS needs a way to slow down resilvering >> >> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6494473 >> >> http://www.openindiana.org >> > >> > Are you sure upgrading to OI is safe at this point? 134 is stable unless >> you start fiddling with dedup, and OI is hardly tested. For a production >> setup, I'd recommend 134 >> >> For a production setup? For production I'd recommend something that is >> supported, preferably NexentaStor 3 (which is b134 + important ZFS fixes >> :-) >> -- richard >> >> -- >> OpenStorage Summit, October 25-27, Palo Alto, CA >> http://nexenta-summit2010.eventbrite.com >> >> Richard Elling >> rich...@nexenta.com +1-760-896-4422 >> Enterprise class storage for everyone >> www.nexenta.com >> >> >> >> >> >> ___ >> zfs-discuss mailing list >> zfs-discuss@opensolaris.org >> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss >> > > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Long resilver time
134 it is. This is an OpenSolaris rig that's going to be replaced within the next 60 days, so just need to get it to something that won't through false checksum errors like the 120-123 builds do and has decent rebuild times. Future boxes will be NexentaStor. Thank you guys. :) -J On Sun, Sep 26, 2010 at 2:21 PM, Richard Elling wrote: > On Sep 26, 2010, at 1:16 PM, Roy Sigurd Karlsbakk wrote: > >>> Upgrading is definitely an option. What is the current snv favorite > >>> for ZFS stability? I apologize, with all the Oracle/Sun changes I > >>> haven't been paying as close attention to big reports on zfs-discuss > >>> as I used to. > >> > >> OpenIndiana b147 is the latest binary release, but it also includes > >> the fix for > >> CR6494473, ZFS needs a way to slow down resilvering > >> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6494473 > >> http://www.openindiana.org > > > > Are you sure upgrading to OI is safe at this point? 134 is stable unless > you start fiddling with dedup, and OI is hardly tested. For a production > setup, I'd recommend 134 > > For a production setup? For production I'd recommend something that is > supported, preferably NexentaStor 3 (which is b134 + important ZFS fixes > :-) > -- richard > > -- > OpenStorage Summit, October 25-27, Palo Alto, CA > http://nexenta-summit2010.eventbrite.com > > Richard Elling > rich...@nexenta.com +1-760-896-4422 > Enterprise class storage for everyone > www.nexenta.com > > > > > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss > ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Long resilver time
On Sep 26, 2010, at 1:16 PM, Roy Sigurd Karlsbakk wrote: >>> Upgrading is definitely an option. What is the current snv favorite >>> for ZFS stability? I apologize, with all the Oracle/Sun changes I >>> haven't been paying as close attention to big reports on zfs-discuss >>> as I used to. >> >> OpenIndiana b147 is the latest binary release, but it also includes >> the fix for >> CR6494473, ZFS needs a way to slow down resilvering >> http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6494473 >> http://www.openindiana.org > > Are you sure upgrading to OI is safe at this point? 134 is stable unless you > start fiddling with dedup, and OI is hardly tested. For a production setup, > I'd recommend 134 For a production setup? For production I'd recommend something that is supported, preferably NexentaStor 3 (which is b134 + important ZFS fixes :-) -- richard -- OpenStorage Summit, October 25-27, Palo Alto, CA http://nexenta-summit2010.eventbrite.com Richard Elling rich...@nexenta.com +1-760-896-4422 Enterprise class storage for everyone www.nexenta.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Long resilver time
> > Upgrading is definitely an option. What is the current snv favorite > > for ZFS stability? I apologize, with all the Oracle/Sun changes I > > haven't been paying as close attention to big reports on zfs-discuss > > as I used to. > > OpenIndiana b147 is the latest binary release, but it also includes > the fix for > CR6494473, ZFS needs a way to slow down resilvering > http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6494473 > http://www.openindiana.org Are you sure upgrading to OI is safe at this point? 134 is stable unless you start fiddling with dedup, and OI is hardly tested. For a production setup, I'd recommend 134 Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 r...@karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Long resilver time
On Sep 26, 2010, at 11:03 AM, Jason J. W. Williams wrote: > Upgrading is definitely an option. What is the current snv favorite for ZFS > stability? I apologize, with all the Oracle/Sun changes I haven't been paying > as close attention to big reports on zfs-discuss as I used to. OpenIndiana b147 is the latest binary release, but it also includes the fix for CR6494473, ZFS needs a way to slow down resilvering http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6494473 http://www.openindiana.org -- richard -- OpenStorage Summit, October 25-27, Palo Alto, CA http://nexenta-summit2010.eventbrite.com Richard Elling rich...@nexenta.com +1-760-896-4422 Enterprise class storage for everyone www.nexenta.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Long resilver time
Upgrading is definitely an option. What is the current snv favorite for ZFS stability? I apologize, with all the Oracle/Sun changes I haven't been paying as close attention to big reports on zfs-discuss as I used to. -J Sent via iPhone Is your e-mail Premiere? On Sep 26, 2010, at 10:22, Roy Sigurd Karlsbakk wrote: > > I just witnessed a resilver that took 4h for 27gb of data. Setup is 3x > raid-z2 stripes with 6 disks per raid-z2. Disks are 500gb in size. No > checksum errors. > > It seems like an exorbitantly long time. The other 5 disks in the stripe with > the replaced disk were at 90% busy and ~150io/s each during the resilver. > Does this seem unusual to anyone else? Could it be due to heavy fragmentation > or do I have a disk in the stripe going bad? Post-resilver no disk is above > 30% util or noticeably higher than any other disk. > > Thank you in advance. (kernel is snv123) > It surely seems a long time for 27 gigs. Scrub takes its time, but for this > 50TB setup with currently ~29TB used, on WD Green drives (yeah, I know > they're bad, but I didn't know that at the time I installed the box, and they > have worked flawlessly for a year or so), scrub takes a bit of time, but > nothing comparible to what you're reporting > >scrub: scrub completed after 47h57m with 0 errors on Fri Sep 3 16:57:26 > 2010 > > Also, snv123 is quite old, is upgrading to 134 an option? > > Vennlige hilsener / Best regards > > roy > -- > Roy Sigurd Karlsbakk > (+47) 97542685 > r...@karlsbakk.net > http://blogg.karlsbakk.net/ > -- > I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det > er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av > idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og > relevante synonymer på norsk. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Long resilver time
On Sun, 26 Sep 2010, Edward Ned Harvey wrote: 27G on a 6-disk raidz2 means approx 6.75G per disk. Ideally, the disk could write 7G = 56 Gbit in a couple minutes if it were all sequential and no other activity in the system. So you're right to suspect something is suboptimal, but the root cause is inefficient resilvering code in zfs specifically for raidzN. The resilver code spends a *lot* of time seeking, because it's not optimized by disk layout. This may change some day, but not in the near future. Part of the problem is that the zfs designers decided that the filesystems should remain up and usable during a resilver. Without this requirement things would be a lot easier. For example, we could just run some utility and wait many hours (perhaps fewer hours than zfs resilver) before the filesystems are allowed to be usable. Few of us want to return to that scenario. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Long resilver time
- Original Message - I just witnessed a resilver that took 4h for 27gb of data. Setup is 3x raid-z2 stripes with 6 disks per raid-z2. Disks are 500gb in size. No checksum errors. It seems like an exorbitantly long time. The other 5 disks in the stripe with the replaced disk were at 90% busy and ~150io/s each during the resilver. Does this seem unusual to anyone else? Could it be due to heavy fragmentation or do I have a disk in the stripe going bad? Post-resilver no disk is above 30% util or noticeably higher than any other disk. Thank you in advance. (kernel is snv123) It surely seems a long time for 27 gigs. Scrub takes its time, but for this 50TB setup with currently ~29TB used, on WD Green drives (yeah, I know they're bad, but I didn't know that at the time I installed the box, and they have worked flawlessly for a year or so), scrub takes a bit of time, but nothing comparible to what you're reporting scrub: scrub completed after 47h57m with 0 errors on Fri Sep 3 16:57:26 2010 Also, snv123 is quite old, is upgrading to 134 an option? Vennlige hilsener / Best regards roy -- Roy Sigurd Karlsbakk (+47) 97542685 r...@karlsbakk.net http://blogg.karlsbakk.net/ -- I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og relevante synonymer på norsk. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Long resilver time
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Jason J. W. Williams > > I just witnessed a resilver that took 4h for 27gb of data. Setup is 3x > raid-z2 stripes with 6 disks per raid-z2. Disks are 500gb in size. No > checksum errors. 27G on a 6-disk raidz2 means approx 6.75G per disk. Ideally, the disk could write 7G = 56 Gbit in a couple minutes if it were all sequential and no other activity in the system. So you're right to suspect something is suboptimal, but the root cause is inefficient resilvering code in zfs specifically for raidzN. The resilver code spends a *lot* of time seeking, because it's not optimized by disk layout. This may change some day, but not in the near future. Mirrors don't suffer the same effect. At least, if they do, it's far less dramatic. For now, all you can do is: (a) factor this into your decision to use mirror versus raidz, and (b) ensure no snapshots, and minimal IO during the resilver, and (c) if you opt for raidz, keep the number of disks in a raidz to a minimum. It is preferable to use 3 vdev's each of 7-disk raidz, instead of using a 21-disk raidz3. Your setup of 3x raidz2 is pretty reasonable, and 4h resilver, although slow, is successful. Which is more than you could say if you had a 21-disk raidz3. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss