[zfs-discuss] Fwd: Is there any way to stop a resilver?

2010-09-29 Thread LIC mesh
This is an iSCSI/COMSTAR array.

The head was running 2009.06 stable with version 14 ZFS, but we updated that
to build 134 (kept the old OS drives) - did not, however, update the zpool -
it's still version 14.

The targets are all running 2009.06 stable, exporting 4 raidz1 LUNs each of
6 drives - 8 shelves have 1TB drives, the other 8 have 2TB drives.

The head sees the filesystem as comprised of 8 vdevs of 8 iSCSI LUNs each,
with SSD ZIL and SSD L2ARC.



On Wed, Sep 29, 2010 at 11:49 AM, Scott Meilicke 
scott.meili...@craneaerospace.com wrote:

  What version of OS?
 Are snapshots running (turn them off).

 So are there eight disks?



 On 9/29/10 8:46 AM, LIC mesh licm...@gmail.com wrote:

 It's always running less than an hour.

 It usually starts at around 300,000h estimate(at 1m in), goes up to an
 estimate in the millions(about 30mins in) and restarts.

 Never gets past 0.00% completion, and K resilvered on any LUN.

 64 LUNs, 32x5.44T, 32x10.88T in 8 vdevs.




 On Wed, Sep 29, 2010 at 11:40 AM, Scott Meilicke 
 scott.meili...@craneaerospace.com wrote:

 Has it been running long? Initially the numbers are *way* off. After a
 while it settles down into something reasonable.

 How many disks, and what size, are in your raidz2?

 -Scott


 On 9/29/10 8:36 AM, LIC mesh licm...@gmail.com 
 http://licm...@gmail.com  wrote:

 Is there any way to stop a resilver?

 We gotta stop this thing - at minimum, completion time is 300,000 hours,
 and maximum is in the millions.

 Raidz2 array, so it has the redundancy, we just need to get data off.

  --
 We value your opinion!  http://www.craneae.com/surveys/satisfaction.htmHow 
 may we serve you better?Please click the survey link to tell us how we
 are doing: http://www.craneae.com/surveys/satisfaction.htm
 http://www.craneae.com/surveys/satisfaction.htm

 Your feedback is of the utmost importance to us. Thank you for your time.

 Crane Aerospace  Electronics Confidentiality Statement:
 The information contained in this email message may be privileged and is
 confidential information intended only for the use of the recipient, or any
 employee or agent responsible to deliver it to the intended recipient. Any
 unauthorized use, distribution or copying of this information is strictly
 prohibited and may be unlawful. If you have received this communication in
 error, please notify the sender immediately and destroy the original message
 and all attachments from your electronic files.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Fwd: Is there any way to stop a resilver?

2010-09-29 Thread Scott Meilicke
(I left the list off last time ­ sorry)

No, the resliver should only be happening if there was a spare available. Is
the whole thing scrubbing? It looks like it. Can you stop it with a

zpool scrub ­s pool

So... Word of warning, I am no expert at this stuff. Think about what I am
suggesting before you do it :). Although stopping a scrub is pretty
innocuous.

-Scott

On 9/29/10 9:22 AM, LIC mesh licm...@gmail.com wrote:

 You almost have it - each iSCSI target is made up of 4 of the raidz vdevs - 4
 * 6 = 24 disks.
 
 16 targets total.
 
 We have one LUN with status of UNAVAIL but didn't know if removing it
 outright would help - it's actually available and well as far as the target is
 concerned, so we thought it went UNAVAIL as a result of iSCSI timeouts - we've
 since fixed the switches buffers, etc.
 
 See:
 http://pastebin.com/pan9DBBS
 
 
 
 On Wed, Sep 29, 2010 at 12:17 PM, Scott Meilicke
 scott.meili...@craneaerospace.com wrote:
 OK, let me see if I have this right:
 
 8 shelves, 1T disks, 24 disks per shelf = 192 disks
 8 shelves, 2T disks, 24 disks per shelf = 192 disks
 Each raidz is six disks.
 64 raidz vdevs
 Each iSCSI target is made up of 8 of these raidz vdevs (8 x 6 disks = 48
 disks)
 Then the head takes these eight targets, and makes a raidz2. So the raidz2
 depends upon all 384 disks. So when a failure occurs, the resliver is
 accessing all 384 disks.
 
 If I have this right, which I am in serious doubt :), then that will either
 take an enormous amount of time to complete, or never. It looks like never.
 
 Recovery:
 
 From the head, can you see which vdev has failed? If so, can you remove it to
 stop the resliver?
 
 
 
 On 9/29/10 8:57 AM, LIC mesh licm...@gmail.com http://licm...@gmail.com
  wrote:
 
 This is an iSCSI/COMSTAR array.
 
 The head was running 2009.06 stable with version 14 ZFS, but we updated that
 to build 134 (kept the old OS drives) - did not, however, update the zpool -
 it's still version 14.
 
 The targets are all running 2009.06 stable, exporting 4 raidz1 LUNs each of
 6 drives - 8 shelves have 1TB drives, the other 8 have 2TB drives.
 
 The head sees the filesystem as comprised of 8 vdevs of 8 iSCSI LUNs each,
 with SSD ZIL and SSD L2ARC.
 
 
 
 On Wed, Sep 29, 2010 at 11:49 AM, Scott Meilicke
 scott.meili...@craneaerospace.com
 http://scott.meili...@craneaerospace.com  wrote:
 What version of OS?
 Are snapshots running (turn them off).
 
 So are there eight disks?
 
 
 
 On 9/29/10 8:46 AM, LIC mesh licm...@gmail.com
 http://licm...@gmail.com  http://licm...@gmail.com  wrote:
 
 It's always running less than an hour.
 
 It usually starts at around 300,000h estimate(at 1m in), goes up to an
 estimate in the millions(about 30mins in) and restarts.
 
 Never gets past 0.00% completion, and K resilvered on any LUN.
 
 64 LUNs, 32x5.44T, 32x10.88T in 8 vdevs.
 
 
 
 
 On Wed, Sep 29, 2010 at 11:40 AM, Scott Meilicke
 scott.meili...@craneaerospace.com
 http://scott.meili...@craneaerospace.com
 http://scott.meili...@craneaerospace.com  wrote:
 Has it been running long? Initially the numbers are way off. After a
 while it settles down into something reasonable.
 
 How many disks, and what size, are in your raidz2?  
 
 -Scott
 
 
 On 9/29/10 8:36 AM, LIC mesh licm...@gmail.com
 http://licm...@gmail.com  http://licm...@gmail.com
  http://licm...@gmail.com  wrote:
 
 Is there any way to stop a resilver?
 
 We gotta stop this thing - at minimum, completion time is 300,000 hours,
 and maximum is in the millions.
 
 Raidz2 array, so it has the redundancy, we just need to get data off.
 
 
 We value your opinion!  http://www.craneae.com/surveys/satisfaction.htm
 How may we serve you better?Please click the survey link to tell us how we
 are doing:  http://www.craneae.com/surveys/satisfaction.htm
 http://www.craneae.com/surveys/satisfaction.htm
 http://www.craneae.com/surveys/satisfaction.htm
 
 Your feedback is of the utmost importance to us. Thank you for your time.
 
 Crane Aerospace  Electronics Confidentiality Statement:
 The information contained in this email message may be privileged and is
 confidential information intended only for the use of the recipient, or any
 employee or agent responsible to deliver it to the intended recipient. Any
 unauthorized use, distribution or copying of this information is strictly
 prohibited and may be unlawful. If you have received this communication in
 error, please notify the sender immediately and destroy the original message
 and all attachments from your electronic files.
 
 


--
Scott Meilicke | Enterprise Systems Administrator | Crane Aerospace 
Electronics | +1 425-743-8153 | M: +1 206-406-2670



We value your opinion!  How may we serve you better? 
Please click the survey link to tell us how we are doing:
http://www.craneae.com/ContactUs/VoiceofCustomer.aspx
Your feedback is of the utmost importance to us. Thank you for your time.