date:20100929

On Wed, Sep 29, 2010 at 03:44:57AM -0700, Ralph Böhme wrote:
  On 9/28/2010 2:13 PM, Nicolas Williams wrote:
  The version of samba bundled with Solaris 10 seems to
  insist on 
  chmod'ing stuff. I've tried all of the various

Just in case it's not clear, I did not write the quoted text.  (One can
tell from the level of quotation that an attribution is missing and that
none of my text was quoted.

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] rpool spare

2010-09-29 Thread Tony MacDoodle

Using ZFS v22, is it possible to add a hot spare to rpool?

Thanks
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] rpool spare

2010-09-29 Thread Cindy Swearingen


Hi Tony,

The current behavior is that you can add a spare to a root pool. If the 
spare kicks in automatically, you would need to apply the boot blocks

manually before you could boot from the spared-in disk.

A good alternative is to create a two-way or three-way mirrored root
pool.

We're tracking the root pool boot issues. If a bug isn't filed for this 
issue, I will file it.


Thanks,

Cindy

On 09/29/10 08:31, Tony MacDoodle wrote:

Using ZFS v22, is it possible to add a hot spare to rpool?

Thanks




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Is there any way to stop a resilver?

Is there any way to stop a resilver?

We gotta stop this thing - at minimum, completion time is 300,000 hours, and
maximum is in the millions.

Raidz2 array, so it has the redundancy, we just need to get data off.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Is there any way to stop a resilver?

Has it been running long? Initially the numbers are way off. After a while
it settles down into something reasonable.

How many disks, and what size, are in your raidz2?

-Scott

On 9/29/10 8:36 AM, LIC mesh licm...@gmail.com wrote:

 Is there any way to stop a resilver?
 
 We gotta stop this thing - at minimum, completion time is 300,000 hours, and
 maximum is in the millions.
 
 Raidz2 array, so it has the redundancy, we just need to get data off.



We value your opinion!  How may we serve you better? 
Please click the survey link to tell us how we are doing:
http://www.craneae.com/ContactUs/VoiceofCustomer.aspx
Your feedback is of the utmost importance to us. Thank you for your time.

Crane Aerospace  Electronics Confidentiality Statement:
The information contained in this email message may be privileged and is 
confidential information intended only for the use of the recipient, or any 
employee or agent responsible to deliver it to the intended recipient. Any 
unauthorized use, distribution or copying of this information is strictly 
prohibited 
and may be unlawful. If you have received this communication in error, please 
notify 
the sender immediately and destroy the original message and all attachments 
from 
your electronic files.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Is there any way to stop a resilver?

It's always running less than an hour.

It usually starts at around 300,000h estimate(at 1m in), goes up to an
estimate in the millions(about 30mins in) and restarts.

Never gets past 0.00% completion, and K resilvered on any LUN.

64 LUNs, 32x5.44T, 32x10.88T in 8 vdevs.




On Wed, Sep 29, 2010 at 11:40 AM, Scott Meilicke 
scott.meili...@craneaerospace.com wrote:

  Has it been running long? Initially the numbers are *way* off. After a
 while it settles down into something reasonable.

 How many disks, and what size, are in your raidz2?

 -Scott


 On 9/29/10 8:36 AM, LIC mesh licm...@gmail.com wrote:

 Is there any way to stop a resilver?

 We gotta stop this thing - at minimum, completion time is 300,000 hours,
 and maximum is in the millions.

 Raidz2 array, so it has the redundancy, we just need to get data off.

  --
 We value your opinion!  http://www.craneae.com/surveys/satisfaction.htmHow 
 may we serve you better?Please click the survey link to tell us how we
 are doing: http://www.craneae.com/surveys/satisfaction.htm
 http://www.craneae.com/surveys/satisfaction.htm

 Your feedback is of the utmost importance to us. Thank you for your time.

 Crane Aerospace  Electronics Confidentiality Statement:
 The information contained in this email message may be privileged and is
 confidential information intended only for the use of the recipient, or any
 employee or agent responsible to deliver it to the intended recipient. Any
 unauthorized use, distribution or copying of this information is strictly
 prohibited and may be unlawful. If you have received this communication in
 error, please notify the sender immediately and destroy the original message
 and all attachments from your electronic files.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Is there any way to stop a resilver?

What version of OS?
Are snapshots running (turn them off).

So are there eight disks?


On 9/29/10 8:46 AM, LIC mesh licm...@gmail.com wrote:

 It's always running less than an hour.
 
 It usually starts at around 300,000h estimate(at 1m in), goes up to an
 estimate in the millions(about 30mins in) and restarts.
 
 Never gets past 0.00% completion, and K resilvered on any LUN.
 
 64 LUNs, 32x5.44T, 32x10.88T in 8 vdevs.
 
 
 
 
 On Wed, Sep 29, 2010 at 11:40 AM, Scott Meilicke
 scott.meili...@craneaerospace.com wrote:
 Has it been running long? Initially the numbers are way off. After a while it
 settles down into something reasonable.
 
 How many disks, and what size, are in your raidz2?  
 
 -Scott
 
 
 On 9/29/10 8:36 AM, LIC mesh licm...@gmail.com http://licm...@gmail.com
  wrote:
 
 Is there any way to stop a resilver?
 
 We gotta stop this thing - at minimum, completion time is 300,000 hours, and
 maximum is in the millions.
 
 Raidz2 array, so it has the redundancy, we just need to get data off.



We value your opinion!  How may we serve you better? 
Please click the survey link to tell us how we are doing:
http://www.craneae.com/ContactUs/VoiceofCustomer.aspx
Your feedback is of the utmost importance to us. Thank you for your time.

Crane Aerospace  Electronics Confidentiality Statement:
The information contained in this email message may be privileged and is 
confidential information intended only for the use of the recipient, or any 
employee or agent responsible to deliver it to the intended recipient. Any 
unauthorized use, distribution or copying of this information is strictly 
prohibited 
and may be unlawful. If you have received this communication in error, please 
notify 
the sender immediately and destroy the original message and all attachments 
from 
your electronic files.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Is there any way to stop a resilver?

2010-09-29 Thread Lin Ling


What caused the resilvering to kick off in the first place?

Lin

On Sep 29, 2010, at 8:46 AM, LIC mesh wrote:

 It's always running less than an hour.
 
 It usually starts at around 300,000h estimate(at 1m in), goes up to an 
 estimate in the millions(about 30mins in) and restarts.
 
 Never gets past 0.00% completion, and K resilvered on any LUN.
 
 64 LUNs, 32x5.44T, 32x10.88T in 8 vdevs.
 
 
 
 
 On Wed, Sep 29, 2010 at 11:40 AM, Scott Meilicke 
 scott.meili...@craneaerospace.com wrote:
 Has it been running long? Initially the numbers are way off. After a while it 
 settles down into something reasonable.
 
 How many disks, and what size, are in your raidz2?  
 
 -Scott
 
 
 On 9/29/10 8:36 AM, LIC mesh licm...@gmail.com wrote:
 
 Is there any way to stop a resilver?
 
 We gotta stop this thing - at minimum, completion time is 300,000 hours, and 
 maximum is in the millions.
 
 Raidz2 array, so it has the redundancy, we just need to get data off.
 
 
 We value your opinion!  How may we serve you better?Please click the survey 
 link to tell us how we are doing: 
 http://www.craneae.com/surveys/satisfaction.htm
 
 Your feedback is of the utmost importance to us. Thank you for your time.
 
 Crane Aerospace  Electronics Confidentiality Statement:
 The information contained in this email message may be privileged and is 
 confidential information intended only for the use of the recipient, or any 
 employee or agent responsible to deliver it to the intended recipient. Any 
 unauthorized use, distribution or copying of this information is strictly 
 prohibited and may be unlawful. If you have received this communication in 
 error, please notify the sender immediately and destroy the original message 
 and all attachments from your electronic files.
 
 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Fwd: Is there any way to stop a resilver?

This is an iSCSI/COMSTAR array.

The head was running 2009.06 stable with version 14 ZFS, but we updated that
to build 134 (kept the old OS drives) - did not, however, update the zpool -
it's still version 14.

The targets are all running 2009.06 stable, exporting 4 raidz1 LUNs each of
6 drives - 8 shelves have 1TB drives, the other 8 have 2TB drives.

The head sees the filesystem as comprised of 8 vdevs of 8 iSCSI LUNs each,
with SSD ZIL and SSD L2ARC.



On Wed, Sep 29, 2010 at 11:49 AM, Scott Meilicke 
scott.meili...@craneaerospace.com wrote:

  What version of OS?
 Are snapshots running (turn them off).

 So are there eight disks?



 On 9/29/10 8:46 AM, LIC mesh licm...@gmail.com wrote:

 It's always running less than an hour.

 It usually starts at around 300,000h estimate(at 1m in), goes up to an
 estimate in the millions(about 30mins in) and restarts.

 Never gets past 0.00% completion, and K resilvered on any LUN.

 64 LUNs, 32x5.44T, 32x10.88T in 8 vdevs.




 On Wed, Sep 29, 2010 at 11:40 AM, Scott Meilicke 
 scott.meili...@craneaerospace.com wrote:

 Has it been running long? Initially the numbers are *way* off. After a
 while it settles down into something reasonable.

 How many disks, and what size, are in your raidz2?

 -Scott


 On 9/29/10 8:36 AM, LIC mesh licm...@gmail.com 
 http://licm...@gmail.com  wrote:

 Is there any way to stop a resilver?

 We gotta stop this thing - at minimum, completion time is 300,000 hours,
 and maximum is in the millions.

 Raidz2 array, so it has the redundancy, we just need to get data off.

  --
 We value your opinion!  http://www.craneae.com/surveys/satisfaction.htmHow 
 may we serve you better?Please click the survey link to tell us how we
 are doing: http://www.craneae.com/surveys/satisfaction.htm
 http://www.craneae.com/surveys/satisfaction.htm

 Your feedback is of the utmost importance to us. Thank you for your time.

 Crane Aerospace  Electronics Confidentiality Statement:
 The information contained in this email message may be privileged and is
 confidential information intended only for the use of the recipient, or any
 employee or agent responsible to deliver it to the intended recipient. Any
 unauthorized use, distribution or copying of this information is strictly
 prohibited and may be unlawful. If you have received this communication in
 error, please notify the sender immediately and destroy the original message
 and all attachments from your electronic files.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Is there any way to stop a resilver?

Most likely an iSCSI timeout, but that was before my time here.

Since then, there have been various individual drives lost along the way on
the shelves, but never a whole LUN, so, theoretically, /except/ for iSCSI
timeouts, there has been no great reason to resilver.



On Wed, Sep 29, 2010 at 11:51 AM, Lin Ling lin.l...@oracle.com wrote:


 What caused the resilvering to kick off in the first place?

 Lin

 On Sep 29, 2010, at 8:46 AM, LIC mesh wrote:

 It's always running less than an hour.

 It usually starts at around 300,000h estimate(at 1m in), goes up to an
 estimate in the millions(about 30mins in) and restarts.

 Never gets past 0.00% completion, and K resilvered on any LUN.

 64 LUNs, 32x5.44T, 32x10.88T in 8 vdevs.




 On Wed, Sep 29, 2010 at 11:40 AM, Scott Meilicke 
 scott.meili...@craneaerospace.com wrote:

  Has it been running long? Initially the numbers are *way* off. After a
 while it settles down into something reasonable.

 How many disks, and what size, are in your raidz2?

 -Scott


 On 9/29/10 8:36 AM, LIC mesh licm...@gmail.com wrote:

 Is there any way to stop a resilver?

 We gotta stop this thing - at minimum, completion time is 300,000 hours,
 and maximum is in the millions.

 Raidz2 array, so it has the redundancy, we just need to get data off.



 --
 We value your opinion!  http://www.craneae.com/surveys/satisfaction.htmHow 
 may we serve you better?Please click the survey link to tell us how we
 are doing: http://www.craneae.com/surveys/satisfaction.htm
 http://www.craneae.com/surveys/satisfaction.htm

 Your feedback is of the utmost importance to us. Thank you for your time.

 Crane Aerospace  Electronics Confidentiality Statement:
 The information contained in this email message may be privileged and is
 confidential information intended only for the use of the recipient, or any
 employee or agent responsible to deliver it to the intended recipient. Any
 unauthorized use, distribution or copying of this information is strictly
 prohibited and may be unlawful. If you have received this communication in
 error, please notify the sender immediately and destroy the original message
 and all attachments from your electronic files.


 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Mac OS X clients with ZFS server

2010-09-29 Thread Rich Teer

Hi all,

Thanks to some clues from people on this list, I have finally
resolved this issue!

To summarise, I was having problems with timeouts when applications
on my MacBook Pro tried to create new files on an NFS file system
that was mounted from my server running snv_130 (writes to existing
files were fine).

The solution was to assign a static IP address to the MBP and ensure
that proper forward and reverse DNS entries were present (in all of
my tests to date, the MBP was using DHCP to gets its IP address, and
I haven't bothered populating my DNS with DHCP-related entries).  Once
the MBP was using a static IP address as described above, creating new
files on NFS mounted file systems works as flawlessly as one would
expect!

Thanks again to everyone who chimed in with ideas.  No, if I could only
stop Aqua from doing the brain-dead click mouse for input focus and
auto-raise the window that has input focus things I'd be really happy..

-- 
Rich Teer, Publisher
Vinylphile Magazine

www.vinylphilemag.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] rpool spare

2010-09-29 Thread Cindy Swearingen



Tony,

A brief follow-up is that the issue of applying the boot blocks
automatically to a spare for a root pool is covered by this
existing CR 6668666. See this URL for more details.

http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6668666

Thanks,

Cindy

On 09/29/10 08:38, Cindy Swearingen wrote:

Hi Tony,

The current behavior is that you can add a spare to a root pool. If the 
spare kicks in automatically, you would need to apply the boot blocks

manually before you could boot from the spared-in disk.

A good alternative is to create a two-way or three-way mirrored root
pool.

We're tracking the root pool boot issues. If a bug isn't filed for this 
issue, I will file it.


Thanks,

Cindy

On 09/29/10 08:31, Tony MacDoodle wrote:

Using ZFS v22, is it possible to add a hot spare to rpool?

Thanks




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Fwd: Is there any way to stop a resilver?

(I left the list off last time  sorry)

No, the resliver should only be happening if there was a spare available. Is
the whole thing scrubbing? It looks like it. Can you stop it with a

zpool scrub s pool

So... Word of warning, I am no expert at this stuff. Think about what I am
suggesting before you do it :). Although stopping a scrub is pretty
innocuous.

-Scott

On 9/29/10 9:22 AM, LIC mesh licm...@gmail.com wrote:

 You almost have it - each iSCSI target is made up of 4 of the raidz vdevs - 4
 * 6 = 24 disks.
 
 16 targets total.
 
 We have one LUN with status of UNAVAIL but didn't know if removing it
 outright would help - it's actually available and well as far as the target is
 concerned, so we thought it went UNAVAIL as a result of iSCSI timeouts - we've
 since fixed the switches buffers, etc.
 
 See:
 http://pastebin.com/pan9DBBS
 
 
 
 On Wed, Sep 29, 2010 at 12:17 PM, Scott Meilicke
 scott.meili...@craneaerospace.com wrote:
 OK, let me see if I have this right:
 
 8 shelves, 1T disks, 24 disks per shelf = 192 disks
 8 shelves, 2T disks, 24 disks per shelf = 192 disks
 Each raidz is six disks.
 64 raidz vdevs
 Each iSCSI target is made up of 8 of these raidz vdevs (8 x 6 disks = 48
 disks)
 Then the head takes these eight targets, and makes a raidz2. So the raidz2
 depends upon all 384 disks. So when a failure occurs, the resliver is
 accessing all 384 disks.
 
 If I have this right, which I am in serious doubt :), then that will either
 take an enormous amount of time to complete, or never. It looks like never.
 
 Recovery:
 
 From the head, can you see which vdev has failed? If so, can you remove it to
 stop the resliver?
 
 
 
 On 9/29/10 8:57 AM, LIC mesh licm...@gmail.com http://licm...@gmail.com
  wrote:
 
 This is an iSCSI/COMSTAR array.
 
 The head was running 2009.06 stable with version 14 ZFS, but we updated that
 to build 134 (kept the old OS drives) - did not, however, update the zpool -
 it's still version 14.
 
 The targets are all running 2009.06 stable, exporting 4 raidz1 LUNs each of
 6 drives - 8 shelves have 1TB drives, the other 8 have 2TB drives.
 
 The head sees the filesystem as comprised of 8 vdevs of 8 iSCSI LUNs each,
 with SSD ZIL and SSD L2ARC.
 
 
 
 On Wed, Sep 29, 2010 at 11:49 AM, Scott Meilicke
 scott.meili...@craneaerospace.com
 http://scott.meili...@craneaerospace.com  wrote:
 What version of OS?
 Are snapshots running (turn them off).
 
 So are there eight disks?
 
 
 
 On 9/29/10 8:46 AM, LIC mesh licm...@gmail.com
 http://licm...@gmail.com  http://licm...@gmail.com  wrote:
 
 It's always running less than an hour.
 
 It usually starts at around 300,000h estimate(at 1m in), goes up to an
 estimate in the millions(about 30mins in) and restarts.
 
 Never gets past 0.00% completion, and K resilvered on any LUN.
 
 64 LUNs, 32x5.44T, 32x10.88T in 8 vdevs.
 
 
 
 
 On Wed, Sep 29, 2010 at 11:40 AM, Scott Meilicke
 scott.meili...@craneaerospace.com
 http://scott.meili...@craneaerospace.com
 http://scott.meili...@craneaerospace.com  wrote:
 Has it been running long? Initially the numbers are way off. After a
 while it settles down into something reasonable.
 
 How many disks, and what size, are in your raidz2?  
 
 -Scott
 
 
 On 9/29/10 8:36 AM, LIC mesh licm...@gmail.com
 http://licm...@gmail.com  http://licm...@gmail.com
  http://licm...@gmail.com  wrote:
 
 Is there any way to stop a resilver?
 
 We gotta stop this thing - at minimum, completion time is 300,000 hours,
 and maximum is in the millions.
 
 Raidz2 array, so it has the redundancy, we just need to get data off.
 
 
 We value your opinion!  http://www.craneae.com/surveys/satisfaction.htm
 How may we serve you better?Please click the survey link to tell us how we
 are doing:  http://www.craneae.com/surveys/satisfaction.htm
 http://www.craneae.com/surveys/satisfaction.htm
 http://www.craneae.com/surveys/satisfaction.htm
 
 Your feedback is of the utmost importance to us. Thank you for your time.
 
 Crane Aerospace  Electronics Confidentiality Statement:
 The information contained in this email message may be privileged and is
 confidential information intended only for the use of the recipient, or any
 employee or agent responsible to deliver it to the intended recipient. Any
 unauthorized use, distribution or copying of this information is strictly
 prohibited and may be unlawful. If you have received this communication in
 error, please notify the sender immediately and destroy the original message
 and all attachments from your electronic files.
 
 


--
Scott Meilicke | Enterprise Systems Administrator | Crane Aerospace 
Electronics | +1 425-743-8153 | M: +1 206-406-2670



We value your opinion!  How may we serve you better? 
Please click the survey link to tell us how we are doing:
http://www.craneae.com/ContactUs/VoiceofCustomer.aspx
Your feedback is of the utmost importance to us. Thank you for your time.

Re: [zfs-discuss] Resilver endlessly restarting at completion

2010-09-29 Thread Tuomas Leikola

The endless resilver problem still persists on OI b147. Restarts when it
should complete.

I see no other solution than to copy the data to safety and recreate the
array. Any hints would be appreciated as that takes days unless i can stop
or pause the resilvering.

On Mon, Sep 27, 2010 at 1:13 PM, Tuomas Leikola tuomas.leik...@gmail.comwrote:

 Hi!

 My home server had some disk outages due to flaky cabling and whatnot, and
 started resilvering to a spare disk. During this another disk or two
 dropped, and were reinserted into the array. So no devices were actually
 lost, they just were intermittently away for a while each.

 The situation is currently as follows:
   pool: tank
  state: ONLINE
 status: One or more devices has experienced an unrecoverable error.  An
 attempt was made to correct the error.  Applications are
 unaffected.
 action: Determine if the device needs to be replaced, and clear the errors
 using 'zpool clear' or replace the device with 'zpool replace'.
see: http://www.sun.com/msg/ZFS-8000-9P
  scrub: resilver in progress for 5h33m, 22.47% done, 19h10m to go
 config:

 NAME   STATE READ WRITE CKSUM
 tank   ONLINE   0 0 0
   raidz1-0 ONLINE   0 0 0
 c11t1d0p0  ONLINE   0 0 0
 c11t2d0ONLINE   0 0 5
 c11t6d0p0  ONLINE   0 0 0
 spare-3ONLINE   0 0 0
   c11t3d0p0ONLINE   0 0 0  106M
 resilvered
   c9d1 ONLINE   0 0 0  104G
 resilvered
 c11t4d0p0  ONLINE   0 0 0
 c11t0d0p0  ONLINE   0 0 0
 c11t5d0p0  ONLINE   0 0 0
 c11t7d0p0  ONLINE   0 0 0  93.6G
 resilvered
   raidz1-2 ONLINE   0 0 0
 c6t2d0 ONLINE   0 0 0
 c6t3d0 ONLINE   0 0 0
 c6t4d0 ONLINE   0 0 0  2.50K
 resilvered
 c6t5d0 ONLINE   0 0 0
 c6t6d0 ONLINE   0 0 0
 c6t7d0 ONLINE   0 0 0
 c6t1d0 ONLINE   0 0 1
 logs
   /dev/zvol/dsk/rpool/log  ONLINE   0 0 0
 cache
   c6t0d0p0 ONLINE   0 0 0
 spares
   c9d1 INUSE currently in use

 errors: No known data errors

 And this has been going on for a week now, always restarting when it should
 complete.

 The questions in my mind atm:

 1. How can i determine the cause for each resilver? Is there a log?

 2. Why does it resilver the same data over and over, and not just the
 changed bits?

 3. Can i force remove c9d1 as it is no longer needed but c11t3 can be
 resilvered instead?

 I'm running opensolaris 134, but the event originally happened on 111b. I
 upgraded and tried quiescing snapshots and IO, none of which helped.

 I've already ordered some new hardware to recreate this entire array as
 raidz2 among other things, but there's about a week of time when I can run
 debuggers and traces if instructed to.

 - Tuomas


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Resilver endlessly restarting at completion

2010-09-29 Thread George Wilson


Answers below...

Tuomas Leikola wrote:
The endless resilver problem still persists on OI b147. Restarts when it 
should complete.


I see no other solution than to copy the data to safety and recreate the 
array. Any hints would be appreciated as that takes days unless i can 
stop or pause the resilvering.


On Mon, Sep 27, 2010 at 1:13 PM, Tuomas Leikola 
tuomas.leik...@gmail.com mailto:tuomas.leik...@gmail.com wrote:


Hi!

My home server had some disk outages due to flaky cabling and
whatnot, and started resilvering to a spare disk. During this
another disk or two dropped, and were reinserted into the array. So
no devices were actually lost, they just were intermittently away
for a while each.

The situation is currently as follows:
  pool: tank
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are
unaffected.
action: Determine if the device needs to be replaced, and clear the
errors
using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scrub: resilver in progress for 5h33m, 22.47% done, 19h10m to go
config:

NAME   STATE READ WRITE CKSUM
tank   ONLINE   0 0 0
  raidz1-0 ONLINE   0 0 0
c11t1d0p0  ONLINE   0 0 0
c11t2d0ONLINE   0 0 5
c11t6d0p0  ONLINE   0 0 0
spare-3ONLINE   0 0 0
  c11t3d0p0ONLINE   0 0 0  106M
resilvered
  c9d1 ONLINE   0 0 0  104G
resilvered
c11t4d0p0  ONLINE   0 0 0
c11t0d0p0  ONLINE   0 0 0
c11t5d0p0  ONLINE   0 0 0
c11t7d0p0  ONLINE   0 0 0  93.6G
resilvered
  raidz1-2 ONLINE   0 0 0
c6t2d0 ONLINE   0 0 0
c6t3d0 ONLINE   0 0 0
c6t4d0 ONLINE   0 0 0  2.50K
resilvered
c6t5d0 ONLINE   0 0 0
c6t6d0 ONLINE   0 0 0
c6t7d0 ONLINE   0 0 0
c6t1d0 ONLINE   0 0 1
logs
  /dev/zvol/dsk/rpool/log  ONLINE   0 0 0
cache
  c6t0d0p0 ONLINE   0 0 0
spares
  c9d1 INUSE currently in use

errors: No known data errors

And this has been going on for a week now, always restarting when it
should complete.

The questions in my mind atm: 


1. How can i determine the cause for each resilver? Is there a log?


If you're running OI b147 then you should be able to do the following:

# echo ::zfs_dbgmsg | mdb -k  /var/tmp/dbg.out

Send me the output.



2. Why does it resilver the same data over and over, and not just
the changed bits?


If you're having drives fail prior to the initial resilver finishing 
then it will restart and do all the work over again. Are drives still 
failing randomly for you?




3. Can i force remove c9d1 as it is no longer needed but c11t3 can
be resilvered instead?


You can detach the spare and let the resilver work on only c11t3. Can 
you send me the output of 'zdb - tank 0'?


Thanks,
George
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Is there any way to stop a resilver?

2010-09-29 Thread George Wilson

Can you post the output of 'zpool status'?

Thanks,
George

LIC mesh wrote:

Most likely an iSCSI timeout, but that was before my time here.

Since then, there have been various individual drives lost along the way
on the shelves, but never a whole LUN, so, theoretically, /except/ for
iSCSI timeouts, there has been no great reason to resilver.

On Wed, Sep 29, 2010 at 11:51 AM, Lin Ling lin.l...@oracle.com
mailto:lin.l...@oracle.com wrote:

What caused the resilvering to kick off in the first place?

Lin

On Sep 29, 2010, at 8:46 AM, LIC mesh wrote:

It's always running less than an hour.

It usually starts at around 300,000h estimate(at 1m in), goes up
to an estimate in the millions(about 30mins in) and restarts.

Never gets past 0.00% completion, and K resilvered on any LUN.

64 LUNs, 32x5.44T, 32x10.88T in 8 vdevs.

On Wed, Sep 29, 2010 at 11:40 AM, Scott Meilicke
scott.meili...@craneaerospace.com
mailto:scott.meili...@craneaerospace.com wrote:

Has it been running long? Initially the numbers are *way* off.
After a while it settles down into something reasonable.

How many disks, and what size, are in your raidz2?

-Scott

On 9/29/10 8:36 AM, LIC mesh licm...@gmail.com
http://licm...@gmail.com/ wrote:

Is there any way to stop a resilver?

We gotta stop this thing - at minimum, completion time is
300,000 hours, and maximum is in the millions.

Raidz2 array, so it has the redundancy, we just need to
get data off.

We value your opinion!
http://www.craneae.com/surveys/satisfaction.htm How may we

serve you better?Please click the survey link to tell us how
we are doing:

http://www.craneae.com/surveys/satisfaction.htmhttp://www.craneae.com/surveys/satisfaction.htm

Your feedback is of the utmost importance to us. Thank you for
your time.

Crane Aerospace Electronics Confidentiality Statement:
The information contained in this email message may be
privileged and is confidential information intended only for
the use of the recipient, or any employee or agent responsible
to deliver it to the intended recipient. Any unauthorized use,
distribution or copying of this information is strictly
prohibited and may be unlawful. If you have received this
communication in error, please notify the sender immediately
and destroy the original message and all attachments from your
electronic files.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org mailto:zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Resilver endlessly restarting at completion

2010-09-29 Thread Tuomas Leikola

Thanks for taking an interest. Answers below.

On Wed, Sep 29, 2010 at 9:01 PM, George Wilson
george.r.wil...@oracle.comwrote:

 On Mon, Sep 27, 2010 at 1:13 PM, Tuomas Leikola 
 tuomas.leik...@gmail.commailto:
 tuomas.leik...@gmail.com wrote:


  (continuous resilver loop) has been going on for a week now, always
 restarting when it
should complete.

The questions in my mind atm:
1. How can i determine the cause for each resilver? Is there a log?


 If you're running OI b147 then you should be able to do the following:

 # echo ::zfs_dbgmsg | mdb -k  /var/tmp/dbg.out

 Send me the output.


Sending verbose output in a separate email. I'm not very familiar with this
but it does show some restarting lines.


2. Why does it resilver the same data over and over, and not just
the changed bits?


 If you're having drives fail prior to the initial resilver finishing then
 it will restart and do all the work over again. Are drives still failing
 randomly for you?



Drives haven't been dropping since the initial incidents. It's run to
completion a few times now without (visible) issues with the drives.

Then again I think there is some magic to reinsert a device back into the
array if there is some intermittent SATA disconnection.



3. Can i force remove c9d1 as it is no longer needed but c11t3 can
be resilvered instead?


 You can detach the spare and let the resilver work on only c11t3. Can you
 send me the output of 'zdb - tank 0'?


Detach commands complain there's not enough replicas. Of course I can
physically remove the device, at which point a scrub would suffice (the
disks must be relatively well up-to-date by now..)

Sending zdb output in a separate mail as soon as it completes..
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Migrating to an aclmode-less world

2010-09-29 Thread Simon Breden

Currently I'm still using OpenSolaris b134 and I had used the 'aclmode' 
property on my file systems. However, the aclmode property has been dropped 
now: 
http://arc.opensolaris.org/caselog/PSARC/2010/029/20100126_mark.shellenbaum 

I'm wondering what will happen to the ACLs on these files and directories if I 
upgrade to a newer Solaris version (OpenIndiana b147 perhaps).

I'm sharing the file systems using CIFS.

I was using very simple ACLs like below for easy inheritance of ACLs, which 
worked OK for my needs.

# zfs set aclinherit=passthrough tank/home/fred/projects
# zfs set aclmode=passthrough tank/home/fred/projects
# chmod A=\
owner@:rwxpdDaARWcCos:fd-:allow,\
group@:rwxpdDaARWcCos:fd-:allow,\
everyone@:rwxpdDaARWcCos:fd-:deny \
/tank/home/fred/projects
# chown fred:fred /tank/home/fred/projects
# zfs set sharesmb=name=projects tank/home/fred/projects

Cheers,
Simon
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] When Zpool has no space left and no snapshots

2010-09-29 Thread David Dyer-Bennet


On Wed, September 22, 2010 21:25, Aleksandr Levchuk wrote:

 I ran out of space, consequently could not rm or truncate files. (It
 make sense because it's a copy-on-write and any transaction needs to
 be written to disk. It worked out really well - all I had to do is
 destroy some snapshots.)

 If there are no snapshots to destroy, how to prepare for a situation
 when a ZFS pool looses it's last free byte?

Add some more space somewhere around 90%, or earlier :-).

If you do get stuck,  you can add another vdev when full, too. Just
remember that you're stuck with whatever you add forever, since there's
no way to remove a vdev from a pool.
-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] tagged ACL groups: let's just keep digging until we come out the other side (was: zfs proerty aclmode gone in 147?)

2010-09-29 Thread Miles Nordin

 rb == Ralph Böhme ra...@rsrc.de writes:

rb The Darwin kernel evaluates permissions in a first
rb match paradigm, evaluating the ACL before the mode 

well...I think it would be better to AND them together like AFS did.
In that case it doesn't make any difference in which order you do it
because AND is commutative.  The Darwin method you describe means one
might remove permissions with chmod but still have access granted
under first-match by the ACL.  I just tested, and Darwin does indeed
work this way. :(

One way to get from NFSv4 to what I want is that you might add EVEN
MORE complexity and have ``tagged ACL groups'':

 * all the existing ACL tools and NFS/SMB clients targeting 
   the #(null) tag, 

 * traditional 'chmod' unix permissions targeting the #(unix) tag.  

 * The evaluation within a tag-group is first-match like now, 

 * The result of each tag-group is ANDed together for the final
   evaluation

When accomodating Darwin ACL's or Windows ACL's or Linux NFSv4 ACL's
or translated POSIX ACL's, the result of the imperfect translation can
be shoved into a tag-group if it's unclean.

The way I would implement the userspace, tools would display all tag
groups if given some new argument, but they would always be incapable
of editing any tag group except #(null).  Another chroot-like tool
would swap a given tag-group for #(null) for all child processes:

car...@awabagal:~/bar$ ls -v\# foo
-rw-r--r--   1 carton   carton 0 Sep 29 18:31 foo
 0#(unix):owner@:execute:deny
 
1#(unix):owner@:read_data/write_data/append_data/write_xattr/write_attributes
 /write_acl/write_owner:allow
 2#(unix):group@:write_data/append_data/execute:deny
 3#(unix):group@:read_data:allow
 
4#(unix):everyone@:write_data/append_data/write_xattr/execute/write_attributes
 /write_acl/write_owner:deny
 
5#(unix):everyone@:read_data/read_xattr/read_attributes/read_acl/synchronize
 :allow
car...@awabagal:~/bar$ chmod A+owner@:write_data:deny foo
car...@awabagal:~/bar$ ls -v\# foo
-rw-r--r--   1 carton   carton 0 Sep 29 18:31 foo
 0#(null):owner@:write_data:deny
   #
 0#(unix):owner@:execute:deny
 
1#(unix):owner@:read_data/write_data/append_data/write_xattr/write_attributes
 /write_acl/write_owner:allow
 2#(unix):group@:write_data/append_data/execute:deny
 3#(unix):group@:read_data:allow
 
4#(unix):everyone@:write_data/append_data/write_xattr/execute/write_attributes
 /write_acl/write_owner:deny
 
5#(unix):everyone@:read_data/read_xattr/read_attributes/read_acl/synchronize
 :allow
car...@awabagal:~/bar$ echo lala  foo
-bash: foo: Permission denied
car...@awabagal:~/bar$ chpacl baz ls -v\# foo
-rw-r--r--   1 carton   carton 0 Sep 29 18:31 foo
   #
 0#root:owner@:write_data:deny -- #root is what's mapped to 
#(null) at boot
   #
 0#(unix):owner@:execute:deny
 
1#(unix):owner@:read_data/write_data/append_data/write_xattr/write_attributes
 /write_acl/write_owner:allow
 2#(unix):group@:write_data/append_data/execute:deny
 3#(unix):group@:read_data:allow
 
4#(unix):everyone@:write_data/append_data/write_xattr/execute/write_attributes
 /write_acl/write_owner:deny
 
5#(unix):everyone@:read_data/read_xattr/read_attributes/read_acl/synchronize
 :allow
car...@awabagal:~/bar$ chpacl '(null)' true
chpacl: '(null)' is reserved.
car...@awabagal:~/bar$ chpacl baz chmod A+owner@:read_data:deny foo
car...@awabagal:~/bar$ chpacl baz ls -v\# foo
-rw-r--r--   1 carton   carton 0 Sep 29 18:31 foo
 0#(null):owner@:read_data:deny
   #
 0#root:owner@:write_data:deny
   #
 0#(unix):owner@:execute:deny
 
1#(unix):owner@:read_data/write_data/append_data/write_xattr/write_attributes
 /write_acl/write_owner:allow
 2#(unix):group@:write_data/append_data/execute:deny
 3#(unix):group@:read_data:allow
 
4#(unix):everyone@:write_data/append_data/write_xattr/execute/write_attributes
 /write_acl/write_owner:deny
 
5#(unix):everyone@:read_data/read_xattr/read_attributes/read_acl/synchronize
 :allow
car...@awabagal:~bar$ cat foo
-bash: foo: Permission denied
car...@awabagal:~bar$ chpacl baz cat foo  -- current tagspace is irrelevant to 
ACL evaluation
-bash: foo: Permission denied
car...@awabagal:~/bar$ ls -v\# foo
-rw-r--r--   1 carton   carton 0 Sep 29 18:31 foo
 0#(null):owner@:write_data:deny
   #
 0#baz:owner@:read_data:deny
   #
 0#(unix):owner@:execute:deny
 
1#(unix):owner@:read_data/write_data/append_data/write_xattr/write_attributes
 /write_acl/write_owner:allow
 2#(unix):group@:write_data/append_data/execute:deny
 3#(unix):group@:read_data:allow
 
4#(unix):everyone@:write_data/append_data/write_xattr/execute/write_attributes
 /write_acl/write_owner:deny
 
5#(unix):everyone@:read_data/read_xattr/read_attributes/read_acl/synchronize
 :allow

Re: [zfs-discuss] When Zpool has no space left and no snapshots

2010-09-29 Thread Matt Cowger

You can truncate a file:

Echo   bigfile

That will free up space without the 'rm'

-Original Message-
From: zfs-discuss-boun...@opensolaris.org 
[mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of David Dyer-Bennet
Sent: Wednesday, September 29, 2010 12:59 PM
To: zfs-discuss@opensolaris.org
Subject: Re: [zfs-discuss] When Zpool has no space left and no snapshots


On Wed, September 22, 2010 21:25, Aleksandr Levchuk wrote:

 I ran out of space, consequently could not rm or truncate files. (It
 make sense because it's a copy-on-write and any transaction needs to
 be written to disk. It worked out really well - all I had to do is
 destroy some snapshots.)

 If there are no snapshots to destroy, how to prepare for a situation
 when a ZFS pool looses it's last free byte?

Add some more space somewhere around 90%, or earlier :-).

If you do get stuck,  you can add another vdev when full, too. Just
remember that you're stuck with whatever you add forever, since there's
no way to remove a vdev from a pool.
-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] When Zpool has no space left and no snapshots

2010-09-29 Thread David Dyer-Bennet


On Wed, September 29, 2010 15:17, Matt Cowger wrote:
 You can truncate a file:

 Echo   bigfile

 That will free up space without the 'rm'

Copy-on-write; the new version gets written to the disk before the old
version is released, it doesn't just overwrite.  AND, if it's in any
snapshots, the old version doesn't get released.

-- 
David Dyer-Bennet, d...@dd-b.net; http://dd-b.net/
Snapshots: http://dd-b.net/dd-b/SnapshotAlbum/data/
Photos: http://dd-b.net/photography/gallery/
Dragaera: http://dragaera.info

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] tagged ACL groups: let's just keep digging until we come out the other side (was: zfs proerty aclmode gone in 147?)

Keep in mind that Windows lacks a mode_t.  We need to interop with
Windows.  If a Windows user cannot completely change file perms because
there's a mode_t completely out of their reach... they'll be frustrated.

Thus an ACL-and-mode model where both are applied doesn't work.  It'd be
nice, but it won't work.

The mode has to be entirely encoded by the ACL.  But we can't resort to
interesting encoding tricks as Windows users won't understand them.

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] tagged ACL groups: let's just keep digging until we come out the other side (was: zfs proerty aclmode gone in 147?)

2010-09-29 Thread Ralph Böhme

 Keep in mind that Windows lacks a mode_t.  We need to
 interop with Windows.

Oh my, I see. Another itch to scratch. Now at least Windows users are happy 
while me and mabye others are not.
-r
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

[zfs-discuss] Resliver making the system unresponsive

This must be resliver day :)

I just had a drive failure. The hot spare kicked in, and access to the pool 
over NFS was effectively zero for about 45 minutes. Currently the pool is still 
reslivering, but for some reason I can access the file system now. 

Resliver speed has been beaten to death I know, but is there a way to avoid 
this? For example, is more enterprisy hardware less susceptible to reslivers? 
This box is used for development VMs, but there is no way I would consider this 
for production with this kind of performance hit during a resliver.

My hardware:
Dell 2950
16G ram
16 disk SAS chassis
LSI 3801 (I think) SAS card (1068e chip)
Intel x25-e SLOG off of the internal PERC 5/i RAID controller
Seagate 750G disks (7200.11)

I am running Nexenta CE 3.0.3 (SunOS rawhide 5.11 NexentaOS_134f i86pc i386 
i86pc Solaris)

  pool: data01
 state: DEGRADED
status: One or more devices is currently being resilvered.  The pool will
continue to function, possibly in a degraded state.
action: Wait for the resilver to complete.
 scan: resilver in progress since Wed Sep 29 14:03:52 2010
1.12T scanned out of 5.00T at 311M/s, 3h37m to go
82.0G resilvered, 22.42% done
config:

NAME   STATE READ WRITE CKSUM
data01 DEGRADED 0 0 0
  raidz2-0 ONLINE   0 0 0
c1t8d0 ONLINE   0 0 0
c1t9d0 ONLINE   0 0 0
c1t10d0ONLINE   0 0 0
c1t11d0ONLINE   0 0 0
c1t12d0ONLINE   0 0 0
c1t13d0ONLINE   0 0 0
c1t14d0ONLINE   0 0 0
  raidz2-1 DEGRADED 0 0 0
c1t22d0ONLINE   0 0 0
c1t15d0ONLINE   0 0 0
c1t16d0ONLINE   0 0 0
c1t17d0ONLINE   0 0 0
c1t23d0ONLINE   0 0 0
spare-5REMOVED  0 0 0
  c1t20d0  REMOVED  0 0 0
  c8t18d0  ONLINE   0 0 0  (resilvering)
c1t21d0ONLINE   0 0 0
logs
  c0t1d0   ONLINE   0 0 0
spares
  c8t18d0  INUSE currently in use

errors: No known data errors

Thanks for any insights.

-Scott
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] tagged ACL groups: let's just keep digging until we come out the other side (was: zfs proerty aclmode gone in 147?)

On Wed, Sep 29, 2010 at 03:09:22PM -0700, Ralph Böhme wrote:
  Keep in mind that Windows lacks a mode_t.  We need to
  interop with Windows.
 
 Oh my, I see. Another itch to scratch. Now at least Windows users are
 happy while me and mabye others are not.

Yes.  Pardon me for forgetting to mention this earlier.  There's so many
wrinkles here...  But this is one of the biggers; I should not have
forgotten it.

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] tagged ACL groups: let's just keep digging until we come out the other side (was: zfs proerty aclmode gone in 147?)

On Wed, Sep 29, 2010 at 05:21:51PM -0500, Nicolas Williams wrote:
 On Wed, Sep 29, 2010 at 03:09:22PM -0700, Ralph Böhme wrote:
   Keep in mind that Windows lacks a mode_t.  We need to
   interop with Windows.
  
  Oh my, I see. Another itch to scratch. Now at least Windows users are
  happy while me and mabye others are not.
 
 Yes.  Pardon me for forgetting to mention this earlier.  There's so many
 wrinkles here...  But this is one of the biggers; I should not have

s/biggers/biggest/

 forgotten it.
 
 Nico
 -- 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Resliver making the system unresponsive

I should add I have 477 snapshots across all files systems. Most of them are 
hourly snaps (225 of them anyway).

On Sep 29, 2010, at 3:16 PM, Scott Meilicke wrote:

 This must be resliver day :)
 
 I just had a drive failure. The hot spare kicked in, and access to the pool 
 over NFS was effectively zero for about 45 minutes. Currently the pool is 
 still reslivering, but for some reason I can access the file system now. 
 
 Resliver speed has been beaten to death I know, but is there a way to avoid 
 this? For example, is more enterprisy hardware less susceptible to reslivers? 
 This box is used for development VMs, but there is no way I would consider 
 this for production with this kind of performance hit during a resliver.
 
 My hardware:
 Dell 2950
 16G ram
 16 disk SAS chassis
 LSI 3801 (I think) SAS card (1068e chip)
 Intel x25-e SLOG off of the internal PERC 5/i RAID controller
 Seagate 750G disks (7200.11)
 
 I am running Nexenta CE 3.0.3 (SunOS rawhide 5.11 NexentaOS_134f i86pc i386 
 i86pc Solaris)
 
  pool: data01
 state: DEGRADED
 status: One or more devices is currently being resilvered.  The pool will
   continue to function, possibly in a degraded state.
 action: Wait for the resilver to complete.
 scan: resilver in progress since Wed Sep 29 14:03:52 2010
1.12T scanned out of 5.00T at 311M/s, 3h37m to go
82.0G resilvered, 22.42% done
 config:
 
   NAME   STATE READ WRITE CKSUM
   data01 DEGRADED 0 0 0
 raidz2-0 ONLINE   0 0 0
   c1t8d0 ONLINE   0 0 0
   c1t9d0 ONLINE   0 0 0
   c1t10d0ONLINE   0 0 0
   c1t11d0ONLINE   0 0 0
   c1t12d0ONLINE   0 0 0
   c1t13d0ONLINE   0 0 0
   c1t14d0ONLINE   0 0 0
 raidz2-1 DEGRADED 0 0 0
   c1t22d0ONLINE   0 0 0
   c1t15d0ONLINE   0 0 0
   c1t16d0ONLINE   0 0 0
   c1t17d0ONLINE   0 0 0
   c1t23d0ONLINE   0 0 0
   spare-5REMOVED  0 0 0
 c1t20d0  REMOVED  0 0 0
 c8t18d0  ONLINE   0 0 0  (resilvering)
   c1t21d0ONLINE   0 0 0
   logs
 c0t1d0   ONLINE   0 0 0
   spares
 c8t18d0  INUSE currently in use
 
 errors: No known data errors
 
 Thanks for any insights.
 
 -Scott
 -- 
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Scott Meilicke



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

Re: [zfs-discuss] Resliver making the system unresponsive