Re: [zfs-discuss] zpool replace not concluding + duplicate drive label

2011-10-27 Thread Andrew Freedman
On 28/10/2011, at 3:06 PM, Daniel Carosone wrote:

> On Thu, Oct 27, 2011 at 10:49:22AM +1100, afree...@mac.com wrote:
>> Hi all,
>> 
>> I'm seeing some puzzling behaviour with my RAID-Z.
>> 
> 
> Indeed.  Start with zdb -l on each of the disks to look at the labels in more 
> detail.
> 
> --
> Dan.

I'm reluctant to include a monstrous wall of text so I've placed the output at 
http://dl.dropbox.com/u/19420697/zdb.out.

Immediately I'm struck by the sad dearth of information on da6, the similarity 
of the da0 + da0/old subtree to the zpool status information and my total lack 
of knowledge on how to use this data in any beneficial fashion.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool replace not concluding + duplicate drive label

2011-10-27 Thread Daniel Carosone
On Thu, Oct 27, 2011 at 10:49:22AM +1100, afree...@mac.com wrote:
> Hi all,
> 
> I'm seeing some puzzling behaviour with my RAID-Z.
> 

Indeed.  Start with zdb -l on each of the disks to look at the labels in more 
detail.

--
Dan.

pgpRTwLfC9flo.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool replace

2011-08-15 Thread Mark J Musante


Hi Doug,

The "vms" pool was created in a non-redundant way, so there is no way to 
get the data off of it unless you can put back the original c0t3d0 disk.


If you can still plug in the disk, you can always do a zpool replace on it 
afterwards.


If not, you'll need to restore from backup, preferably to a pool with 
raidz or mirroring so zfs can repair faults automatically.



On Mon, 15 Aug 2011, Doug Schwabauer wrote:


Help - I've got a bad disk in a zpool and need to replace it.  I've got an 
extra drive that's not being used, although it's still marked like it's in a 
pool. 
So I need to get the "xvm" pool destroyed, c0t5d0 marked as available, and 
replace c0t3d0 with c0t5d0.

root@kc-x4450a # zpool status -xv
  pool: vms
 state: UNAVAIL
status: One or more devices are faulted in response to IO failures.
action: Make sure the affected devices are connected, then run 'zpool clear'.
   see: http://www.sun.com/msg/ZFS-8000-HC
 scrub: none requested
config:

    NAME    STATE READ WRITE CKSUM
    vms UNAVAIL  0 3 0  insufficient replicas
  c0t2d0    ONLINE   0 0 0
  c0t3d0    UNAVAIL  0 6 0  experienced I/O failures
  c0t4d0    ONLINE   0 0 0

errors: Permanent errors have been detected in the following files:

    vms:<0x5>
    vms:<0xb>
root@kc-x4450a # zpool replace -f vms c0t3d0 c0t5d0
cannot replace c0t3d0 with c0t5d0: pool I/O is currently suspended
root@kc-x4450a # zpool import
  pool: xvm
    id: 14176680653869308477
 state: DEGRADED
status: The pool was last accessed by another system.
action: The pool can be imported despite missing or damaged devices.  The
    fault tolerance of the pool may be compromised if imported.
   see: http://www.sun.com/msg/ZFS-8000-EY
config:

    xvm DEGRADED
  mirror-0  DEGRADED
    c0t4d0  FAULTED  corrupted data
    c0t5d0  ONLINE

Thanks!

-Doug




Regards,
markm___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool replace lockup / replace process now stalled, how to fix?

2010-05-21 Thread Michael Donaghy
For the record, in case anyone else experiences this behaviour: I tried 
various things which failed, and finally as a last ditch effort, upgraded my 
freebsd, giving me zpool v14 rather than v13 - and now it's resilvering as it 
should.

Michael

On Monday 17 May 2010 09:26:23 Michael Donaghy wrote:
> Hi,
> 
> I recently moved to a freebsd/zfs system for the sake of data integrity,
>  after losing my data on linux. I've now had my first hard disk failure;
>  the bios refused to even boot with the failed drive (ad18) connected, so I
>  removed it. I have another drive, ad16, which had enough space to replace
>  the failed one, so I partitioned it and attempted to use "zpool replace"
>  to replace the failed partitions for new ones, i.e. "zpool replace tank
>  ad18s1d ad16s4d". This seemed to simply hang, with no processor or disk
>  use; any "zpool status" commands also hung. Eventually I attempted to
>  reboot the system, which also eventually hung; after waiting a while,
>  having no other option, rightly or wrongly, I hard-rebooted. Exactly the
>  same behaviour happened with the other zpool replace.
> 
> Now, my zpool status looks like:
> arcueid ~ $ zpool status
>   pool: tank
>  state: DEGRADED
>  scrub: none requested
> config:
> 
> NAME   STATE READ WRITE CKSUM
> tank   DEGRADED 0 0 0
>   raidz2   DEGRADED 0 0 0
> ad4s1d ONLINE   0 0 0
> ad6s1d ONLINE   0 0 0
> ad9s1d ONLINE   0 0 0
> ad17s1dONLINE   0 0 0
> replacing  DEGRADED 0 0 0
>   ad18s1d  UNAVAIL  0 9.62K 0  cannot open
>   ad16s4d  ONLINE   0 0 0
> ad20s1dONLINE   0 0 0
>   raidz2   DEGRADED 0 0 0
> ad4s1e ONLINE   0 0 0
> ad6s1e ONLINE   0 0 0
> ad17s1eONLINE   0 0 0
> replacing  DEGRADED 0 0 0
>   ad18s1e  UNAVAIL  0 11.2K 0  cannot open
>   ad16s4e  ONLINE   0 0 0
> ad20s1eONLINE   0 0 0
> 
> errors: No known data errors
> 
> It looks like the replace has taken in some sense, but ZFS doesn't seem to
>  be resilvering as it should. Attempting to zpool offline doesn't work:
>  arcueid ~ # zpool offline tank ad18s1d
> cannot offline ad18s1d: no valid replicas
> Attempting to scrub causes a similar hang to before. Data is still readable
> (from the zvol which is the only thing actually on this filesystem),
>  although slowly.
> 
> What should I do to recover this / trigger a proper replace of the failed
> partitions?
> 
> Many thanks,
> Michael
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool replace leaves pool degraded after resilvering

2009-07-09 Thread William Bauer
2009.06 is v111b, but you're running v111a.  I don't know, but perhaps the a->b 
transition addressed this issue, among others?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool replace leaves pool degraded after resilvering

2009-07-09 Thread Maurilio Longo
I forgot to mention this is a 

SunOS biscotto 5.11 snv_111a i86pc i386 i86pc

version.

Maurilio.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool replace - choke point

2008-12-05 Thread Marion Hakanson
[EMAIL PROTECTED] said:
> Thanks for the tips.  I'm not sure if they will be relevant, though.  We
> don't talk directly with the AMS1000.  We are using a USP-VM to virtualize
> all of our storage and we didn't have to add anything to the drv
> configuration files to see the new disk (mpxio was already turned on).  We
> are using the Sun drivers and mpxio and we didn't require any tinkering to
> see the new LUNs.

Yes, the fact that the USP-VM was recognized automatically by Solaris drivers
is a good sign.  I suggest that you check to see what queue-depth and disksort
values you ended up with from the automatic settings:

  echo "*ssd_state::walk softstate |::print -t struct sd_lun un_throttle" \
   | mdb -k

The "ssd_state" would be "sd_state" on an x86 machine (Solaris-10).
The "un_throttle" above will show the current max_throttle (queue depth);
Replace it with "un_min_throttle" to see the min, and "un_f_disksort_disabled"
to see the current queue-sort setting.

The HDS docs for 9500 series suggested 32 as the max_throttle to use, and
the default setting (Solaris-10) was 256 (hopefully with the USP-VM you get
something more reasonable).  And while 32 did work for us, i.e. no operations
were ever lost as far as I could tell, the array back-end -- the drives
themselves, and the internal SATA shelf connections, have an actual queue
depth of four for each array controller.  The AMS1000 has the same limitation
for SATA shelves, according to our HDS engineer.

In short, Solaris, especially with ZFS, functions much better if it does
not try to send more FC operations to the array than the actual physical
devices can handle.  We were actually seeing NFS client operations hang
for minutes at a time when the SAN-hosted NFS server was making its ZFS
devices busy -- and this was true even if clients were using different
devices than the busy ones.  We do not see these hangs after making the
described changes, and I believe this is because the OS is no longer waiting
around for a response from devices that aren't going to respond in a
reasonable amount of time.

Yes, having the USP between the host and the AMS1000 will affect things;
There's probably some huge cache in there somewhere.  But unless you've
got cache of hundreds of GB in size, at some point a resilver operation
is going to end up running at the speed of the actual back-end device.

Regards,

Marion


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool replace - choke point

2008-12-04 Thread Alan Rubin
Thanks for the tips.  I'm not sure if they will be relevant, though.  We don't 
talk directly with the AMS1000.  We are using a USP-VM to virtualize all of our 
storage and we didn't have to add anything to the drv configuration files to 
see the new disk (mpxio was already turned on).  We are using the Sun drivers 
and mpxio and we didn't require any tinkering to see the new LUNs.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool replace - choke point

2008-12-04 Thread Marion Hakanson
[EMAIL PROTECTED] said:
> I think we found the choke point.  The silver lining is that it isn't the
> T2000 or ZFS.  We think it is the new SAN, an Hitachi AMS1000, which has
> 7200RPM SATA disks with the cache turned off.  This system has a very small
> cache, and when we did turn it on for one of the replacement LUNs we saw a
> 10x improvement - until the cache filled up about 1 minute later (was using
> zpool iostat).  Oh well. 

We have experience with a T2000 connected to the HDS 9520V, predecessor
to the AMS arrays, with SATA drives, and it's likely that your AMS1000 SATA
has similar characteristics.  I didn't see if you're using Sun's drivers to
talk to the SAN/array, but we are using Solaris-10 (and Sun drivers + MPXIO),
and since the Hitachi storage isn't automatically recognized (sd/ssd,
scsi_vhci), it took a fair amount of tinkering to get parameters adjusted
to work well with the HDS storage.

The combination that has given us best results with ZFS is:
 (a) Tell the array to ignore SYNCHRONIZE_CACHE requests from the host.
 (b) Balance drives within each AMS disk shelf across both array controllers.
 (c) Set the host's max queue depth to 4 for the SATA LUN's (sd/ssd driver).
 (d) Set the host's disable_disksort flag (sd/ssd driver) for HDS LUN's.

Here's the reference we used for setting the parameters in Solaris-10:
  http://wikis.sun.com/display/StorageDev/Parameter+Configuration

Note that the AMS uses read-after-write verification on SATA drives,
so you only have half the IOP's for writes that the drives are capable
of handling.  We've found that small RAID volumes (e.g. a two-drive
mirror) are unbelievably slow, so you'd want to go toward having more
drives per RAID group, if possible.

Honestly, if I recall correctly what I saw in your "iostat" listings
earlier, your situation is not nearly as "bad" as with our older array.
You don't seem to be driving those HDS LUN's to the extreme busy states
that we have seen on our 9520V.  It was not unusual for us to see LUN's
at 100% busy, 100% wait, with 35 ops total in the "actv" and "wait" columns,
and I don't recall seeing any 100%-busy devices in your logs.

But getting the FC queue-depth (max-throttle) setting to match what the
array's back-end I/O can handle greatly reduced the long "zpool status"
and other I/O-related hangs that we were experiencing.  And disabling
the host-side FC queue-sorting greatly improved the overall latency of
the system when busy.  Maybe it'll help yours too.

Regards,

Marion


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool replace - choke point

2008-12-03 Thread Alan Rubin
I think we found the choke point.  The silver lining is that it isn't the T2000 
or ZFS.  We think it is the new SAN, an Hitachi AMS1000, which has 7200RPM SATA 
disks with the cache turned off.  This system has a very small cache, and when 
we did turn it on for one of the replacement LUNs we saw a 10x improvement - 
until the cache filled up about 1 minute later (was using zpool iostat).  Oh 
well.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool replace - choke point

2008-12-02 Thread Alan Rubin
It's something we've considered here as well.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool replace - choke point

2008-12-02 Thread Matt Walburn
Would any of this have to do with the system being a T2000? Would ZFS
resilvering be affected by single threadedness, slowish US-T1 clock
speed or lack of strong FPU performance?

On 12/1/08, Alan Rubin <[EMAIL PROTECTED]> wrote:
> We will be considering it in the new year,  but that will not happen in time
> to affect our current SAN migration.
> --
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>


-- 
--
Matt Walburn
http://mattwalburn.com
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool replace - choke point

2008-12-01 Thread Alan Rubin
We will be considering it in the new year,  but that will not happen in time to 
affect our current SAN migration.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool replace - choke point

2008-12-01 Thread Blake
Have you considered moving to 10/08 ?  ZFS resilver performance is
much improved in this release, and I suspect that code might help you.

You can easily test upgrading with Live Upgrade.  I did the transition
using LU and was very happy with the results.

For example, I added a disk to a mirror and resilvering the new disk
took about 6 min for almost 300GB, IIRC.

Blake



On Mon, Dec 1, 2008 at 11:04 PM, Alan Rubin <[EMAIL PROTECTED]> wrote:
> I had posted at the Sun forums, but it was recommended to me to try here as 
> well.  For reference, please see 
> http://forums.sun.com/thread.jspa?threadID=5351916&tstart=0.
>
> In the process of a large SAN migration project we are moving many large 
> volumes from the old SAN to the new. We are making use of the 'replace' 
> function to replace the old volumes with similar or larger new volumes. This 
> process is moving very slowly, sometimes as slow as only moving one 
> percentage of data every 10 minutes. Is there any way to streamline this 
> method? The system is Solaris 10 08/07. How much is dependent on the activity 
> of the box? How about on the architecture of the box? The primary system in 
> question at this point is a T2000 with 8GB of RAM and a 4-core CPU. This 
> server has 6 4Gb fibre channel connections to our SAN environment. At times 
> this server is quite busy because it is our backup server, but performance 
> seems no better when backup operations have ceased their daily activities.
>
> Our pools are only stripes. Would we expect better performance from a mirror 
> or raidz pool? It is worrisome that if the environment were compromised by a 
> failed disk that it could take so long to replace and correct the usual 
> redundancies (if it was a mirror or raidz pool).
>
> I have previously applied the kernel change described here: 
> http://blogs.digitar.com/jjww/?itemid=52
>
> I just moved a 1TB volume which took approx. 27h.
> --
> This message posted from opensolaris.org
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
>
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool replace not working

2008-07-28 Thread Breandan Dezendorf
Marc,
Thanks - you were right - I had two identical drives and I mixed them  
up.  It's going through the resilver process now...  I expect it will  
run all night.

Breandan

On Jul 27, 2008, at 11:20 PM, Marc Bevand wrote:

> It looks like you *think* you are trying to add the new drive, when  
> you are in
> fact re-adding the old (failing) one. A new drive should never show  
> up as
> ONLINE in a pool with no action from your part, if only because it  
> contains no
> partition and no vdev label with the right pool GUID.
>
> If I am right, try to add the other drive.
>
> If I am wrong, you somehow managed to confuse ZFS.. You can prevent  
> ZFS from
> thinking c2d1 is already part of the pool by deleting the partition  
> table on
> it:
>  $ dd if=/dev/zero of=/dev/rdsk/c2d1p0 bs=512 count=1
>  $ zpool import
>  (it should show you the pool is now ready to be imported)
>  $ zpool import tank
>  $ zpool replace tank c2d1
>
> At this point it should be resilvering...
>
> -marc
>
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool replace not working

2008-07-27 Thread Marc Bevand
It looks like you *think* you are trying to add the new drive, when you are in 
fact re-adding the old (failing) one. A new drive should never show up as 
ONLINE in a pool with no action from your part, if only because it contains no 
partition and no vdev label with the right pool GUID.

If I am right, try to add the other drive.

If I am wrong, you somehow managed to confuse ZFS.. You can prevent ZFS from 
thinking c2d1 is already part of the pool by deleting the partition table on 
it:
  $ dd if=/dev/zero of=/dev/rdsk/c2d1p0 bs=512 count=1
  $ zpool import
  (it should show you the pool is now ready to be imported)
  $ zpool import tank
  $ zpool replace tank c2d1

At this point it should be resilvering...

-marc

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss