Re: [zfs-discuss] zpool cannot replace a replacing device

2008-12-07 Thread Brian Couper
hi,

--- replacing UNAVAIL 0 543 0 insufficient replicas
-- 17096229131581286394 FAULTED 0 581 0 was /dev/dsk/c0t2d0s0/old
-- 11342560969745958696 FAULTED [u][b]0 582 0[/b][/u] was /dev/dsk/c0t2d0s0

Looking at that, i dont think you have fixed the original fault. Its still 
getting write errors. Thats why the resilvering has stopped i recon.

There there any spare drive connection's on the system? Could you free up a 
drive connection? So you can plug the drive in to a different connection.
You will need to resolve the hardware error, is it the drive, the cable or hard 
drive controller.
Remember a hard drives best trick is to act alive and well when is really at 
deaths door.
One of ZFS's best features is its ability to sniff out hardware faults.

To restart the resilver, do a zpool clear and zpool online. This will force the 
zpool and the hard drive online. It will start to resilver, do a zpool status 
-v to monitor the process. Watch out for the error count on the drive. Dont do 
this till you really think you have got the error fixed.

How is your back up situation? Get you critical data off the zpool before 
attempting to repair the zpool or change any thing with the zpool.

What i would do is, get a new drive, connect it to a different hard drive 
connection and use a new cable. Remove the old drive, unplug it.  I would not 
try to replace the faulty drive while is still connected, things are just going 
to get confusing.

Your zpool status will then show the drive as missing. Zpool replace  with the 
new drive.
Your zpool with be fixed in a few hours. Your zpool may give errors across 
other drives, as long as its 50 just use zpool clear. your hardware fault my 
have been causing trouble for ages without you knowing.

Am an amateur ZFS er  so use my advice with caution. 

Brian,
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool cannot replace a replacing device

2008-12-07 Thread Brian Couper
zpool replace data 11342560969745958696 c0t2d0  that might replace the drive BUT
you will have to sort out the hardware error first.

For now forget about zfs and what is says about the zpool status. Concentrate 
on fixing the hardware error. Use the manufacturs drive check boot CD to check 
the drive again. I know you checked it once before but my money is on the hard 
drive being faulty. I recon you will get errors on the drive if you check it 
again.

If it passes without any errors, and without wipeing the drive, try zpool clear 
and zpool online. It may not get any more write errors.

Is the drive showing up in the format command?

Remember, this small error has all the signs of going pear-shaped on you, so 
back up your data now while you can still read it!
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zpool cannot replace a replacing device

2008-12-07 Thread Brian Couper
Am at the limit of my knowlage now.

google  man zpool

UNAVAIL is coming up because the zpool was imported with the drive missing.
Try exporting the pool, rebooting then importing it with the drive connected.

UNAVAIL
The device could not be opened. If a pool is imported when a device was 
unavailable, then the device will be identified by a unique identifier instead 
of its path since the path was never correct in the first place.

zpool attach [-f] pool device new_device have a read of zpool attach, it 
might work.

You could also try adding the drive as a hot spare.

Thats all the help i can give, sorry, i dont know how to change/edit parts of 
ZFS.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Problem importing degraded Pool

2008-12-02 Thread Brian Couper
Hi,

Attach both original drives to the system, the faulty one may only have had a 
few check sum errors. 

zpool status -vshould hopefully show your data pool.  Provided you have not 
started to replaced the faulty drive yet.  If it don't see the pool, zpool 
export then zpool  import and hope for the best

If you get back to original failed state with your pool degraded but readable. 
It can be easily fixed. most of the time.

Do a  zpool status -v   - mind the -v

Whats it saying about your pool? I suspect the faulty drive has check sum 
errors and has been off-lined.

power down the system and add the spare 3rd drive to the system so you have all 
3 drives connected. DO NOT MOVE the original drives to different connections in 
the system that just going to cause more trouble.

While your inside the system check all the connection to the hard drives.

power up the system

Look up the ZFS commands.  Read and understand what your about to do.

you need to force the failed drive online
#zpool online pool device
do a zpool clear to clear the error log on the faulty pool
#zpool clear pool

now you have 2 choices here, back up your critical data to the new 3rd drive or 
replace the faulty drive.

zpool replace [-f] pool device [new_device]

Now zfs is almost certainly going to complain like hell about the faulty pool.
during the copy / replace operation. 

To be blunt your data is either readable or its not.  Run zpool clear and force 
online the faulty drive.  Every time it gets put offline, this may be  several 
times!
Zfs will tell you exactly what files have been lost, if any. The process could 
take several hours. Do a zpool scrub once its finished. Then back up your 
data

use zpool status -v to monitor progress.

If you don't get a lot of errors from the faulty drive. You could try  a low 
level format, to fix the drive. After you have got the data off it ;)  

one final word, a striped zpool with copy's=2 is about as much use, as a 
chocolate fire guard when it comes to protecting data.  Use 3+ drives and raidz 
its far better.

Am no expert, been using zfs for 7months. When i fist started using it, ZFS 
found 4 faulty drives in my setup. And other operating systems said they were 
good drives!!! So i have used ZFS to its full recovery potential!!!

Brian,
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Problem importing degraded Pool

2008-12-02 Thread Brian Couper
Hi,

Eeemmm, i think its safe to say your zpool and its data are gone for ever.
Use the Samsung disk checker boot CD, and see if it can fix your faulty disk. 
Then connect all 3 drives to your system and use raidz. Your data will then be 
well protected.

Brian,
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs, raidz, spare and jbod

2008-08-21 Thread Brian Couper
Hi,

One of the thing you could have done to continue the resilver is zpool clear
This would have let you continue to replace the drive you pulled out. Once that 
was done you could have them figured out what was wrong with the second faulty 
drive.

The second drive only had check sum errors, ZFS was doing it job, the data on 
the was usable.  zpool clear would have keeped the pool online, all be it with 
lots of complaints.

I have had to use  zpool clear multiple times on one of my zpools after a PSU 
failure took out a HDD and damaged another. Mobo and ram died too :(
The damaged drive racked up thousands and thousands of errors while i replaced 
the dead drive. in the end i only lost one small file.

Am no expert, but thats how I got round a similar problem.
 
 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss