Re: RAID1 drive replacement help?
Paul M wrote: Thinking about this some more, I suspect that what may be happening is that the disk still thinks it is a spare. Try blowing away the RAID partition, possibly even replace it with a regular partition and write data to it just to make sure. Then delete that, recreate the RAID partition and try again to reconstruct the component. (It may also be possible to achieve this with the -r option to raidctl, but I'm unfamiliar with the operation of this switch). I will nuke the raid partition. I'll relabel is as a regular partition and format a file system on it. I'll then relabel it again as a RAID partition. I assume that would count as a nuke 'em from space. :) Once it's nuked what are the series of steps to add it as a component to the array? I want to make sure I get it right this time. Essentially, you configured the disk as a spare, now you want to override that configuration and configure it as a component. The man page does say that the spare and the component it was reconstructed from are interchangeable, but I think the system is getting confused as to just what wd1d is. OK... Taking a different approach, you could keep wd1d as the spare, but add a 3rd disk to replace the failed component and simply reconstruct onto that (using the -B switch to raidctl) I will look in the box to see if I can get another drive in there. I may be space constrained... Also - dont forget about the syslog. Sorry, but I'm not clear on what you mean here? Could you clarify? Thanks, Jeff -- Jeffrey C. Smith Phone: 512.692.7607 RevolutionONE Cell : 512.965.3898 j...@revolutionone.com paulm
Re: RAID1 drive replacement help?
Paul M wrote: According to the man page (if my memory is correct), the name component1 is a placeholder used by raidctl when it is unable to access the drive - in other words this component is bad. Remove the drive completely and it will still list it as component1. So component1 is not the name. You need to use (again, if my memory is correct): # raidctl -v -R /dev/wd1d raid0 Paul, Thanks for the reply. I still seem to be having trouble. Here's the latest status of the array: # raidctl -v -s raid0 raid0 Components: /dev/wd0d: optimal component1: failed No spares. Component label for /dev/wd0d: Row: 0, Column: 0, Num Rows: 1, Num Columns: 2 Version: 2, Serial Number: 100, Mod Counter: 1237272102 Clean: No, Status: 0 sectPerSU: 128, SUsPerPU: 1, SUsPerRU: 1 Queue size: 100, blocksize: 512, numBlocks: 238613376 RAID Level: 1 Autoconfig: Yes Root partition: Yes Last configured as: raid0 component1 status is: failed. Skipping label. Parity status: clean Reconstruction is 100% complete. Parity Re-write is 100% complete. Copyback is 100% complete. Now, I try to reconstruct /dev/wd1d (the failed drive): # raidctl -v -R /dev/wd1d raid0 raidctl: /dev/wd1d is not a component of this device Still no luck. Any more ideas??? Thanks, Jeff -- Jeffrey C. Smith Phone: 512.692.7607 RevolutionONE Cell : 512.965.3898 j...@revolutionone.com
Re: RAID1 drive replacement help?
Paul M wrote: On 18/09/2009, at 2:28 PM, Jeffrey C. Smith wrote: Now, I try to reconstruct /dev/wd1d (the failed drive): # raidctl -v -R /dev/wd1d raid0 raidctl: /dev/wd1d is not a component of this device Still no luck. Any more ideas??? Thanks, Jeff Has your raid0.conf file changed? The one you posted earlier shows that /dev/wd1d *is* a component of that array. It has not changed. Here it is: # more /etc/raid0.conf START array 1 2 0 START disks /dev/wd0d /dev/wd1d START layout 128 1 1 1 START queue fifo 100 What do disklabel wd1 and dmesg | grep wd show? anything suspicious? Here are the disk labels: # disklabel wd0 # Inside MBR partition 3: type A6 start 63 size 241248042 # /dev/rwd0c: type: ESDI disk: ESDI/IDE disk label: IC35L120AVV207-1 flags: bytes/sector: 512 sectors/track: 63 tracks/cylinder: 255 sectors/cylinder: 16065 cylinders: 15017 total sectors: 241254720 rpm: 3600 interleave: 1 trackskew: 0 cylinderskew: 0 headswitch: 0 # microseconds track-to-track seek: 0 # microseconds drivedata: 0 16 partitions: #size offset fstype [fsize bsize cpg] a: 2104452 63 4.2BSD 2048 163841 b: 530145 2104515swap c:2412547200 unused d:238613445 2634660RAID # disklabel wd1 # Inside MBR partition 3: type A6 start 63 size 241248042 # /dev/rwd1c: type: ESDI disk: ESDI/IDE disk label: IC35L120AVV207-1 flags: bytes/sector: 512 sectors/track: 63 tracks/cylinder: 255 sectors/cylinder: 16065 cylinders: 15017 total sectors: 241254720 rpm: 3600 interleave: 1 trackskew: 0 cylinderskew: 0 headswitch: 0 # microseconds track-to-track seek: 0 # microseconds drivedata: 0 16 partitions: #size offset fstype [fsize bsize cpg] a: 2104452 63 4.2BSD 2048 163841 b: 530145 2104515swap c:2412547200 unused d:238613445 2634660RAID Here's the entire dmesg (lots of wdx/raid0 stuff at the bottom): # dmesg OpenBSD 4.5 (GENERIC) #0: Tue Aug 11 17:54:58 CDT 2009 r...@gateway.whiteinstruments.com:/mnt/sys/arch/i386/compile/GENERIC cpu0: Intel(R) Pentium(R) 4 CPU 1400MHz (GenuineIntel 686-class) 1.39 GHz cpu0: FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM real mem = 133296128 (127MB) avail mem = 120193024 (114MB) mainbus0 at root bios0 at mainbus0: AT/286+ BIOS, date 08/02/01, BIOS32 rev. 0 @ 0xffe90, SMBIOS rev. 2.3 @ 0xf0450 (97 entries) bios0: vendor Dell Computer Corporation version A04 date 08/02/2001 bios0: Dell Computer Corporation OptiPlex GX400 acpi0 at bios0: rev 0 acpi0: tables DSDT FACP SSDT BOOT acpi0: wakeup devices VBTN(S4) PCI0(S5) USB0(S3) USB1(S3) PCI1(S5) KBD_(S3) acpitimer0 at acpi0: 3579545 Hz, 24 bits acpiprt0 at acpi0: bus 0 (PCI0) acpiprt1 at acpi0: bus 2 (PCI1) acpicpu0 at acpi0 acpibtn0 at acpi0: VBTN bios0: ROM list: 0xc/0x9800 0xc9800/0x2800 cpu0 at mainbus0: (uniprocessor) pci0 at mainbus0 bus 0: configuration mode 1 (bios) pchb0 at pci0 dev 0 function 0 Intel 82850 Host rev 0x02 intelagp0 at pchb0 agp0 at intelagp0: aperture at 0xf000, size 0x800 ppb0 at pci0 dev 1 function 0 Intel 82850/82860 AGP rev 0x02 pci1 at ppb0 bus 1 vga1 at pci1 dev 0 function 0 NVIDIA Riva TNT2 rev 0x15 wsdisplay0 at vga1 mux 1: console (80x25, vt100 emulation) wsdisplay0: screen 1-5 added (80x25, vt100 emulation) ppb1 at pci0 dev 30 function 0 Intel 82801BA Hub-to-PCI rev 0x04 pci2 at ppb1 bus 2 dc0 at pci2 dev 8 function 0 Macronix PMAC 98715 rev 0x25: irq 10, address 00:80:c6:fa:cd:20 dcphy0 at dc0 phy 31: internal PHY rl0 at pci2 dev 10 function 0 Realtek 8139 rev 0x10: irq 11, address 00:50:fc:5f:45:61 rlphy0 at rl0 phy 0: RTL internal PHY xl0 at pci2 dev 12 function 0 3Com 3c905C 100Base-TX rev 0x78: irq 11, address 00:06:5b:4a:5e:0f exphy0 at xl0 phy 24: 3Com internal media interface ichpcib0 at pci0 dev 31 function 0 Intel 82801BA LPC rev 0x04 pciide0 at pci0 dev 31 function 1 Intel 82801BA IDE rev 0x04: DMA, channel 0 wired to compatibility, channel 1 wired to compatibility wd0 at pciide0 channel 0 drive 0: IC35L120AVV207-1 wd0: 16-sector PIO, LBA48, 117800MB, 241254720 sectors wd1 at pciide0 channel 0 drive 1: IC35L120AVV207-1 wd1: 16-sector PIO, LBA48, 117800MB, 241254720 sectors wd0(pciide0:0:0): using PIO mode 4, Ultra-DMA mode 2 wd1(pciide0:0:1): using PIO mode 4, Ultra-DMA mode 2 atapiscsi0 at pciide0 channel 1 drive 0 scsibus0 at atapiscsi0: 2 targets cd0 at scsibus0 targ 0 lun 0: LG, CD-ROM CRD-8482B, 1.05 ATAPI 5/cdrom removable cd0(pciide0:1:0): using PIO mode 4, Ultra-DMA mode 2 uhci0 at pci0 dev 31 function 2 Intel 82801BA USB rev 0x04: irq 11 ichiic0 at pci0 dev 31 function 3 Intel 82801BA SMBus rev 0x04: irq 10 iic0 at ichiic0 uhci1 at pci0 dev 31 function 4 Intel 82801BA USB rev 0x04: irq 9 auich0 at pci0 dev 31
Re: RAID1 drive replacement help?
Paul M wrote: On 16/09/2009, at 10:46 AM, Jeffrey C. Smith wrote: I am trying to add a new drive to replace a failed drive on my RAID1 OpenBSD system. I have read the available documentation but can't get the drive added permanently. Here's what I've done so far- ... Any feedback would be most appreciated... You did not state what the failed disk was - was it also wd1d? Sorry. Yes it is... So, you replaced the failed drive, then configured this new drive as a spare, and then reconstructed onto it. Is that correct? Yes. And as a spare the array is up and healthy: # raidctl -s raid0 raid0 Components: /dev/wd0d: optimal component1: spared Spares: /dev/wd1d: used_spare Parity status: clean Reconstruction is 100% complete. Parity Re-write is 100% complete. Copyback is 100% complete. I think this is wrong because the new disk is not a spare, it is one of the components. I think what you need is the -R option to raidctl, rather than the -F option. OK. So I rebooted and the status is now failed: # raidctl -s raid0 raid0 Components: /dev/wd0d: optimal component1: failed No spares. Parity status: clean Reconstruction is 100% complete. Parity Re-write is 100% complete. Copyback is 100% complete. I try the -R option: # raidctl -v -R component1 raid0 Reconstruction status: It doesn't complain and comes back almost immediately (strange). I check status again: # raidctl -s raid0 raid0 Components: /dev/wd0d: optimal component1: failed No spares. Parity status: clean Reconstruction is 100% complete. Parity Re-write is 100% complete. Copyback is 100% complete. Says reconstruction is complete but the device is still failed? Maybe it wants the device name: # raidctl -v -R wd1d raid0 raidctl: wd1d is not a component of this device Nope. Seems component1 is the correct name. So, it does not complain when I run the reconstruct command (raidctl -v -R component1 raid0) and it even starts the reconstruction progress indicator. But, it quickly returns after that and does not seem to do anything? To recap, seems the only way I can get the drive into the array is to add it as a spare and us the -F option to reconstruct it. After that the array is up and healthy. If I reboot the wd1 drive (component1) fails and I'm back to where I started. Any ideas out there??? Thanks much, Jeff
RAID1 drive replacement help?
I am trying to add a new drive to replace a failed drive on my RAID1 OpenBSD system. I have read the available documentation but can't get the drive added permanently. Here's what I've done so far- First, add the drive as a spare: # raidctl -a /dev/wd1d raid0 Checking the status gives: # raidctl -s raid0 raid0 Components: /dev/wd0d: optimal component1: failed Spares: /dev/wd1d: spare Parity status: clean Reconstruction is 100% complete. Parity Re-write is 100% complete. Copyback is 100% complete. Forcing the reconstruction of the array: # raidctl -vF component1 raid0 Reconstruction status: 100% | | ETA:00:00 / Now the status looks like this (after about 2 hours of rebuild time): # raidctl -s raid0 raid0 Components: /dev/wd0d: optimal component1: spared Spares: /dev/wd1d: used_spare Parity status: clean Reconstruction is 100% complete. Parity Re-write is 100% complete. Copyback is 100% complete. As far as I can tell the second drive is now part of the array as a spare. This is the second time I've completed these steps which explains why the parity is clean - I've already rebuilt the parity earlier. My config file looks like this: # more /etc/raid0.conf START array 1 2 0 START disks /dev/wd0d /dev/wd1d START layout 128 1 1 1 START queue fifo 100 If I reboot the machine it will come up but the second drive (/dev/wd1d) is still in the failed state? I suspect that I need to reconfigure the array to make wd1d a permanent part of the array. However, I am not sure how that would be done and I don't want to make a mistake and trash the array. What do I need to do to make the spare a permanent part of the array so that on the next system boot it will have both drives in optimal state? Any feedback would be most appreciated... Thanks, Jeff -- Jeffrey C. Smith Phone: 512.692.7607 RevolutionONE Cell : 512.965.3898 j...@revolutionone.com