Re: Hot swapping failed disk /dev/sda in RAID 1 array

2016-07-20 Thread Urs Thuermann
Peter Ludikovsky  writes:

> Ad 1: Yes, the SATA controller has to support Hot-Swap. You _can_ remove
> the device nodes by running
> # echo 1 > /sys/block//device/delete

Thanks, I have now my RAID array fully working again.  This is what I
have done:

1. Like you suggested above I deleted the drive (/dev/sda* and entries
   in /proc/partitions)

echo 1 > /sys/block/sda/device/delete

2. Hotplug-added the new drive.  Obviously, my controller doesn't
   support or isn't configured to notify the kernel.  Using Google I
   found the command the have the kernel rescan for drives:

echo "- - -" > /sys/class/scsi_host/host0/scan

3. The rest is straight-forward:

fdisk /dev/sda  [Add partition /dev/sda1 with type 0xfd]
mdadm /dev/md0 --add /dev/sda1
update-grub

Now, everything is up again and both drives synced, without reboot:

# cat /proc/mdstat 
Personalities : [raid1] 
md0 : active raid1 sda1[2] sdb1[1]
  1953381376 blocks super 1.2 [2/2] [UU]
  bitmap: 1/15 pages [4KB], 65536KB chunk

unused devices: 
# uptime
 11:49:01 up 106 days, 22:44, 23 users,  load average: 0.13, 0.19, 0.15

I only wonder if it's normal that the drives are numbered 2 and 1
instead of 0 and 1.

> Ad 2: Depends on the controller, see 1. It might recognize the new
> drive, or not. It might see the correct device, or not.

Next time I reboot the machine I will check whether there are any BIOS
settings to make the controller support hot-plugging.

urs



Re: Hot swapping failed disk /dev/sda in RAID 1 array

2016-07-19 Thread Pascal Hambourg

Le 19/07/2016 à 16:01, Urs Thuermann a écrit :


   Shouldn't the device nodes and entries in /proc/partitions
   disappear when the drive is pulled?  Or does the BIOS or the SATA
   controller have to support this?

2. Can I hotplug the new drive and rebuild the RAID array?


As others replied, the SATA controller must support hot-plug, but also 
must be configured in AHCI mode in the BIOS settings so that the kernel 
is notified when a device is added or removed.




Re: Hot swapping failed disk /dev/sda in RAID 1 array

2016-07-19 Thread Andy Smith
Hi Urs,

On Tue, Jul 19, 2016 at 04:01:39PM +0200, Urs Thuermann wrote:
> 2. Can I hotplug the new drive and rebuild the RAID array?

It should work, if your SATA port supports hotplug. Plug the new
drive in and see if the new device node appears. If it does then
you're probably good to go.

You can dump out the partition table from an existing drive with
something like:

# sfdisk -d /dev/sdb > sdb.out

And then partition the new drive the same with something like:

# sfdisk /dev/sdc < sdb.out

(assuming sdb is your working existing drive and sdc is the device
node of the new drive)

Then add the new device to the md with something like:

# mdadm /dev/md0 --add /dev/sdc1

(assuming your array is md0; adjust to suit)

At that point /proc/mdstat should show a rebuild taking place.

If you run into difficulty try asking on the linux-raid mailing list
- it's very good for support and it's best to ask there before doing
anything that you have the slightest doubt about!

Cheers,
Andy

-- 
http://bitfolk.com/ -- No-nonsense VPS hosting



Re: Hot swapping failed disk /dev/sda in RAID 1 array

2016-07-19 Thread Peter Ludikovsky
Ad 1: Yes, the SATA controller has to support Hot-Swap. You _can_ remove
the device nodes by running
# echo 1 > /sys/block//device/delete

Ad 2: Depends on the controller, see 1. It might recognize the new
drive, or not. It might see the correct device, or not.

Ad 3: As long as the second HDD is within the BIOS boot order, that
should work.

Regards,
/peter

Am 19.07.2016 um 16:01 schrieb Urs Thuermann:
> In my RAID 1 array /dev/md0 consisting of two SATA drives /dev/sda1
> and /dev/sdb1 the first drive /dev/sda has failed.  I have called
> mdadm --fail and mdadm --remove on that drive and then pulled the
> cables and removed the drive.  The RAID array continues to work fine
> but in degraded mode.
> 
> I have some questions:
> 
> 1. The block device nodes /dev/sda and /dev/sda1 still exist and the
>partitions are still listed in /proc/partitions.
> 
>That causes I/O errors when running LVM tools or fdisk -l or other
>tools that try to access/scan all block devices.
> 
>Shouldn't the device nodes and entries in /proc/partitions
>disappear when the drive is pulled?  Or does the BIOS or the SATA
>controller have to support this?
> 
> 2. Can I hotplug the new drive and rebuild the RAID array?  Since
>removal of the old drive seems not to be detected I wonder if the
>new drive will be detected correctly.  Will the kernel continue
>with the old drive's size and partitioning, as is still found in
>/proc/partitions?  Will a call
> 
> blockdev --rereadpt /dev/sda
> 
>help?
> 
> 3. Alternativley, I could reboot the system.  I have called
> 
> grub-install /dev/sdb
> 
>and hope this suffices to make the system bootable again.
>Would that be safer?
> 
> Any other suggestions?
> 
> 
> urs
> 



signature.asc
Description: OpenPGP digital signature


Hot swapping failed disk /dev/sda in RAID 1 array

2016-07-19 Thread Urs Thuermann
In my RAID 1 array /dev/md0 consisting of two SATA drives /dev/sda1
and /dev/sdb1 the first drive /dev/sda has failed.  I have called
mdadm --fail and mdadm --remove on that drive and then pulled the
cables and removed the drive.  The RAID array continues to work fine
but in degraded mode.

I have some questions:

1. The block device nodes /dev/sda and /dev/sda1 still exist and the
   partitions are still listed in /proc/partitions.

   That causes I/O errors when running LVM tools or fdisk -l or other
   tools that try to access/scan all block devices.

   Shouldn't the device nodes and entries in /proc/partitions
   disappear when the drive is pulled?  Or does the BIOS or the SATA
   controller have to support this?

2. Can I hotplug the new drive and rebuild the RAID array?  Since
   removal of the old drive seems not to be detected I wonder if the
   new drive will be detected correctly.  Will the kernel continue
   with the old drive's size and partitioning, as is still found in
   /proc/partitions?  Will a call

blockdev --rereadpt /dev/sda

   help?

3. Alternativley, I could reboot the system.  I have called

grub-install /dev/sdb

   and hope this suffices to make the system bootable again.
   Would that be safer?

Any other suggestions?


urs