Re: OpenBSD softraid can do scrub, hotspare, hotswap? How do rebuild + those 3 really done? (Absence of docs and howtos - ultimate Q!)

Tinker Sat, 20 Feb 2016 12:09:07 -0800

Marcus,

Holy moly, that is beautiful.


So glad to understand better what's in the box.

Also please note that I'm not trying to suggest to implement lots ofcrap, am perfectly clear that high security is correlated with lowcomplexity.



On 2016-02-21 00:29, Marcus MERIGHI wrote:

ti...@openmailbox.org (Tinker), 2016.02.20 (Sat) 16:43 (CET):

..

You appear to mean bioctl(8). Thats the only place I could find theword

'patrol'. bioctl(8) can control more than softraid(4) devices.

bio(4):
     The following device drivers register with bio for volume
        management:

           ami(4)         American Megatrends Inc. MegaRAID
                          PATA/SATA/SCSI RAID controller
           arc(4)         Areca Technology Corporation SAS/SATA RAID
                          controller
           cac(4)         Compaq Smart Array 2/3/4 SCSI RAID controller
           ciss(4)        Compaq Smart Array SAS/SATA/SCSI RAID
                                controller
           ips(4)         IBM SATA/SCSI ServeRAID controller
           mfi(4)         LSI Logic & Dell MegaRAID SAS RAID controller

mpi(4) LSI Logic Fusion-MPT Message PassingInterfacempii(4) LSI Logic Fusion-MPT Message PassingInterface

                                II
           softraid(4)    Software RAID

It is talking about controlling a HW raid controller, in that 'patrol'
paragraph, isn't it?

So by this you mean that patrolling is really implemented for softraid??(Karel and Constantine don't agree??)

So I just do.. "bioctl -t start sdX" wher sdX is the name of my softraiddevice, and it'll do the "scrub" as in reading through all underlyingphysical media to check its internal integrity so for RAID1C that willbe data readability and that checksums are correct, and "doas bioctlsoftraid0" will show me the % status, and if I don't get any errorsbefore it goes back to normal it means the patrol was successful right?

(And as usual patrol is implemented to have the lowest priority, so itshould not interfere extreemely much with ordinary SSD softraidoperation.)

* Rebuild - I think I saw some console dump of the status of arebuild
process on the net, so MAYBE or NO..?


That's what it looks like:

$ doas bioctl softraid0
Volume      Status               Size Device
softraid0 0 Rebuild    12002360033280 sd6     RAID5 35% done
          0 Rebuild     4000786726912 0:0.0   noencl <sd2a>
          1 Online      4000786726912 0:1.0   noencl <sd3a>
          2 Online      4000786726912 0:2.0   noencl <sd4a>
          3 Online      4000786726912 0:3.0   noencl <sd5a>


Yey!!

Wait, can you explain to me what I would write instead of "device" and"channel:target[.lun]" in "bioctl -R device" and "bioctl -Rchannel:target[.lun]", AND what effect those would have?

Say that my sd0 and sd1 SSD:s run a RAID1C already, can I then makesoftraid extend my RAID1C with my sd2 SSD by "rebuilding" it, as a wayto live-copy in all my data to sd2, so this would work as a kind of liveattach even if expensive?


Does it work for a softraid that's live already?

* Hotspare - MAYBE, "man softraid" says "Currently there is noautomated
mechanism to recover from failed disks.", but that is not so specific
wording, and I think I read a hint somewhere that there is hotspare
functionality.


bioctl(8)
     -H channel:target[.lun]
             If the device at channel:target[.lun] is currently marked
             ``Unused'', promote it to being a ``Hot Spare''.

That's the only mention of 'hot spare'. And again talking about
controlling a hardware RAID controller, isn't it?

What is 'not so specific' about 'no' (as in "Currently there is *no*
automated mechanism to recover from failed disks")?


Awesome.

I guess "bioctl softraid0" will list which hotspares there arecurrently, and that "-d" will drop a hotspare.

The fact that there is hotspare functionality, means that there arecases when softraid will take a disk out of use.

That will be when that disk reports itself as COMPLETELY out of use ALLBY ITSELF, such as self-detaching itself on the level of the SATAcontroller or reporting failure via some SMART command?

A disk just half-breaking with broken sectors and 99% IO slowdown willnot cause it to go offline though so I guess I should buy enterprisedrives with IO access time guarantees then.

* Hotswap - MAYBE, this would depend on if there's rebuild. Onlydisconnect
("bioctl -O" I think; "bioctl -d" is to.. unmount or self-destruct a
softraid?)
bioctl -O should fail the chunk specified, simulating hardware failure.
After this command you have an 'Offline' chunk in the 'bioctl' output.
bioctl -d 'detach', not 'destroy'; just as sdX appears when youassamble
  a softraid volume, this makes it go away. better unmount before...

So "-d" is to take down a whole softraid. "-O" could work to take out asingle physical disk but it's unclean.

So then, there is a very unrefined hotswapping functionality in that"-O" can be used to take out a single physical drive, and "-R" (if Iunderstood it correctly above) can be used to plug in a drive.

Preferable would be to "hotswap" the whole softraid by simply taking itoffline altogether ("bioctl -d [raid dev]") and then taking it onlinealtogether ("bioctl -c 1 -l [comma-separated devs] softraid0")

The man pages are sometimes over-minimalistic with respect to anindividualuser who's trying to learn, this is why I'm asking for yourclarification.
I am quite sure the man pages are kept as condensed as they are on
purpose.

You can always read mplayer(1) if you want something lengthy ;-)
So your clarifications would still be much appreciated.
Nothing authoritative from me!
I am just trying to flatten your learning curve.

Bye, Marcus



Awesome. Thank you so much!

Re: OpenBSD softraid can do scrub, hotspare, hotswap? How do rebuild + those 3 really done? (Absence of docs and howtos - ultimate Q!)

Reply via email to