On 5/6/22 9:03 AM, Proton wrote:
Hi,

I'm using softraid 1C on my remote dedicated server, built on two NVMe disks.
It works really well from performance perspective and provide some data 
protection,
but there is no way to check device health status because SMART doesn’t work.
I guess bioctl will tell me only if devices are ‚online’, but nothing more?

well....a softraid device isn't a physical device, so, I'm not sure
what you would get that you couldn't get out of bioctl.  I have:
  bioctl softraid0
in my /etc/daily.local, and I also have a backup system that checks softraid
status on all systems (hey, as long as I'm in the neighborhood and doing
stuff as root...)

You can look at the SMART status of the underlying physical devices in
the softraid set exactly as you would non-softraid drives.

So, if you put a lot of faith in SMART (I don't), what are you missing?

Are there any "poor man’s” methods for checking state of devices you would 
suggest
to perform periodically - like ‚cat /dev/rsd0c > /dev/null’ + ‚cat /dev/rsd1c > 
/dev/null’?
Will potential I/O errors or timeouts be reported to stderr or to some system 
log file?

doing read tests like that over the entire underlying drives seems like
a good idea to me. Haven't implemented it so I can't say how it would
respond to real problems, but I can think of only one good way to find
out.  (from experience: how things act when a drive fails are hard to
predict and really hard to test.  So even a dozen "this is how it behaved"
results doesn't tell you what happens for the NEXT failure)

I would definitely want to put some rate limiting on it so you don't
kill performance overall.

As last method I can reboot to linux rescue from time to time, but this would 
be not very convenient.

Should I forget about NVMe and use other option - LSI MegaRaid HW with SSD 
disks attached?

what would you gain there?  Now you could only access what the
controller thinks of the drive's state through bioctl (which
you seemed to think was inadequate for softraid).

In the HW vs. SW RAID argument, I'm firmly in the "either way" camp,
but if I understand your query, you are LOSING info here.

(I've also heard stories about SSDs and HW RAID not playing well
together, but I'm not prepared to defend or refute that statement.
On the other hand, I've seen SSDs work differently enough from what
HW and SW expect that ... nothing would surprise me).

Nick.

Reply via email to