Re: [CentOS] ssacli start rebuild?

2020-11-14 Thread hw
On Sat, 2020-11-14 at 14:37 -0700, Warren Young wrote:
> On Nov 14, 2020, at 5:56 AM, hw  wrote:
> > On Wed, 2020-11-11 at 16:38 -0700, Warren Young wrote:
> > > On Nov 11, 2020, at 2:01 PM, hw  wrote:
> > > > I have yet to see software RAID that doesn't kill the performance.
> > > 
> > > When was the last time you tried it?
> > 
> > I'm currently using it, and the performance sucks.
> 
> Be specific.  Give chip part numbers, drivers used, whether this is on-board 
> software RAID or something entirely different like LVM or MD RAID, etc.  For 
> that matter, I don’t even see that you’ve identified whether this is CentOS 
> 6, 7 or 8.  (I hope it isn't older!)

I don't need to be specific because I have seen the difference in
practical usage over the last 20 years.  I'm not setting up
scientific testing environments that would cost tremendous amounts
of money and am using available and cost-efficient hard- and software.

> > Perhaps it's
> > not the software itself or the CPU but the on-board controllers
> > or other components being incable handling multiple disks in a
> > software raid.  That's something I can't verify.
> 
> Sure you can.  Benchmark RAID-0 vs RAID-1 in 2, 4, and 8 disk arrays.

No, I can't.  I don't have tons of different CPUs, mainboards, controller
cards and electronic diagnosting equipment around to do that, and what
would you even benchmark?  Is the user telling you that the software they
are using in a VM that is stored on an NFS server, run by another server
connected to it, is now running faster or slower?  Are you doing SQL queries
to create reports that are rarely required and take a while to run your
benchmark?  And what is even relevant?

I am seeing that a particular software running in a VM is now running not
any slower and maybe even faster than before the failed disk was replaced.
That means hardware RAID with 8 disks in hardware RAID 1+0 vs. two disks
as RAID 0 each in software RAID, using the otherwise same hardware, is
not faster and even slower than the software RAID.  The CPU load on the
storage server is also higher, which in this case does not matter.  I'm
happy with the result so far, and that is what matters.

If the disks were connected to the mainboard instead, the software might
be running slower.  I can't benchmark that, either, because I can't connect
the disks to the SATA ports on the board.  If there were 8 disks in a
RAID 1+0, all connected to the board, it might be a lot slower.  I can't
benchmark that, the board doesn't have so many SATA connectors.

I only have two new disks and no additional or different hardware.  Telling
me to specify particular chips and such is totally pointless.  Benchmarking
is not feasible and pointless, either.

Sure you can do some kind of benchmarking in a lab if you can afford it, but
how does that correlate to the results you'll be getting in practise?  Even if
you involve users, those users will be different from the users I'm dealing 
with.

> In a 2-disk array, a proper software RAID system should give 2x a single 
> disk’s performance for both read and write in RAID-0, but single-disk write 
> performance for RAID-1.
>
> Such values should scale reasonably as you add disks: RAID-0 over 8 disks 
> gives 8x performance, RAID-1 over 8 disks gives 4x write but 8x read, etc.
> 
> These are rough numbers, but what you’re looking for are failure cases where 
> it’s 1x a single disk for read or write.  That tells you there’s a bottleneck 
> or serialization condition, such that you aren’t getting the parallel I/O you 
> should be expecting.

And?

> > > Why would you expect that a modern 8-core Intel CPU would impede I/O
> > 
> > It doesn't matter what I expect.
> 
> It *does* matter if you know what the hardware’s capable of.

I can expect a hardware to do something as much as I want, it will always only 
do
whatever it does regardless.

> TLS is a much harder problem than XOR checksumming for traditional RAID, yet 
> it imposes [approximately zero][1] performance penalty on modern server 
> hardware, so if your CPU can fill a 10GE pipe with TLS, then it should have 
> no problem dealing with the simpler calculations needed by the ~2 Gbit/sec 
> flat-out max data rate of a typical RAID-grade 4 TB spinning HDD.
> 
> Even with 8 in parallel in the best case where they’re all reading linearly, 
> you’re still within a small multiple of the Ethernet case, so we should still 
> expect the software RAID stack not to become CPU-bound.
> 
> And realize that HDDs don’t fall into this max data rate case often outside 
> of benchmarking.  Once you start throwing ~5 ms seek times into the mix, the 
> CPU’s job becomes even easier.
> 
> [1]: https://stackoverflow.com/a/548042/142454

This may all be nice and good in theory.  In practise, I'm seeing up to 30% CPU
during a mdraid resync for a single 2-disk array.  How much performance impact
does that indicate for "normal" operations?

> > > > And where
> > > > do you get cost-efficient 

Re: [CentOS] ssacli start rebuild?

2020-11-14 Thread Valeri Galtsev


> On Nov 14, 2020, at 8:45 PM, hw  wrote:
> 
> On Sat, 2020-11-14 at 18:55 +0100, Simon Matter wrote:
>>> On Wed, 2020-11-11 at 16:38 -0700, Warren Young wrote:
 On Nov 11, 2020, at 2:01 PM, hw  wrote:
> I have yet to see software RAID that doesn't kill the performance.
 
 When was the last time you tried it?
>>> 
>>> I'm currently using it, and the performance sucks.  Perhaps it's
>>> not the software itself or the CPU but the on-board controllers
>>> or other components being incable handling multiple disks in a
>>> software raid.  That's something I can't verify.
>>> 
 Why would you expect that a modern 8-core Intel CPU would impede I/O in
 any measureable way as compared to the outdated single-core 32-bit RISC
 CPU typically found on hardware RAID cards?  These are the same CPUs,
 mind, that regularly crunch through TLS 1.3 on line-rate fiber Ethernet
 links, a much tougher task than mediating spinning disk I/O.
>>> 
>>> It doesn't matter what I expect.
>>> 
> And where
> do you get cost-efficient cards that can do JBOD?
 
 $69, 8 SATA/SAS ports: https://www.newegg.com/p/0ZK-08UH-0GWZ1
>>> 
>>> That says it's for HP.  So will you still get firmware updates once
>>> the warranty is expired?  Does it exclusively work with HP hardware?
>>> 
>>> And are these good?
>>> 
 Search for “LSI JBOD” for tons more options.  You may have to fiddle
 with the firmware to get it to stop trying to do clever RAID stuff,
 which lets you do smart RAID stuff like ZFS instead.
 
> What has HP been thinking?
 
 That the hardware vs software RAID argument is over in 2020.
 
>>> 
>>> Do you have a reference for that, like a final statement from HP?
>>> Did they stop developing RAID controllers, or do they ship their
>>> servers now without them and tell customers to use btrfs or mdraid?
>> 
>> HPE and the other large vendors won't tell you directly because they love
>> to sell you their outdated SAS/SATA Raid stuff. They were quite slow to
>> introduce NVMe storage, be it as PCIe cards or U.2 format, but it's also
>> clear to them that NVMe is the future and that it's used with software
>> redundancy provided by MDraid, ZFS, Btrfs etc. Just search for HPE's
>> 4AA4-7186ENW.pdf file which also mentions it.
>> 
>> In fact local storage was one reason why we turned away from HPE and Dell
>> after many years because we just didn't want to invest in outdated
>> technology.
>> 
> 
> I'm currently running an mdadm raid-check and two RAID-1 arrays, and the
> server shows 2 processes with 24--27% CPU each and two others around 5%.
> And you want to tell me that the CPU load is almost non-existent.

Hardware vs software RAID discussion is like a clash of two different 
religions. I, BTW, on your religious side: hardware RAID. For different reason: 
in hardware RAID it is small piece of code (hence well debugged), and dedicated 
hardware. Thus, things like kernel panic (of the main system, the one that 
would be running software RAID) does not affect hardware RAID function, whereas 
software RAID function will not be fulfilled in case of kernel panic. Whereas 
unclean filesystem can be dealt with, “unclean” RAID pretty much can not.

But again, it is akin religion, and after both sides shoot out all their 
ammunition, everyone returns back being still on the same side one was before 
the “discussion”.

So, I would just suggest… Hm, never mind, everyone, do what you feel right ;-)

Valeri

> I've also constantly seen much better performance with hardware RAID than
> with software RAID over the years and ZFS having the worst performance of
> anything, even with SSD caches.
> 
> It speaks for itself, and, like I said, I have yet to see a software RAID
> that doesn't bring the performance down.  Show me one that doesn't.
> 
> Are there any hardware RAID controllers designed for NVMe storage you could
> use to compare software RAID with?  Are there any ZFS or btrfs hardware
> controllers you could compare with?
> 
> 
> ___
> CentOS mailing list
> CentOS@centos.org
> https://lists.centos.org/mailman/listinfo/centos

___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ssacli start rebuild?

2020-11-14 Thread John Pierce
On Sat, Nov 14, 2020 at 6:32 PM hw  wrote:

>
> I don't like the idea of flashing one.  I don't have the firmware and I
> don't
> know if they can be flashed with Linux.  Aren't there any good --- and cost
> efficient --- ones that do JBOD by default, preferably including 16-port
> cards
> with mini-SAS connectors?
>
>
the firmware is freely downloadable from  lsi/broadcom, and linux has
sas2flash or sas3flash (for 2x08/3008 respectively) command line tools to
do the flashing.

pretty much standard procedure for the ZFS crowd to flash those...  the
2x08 cards often come with "IR" firmware that does limited raid, and its
preferable to flash them with the IT firmware that puts them in plain HBA
mode, stands for Initiator-Target.


-- 
-john r pierce
  recycling used bits in santa cruz
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ssacli start rebuild?

2020-11-14 Thread hw
On Sat, 2020-11-14 at 18:55 +0100, Simon Matter wrote:
> > On Wed, 2020-11-11 at 16:38 -0700, Warren Young wrote:
> > > On Nov 11, 2020, at 2:01 PM, hw  wrote:
> > > > I have yet to see software RAID that doesn't kill the performance.
> > > 
> > > When was the last time you tried it?
> > 
> > I'm currently using it, and the performance sucks.  Perhaps it's
> > not the software itself or the CPU but the on-board controllers
> > or other components being incable handling multiple disks in a
> > software raid.  That's something I can't verify.
> > 
> > > Why would you expect that a modern 8-core Intel CPU would impede I/O in
> > > any measureable way as compared to the outdated single-core 32-bit RISC
> > > CPU typically found on hardware RAID cards?  These are the same CPUs,
> > > mind, that regularly crunch through TLS 1.3 on line-rate fiber Ethernet
> > > links, a much tougher task than mediating spinning disk I/O.
> > 
> > It doesn't matter what I expect.
> > 
> > > > And where
> > > > do you get cost-efficient cards that can do JBOD?
> > > 
> > > $69, 8 SATA/SAS ports: https://www.newegg.com/p/0ZK-08UH-0GWZ1
> > 
> > That says it's for HP.  So will you still get firmware updates once
> > the warranty is expired?  Does it exclusively work with HP hardware?
> > 
> > And are these good?
> > 
> > > Search for “LSI JBOD” for tons more options.  You may have to fiddle
> > > with the firmware to get it to stop trying to do clever RAID stuff,
> > > which lets you do smart RAID stuff like ZFS instead.
> > > 
> > > > What has HP been thinking?
> > > 
> > > That the hardware vs software RAID argument is over in 2020.
> > > 
> > 
> > Do you have a reference for that, like a final statement from HP?
> > Did they stop developing RAID controllers, or do they ship their
> > servers now without them and tell customers to use btrfs or mdraid?
> 
> HPE and the other large vendors won't tell you directly because they love
> to sell you their outdated SAS/SATA Raid stuff. They were quite slow to
> introduce NVMe storage, be it as PCIe cards or U.2 format, but it's also
> clear to them that NVMe is the future and that it's used with software
> redundancy provided by MDraid, ZFS, Btrfs etc. Just search for HPE's
> 4AA4-7186ENW.pdf file which also mentions it.
> 
> In fact local storage was one reason why we turned away from HPE and Dell
> after many years because we just didn't want to invest in outdated
> technology.
> 

I'm currently running an mdadm raid-check and two RAID-1 arrays, and the
server shows 2 processes with 24--27% CPU each and two others around 5%.
And you want to tell me that the CPU load is almost non-existent.

I've also constantly seen much better performance with hardware RAID than
with software RAID over the years and ZFS having the worst performance of
anything, even with SSD caches.

It speaks for itself, and, like I said, I have yet to see a software RAID
that doesn't bring the performance down.  Show me one that doesn't.

Are there any hardware RAID controllers designed for NVMe storage you could
use to compare software RAID with?  Are there any ZFS or btrfs hardware
controllers you could compare with?


___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ssacli start rebuild?

2020-11-14 Thread hw
On Sat, 2020-11-14 at 07:11 -0800, John Pierce wrote:
> On Sat, Nov 14, 2020, 4:57 AM hw  wrote:
> 
> > On Wed, 2020-11-11 at 16:38 -0700, Warren Young wrote:
> > 
> > > > And where
> > > > do you get cost-efficient cards that can do JBOD?
> > > 
> > > $69, 8 SATA/SAS ports: https://www.newegg.com/p/0ZK-08UH-0GWZ1
> > 
> > That says it's for HP.  So will you still get firmware updates once
> > the warranty is expired?  Does it exclusively work with HP hardware?
> > 
> > And are these good?
> > 
> 
> That specific card is a bad choice, it's the very obsolete SAS1068E chip,
> which was SAS 1.0, with max 2gb per disk.
> 

Thanks!  That's probably why it isn't so expensive.

> Cards based on the SAS 2008, 2308, and 3008 chips are a much better choice.
> 
> Any oem card with these chips can be flashed with generic LSI/Broadcom IT
> firmware.
> 

I don't like the idea of flashing one.  I don't have the firmware and I don't
know if they can be flashed with Linux.  Aren't there any good --- and cost
efficient --- ones that do JBOD by default, preferably including 16-port cards
with mini-SAS connectors?


___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ssacli start rebuild?

2020-11-14 Thread Warren Young
On Nov 14, 2020, at 5:56 AM, hw  wrote:
> 
> On Wed, 2020-11-11 at 16:38 -0700, Warren Young wrote:
>> On Nov 11, 2020, at 2:01 PM, hw  wrote:
>>> I have yet to see software RAID that doesn't kill the performance.
>> 
>> When was the last time you tried it?
> 
> I'm currently using it, and the performance sucks.

Be specific.  Give chip part numbers, drivers used, whether this is on-board 
software RAID or something entirely different like LVM or MD RAID, etc.  For 
that matter, I don’t even see that you’ve identified whether this is CentOS 6, 
7 or 8.  (I hope it isn't older!)

> Perhaps it's
> not the software itself or the CPU but the on-board controllers
> or other components being incable handling multiple disks in a
> software raid.  That's something I can't verify.

Sure you can.  Benchmark RAID-0 vs RAID-1 in 2, 4, and 8 disk arrays.

In a 2-disk array, a proper software RAID system should give 2x a single disk’s 
performance for both read and write in RAID-0, but single-disk write 
performance for RAID-1.

Such values should scale reasonably as you add disks: RAID-0 over 8 disks gives 
8x performance, RAID-1 over 8 disks gives 4x write but 8x read, etc.

These are rough numbers, but what you’re looking for are failure cases where 
it’s 1x a single disk for read or write.  That tells you there’s a bottleneck 
or serialization condition, such that you aren’t getting the parallel I/O you 
should be expecting.

>> Why would you expect that a modern 8-core Intel CPU would impede I/O
> 
> It doesn't matter what I expect.

It *does* matter if you know what the hardware’s capable of.

TLS is a much harder problem than XOR checksumming for traditional RAID, yet it 
imposes [approximately zero][1] performance penalty on modern server hardware, 
so if your CPU can fill a 10GE pipe with TLS, then it should have no problem 
dealing with the simpler calculations needed by the ~2 Gbit/sec flat-out max 
data rate of a typical RAID-grade 4 TB spinning HDD.

Even with 8 in parallel in the best case where they’re all reading linearly, 
you’re still within a small multiple of the Ethernet case, so we should still 
expect the software RAID stack not to become CPU-bound.

And realize that HDDs don’t fall into this max data rate case often outside of 
benchmarking.  Once you start throwing ~5 ms seek times into the mix, the CPU’s 
job becomes even easier.

[1]: https://stackoverflow.com/a/548042/142454

> 
>>> And where
>>> do you get cost-efficient cards that can do JBOD?
>> 
>> $69, 8 SATA/SAS ports: https://www.newegg.com/p/0ZK-08UH-0GWZ1
> 
> That says it's for HP.  So will you still get firmware updates once
> the warranty is expired?  Does it exclusively work with HP hardware?
> 
> And are these good?

You asked for “cost-efficient,” which I took to be a euphemism for “cheapest 
thing that could possibly work.”

If you’re willing to spend money, then I fully expect you can find JBOD cards 
you’ll be happy with.

Personally, I get servers with enough SFF-8087 SAS connectors on them to 
address all the disks in the system.  I haven’t bothered with add-on SATA cards 
in years.

I use ZFS, so absolute flat-out benchmark speed isn’t my primary consideration. 
 Data durability and data set features matter to me far more.

>>> What has HP been thinking?
>> 
>> That the hardware vs software RAID argument is over in 2020.
> 
> Do you have a reference for that, like a final statement from HP?

Since I’m not posting from an hpe.com email address, I think it’s pretty 
obvious that that is my opinion, not an HP corporate statement.

I base it on observing the Linux RAID market since the mid-90s.  The massive 
consolidation for hardware RAID is a big part of it.  That’s what happens when 
a market becomes “mature,” which is often the step just prior to “moribund.”

> Did they stop developing RAID controllers, or do they ship their
> servers now without them

Were you under the impression that HP was trying to provide you the best 
possible technology for all possible use cases, rather than make money by 
maximizing the ratio of cash in vs cash out?

Just because they’re serving it up on a plate doesn’t mean you hafta pick up a 
fork.
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ssacli start rebuild?

2020-11-14 Thread Simon Matter
> On Wed, 2020-11-11 at 16:38 -0700, Warren Young wrote:
>> On Nov 11, 2020, at 2:01 PM, hw  wrote:
>> > I have yet to see software RAID that doesn't kill the performance.
>>
>> When was the last time you tried it?
>
> I'm currently using it, and the performance sucks.  Perhaps it's
> not the software itself or the CPU but the on-board controllers
> or other components being incable handling multiple disks in a
> software raid.  That's something I can't verify.
>
>> Why would you expect that a modern 8-core Intel CPU would impede I/O in
>> any measureable way as compared to the outdated single-core 32-bit RISC
>> CPU typically found on hardware RAID cards?  These are the same CPUs,
>> mind, that regularly crunch through TLS 1.3 on line-rate fiber Ethernet
>> links, a much tougher task than mediating spinning disk I/O.
>
> It doesn't matter what I expect.
>
>> > And where
>> > do you get cost-efficient cards that can do JBOD?
>>
>> $69, 8 SATA/SAS ports: https://www.newegg.com/p/0ZK-08UH-0GWZ1
>
> That says it's for HP.  So will you still get firmware updates once
> the warranty is expired?  Does it exclusively work with HP hardware?
>
> And are these good?
>
>> Search for “LSI JBOD” for tons more options.  You may have to fiddle
>> with the firmware to get it to stop trying to do clever RAID stuff,
>> which lets you do smart RAID stuff like ZFS instead.
>>
>> > What has HP been thinking?
>>
>> That the hardware vs software RAID argument is over in 2020.
>>
>
> Do you have a reference for that, like a final statement from HP?
> Did they stop developing RAID controllers, or do they ship their
> servers now without them and tell customers to use btrfs or mdraid?

HPE and the other large vendors won't tell you directly because they love
to sell you their outdated SAS/SATA Raid stuff. They were quite slow to
introduce NVMe storage, be it as PCIe cards or U.2 format, but it's also
clear to them that NVMe is the future and that it's used with software
redundancy provided by MDraid, ZFS, Btrfs etc. Just search for HPE's
4AA4-7186ENW.pdf file which also mentions it.

In fact local storage was one reason why we turned away from HPE and Dell
after many years because we just didn't want to invest in outdated
technology.

Regards,
Simon

___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ssacli start rebuild?

2020-11-14 Thread John Pierce
On Sat, Nov 14, 2020, 4:57 AM hw  wrote:

> On Wed, 2020-11-11 at 16:38 -0700, Warren Young wrote:
>
> > > And where
> > > do you get cost-efficient cards that can do JBOD?
> >
> > $69, 8 SATA/SAS ports: https://www.newegg.com/p/0ZK-08UH-0GWZ1
>
> That says it's for HP.  So will you still get firmware updates once
> the warranty is expired?  Does it exclusively work with HP hardware?
>
> And are these good?
>

That specific card is a bad choice, it's the very obsolete SAS1068E chip,
which was SAS 1.0, with max 2gb per disk.

Cards based on the SAS 2008, 2308, and 3008 chips are a much better choice.

Any oem card with these chips can be flashed with generic LSI/Broadcom IT
firmware.
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ssacli start rebuild?

2020-11-14 Thread hw
On Wed, 2020-11-11 at 16:38 -0700, Warren Young wrote:
> On Nov 11, 2020, at 2:01 PM, hw  wrote:
> > I have yet to see software RAID that doesn't kill the performance.
> 
> When was the last time you tried it?

I'm currently using it, and the performance sucks.  Perhaps it's
not the software itself or the CPU but the on-board controllers
or other components being incable handling multiple disks in a
software raid.  That's something I can't verify.

> Why would you expect that a modern 8-core Intel CPU would impede I/O in any 
> measureable way as compared to the outdated single-core 32-bit RISC CPU 
> typically found on hardware RAID cards?  These are the same CPUs, mind, that 
> regularly crunch through TLS 1.3 on line-rate fiber Ethernet links, a much 
> tougher task than mediating spinning disk I/O.

It doesn't matter what I expect.

> > And where
> > do you get cost-efficient cards that can do JBOD?
> 
> $69, 8 SATA/SAS ports: https://www.newegg.com/p/0ZK-08UH-0GWZ1

That says it's for HP.  So will you still get firmware updates once
the warranty is expired?  Does it exclusively work with HP hardware?

And are these good?

> Search for “LSI JBOD” for tons more options.  You may have to fiddle with the 
> firmware to get it to stop trying to do clever RAID stuff, which lets you do 
> smart RAID stuff like ZFS instead.
> 
> > What has HP been thinking?
> 
> That the hardware vs software RAID argument is over in 2020.
> 

Do you have a reference for that, like a final statement from HP?
Did they stop developing RAID controllers, or do they ship their
servers now without them and tell customers to use btrfs or mdraid?


___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ssacli start rebuild?

2020-11-11 Thread Simon Matter
>
>
>> On Nov 11, 2020, at 6:00 PM, John Pierce  wrote:
>>
>> On Wed, Nov 11, 2020 at 3:38 PM Warren Young  wrote:
>>
>>> On Nov 11, 2020, at 2:01 PM, hw  wrote:

 I have yet to see software RAID that doesn't kill the performance.
>>>
>>> When was the last time you tried it?
>>>
>>> Why would you expect that a modern 8-core Intel CPU would impede I/O in
>>> any measureable way as compared to the outdated single-core 32-bit RISC
>>> CPU
>>> typically found on hardware RAID cards?  These are the same CPUs, mind,
>>> that regularly crunch through TLS 1.3 on line-rate fiber Ethernet
>>> links, a
>>> much tougher task than mediating spinning disk I/O.
>>
>>
>> the only 'advantage' hardware raid has is write-back caching.
>
> Just for my information: how do you map failed software RAID drive to
> physical port of, say, SAS-attached enclosure. I’d love to hot replace
> failed drives in software RAIDs, have over hundred physical drives
> attached to a machine. Do not criticize, this is box installed by someone
> else, I have “inherited” it.To replace I have to query drive serial
> number, power off the machine and pulling drives one at a time read the
> labels...

There are different methods depending on how the disks are attached. In
some cases you can use a tool to show the corresponding disk or slot.
Otherwise, once you have hot removed the drive from the RAID, you can
either dd to the broken drive or make some traffic on the still working
RAID and you'll see the disk immediately when looking at the disks busy
LEDs.

I've used Linux Software RAID during the last two decades and it has
always worked nicely while I started to hate hardware RAID more and more.
Now with U.2 NVMe SSD drives, at least when we started using them, there
were no RAID controllers available at all. And performance with Linux
Software RAID1 on AMD EPYC boxes is amazing :-)

Regards,
Simon

___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ssacli start rebuild?

2020-11-11 Thread Valeri Galtsev


> On Nov 11, 2020, at 8:04 PM, John Pierce  wrote:
> 
> in large raids, I label my disks with the last 4 or 6 digits of the drive
> serial number (or for SAS disks, the WWN).this is visible via smartctl,
> and I record it with the zpool documentation I keep on each server
> (typically a text file on a cloud drive).  

I get info about software RAID failure from cronjob executing raid-check 
(coming with mdadm rpm). I can get S/N of failed drive (they are not dead-dead, 
still one query one) using smartctl, but I am too lazy to have all serial 
numbers of drives printed and affixed to fronts of drive trays… but so far I 
see no other way ;-(

Valeri 

>   zpools don't actually care
> WHAT slot a given pool member is in, you can shut the box down, shuffle all
> the disks, boot back up and find them all and put them back in the pool.
> 
> the physical error reports that proceed a drive failure should list the
> drive identification beyond just the /dev/sdX kind of thing, which is
> subject to change if you add more SAS devices.
> 
> I once researched what it would take to implement the drive failure lights
> on the typical brand name server/storage chassis, there's a command for
> manipulating SES devices such as those lights, the catch is figuring out
> the mapping between the drives and lights, its not always evident, so would
> require trial and error.
> 
> 
> 
> On Wed, Nov 11, 2020 at 5:37 PM Valeri Galtsev 
> wrote:
> 
>> 
>> 
>>> On Nov 11, 2020, at 6:00 PM, John Pierce  wrote:
>>> 
>>> On Wed, Nov 11, 2020 at 3:38 PM Warren Young  wrote:
>>> 
 On Nov 11, 2020, at 2:01 PM, hw  wrote:
> 
> I have yet to see software RAID that doesn't kill the performance.
 
 When was the last time you tried it?
 
 Why would you expect that a modern 8-core Intel CPU would impede I/O in
 any measureable way as compared to the outdated single-core 32-bit RISC
>> CPU
 typically found on hardware RAID cards?  These are the same CPUs, mind,
 that regularly crunch through TLS 1.3 on line-rate fiber Ethernet
>> links, a
 much tougher task than mediating spinning disk I/O.
>>> 
>>> 
>>> the only 'advantage' hardware raid has is write-back caching.
>> 
>> Just for my information: how do you map failed software RAID drive to
>> physical port of, say, SAS-attached enclosure. I’d love to hot replace
>> failed drives in software RAIDs, have over hundred physical drives attached
>> to a machine. Do not criticize, this is box installed by someone else, I
>> have “inherited” it.To replace I have to query drive serial number, power
>> off the machine and pulling drives one at a time read the labels...
>> 
>> With hardware RAID that is not an issue, I always know which physical port
>> failed drive is in. And I can tell controller to “indicate” specific drive
>> (it blinks respective port LED). Always hot replacing drives in hardware
>> RAIDs, no one ever knows it has been done. And I’d love to deal same way
>> with drives in software RAIDs.
>> 
>> Thanks for advises in advance. And my apologies for “stealing the thread"
>> 
>> Valeri
>> 
>>> with ZFS you can get much the same performance boost out of a small fast
>>> SSD used as a ZIL / SLOG.
>>> 
>>> --
>>> -john r pierce
>>> recycling used bits in santa cruz
>>> ___
>>> CentOS mailing list
>>> CentOS@centos.org
>>> https://lists.centos.org/mailman/listinfo/centos
>> 
>> ___
>> CentOS mailing list
>> CentOS@centos.org
>> https://lists.centos.org/mailman/listinfo/centos
>> 
> 
> 
> -- 
> -john r pierce
>  recycling used bits in santa cruz
> ___
> CentOS mailing list
> CentOS@centos.org
> https://lists.centos.org/mailman/listinfo/centos

___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ssacli start rebuild?

2020-11-11 Thread Valeri Galtsev


> On Nov 11, 2020, at 8:07 PM, John Pierce  wrote:
> 
> On Wed, Nov 11, 2020 at 5:47 PM Valeri Galtsev 
> wrote:
> 
>> I’m sure you can reflash LSI card to make it SATA or SAS HBA, or MegaRAD
>> hardware RAID adapter. Is far as I recollect it is the same electronics
>> board. I reflashed a couple of HBAs to make them MegaRAID boards.
>> 
> 
> you can reflash SOME megaraid cards to put them in IT 'hba' mode, but not
> others.
> 
> 
>> 
>> One thing though bothers me about LSI, now after last it was bought by
>> Intel its future faith worries me. Intel pushed 3ware which it acquired in
>> the same package with LSI already into oblivion…
>> 
> 
> Its Avago, formerly Aligent, and before that HP, which bought LSI, 3Ware,
> and then Broadcom, and renamed itself Broadcom.
> 

I am apparently wrong, at least about LSI, it still belongs to broadcom, thanks!

Long before broadcom acquired LSI and 3ware, I was awfully displeased by their 
WiFi chip: infamous BCM43xx. It is 32 bit chip sitting on 64 bit bus. No [sane] 
open source programmer will be happy to write driver for that. For ages we were 
using ndis wrapper…. As much as I disliked broadcom for their wireless chipset, 
I loved them for their ethernet one. And I recollect this was long ago before 
acquisition by broadcom of LSI and 3ware. Or am I wrong?

Valeri

> 
> -- 
> -john r pierce
>  recycling used bits in santa cruz
> ___
> CentOS mailing list
> CentOS@centos.org
> https://lists.centos.org/mailman/listinfo/centos

___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ssacli start rebuild?

2020-11-11 Thread Warren Young
On Nov 11, 2020, at 7:04 PM, Warren Young  wrote:
> 
> zpool mount -d /dev/disk/by-partlabel

Oops, I’m mixing the zpool and zfs commands.  It’d be “zpool import”.

And you do this just once: afterward, the automatic on-boot import brings the 
drives back in using the names they had before, so when you’ve got some 
low-skill set of remote hands in front of the machine, and you’re looking at a 
failure indication in zpool status, you just say “Swap out the drive in the 
third cage, left side, four slots down.”
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ssacli start rebuild?

2020-11-11 Thread John Pierce
On Wed, Nov 11, 2020 at 5:47 PM Valeri Galtsev 
wrote:

> I’m sure you can reflash LSI card to make it SATA or SAS HBA, or MegaRAD
> hardware RAID adapter. Is far as I recollect it is the same electronics
> board. I reflashed a couple of HBAs to make them MegaRAID boards.
>

you can reflash SOME megaraid cards to put them in IT 'hba' mode, but not
others.


>
> One thing though bothers me about LSI, now after last it was bought by
> Intel its future faith worries me. Intel pushed 3ware which it acquired in
> the same package with LSI already into oblivion…
>

Its Avago, formerly Aligent, and before that HP, which bought LSI, 3Ware,
and then Broadcom, and renamed itself Broadcom.


-- 
-john r pierce
  recycling used bits in santa cruz
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ssacli start rebuild?

2020-11-11 Thread John Pierce
in large raids, I label my disks with the last 4 or 6 digits of the drive
serial number (or for SAS disks, the WWN).this is visible via smartctl,
and I record it with the zpool documentation I keep on each server
(typically a text file on a cloud drive). zpools don't actually care
WHAT slot a given pool member is in, you can shut the box down, shuffle all
the disks, boot back up and find them all and put them back in the pool.

the physical error reports that proceed a drive failure should list the
drive identification beyond just the /dev/sdX kind of thing, which is
subject to change if you add more SAS devices.

I once researched what it would take to implement the drive failure lights
on the typical brand name server/storage chassis, there's a command for
manipulating SES devices such as those lights, the catch is figuring out
the mapping between the drives and lights, its not always evident, so would
require trial and error.



On Wed, Nov 11, 2020 at 5:37 PM Valeri Galtsev 
wrote:

>
>
> > On Nov 11, 2020, at 6:00 PM, John Pierce  wrote:
> >
> > On Wed, Nov 11, 2020 at 3:38 PM Warren Young  wrote:
> >
> >> On Nov 11, 2020, at 2:01 PM, hw  wrote:
> >>>
> >>> I have yet to see software RAID that doesn't kill the performance.
> >>
> >> When was the last time you tried it?
> >>
> >> Why would you expect that a modern 8-core Intel CPU would impede I/O in
> >> any measureable way as compared to the outdated single-core 32-bit RISC
> CPU
> >> typically found on hardware RAID cards?  These are the same CPUs, mind,
> >> that regularly crunch through TLS 1.3 on line-rate fiber Ethernet
> links, a
> >> much tougher task than mediating spinning disk I/O.
> >
> >
> > the only 'advantage' hardware raid has is write-back caching.
>
> Just for my information: how do you map failed software RAID drive to
> physical port of, say, SAS-attached enclosure. I’d love to hot replace
> failed drives in software RAIDs, have over hundred physical drives attached
> to a machine. Do not criticize, this is box installed by someone else, I
> have “inherited” it.To replace I have to query drive serial number, power
> off the machine and pulling drives one at a time read the labels...
>
> With hardware RAID that is not an issue, I always know which physical port
> failed drive is in. And I can tell controller to “indicate” specific drive
> (it blinks respective port LED). Always hot replacing drives in hardware
> RAIDs, no one ever knows it has been done. And I’d love to deal same way
> with drives in software RAIDs.
>
> Thanks for advises in advance. And my apologies for “stealing the thread"
>
> Valeri
>
> > with ZFS you can get much the same performance boost out of a small fast
> > SSD used as a ZIL / SLOG.
> >
> > --
> > -john r pierce
> >  recycling used bits in santa cruz
> > ___
> > CentOS mailing list
> > CentOS@centos.org
> > https://lists.centos.org/mailman/listinfo/centos
>
> ___
> CentOS mailing list
> CentOS@centos.org
> https://lists.centos.org/mailman/listinfo/centos
>


-- 
-john r pierce
  recycling used bits in santa cruz
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ssacli start rebuild?

2020-11-11 Thread Warren Young
On Nov 11, 2020, at 6:37 PM, Valeri Galtsev  wrote:
> 
> how do you map failed software RAID drive to physical port of, say, 
> SAS-attached enclosure.

With ZFS, you set a partition label on the whole-drive partition pool member, 
then mount the pool with something like “zpool mount -d 
/dev/disk/by-partlabel”, which then shows the logical disk names in commands 
like “zpool status” rather than opaque “/dev/sdb3” type things.

It is then up to you to assign sensible drive names like “cage-3-left-4” for 
the 4th drive down on the left side of the third drive cage.  Or, maybe your 
organization uses asset tags, so you could label the disk the same way, 
“sn123456”, which you find by looking at the front of each slot.
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ssacli start rebuild?

2020-11-11 Thread Valeri Galtsev


> On Nov 11, 2020, at 5:38 PM, Warren Young  wrote:
> 
> On Nov 11, 2020, at 2:01 PM, hw  wrote:
>> 
>> I have yet to see software RAID that doesn't kill the performance.
> 
> When was the last time you tried it?
> 
> Why would you expect that a modern 8-core Intel CPU would impede I/O in any 
> measureable way as compared to the outdated single-core 32-bit RISC CPU 
> typically found on hardware RAID cards?  These are the same CPUs, mind, that 
> regularly crunch through TLS 1.3 on line-rate fiber Ethernet links, a much 
> tougher task than mediating spinning disk I/O.
> 
>> And where
>> do you get cost-efficient cards that can do JBOD?
> 
> $69, 8 SATA/SAS ports: https://www.newegg.com/p/0ZK-08UH-0GWZ1
> 
> Search for “LSI JBOD” for tons more options.  You may have to fiddle with the 
> firmware to get it to stop trying to do clever RAID stuff, which lets you do 
> smart RAID stuff like ZFS instead.
> 
>> What has HP been thinking?
> 
> That the hardware vs software RAID argument is over in 2020.

I’d rather have distributed redundant storage on multiple machines… but I still 
have [mostly] hardware RAIDs ;-)

Valeri

> ___
> CentOS mailing list
> CentOS@centos.org
> https://lists.centos.org/mailman/listinfo/centos

___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ssacli start rebuild?

2020-11-11 Thread Valeri Galtsev


> On Nov 11, 2020, at 5:38 PM, Warren Young  wrote:
> 
> On Nov 11, 2020, at 2:01 PM, hw  wrote:
>> 
>> I have yet to see software RAID that doesn't kill the performance.
> 
> When was the last time you tried it?
> 
> Why would you expect that a modern 8-core Intel CPU would impede I/O in any 
> measureable way as compared to the outdated single-core 32-bit RISC CPU 
> typically found on hardware RAID cards?  These are the same CPUs, mind, that 
> regularly crunch through TLS 1.3 on line-rate fiber Ethernet links, a much 
> tougher task than mediating spinning disk I/O.
> 
>> And where
>> do you get cost-efficient cards that can do JBOD?
> 
> $69, 8 SATA/SAS ports: https://www.newegg.com/p/0ZK-08UH-0GWZ1
> 
> Search for “LSI JBOD” for tons more options.  You may have to fiddle with the 
> firmware to get it to stop trying to do clever RAID stuff, which lets you do 
> smart RAID stuff like ZFS instead.

I’m sure you can reflash LSI card to make it SATA or SAS HBA, or MegaRAD 
hardware RAID adapter. Is far as I recollect it is the same electronics board. 
I reflashed a couple of HBAs to make them MegaRAID boards.

One thing though bothers me about LSI, now after last it was bought by Intel 
its future faith worries me. Intel pushed 3ware which it acquired in the same 
package with LSI already into oblivion…

Valeri

>> What has HP been thinking?
> 
> That the hardware vs software RAID argument is over in 2020.
> 
> ___
> CentOS mailing list
> CentOS@centos.org
> https://lists.centos.org/mailman/listinfo/centos

___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ssacli start rebuild?

2020-11-11 Thread Valeri Galtsev


> On Nov 11, 2020, at 6:00 PM, John Pierce  wrote:
> 
> On Wed, Nov 11, 2020 at 3:38 PM Warren Young  wrote:
> 
>> On Nov 11, 2020, at 2:01 PM, hw  wrote:
>>> 
>>> I have yet to see software RAID that doesn't kill the performance.
>> 
>> When was the last time you tried it?
>> 
>> Why would you expect that a modern 8-core Intel CPU would impede I/O in
>> any measureable way as compared to the outdated single-core 32-bit RISC CPU
>> typically found on hardware RAID cards?  These are the same CPUs, mind,
>> that regularly crunch through TLS 1.3 on line-rate fiber Ethernet links, a
>> much tougher task than mediating spinning disk I/O.
> 
> 
> the only 'advantage' hardware raid has is write-back caching.

Just for my information: how do you map failed software RAID drive to physical 
port of, say, SAS-attached enclosure. I’d love to hot replace failed drives in 
software RAIDs, have over hundred physical drives attached to a machine. Do not 
criticize, this is box installed by someone else, I have “inherited” it.To 
replace I have to query drive serial number, power off the machine and pulling 
drives one at a time read the labels...

With hardware RAID that is not an issue, I always know which physical port 
failed drive is in. And I can tell controller to “indicate” specific drive (it 
blinks respective port LED). Always hot replacing drives in hardware RAIDs, no 
one ever knows it has been done. And I’d love to deal same way with drives in 
software RAIDs.

Thanks for advises in advance. And my apologies for “stealing the thread"

Valeri

> with ZFS you can get much the same performance boost out of a small fast
> SSD used as a ZIL / SLOG.
> 
> -- 
> -john r pierce
>  recycling used bits in santa cruz
> ___
> CentOS mailing list
> CentOS@centos.org
> https://lists.centos.org/mailman/listinfo/centos

___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ssacli start rebuild?

2020-11-11 Thread John Pierce
On Wed, Nov 11, 2020 at 3:38 PM Warren Young  wrote:

> On Nov 11, 2020, at 2:01 PM, hw  wrote:
> >
> > I have yet to see software RAID that doesn't kill the performance.
>
> When was the last time you tried it?
>
> Why would you expect that a modern 8-core Intel CPU would impede I/O in
> any measureable way as compared to the outdated single-core 32-bit RISC CPU
> typically found on hardware RAID cards?  These are the same CPUs, mind,
> that regularly crunch through TLS 1.3 on line-rate fiber Ethernet links, a
> much tougher task than mediating spinning disk I/O.


the only 'advantage' hardware raid has is write-back caching.

with ZFS you can get much the same performance boost out of a small fast
SSD used as a ZIL / SLOG.

-- 
-john r pierce
  recycling used bits in santa cruz
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ssacli start rebuild?

2020-11-11 Thread Warren Young
On Nov 11, 2020, at 2:01 PM, hw  wrote:
> 
> I have yet to see software RAID that doesn't kill the performance.

When was the last time you tried it?

Why would you expect that a modern 8-core Intel CPU would impede I/O in any 
measureable way as compared to the outdated single-core 32-bit RISC CPU 
typically found on hardware RAID cards?  These are the same CPUs, mind, that 
regularly crunch through TLS 1.3 on line-rate fiber Ethernet links, a much 
tougher task than mediating spinning disk I/O.

> And where
> do you get cost-efficient cards that can do JBOD?

$69, 8 SATA/SAS ports: https://www.newegg.com/p/0ZK-08UH-0GWZ1

Search for “LSI JBOD” for tons more options.  You may have to fiddle with the 
firmware to get it to stop trying to do clever RAID stuff, which lets you do 
smart RAID stuff like ZFS instead.

> What has HP been thinking?

That the hardware vs software RAID argument is over in 2020.

___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ssacli start rebuild?

2020-11-11 Thread hw
On Wed, 2020-11-11 at 11:34 +0100, Thomas Bendler wrote:
> Am Mi., 11. Nov. 2020 um 07:28 Uhr schrieb hw :
> 
> > [...]
> > With this experience, these controllers are now deprecated.  RAID
> > controllers
> > that can't rebuild an array after a disk has failed and has been replaced
> > are virtually useless.
> > [...]
> 
> HW RAID is often delivered with quite limited functionality. Because of
> this I switched in most cases to software RAID meanwhile and configured the
> HW RAID as JBOD. The funny thing is, when you use the discs previously used
> in the HW RAID in such a scenario, the software RAID detects them as RAID
> disks. It looks like a significant amount of HW RAID controllers use the
> Linux software RAID code in their firmware.
> 

I have yet to see software RAID that doesn't kill the performance.  And where
do you get cost-efficient cards that can do JBOD?  I don't have any.

It turned out that the controller does not rebuild the array even with a disk
that is the same model and capacity as the others.  What has HP been thinking?


___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ssacli start rebuild?

2020-11-11 Thread Thomas Bendler
Am Mi., 11. Nov. 2020 um 07:28 Uhr schrieb hw :

> [...]
> With this experience, these controllers are now deprecated.  RAID
> controllers
> that can't rebuild an array after a disk has failed and has been replaced
> are virtually useless.
> [...]


HW RAID is often delivered with quite limited functionality. Because of
this I switched in most cases to software RAID meanwhile and configured the
HW RAID as JBOD. The funny thing is, when you use the discs previously used
in the HW RAID in such a scenario, the software RAID detects them as RAID
disks. It looks like a significant amount of HW RAID controllers use the
Linux software RAID code in their firmware.

Kind regards Thomas
-- 
Linux ... enjoy the ride!
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ssacli start rebuild?

2020-11-10 Thread hw
On Mon, 2020-11-09 at 16:30 +0100, Thomas Bendler wrote:
> Am Fr., 6. Nov. 2020 um 20:38 Uhr schrieb hw :
> 
> > [...]
> > Some search results indicate that it's possible that other disks in the
> > array have read errors and might prevent rebuilding for RAID 5.  I don't
> > know if there are read errors, and if it's read errors, I think it would
> > mean that these errors would have to affect just the disk which is
> > mirroring the disk that failed, this being a RAID 1+0.  But if the RAID
> > is striped across all the disks, that could be any or all of them.
> > 
> > The array is still in production and still works, so it should just
> > rebuild.
> > Now the plan is to use another 8TB disk once it arrives, make a new RAID 1
> > with the two new disks and copy the data over.  The remaining 4TB disks can
> > then be used to make a new array.
> > 
> > Learn from this that it can be a bad idea to use a RAID 0 for backups and
> > that
> > least one generation of backups must be on redundant storage ...
> > 
> 
> Just checked on one of my HP boxes, you can indeed not figure out if one of
> the discs has read errors. Do you have the option to reboot the box and
> check on the controller directly?
> 

Thanks!  The controller (it's BIOS) doesn't show up during boot, so I can't
check there for errors.

The controller is extremely finicky:  The plan to make a RAID 1 from the two
new drives has failed because the array with the failed drive is unusable
when the failed is missing entirely.

In the process of moving the 8TB drives back and forth, it turned out that
when an array that was made from them is missing one drive, that array is
unusable --- and when putting the missing drive is put back in, the array
remains 'Ready for Rebuild' without the rebuild starting.  There is also no
way to delete an array that is missing a drive.

So the theory that the array isn't being rebuilt because other disks have
errors is likely wrong.  That means that whenenver a disk fails and is
being replaced, there is no way to rebuild the array (unless it would happen
automatically, which it doesn't).

With this experience, these controllers are now deprecated.  RAID controllers
that can't rebuild an array after a disk has failed and has been replaced
are virtually useless.


___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ssacli start rebuild?

2020-11-09 Thread Thomas Bendler
Am Fr., 6. Nov. 2020 um 20:38 Uhr schrieb hw :

> [...]
> Some search results indicate that it's possible that other disks in the
> array have read errors and might prevent rebuilding for RAID 5.  I don't
> know if there are read errors, and if it's read errors, I think it would
> mean that these errors would have to affect just the disk which is
> mirroring the disk that failed, this being a RAID 1+0.  But if the RAID
> is striped across all the disks, that could be any or all of them.
>
> The array is still in production and still works, so it should just
> rebuild.
> Now the plan is to use another 8TB disk once it arrives, make a new RAID 1
> with the two new disks and copy the data over.  The remaining 4TB disks can
> then be used to make a new array.
>
> Learn from this that it can be a bad idea to use a RAID 0 for backups and
> that
> least one generation of backups must be on redundant storage ...
>

Just checked on one of my HP boxes, you can indeed not figure out if one of
the discs has read errors. Do you have the option to reboot the box and
check on the controller directly?

Kind regards Thomas
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ssacli start rebuild?

2020-11-06 Thread hw
On Fri, 2020-11-06 at 12:08 +0100, Thomas Bendler wrote:
> Am Fr., 6. Nov. 2020 um 00:52 Uhr schrieb hw :
> 
> > [...]
> >   logicaldrive 1 (14.55 TB, RAID 1+0, Ready for Rebuild)
> > [...]
> 
> Have you checked the rebuild priority:
> 
> ❯ ssacli ctrl slot=0 show config detail | grep "Rebuild Priority"
>   ~
>Rebuild Priority: Medium
> ❯
> 
> Slot needs to be adjusted to your configuration.

Yes, I've set it to high:


ssacli ctrl slot=3 show config detail | grep Prior
   Rebuild Priority: High
   Expand Priority: Medium


Some search results indicate that it's possible that other disks in the
array have read errors and might prevent rebuilding for RAID 5.  I don't
know if there are read errors, and if it's read errors, I think it would
mean that these errors would have to affect just the disk which is
mirroring the disk that failed, this being a RAID 1+0.  But if the RAID
is striped across all the disks, that could be any or all of them.

The array is still in production and still works, so it should just rebuild.
Now the plan is to use another 8TB disk once it arrives, make a new RAID 1
with the two new disks and copy the data over.  The remaining 4TB disks can
then be used to make a new array.

Learn from this that it can be a bad idea to use a RAID 0 for backups and that
least one generation of backups must be on redundant storage ...


___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ssacli start rebuild?

2020-11-06 Thread Thomas Bendler
Am Fr., 6. Nov. 2020 um 00:52 Uhr schrieb hw :

> [...]
>   logicaldrive 1 (14.55 TB, RAID 1+0, Ready for Rebuild)
> [...]


Have you checked the rebuild priority:

❯ ssacli ctrl slot=0 show config detail | grep "Rebuild Priority"
  ~
   Rebuild Priority: Medium
❯

Slot needs to be adjusted to your configuration.

Kind regards Thomas
-- 
Linux ... enjoy the ride!
___
CentOS mailing list
CentOS@centos.org
https://lists.centos.org/mailman/listinfo/centos