Re: Rebuild after disk fail

2020-01-18 Thread Russell Coker via luv-main
On Saturday, 18 January 2020 6:44:51 PM AEDT Craig Sanders via luv-main wrote:
> I personally would never use anything less than RAID-1 (or equivalent, such
> as a mirrored pair on zfs) for any storage. Which means, of course, that I'm
> used to paying double for my storage capacity - i can't just buy one, I
> have to buy a pair.  Not as a substitute for regular backups, but for
> convenience when only one drive of a pair has died.
> 
> Drives die, and the time & inconvenience of dealing with that (and the lost
> data) cost far more than the price of a second drive for raid-1/mirror.

I generally agree that RAID-1 is the way to go.  But if you can't do that then 
BTRFS "dup" and ZFS "copies=2" are good options, especially with SSD.

So far I have not seen a SSD entirely die, the worst I've seen is a SSD stop 
accepting writes (which causes an immediate kernel panic with a filesystem 
like BTRFS).  I've also seen SSDs return corrupt data while claiming it to be 
good, but not in huge quantities.

For hard drives also I haven't seen a total failure (like stiction) for many 
years.  The worst hard drive problem I've seen was about 12,000 read errors, 
that sounds like a lot but is a very small portion of a 3TB disk and "dup" or 
"copies=2" should get most of your data back in that situation.

-- 
My Main Blog http://etbe.coker.com.au/
My Documents Bloghttp://doc.coker.com.au/

___
luv-main mailing list
luv-main@luv.asn.au
https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main


Re: Rebuild after disk fail

2020-01-18 Thread Russell Coker via luv-main
On Saturday, 18 January 2020 2:34:52 PM AEDT Andrew McGlashan via luv-main 
wrote:
> Hi,
> 
> On 18/1/20 2:14 pm, Andrew McGlashan via luv-main wrote:
> > btrfs -- I never, ever considered that to be real production ready
> > and I believe that even dead hat has moved away from it somewhat
> > (not sure to what extent).
> 
> Some links, none of which are  new as this occurred some time ago now.
> 
> https://news.ycombinator.com/item?id=14907771

I think this link is the most useful.

BTRFS has worked quite solidly for me for years.  The main deficiency of BTRFS 
is that RAID-5 and RAID-6 are not usable as of the last reports I read.  For a 
home server RAID-1 is all you need (2 or 3 largish SATA disks in a RAID-1 
gives plenty of storage).  The way BTRFS allows you to extend a RAID-1 
filesystem by adding a new disk of any size and rebalancing is really handy 
for home use.  The ZFS limit of having all disks be the same size and upgraded 
in lock step is no problem for corporate use.

Generally I recommend using BTRFS for workstations and servers that have 2 
disks.  Use ZFS for big storage.

-- 
My Main Blog http://etbe.coker.com.au/
My Documents Bloghttp://doc.coker.com.au/

___
luv-main mailing list
luv-main@luv.asn.au
https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main


Re: Rebuild after disk fail

2020-01-18 Thread Russell Coker via luv-main
On Sunday, 19 January 2020 3:47:00 PM AEDT Craig Sanders via luv-main wrote:
> NVME SSDs are **much** faster then SATA SSDs.  SATA 3 is 6 Gbps (600 MBps),
> so taking protocol overhead into account SATA drives max out at around 550
> MBps.
> 
> NVME drives run at **up to** PCI-e bus speeds - with 4 lanes, that's a
> little under 40 Gbps for PCIe v3 (approx 4000 MBps minus protocol
> overhead), double that for PCIe v4.  That's the theoretical maximum speed,
> anyway. In practice, most NVME SSDs run quite a bit slower than that, about
> 2 GBps - that's still almost 4 times as fast as a SATA SSD.
> 
> Some brands and models (e.g. those from samsung and crucial) run at around
> 3200 to 3500 MBps, but they cost more (e.g. a 1TB Samsung 970 EVO PLUS
> (MZ-V7S1T0BW) costs around $300, while the 1TB Kingston A2000
> (SA2000M8/1000G) costs around $160 but is only around 1800 MBps).

Until recently I had a work Thinkpad with NVMe.  That could sustain almost 
5GB/s until the CPU overheated and throttled it (there was an ACPI bug that 
caused it to falsely regard 60C as a thermal throttle point instead of 80C).

But when it came to random writes the speed was much lower, particularly with 
sustained writes.  Things like upgrading a Linux distribution in a VM image 
causes sustained write rates to go well below 1GB/s.

The NVMe interface is good, but having a CPU and storage that can sustain it 
is another issue.

-- 
My Main Blog http://etbe.coker.com.au/
My Documents Bloghttp://doc.coker.com.au/

___
luv-main mailing list
luv-main@luv.asn.au
https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main


Re: Rebuild after disk fail

2020-01-18 Thread Andrew Greig via luv-main

Hi Craig

On 19/1/20 3:47 pm, Craig Sanders via luv-main wrote

That would be a very good idea.  Most modern motherboards will have more than
enough NVME and SATA slots for that (e.g. most Ryzen x570 motherboards have
2 or 3 NVME slots for extremely fast SSDs, plus 6 or 8 SATA ports for SATA
HDDs and SSDs. They also have enough RAM slots for 64GB DDR-4 RAM, and have at
least 2 or 3 PCI-e v4 slots - you'll use one for your graphics card).

2 SSDs for the rootfs including your home dir, and 2 HDDs for your /data bulk
storage filesystem.  And more than enough drive ports for future expansion if
you ever need it.


---

some info on nvme vs sata:

NVME SSDs are **much** faster then SATA SSDs.  SATA 3 is 6 Gbps (600 MBps), so
taking protocol overhead into account SATA drives max out at around 550 MBps.

NVME drives run at **up to** PCI-e bus speeds - with 4 lanes, that's a little
under 40 Gbps for PCIe v3 (approx 4000 MBps minus protocol overhead), double
that for PCIe v4.  That's the theoretical maximum speed, anyway. In practice,
most NVME SSDs run quite a bit slower than that, about 2 GBps - that's still
almost 4 times as fast as a SATA SSD.

Some brands and models (e.g. those from samsung and crucial) run at around
3200 to 3500 MBps, but they cost more (e.g. a 1TB Samsung 970 EVO PLUS
(MZ-V7S1T0BW) costs around $300, while the 1TB Kingston A2000 (SA2000M8/1000G)
costs around $160 but is only around 1800 MBps).

AFAIK there are no NVME drives that run at full PCI-e v4 speed (~8 GBps with
4 lanes) yet, it's still too new. That's not a problem, PCI-e is designed to
be backwards-compatible with earlier versions, so any current NVME drive will
work in pcie v4 slots.

NVME SSDs cost about the same as SATA SSDs of the same capacity so there's no
reason not to get them if your motherboard has NVME slots (which are pretty
much standard these days).


BTW, the socket that NVME drives plug into is called "M.2".  M.2 supports
both SATA & NVME protocols.  SATA M.2 runs at 6 Gbps.  NVME runs at PCI-e bus
speed. So you have to be careful when you buy to make sure you get an NVME M.2
drive and not a SATA drive in M.2 form-factor...some retailers will try to
exploit the confusion over this.

craig

--


Hi Craig

here is the output of blkid

/dev/sdb1: LABEL="Data" UUID="73f55e83-2038-4a0d-9c05-8f7e2e741517" 
UUID_SUB="77fdea4e-3157-45af-bba4-7db8eb04ff08" TYPE="btrfs" 
PARTUUID="d5d96658-01"
/dev/sdc1: LABEL="Data" UUID="73f55e83-2038-4a0d-9c05-8f7e2e741517" 
UUID_SUB="8ad739f7-675e-4aeb-ab27-299b34f6ace5" TYPE="btrfs" 
PARTUUID="a1948e65-01"


I tried the first UUID for sdc1 and the machine hung but gave me an 
opportunity to edit the fstab and reboot. When checking the UUID I 
discovered that the first entry for both drives were identical. Should I 
be using the SUB UUID for sdc1 for the entry in fstab?


Kind regards

Andrew



___
luv-main mailing list
luv-main@luv.asn.au
https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main


Re: Rebuild after disk fail

2020-01-18 Thread Andrew Greig via luv-main

Thanks Craig,

As they say in the Medibank commercial "I feel better now!"


Andrew

On 19/1/20 3:47 pm, Craig Sanders via luv-main wrote:

On Sat, Jan 18, 2020 at 11:06:50PM +1100, Andrew Greig wrote:

Yes, the problem was my Motherboard would not handle enough disks, and we
did Format sdc with btrfs and left the sdb alone so that btrfs could arrange
things between them.

I was hoping to get an understanding of how the RAID drives remembered the
"Balance" command when the the whole of the root filesystem was replaced on
a new SSD.

Your rootfs and your /data filesystem(*) are entirely separate.  Don't confuse
them.

The /data filesystem needed to be re-balanced when you added the second drive
(making it into a raid-1 array). 'btrfs balance' reads and rewrites all the
existing data on a btrfs filesystem so that it is distributed equally over all
drives in the array.  For RAID-1, that means mirroring all the data on the
first drive onto the second, so that there's a redundant copy of everything.

Your rootfs is only a single partition, it doesn't have a raid-1 mirror, so
re-balancing isn't necessary (and would do nothing).


BTW, there's nothing being "remembered". 'btrfs balance' just re-balances the
existing data over all drives in the array. It's a once-off operation that
runs to completion and then exits. All **NEW** data will be automatically
distributed across the array.  If you ever add another drive to the array, or
convert it to raid-0 (definitely NOT recommended), you'll need to re-balance
it again. until and unless that happens you don't need to even think about
re-balancing, it's no longer relevant.



(*) I think you had your btrfs raid array mounted at /data, but I may be
mis-remembering that.  To the best of my knowledge, you have two entirely
separate btrfs filesystems - one is the root filesystem, mounted as / (it also
has /home on it, which IIRC you have made a separate btrfs sub-volume for).
Anyway, it's a single-partition btrfs fs with no raid. The other is a 2 drive
btrfs fs using raid-1, which I think is mounted as /data.



I thought that control would have rested with /etc/fstab.  How do the
drives know to balance themselves, is there a command resident in sdc1?

/etc/fstab tells the system which filesystems to mount. It gets read at boot
time by the system start up scripts.



My plan is to have auto backups, and given that my activity has seen an SSD
go down in 12 months, maybe at 10 months I should build a new box, something
which will handle 64Gb RAM and have a decent Open Source Graphics driver.
And put the / on a pair of 1Tb SSDs.

That would be a very good idea.  Most modern motherboards will have more than
enough NVME and SATA slots for that (e.g. most Ryzen x570 motherboards have
2 or 3 NVME slots for extremely fast SSDs, plus 6 or 8 SATA ports for SATA
HDDs and SSDs. They also have enough RAM slots for 64GB DDR-4 RAM, and have at
least 2 or 3 PCI-e v4 slots - you'll use one for your graphics card).

2 SSDs for the rootfs including your home dir, and 2 HDDs for your /data bulk
storage filesystem.  And more than enough drive ports for future expansion if
you ever need it.


---

some info on nvme vs sata:

NVME SSDs are **much** faster then SATA SSDs.  SATA 3 is 6 Gbps (600 MBps), so
taking protocol overhead into account SATA drives max out at around 550 MBps.

NVME drives run at **up to** PCI-e bus speeds - with 4 lanes, that's a little
under 40 Gbps for PCIe v3 (approx 4000 MBps minus protocol overhead), double
that for PCIe v4.  That's the theoretical maximum speed, anyway. In practice,
most NVME SSDs run quite a bit slower than that, about 2 GBps - that's still
almost 4 times as fast as a SATA SSD.

Some brands and models (e.g. those from samsung and crucial) run at around
3200 to 3500 MBps, but they cost more (e.g. a 1TB Samsung 970 EVO PLUS
(MZ-V7S1T0BW) costs around $300, while the 1TB Kingston A2000 (SA2000M8/1000G)
costs around $160 but is only around 1800 MBps).

AFAIK there are no NVME drives that run at full PCI-e v4 speed (~8 GBps with
4 lanes) yet, it's still too new. That's not a problem, PCI-e is designed to
be backwards-compatible with earlier versions, so any current NVME drive will
work in pcie v4 slots.

NVME SSDs cost about the same as SATA SSDs of the same capacity so there's no
reason not to get them if your motherboard has NVME slots (which are pretty
much standard these days).


BTW, the socket that NVME drives plug into is called "M.2".  M.2 supports
both SATA & NVME protocols.  SATA M.2 runs at 6 Gbps.  NVME runs at PCI-e bus
speed. So you have to be careful when you buy to make sure you get an NVME M.2
drive and not a SATA drive in M.2 form-factor...some retailers will try to
exploit the confusion over this.

craig

--
craig sanders 
___
luv-main mailing list
luv-main@luv.asn.au
https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main


Re: Rebuild after disk fail

2020-01-18 Thread Craig Sanders via luv-main
On Sat, Jan 18, 2020 at 11:06:50PM +1100, Andrew Greig wrote:
> Yes, the problem was my Motherboard would not handle enough disks, and we
> did Format sdc with btrfs and left the sdb alone so that btrfs could arrange
> things between them.
>
> I was hoping to get an understanding of how the RAID drives remembered the
> "Balance" command when the the whole of the root filesystem was replaced on
> a new SSD.

Your rootfs and your /data filesystem(*) are entirely separate.  Don't confuse
them.

The /data filesystem needed to be re-balanced when you added the second drive
(making it into a raid-1 array). 'btrfs balance' reads and rewrites all the
existing data on a btrfs filesystem so that it is distributed equally over all
drives in the array.  For RAID-1, that means mirroring all the data on the
first drive onto the second, so that there's a redundant copy of everything.

Your rootfs is only a single partition, it doesn't have a raid-1 mirror, so
re-balancing isn't necessary (and would do nothing).


BTW, there's nothing being "remembered". 'btrfs balance' just re-balances the
existing data over all drives in the array. It's a once-off operation that
runs to completion and then exits. All **NEW** data will be automatically
distributed across the array.  If you ever add another drive to the array, or
convert it to raid-0 (definitely NOT recommended), you'll need to re-balance
it again. until and unless that happens you don't need to even think about
re-balancing, it's no longer relevant.



(*) I think you had your btrfs raid array mounted at /data, but I may be
mis-remembering that.  To the best of my knowledge, you have two entirely
separate btrfs filesystems - one is the root filesystem, mounted as / (it also
has /home on it, which IIRC you have made a separate btrfs sub-volume for).
Anyway, it's a single-partition btrfs fs with no raid. The other is a 2 drive
btrfs fs using raid-1, which I think is mounted as /data.


> I thought that control would have rested with /etc/fstab.  How do the
> drives know to balance themselves, is there a command resident in sdc1?

/etc/fstab tells the system which filesystems to mount. It gets read at boot
time by the system start up scripts.


> My plan is to have auto backups, and given that my activity has seen an SSD
> go down in 12 months, maybe at 10 months I should build a new box, something
> which will handle 64Gb RAM and have a decent Open Source Graphics driver.
> And put the / on a pair of 1Tb SSDs.

That would be a very good idea.  Most modern motherboards will have more than
enough NVME and SATA slots for that (e.g. most Ryzen x570 motherboards have
2 or 3 NVME slots for extremely fast SSDs, plus 6 or 8 SATA ports for SATA
HDDs and SSDs. They also have enough RAM slots for 64GB DDR-4 RAM, and have at
least 2 or 3 PCI-e v4 slots - you'll use one for your graphics card).

2 SSDs for the rootfs including your home dir, and 2 HDDs for your /data bulk
storage filesystem.  And more than enough drive ports for future expansion if
you ever need it.


---

some info on nvme vs sata:

NVME SSDs are **much** faster then SATA SSDs.  SATA 3 is 6 Gbps (600 MBps), so
taking protocol overhead into account SATA drives max out at around 550 MBps.

NVME drives run at **up to** PCI-e bus speeds - with 4 lanes, that's a little
under 40 Gbps for PCIe v3 (approx 4000 MBps minus protocol overhead), double
that for PCIe v4.  That's the theoretical maximum speed, anyway. In practice,
most NVME SSDs run quite a bit slower than that, about 2 GBps - that's still
almost 4 times as fast as a SATA SSD.

Some brands and models (e.g. those from samsung and crucial) run at around
3200 to 3500 MBps, but they cost more (e.g. a 1TB Samsung 970 EVO PLUS
(MZ-V7S1T0BW) costs around $300, while the 1TB Kingston A2000 (SA2000M8/1000G)
costs around $160 but is only around 1800 MBps).

AFAIK there are no NVME drives that run at full PCI-e v4 speed (~8 GBps with
4 lanes) yet, it's still too new. That's not a problem, PCI-e is designed to
be backwards-compatible with earlier versions, so any current NVME drive will
work in pcie v4 slots.

NVME SSDs cost about the same as SATA SSDs of the same capacity so there's no
reason not to get them if your motherboard has NVME slots (which are pretty
much standard these days).


BTW, the socket that NVME drives plug into is called "M.2".  M.2 supports
both SATA & NVME protocols.  SATA M.2 runs at 6 Gbps.  NVME runs at PCI-e bus
speed. So you have to be careful when you buy to make sure you get an NVME M.2
drive and not a SATA drive in M.2 form-factor...some retailers will try to
exploit the confusion over this.

craig

--
craig sanders 
___
luv-main mailing list
luv-main@luv.asn.au
https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main


Re: Rebuild after disk fail

2020-01-18 Thread Andrew Greig via luv-main

Hi Craig,

Yes, the problem was my Motherboard would not handle enough disks, and 
we did Format sdc with btrfs and left the sdb alone so that btrfs could 
arrange things between them.


I was hoping to get an understanding of how the RAID drives remembered 
the "Balance" command when the the whole of the root filesystem was 
replaced on a new SSD. I thought that control would have rested with 
/etc/fstab.  How do the drives know to balance themselves, is there a 
command resident in sdc1?


My plan is to have auto backups, and given that my activity has seen an 
SSD go down in 12 months, maybe at 10 months I should build a new box, 
something which will handle 64Gb RAM and have a decent Open Source 
Graphics driver. And put the / on a pair of 1Tb SSDs.


Many thanks

Andrew


On 18/1/20 6:44 pm, Craig Sanders via luv-main wrote:

On Sat, Jan 18, 2020 at 02:14:46PM +1100, Andrew McGlashan wrote:

Just some thoughts

Way back, SSDs were expensive and less reliable than today.

Given the cost of SSDs today, I would consider even RAIDING the SSDs.

If it's physically possible to install a second SSD of the same storage
capacity or larger then he absolutely should do so.  I vaguely recall
suggesting he should get a second SSD for the rootfs ages ago, but my
understanding / assumption was that there was only physical space and
connectors for one SSD in the machine.

The 'btrfs snapshot' + 'btrfs send' suggestion was just a way of regularly
backing up a single-drive btrfs filesystem onto his raid-1 btrfs array so that
little or nothing was lost in case of another drive failure. It's less than
ideal, but a LOT better than nothing.

I personally would never use anything less than RAID-1 (or equivalent, such
as a mirrored pair on zfs) for any storage. Which means, of course, that I'm
used to paying double for my storage capacity - i can't just buy one, I have
to buy a pair.  Not as a substitute for regular backups, but for convenience
when only one drive of a pair has died.

Drives die, and the time & inconvenience of dealing with that (and the lost
data) cost far more than the price of a second drive for raid-1/mirror.

craig

--
craig sanders 
___
luv-main mailing list
luv-main@luv.asn.au
https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main

___
luv-main mailing list
luv-main@luv.asn.au
https://lists.luv.asn.au/cgi-bin/mailman/listinfo/luv-main