Re: continued zfs-related lockups

2024-11-19 Thread Greg Troxel
I have published a repository related to debugging/understanding ZFS on NetBSD: https://codeberg.org/gdt/netbsd-zfs This has a patch that - adds comments - rototills ARC sizing (smaller) - disables prefetching - printf of arc eviction behavior (when interesting) a script

Re: Need advice on ZFS

2024-10-08 Thread Sergio de Almeida Lenzi
Em qua, 2024-08-14 às 09:22 +, janic...@posteo.de escreveu: > Hello, > > I use ZFS on my NetBSD 10 server for a while to store all my family > member's data and have never met any problem or data loss. I even > managed to hack SSD caching into it - the NetBSD ver

Re: ZFS Partition GUID all zeros?

2024-09-13 Thread Jeff Rizzo
On 9/12/24 11:55 PM, Michael van Elst wrote: The gpt command creates GUIDs, including the partition GUIDs when you create the GPT itself. When adding, modifying or removing partitions these are not changed (just the "type GUIDs"). Ah hah - this is the crucial bit of information.  I think I just

Re: ZFS Partition GUID all zeros?

2024-09-13 Thread Michael van Elst
r...@tastylime.net (Jeff Rizzo) writes: >I was getting some help with related problems on IRC earlier today (I >think they're mostly sorted) , and it didn't even think about this one:  >the `gpt` command seems to have created partitions with all-zeros GUID. The gpt command creates GUIDs, inclu

Re: ZFS Partition GUID all zeros?

2024-09-12 Thread Martin Husemann
On Thu, Sep 12, 2024 at 08:39:36PM -0700, Jeff Rizzo wrote: > So... how bad is it that I have four GPT partitions with the same GUID?  :)  > Can I edit them? And how do we think this happened? Very bad, and we should add some asserts to gpt(8) to catch this! You should be able to fix it with "gpt

ZFS Partition GUID all zeros?

2024-09-12 Thread Jeff Rizzo
I was getting some help with related problems on IRC earlier today (I think they're mostly sorted) , and it didn't even think about this one:  the `gpt` command seems to have created partitions with all-zeros GUID. Background: I'm replacing some 4TB disks in a ZFS raidz2 with

Re: Need advice on ZFS

2024-08-14 Thread janicetr
Hello, I use ZFS on my NetBSD 10 server for a while to store all my family member's data and have never met any problem or data loss. I even managed to hack SSD caching into it - the NetBSD version doesn't seem to support cache files, only (SSD) disks, but with vnd0 (and a bit mo

Re: Need advice on ZFS

2024-08-09 Thread Joel
On 8/9/2024 13:18, Ted Spradley wrote: On Thu, 08 Aug 2024 18:11:07 -0400 Greg Troxel wrote: You obviously need to be creating multiple backups and taking at least some of them off site. That's true regardless of filesystem. I would say your exposure from all things not zfs flakine

Re: Need advice on ZFS

2024-08-09 Thread Ted Spradley
On Thu, 08 Aug 2024 18:11:07 -0400 Greg Troxel wrote: > You obviously need to be creating multiple backups and taking at least > some of them off site. That's true regardless of filesystem. I would > say your exposure from all things not zfs flakiness is far greater than > y

Re: Need advice on ZFS

2024-08-08 Thread Will Senn
On 8/8/24 14:53, Jonathan A. Kollasch wrote: On Thu, Aug 08, 2024 at 02:19:18PM -0500, Ted Spradley wrote: Should I trust NetBSD's ZFS with my user's data? I understand that our ZFS isn't being kept up as well as FreeBSD's, that's why I have FreeBSD on one box on my h

Re: Need advice on ZFS

2024-08-08 Thread Greg Troxel
Ted Spradley writes: > Should I trust NetBSD's ZFS with my user's data? > > I understand that our ZFS isn't being kept up as well as FreeBSD's, > that's why I have FreeBSD on one box on my home network, but I'd like > to keep all *my* hosts on NetBSD

Re: Need advice on ZFS

2024-08-08 Thread Jonathan A. Kollasch
On Thu, Aug 08, 2024 at 02:19:18PM -0500, Ted Spradley wrote: > Should I trust NetBSD's ZFS with my user's data? > > I understand that our ZFS isn't being kept up as well as FreeBSD's, > that's why I have FreeBSD on one box on my home network, but I'd l

Need advice on ZFS

2024-08-08 Thread Ted Spradley
Should I trust NetBSD's ZFS with my user's data? I understand that our ZFS isn't being kept up as well as FreeBSD's, that's why I have FreeBSD on one box on my home network, but I'd like to keep all *my* hosts on NetBSD (my users run Windows). I have one user (my wif

continued zfs-related lockups

2024-07-09 Thread Greg Troxel
I have having continued zfs-related lockups on two systems and am posting some anecdata/comments. I am building a LOCKDEBUG kernel to see if that changes anything. Both systems are up-to-date netbsd-10. System 1 is bare metal, 32G ram. System 2 is xen, 4000M RAM in the dom0. Issues described

Re: Samba, ZFS and xattr

2023-08-21 Thread Greg Troxel
Hauke Fath writes: > On 8/14/23 22:14, Chavdar Ivanov wrote: >>> supermicro# zfs set xattr=on pool0/backup/timemachine >>> property 'xattr' not supported on NetBSD: permission denied >>> >>> If I'm not mistaken, this should be the step to

Re: Samba, ZFS and xattr

2023-08-21 Thread Hauke Fath
On 8/14/23 22:14, Chavdar Ivanov wrote: supermicro# zfs set xattr=on pool0/backup/timemachine property 'xattr' not supported on NetBSD: permission denied If I'm not mistaken, this should be the step to set xattr? According to Oracle's documentation, yes; it should be on b

Re: Samba, ZFS and xattr

2023-08-14 Thread Chavdar Ivanov
On Mon, 14 Aug 2023 at 18:56, Manuel Kuklinski wrote: > > Am Montag 14 August 2023 um 12:49:53 -0400, schrieb Greg Troxel 0,1K: > > Did you try to set it and verify that it doesn't work? I have the > > impression that xattr is fine in zfs -- but that is an impres

Re: Samba, ZFS and xattr

2023-08-14 Thread Manuel Kuklinski
Am Montag 14 August 2023 um 12:49:53 -0400, schrieb Greg Troxel 0,1K: > Did you try to set it and verify that it doesn't work? I have the > impression that xattr is fine in zfs -- but that is an impression, not > knowledge. > Hi! Already tried this: supermicro# zfs set xatt

Samba, ZFS and xattr

2023-08-14 Thread Manuel Kuklinski
Hi! I'm new to NetBSD, but so far I'm happy about my decision to switch from another UNIX-like OS; especially since ZFS is supported for some time now. I'm relying on ZFS and Samba for Macintosh clients and need xattr for "fruit:timemachine = yes" in /usr/pkg/etc/sam

Re: ZFS Bogosity

2023-08-14 Thread Michael van Elst
ently, i.e. small disks with "physical" driver names and a partition suffix, large disks with a name (or an UUID as unique default name). Make small disks using wedges, indepenent of disklabel or GPT, would unify both. > > Makeing zfs scan disklabel partitions derived from hw.disknam

Re: ZFS Bogosity

2023-08-14 Thread Greg Troxel
have unwanted side effects > when you scan non-disk devices. Makes sense, but the man page should probably be louder about this, then. I'll try to get this into the zfs howto. >> What happens on FreeBSD? Are they so firm on gpt-only, geom and >> zfs-on-whole disk that this doesn&#

Re: ZFS Bogosity

2023-08-13 Thread Jay F. Shachter
nk this matches the symptoms you saw? > Indeed, it matches exactly the symptoms that I saw, except that I did not test whether it would have worked if I had created a ZFS pool on a whole unpartitioned wd0, since that would have destroyed 10 operating systems, including NetBSD. (Yes, I truly do

Re: ZFS Bogosity

2023-08-13 Thread Michael van Elst
disknames->dev if not, but it seems best to minimally munge > upstream. That's an optimization to avoid scanning and probing all entries in /dev/ which can take some time and may have unwanted side effects when you scan non-disk devices. > What happens on FreeBSD? Are they so firm on gpt-

Re: ZFS Bogosity

2023-08-13 Thread Greg Troxel
mlel...@serpens.de (Michael van Elst) writes: > g...@lexort.com (Greg Troxel) writes: > >>David Brownlee writes: >>> https://gnats.netbsd.org/57583 > >>Do you think this is just a bug that it fails to look at wd3e >>etc. wrongly if there is /dev/zfs? &

Re: ZFS Bogosity

2023-08-13 Thread Michael van Elst
g...@lexort.com (Greg Troxel) writes: >David Brownlee writes: >> https://gnats.netbsd.org/57583 >Do you think this is just a bug that it fails to look at wd3e >etc. wrongly if there is /dev/zfs? The code scans all devices in the specified device directory, unless it's

Re: ZFS Bogosity

2023-08-13 Thread Greg Troxel
David Brownlee writes: > https://gnats.netbsd.org/57583 Do you think this is just a bug that it fails to look at wd3e etc. wrongly if there is /dev/zfs? What is the point of /dev/zfs (is that how zpool/zfs control works?) and is there any reason this should matter? Do you think this is t

Re: ZFS Bogosity

2023-08-13 Thread David Brownlee
On Thu, 10 Aug 2023 at 22:16, Jay F. Shachter wrote: > > Esteemed Colleagues: > > I have a multiboot computer on which Solaris, Linux, and NetBSD 10 > BETA have all been successfully installed (I couldn't install NetBSD > 9.3) and they are all sharing storage on a ZFS pool,

ZFS Bogosity

2023-08-10 Thread Jay F. Shachter
Esteemed Colleagues: I have a multiboot computer on which Solaris, Linux, and NetBSD 10 BETA have all been successfully installed (I couldn't install NetBSD 9.3) and they are all sharing storage on a ZFS pool, because all three of those operating systems support, or can be make to support

Re: zfs pool behavior - is it ever freed?

2023-07-31 Thread Greg Troxel
which I am not really proposing for committing this minute. But I suggest anyone trying to run zfs on 8G and below try it. I think it would be interesting to hear how it affects systems with lots of memory. On a system with 6000 MB (yes, that's not a power of 2 - xen config), I end up with

Re: zfs pool behavior - is it ever freed?

2023-07-29 Thread Michael van Elst
g...@lexort.com (Greg Troxel) writes: >mlel...@serpens.de (Michael van Elst) writes: >> t...@netbsd.org (Tobias Nygren) writes: >> >>>There exists ZFS code which hooks into UVM to drain memory -- but part >>>of it is ifdef __i386 for some reason. See arc_kmem_

Re: zfs pool behavior - is it ever freed?

2023-07-29 Thread Greg Troxel
mlel...@serpens.de (Michael van Elst) writes: > t...@netbsd.org (Tobias Nygren) writes: > >>There exists ZFS code which hooks into UVM to drain memory -- but part >>of it is ifdef __i386 for some reason. See arc_kmem_reap_now(). > > That's an extra for 32bit system

Re: zfs pool behavior - is it ever freed?

2023-07-29 Thread Greg Troxel
mechanism to free memory under pressure -- which there >> > is not. >> >> There exists ZFS code which hooks into UVM to drain memory -- but part >> of it is ifdef __i386 for some reason. See arc_kmem_reap_now(). > > FWIW, with jemalloc, there is the possibility

Re: zfs pool behavior - is it ever freed?

2023-07-29 Thread Michael van Elst
t...@netbsd.org (Tobias Nygren) writes: >There exists ZFS code which hooks into UVM to drain memory -- but part >of it is ifdef __i386 for some reason. See arc_kmem_reap_now(). That's an extra for 32bit systems (later code replaced __i386 with the proper macro) where kernel address sp

Re: zfs pool behavior - is it ever freed?

2023-07-29 Thread tlaronde
gt; > is not. > > There exists ZFS code which hooks into UVM to drain memory -- but part > of it is ifdef __i386 for some reason. See arc_kmem_reap_now(). FWIW, with jemalloc, there is the possibility to configure to give back memory to the system. Since jemalloc is incorporated in Ne

Re: zfs pool behavior - is it ever freed?

2023-07-29 Thread Tobias Nygren
On Fri, 28 Jul 2023 20:04:56 -0400 Greg Troxel wrote: > The upstream code tries to find a min/target/max under the assumption > that there is a mechanism to free memory under pressure -- which there > is not. There exists ZFS code which hooks into UVM to drain memory -- but part of it

Re: zfs pool behavior - is it ever freed?

2023-07-28 Thread Greg Troxel
Tobias Nygren writes: > n Thu, 27 Jul 2023 06:43:45 -0400 > Greg Troxel wrote: > >> Thus it seems there is a limit for zfs usage, but it is simply >> sometimes too high depending on available RAM. > > I use this patch on my RPi4, which I feel improves thi

Re: zfs pool behavior - is it ever freed?

2023-07-28 Thread Michael van Elst
On Fri, Jul 28, 2023 at 12:26:57PM -0400, Greg Troxel wrote: > mlel...@serpens.de (Michael van Elst) writes: > > > g...@lexort.com (Greg Troxel) writes: > > > >>I'm not either, but if there is a precise description/code of what they > >>did, that lowers the barrier to us stealing* it. (* There is

Re: zfs pool behavior - is it ever freed?

2023-07-28 Thread Greg Troxel
mlel...@serpens.de (Michael van Elst) writes: > g...@lexort.com (Greg Troxel) writes: > >>I'm not either, but if there is a precise description/code of what they >>did, that lowers the barrier to us stealing* it. (* There is of course >>a long tradition of improvements from various *BSD being app

Re: zfs pool behavior - is it ever freed?

2023-07-28 Thread Michael van Elst
g...@lexort.com (Greg Troxel) writes: >I'm not either, but if there is a precise description/code of what they >did, that lowers the barrier to us stealing* it. (* There is of course >a long tradition of improvements from various *BSD being applied to >others.) The FreeBSD code is already there

Re: zfs pool behavior - is it ever freed?

2023-07-28 Thread Greg Troxel
us *BSD being applied to others.) > I moved to FreeBSD from Net a few years ago (mainly to get ZFS), and > have had similar issues under heavy load with a large ARC. It wouldn't > crash or hang, but it would always favour killing something over > flushing the ARC under pressure. I d

Re: zfs pool behavior - is it ever freed?

2023-07-28 Thread Mr Roooster
On Thu, 27 Jul 2023 at 19:28, Greg Troxel wrote: > > Mike Pumford writes: > [snip] > > > If I've read it right there needs to be a mechanism for memory > > pressure to force ZFS to release memory. Doing it after all the > > processes have been swapped to disk

Re: zfs pool behavior - is it ever freed?

2023-07-27 Thread Michael van Elst
On Thu, Jul 27, 2023 at 06:42:02PM +0100, Mike Pumford wrote: > > Now I might be reading it wrong but that suggest to me that it would be an > awful idea to run ZFS on a system that needs memory for things other than > filesystem caching as there is no way for those memory needs to

Re: zfs pool behavior - is it ever freed?

2023-07-27 Thread Brett Lymn
for production use. > 4G will not work; we have no reports of succesful long-term operation > > When you run out, it's ugly. External tickle after sync(8) works to > reboot. Other wdog approaches unclear. > > > Additional data welcome of course. From my e

Re: zfs pool behavior - is it ever freed?

2023-07-27 Thread Greg Troxel
Mike Pumford writes: > Now I might be reading it wrong but that suggest to me that it would > be an awful idea to run ZFS on a system that needs memory for things > other than filesystem caching as there is no way for those memory > needs to force ZFS to give up its pool usage. As

Re: zfs pool behavior - is it ever freed?

2023-07-27 Thread Mike Pumford
On 27/07/2023 13:47, Michael van Elst wrote: Swapping out userland pages is done much earlier, so with high ZFS utilization you end with a system that has a huge part of real memory allocated to the kernel. When you run out of swap (and processes already get killed), then you see some

Re: zfs pool behavior - is it ever freed?

2023-07-27 Thread Greg Troxel
David Brownlee writes: > I would definitely like to see something like this in-tree soonest for > low memory (<6GB?) machines, but I'd prefer not to affect machines > with large amounts of memory used as dedicated ZFS fileservers (at > least not until its easily tunable)

Re: zfs pool behavior - is it ever freed?

2023-07-27 Thread David Brownlee
On Thu, 27 Jul 2023 at 13:24, Greg Troxel wrote: > > Tobias Nygren writes: > > > I use this patch on my RPi4, which I feel improves things. > > People might find it helpful. > > That looks very helpful; I'll try it. > > > There ought to be writable sysct

Re: zfs pool behavior - is it ever freed?

2023-07-27 Thread Michael van Elst
g...@lexort.com (Greg Troxel) writes: > RAM and/or responds to pressure. That's why we see almost no reports > of trouble expect for zfs. There is almost no pressure on pools and several effects prevent pressure from actually draining pool caches. There is almost no pressure on

Re: zfs pool behavior - is it ever freed?

2023-07-27 Thread Greg Troxel
Tobias Nygren writes: > I use this patch on my RPi4, which I feel improves things. > People might find it helpful. That looks very helpful; I'll try it. > There ought to be writable sysctl knobs for some of the ZFS > tuneables, but looks like it isn't implemented in NetBS

Re: zfs pool behavior - is it ever freed?

2023-07-27 Thread Tobias Nygren
On Thu, 27 Jul 2023 06:43:45 -0400 Greg Troxel wrote: > Thus it seems there is a limit for zfs usage, but it is simply > sometimes too high depending on available RAM. I use this patch on my RPi4, which I feel improves things. People might find it helpful. There ought to be writable

Re: zfs pool behavior - is it ever freed?

2023-07-27 Thread David Brownlee
Potentially supporting datapoint: I've found issues with netbsd-9 with ZFS on 4GB. Memory pressure was incredibly high and the system went away every few months. Currently running fine on -9 & -10 machines with between 8GB and 192GB The three 8GB ZFS machines (netbsd-9+raidz1, netbsd-

Re: zfs pool behavior - is it ever freed?

2023-07-27 Thread Greg Troxel
I have a bit of data, perhaps merged with some off list comments: People say that a 16G machine is ok with zfs, and I have seen no reports of real trouble. When I run my box with 4G, it locks up. When I run my box with 8G, I end up with pool usage in the 3 G to 3.5 G range. It feels

Re: zfs pool behavior - is it ever freed?

2023-07-22 Thread Hauke Fath
On Sat, 22 Jul 2023 14:13:06 +0200, Hauke Fath wrote: > It has a pair of SSDs (older intel SLC sata) for system partitions and > L2ARC, [...] Got my acronyms wrong, I meant SLOG*. I understand that L2ARC is largely pointless, and a waste of good RAM. Cheerio, Hauke *

Re: zfs pool behavior - is it ever freed?

2023-07-22 Thread Hauke Fath
term even if not xen)? 800 MB > Are you using NFS from the domU to dom0? domU running zfs? Something > else? The machine is not running Xen - I found the Dom0 too limiting for the purpose. It has a pair of SSDs (older intel SLC sata) for system partitions and L2ARC, and a pair of 4TB

Re: zfs pool behavior - is it ever freed?

2023-07-22 Thread Greg Troxel
Hauke Fath writes: > On Fri, 21 Jul 2023 08:31:46 -0400, Greg Troxel wrote: > [zfs memory pressure] > >> Are others having this problem? > > I have two machines, one at home (-10) and one at work (-9), in a > similar role as yours (fileserver and builds). While both h

Re: zfs pool behavior - is it ever freed?

2023-07-22 Thread Hauke Fath
On Fri, 21 Jul 2023 08:31:46 -0400, Greg Troxel wrote: [zfs memory pressure] > Are others having this problem? I have two machines, one at home (-10) and one at work (-9), in a similar role as yours (fileserver and builds). While both have had their moments, those have never been zfs rela

Re: zfs pool behavior - is it ever freed?

2023-07-21 Thread Greg Troxel
This script worked to reboot after a wedge. Assuming one has a watchdog of course. #!/bin/sh if [ `id -u` != 0 ]; then echo run as root exit 1 fi wdogctl -e -p 360 tco0 while true; do echo -n "LOOP: "; date date > /tank0/n0/do-wdog sync wdogctl -t

zfs pool behavior - is it ever freed?

2023-07-21 Thread Greg Troxel
I'm having trouble with zfs causing a system to run out of memory, when I think it should work ok. I have tried to err on the side of TMI. I have a semi-old computer (2010) that is: netbsd-10 amd64 8GB RAM 1T SSD cpu0: "Pentium(R) Dual-Core CPU E5700 @ 3.00GHz"

Re: Root On ZFS with Encryption

2022-01-09 Thread nia
On Sun, Jan 02, 2022 at 08:52:18PM +, Xianwen Chen () wrote: > > > Dear all, > > Happy new year! > > I wonder if the workaround described in [0] could be further extended to the > root ZFS fully encrypted? Has someone tried to do it? > > Yours sincerel

Root On ZFS with Encryption

2022-01-02 Thread 陈贤文
Dear all, Happy new year! I wonder if the workaround described in [0] could be further extended to the root ZFS fully encrypted? Has someone tried to do it? Yours sincerely, Xianwen [0] https://wiki.netbsd.org/wiki/RootOnZFS/

Re: Default group on ZFS

2021-12-20 Thread Hauke Fath
On Mon, 20 Dec 2021 18:23:35 +0100, J. Hannken-Illjes wrote: > Looks like the attached diff is sufficient. > Will you commit or should I? Thanks! I'll give it a test run, and commit. Cheerio, Hauke -- The ASCII Ribbon CampaignHauke Fath () No HTML/RTF in email

Re: Default group on ZFS

2021-12-20 Thread J. Hannken-Illjes
> On 20. Dec 2021, at 14:59, Hauke Fath wrote: > > On 12/20/21 1:46 PM, Hauke Fath wrote: >> on BSD ffs, group ownership of a newly created file defaults to the >> enclosing directory's. >> ZFS appears to go with SysV in defaulting to the owner's primary

Re: Default group on ZFS

2021-12-20 Thread Hauke Fath
On 12/20/21 1:46 PM, Hauke Fath wrote: on BSD ffs, group ownership of a newly created file defaults to the enclosing directory's. ZFS appears to go with SysV in defaulting to the owner's primary group. Is there a mount option to change that? I know I could setgid all the director

Default group on ZFS

2021-12-20 Thread Hauke Fath
Hi, on BSD ffs, group ownership of a newly created file defaults to the enclosing directory's. ZFS appears to go with SysV in defaulting to the owner's primary group. Is there a mount option to change that? I know I could setgid all the directories on the zfs volume, but that'

Re: diagnosis for disk drive errors (zfs on cgd on sata disk)

2021-08-23 Thread Pouya Tafti
On Sat, 21 Aug 2021 at 17:03 -, Michael van Elst wrote: > pouya+lists.net...@nohup.io (Pouya Tafti) writes: > > >Aug 20 06:04:33 basil smartd[1106]: Device: /dev/rsd5d [SAT], SMART > >Prefailure Attribute: 1 Raw_Read_Error_Rate changed from 65 to 79 > >Aug 20 06:04:33 basil smartd[1106]: Dev

ZFS root & boot install story?

2021-08-22 Thread Carl Brewer
Hey, According to a wiki article from 16 months ago, ZFS root "works", but is a bit fiddly to set up : https://wiki.netbsd.org/wiki/RootOnZFS/ It seems a bit weird.  Has the story changed since that wiki article was written? I'm going to be replacing an old NetBSD serv

Re: diagnosis for disk drive errors (zfs on cgd on sata disk)

2021-08-21 Thread Michael van Elst
pouya+lists.net...@nohup.io (Pouya Tafti) writes: >Aug 20 06:04:33 basil smartd[1106]: Device: /dev/rsd5d [SAT], SMART Prefailure >Attribute: 1 Raw_Read_Error_Rate changed from 65 to 79 >Aug 20 06:04:33 basil smartd[1106]: Device: /dev/rsd5d [SAT], SMART Prefailure >Attribute: 7 Seek_Error_Rate

Re: diagnosis for disk drive errors (zfs on cgd on sata disk)

2021-08-21 Thread Pouya Tafti
logs, this time the device wasn't detached by the controller, but smartd(8) logged some read errors throughout the day. But these also kept showing up before the pool became unresponsive. zpool status shows no errors and I did a successful scrub of both pools (primary and backup) after rebo

Re: diagnosis for disk drive errors (zfs on cgd on sata disk)

2021-08-20 Thread Pouya Tafti
Duplicate, please ignore. Apologies for the noise. On Fri, 20 Aug 2021 at 06:34 +0200, Pouya Tafti wrote: > After a recent drive failure in my primary zfs pool, I set > up a secondary pool on a cgd(4) device on a single new sata > hdd (zfs on gpt on cgd on gpt on a 4TB Seagate Ironwolf

Re: diagnosis for disk drive errors (zfs on cgd on sata disk)

2021-08-20 Thread Pouya Tafti
On Fri, 20 Aug 2021 at 06:13 -, Michael van Elst wrote: [snip] > Yes. It could be the drive itself, but I'd suspect the > backplane or cables. The PSU is also a possible candidate. Thanks. Retrying the replication in another bay now before opening up the box.

diagnosis for disk drive errors (zfs on cgd on sata disk)

2021-08-20 Thread Pouya Tafti
After a recent drive failure in my primary zfs pool, I set up a secondary pool on a cgd(4) device on a single new sata hdd (zfs on gpt on cgd on gpt on a 4TB Seagate Ironwolf hdd) to back up the primary. I initialy scrubbed the entire disk without apparent incident using a temporary cryptographic

Re: diagnosis for disk drive errors (zfs on cgd on sata disk)

2021-08-19 Thread Michael van Elst
pouya+lists.net...@nohup.io (Pouya Tafti) writes: Your disk controller gives the error reason: >[ 57131.573806] mpii0: physical device removed from slot 7 >Apart from the drive, I have also little faith in the >backplate, cables, SAS controller (which I reflashed), RAM, >etc., although here it l

diagnosis for disk drive errors (zfs on cgd on sata disk)

2021-08-19 Thread Pouya Tafti
After a recent drive failure in my primary zfs pool, I set up a secondary pool on a cgd(4) device on a single new sata hdd (zfs on gpt on cgd on gpt on a 4TB Seagate Ironwolf hdd) to back up the primary. I initialy scrubbed the entire disk without apparent incident using a temporary cryptographic

Re: zfs resilver in(de)finite loop?

2021-08-14 Thread Pouya Tafti
On Sun, 15 Aug 2021 at 00:57 +0200, Pouya Tafti wrote: > Thanks. What I ended up doing was detach the replacement > device, offline the old one, and then re-issue replace. > It's resilvering the replacement now. Will see how it goes. This 'worked' in the sense that the resilvering finished with

Re: zfs resilver in(de)finite loop?

2021-08-14 Thread Pouya Tafti
On Sat, 14 Aug 2021 at 18:52 -0400, Brad Spencer wrote: [snip] > As a general rule, although I will say not a required-hard-rule, it > would be a good idea to take not fully failed ZFS member offline before > doing a replacement. If the member has failed completely, that is > diffe

Re: zfs resilver in(de)finite loop?

2021-08-14 Thread Pouya Tafti
On Sat, 14 Aug 2021 at 23:29 +0200, Pouya Tafti wrote: > I'm new to ZFS and this is the first time I'm dealing with > disk errors. So I don't know if this is normal behaviour > and I should just wait or if I was wrong to issue replace > rather than take the drive of

Re: zfs resilver in(de)finite loop?

2021-08-14 Thread Brad Spencer
[snip] So... it looks like it may have tried to resilver the failing drive when you performed the replacement or had started to resilver the failing drive as you performed the replacement. In another OS with ZFS I have seen something like this resilver restarting behavior. In my case it ultimate

zfs resilver in(de)finite loop?

2021-08-14 Thread Pouya Tafti
). I wasn't sure whether in such a case one should take the problematic drive offline and resilver, or a simple replace would do, but assumed zfs would be smart enough to do the Right Thing as it knows about the errors. So I issued # zpool replace pond wedges/slot4zfs wedges/slot7zfs many

Re: cgd + zfs

2021-08-11 Thread Pouya Tafti
Thank you everyone for all the helpful and informative replies. I ended up using zfs without encryption in a configuration similar to what David had suggested. To summarise: 1. I was concerned hiding the SAS drives behind cgd could interfere with low-level fault tolerance mechanisms of zfs

Re: zfs on raw vs gpt

2021-07-31 Thread Michael van Elst
pouya+lists.net...@nohup.io (Pouya Tafti) writes: >On Fri, 30 Jul 2021 at 19:10 +0100, David Brownlee wrote: >> I started setting up using raw /dev/wdX or /dev/sdX devices, but >> switched across to ZFS gpt partitions. Both work fine, but for me: >> [...] >Your list o

Re: zfs on raw vs gpt

2021-07-31 Thread Pouya Tafti
On Fri, 30 Jul 2021 at 19:10 +0100, David Brownlee wrote: > [...] > I started setting up using raw /dev/wdX or /dev/sdX devices, but > switched across to ZFS gpt partitions. Both work fine, but for me: > [...] Your list of pros and cons was very helpful, thank you. I decided to

Re: zfs on raw vs gpt

2021-07-30 Thread David Brownlee
On Thu, 29 Jul 2021 at 21:08, Pouya Tafti wrote: > > Hi NetBSD users, > > Any advice on using zfs on raw disks vs gpt partitions? I'm > going to use entire disks for zfs and don't need > root-on-zfs. One advantage of using partitions seems to be > to protect agains

zfs on raw vs gpt

2021-07-29 Thread Pouya Tafti
Hi NetBSD users, Any advice on using zfs on raw disks vs gpt partitions? I'm going to use entire disks for zfs and don't need root-on-zfs. One advantage of using partitions seems to be to protect against the risk of having to replace a disk by a marginally smaller one. But I'

Re: ZFS RAIDZ2 and wd uncorrectable data error - why does ZFS not notice the hardware error?

2021-07-19 Thread Mr Roooster
> It depends where the failure occurs I expect. A drive could read just fine, but then a damaged cable may cause enough noise that the data doesn't always make it to the controller correctly. > I also didn't realize that wd(4) would issue aother read when there is a > failure, b

Re: cgd + zfs

2021-07-19 Thread Pouya Tafti
On Sun, 18 Jul 2021 at 09:29, Pouya Tafti wrote: > > Thanks! This is an interesting suggestion. I'm > > wondering though, wouldn't having a two-drive mirror create > > an assymmetry in how many failed drives you could tolerate? > > If you lost both mirrors the whole pool would be gone (I > > ass

Re: cgd + zfs

2021-07-19 Thread David Brownlee
Dell T320 - SAS9217-8i with 8 > > drives (plus one on onboard ahcisata), 6 in a RAIDZ2, two in a zfs > > mirror and one for boot > > Thanks! This is an interesting suggestion. I'm wondering though, wouldn't > having a two-drive mirror create an assymmetry in how many

Re: cgd + zfs

2021-07-18 Thread Pouya Tafti
> [pouya+lists.net...@nohup.io] > >> I'm now thinking, would it make sense to do the layering the other > >> way around, i.e. have cgd on top of a zvol? I wonder if there would > >> be any resilience (and possibly performance) advantage to having zfs > >

Re: cgd + zfs

2021-07-18 Thread Pouya Tafti
grade. (This suggestion is much relevant if you have other systems > where you can easily hook up a 4 drive RAIDZ2, but not a 6 drive > RAIDZ2 :-p) > > I have a somewhat similar setup on a Dell T320 - SAS9217-8i with 8 > drives (plus one on onboard ahcisata), 6 in a RAIDZ2, two in a zf

Re: ZFS RAIDZ2 and wd uncorrectable data error - why does ZFS not notice the hardware error?

2021-07-17 Thread Michael van Elst
rther read attempts aren't honored. >I also didn't realize that wd(4) would issue aother read when there is a >failure, but maybe that's in zfs glue code. ZFS also seems to do a retry by itself, so you have wait the wd retries twice. On Solaris, ZFS can issue a request with th

Re: ZFS RAIDZ2 and wd uncorrectable data error - why does ZFS not notice the hardware error?

2021-07-17 Thread Greg Troxel
resting point. I find this confusing, because I thought an uncorrectable read error would, for disks I've dealt with, cause the sector to be marked as permanently failed and pending reallocation. I also didn't realize that wd(4) would issue aother read when there is a failure, but maybe

Re: ZFS RAIDZ2 and wd uncorrectable data error - why does ZFS not notice the hardware error?

2021-07-17 Thread Brad Spencer
Mr Roooster writes: [snip] > The wd driver is retrying, (IIRC it retries 3 times) and suceeding on > the second or 3rd attempt. (See xfer 338, retry 0, followed by a 'soft > error corrected' with the same xfer number 10 seconds later. This is > the retry suceeding). >

Re: ZFS RAIDZ2 and wd uncorrectable data error - why does ZFS not notice the hardware error?

2021-07-17 Thread Mr Roooster
retry 1 > ``` > > The whole syslog is full of these messages. What surprises me is that > there are "uncorrectable" data errors in the syslog. Nevertheless, the > data can still be read - albeit very slowly. My assumption was that the > redundancies of RAID2 are being used t

Re: ZFS RAIDZ2 and wd uncorrectable data error - why does ZFS not notice the hardware error?

2021-07-17 Thread Michael van Elst
wxr-xr-x 1 root wheel 8 Jul 17 22:03 /dev/wedges/0d2c1666-075f-4a70-bd04-c3f2913fbc80@ -> /dev/dk7 lrwxr-xr-x 1 root wheel 8 Jul 17 22:03 /dev/wedges/41993b4b-180c-4df3-bd6b-25b5cbc036cb@ -> /dev/dk6 The zpool can then be created using these links. For this to work, devpubd needs t

Re: cgd + zfs

2021-07-17 Thread Michael van Elst
a...@absd.org (David Brownlee) writes: >zfs is setup to use wedges via /dev/wedges and then adjusted >rc.d/devpubd to run _before_ zfs, so I have stable zfs devices if >anything renumbers on reboot Since the standard devpubd hooks now run without /usr, we could make this official. Th

Re: cgd + zfs

2021-07-17 Thread Brad Spencer
Greg Troxel writes: > Brad Spencer writes: > >>> I'm now thinking, would it make sense to do the layering the other >>> way around, i.e. have cgd on top of a zvol? I wonder if there would >>> be any resilience (and possibly performance) advantage to h

Re: cgd + zfs

2021-07-17 Thread Greg Troxel
Brad Spencer writes: >> I'm now thinking, would it make sense to do the layering the other >> way around, i.e. have cgd on top of a zvol? I wonder if there would >> be any resilience (and possibly performance) advantage to having zfs >> directly access the hard

Re: ZFS RAIDZ2 and wd uncorrectable data error - why does ZFS not notice the hardware error?

2021-07-16 Thread Matthias Petermann
ice and online it using 'zpool online'. see: http://illumos.org/msg/ZFS-8000-2Q scan: none requested config: NAME STATE READ WRITE CKSUM tank DEGRADED 0 0 0 raidz2-0DE

Re: ZFS RAIDZ2 and wd uncorrectable data error - why does ZFS not notice the hardware error?

2021-07-16 Thread Matthias Petermann
Hi, On 16.07.21 23:21, RVP wrote: On Fri, 16 Jul 2021, Matthias Petermann wrote: I will overwrite the disk with zeros once as a test. According to the S.M.A.R.T. values, the number of "pending" sectors has already decreased - from 18 to 15. ``` 197 200    0 no  online  positive    Curre

Re: ZFS RAIDZ2 and wd uncorrectable data error - why does ZFS not notice the hardware error?

2021-07-16 Thread RVP
On Fri, 16 Jul 2021, Matthias Petermann wrote: I will overwrite the disk with zeros once as a test. According to the S.M.A.R.T. values, the number of "pending" sectors has already decreased - from 18 to 15. ``` 197 2000 no online positiveCurrent pending sector 15 ``` I w

Re: ZFS RAIDZ2 and wd uncorrectable data error - why does ZFS not notice the hardware error?

2021-07-16 Thread Matthias Petermann
Hi Michael, On 16.07.21 16:46, Michael van Elst wrote: smartmontools has more features and also understands rare setups with e.g. RAID controllers, early USB enclosures or vendor-specific (usually undocumented) parameters. It also comes with smartd to monitor drives continously. For plain SATA

  1   2   3   >