Re: Debian 12.5.0 amd64 and OpenZFS bug #15526
On Mon 25/03/2024 at 23:40, David Christensen wrote: > On 3/25/24 15:05, Gareth Evans wrote: >> On Fri 22/03/2024 at 21:01, Gareth Evans wrote: >>> As anyone interested can see from the ref to #15933 in the below, there >>> seems to have been considerable effort in getting to grips with this bug >>> (actually multiple bugs), and it looks like a fix may be forthcoming, >>> though not sure at the time of writing if there may be some further >>> polishing first >>> >>> https://github.com/openzfs/zfs/pull/16019 >> >> https://github.com/openzfs/zfs/issues/15933 >> >> is now closed as completed with fix >> >> https://github.com/openzfs/zfs/commit/102b468b5e190973fbaee6fe682727eb33079811 >> >> which for the moment necessarily adds synchronous writes. >> >> FYI. >> Gareth > > > Thank you for keeping an eye on this. > > > Looking at the github commit, the C code makes me worry -- it does not > appear to use traditional C/C++ thread-safe programming techniques such > as I learned in CS and used when I did systems programming (e.g. guard > functions, critical sections, locks, semaphores, etc.). > Do I need to > look at more enclosing code to see such, are those techniques missing, > are there some newer techniques I do not understand, or something else? I don't know, I will have a look too, though my C[++] is almost as rusty as my Rust :) G
Re: Debian 12.5.0 amd64 and OpenZFS bug #15526
On 3/25/24 15:05, Gareth Evans wrote: On Fri 22/03/2024 at 21:01, Gareth Evans wrote: As anyone interested can see from the ref to #15933 in the below, there seems to have been considerable effort in getting to grips with this bug (actually multiple bugs), and it looks like a fix may be forthcoming, though not sure at the time of writing if there may be some further polishing first https://github.com/openzfs/zfs/pull/16019 https://github.com/openzfs/zfs/issues/15933 is now closed as completed with fix https://github.com/openzfs/zfs/commit/102b468b5e190973fbaee6fe682727eb33079811 which for the moment necessarily adds synchronous writes. FYI. Gareth Thank you for keeping an eye on this. Looking at the github commit, the C code makes me worry -- it does not appear to use traditional C/C++ thread-safe programming techniques such as I learned in CS and used when I did systems programming (e.g. guard functions, critical sections, locks, semaphores, etc.). Do I need to look at more enclosing code to see such, are those techniques missing, are there some newer techniques I do not understand, or something else? David
Re: Debian 12.5.0 amd64 and OpenZFS bug #15526
On Fri 22/03/2024 at 21:01, Gareth Evans wrote: > As anyone interested can see from the ref to #15933 in the below, there seems > to have been considerable effort in getting to grips with this bug (actually > multiple bugs), and it looks like a fix may be forthcoming, though not sure > at the time of writing if there may be some further polishing first > > https://github.com/openzfs/zfs/pull/16019 https://github.com/openzfs/zfs/issues/15933 is now closed as completed with fix https://github.com/openzfs/zfs/commit/102b468b5e190973fbaee6fe682727eb33079811 which for the moment necessarily adds synchronous writes. FYI. Gareth
Re: Debian 12.5.0 amd64 and OpenZFS bug #15526
> On 27 Feb 2024, at 23:47, Gareth Evans wrote: > On Tue 27/02/2024 at 22:52, David Christensen > wrote: >> ... >> These appear to be the ZFS packages for the available Debian releases: >> >> https://packages.debian.org/buster/zfs-dkms >> >> busterzfs-dkms (0.7.12-2+deb10u2) >> buster-backportszfs-dkms (2.0.3-9~bpo10+1) >> bullseyezfs-dkms (2.0.3-9+deb11u1) >> bullseye-backportszfs-dkms (2.1.11-1~bpo11+1) >> bookwormzfs-dkms (2.1.11-1) >> bookworm-backportszfs-dkms (2.2.2-4~bpo12+1) >> trixiezfs-dkms (2.2.2-4) >> >> >> The question is, how far back to go? Is OpenZFS 2.1.x buggy? OpenZFS >> 2.0.x? What is 0.7.12 -- OpenZFS, ZFS-on-Linux, or something else -- >> and is it buggy? > > This seems to be very "involved"! The discussion in #15526 suggests a > coreutils upgrade (particularly re. "cp") in combination with the addition of > the zpool block cloning feature seems to have triggered the issue, which may > have gone undetected for some time. > >>> After downgrading coreutils from 9.3 to 8.32, I am no longer able to >>> reproduce this corruption. >> This seems to solve the corruption issue on my end too. > -- https://github.com/openzfs/zfs/issues/15526#issuecomment-1810472547 > > See also > https://www.reddit.com/r/zfs/comments/1826lgs/psa_its_not_block_cloning_its_a_data_corruption/ > > Debian users can't follow the gentoo/emerge-based reproduction/trigger steps > for build of golang in > https://github.com/openzfs/zfs/issues/15526 (for zfs 2.2.0) > and > https://github.com/openzfs/zfs/issues/15933 (for 2.2.3) > > If anyone can recommend steps to debianise these (15933 seem most likely to > be useful, and slightly different), I would be happy to test openzfs 2.2.2-4 > from bookworm-backports on deb 12.5 > > Given that the original gentoo reporter, who seems to have tested > extensively, considered the issue closed after upgrade to openzfs 2.2.2 > > https://bugs.gentoo.org/917224#c26 > > I wonder if the 2.2.3 issue is similar/related, or perhaps there are multiple > triggers. > > Watching with interest. > > Best wishes, > Gareth > As anyone interested can see from the ref to #15933 in the below, there seems to have been considerable effort in getting to grips with this bug (actually multiple bugs), and it looks like a fix may be forthcoming, though not sure at the time of writing if there may be some further polishing first https://github.com/openzfs/zfs/pull/16019
Re: Debian 12.5.0 amd64 and OpenZFS bug #15526
On Tue 27/02/2024 at 22:52, David Christensen wrote: > ... > These appear to be the ZFS packages for the available Debian releases: > > https://packages.debian.org/buster/zfs-dkms > > busterzfs-dkms (0.7.12-2+deb10u2) > buster-backports zfs-dkms (2.0.3-9~bpo10+1) > bullseye zfs-dkms (2.0.3-9+deb11u1) > bullseye-backportszfs-dkms (2.1.11-1~bpo11+1) > bookworm zfs-dkms (2.1.11-1) > bookworm-backportszfs-dkms (2.2.2-4~bpo12+1) > trixiezfs-dkms (2.2.2-4) > > > The question is, how far back to go? Is OpenZFS 2.1.x buggy? OpenZFS > 2.0.x? What is 0.7.12 -- OpenZFS, ZFS-on-Linux, or something else -- > and is it buggy? This seems to be very "involved"! The discussion in #15526 suggests a coreutils upgrade (particularly re. "cp") in combination with the addition of the zpool block cloning feature seems to have triggered the issue, which may have gone undetected for some time. >> After downgrading coreutils from 9.3 to 8.32, I am no longer able to >> reproduce this corruption. > This seems to solve the corruption issue on my end too. -- https://github.com/openzfs/zfs/issues/15526#issuecomment-1810472547 See also https://www.reddit.com/r/zfs/comments/1826lgs/psa_its_not_block_cloning_its_a_data_corruption/ Debian users can't follow the gentoo/emerge-based reproduction/trigger steps for build of golang in https://github.com/openzfs/zfs/issues/15526 (for zfs 2.2.0) and https://github.com/openzfs/zfs/issues/15933 (for 2.2.3) If anyone can recommend steps to debianise these (15933 seem most likely to be useful, and slightly different), I would be happy to test openzfs 2.2.2-4 from bookworm-backports on deb 12.5 Given that the original gentoo reporter, who seems to have tested extensively, considered the issue closed after upgrade to openzfs 2.2.2 https://bugs.gentoo.org/917224#c26 I wonder if the 2.2.3 issue is similar/related, or perhaps there are multiple triggers. Watching with interest. Best wishes, Gareth
Re: Debian 12.5.0 amd64 and OpenZFS bug #15526
On 2/26/24 20:52, Gareth Evans wrote: Replied to OP by mistake, reposting to list. On Sun 25/02/2024 at 05:34, David Christensen wrote: debian-user: Is Debian 12.5.0 amd64 affected by OpenZFS bug #15526? https://cdimage.debian.org/debian-cd/current/amd64/iso-cd/debian-12.5.0-amd64-netinst.iso https://packages.debian.org/bookworm/zfs-dkms https://github.com/openzfs/zfs/issues/15526 Hi David, Given the complexity of the issues, I'm not sure if this truly answers your question, but https://github.com/openzfs/zfs/issues/15933 seems to suggest that or a similar issue is still ongoing with Open ZFS 2.2.3, which is later than the version currently available from bookworm or bookworm-backports. It seems bookworm-backports might eventually provide the solution, if at all, per the Debian wiki on ZFS: "it is recommended by Debian ZFS on Linux Team to install ZFS related packages from Backports archive. Upstream stable patches will be tracked and compatibility is always maintained." https://wiki.debian.org/ZFS Currently: $ apt policy zfs-dkms zfs-dkms: Installed: 2.2.2-4~bpo12+1 Candidate: 2.2.2-4~bpo12+1 Version table: *** 2.2.2-4~bpo12+1 100 100 https://deb.debian.org/debian bookworm-backports/contrib amd64 Packages 100 https://deb.debian.org/debian bookworm-backports/contrib i386 Packages 100 /var/lib/dpkg/status 2.1.11-1 500 500 https://deb.debian.org/debian bookworm/contrib amd64 Packages 500 https://deb.debian.org/debian bookworm/contrib i386 Packages Hope that helps. Gareth That you for citing OpenZFS bug #15933. These appear to be the ZFS packages for the available Debian releases: https://packages.debian.org/buster/zfs-dkms buster zfs-dkms (0.7.12-2+deb10u2) buster-backportszfs-dkms (2.0.3-9~bpo10+1) bullseyezfs-dkms (2.0.3-9+deb11u1) bullseye-backports zfs-dkms (2.1.11-1~bpo11+1) bookwormzfs-dkms (2.1.11-1) bookworm-backports zfs-dkms (2.2.2-4~bpo12+1) trixie zfs-dkms (2.2.2-4) The question is, how far back to go? Is OpenZFS 2.1.x buggy? OpenZFS 2.0.x? What is 0.7.12 -- OpenZFS, ZFS-on-Linux, or something else -- and is it buggy? David
Re: Debian 12.5.0 amd64 and OpenZFS bug #15526
On Tue 27/02/2024 at 04:52, Gareth Evans wrote: > https://github.com/openzfs/zfs/issues/15933 > > seems to suggest that or a similar issue is still ongoing with Open ZFS > 2.2.3 ... I wonder if that might be a regression, since what I think is the same issue as openzfs #15526 appeared to be resolved in ZFS 2.2.2 "I've been unable to reproduce this after upgrading to zfs-2.2.2" https://bugs.gentoo.org/917224 I imagine openzfs problems (and solutions) are likely to be the same across distributions? Kind regards, G
Re: Debian 12.5.0 amd64 and OpenZFS bug #15526
Replied to OP by mistake, reposting to list. On Sun 25/02/2024 at 05:34, David Christensen wrote: > debian-user: > > Is Debian 12.5.0 amd64 affected by OpenZFS bug #15526? > > https://cdimage.debian.org/debian-cd/current/amd64/iso-cd/debian-12.5.0-amd64-netinst.iso > > https://packages.debian.org/bookworm/zfs-dkms > > https://github.com/openzfs/zfs/issues/15526 Hi David, Given the complexity of the issues, I'm not sure if this truly answers your question, but https://github.com/openzfs/zfs/issues/15933 seems to suggest that or a similar issue is still ongoing with Open ZFS 2.2.3, which is later than the version currently available from bookworm or bookworm-backports. It seems bookworm-backports might eventually provide the solution, if at all, per the Debian wiki on ZFS: "it is recommended by Debian ZFS on Linux Team to install ZFS related packages from Backports archive. Upstream stable patches will be tracked and compatibility is always maintained." https://wiki.debian.org/ZFS Currently: $ apt policy zfs-dkms zfs-dkms: Installed: 2.2.2-4~bpo12+1 Candidate: 2.2.2-4~bpo12+1 Version table: *** 2.2.2-4~bpo12+1 100 100 https://deb.debian.org/debian bookworm-backports/contrib amd64 Packages 100 https://deb.debian.org/debian bookworm-backports/contrib i386 Packages 100 /var/lib/dpkg/status 2.1.11-1 500 500 https://deb.debian.org/debian bookworm/contrib amd64 Packages 500 https://deb.debian.org/debian bookworm/contrib i386 Packages Hope that helps. Gareth
Debian 12.5.0 amd64 and OpenZFS bug #15526
debian-user: Is Debian 12.5.0 amd64 affected by OpenZFS bug #15526? https://cdimage.debian.org/debian-cd/current/amd64/iso-cd/debian-12.5.0-amd64-netinst.iso https://packages.debian.org/bookworm/zfs-dkms https://github.com/openzfs/zfs/issues/15526 David