Re: current status of zfs block_cloning on CURRENT?
On Mon, Apr 24, 2023 at 9:49 PM Charlie Li wrote: > Charlie Li wrote: > > Pete Wright wrote: > >> i've seen a few threads about the block_cloning feature causing data > >> corruption issues on CURRENT and have been keen to avoid enabling it > >> until the dust settles. i was under the impression that we either > >> reverted or disabled block_cloning on CURRENT, but when i ran "zpool > >> upgrade" on a pool today it reported block_cloning was enabled. this > >> is on a system i rebuilt yesterday. > >> > > The dust has settled. > Barely... > >> i was hoping to get some clarity on the effect of having this feature > >> enabled, is this enough to trigger the data corruption bug or does > >> something on the zfs filesystem itself have to be enabled to trigger > >> this? > >> > > The initial problem with block_cloning [0][1] was fixed in commits > > e0bb199925565a3770733afd1a4d8bb2d4d0ce31 and > > 1959e122d9328b31a62ff7508e1746df2857b592, with a sysctl added in commit > > 068913e4ba3dd9b3067056e832cefc5ed264b5cc. A different data corruption > > problem [2][3] was fixed in commit > > 63ee747febbf024be0aace61161241b53245449e. All were committed between > > 15-17 April. > > > > [0] https://github.com/openzfs/zfs/pull/13392#issuecomment-1504239103 > > [1] https://github.com/openzfs/zfs/pull/14739 > > [2] https://github.com/openzfs/zfs/issues/14753 > > [3] https://github.com/openzfs/zfs/pull/14761 > > > Given mjg@'s thread reporting further crashes/panics, you may want to > keep the sysctl disabled if you upgraded the pool already. > I thought the plan was to keep it disabled until after 14. And even then, when it comes back in, it will be a new feature It should never be enabled. Warner
Re: current status of zfs block_cloning on CURRENT?
Charlie Li wrote: Pete Wright wrote: i've seen a few threads about the block_cloning feature causing data corruption issues on CURRENT and have been keen to avoid enabling it until the dust settles. i was under the impression that we either reverted or disabled block_cloning on CURRENT, but when i ran "zpool upgrade" on a pool today it reported block_cloning was enabled. this is on a system i rebuilt yesterday. The dust has settled. Barely... i was hoping to get some clarity on the effect of having this feature enabled, is this enough to trigger the data corruption bug or does something on the zfs filesystem itself have to be enabled to trigger this? The initial problem with block_cloning [0][1] was fixed in commits e0bb199925565a3770733afd1a4d8bb2d4d0ce31 and 1959e122d9328b31a62ff7508e1746df2857b592, with a sysctl added in commit 068913e4ba3dd9b3067056e832cefc5ed264b5cc. A different data corruption problem [2][3] was fixed in commit 63ee747febbf024be0aace61161241b53245449e. All were committed between 15-17 April. [0] https://github.com/openzfs/zfs/pull/13392#issuecomment-1504239103 [1] https://github.com/openzfs/zfs/pull/14739 [2] https://github.com/openzfs/zfs/issues/14753 [3] https://github.com/openzfs/zfs/pull/14761 Given mjg@'s thread reporting further crashes/panics, you may want to keep the sysctl disabled if you upgraded the pool already. -- Charlie Li …nope, still don't have an exit line. OpenPGP_signature Description: OpenPGP digital signature
Re: current status of zfs block_cloning on CURRENT?
Pete Wright wrote: i've seen a few threads about the block_cloning feature causing data corruption issues on CURRENT and have been keen to avoid enabling it until the dust settles. i was under the impression that we either reverted or disabled block_cloning on CURRENT, but when i ran "zpool upgrade" on a pool today it reported block_cloning was enabled. this is on a system i rebuilt yesterday. The dust has settled. i was hoping to get some clarity on the effect of having this feature enabled, is this enough to trigger the data corruption bug or does something on the zfs filesystem itself have to be enabled to trigger this? The initial problem with block_cloning [0][1] was fixed in commits e0bb199925565a3770733afd1a4d8bb2d4d0ce31 and 1959e122d9328b31a62ff7508e1746df2857b592, with a sysctl added in commit 068913e4ba3dd9b3067056e832cefc5ed264b5cc. A different data corruption problem [2][3] was fixed in commit 63ee747febbf024be0aace61161241b53245449e. All were committed between 15-17 April. [0] https://github.com/openzfs/zfs/pull/13392#issuecomment-1504239103 [1] https://github.com/openzfs/zfs/pull/14739 [2] https://github.com/openzfs/zfs/issues/14753 [3] https://github.com/openzfs/zfs/pull/14761 -- Charlie Li …nope, still don't have an exit line. OpenPGP_signature Description: OpenPGP digital signature
current status of zfs block_cloning on CURRENT?
hi everyone, i've seen a few threads about the block_cloning feature causing data corruption issues on CURRENT and have been keen to avoid enabling it until the dust settles. i was under the impression that we either reverted or disabled block_cloning on CURRENT, but when i ran "zpool upgrade" on a pool today it reported block_cloning was enabled. this is on a system i rebuilt yesterday. i was hoping to get some clarity on the effect of having this feature enabled, is this enough to trigger the data corruption bug or does something on the zfs filesystem itself have to be enabled to trigger this? i also noticed there is no entry for this feature in zpool-features(7), hence i thought i was safe to upgrade my pool. thanks in advance, -pete -- Pete Wright p...@nomadlogic.org @nomadlogicLA
Re: /lib/libc.so.7 vs. __libc_start1@FBSD_1.7 in main [so: 14] recently ?
On Sun, Apr 23, 2023 at 10:22 PM Mark Millard wrote: > [Warner answered my specific question separately. This is about > something else.] > > On Apr 23, 2023, at 20:57, Warner Losh wrote: > > > On Sun, Apr 23, 2023 at 9:40 PM Simon J. Gerraty > wrote: > >> Mark Millard wrote: > >> > I will not get into why, but I executed a git built for 1400082 > >> > in a 1400081 world context and got what was to me a surprise, > >> > given that /lib/libc.so.7 is part of 13.2-RELEASE : > >> > > >> > ld-elf.so.1: /usr/local/bin/git: Undefined symbol > "__libc_start1@FBSD_1.7" > >> > >> This is a symptom of trying to run a prog built for target on a host > >> which is not the same. > >> > >> I hit this a lot recently while updating Makefile.depend files for > >> userland. > >> > >> There are a number of makefiles (eg for sh, csh, awk) which need to run > >> a tool on the host to generate something. > >> When trying to build 14.0 on a 13.1 host each of those tools failed with > >> the above issue until actually built for the host. > > > > Your path is messed up then. We always run (a copy of) the host's > binaries > > for these tools. > > For the kernel's vers.c generation, git is used but > does not get a build of its own under buildworld or > buildkernel as far as I know: not a bootstrap or > staged tool. > Correct. The host's git is assumed to always be good and always executing in a sane env. And you can just remove / take git out of the path if you hit problems here. > > If you were running the 14 binaries on 13 as part of the > > build process, the path is messed up. I'm not surprised for dirdep > > since it doesn't do all the staging activities that buildworld. > > git use is not covered by buildworld or kernel-toolchain > staging activities as far as I know. > > Is git the only example of such for things used by buildworld > or buildkernel ? > buildkernel is the only place I know that git is used to get the tip of git branch for messages. I think that reproducible builds omit this. Warner > >> AFAIK the non-DIRDEPS_BUILD build does a separate pass through the tree > >> to do the target build-tools to build these things. > > > > Yes and no... We copy the host's tools when we can, and build a matched > set of > > binary and libraries when the host one isn't good enough. I think it's a > path > > issue you are seeing... > > > > Also, "copy" isn't a physical copy because macos hates copied binaries > due to security concerns. > > > >> > >> The DIRDEPS_BUILD uses a pseudo MACHINE "host" to deal with such things, > >> ideally those tools would be built in a subdirectory of sh, csh etc, so > >> that one can choose to build only that tool if desired - sometimes you > >> want to build the app (eg awk) for the host as well but usually not. > > > > Yea, buildworld deals with this by creating new binaries and installing > them in > > a special directory, which is somewhat similar (though we always build > > them rather than on demand like dirdep hopes to do). > > > === > Mark Millard > marklmi at yahoo.com > > >
Re: another crash and going forward with zfs
On 4/18/23, Pawel Jakub Dawidek wrote: > On 4/18/23 05:14, Mateusz Guzik wrote: >> On 4/17/23, Pawel Jakub Dawidek wrote: >>> Correct me if I'm wrong, but from my understanding there were zero >>> problems with block cloning when it wasn't in use or now disabled. >>> >>> The reason I've introduced vfs.zfs.bclone_enabled sysctl, was to exactly >>> avoid mess like this and give us more time to sort all the problems out >>> while making it easy for people to try it. >>> >>> If there is no plan to revert the whole import, I don't see what value >>> removing just block cloning will bring if it is now disabled by default >>> and didn't cause any problems when disabled. >>> >> >> The feature definitely was not properly stress tested and what not and >> trying to do it keeps running into panics. Given the complexity of the >> feature I would expect there are many bug lurking, some of which >> possibly related to the on disk format. Not having to deal with any of >> this is can be arranged as described above and is imo the most >> sensible route given the timeline for 14.0 > > Block cloning doesn't create, remove or modify any on-disk data until it > is in use. > > Again, if we are not going to revert the whole merge, I see no point in > reverting block cloning as until it is enabled, its code is not > executed. This allow people who upgraded the pools to do nothing special > and it will allow people to test it easily. > Some people will zpool upgrade out of habit or whatever after moving to 14.0, which will then make them unable to go back to 13.x if woes show up. Woes don't even have to be zfs-related. This is a major release, one has to suspect there will be some breakage and it maybe the best way forward for some of the users will be to downgrade (e.g., with boot envinronments). As is they wont be able to do it if they zpool upgrade. If someone *does* zpool upgrade and there is further data corruption due to block cloning (which you really can't rule out given that the feature so far did not survive under load), telephone game is going to turn this into "14.0 corrupts data" and no amount of clarifying about an optional feature is going to help the press. If anything the real question is how come the feature got merged upstream, when: 1. FreeBSD CI for the project is offline 2. There is no Linux support 3. ... basic usage showed numerous bugs Should the feature get whipped into shape, it can be a 14.1 candidate. -- Mateusz Guzik
Re: dbus broken?
unsubscribe On Mon, Apr 24, 2023 at 10:02 AM Lizbeth Mutterhunt, Ph.D wrote: > > > > Wenn einem gar niets meer einfällt, schreit man auch nix. Mijn mute reminder! > > Begin doorgestuurd bericht: > > Van: "Lizbeth Mutterhunt, Ph.D" > Datum: 24 april 2023 om 08:58:54 CEST > Aan: current-us...@freebsd.org > Onderwerp: dbus broken? > > As I tried the CURRENT Image from Aor., 4th, I recognised the dbus Package > is broken, Even in Building from Ports. > > The Image from 20th Apr. is ok again. Something went wrong on Apr. 4th > (missing dbus config). > > Just for information purpose, now everything is fine again on my Acer > Vhromebook 314! > > Thx for patience! > > Liz > > Wenn einem gar niets meer einfällt, schreit man auch nix. Mijn mute reminder! -- George Kontostanos ---
Fwd: dbus broken?
Wenn einem gar niets meer einfällt, schreit man auch nix. Mijn mute reminder! Begin doorgestuurd bericht: > Van: "Lizbeth Mutterhunt, Ph.D" > Datum: 24 april 2023 om 08:58:54 CEST > Aan: current-us...@freebsd.org > Onderwerp: dbus broken? > > As I tried the CURRENT Image from Aor., 4th, I recognised the dbus Package > is broken, Even in Building from Ports. > > The Image from 20th Apr. is ok again. Something went wrong on Apr. 4th > (missing dbus config). > > Just for information purpose, now everything is fine again on my Acer > Vhromebook 314! > > Thx for patience! > > Liz > > Wenn einem gar niets meer einfällt, schreit man auch nix. Mijn mute reminder!
Re: /lib/libc.so.7 vs. __libc_start1@FBSD_1.7 in main [so: 14] recently ?
Warner Losh wrote: > > ld-elf.so.1: /usr/local/bin/git: Undefined symbol "__libc_start1@FBSD_1.7" > > This is a symptom of trying to run a prog built for target on a host > which is not the same. > Your path is messed up then. We always run (a copy of) the host's binaries I wasn't using the targets you are refering to. But the point I was making was that trying to run a target binary on a host which is not the same, will result in the errror observed, and is thus a likely explaination. --sjg