Re: current status of zfs block_cloning on CURRENT?

2023-04-24 Thread Warner Losh
On Mon, Apr 24, 2023 at 9:49 PM Charlie Li  wrote:

> Charlie Li wrote:
> > Pete Wright wrote:
> >> i've seen a few threads about the block_cloning feature causing data
> >> corruption issues on CURRENT and have been keen to avoid enabling it
> >> until the dust settles.  i was under the impression that we either
> >> reverted or disabled block_cloning on CURRENT, but when i ran "zpool
> >> upgrade" on a pool today it reported block_cloning was enabled.  this
> >> is on a system i rebuilt yesterday.
> >>
> > The dust has settled.
> Barely...
> >> i was hoping to get some clarity on the effect of having this feature
> >> enabled, is this enough to trigger the data corruption bug or does
> >> something on the zfs filesystem itself have to be enabled to trigger
> >> this?
> >>
> > The initial problem with block_cloning [0][1] was fixed in commits
> > e0bb199925565a3770733afd1a4d8bb2d4d0ce31 and
> > 1959e122d9328b31a62ff7508e1746df2857b592, with a sysctl added in commit
> > 068913e4ba3dd9b3067056e832cefc5ed264b5cc. A different data corruption
> > problem [2][3] was fixed in commit
> > 63ee747febbf024be0aace61161241b53245449e. All were committed between
> > 15-17 April.
> >
> > [0] https://github.com/openzfs/zfs/pull/13392#issuecomment-1504239103
> > [1] https://github.com/openzfs/zfs/pull/14739
> > [2] https://github.com/openzfs/zfs/issues/14753
> > [3] https://github.com/openzfs/zfs/pull/14761
> >
> Given mjg@'s thread reporting further crashes/panics, you may want to
> keep the sysctl disabled if you upgraded the pool already.
>

I thought the plan was to keep it disabled until after 14. And even then,
when it comes back in, it will be a new feature It should never be enabled.

Warner


Re: current status of zfs block_cloning on CURRENT?

2023-04-24 Thread Charlie Li

Charlie Li wrote:

Pete Wright wrote:
i've seen a few threads about the block_cloning feature causing data 
corruption issues on CURRENT and have been keen to avoid enabling it 
until the dust settles.  i was under the impression that we either 
reverted or disabled block_cloning on CURRENT, but when i ran "zpool 
upgrade" on a pool today it reported block_cloning was enabled.  this 
is on a system i rebuilt yesterday.



The dust has settled.

Barely...
i was hoping to get some clarity on the effect of having this feature 
enabled, is this enough to trigger the data corruption bug or does 
something on the zfs filesystem itself have to be enabled to trigger 
this?


The initial problem with block_cloning [0][1] was fixed in commits 
e0bb199925565a3770733afd1a4d8bb2d4d0ce31 and 
1959e122d9328b31a62ff7508e1746df2857b592, with a sysctl added in commit 
068913e4ba3dd9b3067056e832cefc5ed264b5cc. A different data corruption 
problem [2][3] was fixed in commit 
63ee747febbf024be0aace61161241b53245449e. All were committed between 
15-17 April.


[0] https://github.com/openzfs/zfs/pull/13392#issuecomment-1504239103
[1] https://github.com/openzfs/zfs/pull/14739
[2] https://github.com/openzfs/zfs/issues/14753
[3] https://github.com/openzfs/zfs/pull/14761

Given mjg@'s thread reporting further crashes/panics, you may want to 
keep the sysctl disabled if you upgraded the pool already.


--
Charlie Li
…nope, still don't have an exit line.



OpenPGP_signature
Description: OpenPGP digital signature


Re: current status of zfs block_cloning on CURRENT?

2023-04-24 Thread Charlie Li

Pete Wright wrote:
i've seen a few threads about the block_cloning feature causing data 
corruption issues on CURRENT and have been keen to avoid enabling it 
until the dust settles.  i was under the impression that we either 
reverted or disabled block_cloning on CURRENT, but when i ran "zpool 
upgrade" on a pool today it reported block_cloning was enabled.  this is 
on a system i rebuilt yesterday.



The dust has settled.
i was hoping to get some clarity on the effect of having this feature 
enabled, is this enough to trigger the data corruption bug or does 
something on the zfs filesystem itself have to be enabled to trigger this?


The initial problem with block_cloning [0][1] was fixed in commits 
e0bb199925565a3770733afd1a4d8bb2d4d0ce31 and 
1959e122d9328b31a62ff7508e1746df2857b592, with a sysctl added in commit 
068913e4ba3dd9b3067056e832cefc5ed264b5cc. A different data corruption 
problem [2][3] was fixed in commit 
63ee747febbf024be0aace61161241b53245449e. All were committed between 
15-17 April.


[0] https://github.com/openzfs/zfs/pull/13392#issuecomment-1504239103
[1] https://github.com/openzfs/zfs/pull/14739
[2] https://github.com/openzfs/zfs/issues/14753
[3] https://github.com/openzfs/zfs/pull/14761

--
Charlie Li
…nope, still don't have an exit line.



OpenPGP_signature
Description: OpenPGP digital signature


current status of zfs block_cloning on CURRENT?

2023-04-24 Thread Pete Wright

hi everyone,
i've seen a few threads about the block_cloning feature causing data 
corruption issues on CURRENT and have been keen to avoid enabling it 
until the dust settles.  i was under the impression that we either 
reverted or disabled block_cloning on CURRENT, but when i ran "zpool 
upgrade" on a pool today it reported block_cloning was enabled.  this is 
on a system i rebuilt yesterday.


i was hoping to get some clarity on the effect of having this feature 
enabled, is this enough to trigger the data corruption bug or does 
something on the zfs filesystem itself have to be enabled to trigger this?


i also noticed there is no entry for this feature in zpool-features(7), 
hence i thought i was safe to upgrade my pool.


thanks in advance,
-pete

--
Pete Wright
p...@nomadlogic.org
@nomadlogicLA




Re: /lib/libc.so.7 vs. __libc_start1@FBSD_1.7 in main [so: 14] recently ?

2023-04-24 Thread Warner Losh
On Sun, Apr 23, 2023 at 10:22 PM Mark Millard  wrote:

> [Warner answered my specific question separately. This is about
> something else.]
>
> On Apr 23, 2023, at 20:57, Warner Losh  wrote:
>
> > On Sun, Apr 23, 2023 at 9:40 PM Simon J. Gerraty 
> wrote:
> >> Mark Millard  wrote:
> >> > I will not get into why, but I executed a git built for 1400082
> >> > in a 1400081 world context and got what was to me a surprise,
> >> > given that /lib/libc.so.7 is part of 13.2-RELEASE :
> >> >
> >> > ld-elf.so.1: /usr/local/bin/git: Undefined symbol
> "__libc_start1@FBSD_1.7"
> >>
> >> This is a symptom of trying to run a prog built for target on a host
> >> which is not the same.
> >>
> >> I hit this a lot recently while updating Makefile.depend files for
> >> userland.
> >>
> >> There are a number of makefiles (eg for sh, csh, awk) which need to run
> >> a tool on the host to generate something.
> >> When trying to build 14.0 on a 13.1 host each of those tools failed with
> >> the above issue until actually built for the host.
> >
> > Your path is messed up then. We always run (a copy of) the host's
> binaries
> > for these tools.
>
> For the kernel's vers.c generation, git is used but
> does not get a build of its own under buildworld or
> buildkernel as far as I know: not a bootstrap or
> staged tool.
>

Correct. The host's git is assumed to always be good and always executing
in a sane env.
And you can just remove / take git out of the path if you hit problems here.


> > If you were running the 14 binaries on 13 as part of the
> > build process, the path is messed up. I'm not surprised for dirdep
> > since it doesn't do all the staging activities that buildworld.
>
> git use is not covered by buildworld or kernel-toolchain
> staging activities as far as I know.
>
> Is git the only example of such for things used by buildworld
> or buildkernel ?
>

buildkernel is the only place I know that git is used to get the tip of git
branch for messages. I think that reproducible builds omit this.

Warner


> >> AFAIK the non-DIRDEPS_BUILD build does a separate pass through the tree
> >> to do the target build-tools to build these things.
> >
> > Yes and no... We copy the host's tools when we can, and build a matched
> set of
> > binary and libraries when the host one isn't good enough. I think it's a
> path
> > issue you are seeing...
> >
> > Also, "copy" isn't a physical copy because macos hates copied binaries
> due to security concerns.
> >
> >>
> >> The DIRDEPS_BUILD uses a pseudo MACHINE "host" to deal with such things,
> >> ideally those tools would be built in a subdirectory of sh, csh etc, so
> >> that one can choose to build only that tool if desired - sometimes you
> >> want to build the app (eg awk) for the host as well but usually not.
> >
> > Yea, buildworld deals with this by creating new binaries and installing
> them in
> > a special directory, which is somewhat similar (though we always build
> > them rather than on demand like dirdep hopes to do).
>
>
> ===
> Mark Millard
> marklmi at yahoo.com
>
>
>


Re: another crash and going forward with zfs

2023-04-24 Thread Mateusz Guzik
On 4/18/23, Pawel Jakub Dawidek  wrote:
> On 4/18/23 05:14, Mateusz Guzik wrote:
>> On 4/17/23, Pawel Jakub Dawidek  wrote:
>>> Correct me if I'm wrong, but from my understanding there were zero
>>> problems with block cloning when it wasn't in use or now disabled.
>>>
>>> The reason I've introduced vfs.zfs.bclone_enabled sysctl, was to exactly
>>> avoid mess like this and give us more time to sort all the problems out
>>> while making it easy for people to try it.
>>>
>>> If there is no plan to revert the whole import, I don't see what value
>>> removing just block cloning will bring if it is now disabled by default
>>> and didn't cause any problems when disabled.
>>>
>>
>> The feature definitely was not properly stress tested and what not and
>> trying to do it keeps running into panics. Given the complexity of the
>> feature I would expect there are many bug lurking, some of which
>> possibly related to the on disk format. Not having to deal with any of
>> this is can be arranged as described above and is imo the most
>> sensible route given the timeline for 14.0
>
> Block cloning doesn't create, remove or modify any on-disk data until it
> is in use.
>
> Again, if we are not going to revert the whole merge, I see no point in
> reverting block cloning as until it is enabled, its code is not
> executed. This allow people who upgraded the pools to do nothing special
> and it will allow people to test it easily.
>

Some people will zpool upgrade out of habit or whatever after moving
to 14.0, which will then make them unable to go back to 13.x if woes
show up.

Woes don't even have to be zfs-related. This is a major release, one
has to suspect there will be some breakage and it maybe the best way
forward for some of the users will be to downgrade (e.g., with boot
envinronments). As is they wont be able to do it if they zpool
upgrade.

If someone *does* zpool upgrade and there is further data corruption
due to block cloning (which you really can't rule out given that the
feature so far did not survive under load), telephone game is going to
turn this into "14.0 corrupts data" and no amount of clarifying about
an optional feature is going to help the press.

If anything the real question is how come the feature got merged upstream, when:
1. FreeBSD CI for the project is offline
2. There is no Linux support
3. ... basic usage showed numerous bugs

Should the feature get whipped into shape, it can be a 14.1 candidate.

-- 
Mateusz Guzik 



Re: dbus broken?

2023-04-24 Thread George Kontostanos
unsubscribe


On Mon, Apr 24, 2023 at 10:02 AM Lizbeth Mutterhunt, Ph.D
 wrote:
>
>
>
> Wenn einem gar niets  meer einfällt, schreit man auch nix. Mijn mute reminder!
>
> Begin doorgestuurd bericht:
>
> Van: "Lizbeth Mutterhunt, Ph.D" 
> Datum: 24 april 2023 om 08:58:54 CEST
> Aan: current-us...@freebsd.org
> Onderwerp: dbus broken?
>
> As I tried the CURRENT Image from Aor., 4th, I recognised the dbus Package 
> is broken, Even in Building from Ports.
>
> The Image from 20th Apr. is ok again. Something went wrong on Apr. 4th 
> (missing dbus config).
>
> Just for information purpose, now everything is fine again on my Acer 
> Vhromebook 314!
>
> Thx for patience!
>
> Liz
>
> Wenn einem gar niets  meer einfällt, schreit man auch nix. Mijn mute reminder!



-- 
George Kontostanos
---



Fwd: dbus broken?

2023-04-24 Thread Lizbeth Mutterhunt, Ph.D


Wenn einem gar niets  meer einfällt, schreit man auch nix. Mijn mute reminder!

Begin doorgestuurd bericht:

> Van: "Lizbeth Mutterhunt, Ph.D" 
> Datum: 24 april 2023 om 08:58:54 CEST
> Aan: current-us...@freebsd.org
> Onderwerp: dbus broken?
> 
> As I tried the CURRENT Image from Aor., 4th, I recognised the dbus Package 
> is broken, Even in Building from Ports.
> 
> The Image from 20th Apr. is ok again. Something went wrong on Apr. 4th 
> (missing dbus config).
> 
> Just for information purpose, now everything is fine again on my Acer 
> Vhromebook 314!
> 
> Thx for patience!
> 
> Liz 
> 
> Wenn einem gar niets  meer einfällt, schreit man auch nix. Mijn mute reminder!


Re: /lib/libc.so.7 vs. __libc_start1@FBSD_1.7 in main [so: 14] recently ?

2023-04-24 Thread Simon J. Gerraty
Warner Losh  wrote:
> > ld-elf.so.1: /usr/local/bin/git: Undefined symbol "__libc_start1@FBSD_1.7"
> 
> This is a symptom of trying to run a prog built for target on a host
> which is not the same.

> Your path is messed up then. We always run (a copy of) the host's binaries

I wasn't using the targets you are refering to.

But the point I was making was that trying to run a target binary on a
host which is not the same, will result in the errror observed, and is
thus a likely explaination.

--sjg