Re: ZFS: time to drop Big Scary Warning

2021-03-25 Thread Andreas Gustafsson
Greg Troxel wrote:
> > That's a good test, but how does zfs compare in for the same test with lets
> > say ffs or ext2fs (filesystems that offer persistence)?
> 
> With the same system, booted in the  same way, but with 3 different
> filesystems mounted on /tmp, I get similar numbers of failures:
> 
> tmpfs 12
> ffs2  13
> zfs   18
> 
> So tmpfs/ffs2 are ~equal and zfs has a few more failures (but it all
> looks a bit random and non-repeatable).So it's hard to sort out "zfs
> is buggy" vs "some tests fail in timing-related hard-to-understand ways
> and that seems provoked slightly more with /tmp on zfs".

Since I'm all too familiar with the randomly failing tests in the
NetBSD test suite, let me sort them out for you.  Of the test cases in
the zfs run that didn't match the releng results, these two are known
random failures reported in PR 55770:

>   ./usr.bin/cc/t_tsan_data_race:data_race_pie
>   ./usr.bin/c++/t_tsan_data_race:data_race_pie

This is probably the known random failure reported in PR 55692:

>   ./fs/nfs/t_rquotad:get_nfs_be_1_group

This failure is already reported in PR 55603, though your report is
the first of it failing on real hardware:

>   ./modules/t_x86_pte:svs_g_bit_set

This one is reported as randomly failing in PR 55331, but since that
problem appears to be tmpfs related, it may be the case that tmpfs and
zfs are both buggy:

>   ./lib/libarchive/t_libarchive:libarchive

The remaining four failed test cases are _not_ known to fail randomly,
and as far as I know, do not have existing PRs.  Since they all
involve file system operations, it seems likely that they are in fact
zfs related in some way:

>   ./bin/cp/t_cp:file_to_file
>   ./lib/libc/stdlib/t_mktemp:mktemp_large_template
>   ./lib/libc/sys/t_stat:stat_chflags
>   ./usr.bin/ztest/t_ztest:assert

-- 
Andreas Gustafsson, g...@gson.org


Re: ZFS: time to drop Big Scary Warning

2021-03-25 Thread Christos Zoulas
I think that is good enough. We should document the timing-related tests and 
try to fix them!

christos

> On Mar 25, 2021, at 2:06 PM, Greg Troxel  wrote:
> 
> Signed PGP part
> 
> chris...@astron.com (Christos Zoulas) writes:
> 
>> That's a good test, but how does zfs compare in for the same test with lets
>> say ffs or ext2fs (filesystems that offer persistence)?
> 
> With the same system, booted in the  same way, but with 3 different
> filesystems mounted on /tmp, I get similar numbers of failures:
> 
> tmpfs 12
> ffs2  13
> zfs   18
> 
> So tmpfs/ffs2 are ~equal and zfs has a few more failures (but it all
> looks a bit random and non-repeatable).So it's hard to sort out "zfs
> is buggy" vs "some tests fail in timing-related hard-to-understand ways
> and that seems provoked slightly more with /tmp on zfs".
> 
> Did you mean something else?
> 
> 



signature.asc
Description: Message signed with OpenPGP


Re: ZFS: time to drop Big Scary Warning

2021-03-25 Thread Greg Troxel

chris...@astron.com (Christos Zoulas) writes:

> That's a good test, but how does zfs compare in for the same test with lets
> say ffs or ext2fs (filesystems that offer persistence)?

With the same system, booted in the  same way, but with 3 different
filesystems mounted on /tmp, I get similar numbers of failures:

tmpfs   12
ffs213
zfs 18

So tmpfs/ffs2 are ~equal and zfs has a few more failures (but it all
looks a bit random and non-repeatable).So it's hard to sort out "zfs
is buggy" vs "some tests fail in timing-related hard-to-understand ways
and that seems provoked slightly more with /tmp on zfs".

Did you mean something else?


signature.asc
Description: PGP signature


Re: ZFS: time to drop Big Scary Warning

2021-03-23 Thread Christos Zoulas
In article ,
Greg Troxel   wrote:
>-=-=-=-=-=-
>
>which is also similar, but slightly different.
>
>So overal I conclude that there's nothing terrible going on, and that
>these results are in the same class of mostly passing but somewhat
>irregular as the base case.  So work to do, but it doesn't support "ZFS
>is scary".
>
>(Of course, the system stayed up through the tests and has no apparent
>trouble, or I would have said.)
>
>As an aside, it would be nice if atf-test used TMPDIR or had an argument
>to say what place to do tests.

That's a good test, but how does zfs compare in for the same test with lets
say ffs or ext2fs (filesystems that offer persistence)?

Best,

christos



Re: ZFS: time to drop Big Scary Warning

2021-03-23 Thread Greg Troxel

I got a suggestion to run atf with a ZFS tmp.  This is all with current
from around March 1, and is straight current, no Xen.

Creating tank0/tmp and having it be mounted on /tmp failed the mount
(but created the volume) with some sort of "busy" error.  I already had
a tmpfs mounted.  Rebooting, zfs got mounted and then tmpfs and i
unmounted tmpfs and then I have a zfs tmp.  So not sure what's up but
feels like a tmpfs issue more than a zfs issue, and not a big deal.  Or
maybe it's a feature that you can't mount over tmpfs.


With /tmp being tmpfs, my results are similar to the releng runs.  I've
indented things that don't match two spaces.

Failed test cases:
  lib/libc/sys/t_futex_ops:futex_wait_timeout_deadline
lib/libc/sys/t_ptrace_waitid:syscall_signal_on_sce
lib/libc/sys/t_truncate:truncate_err
  lib/librumpclient/t_exec:threxec
net/if_wg/t_misc:wg_rekey
  usr.bin/cc/t_tsan_data_race:data_race
usr.bin/make/t_make:archive
usr.bin/c++/t_tsan_data_race:data_race
usr.sbin/cpuctl/t_cpuctl:nointr
usr.sbin/cpuctl/t_cpuctl:offline
fs/ffs/t_quotalimit:slimit_le_1_user
modules/t_x86_pte:rwx

Summary for 903 test programs:
9570 passed test cases.
12 failed test cases.
73 expected failed test cases.
530 skipped test cases.

With /tmp being zfs:tank0/tmp, I get

Failed test cases:
  ./bin/cp/t_cp:file_to_file
  ./lib/libarchive/t_libarchive:libarchive
  ./lib/libc/stdlib/t_mktemp:mktemp_large_template
./lib/libc/sys/t_ptrace_waitid:syscall_signal_on_sce
  ./lib/libc/sys/t_stat:stat_chflags
./lib/libc/sys/t_truncate:truncate_err
./net/if_wg/t_misc:wg_rekey
  ./usr.bin/cc/t_tsan_data_race:data_race_pie
./usr.bin/make/t_make:archive
  ./usr.bin/ztest/t_ztest:assert
./usr.bin/c++/t_tsan_data_race:data_race
  ./usr.bin/c++/t_tsan_data_race:data_race_pie
./usr.sbin/cpuctl/t_cpuctl:nointr
./usr.sbin/cpuctl/t_cpuctl:offline
  ./fs/nfs/t_rquotad:get_nfs_be_1_group
./modules/t_x86_pte:rwx
  ./modules/t_x86_pte:svs_g_bit_set

Summary for 903 test programs:
9567 passed test cases.
17 failed test cases.
72 expected failed test cases.
529 skipped test cases.

which is also similar, but slightly different.

So overal I conclude that there's nothing terrible going on, and that
these results are in the same class of mostly passing but somewhat
irregular as the base case.  So work to do, but it doesn't support "ZFS
is scary".

(Of course, the system stayed up through the tests and has no apparent
trouble, or I would have said.)

As an aside, it would be nice if atf-test used TMPDIR or had an argument
to say what place to do tests.


signature.asc
Description: PGP signature


Re: ZFS: time to drop Big Scary Warning

2021-03-20 Thread Greg Troxel

"J. Hannken-Illjes"  writes:

>> On 19. Mar 2021, at 21:18, Michael  wrote:
>> 
>> On Fri, 19 Mar 2021 15:57:18 -0400
>> Greg Troxel  wrote:
>> 
>>> Even in current, zfs has a Big Scary Warning.  Lots of people are using
>>> it and it seems quite solid, especially by -current standards.  So it
>>> feels times to drop the warning.
>>> 
>>> I am not proposing dropping the warning in 9.
>>> 
>>> Objections/comments?
>> 
>> I've been using it on sparc64 without issues for a while now.
>> Does nfs sharing work these days? I dimly remember problems there.
>
> If you mean misc/55042: Panic when creating a directory on a NFS served ZFS
> it should be fixed in -current.

I have a box running current/amd64 from about March 4, with a zpool on a
disklabel partition, and a filesystem from that exported, mounted on a
9/amd64 box, and did the mkdir test and it was totally fine.   I was
able to have the maproot segfault happen, before the fix.  So yes, this
is fixed.


So summarizing:

  nobody has said there is any remaining serious issue

  many remember issues about NFS (true) but they all seem ok now

and I just looked over the open PRs and w.r.t. current don't see
anything serious.



signature.asc
Description: PGP signature


Re: ZFS: time to drop Big Scary Warning

2021-03-19 Thread J. Hannken-Illjes
> On 19. Mar 2021, at 21:18, Michael  wrote:
> 
> Hello,
> 
> On Fri, 19 Mar 2021 15:57:18 -0400
> Greg Troxel  wrote:
> 
>> Even in current, zfs has a Big Scary Warning.  Lots of people are using
>> it and it seems quite solid, especially by -current standards.  So it
>> feels times to drop the warning.
>> 
>> I am not proposing dropping the warning in 9.
>> 
>> Objections/comments?
> 
> I've been using it on sparc64 without issues for a while now.
> Does nfs sharing work these days? I dimly remember problems there.

If you mean misc/55042: Panic when creating a directory on a NFS served ZFS
it should be fixed in -current.

--
J. Hannken-Illjes - hann...@eis.cs.tu-bs.de - TU Braunschweig


signature.asc
Description: Message signed with OpenPGP


Re: ZFS: time to drop Big Scary Warning

2021-03-19 Thread Michael
Hello,

On Fri, 19 Mar 2021 15:57:18 -0400
Greg Troxel  wrote:

> Even in current, zfs has a Big Scary Warning.  Lots of people are using
> it and it seems quite solid, especially by -current standards.  So it
> feels times to drop the warning.
> 
> I am not proposing dropping the warning in 9.
> 
> Objections/comments?

I've been using it on sparc64 without issues for a while now.
Does nfs sharing work these days? I dimly remember problems there.

have fun
Michael


ZFS: time to drop Big Scary Warning

2021-03-19 Thread Greg Troxel

Even in current, zfs has a Big Scary Warning.  Lots of people are using
it and it seems quite solid, especially by -current standards.  So it
feels times to drop the warning.

I am not proposing dropping the warning in 9.

Objections/comments?


signature.asc
Description: PGP signature