Re: svn commit: r362848 - in stable/12/sys: net netinet sys

2020-07-20 Thread Konstantin Belousov
On Tue, Jul 21, 2020 at 07:20:44AM +1000, Peter Jeremy wrote:
> On 2020-Jul-19 14:48:28 +0300, Konstantin Belousov  
> wrote:
> >On Sun, Jul 19, 2020 at 09:21:02PM +1000, Peter Jeremy wrote:
> >> I'm sending this to -stable, rather than the src groups because I
> >> don't believe the problem is the commit itself, rather the commit
> >> has uncovered a latent problem elsewhere.
> >> 
> >> On 2020-Jul-01 18:03:38 +, Michael Tuexen  wrote:
> >> >Author: tuexen
> >> >Date: Wed Jul  1 18:03:38 2020
> >> >New Revision: 362848
> >> >URL: https://svnweb.freebsd.org/changeset/base/362848
> >> >
> >> >Log:
> >> >  MFC r353480: Use event handler in SCTP
> >> 
> >> I have no idea how, but this update breaks booting amd64 for me (r362847
> >> works and this doesn't).  I have a custom kernel with ZFS but no SCTP so I
> >> have no real idea how this could break booting - presumably the
> >> eventhandler change has uncovered a bug somewhere else.
> >> 
> >> The symptoms are that I get:
> >> Mounting from zfs:zroot/ROOT/r363310 failed with error 6; retrying for 3 
> >> more seconds
> >> Mounting from zfs:zroot/ROOT/r363310 failed with error 6
> >> 
> >> (r363310 is where I was trying to update to and I didn't change the BE
> >> name as I was searching for the problem and error 6 is ENXIO).
> >> 
> >> I tried to reproduce the problem with GENERIC but it hangs after
> >> displaying the EFI framebuffer information (I've seen that before and
> >> suspect it is a loader problem but haven't dug into it).
> 
> I've confirmed that particular problem is bug 209821.  I've disabled
> EFI and GENERIC r362848 boots and runs successfully.
Did you mis-typed the PR number ?   The referenced bug talks about very
early hang, while your report said that kernel boots up to the point of
mounting root.

> 
> >> Does anyone have any ideas?
> >
> >Did you checked that the physical devices where your ZFS pool is located,
> >are detected, and that kernel messages for their drivers are as usual ?
> >Overall, is there anything strange in the verbose dmesg ?
> 
> There's nothing obviously strange (in particular, I can see the physical
> boot/root disk) but the faulty kernel appears to have moved the msgbuf
> somewhere unexpected so it's not saved across reboots and I'm limited to
> eyeballing the messages via DDB.
> 
> Since GENERIC worked, I did some more experimenting and tracked the
> problem down to a lack of "options ACPI_DMAR" in my kernel config.
> That makes more sense, though I have no idea why it suddenly became
> mandatory for my system.
No, this does not make too much sense either, since DMAR is disabled
by default.  Did you enabled it ?

BTW, you are using stable, right ?  There were some code reorganization
commits in HEAD moving DMAR code around, but they were not merged to
stable.
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: svn commit: r362848 - in stable/12/sys: net netinet sys

2020-07-20 Thread Peter Jeremy
On 2020-Jul-19 14:48:28 +0300, Konstantin Belousov  wrote:
>On Sun, Jul 19, 2020 at 09:21:02PM +1000, Peter Jeremy wrote:
>> I'm sending this to -stable, rather than the src groups because I
>> don't believe the problem is the commit itself, rather the commit
>> has uncovered a latent problem elsewhere.
>> 
>> On 2020-Jul-01 18:03:38 +, Michael Tuexen  wrote:
>> >Author: tuexen
>> >Date: Wed Jul  1 18:03:38 2020
>> >New Revision: 362848
>> >URL: https://svnweb.freebsd.org/changeset/base/362848
>> >
>> >Log:
>> >  MFC r353480: Use event handler in SCTP
>> 
>> I have no idea how, but this update breaks booting amd64 for me (r362847
>> works and this doesn't).  I have a custom kernel with ZFS but no SCTP so I
>> have no real idea how this could break booting - presumably the
>> eventhandler change has uncovered a bug somewhere else.
>> 
>> The symptoms are that I get:
>> Mounting from zfs:zroot/ROOT/r363310 failed with error 6; retrying for 3 
>> more seconds
>> Mounting from zfs:zroot/ROOT/r363310 failed with error 6
>> 
>> (r363310 is where I was trying to update to and I didn't change the BE
>> name as I was searching for the problem and error 6 is ENXIO).
>> 
>> I tried to reproduce the problem with GENERIC but it hangs after
>> displaying the EFI framebuffer information (I've seen that before and
>> suspect it is a loader problem but haven't dug into it).

I've confirmed that particular problem is bug 209821.  I've disabled
EFI and GENERIC r362848 boots and runs successfully.

>> Does anyone have any ideas?
>
>Did you checked that the physical devices where your ZFS pool is located,
>are detected, and that kernel messages for their drivers are as usual ?
>Overall, is there anything strange in the verbose dmesg ?

There's nothing obviously strange (in particular, I can see the physical
boot/root disk) but the faulty kernel appears to have moved the msgbuf
somewhere unexpected so it's not saved across reboots and I'm limited to
eyeballing the messages via DDB.

Since GENERIC worked, I did some more experimenting and tracked the
problem down to a lack of "options ACPI_DMAR" in my kernel config.
That makes more sense, though I have no idea why it suddenly became
mandatory for my system.

-- 
Peter Jeremy


signature.asc
Description: PGP signature


Re: zfs meta data slowness

2020-07-20 Thread Eugene Grosbein
19.07.2020 21:17, mike tancsa wrote:
> Are there any tweaks that can be done to speed up or improve zfs
> metadata performance ? I have a backup server with a lot of snapshots
> (40,000)  and just doing a listing can take a great deal of time.  Best
> case scenario is about 24 seconds, worst case, I have seen it up to 15
> minutes.  (FreeBSD 12.1-STABLE r363078)

I believe you have to perform kernel profiling for your specific use case.

___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ls colour (COLORTERM / CLICOLOR)

2020-07-20 Thread James Wright




On 20/07/2020 14:26, Kyle Evans wrote:

On Sat, Jul 18, 2020 at 7:51 PM James Wright
 wrote:


 Updated to 12.1-STABLE r363215 a few days ago (previous build was
circa 1st June)
but seem to have lost "ls" colour output with "COLORTERM=yes" set in my env.

Setting "CLICOLOR=yes" seems to enable it again, however the man page
states that
setting either should work?


Hi,

Indeed, sorry for the flip-flopping. The short version of the
situation is that I had flipped ls(1) to --color=auto by default based
on a misunderstanding of defaults elsewhere due to shell aliases that
I hadn't realized were in use. The ls(1) binary is historically and
almost universally configured for non-colored by default where color
support exists, and you should instead use appropriate shell alias for
ls=`ls -G` or `ls --color=auto`.

I can see where the manpage could describe the differences a little
better. CLICOLOR (On FreeBSD) historically meant that we'll enable
color if the terminal supports it, and setting it would have the same
effect as the above shell alias. COLORTERM is less aggressive and
won't imply any specific --color option, you would still --color=auto
to go with it for it to have any effect.

Thanks,

Kyle Evans


Thank you for the clarifying the diferences between CLICOLOR and COLORTERM,
that makes sense to me now. I'll set the shell alias and remove COLORTERM.
Only raised this in case it was an unintended consequence of recent 
changes. :-)



___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: ls colour (COLORTERM / CLICOLOR)

2020-07-20 Thread Kyle Evans
On Sat, Jul 18, 2020 at 7:51 PM James Wright
 wrote:
>
>
> Updated to 12.1-STABLE r363215 a few days ago (previous build was
> circa 1st June)
> but seem to have lost "ls" colour output with "COLORTERM=yes" set in my env.
>
>Setting "CLICOLOR=yes" seems to enable it again, however the man page
> states that
> setting either should work?
>

Hi,

Indeed, sorry for the flip-flopping. The short version of the
situation is that I had flipped ls(1) to --color=auto by default based
on a misunderstanding of defaults elsewhere due to shell aliases that
I hadn't realized were in use. The ls(1) binary is historically and
almost universally configured for non-colored by default where color
support exists, and you should instead use appropriate shell alias for
ls=`ls -G` or `ls --color=auto`.

I can see where the manpage could describe the differences a little
better. CLICOLOR (On FreeBSD) historically meant that we'll enable
color if the terminal supports it, and setting it would have the same
effect as the above shell alias. COLORTERM is less aggressive and
won't imply any specific --color option, you would still --color=auto
to go with it for it to have any effect.

Thanks,

Kyle Evans
___
freebsd-stable@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-stable
To unsubscribe, send any mail to "freebsd-stable-unsubscr...@freebsd.org"


Re: zfs meta data slowness

2020-07-20 Thread Ronald Klop

Hi,

My first suggestion would be to remove a lot of snapshots. But that my not 
match your business case.
Maybe you can provide more information about your setup:
Amount of RAM, CPU?
output of "zpool status"
output of "zfs list" if possible to share
Type of disks/ssds?
What is the load of the system? I/O per second, etc.
Do you use dedup, GELI?
Something else special about the setup.
output of "top -b"

That kind of information.

Regards,
Ronald.


Van: mike tancsa 
Datum: zondag, 19 juli 2020 16:17
Aan: FreeBSD-STABLE Mailing List 
Onderwerp: zfs meta data slowness


Are there any tweaks that can be done to speed up or improve zfs
metadata performance ? I have a backup server with a lot of snapshots
(40,000)  and just doing a listing can take a great deal of time.  Best
case scenario is about 24 seconds, worst case, I have seen it up to 15
minutes.  (FreeBSD 12.1-STABLE r363078)


ARC Efficiency: 79.33b
Cache Hit Ratio:92.81%  73.62b
Cache Miss Ratio:   7.19%   5.71b
Actual Hit Ratio:   92.78%  73.60b

Data Demand Efficiency: 96.47%  461.91m
Data Prefetch Efficiency:   1.00%   262.73m

CACHE HITS BY CACHE LIST:
  Anonymously Used: 0.01%   3.86m
  Most Recently Used:   3.91%   2.88b
  Most Frequently Used: 96.06%  70.72b
  Most Recently Used Ghost: 0.01%   5.31m
  Most Frequently Used Ghost:   0.01%   10.47m

CACHE HITS BY DATA TYPE:
  Demand Data:  0.61%   445.60m
  Prefetch Data:0.00%   2.63m
  Demand Metadata:  99.36%  73.15b
  Prefetch Metadata:0.03%   21.00m

CACHE MISSES BY DATA TYPE:
  Demand Data:  0.29%   16.31m
  Prefetch Data:4.56%   260.10m
  Demand Metadata:  95.02%  5.42b
  Prefetch Metadata:0.14%   7.75m


Other than increase the metadata max, I havent really changed any tuneables


ZFS Tunables (sysctl):
kern.maxusers   4416
vm.kmem_size66691842048
vm.kmem_size_scale  1
vm.kmem_size_min0
vm.kmem_size_max1319413950874
vfs.zfs.trim.max_interval   1
vfs.zfs.trim.timeout30
vfs.zfs.trim.txg_delay  32
vfs.zfs.trim.enabled1
vfs.zfs.vol.immediate_write_sz  32768
vfs.zfs.vol.unmap_sync_enabled  0
vfs.zfs.vol.unmap_enabled   1
vfs.zfs.vol.recursive   0
vfs.zfs.vol.mode1
vfs.zfs.version.zpl 5
vfs.zfs.version.spa 5000
vfs.zfs.version.acl 1
vfs.zfs.version.ioctl   7
vfs.zfs.debug   0
vfs.zfs.super_owner 0
vfs.zfs.immediate_write_sz  32768
vfs.zfs.sync_pass_rewrite   2
vfs.zfs.sync_pass_dont_compress 5
vfs.zfs.sync_pass_deferred_free 2
vfs.zfs.zio.dva_throttle_enabled1
vfs.zfs.zio.exclude_metadata0
vfs.zfs.zio.use_uma 1
vfs.zfs.zio.taskq_batch_pct 75
vfs.zfs.zil_maxblocksize131072
vfs.zfs.zil_slog_bulk   786432
vfs.zfs.zil_nocacheflush0
vfs.zfs.zil_replay_disable  0
vfs.zfs.cache_flush_disable 0
vfs.zfs.standard_sm_blksz   131072
vfs.zfs.dtl_sm_blksz4096
vfs.zfs.min_auto_ashift 9
vfs.zfs.max_auto_ashift 13
vfs.zfs.vdev.trim_max_pending   1
vfs.zfs.vdev.bio_delete_disable 0
vfs.zfs.vdev.bio_flush_disable  0
vfs.zfs.vdev.def_queue_depth32
vfs.zfs.vdev.queue_depth_pct1000
vfs.zfs.vdev.write_gap_limit4096
vfs.zfs.vdev.read_gap_limit 32768
vfs.zfs.vdev.aggregation_limit_non_rotating131072
vfs.zfs.vdev.aggregation_limit  1048576
vfs.zfs.vdev.initializing_max_active1
vfs.zfs.vdev.initializing_min_active1
vfs.zfs.vdev.removal_max_active 2
vfs.zfs.vdev.removal_min_active 1
vfs.zfs.vdev.trim_max_active64
vfs.zfs.vdev.trim_min_active1
vfs.zfs.vdev.scrub_max_active   2
vfs.zfs.vdev.scrub_min_active   1
vfs.zfs.vdev.async_write_max_active 10
vfs.

FreeBSD CI Weekly Report 2020-07-19

2020-07-20 Thread Li-Wen Hsu
(Please send the followup to freebsd-testing@ and note Reply-To is set.)

FreeBSD CI Weekly Report 2020-07-19
===

Here is a summary of the FreeBSD Continuous Integration results for the period
from 2020-07-13 to 2020-07-19.

During this period, we have:

* 1699 builds (93.3% (-2.4) passed, 6.7% (+2.4) failed) of buildworld and
  buildkernel (GENERIC and LINT) were executed on aarch64, amd64, armv6,
  armv7, i386, mips, mips64, powerpc, powerpc64, powerpcspe, riscv64,
  sparc64 architectures for head, stable/12, stable/11 branches.
* 191 test runs (88.0% (+1.3) passed, 12.0% (+0) unstable, 0% (-1.3)
  exception) were executed on amd64, i386, riscv64 architectures for head,
  stable/12, stable/11 branches.
* 36 doc and www builds (100% (+0) passed)

Test case status (on 2020-07-19 23:59):
| Branch/Architecture | Total| Pass  | Fail  | Skipped |
| --- |  | - | - | --- |
| head/amd64  | 7859 (0) | 7768 (+1) | 0 (0) | 91 (-1) |
| head/i386   | 7857 (0) | 7759 (+1) | 0 (0) | 98 (-1) |
| 12-STABLE/amd64 | 7617 (0) | 7557 (0)  | 0 (0) | 60 (0)  |
| 12-STABLE/i386  | 7615 (0) | 7550 (+3) | 0 (0) | 65 (-3) |
| 11-STABLE/amd64 | 6912 (0) | 6861 (0)  | 0 (0) | 51 (0)  |
| 11-STABLE/i386  | 6910 (0) | 6854 (0)  | 0 (0) | 56 (0)  |

(The statistics from experimental jobs are omitted)

If any of the issues found by CI are in your area of interest or expertise
please investigate the PRs listed below.

The latest web version of this report is available at
https://hackmd.io/@FreeBSD-CI/report-20200719 and archive is available at
https://hackmd.io/@FreeBSD-CI/ , any help is welcomed.


## Failing jobs

* https://ci.freebsd.org/job/FreeBSD-head-amd64-gcc6_build/
  ```
  /usr/local/bin/x86_64-unknown-freebsd12.1-ld: 
/tmp/obj/workspace/src/amd64.amd64/lib/clang/liblldb/liblldb.a(IOHandlerCursesGUI.o):
 in function `curses::Window::Box(unsigned int, unsigned int)':
  
/workspace/src/contrib/llvm-project/lldb/source/Core/IOHandlerCursesGUI.cpp:361:
 undefined reference to `box'
  /usr/local/bin/x86_64-unknown-freebsd12.1-ld: 
/workspace/src/contrib/llvm-project/lldb/source/Core/IOHandlerCursesGUI.cpp:361:
 undefined reference to `box'
  collect2: error: ld returned 1 exit status
  ```

  From kevans@:
  one of ncurses' scripts that generates box and a bunch of other symbols is 
shooting blanks with gcc6, however it seems fine on gcc9.

## Regressions

* lib.libexecinfo.backtrace_test.backtrace_fmt_basic starts failing on amd64 
after r360915
  https://bugs.freebsd.org/246537

* lib.msun.ctrig_test.test_inf_inputs starts failing after llvm10 import
  https://bugs.freebsd.org/244732

* Lock-order reversals triggered by tests under sys.net.if_lagg_test.* on i386
  https://bugs.freebsd.org/244163
  Discovered by newly endabled sys.net.* tests. 
([r357857](https://svnweb.freebsd.org/changeset/base/357857))
  
* sys.net.if_lagg_test.lacp_linkstate_destroy_stress panics i386 kernel
  https://bugs.freebsd.org/244168
  Discovered by newly endabled sys.net.* tests. 
([r357857](https://svnweb.freebsd.org/changeset/base/357857))
  Fix in review: https://reviews.freebsd.org/D25284

## Failing and Flaky tests (from experimental jobs)

* https://ci.freebsd.org/job/FreeBSD-head-amd64-dtrace_test/
* cddl.usr.sbin.dtrace.common.misc.t_dtrace_contrib.tst_dynopt_d
* https://bugs.freebsd.org/237641

* https://ci.freebsd.org/job/FreeBSD-head-amd64-test_zfs/
* There are ~13 failing and ~109 skipped cases, including flakey ones, see
  
https://ci.freebsd.org/job/FreeBSD-head-amd64-test_zfs/lastCompletedBuild/testReport/
 for more details
* Work for cleaning these failing cass are in progress

* https://ci.freebsd.org/job/FreeBSD-head-amd64-test_ltp/
* Total 3749 tests, 2277 success, 647 failures, 825 skipped

## Disabled Tests

* sys.fs.tmpfs.mount_test.large
  https://bugs.freebsd.org/212862
* sys.fs.tmpfs.link_test.kqueue
  https://bugs.freebsd.org/213662
* sys.kqueue.libkqueue.kqueue_test.main
  https://bugs.freebsd.org/233586
* sys.kern.ptrace_test.ptrace__PT_KILL_competing_stop
  https://bugs.freebsd.org/220841
* lib.libc.regex.exhaust_test.regcomp_too_big (i386 only)
  https://bugs.freebsd.org/237450
* sys.netinet.socket_afinet.socket_afinet_bind_zero
  https://bugs.freebsd.org/238781
* sys.netpfil.pf.names.names
* sys.netpfil.pf.synproxy.synproxy
  https://bugs.freebsd.org/238870
* sys.kern.ptrace_test.ptrace__follow_fork_child_detached_unrelated_debugger 
  https://bugs.freebsd.org/239292
* sys.kern.ptrace_test.ptrace__follow_fork_both_attached_unrelated_debugger 
  https://bugs.freebsd.org/239397
* sys.kern.ptrace_test.ptrace__parent_sees_exit_after_child_debugger
  https://bugs.freebsd.org/239399
* sys.kern.ptrace_test.ptrace__follow_fork_parent_detached_unrelated_debugger
  https://bugs.freebsd.org/239425
* sys.sys.qmath_test.qdivq_s64q
  https://bugs.freebsd.org/240219
* sys.kern.ptrace_test.pt