Re: make-memstick.sh creates in 14.0-CURRENT run-away processes

2023-09-03 Thread Matthias Apitz
El día Freitag, August 18, 2023 a las 06:17:42 +0200, Matthias Apitz escribió:

> 
> I was used to use in 13.0-CURRENT the script "make-memstick.sh" to
> create memstick immages to install the system on smaller devices where
> the OS can't build from the sources, and it always worked fine for many
> years. Now I'm ready to do so with my fresh compiled system (sources
> from git August, 5:
> 
> $ uname -a
> FreeBSD jet 14.0-CURRENT FreeBSD 14.0-CURRENT amd64 1400094 #0 
> main-n264568-1d7ffb373c9d: Sat Aug  5 17:22:47 CEST 2023 
> guru@jet:/usr/obj/usr/src/amd64.amd64/sys/GENERIC amd64
> 
> but the image is not produces and some processes create temp
> files of 800++ GByte. Here are the details:
> 
> root@jet:/usr/src/release/amd64 # ./make-memstick.sh /home/guru/140.root 
> ~guru/memstick.img
> Calculated size of `/home/guru/memstick.img.part': 23795073024 bytes, 263113 
> inodes
> Extent size set to 32768
> /home/guru/memstick.img.part: 22692.8MB (46474752 sectors) block size 32768, 
> fragment size 4096
> using 27 cylinder groups of 869.44MB, 27822 blks, 10240 inodes.
> super-block backups (for fsck -b #) at:
>   192,  1780800,  3561408,  5342016,  7122624,  8903232, 10683840,
>  12464448, 14245056, 16025664, 17806272, 19586880, 21367488, 23148096,
>  24928704, 26709312, 28489920, 30270528, 32051136, 33831744, 35612352,
>  37392960, 39173568, 40954176, 42734784, 44515392, 46296000
> Populating `/home/guru/memstick.img.part'
> Image `/home/guru/memstick.img.part' complete
> Creating `/tmp/efiboot.iFachZ'
> /tmp/efiboot.iFachZ: 65528 sectors in 65528 FAT32 clusters (512 bytes/cluster)
> BytesPerSec=512 SecPerClust=1 ResSectors=32 FATs=2 Media=0xf0 SecPerTrack=63 
> Heads=255 HiddenSecs=0 HugeSectors=66584 FATsecs=512 RootCluster=2 FSInfo=1 
> Backup=2
> Populating `/tmp/efiboot.iFachZ'
> Image `/tmp/efiboot.iFachZ' complete
> 
> It says 'complete' but never ends growing the file /tmp/mkimg-oGNnFb:
> 
> root@jet:/usr/home/guru # ls -ltrah /tmp | tail -6
> drwx--   2 guru wheel  512B Aug 18 15:43 tmux-1001
> -rw---   1 root wheel   33M Aug 18 17:18 efiboot.iFachZ
> -rw---   1 root wheel0B Aug 18 17:18 mkimg-4eMWKW
> drwxrwxrwt  21 root wheel  1.0K Aug 18 17:18 .
> drwxr-xr-x  22 root wheel  1.0K Aug 18 17:43 ..
> -rw---   1 root wheel  850G Aug 18 17:53 mkimg-oGNnFb
> 
> root@jet:/usr/home/guru # ls -ltrh mem*
> -rw-r--r--  1 root wheel   22G Aug 18 17:18 memstick.img.part
> -rw-r--r--  1 root wheel0B Aug 18 17:18 memstick.img
> 
> Only a hard reset and reboot helps.
> 

(Sorry for the delay, I was out for vacation)

The last part of the script ./make-memstick.sh which should produce the final
image, but the processes mkimg never end, is:


...
# Make an ESP in a file.
espfilename=$(mktemp /tmp/efiboot.XX)
make_esp_file ${espfilename} ${fat32min} ${BASEBITSDIR}/boot/loader.efi

mkimg -s mbr \
-b ${BASEBITSDIR}/boot/mbr \
-p efi:=${espfilename} \
-p freebsd:-"mkimg -s bsd -b ${BASEBITSDIR}/boot/boot -p 
freebsd-ufs:=${2}.part" \
-a 2 \
-o ${2}
...

I've split the two processes, connected my the pipe, and run them one
after the other as:


#!/bin/sh

# set -x

BASEBITSDIR=/home/guru/140.root
img=/home/guru/zdata/memstick.img

espfilename=/home/guru/zdata/efiboot.7S9yjL

mkimg -s bsd -b ${BASEBITSDIR}/boot/boot \
  -p freebsd-ufs:=${img}.part > ${img}.part.mkimg

mkimg -s mbr \
-b ${BASEBITSDIR}/boot/mbr \
-p efi:=${espfilename} \
-p freebsd:=${img}.part.mkimg \
-a 2 \
-o ${img}

# ls -l /home/guru/zdata
# -rw-r--r--  1 root wheel 23795073024 Sep  3 17:12 memstick.img.part
# -rw---  1 root wheel34091008 Sep  3 17:12 efiboot.7S9yjL
# -rw-r--r--  1 root wheel 23795081216 Sep  3 18:25 memstick.img.part.mkimg
# -rw-r--r--  1 root wheel 23829172736 Sep  3 18:34 memstick.img

The resulting file 'memstick.img' (copied with dd to an USB key)
boots fine.

Now I'm clueless about why the pipe between

mkimg ... -p freebsd:-"mkimg -s bsd ..." ...

does not work as it should.

matthias


-- 
Matthias Apitz, ✉ g...@unixarea.de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub



Re: An attempted test of main's "git: 2ad756a6bbb3" "merge openzfs/zfs@95f71c019" that did not go as planned

2023-09-03 Thread Alexander Motin

Mark,

On 03.09.2023 22:54, Mark Millard wrote:

After that ^t produced the likes of:

load: 6.39  cmd: sh 4849 [tx->tx_quiesce_done_cv] 10047.33r 0.51u 121.32s 1% 
13004k


So the full state is not "tx->tx", but is actually a 
"tx->tx_quiesce_done_cv", which means the thread is waiting for new 
transaction to be opened, which means some previous to be quiesced and 
then synced.



#0 0x80b6f103 at mi_switch+0x173
#1 0x80bc0f24 at sleepq_switch+0x104
#2 0x80aec4c5 at _cv_wait+0x165
#3 0x82aba365 at txg_wait_open+0xf5
#4 0x82a11b81 at dmu_free_long_range+0x151


Here it seems like transaction commit is waited due to large amount of 
delete operations, which ZFS tries to spread between separate TXGs.  You 
should probably see some large and growing number in sysctl 
kstat.zfs.misc.dmu_tx.dmu_tx_dirty_frees_delay .



#5 0x829a87d2 at zfs_rmnode+0x72
#6 0x829b658d at zfs_freebsd_reclaim+0x3d
#7 0x8113a495 at VOP_RECLAIM_APV+0x35
#8 0x80c5a7d9 at vgonel+0x3a9
#9 0x80c5af7f at vrecycle+0x3f
#10 0x829b643e at zfs_freebsd_inactive+0x4e
#11 0x80c598cf at vinactivef+0xbf
#12 0x80c590da at vput_final+0x2aa
#13 0x80c68886 at kern_funlinkat+0x2f6
#14 0x80c68588 at sys_unlink+0x28
#15 0x8106323f at amd64_syscall+0x14f
#16 0x8103512b at fast_syscall_common+0xf8


What we don't see here is what quiesce and sync threads of the pool are 
actually doing.  Sync thread has plenty of different jobs, including 
async write, async destroy, scrub and others, that all may delay each 
other.


Before you rebooted the system, depending how alive it is, could you 
save a number of outputs of `procstat -akk`, or at least specifically 
`procstat -akk | grep txg_thread_enter` if the full is hard?  Or somehow 
else observe what they are doing.


`zpool status`, `zpool get all` and `sysctl -a` would also not harm.

PS: I may be wrong, but USB in "USB3 NVMe SSD storage" makes me shiver. 
Make sure there is no storage problems, like some huge delays, timeouts, 
etc, that can be seen, for example, as busy percents regularly spiking 
far above 100% in your `gstat -spod`.


--
Alexander Motin



An attempted test of main's "git: 2ad756a6bbb3" "merge openzfs/zfs@95f71c019" that did not go as planned

2023-09-03 Thread Mark Millard
ThreadRipper 1950X (32 hardware threads) doing bulk -J128
with USE_TMPFS=no , no ALLOW_MAKE_JOBS , no
ALLOW_MAKE_JOBS_PACKAGES , USB3 NVMe SSD storage/ZFS-boot-media,
debug system build in use :

[00:03:44] Building 34214 packages using up to 128 builders
[00:03:44] Hit CTRL+t at any time to see build progress and stats
[00:03:44] [01] [00:00:00] Builder starting
[00:04:37] [01] [00:00:53] Builder started
[00:04:37] [01] [00:00:00] Building ports-mgmt/pkg | pkg-1.20.6
[00:05:53] [01] [00:01:16] Finished ports-mgmt/pkg | pkg-1.20.6: Success
[00:06:15] [01] [00:00:00] Building print/indexinfo | indexinfo-0.3.1
[00:06:15] [02] [00:00:00] Builder starting
. . .
[00:06:18] [128] [00:00:00] Builder starting
[00:07:42] [01] [00:01:27] Finished print/indexinfo | indexinfo-0.3.1: Success
[00:07:45] [01] [00:00:00] Building devel/gettext-runtime | 
gettext-runtime-0.22_1
[00:18:45] [01] [00:11:00] Finished devel/gettext-runtime | 
gettext-runtime-0.22_1: Success
[00:19:06] [01] [00:00:00] Building devel/gmake | gmake-4.3_2
[00:24:13] [01] [00:05:07] Finished devel/gmake | gmake-4.3_2: Success
[00:24:39] [01] [00:00:00] Building devel/libtextstyle | libtextstyle-0.22
[00:31:08] [125] [00:24:50] Builder started
[00:31:08] [125] [00:00:00] Building print/t1utils | t1utils-1.32
[00:31:15] [33] [00:25:00] Builder started
[00:31:15] [81] [00:24:59] Builder started
[00:31:15] [33] [00:00:00] Building databases/xapian-core | xapian-core-1.4.23,1
[00:31:15] [13] [00:25:00] Builder started
[00:31:15] [81] [00:00:00] Building devel/bmake | bmake-20230723
[00:31:15] [13] [00:00:00] Building devel/evdev-proto | evdev-proto-5.8
[00:31:16] [41] [00:25:00] Builder started
[00:31:16] [41] [00:00:00] Building devel/pcre | pcre-8.45_3
. . .

(Looks like lang/go120 ignores the lack of ALLOW_MAKE_JOBS .
There may be others that still have signficant parallel
activity.)

[main-amd64-bulk_a-default] [2023-09-03_13h48m45s] [parallel_build:] Queued: 
34588 Built: 727   Failed: 1 Skipped: 40Ignored: 335   Fetched: 0 
Tobuild: 33485  Time: 01:36:51

(So about 1 hr after the last "Builder starting" it had
built 727.)

The vast majority of the time: lots of cpdup's with tx->tx
showing most of the time for STATE but showing having some
CPU time.

^T commonly showed various Builders in starting PHASE for
3min..6min.

Around 66% mean Idle time (guess from watching top).

After ^C "gstat -spod" reports it is almost always writing
2200 to 2500 writes per second or so for *hours* (still
going on).

ztop reports 1500 to 3200 d/s or so almost always for
Dataset zamd64/poudriere/data/.m instead (also still going
on). Normally no other Dataset is shown.

With all the disk I/O activity, this is definitely "live"
in some sense. But I've no clue if it is just repeating
itself over and over vs. if it making some sort of progress.

For reference for the ^C and after:

^C[01:39:00] [20] [00:00:03] Building sysutils/linux-c7-dosfstools | 
linux-c7-dosfstools-3.0.20
[01:39:00] [93] [00:07:12] Finished science/dimod | dimod-0.12.11: Success
[01:39:00] Error: Signal SIGINT caught, cleaning up and exiting
[01:39:02] [63] [00:06:34] Finished archivers/unarj | unarj-2.65_2: Success
[01:39:03] [128] [00:07:47] Finished sysutils/shuf | shuf-3.0: Success
[01:39:04] [113] [00:07:06] Finished devel/bsddialog | bsddialog-0.4.1: Success
[main-amd64-bulk_a-default] [2023-09-03_13h48m45s] [sigint:] Queued: 34588 
Built: 752   Failed: 1 Skipped: 40Ignored: 335   Fetched: 0 
Tobuild: 33460  Time: 01:38:56
[01:39:06] Logs: 
/usr/local/poudriere/data/logs/bulk/main-amd64-bulk_a-default/2023-09-03_13h48m45s
[01:39:14] [12] [00:09:07] Finished archivers/rzip | rzip-2.1_1: Success
[01:39:14] Cleaning up
exit: cannot open ./var/run/49_nohang.pid: No such file or directory
exit: cannot open ./var/run/87_nohang.pid: No such file or directory

After that ^t produced the likes of:

load: 6.39  cmd: sh 4849 [tx->tx_quiesce_done_cv] 10047.33r 0.51u 121.32s 1% 
13004k
#0 0x80b6f103 at mi_switch+0x173
#1 0x80bc0f24 at sleepq_switch+0x104
#2 0x80aec4c5 at _cv_wait+0x165
#3 0x82aba365 at txg_wait_open+0xf5
#4 0x82a11b81 at dmu_free_long_range+0x151
#5 0x829a87d2 at zfs_rmnode+0x72
#6 0x829b658d at zfs_freebsd_reclaim+0x3d
#7 0x8113a495 at VOP_RECLAIM_APV+0x35
#8 0x80c5a7d9 at vgonel+0x3a9
#9 0x80c5af7f at vrecycle+0x3f
#10 0x829b643e at zfs_freebsd_inactive+0x4e
#11 0x80c598cf at vinactivef+0xbf
#12 0x80c590da at vput_final+0x2aa
#13 0x80c68886 at kern_funlinkat+0x2f6
#14 0x80c68588 at sys_unlink+0x28
#15 0x8106323f at amd64_syscall+0x14f
#16 0x8103512b at fast_syscall_common+0xf8

The console/logs do report "witness exhausted":

. . .
Sep  3 13:41:08 amd64-ZFS login[1751]: ROOT LOGIN (root) ON ttyv0
Sep  3 13:51:35 amd64-ZFS kernel: witness_lock_list_get: witness exhausted
Sep  3 14:26:38 amd64-ZFS kernel: pid 27418 (conftest), jid 

Re: kernel 100% CPU

2023-09-03 Thread Mateusz Guzik
On 9/3/23, Graham Perrin  wrote:
> On 03/09/2023 17:55, Mateusz Guzik wrote:
>> On 9/3/23, Graham Perrin  wrote:
>>> On 02/09/2023 18:31, Mateusz Guzik wrote:
 On 9/2/23, Graham Perrin wrote:
> … I began the trace /after/ the issue became observable.
> Will it be more meaningful to begin a trace and then reproduce the
> issue
> (before the trace ends)?
>
> …
 Looks like you have a lot of unrelated traffic in there.

 …
>>> Instead,  the
>>> two files from 09:21 this morning. Are these useful?
>>>
>>> Before this run of DTrace, I quit Firefox and other applications that
>>> might be causing noise (and the OS has been restarted since my last run
>>> of poudriere-bulk(8)).
>>>
>>> dtrace -x stackframes=100 -n 'profile-997 /arg0/ { @[stack()] = count();
>>> } tick-60s { exit(0); }' -o out.kern_stacks
>>>
>> Post your "sysctl -a" somewhere.
>>
> sysctl-a-2023-09-03-18-22.txt added to the MEGA folder is complete,
> including TSLOG-related lines.
>
> Alternatively, tslog under
>  is automatically
> pruned to exclude such lines. Hopefully not excessively pruned.
>
> TSLOG is one of three things in a Git stash that I apply before most
> builds, .
>

Sorry mate, neglected to specify: collect sysctl -a once you run into
the problem.

Once I look at that I'm probably going to ship some debug patches to
narrow it down.

-- 
Mateusz Guzik 



Re: 100% CPU time for sysctl command, not killable

2023-09-03 Thread Alexander Leidinger

Am 2023-09-02 16:56, schrieb Mateusz Guzik:

On 8/20/23, Alexander Leidinger  wrote:

Hi,

sysctl kern.maxvnodes=1048576000 results in 100% CPU and a 
non-killable

sysctl program. This is somewhat unexpected...



fixed here 
https://cgit.freebsd.org/src/commit/?id=32988c1499f8698b41e15ed40a46d271e757bba3


I confirm.

Thanks!
Alexander.

--
http://www.Leidinger.net alexan...@leidinger.net: PGP 0x8F31830F9F2772BF
http://www.FreeBSD.orgnetch...@freebsd.org  : PGP 0x8F31830F9F2772BF



Re: kernel 100% CPU

2023-09-03 Thread Graham Perrin

On 03/09/2023 17:55, Mateusz Guzik wrote:

On 9/3/23, Graham Perrin  wrote:

On 02/09/2023 18:31, Mateusz Guzik wrote:

On 9/2/23, Graham Perrin wrote:

… I began the trace /after/ the issue became observable.
Will it be more meaningful to begin a trace and then reproduce the issue
(before the trace ends)?

…

Looks like you have a lot of unrelated traffic in there.

…

Instead,  the
two files from 09:21 this morning. Are these useful?

Before this run of DTrace, I quit Firefox and other applications that
might be causing noise (and the OS has been restarted since my last run
of poudriere-bulk(8)).

dtrace -x stackframes=100 -n 'profile-997 /arg0/ { @[stack()] = count();
} tick-60s { exit(0); }' -o out.kern_stacks


Post your "sysctl -a" somewhere.

sysctl-a-2023-09-03-18-22.txt added to the MEGA folder is complete, 
including TSLOG-related lines.


Alternatively, tslog under 
 is automatically 
pruned to exclude such lines. Hopefully not excessively pruned.


TSLOG is one of three things in a Git stash that I apply before most 
builds, .





Re: kernel 100% CPU

2023-09-03 Thread Mateusz Guzik
On 9/3/23, Graham Perrin  wrote:
> On 02/09/2023 18:31, Mateusz Guzik wrote:
>> On 9/2/23, Graham Perrin wrote:
>>> … I began the trace /after/ the issue became observable.
>>> Will it be more meaningful to begin a trace and then reproduce the issue
>>> (before the trace ends)?
>>>
>>> …
>> Looks like you have a lot of unrelated traffic in there.
>>
>> …
>
> Instead,  the
> two files from 09:21 this morning. Are these useful?
>
> Before this run of DTrace, I quit Firefox and other applications that
> might be causing noise (and the OS has been restarted since my last run
> of poudriere-bulk(8)).
>
> dtrace -x stackframes=100 -n 'profile-997 /arg0/ { @[stack()] = count();
> } tick-60s { exit(0); }' -o out.kern_stacks
>

Post your "sysctl -a" somewhere.


-- 
Mateusz Guzik 



Re: kernel 100% CPU

2023-09-03 Thread Graham Perrin

On 02/09/2023 18:31, Mateusz Guzik wrote:

On 9/2/23, Graham Perrin wrote:

… I began the trace /after/ the issue became observable.
Will it be more meaningful to begin a trace and then reproduce the issue
(before the trace ends)?

…

Looks like you have a lot of unrelated traffic in there.

…


Instead,  the 
two files from 09:21 this morning. Are these useful?


Before this run of DTrace, I quit Firefox and other applications that 
might be causing noise (and the OS has been restarted since my last run 
of poudriere-bulk(8)).


dtrace -x stackframes=100 -n 'profile-997 /arg0/ { @[stack()] = count(); 
} tick-60s { exit(0); }' -o out.kern_stacks






Re: kernel 100% CPU

2023-09-03 Thread Graham Perrin

On 03/09/2023 15:02, Mateusz Guzik wrote:

On 9/3/23, Graham Perrin  wrote:

…

The script is intended to run when you have git executing for a long time.


Ah, sorry for my poor understanding.

Re: 
, 
I ran it at a time when the symptom (kernel 100% CPU) was observable 
without a recent run of poudriere-bulk(8).





Re: FreeBSD-15 kernel panic when the amdtemp device is in the kernel

2023-09-03 Thread Gary Jennejohn
On Sun, 03 Sep 2023 15:17:36 +0200
"Herbert J. Skuhra"  wrote:

[SNIP]
> Probably best to file a PR: https://bugs.freebsd.org/bugzilla/
>

Bugzilla 273543

--
Gary Jennejohn



Re: kernel 100% CPU

2023-09-03 Thread Mateusz Guzik
On 9/3/23, Graham Perrin  wrote:
> On 03/09/2023 09:01, Juraj Lutter wrote:
>> … The script mjg@ provided is not a shell script.
>>
>> The script filename is “script.d” where you should put the
>> above-mentioned DTrace script (without the "dtrace -s script.d -o out”
>> line).
>
>
> Thanks, I guess that I'm still doing something wrong:
>
>
> root@mowa219-gjp4-8570p-freebsd:/home/grahamperrin/Documents/IT/BSD/FreeBSD/kernel-cpu
>
> # time dtrace -s script.d -o /tmp/out
> dtrace: script 'script.d' matched 4 probes
> ^C0.246u 4.049s 27:25.70 0.2%   14+91k 261+0io 274pf+0w
> root@mowa219-gjp4-8570p-freebsd:/home/grahamperrin/Documents/IT/BSD/FreeBSD/kernel-cpu
>
> # cat /tmp/out
>
> CPU IDFUNCTION:NAME
>3  2 :END
>
> root@mowa219-gjp4-8570p-freebsd:/home/grahamperrin/Documents/IT/BSD/FreeBSD/kernel-cpu
>

The script is intended to run when you have git executing for a long time.

-- 
Mateusz Guzik 



Re: kernel 100% CPU

2023-09-03 Thread Graham Perrin

On 03/09/2023 09:01, Juraj Lutter wrote:

… The script mjg@ provided is not a shell script.

The script filename is “script.d” where you should put the
above-mentioned DTrace script (without the "dtrace -s script.d -o out” line).



Thanks, I guess that I'm still doing something wrong:


root@mowa219-gjp4-8570p-freebsd:/home/grahamperrin/Documents/IT/BSD/FreeBSD/kernel-cpu 
# time dtrace -s script.d -o /tmp/out

dtrace: script 'script.d' matched 4 probes
^C0.246u 4.049s 27:25.70 0.2%   14+91k 261+0io 274pf+0w
root@mowa219-gjp4-8570p-freebsd:/home/grahamperrin/Documents/IT/BSD/FreeBSD/kernel-cpu 
# cat /tmp/out


CPU ID    FUNCTION:NAME
  3  2 :END

root@mowa219-gjp4-8570p-freebsd:/home/grahamperrin/Documents/IT/BSD/FreeBSD/kernel-cpu 
#





Re: FreeBSD-15 kernel panic when the amdtemp device is in the kernel

2023-09-03 Thread Herbert J. Skuhra
On Sat, 02 Sep 2023 18:02:03 +0200, Gary Jennejohn wrote:
> 
> On Sat, 02 Sep 2023 15:36:36 +0200
> "Herbert J. Skuhra"  wrote:
> 
> > On Fri, 01 Sep 2023 18:05:34 +0200, Gary Jennejohn wrote:
> > >
> > > On Fri, 1 Sep 2023 14:43:21 +
> > > Gary Jennejohn  wrote:
> > >
> > > > A git-bisect is probably required.
> > > >
> > >
> > > I did a bisect and the result was commit
> > > 9a7add6d01f3c5f7eba811e794cf860d2bce131d.
> > >
> > > However, that can't be correct because this commit was made on
> > > Mon Jul 17 19:29:20 2023 and my FBSD-14 kernel from August 13th
> > > boots successfully :(
> >
> > Commit date is August 19th, 2023(!):
> >
> > commit 9a7add6d01f3c5f7eba811e794cf860d2bce131d
> > Author: Colin Percival
> > AuthorDate: Mon Jul 17 19:29:20 2023 -0700
> > Commit: Colin Percival
> > CommitDate: Sat Aug 19 22:04:56 2023 -0700
> >
> >
> > Reverting this commit seems to resolve the issue for me:
> >
> > FreeBSD 15.0-CURRENT amd64 150 #0 main-n265137-2ad756a6bbb3
> >
> > $ git status
> > On branch main
> > Your branch is up to date with 'freebsd/main'.
> >
> > You are currently reverting commit 9a7add6d01f3.
> >   (all conflicts fixed: run "git revert --continue")
> >   (use "git revert --skip" to skip this patch)
> >   (use "git revert --abort" to cancel the revert operation)
> >
> > Changes to be committed:
> >   (use "git restore --staged ..." to unstage)
> > modified:   sys/kern/init_main.c
> >
> > # dmesg |egrep "(amdsmn|amdtemp)"
> > amdsmn0:  on hostb0
> > amdtemp0:  on hostb0
> >
> > $ sysctl kern.conftxt |grep amdt
> > device  amdtemp
> >
> 
> Really?  I did a git log and July 17 is what pops out for this commit.
> 
> Ah, I see that git log doesn't show the commit date.
> 
> So I guess that the git bisect really did find the commit which caused
> all our problems.
> 
> If reverting it fixes things then this requires some action from Colin
> Percival.
> 
> This would also explain why my FBSD-14 kernel from August 13 was
> OK.

Probably best to file a PR: https://bugs.freebsd.org/bugzilla/

--
Herbert



Re: kernel 100% CPU, and ports-mgmt/poudriere-devel 'Inspecting ports tree for modifications to git checkout...' for an extraordinarily long time

2023-09-03 Thread Michael Gmelin



On Sat, 2 Sep 2023 09:53:38 +0100
Graham Perrin  wrote:

> Some inspections are extraordinarily time-consuming. Others complete 
> very quickly, as they should.
> 
> One recent inspection took more than half an hour.
> 
> Anyone else?
> 

Does `git clone https://git.freebsd.org/ports.git` work for you?
(currently it's not working from where I am). Maybe related.

Best
Michael


-- 
Michael Gmelin



Re: kernel 100% CPU

2023-09-03 Thread Juraj Lutter



> On 3 Sep 2023, at 09:56, Graham Perrin  wrote:
>> 
>> dtrace -s script.d -o out

This is the actual command.

The script mjg@ provided is not a shell script.

The script filename is “script.d” where you should put the
above-mentioned DTrace script (without the "dtrace -s script.d -o out” line).


—
Juraj Lutter
o...@freebsd.org




Re: kernel 100% CPU

2023-09-03 Thread Graham Perrin

On 02/09/2023 18:31, Mateusz Guzik wrote:


Looks like you have a lot of unrelated traffic in there.

Run this script:
#pragma D option dynvarsize=32m

profile:::profile-997
/execname == "find"/
{
 @oncpu[stack(), "oncpu"] = count();
}

/*
  * The p_flag & 0x4 test filters out kernel threads.
  */

sched:::off-cpu
/execname == "find"/
{
 self->ts = timestamp;
}

sched:::on-cpu
/self->ts/
{
 @offcpu[stack(30), "offcpu"] = sum(timestamp - self->ts);
 self->ts = 0;
}

dtrace:::END
{
 normalize(@offcpu, 100);
 printa("%k\n%s\n%@d\n\n", @offcpu);
 printa("%k\n%s\n%@d\n\n", @oncpu);
}

dtrace -s script.d -o out



# pwd
/home/grahamperrin/Documents/IT/BSD/FreeBSD/kernel-cpu
# ./2023-09-02-18-31.sh
./2023-09-02-18-31.sh: profile:::profile-997: not found
./2023-09-02-18-31.sh: /execname: not found
./2023-09-02-18-31.sh: 6: Syntax error: "(" unexpected (expecting "}")
# whoami
root
# echo $
$
# echo $0
sh
# echo $SHELL
/bin/csh
# exit
root@mowa219-gjp4-8570p-freebsd:/home/grahamperrin/Documents/IT/BSD/FreeBSD/kernel-cpu 
#





Re: kernel 100% CPU …

2023-09-03 Thread Graham Perrin

On 02/09/2023 18:31, Mateusz Guzik wrote:


… upload it to freefall …


Sorry, that's no longer possible.

Do people have a preferred FreeBSD-oriented location/service for 
FreeBSD-specific files?


TIA

In the meantime: I find the symptom reproducible, quite frequently, 
without using poudriere. Currently:


% uname -aKU
FreeBSD mowa219-gjp4-8570p-freebsd 15.0-CURRENT FreeBSD 15.0-CURRENT 
amd64 150 #11 main-n265135-07bc20e4740d-dirty: Sat Sep  2 19:40:08 
BST 2023 
grahamperrin@mowa219-gjp4-8570p-freebsd:/usr/obj/usr/src/amd64.amd64/sys/GENERIC-NODEBUG 
amd64 150 150

%