On Sat, 21 Feb 2026 22:05:44 +0100,
Klemens Nanni <[email protected]> wrote:
>
> 15.02.2026 23:49, Klemens Nanni пишет:
> > 30.12.2025 22:18, Mark Kettenis пишет:
> >>> Date: Tue, 30 Dec 2025 19:05:03 +0100
> >>> From: Alexander Bluhm <[email protected]>
> >>>
> >>> On Sun, Nov 23, 2025 at 04:14:31PM +0000, Miod Vallat wrote:
> >>>>> All cores attached, i.e. octeon's hw_cpu_hatch() should've run
> >>>>> and bumped `ncpus' four times.
> >>>>
> >>>> Except that this only happens after cpu_boot_secondary_processors()
> >>>> completes, which is way later than softnet_percpu().
> >>>
> >>> octeon sets ncpus late, but ncpusfound is set early. Other subsystems
> >>> also use that variable. My mistake using the wrong one.
> >>
> >> No, you used the right one.
> >>
> >> * ncpusfound: the number of CPUs the hardware provides
> >>
> >> * ncpus: the number of CPUs OpenBSD is using
> >>
> >> So when running a GENERIC kernel on a machine with 32 CPUs, ncpusfound
> >> will be 32 but ncpus will be 1.
> >>
> >> Even when running a GENERIC.MP kernel, ncpusfound may be larger than
> >> ncpus. This could happen if the hardware has more than MAXCPUS CPUs.
> >>
> >> We probably should change octeon to bump ncpus when it attaches CPUs
> >> instead of when it spins them up.
> >
> > I gave it a quick try today:
> >
> > root@ot8:.../~# ps xk |grep softnet
> > 68767 ?? DK 0:00.00 (softnet3)
> > 71399 ?? DK 0:00.00 (softnet4)
> > 75753 ?? DK 0:00.00 (softnet5)
> > 95091 ?? DK 0:00.00 (softnet6)
> > 69574 ?? DK 0:00.00 (softnet7)
> > 31463 ?? DK 0:00.00 (softnet0)
> > 26770 ?? DK 0:00.00 (softnet2)
> > 37751 ?? DK 0:00.00 (softnet1)
> > 23114 00 S+p 0:00.01 grep softnet
> >
> > root@ot8:.../~# sysctl hw | grep cpu
> > hw.ncpu=17
> > hw.cpuspeed=1800
> > hw.ncpufound=16
> > hw.ncpuonline=16
> >
> > Obvious off-by-one still, but incrementing on attach works.
>
> Needed to hoist the increment from below this check into the else:
>
> #ifdef MULTIPROCESSOR
> if (cpuno == 0) {
> ...
> } else {
> ...
> }
> #else
>
> >
> > If that's the right direction, I'll fix up the diff next week.
>
> Now it logically matches what arm64, macppc and powerpc64 (and probably
> others archs) so:
>
> root@ot8:.../~# sysctl hw | grep cpu
> hw.ncpu=16
> hw.cpuspeed=1800
> hw.ncpufound=16
>
> Test machine and router are happy with the final diff below.
>
> I have not done any performance measurements wrt. +7 softnet threads,
> but this strikes me as correctness fix on its own.
>
> Also, octeon get_ncpusfound()'s local variable shadowing the global in
> kern/init_main.c is not helpful in that it at least produces grep noise,
> so I renamed that.
>
> Feedback? Objection? OK?
>
I've tried it on my EdgeRouter 4.
That I do have with snapshot which I just installed:
machine 1 -> ER4:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-30.00 sec 1.39 GBytes 397 Mbits/sec 15928 sender
[ 5] 0.00-30.01 sec 1.39 GBytes 397 Mbits/sec receiver
machine 1 -> ER4 -> machine 2
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-30.00 sec 1.71 GBytes 490 Mbits/sec 370688 sender
[ 5] 0.00-30.01 sec 1.71 GBytes 489 Mbits/sec receiver
machines are in different vlan.
Ok, next, cvs checkout with applied your diff:
machine 1 -> ER4:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-30.00 sec 590 MBytes 165 Mbits/sec 65568 sender
[ 5] 0.00-30.06 sec 590 MBytes 165 Mbits/sec receiver
machine 1 -> ER4 -> machine 2:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-30.00 sec 978 MBytes 273 Mbits/sec 2556800 sender
[ 5] 0.00-30.01 sec 976 MBytes 273 Mbits/sec receiver
plus extreamly bad performance from ER4 to NFS.
Next, cvs checkout without your diff:
machine 1 -> ER:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-30.00 sec 1.32 GBytes 377 Mbits/sec 0 sender
[ 5] 0.00-30.05 sec 1.32 GBytes 376 Mbits/sec receiver
machine 1 -> ER -> machine 2:
[ ID] Interval Transfer Bitrate Retr
[ 5] 0.00-30.00 sec 1.73 GBytes 495 Mbits/sec 637120 sender
[ 5] 0.00-30.01 sec 1.73 GBytes 494 Mbits/sec receiver
> Index: mips64/mips64/cpu.c
> ===================================================================
> RCS file: /mount/openbsd/cvs/src/sys/arch/mips64/mips64/cpu.c,v
> diff -u -p -r1.85 cpu.c
> --- mips64/mips64/cpu.c 5 Jun 2025 09:29:54 -0000 1.85
> +++ mips64/mips64/cpu.c 21 Feb 2026 20:48:26 -0000
> @@ -99,6 +99,7 @@ cpuattach(struct device *parent, struct
> ci->ci_next = cpu_info_list->ci_next;
> cpu_info_list->ci_next = ci;
> ci->ci_flags |= CPUF_PRESENT;
> + ncpus++;
> }
> #else
> ci = &cpu_info_primary;
> Index: octeon/octeon/machdep.c
> ===================================================================
> RCS file: /mount/openbsd/cvs/src/sys/arch/octeon/octeon/machdep.c,v
> diff -u -p -r1.137 machdep.c
> --- octeon/octeon/machdep.c 24 Oct 2023 13:20:10 -0000 1.137
> +++ octeon/octeon/machdep.c 21 Feb 2026 20:49:01 -0000
> @@ -802,7 +802,7 @@ static u_int64_t
> get_ncpusfound(void)
> {
> uint64_t core_mask;
> - uint64_t i, ncpus = 0;
> + uint64_t i, n = 0;
> int chipid;
>
> chipid = octeon_get_chipid();
> @@ -818,9 +818,9 @@ get_ncpusfound(void)
>
> /* There has to be 1-to-1 mapping between cpuids and coreids. */
> for (i = 0; i < OCTEON_MAXCPUS && (core_mask & (1ul << i)) != 0; i++)
> - ncpus++;
> + n++;
>
> - return ncpus;
> + return n;
> }
>
> static enum octeon_board
> @@ -1333,8 +1333,6 @@ hw_cpu_hatch(struct cpu_info *ci)
>
> ci->ci_flags |= CPUF_RUNNING;
> membar_sync();
> -
> - ncpus++;
>
> spl0();
> (void)updateimask(0);
>
--
wbr, Kirill