On 11/27/17 10:06, Andrew Turner wrote:
On 26 Nov 2017, at 17:07, Nathan Whitehorn <nwhiteh...@freebsd.org> wrote:
On 11/26/17 01:46, Andrew Turner wrote:
On 25 Nov 2017, at 23:41, Nathan Whitehorn <nwhiteh...@freebsd.org> wrote:
Author: nwhitehorn
Date: Sat Nov 25 23:41:05 2017
New Revision: 326218
URL: https://svnweb.freebsd.org/changeset/base/326218
Log:
Remove some, but not all, assumptions that the BSP is CPU 0 and that CPUs
are numbered densely from there to n_cpus.
MFC after: 1 month
Modified:
head/sys/kern/kern_clock.c
head/sys/kern/kern_clocksource.c
head/sys/kern/kern_shutdown.c
head/sys/kern/kern_timeout.c
head/sys/kern/sched_ule.c
head/sys/kern/subr_pcpu.c
...
Modified: head/sys/kern/subr_pcpu.c
==============================================================================
--- head/sys/kern/subr_pcpu.c Sat Nov 25 23:23:24 2017 (r326217)
+++ head/sys/kern/subr_pcpu.c Sat Nov 25 23:41:05 2017 (r326218)
@@ -279,6 +279,8 @@ pcpu_destroy(struct pcpu *pcpu)
struct pcpu *
pcpu_find(u_int cpuid)
{
+ KASSERT(cpuid_to_pcpu[cpuid] != NULL,
+ ("Getting uninitialized PCPU %d", cpuid));
return (cpuid_to_pcpu[cpuid]);
}
This breaks on one arm64 simulator I have where the device tree lists 8 cpus,
but only 2 are enabled in the simulation. The ofw_cpu device nodes for these
are cpu0 and cpu4 so the call to cpu_find in ofw_cpu_attach will hit this
KASSERT when the CPU has not been enabled.
Andrew
cpu0: <Open Firmware CPU> on cpulist0
cpu1: <Open Firmware CPU> on cpulist0
panic: Getting uninitialized PCPU 1
cpuid = 0
time = 1
KDB: stack backtrace:
db_trace_self() at db_trace_self_wrapper+0x28
pc = 0xffff0000005f03e8 lr = 0xffff000000087048
sp = 0xffff0000000105a0 fp = 0xffff0000000107b0
That's unfortunate. Removing the new KASSERT is probably not the right option,
since all it is doing is indicating a pre-existing bug. Specifically, it is
preventing ofw_cpu from using a NULL pointer for CPU 4, which I suppose it was
previously happily doing (it would only get dereferenced if you had cpufreq
support, hence no previous panic).
I would expect cpu4 to have a non-NULL value as its cpu pointer will have been
set. It’s cpu1 that is the issue as it’s in the device tree, but doesn’t exist
in “hardware”. Because of this it fails when we try to enable it so free it’s
pcpu data clear the pointer.
Yes, I phrased that badly. As you say, the CPU ID that ofw_cpu has is
"1", which doesn't exist. To fix this, we need to change the CPU ID that
ofw_cpu has to match what the kernel thinks it is using.
A super quick-and-dirty fix would be to be fail attach on
CPU_ABSENT(device_get_unit(dev)), which restores the previous buggy behavior
but without the panic. If you do not actually use ofw_cpu for anything, we
could also just disable it temporarily on your platform (it's off for some PPC
systems, you will note, for similar reasons).
We used to use it to find CPUs to start, however this is no longer the case.
OK, so disabling it works fine for now.
A real fix is somewhat complex, since the driver relies on being able to get a
mapping from platform-specific numbers in the device tree to an entry in pcpu,
which intrinsically relies on reverse-engineering the platform's mapping
between some kind of hardware CPU ID and the kernel CPU numbering. I can't
think of a way to do that internally. We could make some kind of platform
macro, but the whole concept is even somewhat dubious at the moment since a
number of IBM systems with SMT don't even have a 1:1 relationship between CPUs
as FreeBSD defines them and device tree nodes, so it's possible we need a
serious rearchitecture of the driver.
On arm64 we use the ID from the device tree to assign the cpuid so there is a
1:1 mapping between the two. In most cases both are identical, the only case
that’s not correct is when we boot from a non-zero CPU, and I know of only one
SoC where this is true.
The "ID from the device tree" being the value of "reg"? Or something
else? I see a lot of variation in the encoding in sys/dts and there does
not seem to be an obvious single answer. This is making me think that
the only workable solution is something like a new ofw_machdep function
that gets the phandle for a given CPU ID.
-Nathan
Andrew
_______________________________________________
svn-src-head@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/svn-src-head
To unsubscribe, send any mail to "svn-src-head-unsubscr...@freebsd.org"