OK, for the archives:
Someone wrote to me off-list:
> > Other ramdisk-based systems (flashboot, flashrd) have needed to increase
> > NKPTP above the default of 4, and disable isadma (and associated devices).
> > I don't know if it might be relevant here, but easy enough to do that
> > it's probably worth trying.
Very interesting tip, we tried it, but it doesn't make a difference.
(Our kernel has always been far below 16 MB, apparently too small to
ever hit those two limits.)
--
However, it set us on the path to the eventual solution.
It did seem a good idea to try & compare 'flashrd'. Outcome: the
MULTIPROCESSOR variant of 'flashrd' did work normally.
Followed by countless frustrating hours banging head against wall:
comparing & matching 'option' files, transplanting ramdisk images back
and forth. No experiment worked, almost drove us to desperation.
--
Eventually found the cause: it has nothing to do with 'option's or with
the ramdisk image.
The 'src/distrib/i386/common/Makefile.inc' script adds
COPTS+= -mtune=i486
to the kernel make/gcc command. (since OpenBSD 4.8)
Rather miffed by this discovery! The OpenBSD project does not
allow/support use of non-default gcc arch options. Not even when
compiling userland apps*, let alone when compiling the kernel.
Reasonable policy; as long as you stick to it! Don't make a
nigh-on-unnoticable deviation in such a canonical place as distrib/i386!
= =
*) "No support for non-default gcc arch options": we don't know whether
that's actually properly documented anywhere, but we learned that when
reporting that 'ntohs16()' miscompiled under -march=i686, back in
OpenBSD 3.8 days:
http://www.mail-archive.com/[email protected]/msg19810.html
Thank all for the replies/ideas/etc!
+++chefren
On 18-11-10 12:08, chefren wrote:
> We use a custom i386 RAMDISK_CD kernel: basically we add most options from
> GENERIC and
> GENERIC.MP.
>
> Upgrading from 4.6 to 4.8, this kernel hangs forever after:
> root on rd0a swap on rd0b dump on rd0b
>
> The problem turns out to be MP; activation of the secondary processors.
>
> The custom kernel works fine on a single-core machine, and a recompiled
> kernel without
> config lines
> option MULTIPROCESSOR
> cpu* at mainbus?
> also works fine everywhere.
>
>
> --
> The problem can be reproduced by simply adding those two MP config lines to
> the standard
> RAMDISK_CD kernel config.
>
>
> --
> Experiments with adding printf()s on a Dell 1950 (2 CPUs, 8 cores) suggest
> that the hang
> happens during:
> cpu_boot_secondary(&cpu_info[2])
> pmap_tlb_shootrange()
> i386_fast_ipi()
>
> But treat that as an inconclusive hint: we don't know whether the printf()s
> are 100%
> reliable, and VirtualBox (2 CPU, IOAPIC) seems to make it past that point and
> hang
> somewhere after init_main() has entered its intentional infinite waiting
> loop, and another
> computer (Core 2 Duo) doesn't hang but reboots immediately around that point.
>
>
> --
> Are we overlooking an option/driver that's needed for MP on i386?
>
> Or is this a kernel regression from 4.6 --> 4.8?
>
>
> +++chefren
>
--
http://idd.nl/
Chefren Hagens