It seems Alfred Perlstein wrote:
> > 
> > I suggest creative manpower is used to stabilize -current, instead
> > of fine trimming which API's should stay or not...
> I started a loop of make -j128 buildworld and buildkernel last
> night, I still haven't seen anything odd happen on my hardware.
> You and Poul-Henning have to figure out what's going on, no one
> else is able to reproduce this instability you're talking about.

Oohh you dont read the mailing lists then, there has been plenty
of reports of hanging -current boxen since SMPng...

> There has to be a way for you guys to get us some reasonable
> tracebacks or diagnostics instead of just saying "it's broke".

Its close to impossible, the two symptoms I see here are either
spontanous reboots, or solid hangs where only a reset can get
you out, so I cant say much other than "it's broke".

> Perhaps you can explain how you're able to trigger this instability
> with a test script?  Poul-Henning told me he just needed to do a
> make -j256 world, I did 10 of them without a problem...

Hmm, with a -current kernel from today 1200 CET i just need to
do a make depend on a GENERIC kernel, and wham it locks up.

> I'd also like to see what hardware you guys are running on and what
> kernel config.  I'm pretty sure that running with a weird value
> for HZ causes lockups on -stable, dunno about current.

Nothing special, GENERIC kernel with SMP defined will do nicely, running
without SMP improves matters but on the fastet machine I'm still getting
lockups, but they are rare...

Hardware it hangs on here include:

2*PPro@200 192MB FX chipset ATA disks on onboard controller (PIIX3)

2*PII@350 512MB BX chipset SCSI disks on NCR controller

2*PIII@1G 512MB ServerWorks chipset ATA disks on onboard + HPT controller.

It seems the faster the machine the faster the lockup/hang..

Need I mention that they all work just fine(tm) under -stable and
-current back on PRE_SMPNG...

So, we (phk & I) are trying to figure out what is going on, but
there is little to go on but hunch...
So there is nothing special to it guys, you just have to try..
Oh btw using a ccd/vinum/ATA-raid thingy makes the problem worse,
probably due to the higher interrupt rates.

> Basically if you're expecting me or the SMP team to figure out
> what's going on without more info, you're pretty much out of luck.

See above, not really possible, we have been trying to find some
(affordable) HW that could be used to preserve a log over a boot,
but so far I havn't been able to find anything that works, and
is fast enough to not effect the system too much...

> ...wondering if the box Paul Saab gave me is actually SMP... :)

Yup, that would explain things :)


