> > > > >>I'll try to put my B&W G3 back in working condition ASAP to check if I
> > > > >>can reproduce the problem. It features:
> > > > >> cpu0 at mainbus0: 750 (Revision 0x202): 400 MHz: 1MB backside cache
> > > > >>so if this is indeed an old-G3 specific issue, I should be affected.
> > > > >
> > > > >... and I can reproduce this on the B&W G3 here. I'll try to
> > > > >investigate, but this machine is actively trying to commit suicide
> > > > >(when
> > > > >it does not spontaneously powers off).
> > > >
> > > > Was this fixed?
> > >
> > > No. But I lost a lot of hair. And I could not find anything interesting
> > > in the various G3 errata documents.
> >
> > That said, I wonder if mpi@'s recent rewrite of the idle path on ppc
> > will change that behaviour. Something for the week-end...
>
> FWIW my G3 is running last week's snapshot and the problem is still
> here... sorry for your hair.
I had a look at that thread again, and it appears I'm not gonna grow
hair anytime soon.
Looking back at your register dump, it looks like the value of r2,
causing the fault, is similar to the value of `ps' (in gdb speak), in
other words, srr1 (in powerpc speak), except for bit 0x4000 which is
cleared in r2, which would be PSL_PR - set in kernel, clear in userland:
r2 0x9032 36914
ps 0xd032 53298
Therefore I wonder if there wouldn't be a corner case in this sequence
in locore.S FRAME_LEAVE():
mtsrr1 %r3; \
mfsprg %r2,2; /* restore r2 & r3 */ \
mfsprg %r3,3
which would cause the mtsrr1 instruction to cause the following
instruction to fail to execute correctly.
However, I have seen no mention of this or anything similar in the G3
errata, and trying to add `isync' instructions, or duplicating the
mfsprg instruction, did not help - well, it made the issue happen less
often, but not disappear completely.
Miod
PS: a funny coincidence is that we have an open issue at ${DAYJOB} using
similar processors, but a different operating system, where once in a
blue moon, a system call returns... 0x9032, for which we have no
explanation.