On Mon, Mar 16, 2026 at 12:18:05PM -0600, Theo de Raadt wrote:
> I'm surprised at your proposal.
> 
> If this condition gets detected, why do you think it is fine to
> continue?  A kernel data structure is seriously corrupted.

I'm not saying it's fine, sorry if my mail was too long to read. ;)

1. I'm not 100% sure the checks that trigger are correct, after all
  they're not using volatile reads.  Maaaaybe that's the bug but I
  have no idea right now.
  
2. Kurt had posted this on ports@ earlier, then on bugs@, so far no
  one has a fix and you recently tagged 7.9.  This diff is an attempt
  to make kmos' and users life easier before next release.  Obviously
  everybody would be happier with a proper fix.  Maybe this admittedly
  incomplete fix will spark a discussion?

[...]

> > Since this db_enter() has been plaguing kmos' latest builds up to a
> > point that some ports/packages were corrupted, I'd suggest that we
> > disable the db_enter() now that we know that this error case can be
> > hit.  I've managed to this code path twice, months/weeks ago by
> > building large ports on a T4-2 LDOM.  I have ideas to test

FWIW I had a diff turning pm_refs into a struct refcnt, hoping that
the memory barriers in refcnt_rele() would magically fix this issue.
That wasn't enough, the same error triggered.  I guess the mixed
reference counting of struct vmspace and struct pmap make it harder to
reason about.  But maybe this isn't the issue at hand at all.

> > but I have
> > just gotten my hands back on said LDOM and right now I can't even
> > reproduce.  But maybe kmos can give it a try, look for printfs and
> > confirm that the system recovers when hitting such a condition.
> > 
> > Thoughts?  ok?
> > 
> > 
> > Index: pmap.c
> > ===================================================================
> > RCS file: /cvs/src/sys/arch/sparc64/sparc64/pmap.c,v
> > diff -u -p -r1.127 pmap.c
> > --- pmap.c  14 Dec 2025 12:37:22 -0000      1.127
> > +++ pmap.c  11 Mar 2026 22:39:23 -0000
> > @@ -2600,7 +2600,6 @@ ctx_free(struct pmap *pm)
> >             if (TSB_TAG_CTX(tsb_dmmu[i].tag) == oldctx ||
> >                 TSB_TAG_CTX(tsb_immu[i].tag) == oldctx) {
> >                     printf("ctx_free: context %d still active\n", oldctx);
> > -                   db_enter();
> >             }
> >     }
> >  #endif
> > 
> > 
> > -- 
> > jca
> > 
> 

-- 
jca

Reply via email to