On Sun, Feb 08, 2026 at 08:51:38AM -0800, Mike Larkin wrote:
> On Sun, Feb 08, 2026 at 11:18:26AM +0100, Walter Alejandro Iglesias wrote:
> > Hi Mike,
> >
> > Unfortunately, I'm still able to reproduce the panic.
> >
> > In case this is relevant.  With #667 the panic followed by the debugger
> > interface message happened right after the image was uncompressed.  With
> > current versions, I have to run startx to get a first single line
> > message panic, and then reboot again to get a second panic with the
> > debugger message.  I can send you pictures if you want.
> >
> 
> ok maybe...

The first lines of the report of the panic seem to be the same.  The
affected functions are pool_cache_get() pool_get(), m_clget(), etc.

Some times I get just this single line message:

  pool_cache_item_magic_check: mcl2k cpu free list modified: item addr \
    0xfffffd800bc00800+16 0x0!=0xced49a541a9360d4

> 
> We have another report of similar crashes but only when X is used.
> 
> I can't recall; does it fail for you (-current, without any of these diffs)
> if you don't run X? If you never tested that can you try? Just don't run X
> at all and do a ZZZ from the console. LMK.

Perhaps what I said led to confusion.  Most of the tests I've done were
running ZZZ right before logging into the console.  What I meant by
running X, was *after* the machine restarted from hibernation, I do this
to trigger the panic when it doesn't happen immediately.

Of course, I've also tried running ZZZ from within Xorg, in this case,
after the image is uncompressed I get a black screen.  I confirm the
debugger showing the panic is there behind because I can successfully
reboot the machine by blindly typing 'reboot'.


> 
> -ml
> 
> >
> > On Sat, Feb 07, 2026 at 10:59:54AM -0800, Mike Larkin wrote:
> > > On Sat, Feb 07, 2026 at 09:22:15AM -0800, Mike Larkin wrote:
> > > > On Sat, Feb 07, 2026 at 09:09:13AM -0800, Mike Larkin wrote:
> > > > > On Sat, Jan 17, 2026 at 07:53:14PM -0800, Mike Larkin wrote:
> > > > > > On Sat, Jan 17, 2026 at 12:18:28PM +0100, Walter Alejandro Iglesias 
> > > > > > wrote:
> > > > > > > This machine, apparently, hibernates correctly.  Then, booting 
> > > > > > > after the
> > > > > > > hibernation is complete, also apparently, unhibernates correctly. 
> > > > > > >  But,
> > > > > > > sometimes right after the image is uncompressed, sometimes after 
> > > > > > > some
> > > > > > > command is executed in the console, sometimes later, you always 
> > > > > > > end up
> > > > > > > with this same kernel panic (photos included in tarball):
> > > > > > >
> > > > > > >    https://en.roquesor.com/Downloads/panic.tar.gz
> > > > > > >
> > > > > > > As I mentioned to mlarkin@ and krw@ in private, I knew for 
> > > > > > > certain that
> > > > > > > this didn't happen with this machine.  I used to test how it 
> > > > > > > hibernated
> > > > > > > because I had it connected to a UPS when I used it (for many 
> > > > > > > years) as a
> > > > > > > home mail-web server.  And even though it's an old machine, I 
> > > > > > > still use
> > > > > > > it occasionally, which is why I took the time to do the trace 
> > > > > > > back.
> > > > > > >
> > > > > > > After three days installing snapshots I could find out in which 
> > > > > > > one the
> > > > > > > panic starts happening and which modification causes it.
> > > > > > >
> > > > > > > This is the last snapshot working fine (the number of the beast 
> > > > > > > :-):
> > > > > > >
> > > > > > >   https://openbsd.cs.toronto.edu/archive/2025-05-22/amd64
> > > > > > >
> > > > > > >   kern.version=OpenBSD 7.7-current (GENERIC.MP) #666: Wed May 21 
> > > > > > > 00:12:25 MDT 2025
> > > > > > >     
> > > > > > > [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> > > > > > >
> > > > > > > And this is the first when the issue appears:
> > > > > > >
> > > > > > >   https://openbsd.cs.toronto.edu/archive/2025-05-23/amd64
> > > > > > >
> > > > > > >   kern.version=OpenBSD 7.7-current (GENERIC.MP) #667: Thu May 22 
> > > > > > > 22:13:35 MDT 2025
> > > > > > >     
> > > > > > > [email protected]:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> > > > > > >
> > > > > > > The modification that causes the issue is this:
> > > > > > >
> > > > > > >   https://marc.info/?l=openbsd-cvs&m=174779996216651&w=2
> > > > > > >
> > > > > > > After reverting those Mike's diffs in current, the panic 
> > > > > > > disappears.
> > > > > > >
> > > > > >
> > > > > > I'll take a look at those diff again but I don't see how this could 
> > > > > > be the
> > > > > > problem. That diff just moved the allocation earlier. Maybe there's 
> > > > > > a double
> > > > > > free. I'll look.
> > > > > >
> > > > > > -ml
> > > > > >
> > > > >
> > > > > Hi Walter
> > > > >
> > > > > Can you try this diff on -current and let me know if this fixes it?
> > > > >
> > > > > diff /export/bin/src/OpenBSD/hib
> > > > > path + /export/bin/src/OpenBSD/hib
> > > > > commit - 12762e4337611beea64f8029adc3932766410745
> > > > > blob - 2b7924651df0c61efc89543ceacf5e7e5881e4ca
> > > > > file + sys/kern/subr_hibernate.c
> > > > > --- sys/kern/subr_hibernate.c
> > > > > +++ sys/kern/subr_hibernate.c
> > > > > @@ -2012,7 +2012,7 @@ hibernate_free(void)
> > > > >       pmap_activate(curproc);
> > > > >
> > > > >       if (hibernate_temp_page) {
> > > > > -             pmap_kremove(hibernate_temp_page, PAGE_SIZE);
> > > > > +//           pmap_kremove(hibernate_temp_page, PAGE_SIZE);
> > > > >               km_free((void *)hibernate_temp_page, PAGE_SIZE,
> > > > >                   &kv_any, &kp_none);
> > > > >       }
> > > > >
> > > >
> > > > actually disregard, this isn't quit right either. I'll let you know 
> > > > when I have
> > > > something fully baked. stay tuned.
> > > >
> > >
> > > You can try this one instead. It replaces the km_alloc/km_free management 
> > > of the
> > > temp page with a fixed stolen low VA (like we do for the other pages). I 
> > > have
> > > run this through about a dozen cycles here and it seems ok but since you 
> > > can
> > > repro, I'd like to know if I'm on the right track or not.
> > >
> > > Apply to -current on the affected machine and LMK.
> > >
> > > -ml
> > >
> > >
> > > diff /export/bin/src/OpenBSD/hib
> > > path + /export/bin/src/OpenBSD/hib
> > > commit - 12762e4337611beea64f8029adc3932766410745
> > > blob - ec4e1904b741e23ee1df50ab3e807e90283c2fbc
> > > file + sys/arch/amd64/include/hibernate_var.h
> > > --- sys/arch/amd64/include/hibernate_var.h
> > > +++ sys/arch/amd64/include/hibernate_var.h
> > > @@ -52,8 +52,9 @@
> > >  #define HIBERNATE_STACK_PAGE     (PAGE_SIZE * 32)
> > >
> > >  #define HIBERNATE_INFLATE_PAGE   (PAGE_SIZE * 33)
> > > +#define HIBERNATE_TEMP_PAGE      (PAGE_SIZE * 34)
> > >  /* HIBERNATE_HIBALLOC_PAGE must be the last stolen page (see machdep.c) 
> > > */
> > > -#define HIBERNATE_HIBALLOC_PAGE  (PAGE_SIZE * 34)
> > > +#define HIBERNATE_HIBALLOC_PAGE  (PAGE_SIZE * 35)
> > >
> > >  /* Use 4MB hibernation chunks */
> > >  #define HIBERNATE_CHUNK_SIZE             0x400000
> > > commit - 12762e4337611beea64f8029adc3932766410745
> > > blob - 2b7924651df0c61efc89543ceacf5e7e5881e4ca
> > > file + sys/kern/subr_hibernate.c
> > > --- sys/kern/subr_hibernate.c
> > > +++ sys/kern/subr_hibernate.c
> > > @@ -66,7 +66,6 @@ CTASSERT((offsetof(union hibernate_info, sec_size) + s
> > >   */
> > >
> > >  /* Temporary vaddr ranges used during hibernate */
> > > -vaddr_t hibernate_temp_page;
> > >  vaddr_t hibernate_copy_page;
> > >  vaddr_t hibernate_rle_page;
> > >
> > > @@ -1527,13 +1526,14 @@ hibernate_write_chunks(union hibernate_info *hib)
> > >                                   case -1:
> > >                                           return EIO;
> > >                                   case 0:
> > > -                                         
> > > pmap_kenter_pa(hibernate_temp_page,
> > > +                                         
> > > pmap_kenter_pa(HIBERNATE_TEMP_PAGE,
> > >                                                   inaddr & PMAP_PA_MASK,
> > >                                                   PROT_READ);
> > >
> > > -                                         
> > > bcopy((caddr_t)hibernate_temp_page,
> > > +                                         
> > > bcopy((caddr_t)HIBERNATE_TEMP_PAGE,
> > >                                                   
> > > (caddr_t)hibernate_copy_page,
> > >                                                   PAGE_SIZE);
> > > +
> > >                                           inaddr += hibernate_deflate(hib,
> > >                                                   temp_inaddr,
> > >                                                   &out_remaining);
> > > @@ -1972,8 +1972,6 @@ hibernate_suspend(void)
> > >  int
> > >  hibernate_alloc(void)
> > >  {
> > > - KASSERT(hibernate_temp_page == 0);
> > > -
> > >   /*
> > >    * If we weren't able to early allocate a piglet, don't proceed
> > >    */
> > > @@ -1984,23 +1982,7 @@ hibernate_alloc(void)
> > >   pmap_kenter_pa(HIBERNATE_HIBALLOC_PAGE, HIBERNATE_HIBALLOC_PAGE,
> > >       PROT_READ | PROT_WRITE);
> > >
> > > - /*
> > > -  * Allocate VA for the temp page.
> > > -  *
> > > -  * This will become part of the suspended kernel and will
> > > -  * be freed in hibernate_free, upon resume (or hibernate
> > > -  * failure)
> > > -  */
> > > - hibernate_temp_page = (vaddr_t)km_alloc(PAGE_SIZE, &kv_any,
> > > -     &kp_none, &kd_nowait);
> > > - if (!hibernate_temp_page)
> > > -         goto unmap;
> > > -
> > >   return (0);
> > > -unmap:
> > > - pmap_kremove(HIBERNATE_HIBALLOC_PAGE, PAGE_SIZE);
> > > - pmap_update(pmap_kernel());
> > > - return (ENOMEM);
> > >  }
> > >
> > >  /*
> > > @@ -2011,13 +1993,7 @@ hibernate_free(void)
> > >  {
> > >   pmap_activate(curproc);
> > >
> > > - if (hibernate_temp_page) {
> > > -         pmap_kremove(hibernate_temp_page, PAGE_SIZE);
> > > -         km_free((void *)hibernate_temp_page, PAGE_SIZE,
> > > -             &kv_any, &kp_none);
> > > - }
> > > -
> > > - hibernate_temp_page = 0;
> > > + pmap_kremove(HIBERNATE_TEMP_PAGE, PAGE_SIZE);
> > >   pmap_kremove(HIBERNATE_HIBALLOC_PAGE, PAGE_SIZE);
> > >   pmap_update(pmap_kernel());
> > >  }
> >
> > --
> > Walter

-- 
Walter

Reply via email to