> Date: Sun, 9 Jul 2023 17:24:41 -0500
> From: Scott Cheloha <scottchel...@gmail.com>
> 
> On Sun, Jul 09, 2023 at 08:11:43PM +0200, Claudio Jeker wrote:
> > On Sun, Jul 09, 2023 at 12:52:20PM -0500, Scott Cheloha wrote:
> > > This patch fixes resume/unhibernate on GPROF kernels where kgmon(8)
> > > has activated kernel profiling.
> > > 
> > > I think the problem is that code called from cpu_hatch() does not play
> > > nicely with _mcount(), so GPROF kernels crash during resume.  I can't
> > > point you to which code in particular, but keeping all CPUs out of
> > > _mcount() until the primary CPU has completed resume/unhibernate fixes
> > > the crash.
> > > 
> > > ok?
> > 
> > To be honest, I'm not sure we need something like this. GPROF is already a
> > special case and poeple running a GPROF kernel should probably stop the
> > collection of profile data before suspend/hibernate.
> 
> Sorry, I was a little unclear in my original mail.
> 
> When I say "has activated kernel profiling" I mean "has *ever*
> activated kernel profiling".
> 
> Regardless of whether or not profiling is active at the moment we
> reach sleep_state(), if kernel profiling has *ever* been activated in
> the past, the resume crashes.

So isn't the real problem that some of the lower-level code involved
in the resume path isn't properly marked to not do the
instrumentation?  Traditionally that was assembly code and we'd use
NENTRY() (in amd64) or ENTRY_NP() (on some other architectures) to
prevent thise functions from calling _mcount().  But that was only
ever done for code used during early bootstrap of the kernel.  And
these days there may be C code that needs this as well.

With your diff, functions in the suspend/resume path will still call
_mcount() which may not be safe.

Reply via email to