On Wed, Apr 15, 2020 at 10:41:27AM +0200, Stephen Berman wrote:
> On Mon, 13 Apr 2020 21:29:33 +0200 Stephen Berman <[email protected]> 
> wrote:
> 
> > On Sun, 12 Apr 2020 18:47:08 +0100 Ken Moffat <[email protected]> 
> > wrote:
> >
> >> On Sat, Apr 11, 2020 at 06:43:22PM +0200, Stephen Berman wrote:
> >>> On Fri, 10 Apr 2020 15:51:22 -0500 Bruce Dubbs <[email protected]> 
> >>> wrote:
> >>>
> >>> > On 4/10/20 3:29 PM, Stephen Berman wrote:
> >>> >> I've built current development LFS using jhalfs and when I invoke (via
> >>> >> sudo or logged in as root) `shutdown -h now', the system appears to 
> >>> >> hang
> >>> >> while trying to detach the cdrom block device.  Here are the last two
> >>> >> lines printed to the terminal after issuing that command:
> >>> >> Bringing down the loopback interface..........[OK]
> >>> >> sr 5:0:0:0: tag#21 timing out command, waited 120s
> >>> >> and every 2 minutes, the last line repeats with a different tag#.
> [...]
> >> For finding which commit caused a problem, you really need to run
> >> git bisect.  I hope you are aware that the stable kernels maintained
> >> by Greg KH use a different git tree from Linus' upstream, and that
> >> while in Linus' tree there is a progression from 4.20.0 to 5.6.0, in
> >> Greg's stable trees the progressions are 4.20.0 to 4.20.last, 5.0.0
> >> to 5.0.last, etc.
> >
> > I didn't know this, thanks for pointing it out.
> >
> >> Picking random versions can help identify where a problem occurred,
> >> but it needs a plan.  I'll suggest you try 5.6.latest first (5.6.3
> >> at the moment), just in case.  Assuming that is still broken, try
> >> 5.6.0 itself to confirm the breakage.  Then try 4.20.0 to confirm
> >> that was ok (we know 4.20.12 was ok, but that is not in Linus's
> >> tree).  You could then bisect in Linus's tree.
> >
> > I have now built the just released 5.6.4 stable kernel and the problem
> > persists with it.  I'll try to get to the mainline kernels you suggest
> > in the next day or two.  (FWIW, on the same machine I also have openSUSE
> > Tumbleweed installed, which is currently running (a modified) 5.6.2
> > kernel, and here the machine powers off normally (but I do it via
> > KDE, not directly with `shutdown -h now'.))
> >
> >> Alternatively, if you can identify in which stable series the
> >> problem first appeared, you might bisect in that series.  But that
> >> is only recommended if the series is still supported (so, 5.4, and
> >> perhaps 5.5) and in any case GKH seems content if his kernels are
> >> bug compatible with Linus'.
> >>
> >> I'd better warn you that for us mere mortals git bisection does not
> >> always provide a clear answer, although "timeouts during shutdown"
> >> sounds like the sort of problem that will give a clear 'good or bad'
> >> answer.
> >
> > I hope so, but if I understand the bisection process correctly, I'm
> > afraid it may be too time-consuming, since the machine where this
> > problem occurs is my main work computer.  Is the process this?:
> > 1. Configure the kernel.  Is it nessary to do `make oldconfig' using the
> >    4.20.12 config file, or could I just run make with that config file
> >    on later kernel sources?
> > 2. Build and install the kernel and modules.
> > 3. Reboot and then do `shutdown -h now'.
> > 4. Reboot.
> > 5. Lather, rinse, repeat.
> 
> What I described here is a manual bisection; I don't see how I can use
> `git bisect' to do this, because I have to restart the process after
> each reboot.  If I've misunderstood, I'd appreciate it if someone would
> give me a step by step recipe for how to do it.
> 

Git stores its state in the '.git/' and works fine across reboots.

Look for information on 'git bisect' to better understand the
process.  It is absolutely normal for kernel problems to be bisected
on the machine where the problem is happening.

But -

> Be that as it may, in the mean time I've cloned the mainline kernel
> repository, checked out 5.3.0 as the approximate midpoint between the
> known good (4.20.12) and bad (5.5.9) versions of the stable kernel,
> built and installed it, rebooted with this kernel, and here, too,
> `shutdown -h now' failed to power off the computer.  One difference from
> 5.5.9 and 5.6.4 is that there was no "timing out command" message after
> "Bringing down the loopback interface", even though I waited more than
> two minutes.
> 
Possibly two bugs, or an added descriptive message in a later commit
to highlight the problem.

> However, I have discovered a pattern which seems significant: if, after
> booting, I immediately do `shutdown -h now' (i.e., from the tty, without
> starting X or any other program), then the machine does power off, 20
> seconds after the "Bringing down the loopback interface" message; I've
> replicated this with 5.3.0 and 5.6.4.  But if I run another program
> before shutdown (the two cases I've tried are startx and emacs in the
> tty), then shutdown does not power off the machine.
> 

So, something in userspace is triggering this.  Which window manager
are you using for this ?
> In the absence of a clear understanding of how to use `git bisect' for
> this issue, I will, as time permits, manually try versions from the
> mainline repository between 4.20.0 and 5.3.0.
> 
> Steve Berman

ĸen
-- 
The beauty of reading a page of de Selby is that it leads one
inescapably to the conclusion that one is not, of all nincompoops,
the greatest.            -- du Garbandier
-- 
http://lists.linuxfromscratch.org/listinfo/lfs-support
FAQ: http://www.linuxfromscratch.org/blfs/faq.html
Unsubscribe: See the above information page

Do not top post on this list.

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?

http://en.wikipedia.org/wiki/Posting_style

Reply via email to