On Fri, Nov 08, 2019 at 01:04:37PM -0800, Kevin J. McCarthy wrote:
On Fri, Nov 08, 2019 at 09:29:29PM +0100, Oswald Buddenhagen wrote:
I've pushed a commit up to the branch 'kevin/sigwinch-fixes'. Would
you mind giving that a try and seeing if it resolves the issue?
no luck. and i'm additionally getting a blank screen when invoking the
mailbox browser, until i ctrl-l (that was broken before, too, just
differently). as before, things normalize once i resize the window.
You mean as a result of the branch commit? That seems highly unlikely.
The patch simply removes the race condition of resetting the flag after
the redraw. All the other flag handlers in Mutt already do it that way.
when the code is racy in another place, then changing the conditions may
very well result in such a behavior difference.
Mutt installs its signal handlers twice - once before initscr(), and
once after. The second installation was originally only for slang,
but was changed to always run 18 years ago.
that unification was probably wrong - slang installs an own handler,
while ncurses won't overwrite an already installed one (if it's built
to even install an own handler at all, but that's the default).
That's what the code indicates. So for ncurses Mutt installs its
handler first, while for slang it installs afterward. The "Solaris 8
hack" was installing the handlers afterwards for ncurses too, but
installing the same handlers again shouldn't lead to this odd behavior.
Where is the race?
there would be if resetting the signal handlers flushed the queue, for
example.
but that double init cries HACK!! in the first place, and i wouldn't
be surprised if that was the source of the race.
By all means, comment it out and see: main.c:591.
yeah, but no luck with that. so much for that hypothesis ...
regardless, i'd revert that workaround to make the code clearer -
solaris 8 doesn't seem terribly relevant by now.
However I'm starting to suspect something funky with your terminal or
terminfo...
nothing wrong with the terminfo, as mutt runs just fine if started from
the command line inside konsole. only the direct launch via konsole -e
exposes the problem.
i tried strace'ing it, but guess what, this changes the situation
sufficiently to make the problem go away.
i had a quick look at mutt's signal handling. a few things stick out:
- the multiplexing into a single handler just to switch over the signal
in turn is rather pointless, and makes the code harder to understand
- the handling of the terminating singals calls signal-unsafe functions
- the SIGTSTP clause misses a fallthrough declaration. note that recent
compilers will complain about that and provide a pragma to declare
that properly - see for example
https://sourceforge.net/p/isync/isync/ci/33ee4a4ffed94164fd20c397b214101e07fcbe66/
- the installation of the empty sigchld handler is pointless, see
https://stackoverflow.com/questions/18437779/do-i-need-to-do-anything-with-a-sigchld-handler-if-i-am-just-using-wait-to-wai
however, i didn't find anything obviously wrong with the winch handling.
only the general note that i don't trust signal handling which isn't
self-pipe serialized with a proper main loop in interactive
applications.
i placed a write() inside the signal handler, and it confirms that there
is an early winch delivery, which happens after the index was already
drawn with the wrong geometry and key-wait state was entered (that means
that there is flicker, but it's not visible, because konsole delays
screen refreshing).
i wouldn't be surprised if konsole is internally racy and reports a
bogus screen size to ncurses. i'll investigate that and let you know.