Re: botched display - mutt or ncurses bug?
On Sat, Nov 09, 2019 at 12:11:41PM +0100, Oswald Buddenhagen wrote: i wouldn't be surprised if konsole is internally racy and reports a bogus screen size to ncurses. i'll investigate that and let you know. yep, that was it. someone "removed a hack" by replacing it with a broken one. what an epic waste of time. well, at least we reviewed mutt's signal handling and found some oddities. ^^
Re: botched display - mutt or ncurses bug?
On Fri, Nov 08, 2019 at 01:04:37PM -0800, Kevin J. McCarthy wrote: On Fri, Nov 08, 2019 at 09:29:29PM +0100, Oswald Buddenhagen wrote: I've pushed a commit up to the branch 'kevin/sigwinch-fixes'. Would you mind giving that a try and seeing if it resolves the issue? no luck. and i'm additionally getting a blank screen when invoking the mailbox browser, until i ctrl-l (that was broken before, too, just differently). as before, things normalize once i resize the window. You mean as a result of the branch commit? That seems highly unlikely. The patch simply removes the race condition of resetting the flag after the redraw. All the other flag handlers in Mutt already do it that way. when the code is racy in another place, then changing the conditions may very well result in such a behavior difference. Mutt installs its signal handlers twice - once before initscr(), and once after. The second installation was originally only for slang, but was changed to always run 18 years ago. that unification was probably wrong - slang installs an own handler, while ncurses won't overwrite an already installed one (if it's built to even install an own handler at all, but that's the default). That's what the code indicates. So for ncurses Mutt installs its handler first, while for slang it installs afterward. The "Solaris 8 hack" was installing the handlers afterwards for ncurses too, but installing the same handlers again shouldn't lead to this odd behavior. Where is the race? there would be if resetting the signal handlers flushed the queue, for example. but that double init cries HACK!! in the first place, and i wouldn't be surprised if that was the source of the race. By all means, comment it out and see: main.c:591. yeah, but no luck with that. so much for that hypothesis ... regardless, i'd revert that workaround to make the code clearer - solaris 8 doesn't seem terribly relevant by now. However I'm starting to suspect something funky with your terminal or terminfo... nothing wrong with the terminfo, as mutt runs just fine if started from the command line inside konsole. only the direct launch via konsole -e exposes the problem. i tried strace'ing it, but guess what, this changes the situation sufficiently to make the problem go away. i had a quick look at mutt's signal handling. a few things stick out: - the multiplexing into a single handler just to switch over the signal in turn is rather pointless, and makes the code harder to understand - the handling of the terminating singals calls signal-unsafe functions - the SIGTSTP clause misses a fallthrough declaration. note that recent compilers will complain about that and provide a pragma to declare that properly - see for example https://sourceforge.net/p/isync/isync/ci/33ee4a4ffed94164fd20c397b214101e07fcbe66/ - the installation of the empty sigchld handler is pointless, see https://stackoverflow.com/questions/18437779/do-i-need-to-do-anything-with-a-sigchld-handler-if-i-am-just-using-wait-to-wai however, i didn't find anything obviously wrong with the winch handling. only the general note that i don't trust signal handling which isn't self-pipe serialized with a proper main loop in interactive applications. i placed a write() inside the signal handler, and it confirms that there is an early winch delivery, which happens after the index was already drawn with the wrong geometry and key-wait state was entered (that means that there is flicker, but it's not visible, because konsole delays screen refreshing). i wouldn't be surprised if konsole is internally racy and reports a bogus screen size to ncurses. i'll investigate that and let you know.
Re: botched display - mutt or ncurses bug?
On Fri, Nov 08, 2019 at 09:29:29PM +0100, Oswald Buddenhagen wrote: I've pushed a commit up to the branch 'kevin/sigwinch-fixes'. Would you mind giving that a try and seeing if it resolves the issue? no luck. and i'm additionally getting a blank screen when invoking the mailbox browser, until i ctrl-l (that was broken before, too, just differently). as before, things normalize once i resize the window. You mean as a result of the branch commit? That seems highly unlikely. The patch simply removes the race condition of resetting the flag after the redraw. All the other flag handlers in Mutt already do it that way. Mutt installs its signal handlers twice - once before initscr(), and once after. The second installation was originally only for slang, but was changed to always run 18 years ago. that unification was probably wrong - slang installs an own handler, while ncurses won't overwrite an already installed one (if it's built to even install an own handler at all, but that's the default). That's what the code indicates. So for ncurses Mutt installs its handler first, while for slang it installs afterward. The "Solaris 8 hack" was installing the handlers afterwards for ncurses too, but installing the same handlers again shouldn't lead to this odd behavior. Where is the race? but that double init cries HACK!! in the first place, and i wouldn't be surprised if that was the source of the race. By all means, comment it out and see: main.c:591. However I'm starting to suspect something funky with your terminal or terminfo... -- Kevin J. McCarthy GPG Fingerprint: 8975 A9B3 3AA3 7910 385C 5308 ADEF 7684 8031 6BDA signature.asc Description: PGP signature
Re: botched display - mutt or ncurses bug?
On Fri, Nov 08, 2019 at 08:13:07AM -0800, Kevin J. McCarthy wrote: On Fri, Nov 08, 2019 at 10:38:26AM +0100, Oswald Buddenhagen wrote: anyway, i now have a hypothesis that is consistent with all observations so far: the trigger is an early resize event Do you mean a resize that you perform, or are you referring to the asynchronous update that konsole performs? the latter. i fail to parse my mail differently ... I've pushed a commit up to the branch 'kevin/sigwinch-fixes'. Would you mind giving that a try and seeing if it resolves the issue? no luck. and i'm additionally getting a blank screen when invoking the mailbox browser, until i ctrl-l (that was broken before, too, just differently). as before, things normalize once i resize the window. Mutt installs its signal handlers twice - once before initscr(), and once after. The second installation was originally only for slang, but was changed to always run 18 years ago. that unification was probably wrong - slang installs an own handler, while ncurses won't overwrite an already installed one (if it's built to even install an own handler at all, but that's the default). midnight commander chains sigwinch for slang, but not ncurses. but that double init cries HACK!! in the first place, and i wouldn't be surprised if that was the source of the race.
Re: botched display - mutt or ncurses bug?
On Fri, Nov 08, 2019 at 10:38:26AM +0100, Oswald Buddenhagen wrote: anyway, i now have a hypothesis that is consistent with all observations so far: the trigger is an early resize event Do you mean a resize that you perform, or are you referring to the asynchronous update that konsole performs? that would mean that the SIGWINCH handling in mutt is racy, at least during startup. There are a couple odd things here. Mutt installs its signal handlers twice - once before initscr(), and once after. The second installation was originally only for slang, but was changed to always run 18 years ago. The commit has a reference to an old ticket (not in Trac) noting the second installation helps with Solaris 8. I don't think this would cause a race issue though. The SIGWINCH handler sets a flag, which is then processed outside the handler. However, I did note two places where the flag is reset *after* the resize/reflow. That does seem racy, and is easy enough to fix. I've pushed a commit up to the branch 'kevin/sigwinch-fixes'. Would you mind giving that a try and seeing if it resolves the issue? -- Kevin J. McCarthy GPG Fingerprint: 8975 A9B3 3AA3 7910 385C 5308 ADEF 7684 8031 6BDA signature.asc Description: PGP signature
Re: botched display - mutt or ncurses bug?
* Oswald Buddenhagen [11-08-19 04:41]: > On Thu, Nov 07, 2019 at 06:43:57PM -0500, Thomas Dickey wrote: > > If you had supplied a "typescript" file (from "script") > > matching the screenshots, someone could examine that and > > infer the problem. > > > i take it that you're interested. ;) > but the log i have is from opening my inbox and thus not for public > consumption. > > anyway, i now have a hypothesis that is consistent with all observations so > far: the trigger is an early resize event (unlike apparently all other > emulators, konsole initializes the screen to 80x40 and does the update to > the actual window size asynchronously). apart from the fact that the > persistent positioning error always appears in line 41, the situation always > normalizes for the rest of the session once i resize the window. > that would mean that the SIGWINCH handling in mutt is racy, at least during > startup. so alter your initial konsole window size and see if you still experience the anomaly fwiw: I observe different but similar with tmux/yakuake when the screen size differs between attachments. -- (paka)Patrick Shanahan Plainfield, Indiana, USA @ptilopteri http://en.opensuse.orgopenSUSE Community Memberfacebook/ptilopteri Photos: http://wahoo.no-ip.org/piwigo paka @ IRCnet freenode
Re: botched display - mutt or ncurses bug?
On Thu, Nov 07, 2019 at 06:43:57PM -0500, Thomas Dickey wrote: If you had supplied a "typescript" file (from "script") matching the screenshots, someone could examine that and infer the problem. i take it that you're interested. ;) but the log i have is from opening my inbox and thus not for public consumption. anyway, i now have a hypothesis that is consistent with all observations so far: the trigger is an early resize event (unlike apparently all other emulators, konsole initializes the screen to 80x40 and does the update to the actual window size asynchronously). apart from the fact that the persistent positioning error always appears in line 41, the situation always normalizes for the rest of the session once i resize the window. that would mean that the SIGWINCH handling in mutt is racy, at least during startup.
Re: botched display - mutt or ncurses bug?
Hi, On Thursday, 2019-11-07 21:40:00 +0100, Oswald Buddenhagen wrote: > anyone else seen such a thing? Never. Running Mutt master built on Debian buster and Fedora 29 in Gnome terminal. Eike -- OpenPGP/GnuPG encrypted mail preferred in all private communication. GPG key 0x6A6CD5B765632D3A - 2265 D7F3 A7B0 95CC 3918 630B 6A6C D5B7 6563 2D3A Use LibreOffice! https://www.libreoffice.org/ signature.asc Description: PGP signature
Re: botched display - mutt or ncurses bug?
On Thu, Nov 07, 2019 at 09:40:00PM +0100, Oswald Buddenhagen wrote: > hi, > > for some weeks/months now, i'm getting weird rendering artifacts in mutt. i > first thought that it's konsole's fault, because it manifests only there, > but the captured raw tty output reveals that the mutt output is already > really weird (apart from the rendering bugs it's also rather inefficient, > including obvious no-ops - but not consistently), and cat'ing it in xterm > produces the same mess. running mutt through valgrind/memcheck reveals no > problems ... but the rendering problem also goes away, so it could be some > subtle initialization bug related to the runtime environment. > > while the bogus output most probably comes from ncurses, mutt appears to be > the only ncurses application i tried that has such problems, so maybe it's > abusing ncurses in some way. > > i'm running on up-to-date debian unstable (ncurses is slightly outdated > there) with mutt master. > > you can find screenshots and my wild speculations in > https://bugs.kde.org/show_bug.cgi?id=412598 > > anyone else seen such a thing? I remember seing some weird artifacts with curses-based applications using vte-ng and unicode glypbs. This is just a hail-mary shot. I hope this is helpful. -Santiago. signature.asc Description: PGP signature
Re: botched display - mutt or ncurses bug?
On Thu, Nov 07, 2019 at 09:40:00PM +0100, Oswald Buddenhagen wrote: > hi, > > for some weeks/months now, i'm getting weird rendering artifacts in mutt. i > first thought that it's konsole's fault, because it manifests only there, ... > you can find screenshots and my wild speculations in > https://bugs.kde.org/show_bug.cgi?id=412598 There's not enough information provided in the report. If you had supplied a "typescript" file (from "script") matching the screenshots, someone could examine that and infer the problem. -- Thomas E. Dickey https://invisible-island.net ftp://ftp.invisible-island.net signature.asc Description: PGP signature