Re: botched display - mutt or ncurses bug?

2019-11-09 Thread Oswald Buddenhagen

On Sat, Nov 09, 2019 at 12:11:41PM +0100, Oswald Buddenhagen wrote:
i wouldn't be surprised if konsole is internally racy and reports a 
bogus screen size to ncurses. i'll investigate that and let you know.


yep, that was it. someone "removed a hack" by replacing it with a broken 
one. what an epic waste of time. well, at least we reviewed mutt's 
signal handling and found some oddities. ^^




Re: botched display - mutt or ncurses bug?

2019-11-09 Thread Oswald Buddenhagen

On Fri, Nov 08, 2019 at 01:04:37PM -0800, Kevin J. McCarthy wrote:

On Fri, Nov 08, 2019 at 09:29:29PM +0100, Oswald Buddenhagen wrote:
I've pushed a commit up to the branch 'kevin/sigwinch-fixes'.  Would 
you mind giving that a try and seeing if it resolves the issue?


no luck. and i'm additionally getting a blank screen when invoking the 
mailbox browser, until i ctrl-l (that was broken before, too, just 
differently). as before, things normalize once i resize the window.


You mean as a result of the branch commit?  That seems highly unlikely. 
The patch simply removes the race condition of resetting the flag after 
the redraw.  All the other flag handlers in Mutt already do it that way.


when the code is racy in another place, then changing the conditions may 
very well result in such a behavior difference.


Mutt installs its signal handlers twice - once before initscr(), and 
once after.  The second installation was originally only for slang, 
but was changed to always run 18 years ago.


that unification was probably wrong - slang installs an own handler, 
while ncurses won't overwrite an already installed one (if it's built 
to even install an own handler at all, but that's the default). 


That's what the code indicates.  So for ncurses Mutt installs its 
handler first, while for slang it installs afterward.  The "Solaris 8 
hack" was installing the handlers afterwards for ncurses too, but 
installing the same handlers again shouldn't lead to this odd behavior. 
Where is the race?


there would be if resetting the signal handlers flushed the queue, for 
example.


but that double init cries HACK!! in the first place, and i wouldn't 
be surprised if that was the source of the race.


By all means, comment it out and see: main.c:591.


yeah, but no luck with that. so much for that hypothesis ...
regardless, i'd revert that workaround to make the code clearer - 
solaris 8 doesn't seem terribly relevant by now.


However I'm starting to suspect something funky with your terminal or 
terminfo...


nothing wrong with the terminfo, as mutt runs just fine if started from 
the command line inside konsole. only the direct launch via konsole -e 
exposes the problem.


i tried strace'ing it, but guess what, this changes the situation 
sufficiently to make the problem go away.


i had a quick look at mutt's signal handling. a few things stick out:
- the multiplexing into a single handler just to switch over the signal 
  in turn is rather pointless, and makes the code harder to understand

- the handling of the terminating singals calls signal-unsafe functions
- the SIGTSTP clause misses a fallthrough declaration. note that recent 
  compilers will complain about that and provide a pragma to declare 
  that properly - see for example 
  https://sourceforge.net/p/isync/isync/ci/33ee4a4ffed94164fd20c397b214101e07fcbe66/
- the installation of the empty sigchld handler is pointless, see 
  https://stackoverflow.com/questions/18437779/do-i-need-to-do-anything-with-a-sigchld-handler-if-i-am-just-using-wait-to-wai


however, i didn't find anything obviously wrong with the winch handling. 
only the general note that i don't trust signal handling which isn't 
self-pipe serialized with a proper main loop in interactive 
applications.


i placed a write() inside the signal handler, and it confirms that there 
is an early winch delivery, which happens after the index was already 
drawn with the wrong geometry and key-wait state was entered (that means 
that there is flicker, but it's not visible, because konsole delays 
screen refreshing).


i wouldn't be surprised if konsole is internally racy and reports a 
bogus screen size to ncurses. i'll investigate that and let you know.




Re: botched display - mutt or ncurses bug?

2019-11-08 Thread Kevin J. McCarthy

On Fri, Nov 08, 2019 at 09:29:29PM +0100, Oswald Buddenhagen wrote:
I've pushed a commit up to the branch 'kevin/sigwinch-fixes'.  Would 
you mind giving that a try and seeing if it resolves the issue?


no luck. and i'm additionally getting a blank screen when invoking the 
mailbox browser, until i ctrl-l (that was broken before, too, just 
differently). as before, things normalize once i resize the window.


You mean as a result of the branch commit?  That seems highly unlikely. 
The patch simply removes the race condition of resetting the flag after 
the redraw.  All the other flag handlers in Mutt already do it that way.


Mutt installs its signal handlers twice - once before initscr(), and 
once after.  The second installation was originally only for slang, 
but was changed to always run 18 years ago.


that unification was probably wrong - slang installs an own handler, 
while ncurses won't overwrite an already installed one (if it's built 
to even install an own handler at all, but that's the default). 


That's what the code indicates.  So for ncurses Mutt installs its 
handler first, while for slang it installs afterward.  The "Solaris 8 
hack" was installing the handlers afterwards for ncurses too, but 
installing the same handlers again shouldn't lead to this odd behavior. 
Where is the race?


but that double init cries HACK!! in the first place, and i wouldn't 
be surprised if that was the source of the race.


By all means, comment it out and see: main.c:591.  However I'm starting 
to suspect something funky with your terminal or terminfo...


--
Kevin J. McCarthy
GPG Fingerprint: 8975 A9B3 3AA3 7910 385C  5308 ADEF 7684 8031 6BDA


signature.asc
Description: PGP signature


Re: botched display - mutt or ncurses bug?

2019-11-08 Thread Oswald Buddenhagen

On Fri, Nov 08, 2019 at 08:13:07AM -0800, Kevin J. McCarthy wrote:

On Fri, Nov 08, 2019 at 10:38:26AM +0100, Oswald Buddenhagen wrote:
anyway, i now have a hypothesis that is consistent with all 
observations so far: the trigger is an early resize event


Do you mean a resize that you perform, or are you referring to the 
asynchronous update that konsole performs?



the latter. i fail to parse my mail differently ...

I've pushed a commit up to the branch 'kevin/sigwinch-fixes'.  Would 
you mind giving that a try and seeing if it resolves the issue?


no luck. and i'm additionally getting a blank screen when invoking the 
mailbox browser, until i ctrl-l (that was broken before, too, just 
differently). as before, things normalize once i resize the window.


Mutt installs its signal handlers twice - once before initscr(), and 
once after.  The second installation was originally only for slang, 
but was changed to always run 18 years ago.


that unification was probably wrong - slang installs an own handler, 
while ncurses won't overwrite an already installed one (if it's built to 
even install an own handler at all, but that's the default). midnight 
commander chains sigwinch for slang, but not ncurses.
but that double init cries HACK!! in the first place, and i wouldn't be 
surprised if that was the source of the race.


Re: botched display - mutt or ncurses bug?

2019-11-08 Thread Kevin J. McCarthy

On Fri, Nov 08, 2019 at 10:38:26AM +0100, Oswald Buddenhagen wrote:
anyway, i now have a hypothesis that is consistent with all 
observations so far: the trigger is an early resize event


Do you mean a resize that you perform, or are you referring to the 
asynchronous update that konsole performs?


that would mean that the SIGWINCH handling in mutt is racy, at least 
during startup.


There are a couple odd things here.  Mutt installs its signal handlers 
twice - once before initscr(), and once after.  The second installation 
was originally only for slang, but was changed to always run 18 years 
ago.


The commit has a reference to an old ticket (not in Trac) noting the 
second installation helps with Solaris 8.  I don't think this would 
cause a race issue though.


The SIGWINCH handler sets a flag, which is then processed outside the 
handler.  However, I did note two places where the flag is reset *after* 
the resize/reflow.  That does seem racy, and is easy enough to fix.


I've pushed a commit up to the branch 'kevin/sigwinch-fixes'.  Would you
mind giving that a try and seeing if it resolves the issue?

--
Kevin J. McCarthy
GPG Fingerprint: 8975 A9B3 3AA3 7910 385C  5308 ADEF 7684 8031 6BDA


signature.asc
Description: PGP signature


Re: botched display - mutt or ncurses bug?

2019-11-08 Thread Patrick Shanahan
* Oswald Buddenhagen  [11-08-19 04:41]:
> On Thu, Nov 07, 2019 at 06:43:57PM -0500, Thomas Dickey wrote:
> > If you had supplied a "typescript" file (from "script")
> > matching the screenshots, someone could examine that and
> > infer the problem.
> > 
> i take it that you're interested. ;)
> but the log i have is from opening my inbox and thus not for public
> consumption.
> 
> anyway, i now have a hypothesis that is consistent with all observations so
> far: the trigger is an early resize event (unlike apparently all other
> emulators, konsole initializes the screen to 80x40 and does the update to
> the actual window size asynchronously). apart from the fact that the
> persistent positioning error always appears in line 41, the situation always
> normalizes for the rest of the session once i resize the window.
> that would mean that the SIGWINCH handling in mutt is racy, at least during
> startup.

so alter your initial konsole window size and see if you still experience
the anomaly

fwiw: I observe different but similar with tmux/yakuake when the screen
size differs between attachments.

-- 
(paka)Patrick Shanahan   Plainfield, Indiana, USA  @ptilopteri
http://en.opensuse.orgopenSUSE Community Memberfacebook/ptilopteri
Photos: http://wahoo.no-ip.org/piwigo   paka @ IRCnet freenode


Re: botched display - mutt or ncurses bug?

2019-11-08 Thread Oswald Buddenhagen

On Thu, Nov 07, 2019 at 06:43:57PM -0500, Thomas Dickey wrote:

If you had supplied a "typescript" file (from "script")
matching the screenshots, someone could examine that and
infer the problem.


i take it that you're interested. ;)
but the log i have is from opening my inbox and thus not for public 
consumption.


anyway, i now have a hypothesis that is consistent with all observations 
so far: the trigger is an early resize event (unlike apparently all 
other emulators, konsole initializes the screen to 80x40 and does the 
update to the actual window size asynchronously). apart from the fact 
that the persistent positioning error always appears in line 41, the 
situation always normalizes for the rest of the session once i resize 
the window.
that would mean that the SIGWINCH handling in mutt is racy, at least 
during startup.


Re: botched display - mutt or ncurses bug?

2019-11-07 Thread Eike Rathke
Hi,

On Thursday, 2019-11-07 21:40:00 +0100, Oswald Buddenhagen wrote:

> anyone else seen such a thing?

Never. Running Mutt master built on Debian buster and Fedora 29 in Gnome
terminal.

  Eike

-- 
OpenPGP/GnuPG encrypted mail preferred in all private communication.
GPG key 0x6A6CD5B765632D3A - 2265 D7F3 A7B0 95CC 3918  630B 6A6C D5B7 6563 2D3A
Use LibreOffice! https://www.libreoffice.org/


signature.asc
Description: PGP signature


Re: botched display - mutt or ncurses bug?

2019-11-07 Thread Santiago Torres
On Thu, Nov 07, 2019 at 09:40:00PM +0100, Oswald Buddenhagen wrote:
> hi,
> 
> for some weeks/months now, i'm getting weird rendering artifacts in mutt. i
> first thought that it's konsole's fault, because it manifests only there,
> but the captured raw tty output reveals that the mutt output is already
> really weird (apart from the rendering bugs it's also rather inefficient,
> including obvious no-ops - but not consistently), and cat'ing it in xterm
> produces the same mess. running mutt through valgrind/memcheck reveals no
> problems ... but the rendering problem also goes away, so it could be some
> subtle initialization bug related to the runtime environment.
> 
> while the bogus output most probably comes from ncurses, mutt appears to be
> the only ncurses application i tried that has such problems, so maybe it's
> abusing ncurses in some way.
> 
> i'm running on up-to-date debian unstable (ncurses is slightly outdated
> there) with mutt master.
> 
> you can find screenshots and my wild speculations in
> https://bugs.kde.org/show_bug.cgi?id=412598
> 
> anyone else seen such a thing?

I remember seing some weird artifacts with curses-based applications
using vte-ng and unicode glypbs.

This is just a hail-mary shot. I hope this is helpful.

-Santiago.


signature.asc
Description: PGP signature


Re: botched display - mutt or ncurses bug?

2019-11-07 Thread Thomas Dickey
On Thu, Nov 07, 2019 at 09:40:00PM +0100, Oswald Buddenhagen wrote:
> hi,
> 
> for some weeks/months now, i'm getting weird rendering artifacts in mutt. i
> first thought that it's konsole's fault, because it manifests only there,

...

> you can find screenshots and my wild speculations in
> https://bugs.kde.org/show_bug.cgi?id=412598

There's not enough information provided in the report.

If you had supplied a "typescript" file (from "script")
matching the screenshots, someone could examine that and
infer the problem.

-- 
Thomas E. Dickey 
https://invisible-island.net
ftp://ftp.invisible-island.net


signature.asc
Description: PGP signature