Hello Stephen.

I hope you are rich enough to have not only survived the flooding,
but are also able to solve the damage done!
(I have not seen videos nor photos or something, but i have read
some numbers, and it was quite frightening.)

Stephen Isard wrote in
 <5799-1630854952-558...@sneakemail.com>:
 |I'm sorry to report that the problem has not gone away entirely.  I 
 |appear to get two kinds of error conditions.  As I last reported
 |
 |> I've had the error message
 |>
 |> s-nail: TLS socket read error, retrying: Resource temporarily unavailable
 |> s-nail: IMAP: error:00000000:lib(0):func(0):reason(0)
 |>
 |> while in the editor, and when I saved and quit, the edited version DID
 |> appear in the message.
 |
 |That seems to happen when I lose my connection to the internet 
 |altogether, not just to the IMAP server, and the error message comes up 
 |in the s-nail window while I am still in the editor.
 |
 |In the other condition, the message
 |
 |s-nail: IMAP write error: error:00000000:lib(0):func(0):reason(0)
 |
 |appears only when I leave the editor, and I don't know what event causes 
 |it.  That's the circumstance in which the editor temporary file doesn't 
 |get read back into the message buffer, and the temporary file remains in 
 |place until the message is sent or abandoned, at which point it is 
 |deleted.
 |
 |What chain of events can lead to an IMAP write error?

Dear Stephen i thought it was clear that this is all about your
editor, not about S-nail.  We will always read back the edited
file _if_ the size or its timestamp changed (aka whenever it seems
to have been modified, easy approach aka not checksum-checked),
_unless_ the editor exits with a non-0 exit status.
If your editor would be

  #!/bin/sh -
  trap : ALRM HUP INT QUIT TERM
  the-real-editor "$@"
  exit 0

then your problem with not reading back the temporary file should
not happen.

Regarding IMAP errors as such.  Well, IMAP is a TCP protocol, and
if you have a flaky internet connection then connection breaks are
a regular problem, i can tell you, i was struggling for years with
SSH breaks, POP3 breaks which caused redoing all the stuff
(because no now-do-sync is not possible), and HTTPS breakups
(which is a _real_ problem if some python forum aka tracker
software thinks you are a spammer and locks you out for 72 hours
or something because action/IP or whatever limits exist).

For now the MUA still uses that *imap-keepalive* mechanism of
keeping TCP connections alive.  This uses alarm(2) based
scheduling, meaning that whenever some timeout passes a SIGALRM is
sent, and then we perform some network I/O in the signal handler
in order to let the server know that he really has a peer.
The problem in the released version was that we do block SIGALRM
in the MUA while you are editing a file in compose mode, so that
this mechanism does not kick in, which thus could lead to the
server kicking us out.  Note, however, that such timeouts usually
are pretty high, many minutes, even half an hour or so.  The now
obsolete standard even said

  5.4.    Autologout Timer

     If a server has an inactivity autologout timer, the duration
     of that timer MUST be at least 30 minutes.  The receipt of
     ANY command from the client during that interval SHOULD
     suffice to reset the autologout timer.

The problem of the long hang you had seen was that on blocking
socket I/O which has a SO_{RCV,SND}TIMEOUt set via setsockopt(2)
the SSL library fails with "resource temporary unavailable", and
we did not treat that as a permanent error, effectively causing an
endless loop.  But not endless, it seems the Linux network stack
has some limit, effectively causing a break after ~15 minutes.
I have no idea what limit _that_ is, though, i have
net.ipv4.tcp_keepalive_time=300 and this makes 5 minutes for me.
In fact there is no 900 here, nowhere.  But i do not know,
obviously.

The problem that lasted after my first (second with that
SO_RCVTIMEOU) patch was that now SIGALRM was allowed again while
editing, but that an installed alarm(2) timer is inherited by
child processes, which is why your version of ed(1) seems to have
exited with a non-0 exit status, after having seen a SIGALRM,
though the default action of SIGARLM is

     SIGALRM      P1990      Term    Timer signal from alarm(2)

terminate says signal(7).  So the next patch on top of the first
patch drops such installed alarm(2) actions in child processes.

With these three patches which are on [master, stable/stable,
stable/latest, stable/v14.9] (uff) i can wholeheartly put the
blame on other people.

Like i said, personally i am super happy (as far as computing is
concerned) ever since i switched to my WireGuard VPN.  That is
datagram based, and can thus easily survive connection breaks etc.
I even have had occasions where i actively disconnected the Wifi,
recognized i had forgotten something, went online again, and the
ssh connection was still alive and well :-)  And the server VM has
a really really good network connection with only one failure in
the almost six years i have it, so over there it is no problem.
I now run an (TCP) IRC client proxy permanently over there, and it
is unbelievably a smooth experience.  TCP without breakups!!!
Maybe your friend should spend the five dollars or what the most
minimal thing costs, and use some minimal thing over there, like
AlpineLinux/virtual, with dovecot or something.  Then use
a datagram based VPN like WireGuard or OpenVPN or tinc.

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)

Reply via email to