D4975: Prevent misdetection of EOF on Linux
Lekensteyn added a comment. (FWIW, I ran a patched kpty 5.32.0-1 version for a while and didn't have the hang issue until I recently upgraded to an unpatched 5.33.0-1. Haven't had a moment so far to dig deeper into the issue though.) REPOSITORY R291 KPty REVISION DETAIL https://phabricator.kde.org/D4975 To: Lekensteyn, #konsole, ossi, hindenburg Cc: #frameworks
D4975: Prevent misdetection of EOF on Linux
ossi added a comment. yeah, but without knowing the root cause, your fix may be just wrong, or at least the comment/commit message may be misleading. also, https://community.kde.org/Policies/Commit_Policy#Don.27t_commit_code_you_don.27t_understand REPOSITORY R291 KPty REVISION DETAIL https://phabricator.kde.org/D4975 To: Lekensteyn, #konsole, hindenburg, ossi Cc: #frameworks
D4975: Prevent misdetection of EOF on Linux
Lekensteyn added a comment. In https://phabricator.kde.org/D4975#95754, @ossi wrote: > first try googling it; maybe there is even a response on stackoverflow (as so often happens). Was one of the first things to try, that only pointed to an issue in Emacs and another that clarified that FIONREAD does not work for things like UDP sockets. > then try posting it yourself to stackoverflow. > dunno whether such questions are welcome on the linux-kernel list, but if not, there might be other lists where it would fit. I doubt that SO would know the answer to this, *some* Linux kernel list would probably work, but then a stripped down testcase is likely needed and I don't really feel like wasting much more time on this. The problem is real. REPOSITORY R291 KPty REVISION DETAIL https://phabricator.kde.org/D4975 To: Lekensteyn, #konsole, hindenburg, ossi Cc: #frameworks
D4975: Prevent misdetection of EOF on Linux
ossi added a comment. first try googling it; maybe there is even a response on stackoverflow (as so often happens). then try posting it yourself to stackoverflow. dunno whether such questions are welcome on the linux-kernel list, but if not, there might be other lists where it would fit. INLINE COMMENTS > Lekensteyn wrote in kptydevice.cpp:307 > Constrained in what sense? Should the first known broken kernel version be > noted here? basically what you put in the commit message. REPOSITORY R291 KPty REVISION DETAIL https://phabricator.kde.org/D4975 To: Lekensteyn, #konsole, hindenburg, ossi Cc: #frameworks
D4975: Prevent misdetection of EOF on Linux
Lekensteyn added a comment. In https://phabricator.kde.org/D4975#95727, @ossi wrote: > please verify that this is in fact a kernel bug (include ml references), not something that should be expected for some weird reasons (compare the solaris path). I suspect it is a kernel bug, but have no ml references. It seems to be a race condition for which it is hard to prove that it does not exist. See https://bugs.kde.org/show_bug.cgi?id=372991#c10, poll returned but FIONREAD outputted 0 bytes. > if this is a bug, does it actually affect released versions which are not being "upgraded away" on short notice? i wouldn't want a workaround for a bug that had a very short life span. This issue has been there for several months (maybe even a year) and I regularly (multiple times a week) hit this condition when interrupting a program (today it even happened when interrupting a `git push` which was waiting for the SSH passphrase). I have tried to pinpoint the root cause (but failed even after spending hours), what else should I do or can I try? INLINE COMMENTS > ossi wrote in kptydevice.cpp:307 > this should be constrained more - affected versions, and the fact that it's a > bug. Constrained in what sense? Should the first known broken kernel version be noted here? REPOSITORY R291 KPty REVISION DETAIL https://phabricator.kde.org/D4975 To: Lekensteyn, #konsole, hindenburg, ossi Cc: #frameworks
D4975: Prevent misdetection of EOF on Linux
ossi added a comment. please verify that this is in fact a kernel bug (include ml references), not something that should be expected for some weird reasons (compare the solaris path). if this is a bug, does it actually affect released versions which are not being "upgraded away" on short notice? i wouldn't want a workaround for a bug that had a very short life span. INLINE COMMENTS > kptydevice.cpp:307 > +#else > +// On Linux, for some unknown reason 0 bytes can be returned after an > +// interrupt signal even if a slave is still attached. To avoid > marking this should be constrained more - affected versions, and the fact that it's a bug. > kptydevice.cpp:315 > +return true; > +} else if (available) { > +maybeEof = false; need no else after return. REPOSITORY R291 KPty REVISION DETAIL https://phabricator.kde.org/D4975 To: Lekensteyn, #konsole, hindenburg, ossi Cc: #frameworks
D4975: Prevent misdetection of EOF on Linux
cfeck added a reviewer: ossi. REPOSITORY R291 KPty REVISION DETAIL https://phabricator.kde.org/D4975 To: Lekensteyn, #konsole, hindenburg, ossi Cc: #frameworks
D4975: Prevent misdetection of EOF on Linux
hindenburg added a comment. Thanks, this needs to get reviewed by the framework team that handles KPty. REPOSITORY R291 KPty REVISION DETAIL https://phabricator.kde.org/D4975 To: Lekensteyn, #konsole, hindenburg Cc: #frameworks
D4975: Prevent misdetection of EOF on Linux
Lekensteyn created this revision. Restricted Application added a project: Frameworks. Restricted Application added a subscriber: Frameworks. REVISION SUMMARY On Linux, even if slave activity is detected on a master fd, FIONREAD can still return zero available bytes in the buffer. Previously this would prevent further signals from being emitted even if new IO arrived (since the readNotifier is disabled) which in its turn prevented the Konsole output from being updated. A race condition in the kernel is suspected, the first known broken kernel is v4.1.10-89-g5eb491ba5d06. BUG: 372991 TEST PLAN Tested in conjunction with konsole v16.12.1-60-g151215a9 with this debug patch applied: diff --git a/src/kptydevice.cpp b/src/kptydevice.cpp index 92b443b..4dfcd02 100644 --- a/src/kptydevice.cpp +++ b/src/kptydevice.cpp @@ -273,6 +273,7 @@ struct KPtyDevicePrivate : public KPtyPrivate { KRingBuffer writeBuffer; }; +#include bool KPtyDevicePrivate::_k_canRead() { Q_Q(KPtyDevice); @@ -310,6 +311,9 @@ bool KPtyDevicePrivate::_k_canRead() // select() will trigger again anyway if an EOF condition was found, and // only then we will accept it. if (!available && !maybeEof) { +FILE *fp = fopen("/tmp/EOF", "a"); +fprintf(fp, "Maybe EOF!?\n"); +fclose(fp); maybeEof = true; return true; } else if (available) { Indeed, the /tmp/EOF file is being written and the Konsole output no longer "hangs". Closing a tab (via EOF in bash or closing the tab) somehow does not trigger this debug print, perhaps the watcher is disabled elsewhere. (This is an observation and not a problem.) REPOSITORY R291 KPty REVISION DETAIL https://phabricator.kde.org/D4975 AFFECTED FILES src/kptydevice.cpp To: Lekensteyn, hindenburg, #konsole Cc: #frameworks