[konsole] [Bug 372991] Terminal gets stuck on interrupting a program that is outputting, preventing further output from being shown
https://bugs.kde.org/show_bug.cgi?id=372991 Martin Sandsmarkchanged: What|Removed |Added Resolution|--- |UPSTREAM Status|CONFIRMED |RESOLVED --- Comment #20 from Martin Sandsmark --- That explains why I haven't seen that in a while. Feel free to re-open if you manage to reproduce it with a kernel version with that patch in, I can't do that here. -- You are receiving this mail because: You are watching all bug changes.
[konsole] [Bug 372991] Terminal gets stuck on interrupting a program that is outputting, preventing further output from being shown
https://bugs.kde.org/show_bug.cgi?id=372991 Peter Wuchanged: What|Removed |Added See Also||https://bugs.kde.org/show_b ||ug.cgi?id=175283 -- You are receiving this mail because: You are watching all bug changes.
[konsole] [Bug 372991] Terminal gets stuck on interrupting a program that is outputting, preventing further output from being shown
https://bugs.kde.org/show_bug.cgi?id=372991 --- Comment #19 from Peter Wu--- This bug is possibly resolved in the Linux kernel with this commit: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/?id=77dae6134440420bac334581a3ccee94cee1c054 This change (v4.11-rc2-16-g77dae6134440) has been backported as: v4.11.1-99-g34e01f920739 v4.10.16-82-g58d479441029 v4.9.28-70-g89c91ea37581 v4.4.68-46-g9bd2cc56a089 -- You are receiving this mail because: You are watching all bug changes.
[konsole] [Bug 372991] Terminal gets stuck on interrupting a program that is outputting, preventing further output from being shown
https://bugs.kde.org/show_bug.cgi?id=372991 Simon Andricchanged: What|Removed |Added CC||simonandr...@gmail.com -- You are receiving this mail because: You are watching all bug changes.
[konsole] [Bug 372991] Terminal gets stuck on interrupting a program that is outputting, preventing further output from being shown
https://bugs.kde.org/show_bug.cgi?id=372991 --- Comment #18 from Martin Sandsmark--- > What do you use for "quickly" triggering the issue? Something automated would > be awesome :) I started by trying to do that by automatically launching xxd in the background and then sleeping and killall xxd, but it didn't seem to work. but it since it did this in a loop, when I tried to ctrl+c it it just continued the next iteration, which meant I could quickly ctrl+c a bunch to trigger it. > It looks unrelated, that seems to be an issue with firing the timer, the > issue in this bug is caused by a watcher that is removed. the cause is probably different, but the result is more or less the same (spamming a ton of output -> konsole seemingly hangs). -- You are receiving this mail because: You are watching all bug changes.
[konsole] [Bug 372991] Terminal gets stuck on interrupting a program that is outputting, preventing further output from being shown
https://bugs.kde.org/show_bug.cgi?id=372991 --- Comment #17 from Peter Wu--- Created attachment 104751 --> https://bugs.kde.org/attachment.cgi?id=104751=edit smaller kernel config that is able to boot archiso (In reply to Martin Sandsmark from comment #15) > Well, it wasn't that commit, managed to reproduce it with that reverted. What about testing the previous commit? Simply reverting might not work as there can be other patches that cause this issue. > Running an arch install pretty similar to your setup but with virtualbox. > Also made an unholy mess of shell scripts that seems to trigger it more > consistently, only problem is that it takes forever to build the kernel (I'm > too lazy to modify the config to trim it down). I am attaching a kernel config from which I based my test kernels (the kernel config I used for bisection got lost). It also has ACPI debugging features enabled (remainder of some other bughunt), you could disable that if you want. What do you use for "quickly" triggering the issue? Something automated would be awesome :) (In reply to Martin Sandsmark from comment #16) > also, I'm not sure if I'm somehow triggering this bug instead sometimes (or > if these are related): https://bugs.kde.org/show_bug.cgi?id=230184 It looks unrelated, that seems to be an issue with firing the timer, the issue in this bug is caused by a watcher that is removed. -- You are receiving this mail because: You are watching all bug changes.
[konsole] [Bug 372991] Terminal gets stuck on interrupting a program that is outputting, preventing further output from being shown
https://bugs.kde.org/show_bug.cgi?id=372991 --- Comment #16 from Martin Sandsmark--- also, I'm not sure if I'm somehow triggering this bug instead sometimes (or if these are related): https://bugs.kde.org/show_bug.cgi?id=230184 -- You are receiving this mail because: You are watching all bug changes.
[konsole] [Bug 372991] Terminal gets stuck on interrupting a program that is outputting, preventing further output from being shown
https://bugs.kde.org/show_bug.cgi?id=372991 --- Comment #15 from Martin Sandsmark--- Well, it wasn't that commit, managed to reproduce it with that reverted. Running an arch install pretty similar to your setup but with virtualbox. Also made an unholy mess of shell scripts that seems to trigger it more consistently, only problem is that it takes forever to build the kernel (I'm too lazy to modify the config to trim it down). -- You are receiving this mail because: You are watching all bug changes.
[konsole] [Bug 372991] Terminal gets stuck on interrupting a program that is outputting, preventing further output from being shown
https://bugs.kde.org/show_bug.cgi?id=372991 --- Comment #14 from Peter Wu--- (In reply to Martin Sandsmark from comment #13) > Are you sure that 4.1 was okay? Not sure if it was okay, but I wasn't able to reproduce the issue. The difficult/annoying thing is that some versions easily trigger the problem while others only show up after trying for some time. > There is only a single commit to any relevant tty code that I can spot > between v4.1.1 and v4.1.10-89-g5eb491ba5d06: > https://www.spinics.net/lists/stable-commits/msg47515.html > (3b19e032295647b7be2aa3be62510db4aaeda759). which first ended up in v4.1.5 as ba3961ad681981dc74fcd519b8f98be8bc3ac381 -- You are receiving this mail because: You are watching all bug changes.
[konsole] [Bug 372991] Terminal gets stuck on interrupting a program that is outputting, preventing further output from being shown
https://bugs.kde.org/show_bug.cgi?id=372991 Martin Sandsmarkchanged: What|Removed |Added CC||martin.sandsm...@kde.org --- Comment #13 from Martin Sandsmark --- Are you sure that 4.1 was okay? There is only a single commit to any relevant tty code that I can spot between v4.1.1 and v4.1.10-89-g5eb491ba5d06: https://www.spinics.net/lists/stable-commits/msg47515.html (3b19e032295647b7be2aa3be62510db4aaeda759). There's of course a bunch of commits that might be the source in e. g. kernel/ as well, though. I'll see if I can try to narrow it down further. -- You are receiving this mail because: You are watching all bug changes.
[konsole] [Bug 372991] Terminal gets stuck on interrupting a program that is outputting, preventing further output from being shown
https://bugs.kde.org/show_bug.cgi?id=372991 --- Comment #12 from Peter Wu--- Proposed patch: https://git.reviewboard.kde.org/r/129984/ -- You are receiving this mail because: You are watching all bug changes.
[konsole] [Bug 372991] Terminal gets stuck on interrupting a program that is outputting, preventing further output from being shown
https://bugs.kde.org/show_bug.cgi?id=372991 --- Comment #11 from Peter Wu--- Looks like a race condition. Reproduced in QEMU 2.8.0 with Arch Linux and Konsole 16.12.2 with plasma-desktop 5.9.3 (and also openbox). Cannot reproduce with just one CPU (need -smp 2 or more). Command to start QEMU: qemu-system-x86_64 -M pc,accel=kvm -m 2G -vga qxl -net user -net nic,model=virtio -drive if=virtio,file=plasma.qcow2 -initrd initrd.img -kernel arch/x86/boot/bzImage -append "rw root=/dev/vda1" -smp 4 A normal user was created with autologin (agetty --autologin). In openbox (where no background activity exists), it was necessary to run "while lsdo :;done" in a separate xterm process (which would display just the directory "x" in the home directory on an 8GB ext4 filesystem). Tested kernels (BAD = got hang, OK = could not reproduce hang within a few minutes): BAD v4.9.11 BAD v4.10.1 BAD v4.1.38 OK v3.12.70 (Note: to update display, had to switch between QEMU monitor and back) OK v3.18.48 OK v4.0.9 (needed "ln -s compiler-gcc5.h ../include/linux/compiler-gcc6.h" to fix GCC6 build) BAD v4.1.38 (yes, still bad, just to be sure) BAD v4.1.15 OK v4.1.1 (could not reproduce) OK v4.1.7-44-g49b85054a83d (git bisect starts) BAD v4.1.13-91-g8d0fe5721d27 [OK v4.1.10-89-g5eb491ba5d06 (actually, this is also BAD! see below)] BAD v4.1.12-7-g4508582e6a83 [OK v4.1.10-174-gc1d40e01ad8c (actually, this is also BAD! see below)] BAD v4.1.11-13-g7b61554c25cb BAD v4.1.10-195-g614ea4ea2c3f (hmm, I suspect that this commit is bad) BAD v4.1.10-194-ga0533fb8cf60 (wait, I can still reproduce? did not expect that, bisect probably wrong) BAD v4.1.10-184-g0cf68c236f11 BAD v4.1.10-179-g249af812dcf3 BAD v4.1.10-174-gc1d40e01ad8c (not good, bad! Can also reproduce issue) BAD v4.1.10-89-g5eb491ba5d06 (finally got hang after trying for 5 minutes) OK v4.1.1 (can still not reproduce after 8 minutes) [v4.1.10-89-g5eb491ba5d06 (can still not reproduce after 6 minutes)] [BAD v4.1.10-194-ga0533fb8cf60 (can still reproduce after 2 minutes)] Gave up after 2-3 hours of trying... The first known bad version is v4.1.10-89-g5eb491ba5d06. The first *possibly* good version is v4.1.7-44-g49b85054a83d, but a quick scan through the commits (keywords, "tty", "console", "lock") does not yield suspicuous commits. I think I'll just try to patch this without knowing the root cause. -- You are receiving this mail because: You are watching all bug changes.
[konsole] [Bug 372991] Terminal gets stuck on interrupting a program that is outputting, preventing further output from being shown
https://bugs.kde.org/show_bug.cgi?id=372991 --- Comment #10 from Peter Wu--- (In reply to Hello71 from comment #5) > this is clearly some type of race condition, since it only breaks sometimes. > > you can try "taskset -c 0 konsole", which for me fixes the problem. Even with "tasksel -c 0 konsole" I can reproduce the problem if I repeat the experiment often enough. (In reply to Egmont Koblinger from comment #9) > Without having looked at the code, I'm wondering how FIONREAD can return > 0... I mean I guess it's preceded by a select() or similar that waits for > data to appear, and the reason for reaching FIONREAD is that it signaled > some data available. But maybe this is where somehow (how?) multithreaded > race condition-y stuff kicks in. Comparing the "perf trace" output you can basically observe the following pattern: GOOD: xxd: write(, ..., count: 68) ... konsole: write(, ..., count: 1) = 1(is this "^C"?) konsole: poll() = 1 xxd: ... [continued] write = 67 konsole: ioctl(, FIONREAD) = 0 konsole: read(, ..., count: 2) = 2 bash: ... [continued] wait4() = (xxd) BAD: xxd: write(, ..., count: 68) ... konsole: write(, ..., count: 1) = 1 konsole: poll() = 1 konsole: ioctl(, FIONREAD) = 0 konsole: read(, ..., count: 0) = 0 xxd: ... [continued] write = 67 bash: ... [continued] wait4() = (xxd) In another (newer) trace I found this: GOOD: xxd: write(, ..., count: 68) ... konsole: write(, ..., count: 1) = 1 konsole: poll() = 1 konsole: ioctl(, FIONREAD) = 0 konsole: read(, ..., count: 2) = 2 xxd: ... [continued] write = 67 bash: ... [continued] wait4() = (xxd) BAD: xxd: write(, ..., count: 68) = 67 konsole: write(, ..., count: 1) = 1 konsole: poll() = 1 konsole: ioctl(, FIONREAD) = 0 konsole: read(, ..., count: 0) = 0 bash: ... [continued] wait4() = (xxd) So apparently there is a pending write from the slave device which seems to trigger poll (I guess?). But when the input buffer of the master device is checked, the data is not present (FIONREAD.available=0, hence read count 0). Maybe it is a Linux kernel issue (currently 4.9.5-1-ARCH) or reliance on invalid assumptions? It shouldn't hurt to not disable the file watcher? If the class is destroyed, then the watcher is killed anyway? -- You are receiving this mail because: You are watching all bug changes.
[konsole] [Bug 372991] Terminal gets stuck on interrupting a program that is outputting, preventing further output from being shown
https://bugs.kde.org/show_bug.cgi?id=372991 --- Comment #9 from Egmont Koblinger--- Without having looked at the code, I'm wondering how FIONREAD can return 0... I mean I guess it's preceded by a select() or similar that waits for data to appear, and the reason for reaching FIONREAD is that it signaled some data available. But maybe this is where somehow (how?) multithreaded race condition-y stuff kicks in. As far as I know, terminal devices don't have the concept of EOF, 0 should mean "oops someone just grabbed that data before us". (There's a concept of EOF, as in "stty eof" in the other part, from the keyboard to the kernel driver.) -- You are receiving this mail because: You are watching all bug changes.
[konsole] [Bug 372991] Terminal gets stuck on interrupting a program that is outputting, preventing further output from being shown
https://bugs.kde.org/show_bug.cgi?id=372991 --- Comment #8 from Peter Wu--- Created attachment 104086 --> https://bugs.kde.org/attachment.cgi?id=104086=edit debug patch I have traced the issue into kpty, KPtyDevicePrivate::_k_canRead. The ioctl(q->masterFd, FIONREAD, ) returns 0 available data. Subsequently reading 0 bytes also returns 0 (with errno=0). This is considered EOF which then prevents further read events from being reported. tty_ioctl(4) documents FIONREAD as: "Get the number of bytes in the input buffer." Not sure if "0 available bytes" should be interpreted as EOF though for ptmx devices. Attached is a debug patch. I did not observe ioctl failing in the output. Put a gdb breakpoint on the "!readBytes" block (line with "readNotifier->setEnabled(false);") and invoke "call fflush(0)". If you "return", the output is processed normally. available 4095 fd=12 readBytes=4095 errno=0 available 4095 fd=12 readBytes=4095 errno=0 available 0 fd=12 readBytes=0 errno=0 Would you see any issues with returning false (without logging other warnings nor emitting signals) when available==0? Surely if there is really an EOF condition, read would return 0 which actually detects the EOF? -- You are receiving this mail because: You are watching all bug changes.
[konsole] [Bug 372991] Terminal gets stuck on interrupting a program that is outputting, preventing further output from being shown
https://bugs.kde.org/show_bug.cgi?id=372991 Egmont Koblingerchanged: What|Removed |Added CC||egm...@gmail.com --- Comment #7 from Egmont Koblinger --- I can second Hello71's observation: "taskset -c 0 konsole" works perfectly for me. Looks like a race condition around multiple threads, or something like this. -- You are receiving this mail because: You are watching all bug changes.
[konsole] [Bug 372991] Terminal gets stuck on interrupting a program that is outputting, preventing further output from being shown
https://bugs.kde.org/show_bug.cgi?id=372991 --- Comment #6 from Kurt Hindenburg--- >From what I can tell, KDE4 Konsole doesn't have this issue. It appears the conversion to KF5 introduced this. -- You are receiving this mail because: You are watching all bug changes.
[konsole] [Bug 372991] Terminal gets stuck on interrupting a program that is outputting, preventing further output from being shown
https://bugs.kde.org/show_bug.cgi?id=372991 --- Comment #5 from Hello71--- this is clearly some type of race condition, since it only breaks sometimes. you can try "taskset -c 0 konsole", which for me fixes the problem. -- You are receiving this mail because: You are watching all bug changes.
[konsole] [Bug 372991] Terminal gets stuck on interrupting a program that is outputting, preventing further output from being shown
https://bugs.kde.org/show_bug.cgi?id=372991 Hello71changed: What|Removed |Added CC||alex_y...@yahoo.ca --- Comment #4 from Hello71 --- *** Bug 376462 has been marked as a duplicate of this bug. *** -- You are receiving this mail because: You are watching all bug changes.
[konsole] [Bug 372991] Terminal gets stuck on interrupting a program that is outputting, preventing further output from being shown
https://bugs.kde.org/show_bug.cgi?id=372991 Kurt Hindenburgchanged: What|Removed |Added Status|UNCONFIRMED |CONFIRMED Ever confirmed|0 |1 --- Comment #3 from Kurt Hindenburg --- well that's interesting... I'll try to research -- You are receiving this mail because: You are watching all bug changes.
[konsole] [Bug 372991] Terminal gets stuck on interrupting a program that is outputting, preventing further output from being shown
https://bugs.kde.org/show_bug.cgi?id=372991 Patrick Rudolphchanged: What|Removed |Added CC||patrick.rudolph@br-automati ||on.com --- Comment #2 from Patrick Rudolph --- I can easily reproduce this issue as descripted by Peter. It got stuck 3 times out of 10 test runs. The problem only affects a single tab. All other tabs in the same konsole remain working. * konsole-16.11.90-2.1 (SUSE backport) * qtbase-5.6.1-8.1 (SLES12) * plasma5-desktop-5.8.4-1.1 (SUSE backport) -- You are receiving this mail because: You are watching all bug changes.
[konsole] [Bug 372991] Terminal gets stuck on interrupting a program that is outputting, preventing further output from being shown
https://bugs.kde.org/show_bug.cgi?id=372991 --- Comment #1 from Peter Wu--- Other reproducers: xxd /dev/urandom yes | nl -- You are receiving this mail because: You are watching all bug changes.
[konsole] [Bug 372991] Terminal gets stuck on interrupting a program that is outputting, preventing further output from being shown
https://bugs.kde.org/show_bug.cgi?id=372991 Peter Wuchanged: What|Removed |Added Attachment #102469|application/x-tar |application/x-gzip mime type|| -- You are receiving this mail because: You are watching all bug changes.