Re: cannot start detached sessions (with -m -d) back to back
On Fri, 31 Dec 2021, RVP wrote: OP: Try this patch: ---START--- diff -u screen-4.8.0{.orig,}/screen.c --- screen-4.8.0.orig/screen.c 2020-02-05 20:09:38.0 + +++ screen-4.8.0/screen.c 2021-12-31 01:56:30.470670972 + @@ -1801,6 +1801,10 @@ struct win *p = windows; windows = windows->w_next; FreeWindow(p); +if (p->w_pid > 0) { + debug1("Hangup(%d);\n", p->w_pid); + killpg(p->w_pid, SIGHUP); +} } if (ServerSocket != -1) { ---END--- I goofed that (use-after-free). Try this one please: ---START--- diff -urN screen-4.8.0.orig/screen.c screen-4.8.0/screen.c --- screen-4.8.0.orig/screen.c 2021-12-31 03:27:41.988827000 + +++ screen-4.8.0/screen.c 2021-12-31 03:26:47.995416000 + @@ -1801,8 +1801,13 @@ debug1("Finit(%d);\n", i); while (windows) { struct win *p = windows; +pid_t pid = p->w_pid; windows = windows->w_next; FreeWindow(p); +if (pid > 0) { + debug1("Hangup(%d);\n", pid); + killpg(pid, SIGHUP); +} } if (ServerSocket != -1) { ---END--- -RVP
Re: cannot start detached sessions (with -m -d) back to back
On Fri, Dec 31, 2021 at 02:08:02 +, RVP wrote: > On Fri, 31 Dec 2021, Valery Ushakov wrote: > > > I think screen is racing against the child process in MakeWindow. > > I think something else might also be going on: Ouch. That probably also explains why I was getting corrupt ktrace.out files sometimes, there was still that ghost shell that was being traced. That most likely is a contributing factor. If the stuck program is still holding onto the slave side, the parent screen process gets select return earlier, I guess. When the pty that still has the ghost shell attached to the slave side is reallocated, the ghost shell returns from read with 16982 16982 sh RET read RESTART reissues read and gets EOF. So if the new child wins the race, the new shell gets the pty and the new screen session is started. > 2. For some reason, on NetBSD, closing the master PTY-fd in screen does not >cause programs on the slave side to exit (on EOF). This seems like a bug to me, but I don't know much about tty driver. > Hope this helps... Thanks! -uwe
Re: cannot start detached sessions (with -m -d) back to back
On Fri, 31 Dec 2021, Valery Ushakov wrote: I think screen is racing against the child process in MakeWindow. I think something else might also be going on: On NetBSD 9.2_STABLE: $ screen -S ses -d -m -p 0 sh # create $ ps -Au# check for screen & sh ... root1802 0.0 0.0 27148 1992 ? Is1:34am 0:00.00 SCREEN -S ses -d -m -p 0 sh (screen) ... rvp 1085 0.0 0.1 23488 2252 pts/4 Ss+ 1:34am 0:00.00 sh ... $ screen -S ses -X quit # quit $ screen -S ses -X quit # confirm No screen session found. $ ls -l /dev/pts/ # check for pts/4 used by sh total 0 crw--w 1 rvp tty 5, 0 Dec 31 01:38 0 crw--w 1 rvp tty 5, 1 Dec 31 01:38 1 crw--w 1 rvp tty 5, 2 Dec 31 01:33 2 crw--w 1 rvp tty 5, 3 Dec 31 01:38 3 $ ps -Au# but sh still exists! ... rvp 1085 0.0 0.1 23488 2252 pts/4 Is+ 1:34am 0:00.00 sh ... $ fstat -p 1085 # and still holding on to pts/4 USER CMD PID FD MOUNT INUM MODE SZ|DV R/W rvp sh 1085 wd /tmp 85400958320606835 drwxrwxrwt 192 r rvp sh 10850 /dev/pts 11 crw-rw-rw- pts/4 rw rvp sh 10851 /dev/pts 11 crw-rw-rw- pts/4 rw rvp sh 10852 /dev/pts 11 crw-rw-rw- pts/4 rw rvp sh 1085 12 / 1405459 crw-rw-rw- tty rw $ ls -l /dev/pts/ # no pts/4 total 0 crw--w 1 rvp tty 5, 0 Dec 31 01:41 0 crw--w 1 rvp tty 5, 1 Dec 31 01:40 1 crw--w 1 rvp tty 5, 2 Dec 31 01:33 2 crw--w 1 rvp tty 5, 3 Dec 31 01:41 3 $ screen -S ses -d -m -p 0 sh # other sh instances won't start # running bash(1), however, closes # the older instance on the same PTY $ kill -HUP 1085# kill still running sh $ screen -S ses -d -m -p 0 sh # now screen will run 1. I don't use screen, but, it seems to me that `-X quit' should kill off all programs upon termination. 2. For some reason, on NetBSD, closing the master PTY-fd in screen does not cause programs on the slave side to exit (on EOF). OP: Try this patch: ---START--- diff -u screen-4.8.0{.orig,}/screen.c --- screen-4.8.0.orig/screen.c 2020-02-05 20:09:38.0 + +++ screen-4.8.0/screen.c 2021-12-31 01:56:30.470670972 + @@ -1801,6 +1801,10 @@ struct win *p = windows; windows = windows->w_next; FreeWindow(p); +if (p->w_pid > 0) { + debug1("Hangup(%d);\n", p->w_pid); + killpg(p->w_pid, SIGHUP); +} } if (ServerSocket != -1) { ---END--- Hope this helps... -RVP
Re: cannot start detached sessions (with -m -d) back to back
On Fri, Dec 31, 2021 at 03:18:08 +0300, Valery Ushakov wrote: > I think screen is racing against the child process in MakeWindow. I filed https://savannah.gnu.org/bugs/index.php?61749 for this. -uwe
Re: cannot start detached sessions (with -m -d) back to back
On Thu, Dec 30, 2021 at 06:55:24 +0300, Valery Ushakov wrote: > Building screeen with debugging shows that succesful session start has > for the first read from the window: > > + hit ev fd 5 type 1! > going to read from window fd 5 > -> 5 bytes > > but failed attempt has > > + hit ev fd 5 type 1! > going to read from window fd 5 > Window 0: EOF - killing window > > where fd 5 is obtained from the cloning pty device /dev/ptmx (ptm(4)) > > The comment in ptcread says: > > /* >* We want to block until the slave >* is open, and there's something to read; >* but if we lost the slave or we're NBIO, >* then return the appropriate error instead. >*/ > ... > if (!ISSET(tp->t_state, TS_CARR_ON)) { > error = 0; /* EOF */ > goto out; > } I think screen is racing against the child process in MakeWindow. >From a quick look it seems that the parent opens the master side and saves the slave name. Then it calls ForkWindow and adds the master fd to the list of descriptors to poll. Now the race is on, b/c it takes some time for the child to open the slave side, and if the parent wins the race, it will get EOF from the master, as the slave is not open yet. E.g. a failed attempt: $ kdump | sed -n -e '/select/p' -e '/EOF/p' \ -e '/\/dev\/pts/{' -e 'N' -e '/open/p' -e '}' 4831 4831 screen CALL __select50(0x100,0x7f7fff3739f0,0x7f7fff373a10,0,0) 4831 4831 screen RET __select50 1 "Window 0: EOF (errno 0) - killing window\n" 5218 5218 screen NAMI "/dev/pts/7" 5218 5218 screen RET open 0 vs. a succesful one: 19165 19165 screen CALL __select50(0x100,0x7f7fff30a910,0x7f7fff30a930,0,0) 26839 26839 screen NAMI "/dev/pts/7" 26839 26839 screen RET open 0 19165 19165 screen RET __select50 1 "serv_select_fn called\n" 19165 19165 screen CALL __select50(0x100,0x7f7fff30a910,0x7f7fff30a930,0,0) 19165 19165 screen RET __select50 1 Using brute force to make screen pre-open the slave in the parent process should take care of the race --- pty.c~ 2021-12-29 23:50:37.231129335 +0300 +++ pty.c 2021-12-31 03:11:48.652558852 +0300 @@ -288,6 +288,7 @@ char **ttyn; } initmaster(f); *ttyn = TtyName; + pty_preopen = 1; /* XXX: uwe */ return f; } #endif This is the HAVE_SVR4_PTYS version of OpenPTY (yes, screen still uses k function definitions :) -uwe
Re: UEFI dual-boot with Windows
Not adding anything here but, this is something users ask everynow and then on different forums. It would be awesome if the wiki contained a guide on how to set-up a dualboot, Windows/NetBSD and Linux/NetBSD. Including seting up rEFInd would be the icing on the cake. Happy new year. Den ons 29 dec. 2021 21:28Todd Gruhn skrev: > My system wa built a year ago. I boot NetBSD from one HD. > I boot Windoze from another HD. > It also allows me to boot either of 2 CD/DVDs. > > The 2 CDs come in handy when upgrading NetBSD. > > I can choose which device to boot when UEFI comes up. > > On Wed, Dec 29, 2021 at 6:19 PM Chavdar Ivanov wrote: > > > > I boot my netbsd-current system in uefi mode from the second disk by > selecting its .efi file; I lost my default rEFInd setup when I downgraded > the first disk from W11 to W10 and haven’t tried to recover it yet, it also > can be started by selecting its .efi file. I have never copied the system > kernel on the efi partition; there are three systems on the second disk > with their own efi partitions. This is on an HP envy 17 laptop, 5 years old. > > > > On Wed, 29 Dec 2021 at 17:19, Tobias Nygren wrote: > >> > >> On Wed, 29 Dec 2021 17:05:08 + (UTC) > >> Benny Siegert wrote: > >> > >> > Hi! > >> > > >> > I re-installed Windows 10 on my machine, and it insisted on UEFI boot, > >> > which killed my previous dual-booting setup with GRUB and legacy boot. > >> > > >> > NetBSD is on the second NVMe drive, while the first one is all > Windows. > >> > > >> > After installing Windows, I manually installed rEFInd into the EFI > >> > partition. For NetBSD, I copied bootx64.efi to /EFI/NetBSD (so as not > to > >> > overwrite the existing /EFI/Boot/bootx64.efi, which I assume is from > >> > Windows). I also copied a GENERIC NetBSD-9.2 kernel to /netbsd.gz on > the > >> > EFI partition. > >> > > >> > After selecting NetBSD in rEFInd (which it auto-detects), I see the > >> > NetBSD/x86 EFI boot (x64) banner. It proceeds to load a kernel from > >> > "NAME=EFI system partition:netbsd.gz (howto 0x2)". > >> > > >> > Unfortunately, after the initial loader line with the sizes, the boot > >> > seems to hang with no further output. > >> > > >> > Any ideas, hints or tips? > >> > >> I have a similar problem when I have a 4k sector NVMe drive installed. > >> I suspect in my case it is a Dell firmware bug but not sure. > >> It hangs for me when tearing down UEFI stuff before jumping to kernel. > >> > >> To rule out issues with the EFI system partition itself you could > >> install a /EFI/NetBSD/boot.cfg to instruct bootx64.efi to load the > >> kernel from hd1a:netbsd or whatever your FFS partition is named. > >> > >> -Tobias > > > > -- > > >