Re: cannot start detached sessions (with -m -d) back to back

2021-12-30 Thread RVP

On Fri, 31 Dec 2021, RVP wrote:


OP: Try this patch:

---START---
diff -u screen-4.8.0{.orig,}/screen.c --- screen-4.8.0.orig/screen.c 
2020-02-05 20:09:38.0 +

+++ screen-4.8.0/screen.c   2021-12-31 01:56:30.470670972 +
@@ -1801,6 +1801,10 @@
struct win *p = windows;
windows = windows->w_next;
FreeWindow(p);
+if (p->w_pid > 0) {
+  debug1("Hangup(%d);\n", p->w_pid);
+  killpg(p->w_pid, SIGHUP);
+}
  }

  if (ServerSocket != -1) {
---END---



I goofed that (use-after-free). Try this one please:

---START---
diff -urN screen-4.8.0.orig/screen.c screen-4.8.0/screen.c
--- screen-4.8.0.orig/screen.c  2021-12-31 03:27:41.988827000 +
+++ screen-4.8.0/screen.c   2021-12-31 03:26:47.995416000 +
@@ -1801,8 +1801,13 @@
   debug1("Finit(%d);\n", i);
   while (windows) {
 struct win *p = windows;
+pid_t pid = p->w_pid;
 windows = windows->w_next;
 FreeWindow(p);
+if (pid > 0) {
+  debug1("Hangup(%d);\n", pid);
+  killpg(pid, SIGHUP);
+}
   }

   if (ServerSocket != -1) {
---END---

-RVP


Re: cannot start detached sessions (with -m -d) back to back

2021-12-30 Thread Valery Ushakov
On Fri, Dec 31, 2021 at 02:08:02 +, RVP wrote:

> On Fri, 31 Dec 2021, Valery Ushakov wrote:
> 
> > I think screen is racing against the child process in MakeWindow.
> 
> I think something else might also be going on:

Ouch.  That probably also explains why I was getting corrupt
ktrace.out files sometimes, there was still that ghost shell that was
being traced.

That most likely is a contributing factor.  If the stuck program is
still holding onto the slave side, the parent screen process gets
select return earlier, I guess.

When the pty that still has the ghost shell attached to the slave side
is reallocated, the ghost shell returns from read with

 16982  16982 sh   RET   read RESTART

reissues read and gets EOF.  So if the new child wins the race, the
new shell gets the pty and the new screen session is started.


> 2. For some reason, on NetBSD, closing the master PTY-fd in screen does not
>cause programs on the slave side to exit (on EOF).

This seems like a bug to me, but I don't know much about tty driver.


> Hope this helps...

Thanks!


-uwe


Re: cannot start detached sessions (with -m -d) back to back

2021-12-30 Thread RVP

On Fri, 31 Dec 2021, Valery Ushakov wrote:


I think screen is racing against the child process in MakeWindow.



I think something else might also be going on:

On NetBSD 9.2_STABLE:

$ screen -S ses -d -m -p 0 sh   # create
$ ps -Au# check for screen & sh
...
root1802  0.0  0.0  27148  1992 ? Is1:34am 0:00.00 SCREEN -S ses -d 
-m -p 0 sh (screen)
...
rvp 1085  0.0  0.1  23488  2252 pts/4 Ss+   1:34am 0:00.00 sh
...
$ screen -S ses -X quit # quit
$ screen -S ses -X quit # confirm
No screen session found.
$ ls -l /dev/pts/   # check for pts/4 used by sh
total 0
crw--w  1 rvp  tty  5, 0 Dec 31 01:38 0
crw--w  1 rvp  tty  5, 1 Dec 31 01:38 1
crw--w  1 rvp  tty  5, 2 Dec 31 01:33 2
crw--w  1 rvp  tty  5, 3 Dec 31 01:38 3
$ ps -Au# but sh still exists!
...
rvp 1085  0.0  0.1  23488  2252 pts/4 Is+   1:34am 0:00.00 sh
...
$ fstat -p 1085 #  and still holding on to pts/4
USER CMD  PID   FD MOUNT   INUM MODE SZ|DV R/W
rvp  sh  1085   wd  /tmp  85400958320606835 drwxrwxrwt 192 r 
rvp  sh  10850  /dev/pts   11 crw-rw-rw-   pts/4 rw

rvp  sh  10851  /dev/pts   11 crw-rw-rw-   pts/4 rw
rvp  sh  10852  /dev/pts   11 crw-rw-rw-   pts/4 rw
rvp  sh  1085   12  / 1405459 crw-rw-rw- tty rw
$ ls -l /dev/pts/   # no pts/4
total 0
crw--w  1 rvp  tty  5, 0 Dec 31 01:41 0
crw--w  1 rvp  tty  5, 1 Dec 31 01:40 1
crw--w  1 rvp  tty  5, 2 Dec 31 01:33 2
crw--w  1 rvp  tty  5, 3 Dec 31 01:41 3
$ screen -S ses -d -m -p 0 sh   # other sh instances won't start
# running bash(1), however, closes
# the older instance on the same PTY
$ kill -HUP 1085# kill still running sh
$ screen -S ses -d -m -p 0 sh   # now screen will run

1. I don't use screen, but, it seems to me that `-X quit' should kill off all
   programs upon termination.

2. For some reason, on NetBSD, closing the master PTY-fd in screen does not
   cause programs on the slave side to exit (on EOF).

OP: Try this patch:

---START---
diff -u screen-4.8.0{.orig,}/screen.c 
--- screen-4.8.0.orig/screen.c  2020-02-05 20:09:38.0 +

+++ screen-4.8.0/screen.c   2021-12-31 01:56:30.470670972 +
@@ -1801,6 +1801,10 @@
 struct win *p = windows;
 windows = windows->w_next;
 FreeWindow(p);
+if (p->w_pid > 0) {
+  debug1("Hangup(%d);\n", p->w_pid);
+  killpg(p->w_pid, SIGHUP);
+}
   }

   if (ServerSocket != -1) {
---END---

Hope this helps...

-RVP


Re: cannot start detached sessions (with -m -d) back to back

2021-12-30 Thread Valery Ushakov
On Fri, Dec 31, 2021 at 03:18:08 +0300, Valery Ushakov wrote:

> I think screen is racing against the child process in MakeWindow.

I filed https://savannah.gnu.org/bugs/index.php?61749 for this.

-uwe


Re: cannot start detached sessions (with -m -d) back to back

2021-12-30 Thread Valery Ushakov
On Thu, Dec 30, 2021 at 06:55:24 +0300, Valery Ushakov wrote:

> Building screeen with debugging shows that succesful session start has
> for the first read from the window:
> 
>  + hit ev fd 5 type 1!
> going to read from window fd 5
>  -> 5 bytes
> 
> but failed attempt has
> 
>  + hit ev fd 5 type 1!
> going to read from window fd 5
> Window 0: EOF - killing window
> 
> where fd 5 is obtained from the cloning pty device /dev/ptmx (ptm(4))
> 
> The comment in ptcread says:
> 
>   /*
>* We want to block until the slave
>* is open, and there's something to read;
>* but if we lost the slave or we're NBIO,
>* then return the appropriate error instead.
>*/
> ...
>   if (!ISSET(tp->t_state, TS_CARR_ON)) {
>   error = 0;   /* EOF */
>   goto out;
>   }

I think screen is racing against the child process in MakeWindow.
>From a quick look it seems that the parent opens the master side and
saves the slave name.  Then it calls ForkWindow and adds the master fd
to the list of descriptors to poll.  Now the race is on, b/c it takes
some time for the child to open the slave side, and if the parent wins
the race, it will get EOF from the master, as the slave is not open
yet.  E.g. a failed attempt:

$ kdump | sed -n -e '/select/p' -e '/EOF/p' \
 -e '/\/dev\/pts/{' -e 'N' -e '/open/p' -e '}'
  4831   4831 screen   CALL  __select50(0x100,0x7f7fff3739f0,0x7f7fff373a10,0,0)
  4831   4831 screen   RET   __select50 1
   "Window 0: EOF (errno 0) - killing window\n"
  5218   5218 screen   NAMI  "/dev/pts/7"
  5218   5218 screen   RET   open 0


vs. a succesful one:


 19165  19165 screen   CALL  __select50(0x100,0x7f7fff30a910,0x7f7fff30a930,0,0)
 26839  26839 screen   NAMI  "/dev/pts/7"
 26839  26839 screen   RET   open 0
 19165  19165 screen   RET   __select50 1
   "serv_select_fn called\n"
 19165  19165 screen   CALL  __select50(0x100,0x7f7fff30a910,0x7f7fff30a930,0,0)
 19165  19165 screen   RET   __select50 1


Using brute force to make screen pre-open the slave in the parent
process should take care of the race

--- pty.c~  2021-12-29 23:50:37.231129335 +0300
+++ pty.c   2021-12-31 03:11:48.652558852 +0300
@@ -288,6 +288,7 @@ char **ttyn;
 }
   initmaster(f);
   *ttyn = TtyName;
+  pty_preopen = 1; /* XXX: uwe */
   return f;
 }
 #endif


This is the HAVE_SVR4_PTYS version of OpenPTY (yes, screen still uses
k function definitions :)

-uwe


Re: UEFI dual-boot with Windows

2021-12-30 Thread Pedro Pinho
Not adding anything here but, this is something users ask everynow and then
on different forums.

It would be awesome if the wiki contained a guide on how to set-up a
dualboot, Windows/NetBSD and Linux/NetBSD. Including seting up rEFInd would
be the icing on the cake.

Happy new year.


Den ons 29 dec. 2021 21:28Todd Gruhn  skrev:

> My system wa built a year ago. I boot NetBSD from one HD.
> I boot Windoze from another HD.
> It also allows me to boot either of 2 CD/DVDs.
>
> The 2 CDs come in handy when upgrading NetBSD.
>
> I can choose which device to boot when UEFI comes up.
>
> On Wed, Dec 29, 2021 at 6:19 PM Chavdar Ivanov  wrote:
> >
> > I boot my netbsd-current system in uefi mode from the second disk by
> selecting its .efi file; I lost my default rEFInd setup when I downgraded
> the first disk from W11 to W10 and haven’t tried to recover it yet, it also
> can be started by selecting its .efi file. I have never copied the system
> kernel on the efi partition; there are three systems on the second disk
> with their own efi partitions. This is on an HP envy 17 laptop, 5 years old.
> >
> > On Wed, 29 Dec 2021 at 17:19, Tobias Nygren  wrote:
> >>
> >> On Wed, 29 Dec 2021 17:05:08 + (UTC)
> >> Benny Siegert  wrote:
> >>
> >> > Hi!
> >> >
> >> > I re-installed Windows 10 on my machine, and it insisted on UEFI boot,
> >> > which killed my previous dual-booting setup with GRUB and legacy boot.
> >> >
> >> > NetBSD is on the second NVMe drive, while the first one is all
> Windows.
> >> >
> >> > After installing Windows, I manually installed rEFInd into the EFI
> >> > partition. For NetBSD, I copied bootx64.efi to /EFI/NetBSD (so as not
> to
> >> > overwrite the existing /EFI/Boot/bootx64.efi, which I assume is from
> >> > Windows). I also copied a GENERIC NetBSD-9.2 kernel to  /netbsd.gz on
> the
> >> > EFI partition.
> >> >
> >> > After selecting NetBSD in rEFInd (which it auto-detects), I see the
> >> > NetBSD/x86 EFI boot (x64) banner. It proceeds to load a kernel from
> >> > "NAME=EFI system partition:netbsd.gz (howto 0x2)".
> >> >
> >> > Unfortunately, after the initial loader line with the sizes, the boot
> >> > seems to hang with no further output.
> >> >
> >> > Any ideas, hints or tips?
> >>
> >> I have a similar problem when I have a 4k sector NVMe drive installed.
> >> I suspect in my case it is a Dell firmware bug but not sure.
> >> It hangs for me when tearing down UEFI stuff before jumping to kernel.
> >>
> >> To rule out issues with the EFI system partition itself you could
> >> install a /EFI/NetBSD/boot.cfg to instruct bootx64.efi to load the
> >> kernel from hd1a:netbsd or whatever your FFS partition is named.
> >>
> >> -Tobias
> >
> > --
> > 
>