Re: confusion about root partition causes panic during startup

2023-07-22 Thread Mike Karels
Are you planning to commit the change to mountroot?

Mike

On 20 Jul 2023, at 21:37, Mike Karels wrote:

> Mateusz Guzik  wrote:
>> On 7/20/23, Mike Karels  wrote:
>>> I installed an additional NVME drive on a system, and then booted.  It
>>> turns
>>> out that the new drive became nda0, renumbering the other drives.  The
>>> loader
>>> found the correct partition to boot (the only choice), and loaded the
>>> kernel
>>> correctly.  However, /etc/fstab still had the old name (nvd1p2), which is
>>> now drive 2.  I expected it to drop into single user, but instead the
>>> system
>>> panicked in vfs_mountroot_shuffle trying to switch root devices (see
>>> below).
>>> It doesn't seem that having the wrong root device in /etc/fstab should
>>> cause
>>> a panic; it makes it harder to patch the system.  I was unable to get the
>>> system to boot using boot-to-single-user or setting currdev, but I managed
>>> to remember doing "boot -a" from a loader prompt to get the system to ask
>>> the root device before mounting it.  I can easily reproduce this to test.
>>> Probably the NDFREE_PNBUF() shouldn't happen if namei() returned an error.
>>>
>
>> ye, this should do it (untested):
>
>> diff --git a/sys/kern/vfs_mountroot.c b/sys/kern/vfs_mountroot.c
>> index 956d29e3f084..85398ff781e4 100644
>> --- a/sys/kern/vfs_mountroot.c
>> +++ b/sys/kern/vfs_mountroot.c
>> @@ -352,13 +352,13 @@ vfs_mountroot_shuffle(struct thread *td, struct
>> mount *mpdevfs)
>> NDINIT(, LOOKUP, FOLLOW | LOCKLEAF, UIO_SYSSPACE, fspath);
>> error = namei();
>> if (error) {
>> -   NDFREE_PNBUF();
>> fspath = "/mnt";
>> NDINIT(, LOOKUP, FOLLOW | LOCKLEAF, UIO_SYSSPACE,
>> fspath);
>> error = namei();
>> }
>> if (!error) {
>> +   NDFREE_PNBUF();
>> vp = nd.ni_vp;
>> error = (vp->v_type == VDIR) ? 0 : ENOTDIR;
>> if (!error)
>> @@ -376,7 +376,6 @@ vfs_mountroot_shuffle(struct thread *td, struct
>> mount *mpdevfs)
>> } else
>> vput(vp);
>> }
>> -   NDFREE_PNBUF();
>
>> if (error)
>> printf("mountroot: unable to remount previous root "
>> @@ -387,6 +386,7 @@ vfs_mountroot_shuffle(struct thread *td, struct
>> mount *mpdevfs)
>> NDINIT(, LOOKUP, FOLLOW | LOCKLEAF, UIO_SYSSPACE, "/dev");
>> error = namei();
>> if (!error) {
>> +   NDFREE_PNBUF();
>> vp = nd.ni_vp;
>> error = (vp->v_type == VDIR) ? 0 : ENOTDIR;
>> if (!error)
>
> That was missing the last change, and tabs were expanded.  I put it in
> by hand, and the patch works, at least to avoid this panic.  It still
> insisted on remounting root on nda1p2, which is not a root file system.
> Remounting /dev still failed without panicking, then it panicked because
> there was no /sbin/init.  Apparently it is necessary to use "boot -a"
> in this situation.  Too bad the loader option menu doesn't include that.
>
> Just to be clear what I tested, my patch follows.
>
>   Mike
>
> diff --git a/sys/kern/vfs_mountroot.c b/sys/kern/vfs_mountroot.c
> index 956d29e3f084..b08b2a3200f8 100644
> --- a/sys/kern/vfs_mountroot.c
> +++ b/sys/kern/vfs_mountroot.c
> @@ -352,13 +352,13 @@ vfs_mountroot_shuffle(struct thread *td, struct mount 
> *mpdevfs)
>   NDINIT(, LOOKUP, FOLLOW | LOCKLEAF, UIO_SYSSPACE, fspath);
>   error = namei();
>   if (error) {
> - NDFREE_PNBUF();
>   fspath = "/mnt";
>   NDINIT(, LOOKUP, FOLLOW | LOCKLEAF, UIO_SYSSPACE,
>   fspath);
>   error = namei();
>   }
>   if (!error) {
> + NDFREE_PNBUF();
>   vp = nd.ni_vp;
>   error = (vp->v_type == VDIR) ? 0 : ENOTDIR;
>   if (!error)
> @@ -376,7 +376,6 @@ vfs_mountroot_shuffle(struct thread *td, struct mount 
> *mpdevfs)
>   } else
>   vput(vp);
>   }
> - NDFREE_PNBUF();
>
>   if (error)
>   printf("mountroot: unable to remount previous root "
> @@ -387,6 +386,7 @@ vfs_mountroot_shuffle(struct thread *td, struct mount 
> *mpdevfs)
>   NDINIT(, LOOKUP, FOLLOW | LOCKLEAF, UIO_SYSSPACE, "/dev");
>   error = namei();
>   if (!error) {
> + NDFREE_PNBUF();
>   vp = nd.ni_vp;
>   error = (vp->v_type == VDIR) ? 0 : ENOTDIR;
>   if (!error)
> @@ -413,7 +413,6 @@ vfs_mountroot_shuffle(struct thread *td, struct mount 
> *mpdevfs)
>   if 

Re: confusion about root partition causes panic during startup

2023-07-20 Thread Mike Karels
Mateusz Guzik  wrote:
> On 7/20/23, Mike Karels  wrote:
> > I installed an additional NVME drive on a system, and then booted.  It
> > turns
> > out that the new drive became nda0, renumbering the other drives.  The
> > loader
> > found the correct partition to boot (the only choice), and loaded the
> > kernel
> > correctly.  However, /etc/fstab still had the old name (nvd1p2), which is
> > now drive 2.  I expected it to drop into single user, but instead the
> > system
> > panicked in vfs_mountroot_shuffle trying to switch root devices (see
> > below).
> > It doesn't seem that having the wrong root device in /etc/fstab should
> > cause
> > a panic; it makes it harder to patch the system.  I was unable to get the
> > system to boot using boot-to-single-user or setting currdev, but I managed
> > to remember doing "boot -a" from a loader prompt to get the system to ask
> > the root device before mounting it.  I can easily reproduce this to test.
> > Probably the NDFREE_PNBUF() shouldn't happen if namei() returned an error.
> >

> ye, this should do it (untested):

> diff --git a/sys/kern/vfs_mountroot.c b/sys/kern/vfs_mountroot.c
> index 956d29e3f084..85398ff781e4 100644
> --- a/sys/kern/vfs_mountroot.c
> +++ b/sys/kern/vfs_mountroot.c
> @@ -352,13 +352,13 @@ vfs_mountroot_shuffle(struct thread *td, struct
> mount *mpdevfs)
> NDINIT(, LOOKUP, FOLLOW | LOCKLEAF, UIO_SYSSPACE, fspath);
> error = namei();
> if (error) {
> -   NDFREE_PNBUF();
> fspath = "/mnt";
> NDINIT(, LOOKUP, FOLLOW | LOCKLEAF, UIO_SYSSPACE,
> fspath);
> error = namei();
> }
> if (!error) {
> +   NDFREE_PNBUF();
> vp = nd.ni_vp;
> error = (vp->v_type == VDIR) ? 0 : ENOTDIR;
> if (!error)
> @@ -376,7 +376,6 @@ vfs_mountroot_shuffle(struct thread *td, struct
> mount *mpdevfs)
> } else
> vput(vp);
> }
> -   NDFREE_PNBUF();

> if (error)
> printf("mountroot: unable to remount previous root "
> @@ -387,6 +386,7 @@ vfs_mountroot_shuffle(struct thread *td, struct
> mount *mpdevfs)
> NDINIT(, LOOKUP, FOLLOW | LOCKLEAF, UIO_SYSSPACE, "/dev");
> error = namei();
> if (!error) {
> +   NDFREE_PNBUF();
> vp = nd.ni_vp;
> error = (vp->v_type == VDIR) ? 0 : ENOTDIR;
> if (!error)

That was missing the last change, and tabs were expanded.  I put it in
by hand, and the patch works, at least to avoid this panic.  It still
insisted on remounting root on nda1p2, which is not a root file system.
Remounting /dev still failed without panicking, then it panicked because
there was no /sbin/init.  Apparently it is necessary to use "boot -a"
in this situation.  Too bad the loader option menu doesn't include that.

Just to be clear what I tested, my patch follows.

Mike

diff --git a/sys/kern/vfs_mountroot.c b/sys/kern/vfs_mountroot.c
index 956d29e3f084..b08b2a3200f8 100644
--- a/sys/kern/vfs_mountroot.c
+++ b/sys/kern/vfs_mountroot.c
@@ -352,13 +352,13 @@ vfs_mountroot_shuffle(struct thread *td, struct mount 
*mpdevfs)
NDINIT(, LOOKUP, FOLLOW | LOCKLEAF, UIO_SYSSPACE, fspath);
error = namei();
if (error) {
-   NDFREE_PNBUF();
fspath = "/mnt";
NDINIT(, LOOKUP, FOLLOW | LOCKLEAF, UIO_SYSSPACE,
fspath);
error = namei();
}
if (!error) {
+   NDFREE_PNBUF();
vp = nd.ni_vp;
error = (vp->v_type == VDIR) ? 0 : ENOTDIR;
if (!error)
@@ -376,7 +376,6 @@ vfs_mountroot_shuffle(struct thread *td, struct mount 
*mpdevfs)
} else
vput(vp);
}
-   NDFREE_PNBUF();
 
if (error)
printf("mountroot: unable to remount previous root "
@@ -387,6 +386,7 @@ vfs_mountroot_shuffle(struct thread *td, struct mount 
*mpdevfs)
NDINIT(, LOOKUP, FOLLOW | LOCKLEAF, UIO_SYSSPACE, "/dev");
error = namei();
if (!error) {
+   NDFREE_PNBUF();
vp = nd.ni_vp;
error = (vp->v_type == VDIR) ? 0 : ENOTDIR;
if (!error)
@@ -413,7 +413,6 @@ vfs_mountroot_shuffle(struct thread *td, struct mount 
*mpdevfs)
if (error)
printf("mountroot: unable to remount devfs under /dev "
"(error %d)\n", error);
-   NDFREE_PNBUF();
 
if (mporoot == mpdevfs) {

Re: confusion about root partition causes panic during startup

2023-07-20 Thread Mateusz Guzik
On 7/20/23, Mike Karels  wrote:
> I installed an additional NVME drive on a system, and then booted.  It
> turns
> out that the new drive became nda0, renumbering the other drives.  The
> loader
> found the correct partition to boot (the only choice), and loaded the
> kernel
> correctly.  However, /etc/fstab still had the old name (nvd1p2), which is
> now drive 2.  I expected it to drop into single user, but instead the
> system
> panicked in vfs_mountroot_shuffle trying to switch root devices (see
> below).
> It doesn't seem that having the wrong root device in /etc/fstab should
> cause
> a panic; it makes it harder to patch the system.  I was unable to get the
> system to boot using boot-to-single-user or setting currdev, but I managed
> to remember doing "boot -a" from a loader prompt to get the system to ask
> the root device before mounting it.  I can easily reproduce this to test.
> Probably the NDFREE_PNBUF() shouldn't happen if namei() returned an error.
>

ye, this should do it (untested):

diff --git a/sys/kern/vfs_mountroot.c b/sys/kern/vfs_mountroot.c
index 956d29e3f084..85398ff781e4 100644
--- a/sys/kern/vfs_mountroot.c
+++ b/sys/kern/vfs_mountroot.c
@@ -352,13 +352,13 @@ vfs_mountroot_shuffle(struct thread *td, struct
mount *mpdevfs)
NDINIT(, LOOKUP, FOLLOW | LOCKLEAF, UIO_SYSSPACE, fspath);
error = namei();
if (error) {
-   NDFREE_PNBUF();
fspath = "/mnt";
NDINIT(, LOOKUP, FOLLOW | LOCKLEAF, UIO_SYSSPACE,
fspath);
error = namei();
}
if (!error) {
+   NDFREE_PNBUF();
vp = nd.ni_vp;
error = (vp->v_type == VDIR) ? 0 : ENOTDIR;
if (!error)
@@ -376,7 +376,6 @@ vfs_mountroot_shuffle(struct thread *td, struct
mount *mpdevfs)
} else
vput(vp);
}
-   NDFREE_PNBUF();

if (error)
printf("mountroot: unable to remount previous root "
@@ -387,6 +386,7 @@ vfs_mountroot_shuffle(struct thread *td, struct
mount *mpdevfs)
NDINIT(, LOOKUP, FOLLOW | LOCKLEAF, UIO_SYSSPACE, "/dev");
error = namei();
if (!error) {
+   NDFREE_PNBUF();
vp = nd.ni_vp;
error = (vp->v_type == VDIR) ? 0 : ENOTDIR;
if (!error)



>   Mike
>
> Trying to mount root from ufs:/dev/nvd1p2 [rw]...
> WARNING: WITNESS option enabled, expect reduced performance.
> mountroot: unable to remount devfs under /dev (error 2)
> panic: Assertion _ndp->ni_cnd.cn_pnbuf != NULL failed at
> ../../../kern/vfs_mountroot.c:416
> cpuid = 19
> time = 11
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> 0xfe006d3bac40
> vpanic() at vpanic+0x149/frame 0xfe006d3bac90
> panic() at panic+0x43/frame 0xfe006d3bacf0
> vfs_mountroot() at vfs_mountroot+0x1bf7/frame 0xfe006d3bae60
> start_init() at start_init+0x23/frame 0xfe006d3baef0
> fork_exit() at fork_exit+0x82/frame 0xfe006d3baf30
> fork_trampoline() at fork_trampoline+0xe/frame 0xfe006d3baf30
> --- trap 0x5c035c02, rip = 0x680c680c680c680c, rsp = 0x1b6b1f6b1b6b1b6b, rbp
> = 0x4eb54eb54eb54eb5 ---
> KDB: enter: panic
> [ thread pid 1 tid 12 ]
> Stopped at  kdb_enter+0x32: movq$0,0xde7643(%rip)
>
>


-- 
Mateusz Guzik 



Re: confusion about root partition causes panic during startup

2023-07-20 Thread Warner Losh
On Thu, Jul 20, 2023, 1:27 PM Mike Karels  wrote:

> I installed an additional NVME drive on a system, and then booted.  It
> turns
> out that the new drive became nda0, renumbering the other drives.  The
> loader
> found the correct partition to boot (the only choice), and loaded the
> kernel
> correctly.  However, /etc/fstab still had the old name (nvd1p2), which is
> now drive 2.  I expected it to drop into single user, but instead the
> system
> panicked in vfs_mountroot_shuffle trying to switch root devices (see
> below).
> It doesn't seem that having the wrong root device in /etc/fstab should
> cause
> a panic; it makes it harder to patch the system.  I was unable to get the
> system to boot using boot-to-single-user or setting currdev, but I managed
> to remember doing "boot -a" from a loader prompt to get the system to ask
> the root device before mounting it.  I can easily reproduce this to test.
> Probably the NDFREE_PNBUF() shouldn't happen if namei() returned an error.
>
> Mike
>
> Trying to mount root from ufs:/dev/nvd1p2 [rw]...
> WARNING: WITNESS option enabled, expect reduced performance.
> mountroot: unable to remount devfs under /dev (error 2)
> panic: Assertion _ndp->ni_cnd.cn_pnbuf != NULL failed at
> ../../../kern/vfs_mountroot.c:416
> cpuid = 19
> time = 11
> KDB: stack backtrace:
> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> 0xfe006d3bac40
> vpanic() at vpanic+0x149/frame 0xfe006d3bac90
> panic() at panic+0x43/frame 0xfe006d3bacf0
> vfs_mountroot() at vfs_mountroot+0x1bf7/frame 0xfe006d3bae60
> start_init() at start_init+0x23/frame 0xfe006d3baef0
> fork_exit() at fork_exit+0x82/frame 0xfe006d3baf30
> fork_trampoline() at fork_trampoline+0xe/frame 0xfe006d3baf30
> --- trap 0x5c035c02, rip = 0x680c680c680c680c, rsp = 0x1b6b1f6b1b6b1b6b,
> rbp = 0x4eb54eb54eb54eb5 ---
> KDB: enter: panic
> [ thread pid 1 tid 12 ]
> Stopped at  kdb_enter+0x32: movq$0,0xde7643(%rip)
>


I'll have to see if I can recreate this. I've been running this way for a
long time...

Warner

>


confusion about root partition causes panic during startup

2023-07-20 Thread Mike Karels
I installed an additional NVME drive on a system, and then booted.  It turns
out that the new drive became nda0, renumbering the other drives.  The loader
found the correct partition to boot (the only choice), and loaded the kernel
correctly.  However, /etc/fstab still had the old name (nvd1p2), which is
now drive 2.  I expected it to drop into single user, but instead the system
panicked in vfs_mountroot_shuffle trying to switch root devices (see below).
It doesn't seem that having the wrong root device in /etc/fstab should cause
a panic; it makes it harder to patch the system.  I was unable to get the
system to boot using boot-to-single-user or setting currdev, but I managed
to remember doing "boot -a" from a loader prompt to get the system to ask
the root device before mounting it.  I can easily reproduce this to test.
Probably the NDFREE_PNBUF() shouldn't happen if namei() returned an error.

Mike

Trying to mount root from ufs:/dev/nvd1p2 [rw]...
WARNING: WITNESS option enabled, expect reduced performance.
mountroot: unable to remount devfs under /dev (error 2)
panic: Assertion _ndp->ni_cnd.cn_pnbuf != NULL failed at 
../../../kern/vfs_mountroot.c:416
cpuid = 19
time = 11
KDB: stack backtrace:
db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame 0xfe006d3bac40
vpanic() at vpanic+0x149/frame 0xfe006d3bac90
panic() at panic+0x43/frame 0xfe006d3bacf0
vfs_mountroot() at vfs_mountroot+0x1bf7/frame 0xfe006d3bae60
start_init() at start_init+0x23/frame 0xfe006d3baef0
fork_exit() at fork_exit+0x82/frame 0xfe006d3baf30
fork_trampoline() at fork_trampoline+0xe/frame 0xfe006d3baf30
--- trap 0x5c035c02, rip = 0x680c680c680c680c, rsp = 0x1b6b1f6b1b6b1b6b, rbp = 
0x4eb54eb54eb54eb5 ---
KDB: enter: panic
[ thread pid 1 tid 12 ]
Stopped at  kdb_enter+0x32: movq$0,0xde7643(%rip)