Date:Sat, 13 Aug 2022 14:44:46 +
From:Taylor R Campbell
Message-ID: <20220813144453.af03d60...@jupiter.mumble.net>
| When _userland_ opens the raw /dev/rdkN _character_ device, for a
| wedge on (say) raid0, the _kernel_ will do the equivalent of opening
|
> Date: Sat, 13 Aug 2022 19:47:54 +0700
> From: Robert Elz
>
> However, since we now know the issue we have been looking at does involve
> the raw devices, not the block ones, I'm not sure what is the point of
> reverting that specfs_vnode.c patch, which only affects the block device
> open. If
Turns out that I was misled by cvs diff ... I did that on the kernel
after the previous crash, just to verify that nothing unexpected had
happened, but forgot that that diffs the checked out version against
the version it was checked out from. When I saw no diffs from
specfs_vnode.c I jumped to
OK, ignore the previous crash - whatever it was, either it was some
weird one off, or it was something unrelated that has been fixed in
the past 18 hours (and caused in the period not long before that).
I did as I said, updated the src tree again, backed out your dk.c patch,
and rebuilt the
Date:Sat, 13 Aug 2022 12:10:46 - (UTC)
From:mlel...@serpens.de (Michael van Elst)
Message-ID:
| That panic should be fixed by now, it was an inverted assertion.
OK, thanks - I did see it gone in my latest test (the message about
which was delayed getting to
k...@munnari.oz.au (Robert Elz) writes:
>vpanic()
>kern_assert()
>_bus_dmamem_unmap.constprop.0() at +0x157
That panic should be fixed by now, it was an inverted assertion.
Date:Fri, 12 Aug 2022 23:35:26 +
From:Taylor R Campbell
Message-ID: <20220812233531.8c22560...@jupiter.mumble.net>
| Can you try _reverting_ specfs_blockopen.patch, and _applying_ the
| attached dkopenclose.patch, and see if you can reproduce any crash?
OK,
> Date: Sat, 13 Aug 2022 05:59:01 +0700
> From: Robert Elz
>
> Why is fsck running on the block device though? And devpubd too? Given
> the reference to cdev_close() I'd assumed it was a char (raw) device that
> was being used, which would be what fsck certainly should be using. But
> I see
Date:Fri, 12 Aug 2022 21:07:14 +
From:Taylor R Campbell
Message-ID: <20220812210719.ca66b60...@jupiter.mumble.net>
| Here's a hypothesis about what happened.
|
| - You have a RAID volume, say raid0, with a GPT partitioning it.
|
| - raid0 is configured
Here's a hypothesis about what happened.
- You have a RAID volume, say raid0, with a GPT partitioning it.
- raid0 is configured with an explicit /etc/raid0.conf file, rather
than with an autoconfigured label.
- You have devpubd=YES in /etc/rc.conf.
On boot, the following sequence of events
> Date: Sat, 13 Aug 2022 02:13:23 +0700
> From: Robert Elz
>
> Apart from dkctl() and the two fsck_ffs processes, there was
> the parent fsck (just in wait) and (or course) init, and a whole
> set of sh processes (many in pipe_rd or similar) - which will be
> the infrastructure used for running
Can you try the attached patch and see if (a) it still panics, or if
(b) there are any other adverse consequences like fsck failing?
>From 7ade169092246e8870aaab2973d8bb252e625d89 Mon Sep 17 00:00:00 2001
From: Taylor R Campbell
Date: Fri, 12 Aug 2022 20:01:50 +
Subject: [PATCH] specfs:
Date:Fri, 12 Aug 2022 17:10:47 +
From:Taylor R Campbell
Message-ID: <20220812171053.3e95260...@jupiter.mumble.net>
| I added a couple more assertions (spec_vnops.c 1.213).
OK, once again, crash on first boot - but this time (as I had only
one new kernel to
Date:Fri, 12 Aug 2022 16:40:13 +
From:Taylor R Campbell
Message-ID: <20220812164019.34ded60...@jupiter.mumble.net>
| This is extremely unlikely. You might try removing the assertion that
| it tripped on, KASSERT(!sd->sd_opened), so that it has the
I added a couple more assertions (spec_vnops.c 1.213).
Attached is an updated patch with the ABI change to record who was
closing when it shouldn't be possible.
>From 4df657b9ca2112c556eb7aaf7b6ed5b6912603c1 Mon Sep 17 00:00:00 2001
From: Taylor R Campbell
Date: Sat, 16 Apr 2022 11:45:06 +
> Date: Fri, 12 Aug 2022 23:28:13 +0700
> From: Robert Elz
>
> I think I am going to try backing out the patch, and just run with your
> (Taylor's) updated specfs_vnops.c in case there is something in the patch
> which is altering the behaviour (more than just the nature of the sd_closing
>
Date:Thu, 11 Aug 2022 13:01:34 +
From:Taylor R Campbell
Message-ID: <20220811130139.b310860...@jupiter.mumble.net>
| If that still doesn't help, you could try the attached patch (which
| I'm not committing at the moment because it changes the kernel ABI).
|
Date:Thu, 11 Aug 2022 13:01:34 +
From:Taylor R Campbell
Message-ID: <20220811130139.b310860...@jupiter.mumble.net>
| If you cvs up spec_vnops.c to 1.211, you may get a different assertion
| firing, or not; which assertion fires now might potentially help to
> Date: Thu, 11 Aug 2022 13:01:34 +
> From: Taylor R Campbell
>
> If that still doesn't help, you could try the attached patch (which
> I'm not committing at the moment because it changes the kernel ABI).
...patch actually attached now
>From 11428c92e7bc51a35942c2024b95c756af8c9fc6 Mon Sep
I've seen something like that in syzkaller before:
https://syzkaller.appspot.com/bug?id=47c67ab6d3a87514d0707882a9ad6671beaa864
If you cvs up spec_vnops.c to 1.211, you may get a different assertion
firing, or not; which assertion fires now might potentially help to
diagnose the problem.
If
A few times recently, I have seen the following panic (from 9.99.99)
[36.426616] panic: kernel diagnostic assertion "sd->sd_closing" failed:
file "/readonly/release/testing/src/sys/miscfs/specfs/spec_vnops.c", line 1725
[36.426616] cpu0: Begin traceback...
[36.426616] vpanic() at
21 matches
Mail list logo