Re: Module autounload proposal: opt-in, not opt-out

2022-08-08 Thread David Holland
On Mon, Aug 08, 2022 at 07:18:05AM -0700, Paul Goyette wrote:
 > (personal note)
 > It really seems to me that the current module sub-systems is at
 > best a second-class capability.  I often get the feeling that
 > others don't really care about modules, until it's the only way
 > to provide something else (dtrace).  This proposal feels like
 > another nail in the modular coffin.  Rather than disabling (part
 > of) the module feature, we should find ways to improve testing
 > the feature.

I think there's a general recognition that it's a useful feature to
have, even though it has real costs (like LOCKDEBUG being so slow);
but on the other hand we're still dealing with the consequences of the
... missteps ... in the initial rollout, even though it was more than
a decade ago.

I can't even remember at this point what half of the technical
showstoppers that we were supposed to swallow were, and I know you've
fixed a lot of them, but there's still at least two(*) big ones left.
Meanwhile the absolute refusal of certain people to listen to concerns
or consider any plans but their own creates a lingering ... lack of
appetite for working on the topic, even though I think all or nearly
all those people have left the project and the circumstances have
changed considerably. This is certainly true for me, and I don't think
I'm the only one. So, mostly, nothing happens.

Note that despite all the calls to remove major pieces of
functionality in the intervening years, I don't think anyone's ever
proposed removing the module system.


(*) I'm thinking of the build scheme and resulting configuration
management problems, and lack of logging/audited of what gets loaded.
[One might think the latter would be simple but it isn't.]

-- 
David A. Holland
dholl...@netbsd.org


Re: mfii(4) and Dell PERC

2022-08-08 Thread Mark Davies




On 8/08/22 21:37, Edgar Fuß wrote:

Does anyone use a Dell PERC H730P or similar RAID controller in RAID mode?


I have several NetBSD systems with PERC H730P's


mfii(4) says all configuration is done via the controller's BIOS.
Does that mean I need to shut down in case a drive fails an I need to rebuild?


no.
on my systems powerd will notify me of a state change on the disk array, 
then I can use bioctl to see the current state.  I can then hot swap the 
faulty disk and the array will automatically start rebuilding.




Can I monitor the RAID state?


yes.

green-mountain# bioctl mfii0 show
Volume Status   Size Device/LabelLevel Stripe
=
 0 Online   2.6TRAID 564K
   0:0 Online   894G 1:0.0 noencl DSF8>
   0:1 Online   894G 1:1.0 noencl DSF8>
   0:2 Online   894G 1:2.0 noencl DSF8>
   0:3 Online   894G 1:3.0 noencl DSF8>





Can I monitor the BBU Battery health?


yes

green-mountain# envstat -d mfii0
 Current  CritMax  WarnMax  WarnMin  CritMin  Unit
  mfii0 BBU state:  TRUE
mfii0 BBU voltage: 3.885 V
mfii0 BBU current: 0.000 A
mfii0 BBU temperature:26.000  degC
  mfii0:0:online


cheers
mark


Re: Module autounload proposal: opt-in, not opt-out

2022-08-08 Thread Paul Goyette

Thomas Klausner wrote:


To reply to this point - I'm very interested in modules, but it
doesn't meet a basic requirement for me - easily going back to a
previous kernel when the new one is broken for some reason - if they
share the same kernel version, they share the same
/stand/amd64/9.99.99/modules directory, and if the problem is there,
I'm stuck.





I think /netbsd/ as a directory including a kernel and modules would
be a solution. I think christos? proposed it at some point and but it
never happened.


Christos started on KERNEL_DIR option, but didn't finish it.  I did
some testing and filed several PRs (kern/55928, kern/55929, and
kern/55930), but so far no further work has occurred.


++--+--+
| Paul Goyette   | PGP Key fingerprint: | E-mail addresses:|
| (Retired)  | FA29 0E3B 35AF E8AE 6651 | p...@whooppee.com|
| Software Developer | 0786 F758 55DE 53BA 7731 | pgoye...@netbsd.org  |
| & Network Engineer |  | pgoyett...@gmail.com |
++--+--+


Re: Module autounload proposal: opt-in, not opt-out

2022-08-08 Thread Valery Ushakov
On Sun, Aug 07, 2022 at 23:08:47 +, Taylor R Campbell wrote:

> Currently there are many types of modules that are autoloaded from
> open-ended patterns:

Our current auto-load policy is a bit too enthusiastic, IMHO.  E.g. an
"unknown" ioctl used to (probably still does) trigger autoload of
compat module, so each time you run vi you have compat module loaded.
One one hand it might seem convenient, but on the other hand we are
getting into the "WAT?!" territory, like those dynamically typed
languages that go out of their way to interpret your program in *some*
way, using most vexing coercions, as if failure is not an option.
IMO, it most certainly is an option and often the most sensible one
too.

Auto-unload seems to me like a kludge to compensate for the blind
"let's try and see if this helps" auto-load policy.

-uwe


Re: Module autounload proposal: opt-in, not opt-out

2022-08-08 Thread Greg Troxel

Paul Goyette  writes:

> (personal note)
> It really seems to me that the current module sub-systems is at
> best a second-class capability.  I often get the feeling that
> others don't really care about modules, until it's the only way
> to provide something else (dtrace).  This proposal feels like
> another nail in the modular coffin.  Rather than disabling (part
> of) the module feature, we should find ways to improve testing
> the feature.

I'd just like to say that while I haven't gone down the "modules first"
path, I have been watching your commits and cheering you on.

I do use a few modules, and this is making me think I should try to run
MODULAR, especially on machines with less memory.
I'm a little scared of not even having UFS, but I can try it as the
low-memory machine is not important.


signature.asc
Description: PGP signature


Re: Module autounload proposal: opt-in, not opt-out

2022-08-08 Thread Thomas Klausner
On Mon, Aug 08, 2022 at 07:18:05AM -0700, Paul Goyette wrote:
> It really seems to me that the current module sub-systems is at
> best a second-class capability.  I often get the feeling that
> others don't really care about modules, until it's the only way
> to provide something else (dtrace).

To reply to this point - I'm very interested in modules, but it
doesn't meet a basic requirement for me - easily going back to a
previous kernel when the new one is broken for some reason - if they
share the same kernel version, they share the same
/stand/amd64/9.99.99/modules directory, and if the problem is there,
I'm stuck.

Also, I'm not clear on what the new workflow is for my old one:

Update kernel:

A.
   build.sh kernel=NAME
   ln -f /netbsd /netbsd.old
   install /netbsd /netbsd
   reboot

B.
   build.sh what?
   ln -f /netbsd /netbsd.old
   ? something for old modules ?
   install .../netbsd /netbsd
   ? something for new modules, are they even built? ?
   reboot

If the kernel is broken somehow

A. drop to boot prompt
   boot /netbsd.old

B. drop to bootprompt
   ? What do I do to tell the old kernel where its old modules are ?
   boot /netbsd.old

I think /netbsd/ as a directory including a kernel and modules would
be a solution. I think christos? proposed it at some point and but it
never happened.
 Thomas


Re: Module autounload proposal: opt-in, not opt-out

2022-08-08 Thread Paul Goyette

On Mon, 8 Aug 2022, Martin Husemann wrote:


On Mon, Aug 08, 2022 at 09:24:28AM -0400, Greg Troxel wrote:

Given where we are, do you really mean

  we should withdraw every module from autoload that does not have a
  documented test result, right now?

It seems far better to have them stay loaded than be unavailable.


I'm not sure (and given Taylor's reply it is not easily doable anyway).

However, I find the idea of offering auto-loadable modules which we do
not fully trust kind of disturbing.

On the other hand we have not heard[1] of many crash reports that are due
to auto-unload, so maybe the problem is not that bad after all and we could
fix the broken ones when we catch them.



Martin

1: which might be just because the feature not being exercised very often


I suspect that modules are not being used very much.  After all, we
include nearly everything in GENERIC kernels, so there's rarely any
need to load modules.  (The major exception being dtrace, IIUC, and
that's almost exclusively for developers.)  I'm probably one of the
most ardent proponents of modules, regularly running a completely
stripped-out kernel (minimal built-in module code).

I'd like to suggest an alternative proposal: have a new sysctl(8)
variable that controls whether the default is opt-in vs opt-out.
Setting opt-in would require a module to explicitly return 0 from
its modcmd(AUTOUNLOAD) routine, while setting opt-out would accept
either 0 or ENOTTY (the current behavior).  I'm not sure which
would be the best default (and won't argue, regardlesss of which
default is selected), but the ability to retain the current opt-
out behavior is important to me, and might lead to more/better
testing (albeit mostly ad-hoc vs formal testing).

(personal note)
It really seems to me that the current module sub-systems is at
best a second-class capability.  I often get the feeling that
others don't really care about modules, until it's the only way
to provide something else (dtrace).  This proposal feels like
another nail in the modular coffin.  Rather than disabling (part
of) the module feature, we should find ways to improve testing
the feature.


++--+--+
| Paul Goyette   | PGP Key fingerprint: | E-mail addresses:|
| (Retired)  | FA29 0E3B 35AF E8AE 6651 | p...@whooppee.com|
| Software Developer | 0786 F758 55DE 53BA 7731 | pgoye...@netbsd.org  |
| & Network Engineer |  | pgoyett...@gmail.com |
++--+--+



Re: Module autounload proposal: opt-in, not opt-out

2022-08-08 Thread Martin Husemann
On Mon, Aug 08, 2022 at 09:24:28AM -0400, Greg Troxel wrote:
> Given where we are, do you really mean
> 
>   we should withdraw every module from autoload that does not have a
>   documented test result, right now?
> 
> It seems far better to have them stay loaded than be unavailable.

I'm not sure (and given Taylor's reply it is not easily doable anyway).

However, I find the idea of offering auto-loadable modules which we do
not fully trust kind of disturbing.

On the other hand we have not heard[1] of many crash reports that are due
to auto-unload, so maybe the problem is not that bad after all and we could
fix the broken ones when we catch them.



Martin

1: which might be just because the feature not being exercised very often


Re: Module autounload proposal: opt-in, not opt-out

2022-08-08 Thread Greg Troxel

Martin Husemann  writes:

> I think that all modules that we deliver and declare safe for *autoload*
> should require to be actually tested, and a basic test (for me) includes
> testing auto-unload. That does not cover races that slip through "casual"
> testing, but should have caught the worst bugs.

That's a reasonable position for adding modules, but

> So the error in the cases you stumbled in is the autoload and keeping the
> badly tested module autoloadable but forbid its unloading sounds a bit
> strange to me.

Given where we are, do you really mean

  we should withdraw every module from autoload that does not have a
  documented test result, right now?

It seems far better to have them stay loaded than be unavailable.


signature.asc
Description: PGP signature


Re: mfii(4) and Dell PERC

2022-08-08 Thread Masanobu SAITOH
Hi.

On 2022/08/08 19:41, Edgar Fuß wrote:
> Thanks for your answers.
> 
>> Some people reported that kern/56669 (and perhaps kern/55192) still exist 
>> on some systems :-(
> Hm.
> 
>> bioctl mfi(i)X show
> Ah, thanks.
> What do I do in case a drive fails? Will adding a hot spare automagically 
> start a reconstruction?

It will do.
(I've never done "bioctl mfii0 hotspare add". I usually add a hotspare
via BIOS setup menu.)

>> If your system has other number, please let me know.
> I don't have such a system yet. I wanted to find out about NetBSD 
> compatibility 
> before buying one.

For the BBU monitoring, it's easy to add a new device support.

-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


virtio(4) bus_dmamap_sync audit

2022-08-08 Thread Taylor R Campbell
I reviewed the bus_dmamap_sync calls in sys/dev/pci/virtio.c and found
several likely bugs by code inspection, including misuse of membar_*.
The attached patch fixes all the issues I saw, and removes the
membar_* calls in favour of bus_dmamap_sync.  Details in attached
commit message.

qemu virtio-blk and virtio-net still seem to work in cursory testing.
Are there any automated tests, ideally stress tests, of virtio(4)?

Review or testing welcome!
>From 39a2d4f14e7b7fe6d2ddb8bd621b459eb343f780 Mon Sep 17 00:00:00 2001
From: Taylor R Campbell 
Date: Mon, 8 Aug 2022 10:47:45 +
Subject: [PATCH] virtio(4): Membar and bus_dmamap_sync audit.

- Don't use membar_* for DMA.

- Sync only the header and payload of the rings separately as needed.
  If we bounce, this avoids large memcpy when we only care about the
  header.

- Sync uring with PREREAD before triggering anything that will return
  data in it.

  => Move uring PREREAD in virtio_enqueue_commit to _before_ updating
 vq->vq_avail->idx, which is the pointat which the DMA read is
 triggered in the `device' (host).

  => Omit needless membar_consumer in virtio_enqueue_commit -- may not
 work with DMA memory, and even if it does, redundant with
 bus_dmamap_sync uring PREREAD here.

  => XXX Does the device/host ever return unsolicited entries in the
 queue, or only solicited ones?  If only solicited ones, the
 PREREAD in virtio_init_vq is redundant.

- Sync uring with POSTREAD before we read from it.  This way the DMA
  read into our buffer has finished before we read from the buffer.

  => Add missing uring POSTREAD in virtio_vq_is_enqueued, between read of
 vq->vq_used_idx and return to caller, so that the caller can
 safely use virtio_dequeue.

  => Add missing uring POSTREADs in virtio_start_vq_intr:
 . between entry from caller and the read of vq->vq_used_idx
 . between the read of vq->vq_used_idx and return to caller,
   so that the caller can safely use virtio_dequeue, just like
   virtio_vq_is_enqueued

  => Move uring POSTREADs in virtio_enqueue_commit to _before_ reading
 vq->vq_used->flags or *vq->vq_avail_event, not after.

- After we write to aring, sync it with PREWRITE.  This way we finish
  writing to our buffer before the DMA write from it.

  => Omit needless PREWRITE in virtio_init_vq -- we do the appropriate
 PREWRITE in virtio_enqueue_commit now.

  => Convert membar_producer to bus_dmamap_sync PREWRITE in
 virtio_enqueue_commit.

  => Omit incorrect aring POSTWRITE in virtio_enqueue_commit -- no need
 because the DMA write may not have completed yet at this point,
 and we already do a POSTWRITE in virtio_vq_is_enqueued.

  => Omit needless membar_producer in virtio_postpone_intr -- may not
 work with DMA memory, and even if it does, redundant with
 bus_dmamap_sync PREWRITE here.

- After xfers to aring have completed, sync it with POSTWRITE.

  => Add missing aring POSTWRITE in virtio_free_vq, in case there are
 paths from virtio_enqueue_commit to here that don't go through
 virtio_is_enqueued.  (If there are no such paths, then maybe we
 should KASSERT(vq->vq_queued == 0) in virtio_free_vq.)
---
 sys/dev/pci/virtio.c | 160 +--
 1 file changed, 122 insertions(+), 38 deletions(-)

diff --git a/sys/dev/pci/virtio.c b/sys/dev/pci/virtio.c
index 4c2d80c28e9a..f7189903a0e3 100644
--- a/sys/dev/pci/virtio.c
+++ b/sys/dev/pci/virtio.c
@@ -422,47 +422,111 @@ virtio_soft_intr(void *arg)
 static inline void
 vq_sync_descs(struct virtio_softc *sc, struct virtqueue *vq, int ops)
 {
+
/* availoffset == sizeof(vring_desc)*vq_num */
bus_dmamap_sync(sc->sc_dmat, vq->vq_dmamap, 0, vq->vq_availoffset,
-   ops);
+   ops);
 }
 
 static inline void
-vq_sync_aring(struct virtio_softc *sc, struct virtqueue *vq, int ops)
+vq_sync_aring_all(struct virtio_softc *sc, struct virtqueue *vq, int ops)
 {
uint16_t hdrlen = offsetof(struct vring_avail, ring);
+   size_t payloadlen = sc->sc_nvqs * sizeof(uint16_t);
+   size_t usedlen = 0;
+
if (sc->sc_active_features & VIRTIO_F_RING_EVENT_IDX)
-   hdrlen += sizeof(uint16_t);
+   usedlen = sizeof(uint16_t);
+   bus_dmamap_sync(sc->sc_dmat, vq->vq_dmamap,
+   vq->vq_availoffset, hdrlen + payloadlen + usedlen, ops);
+}
+
+static inline void
+vq_sync_aring_header(struct virtio_softc *sc, struct virtqueue *vq, int ops)
+{
+   uint16_t hdrlen = offsetof(struct vring_avail, ring);
+
+   bus_dmamap_sync(sc->sc_dmat, vq->vq_dmamap,
+   vq->vq_availoffset, hdrlen, ops);
+}
+
+static inline void
+vq_sync_aring_payload(struct virtio_softc *sc, struct virtqueue *vq, int ops)
+{
+   uint16_t hdrlen = offsetof(struct vring_avail, ring);
+   size_t payloadlen = sc->sc_nvqs * sizeof(uint16_t);
 
bus_dmamap_sync(sc->sc_dmat, vq->vq_dmamap,
-   vq->vq_availo

Re: mfii(4) and Dell PERC

2022-08-08 Thread Edgar Fuß
Thanks for your answers.

> Some people reported that kern/56669 (and perhaps kern/55192) still exist 
> on some systems :-(
Hm.

> bioctl mfi(i)X show
Ah, thanks.
What do I do in case a drive fails? Will adding a hot spare automagically start 
a reconstruction?

> If your system has other number, please let me know.
I don't have such a system yet. I wanted to find out about NetBSD compatibility 
before buying one.


Re: mfii(4) and Dell PERC

2022-08-08 Thread Masanobu SAITOH



On 2022/08/08 18:37, Edgar Fuß wrote:
> I'm unsure whether this is the right list, is port-amd64 more appropriate?
> 
> Does anyone use a Dell PERC H730P or similar RAID controller in RAID mode?

It seems PERC H730P's device ID is 0x005d.

https://pci-ids.ucw.cz/read/PC/1000/005d

It's the Invader series. I have Fujitsu PRAID EP400i. It's also Invader.
It works with one of my machines, but it doesn't guarantee that
your machine with PERC H730P works. Some people reported that kern/56669
(and perhaps kern/55192) still exist on some systems :-(


> mfii(4) says all configuration is done via the controller's BIOS.
> Does that mean I need to shut down in case a drive fails an I need to rebuild?
> 
> Can I monitor the RAID state?

bioctl mfi(i)X show

> Can I monitor the BBU Battery health?

Maybe. If your machine's battery_type is 1, 2, 5 or 6, you can do by envstat(8).

#define MFI_BBU_TYPE_IBBU   1
#define MFI_BBU_TYPE_BBU2
#define MFI_BBU_TYPE_IBBU09 5
#define MFI_BBU_TYPE_CVPM02 6

If your system has other number, please let me know.

> Thanks in advance.

-- 
---
SAITOH Masanobu (msai...@execsw.org
 msai...@netbsd.org)


mfii(4) and Dell PERC

2022-08-08 Thread Edgar Fuß
I'm unsure whether this is the right list, is port-amd64 more appropriate?

Does anyone use a Dell PERC H730P or similar RAID controller in RAID mode?

mfii(4) says all configuration is done via the controller's BIOS.
Does that mean I need to shut down in case a drive fails an I need to rebuild?

Can I monitor the RAID state?
Can I monitor the BBU Battery health?

Thanks in advance.


Re: Module autounload proposal: opt-in, not opt-out

2022-08-08 Thread Taylor R Campbell
> Date: Mon, 8 Aug 2022 09:09:11 +0200
> From: Martin Husemann 
> 
> On Sun, Aug 07, 2022 at 11:08:47PM +, Taylor R Campbell wrote:
> > Some modules might still be unsafe to unload, which is a bug -- but at
> > least this way accidentally creating the wrong network interface or
> > opening the wrong device won't set a ten-second time-bomb for the
> > system like I occasionally stumble upon in the status quo.
> 
> I think that all modules that we deliver and declare safe for *autoload*
> should require to be actually tested, and a basic test (for me) includes
> testing auto-unload. That does not cover races that slip through "casual"
> testing, but should have caught the worst bugs.
> 
> So the error in the cases you stumbled in is the autoload and keeping the
> badly tested module autoloadable but forbid its unloading sounds a bit
> strange to me.

If you want to limit autoloadable modules to those that can be safely
auto-unloaded, then we also need to have predefined lists, like we
have for the exec modules now, for each of the following (and more)
categories of autoloadable modules:

- if_xyz.kmod, when you do `ifconfig xyzN create'
- xyzfs.kmod, when you do `mount -t xyzfs ...'
- xyzdev.kmod, when you open /dev/xyzdevN and there's an entry in
  sys/conf/majors for it
- npf_alg_xyz.kmod, when you configure npf with an ALG xyz

I'm OK with that but it's more work, and -- without auditing and
testing _all_ of those possibilities to put them into the predefined
safe lists -- will have a much bigger immediate negative impact on
autoloadable functionality than my incremental proposal.


Re: Module autounload proposal: opt-in, not opt-out

2022-08-08 Thread Martin Husemann
On Sun, Aug 07, 2022 at 11:08:47PM +, Taylor R Campbell wrote:
> Some modules might still be unsafe to unload, which is a bug -- but at
> least this way accidentally creating the wrong network interface or
> opening the wrong device won't set a ten-second time-bomb for the
> system like I occasionally stumble upon in the status quo.

I think that all modules that we deliver and declare safe for *autoload*
should require to be actually tested, and a basic test (for me) includes
testing auto-unload. That does not cover races that slip through "casual"
testing, but should have caught the worst bugs.

So the error in the cases you stumbled in is the autoload and keeping the
badly tested module autoloadable but forbid its unloading sounds a bit
strange to me.

Martin