Re: Module autounload proposal: opt-in, not opt-out
On Mon, Aug 08, 2022 at 07:18:05AM -0700, Paul Goyette wrote: > (personal note) > It really seems to me that the current module sub-systems is at > best a second-class capability. I often get the feeling that > others don't really care about modules, until it's the only way > to provide something else (dtrace). This proposal feels like > another nail in the modular coffin. Rather than disabling (part > of) the module feature, we should find ways to improve testing > the feature. I think there's a general recognition that it's a useful feature to have, even though it has real costs (like LOCKDEBUG being so slow); but on the other hand we're still dealing with the consequences of the ... missteps ... in the initial rollout, even though it was more than a decade ago. I can't even remember at this point what half of the technical showstoppers that we were supposed to swallow were, and I know you've fixed a lot of them, but there's still at least two(*) big ones left. Meanwhile the absolute refusal of certain people to listen to concerns or consider any plans but their own creates a lingering ... lack of appetite for working on the topic, even though I think all or nearly all those people have left the project and the circumstances have changed considerably. This is certainly true for me, and I don't think I'm the only one. So, mostly, nothing happens. Note that despite all the calls to remove major pieces of functionality in the intervening years, I don't think anyone's ever proposed removing the module system. (*) I'm thinking of the build scheme and resulting configuration management problems, and lack of logging/audited of what gets loaded. [One might think the latter would be simple but it isn't.] -- David A. Holland dholl...@netbsd.org
Re: mfii(4) and Dell PERC
On 8/08/22 21:37, Edgar Fuß wrote: Does anyone use a Dell PERC H730P or similar RAID controller in RAID mode? I have several NetBSD systems with PERC H730P's mfii(4) says all configuration is done via the controller's BIOS. Does that mean I need to shut down in case a drive fails an I need to rebuild? no. on my systems powerd will notify me of a state change on the disk array, then I can use bioctl to see the current state. I can then hot swap the faulty disk and the array will automatically start rebuilding. Can I monitor the RAID state? yes. green-mountain# bioctl mfii0 show Volume Status Size Device/LabelLevel Stripe = 0 Online 2.6TRAID 564K 0:0 Online 894G 1:0.0 noencl DSF8> 0:1 Online 894G 1:1.0 noencl DSF8> 0:2 Online 894G 1:2.0 noencl DSF8> 0:3 Online 894G 1:3.0 noencl DSF8> Can I monitor the BBU Battery health? yes green-mountain# envstat -d mfii0 Current CritMax WarnMax WarnMin CritMin Unit mfii0 BBU state: TRUE mfii0 BBU voltage: 3.885 V mfii0 BBU current: 0.000 A mfii0 BBU temperature:26.000 degC mfii0:0:online cheers mark
Re: Module autounload proposal: opt-in, not opt-out
Thomas Klausner wrote: To reply to this point - I'm very interested in modules, but it doesn't meet a basic requirement for me - easily going back to a previous kernel when the new one is broken for some reason - if they share the same kernel version, they share the same /stand/amd64/9.99.99/modules directory, and if the problem is there, I'm stuck. I think /netbsd/ as a directory including a kernel and modules would be a solution. I think christos? proposed it at some point and but it never happened. Christos started on KERNEL_DIR option, but didn't finish it. I did some testing and filed several PRs (kern/55928, kern/55929, and kern/55930), but so far no further work has occurred. ++--+--+ | Paul Goyette | PGP Key fingerprint: | E-mail addresses:| | (Retired) | FA29 0E3B 35AF E8AE 6651 | p...@whooppee.com| | Software Developer | 0786 F758 55DE 53BA 7731 | pgoye...@netbsd.org | | & Network Engineer | | pgoyett...@gmail.com | ++--+--+
Re: Module autounload proposal: opt-in, not opt-out
On Sun, Aug 07, 2022 at 23:08:47 +, Taylor R Campbell wrote: > Currently there are many types of modules that are autoloaded from > open-ended patterns: Our current auto-load policy is a bit too enthusiastic, IMHO. E.g. an "unknown" ioctl used to (probably still does) trigger autoload of compat module, so each time you run vi you have compat module loaded. One one hand it might seem convenient, but on the other hand we are getting into the "WAT?!" territory, like those dynamically typed languages that go out of their way to interpret your program in *some* way, using most vexing coercions, as if failure is not an option. IMO, it most certainly is an option and often the most sensible one too. Auto-unload seems to me like a kludge to compensate for the blind "let's try and see if this helps" auto-load policy. -uwe
Re: Module autounload proposal: opt-in, not opt-out
Paul Goyette writes: > (personal note) > It really seems to me that the current module sub-systems is at > best a second-class capability. I often get the feeling that > others don't really care about modules, until it's the only way > to provide something else (dtrace). This proposal feels like > another nail in the modular coffin. Rather than disabling (part > of) the module feature, we should find ways to improve testing > the feature. I'd just like to say that while I haven't gone down the "modules first" path, I have been watching your commits and cheering you on. I do use a few modules, and this is making me think I should try to run MODULAR, especially on machines with less memory. I'm a little scared of not even having UFS, but I can try it as the low-memory machine is not important. signature.asc Description: PGP signature
Re: Module autounload proposal: opt-in, not opt-out
On Mon, Aug 08, 2022 at 07:18:05AM -0700, Paul Goyette wrote: > It really seems to me that the current module sub-systems is at > best a second-class capability. I often get the feeling that > others don't really care about modules, until it's the only way > to provide something else (dtrace). To reply to this point - I'm very interested in modules, but it doesn't meet a basic requirement for me - easily going back to a previous kernel when the new one is broken for some reason - if they share the same kernel version, they share the same /stand/amd64/9.99.99/modules directory, and if the problem is there, I'm stuck. Also, I'm not clear on what the new workflow is for my old one: Update kernel: A. build.sh kernel=NAME ln -f /netbsd /netbsd.old install /netbsd /netbsd reboot B. build.sh what? ln -f /netbsd /netbsd.old ? something for old modules ? install .../netbsd /netbsd ? something for new modules, are they even built? ? reboot If the kernel is broken somehow A. drop to boot prompt boot /netbsd.old B. drop to bootprompt ? What do I do to tell the old kernel where its old modules are ? boot /netbsd.old I think /netbsd/ as a directory including a kernel and modules would be a solution. I think christos? proposed it at some point and but it never happened. Thomas
Re: Module autounload proposal: opt-in, not opt-out
On Mon, 8 Aug 2022, Martin Husemann wrote: On Mon, Aug 08, 2022 at 09:24:28AM -0400, Greg Troxel wrote: Given where we are, do you really mean we should withdraw every module from autoload that does not have a documented test result, right now? It seems far better to have them stay loaded than be unavailable. I'm not sure (and given Taylor's reply it is not easily doable anyway). However, I find the idea of offering auto-loadable modules which we do not fully trust kind of disturbing. On the other hand we have not heard[1] of many crash reports that are due to auto-unload, so maybe the problem is not that bad after all and we could fix the broken ones when we catch them. Martin 1: which might be just because the feature not being exercised very often I suspect that modules are not being used very much. After all, we include nearly everything in GENERIC kernels, so there's rarely any need to load modules. (The major exception being dtrace, IIUC, and that's almost exclusively for developers.) I'm probably one of the most ardent proponents of modules, regularly running a completely stripped-out kernel (minimal built-in module code). I'd like to suggest an alternative proposal: have a new sysctl(8) variable that controls whether the default is opt-in vs opt-out. Setting opt-in would require a module to explicitly return 0 from its modcmd(AUTOUNLOAD) routine, while setting opt-out would accept either 0 or ENOTTY (the current behavior). I'm not sure which would be the best default (and won't argue, regardlesss of which default is selected), but the ability to retain the current opt- out behavior is important to me, and might lead to more/better testing (albeit mostly ad-hoc vs formal testing). (personal note) It really seems to me that the current module sub-systems is at best a second-class capability. I often get the feeling that others don't really care about modules, until it's the only way to provide something else (dtrace). This proposal feels like another nail in the modular coffin. Rather than disabling (part of) the module feature, we should find ways to improve testing the feature. ++--+--+ | Paul Goyette | PGP Key fingerprint: | E-mail addresses:| | (Retired) | FA29 0E3B 35AF E8AE 6651 | p...@whooppee.com| | Software Developer | 0786 F758 55DE 53BA 7731 | pgoye...@netbsd.org | | & Network Engineer | | pgoyett...@gmail.com | ++--+--+
Re: Module autounload proposal: opt-in, not opt-out
On Mon, Aug 08, 2022 at 09:24:28AM -0400, Greg Troxel wrote: > Given where we are, do you really mean > > we should withdraw every module from autoload that does not have a > documented test result, right now? > > It seems far better to have them stay loaded than be unavailable. I'm not sure (and given Taylor's reply it is not easily doable anyway). However, I find the idea of offering auto-loadable modules which we do not fully trust kind of disturbing. On the other hand we have not heard[1] of many crash reports that are due to auto-unload, so maybe the problem is not that bad after all and we could fix the broken ones when we catch them. Martin 1: which might be just because the feature not being exercised very often
Re: Module autounload proposal: opt-in, not opt-out
Martin Husemann writes: > I think that all modules that we deliver and declare safe for *autoload* > should require to be actually tested, and a basic test (for me) includes > testing auto-unload. That does not cover races that slip through "casual" > testing, but should have caught the worst bugs. That's a reasonable position for adding modules, but > So the error in the cases you stumbled in is the autoload and keeping the > badly tested module autoloadable but forbid its unloading sounds a bit > strange to me. Given where we are, do you really mean we should withdraw every module from autoload that does not have a documented test result, right now? It seems far better to have them stay loaded than be unavailable. signature.asc Description: PGP signature
Re: mfii(4) and Dell PERC
Hi. On 2022/08/08 19:41, Edgar Fuß wrote: > Thanks for your answers. > >> Some people reported that kern/56669 (and perhaps kern/55192) still exist >> on some systems :-( > Hm. > >> bioctl mfi(i)X show > Ah, thanks. > What do I do in case a drive fails? Will adding a hot spare automagically > start a reconstruction? It will do. (I've never done "bioctl mfii0 hotspare add". I usually add a hotspare via BIOS setup menu.) >> If your system has other number, please let me know. > I don't have such a system yet. I wanted to find out about NetBSD > compatibility > before buying one. For the BBU monitoring, it's easy to add a new device support. -- --- SAITOH Masanobu (msai...@execsw.org msai...@netbsd.org)
virtio(4) bus_dmamap_sync audit
I reviewed the bus_dmamap_sync calls in sys/dev/pci/virtio.c and found several likely bugs by code inspection, including misuse of membar_*. The attached patch fixes all the issues I saw, and removes the membar_* calls in favour of bus_dmamap_sync. Details in attached commit message. qemu virtio-blk and virtio-net still seem to work in cursory testing. Are there any automated tests, ideally stress tests, of virtio(4)? Review or testing welcome! >From 39a2d4f14e7b7fe6d2ddb8bd621b459eb343f780 Mon Sep 17 00:00:00 2001 From: Taylor R Campbell Date: Mon, 8 Aug 2022 10:47:45 + Subject: [PATCH] virtio(4): Membar and bus_dmamap_sync audit. - Don't use membar_* for DMA. - Sync only the header and payload of the rings separately as needed. If we bounce, this avoids large memcpy when we only care about the header. - Sync uring with PREREAD before triggering anything that will return data in it. => Move uring PREREAD in virtio_enqueue_commit to _before_ updating vq->vq_avail->idx, which is the pointat which the DMA read is triggered in the `device' (host). => Omit needless membar_consumer in virtio_enqueue_commit -- may not work with DMA memory, and even if it does, redundant with bus_dmamap_sync uring PREREAD here. => XXX Does the device/host ever return unsolicited entries in the queue, or only solicited ones? If only solicited ones, the PREREAD in virtio_init_vq is redundant. - Sync uring with POSTREAD before we read from it. This way the DMA read into our buffer has finished before we read from the buffer. => Add missing uring POSTREAD in virtio_vq_is_enqueued, between read of vq->vq_used_idx and return to caller, so that the caller can safely use virtio_dequeue. => Add missing uring POSTREADs in virtio_start_vq_intr: . between entry from caller and the read of vq->vq_used_idx . between the read of vq->vq_used_idx and return to caller, so that the caller can safely use virtio_dequeue, just like virtio_vq_is_enqueued => Move uring POSTREADs in virtio_enqueue_commit to _before_ reading vq->vq_used->flags or *vq->vq_avail_event, not after. - After we write to aring, sync it with PREWRITE. This way we finish writing to our buffer before the DMA write from it. => Omit needless PREWRITE in virtio_init_vq -- we do the appropriate PREWRITE in virtio_enqueue_commit now. => Convert membar_producer to bus_dmamap_sync PREWRITE in virtio_enqueue_commit. => Omit incorrect aring POSTWRITE in virtio_enqueue_commit -- no need because the DMA write may not have completed yet at this point, and we already do a POSTWRITE in virtio_vq_is_enqueued. => Omit needless membar_producer in virtio_postpone_intr -- may not work with DMA memory, and even if it does, redundant with bus_dmamap_sync PREWRITE here. - After xfers to aring have completed, sync it with POSTWRITE. => Add missing aring POSTWRITE in virtio_free_vq, in case there are paths from virtio_enqueue_commit to here that don't go through virtio_is_enqueued. (If there are no such paths, then maybe we should KASSERT(vq->vq_queued == 0) in virtio_free_vq.) --- sys/dev/pci/virtio.c | 160 +-- 1 file changed, 122 insertions(+), 38 deletions(-) diff --git a/sys/dev/pci/virtio.c b/sys/dev/pci/virtio.c index 4c2d80c28e9a..f7189903a0e3 100644 --- a/sys/dev/pci/virtio.c +++ b/sys/dev/pci/virtio.c @@ -422,47 +422,111 @@ virtio_soft_intr(void *arg) static inline void vq_sync_descs(struct virtio_softc *sc, struct virtqueue *vq, int ops) { + /* availoffset == sizeof(vring_desc)*vq_num */ bus_dmamap_sync(sc->sc_dmat, vq->vq_dmamap, 0, vq->vq_availoffset, - ops); + ops); } static inline void -vq_sync_aring(struct virtio_softc *sc, struct virtqueue *vq, int ops) +vq_sync_aring_all(struct virtio_softc *sc, struct virtqueue *vq, int ops) { uint16_t hdrlen = offsetof(struct vring_avail, ring); + size_t payloadlen = sc->sc_nvqs * sizeof(uint16_t); + size_t usedlen = 0; + if (sc->sc_active_features & VIRTIO_F_RING_EVENT_IDX) - hdrlen += sizeof(uint16_t); + usedlen = sizeof(uint16_t); + bus_dmamap_sync(sc->sc_dmat, vq->vq_dmamap, + vq->vq_availoffset, hdrlen + payloadlen + usedlen, ops); +} + +static inline void +vq_sync_aring_header(struct virtio_softc *sc, struct virtqueue *vq, int ops) +{ + uint16_t hdrlen = offsetof(struct vring_avail, ring); + + bus_dmamap_sync(sc->sc_dmat, vq->vq_dmamap, + vq->vq_availoffset, hdrlen, ops); +} + +static inline void +vq_sync_aring_payload(struct virtio_softc *sc, struct virtqueue *vq, int ops) +{ + uint16_t hdrlen = offsetof(struct vring_avail, ring); + size_t payloadlen = sc->sc_nvqs * sizeof(uint16_t); bus_dmamap_sync(sc->sc_dmat, vq->vq_dmamap, - vq->vq_availo
Re: mfii(4) and Dell PERC
Thanks for your answers. > Some people reported that kern/56669 (and perhaps kern/55192) still exist > on some systems :-( Hm. > bioctl mfi(i)X show Ah, thanks. What do I do in case a drive fails? Will adding a hot spare automagically start a reconstruction? > If your system has other number, please let me know. I don't have such a system yet. I wanted to find out about NetBSD compatibility before buying one.
Re: mfii(4) and Dell PERC
On 2022/08/08 18:37, Edgar Fuß wrote: > I'm unsure whether this is the right list, is port-amd64 more appropriate? > > Does anyone use a Dell PERC H730P or similar RAID controller in RAID mode? It seems PERC H730P's device ID is 0x005d. https://pci-ids.ucw.cz/read/PC/1000/005d It's the Invader series. I have Fujitsu PRAID EP400i. It's also Invader. It works with one of my machines, but it doesn't guarantee that your machine with PERC H730P works. Some people reported that kern/56669 (and perhaps kern/55192) still exist on some systems :-( > mfii(4) says all configuration is done via the controller's BIOS. > Does that mean I need to shut down in case a drive fails an I need to rebuild? > > Can I monitor the RAID state? bioctl mfi(i)X show > Can I monitor the BBU Battery health? Maybe. If your machine's battery_type is 1, 2, 5 or 6, you can do by envstat(8). #define MFI_BBU_TYPE_IBBU 1 #define MFI_BBU_TYPE_BBU2 #define MFI_BBU_TYPE_IBBU09 5 #define MFI_BBU_TYPE_CVPM02 6 If your system has other number, please let me know. > Thanks in advance. -- --- SAITOH Masanobu (msai...@execsw.org msai...@netbsd.org)
mfii(4) and Dell PERC
I'm unsure whether this is the right list, is port-amd64 more appropriate? Does anyone use a Dell PERC H730P or similar RAID controller in RAID mode? mfii(4) says all configuration is done via the controller's BIOS. Does that mean I need to shut down in case a drive fails an I need to rebuild? Can I monitor the RAID state? Can I monitor the BBU Battery health? Thanks in advance.
Re: Module autounload proposal: opt-in, not opt-out
> Date: Mon, 8 Aug 2022 09:09:11 +0200 > From: Martin Husemann > > On Sun, Aug 07, 2022 at 11:08:47PM +, Taylor R Campbell wrote: > > Some modules might still be unsafe to unload, which is a bug -- but at > > least this way accidentally creating the wrong network interface or > > opening the wrong device won't set a ten-second time-bomb for the > > system like I occasionally stumble upon in the status quo. > > I think that all modules that we deliver and declare safe for *autoload* > should require to be actually tested, and a basic test (for me) includes > testing auto-unload. That does not cover races that slip through "casual" > testing, but should have caught the worst bugs. > > So the error in the cases you stumbled in is the autoload and keeping the > badly tested module autoloadable but forbid its unloading sounds a bit > strange to me. If you want to limit autoloadable modules to those that can be safely auto-unloaded, then we also need to have predefined lists, like we have for the exec modules now, for each of the following (and more) categories of autoloadable modules: - if_xyz.kmod, when you do `ifconfig xyzN create' - xyzfs.kmod, when you do `mount -t xyzfs ...' - xyzdev.kmod, when you open /dev/xyzdevN and there's an entry in sys/conf/majors for it - npf_alg_xyz.kmod, when you configure npf with an ALG xyz I'm OK with that but it's more work, and -- without auditing and testing _all_ of those possibilities to put them into the predefined safe lists -- will have a much bigger immediate negative impact on autoloadable functionality than my incremental proposal.
Re: Module autounload proposal: opt-in, not opt-out
On Sun, Aug 07, 2022 at 11:08:47PM +, Taylor R Campbell wrote: > Some modules might still be unsafe to unload, which is a bug -- but at > least this way accidentally creating the wrong network interface or > opening the wrong device won't set a ten-second time-bomb for the > system like I occasionally stumble upon in the status quo. I think that all modules that we deliver and declare safe for *autoload* should require to be actually tested, and a basic test (for me) includes testing auto-unload. That does not cover races that slip through "casual" testing, but should have caught the worst bugs. So the error in the cases you stumbled in is the autoload and keeping the badly tested module autoloadable but forbid its unloading sounds a bit strange to me. Martin