Re: MCLADDREFERENCE() incrementing the wrong ext_refcnt?
> So I think > atomic_inc_uint(&(o)->m_ext.ext_refcnt);\ > should really be > atomic_inc_uint(&(o)->m_ext_ref->m_ext.ext_refcnt); \ > which, of course, is the same thing if MEXT_ISEMBEDDED(o) is true. > Am I getting something wrong? Self-answer: Yes. m_ext is m_ext_ref->m_ext_storage, so the additional indirection is already performed.
MCLADDREFERENCE() incrementing the wrong ext_refcnt?
Hello. I'm under the impression that MCLADDREFERENCE() may increment the wrong ext_refcnt. In case it's permitted (I cant't find anything to the contrary) to call MCLADDREFERENCE(m1, m2) and then MCLADDREFERENCE(m2, m3), then the second call will increment m2's ext_refcnt where it should be incrementing m1's one (e.g. the one all of m1, m2 and m3's m_ext_ref are pointing to), no? So I think atomic_inc_uint(&(o)->m_ext.ext_refcnt);\ should really be atomic_inc_uint(&(o)->m_ext_ref->m_ext.ext_refcnt); \ which, of course, is the same thing if MEXT_ISEMBEDDED(o) is true. Am I getting something wrong?
_KERNEL_OPT and 0x6e074def
What's the point of #include'ing opt_foobar.h only if _KERNEL_OPT is defined and what's magic about 0x6e074def?
Re: Notes on kern/57133
> One change I can try is to put the diagnostic printf higher in the > scsipi_request function As the problem is so simple to reproduce, I'd put it just below xs = arg. Or set a breakpoint on scsipi_get_opcodeinfo(), then, when hitting it, one on mpii_scsipi_request() (provided you find it's address) and then step through it. Maybe dump the whole xs to see whether other fields are corrupted, too. But I'd bet the problem is the ccb_done routine being called prematurely. You could introduce a mpii_nothing_done() routine that prints something when called, replace ccb->ccb_done = mpii_scsi_cmd_done with = mpii_nothing_done and move the real assignement just before the mpii_start() call. Or so I think.
Re: Notes on kern/57133
> provide details on what this command is? A3h/0Ch is REPORT SUPPORTED OPERATION CODES The call is most probably from dev/scsipi:scsipi_get_opcodeinfo(). I'm still unsure how resid can be 0 at that point. scsipi_enqueue_xs() sets resid to datalen (which is undocumented). Apart from the path interpreting sense info nobody tampers with resid. Can you check for resid != datalen in mpii_scsi_request() just before the xs->xs_control & XS_CTL_POLL test (and, if it fires, print whether it's a polled request)? Otherwise, I suspect ccb_done set to mpii_scsi_cmd_done where it shouldn't (i.e. some race/error for a mpt command that's not plain SCSI).
Re: dumping on RAIDframe
> you dump a memory block that isn't a multiple of a disk sector > (according to disklabel) You mean this one (from disklabel raid0): bytes/sector: 512 ?
Re: dumping on RAIDframe
EF> dumping to dev 18,1 (offset=1090767, size=8252262): GO>Dumping to a RAID 1 set is supported in -8. But yes, none of those GO>values seem to align with each other. 18,1 is 'raid0b' thouugh, so that GO>part seems correct. MvE> offset and size relate to the dump data (dumplo and dumpsize), not MvE> the partition. So here are the various configs (raid0.conf from raidctl -G raid0); I omit sd1's info (normally identical/analogous to that of sd0) because I just pulled the disk to simulate a RAID failure. I'm unsure about whether the dump attempt was with a healthy or (artificially) failed RAID, I think it was with a healthy one. start size index contents 0 1 PMBR 1 1 Pri GPT header 2 32 Pri GPT table 34 2014 Unused 2048 262144 1 GPT part - EFI System 264192 930869248 2 GPT part - NetBSD RAIDFrame component 931133440 2015 Unused 931135455 32 Sec GPT table 931135487 1 Sec GPT header /dev/rsd0: 2 wedges: dk0: efi0, 262144 blocks at 2048, type: msdos dk1: raid0, 930869248 blocks at 264192, type: raidframe # raidctl config file for /dev/rraid0 START array # numRow numCol numSpare 1 2 0 START disks /dev/dk1 /dev/dk3 START layout # sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level_1 128 1 1 1 START queue fifo 100 # /dev/rraid0: type: RAID disk: raid0 label: lahn flags: bytes/sector: 512 sectors/track: 128 tracks/cylinder: 8 sectors/cylinder: 1024 cylinders: 909051 total sectors: 930869120 rpm: 3600 interleave: 1 trackskew: 0 cylinderskew: 0 headswitch: 0 # microseconds track-to-track seek: 0 # microseconds drivedata: 0 6 partitions: #sizeoffset fstype [fsize bsize cpg/sgs] a: 2097152 0 4.2BSD 0 0 0 # (Cyl. 0 - 2047) b: 67108864 2097152 swap # (Cyl. 2048 - 67583) d: 930869120 0 unused 0 0# (Cyl. 0 - 909051*) e: 62914560 69206016 4.2BSD 0 0 0 # (Cyl. 67584 - 129023) f: 629145600 132120576 4.2BSD 0 0 0 # (Cyl. 129024 - 743423)
boot.cfg location (was: GPT attributes in dkwedge [PATCH])
> boot[].cfg is > searched in EFI par[tit]ion /EFI/NetBSD/boot.cfg > and root partition /boot.cfg. But how can EFI locate it on the root partition if it tells where the root partition lives?
Locating boot.cfg on ESP (was: GPT attributes in dkwedge [PATCH])
> | It's not obviously where efiboot finds boot.cfg, since that's in > | esp:/EFI/NetBSD/boot.cfg or, > > And we correctly interpret that, always? It works for me on four servers I recently set up if I put it into /EFI/NetBSD on the ESP. It also, for reasons unknown to me, works on one other identical server I set up earlier (the prototype for the others) if put into the root of the ESP, but that doesn't work on the four others. I didn't find out why.
Re: panic on mfii(4) vd removal
> I get a panic if I remove a virtual disk from an mfii(4) device. That's another blunder in mfii(4). Patch (including the last) attached. Index: sys/dev/pci/mfii.c === RCS file: /cvsroot/src/sys/dev/pci/mfii.c,v retrieving revision 1.3.2.7 diff -u -p -r1.3.2.7 mfii.c --- sys/dev/pci/mfii.c 29 Sep 2022 14:41:43 - 1.3.2.7 +++ sys/dev/pci/mfii.c 22 Sep 2023 12:16:42 - @@ -503,6 +503,8 @@ static const char *mfi_bbu_indicators[] }; #endif +#define MFI_BBU_SENSORS 4 + static voidmfii_init_ld_sensor(struct mfii_softc *, envsys_data_t *, int); static voidmfii_refresh_ld_sensor(struct mfii_softc *, envsys_data_t *); static voidmfii_attach_sensor(struct mfii_softc *, envsys_data_t *); @@ -1373,18 +1375,20 @@ mfii_aen_ld_update(struct mfii_softc *sc if (old == -1 && nld != -1) { printf("%s: logical drive %d added (target %d)\n", DEVNAME(sc), i, nld); + sc->sc_ld[i].ld_present = 1; // XXX scsi_probe_target(sc->sc_scsibus, i); - mfii_init_ld_sensor(sc, >sc_sensors[i], i); - mfii_attach_sensor(sc, >sc_sensors[i]); + mfii_init_ld_sensor(sc, >sc_sensors[i + MFI_BBU_SENSORS], i); + mfii_attach_sensor(sc, >sc_sensors[i + MFI_BBU_SENSORS]); } else if (nld == -1 && old != -1) { printf("%s: logical drive %d removed (target %d)\n", DEVNAME(sc), i, old); + sc->sc_ld[i].ld_present = 0; scsipi_target_detach(>sc_chan, i, 0, DETACH_FORCE); sysmon_envsys_sensor_detach(sc->sc_sme, - >sc_sensors[i]); + >sc_sensors[i + MFI_BBU_SENSORS]); } } @@ -3716,8 +3720,6 @@ freeme: #endif /* NBIO > 0 */ -#define MFI_BBU_SENSORS 4 - static void mfii_bbu(struct mfii_softc *sc, envsys_data_t *edata) {
panic on mfii(4) vd removal
I get a panic if I remove a virtual disk from an mfii(4) device. What I found out is that mfii_aen_ld_update() calls sysmon_envsys_sensor_detach(), which (near the end of the routine) calls TAILQ_REMOVE(). In that, the last statement (minus QUEUEDEBUG_TAILQ_POSTREMOVE()), which is *(elm)->field.tqe_prev = (elm)->field.tqe_next; fails because (elm)->field.tqe_prev is NULL. I seem to be confused how tail queues work internally, because it appears to me that removing the first entry will fail, which is obviously not the case? Any hints what's wrong?
Adding a virtual disk to mpii(4)
After adding a virtual disk to an mfii(4) device (racadm createvirtualdisk, in my case), you get a nice mfii0: logical drive 2 added (target 2) message, but scsictl scsibus0 scan 2 0 doesn't find any drive. That's because sc_ld[i].ld_present is still unset from mfii_attach() and so mfii_scsipi_request() will return early (with xs->error = XS_SELTIMEOUT). The attached fix seems pretty dammn obvious and appears to work. Shall I file a PR? Index: sys/dev/pci/mfii.c === RCS file: /cvsroot/src/sys/dev/pci/mfii.c,v retrieving revision 1.3.2.7 diff -u -p -r1.3.2.7 mfii.c --- sys/dev/pci/mfii.c 29 Sep 2022 14:41:43 - 1.3.2.7 +++ sys/dev/pci/mfii.c 21 Sep 2023 13:38:43 - @@ -1373,6 +1373,7 @@ mfii_aen_ld_update(struct mfii_softc *sc if (old == -1 && nld != -1) { printf("%s: logical drive %d added (target %d)\n", DEVNAME(sc), i, nld); + sc->sc_ld[i].ld_present = 1; // XXX scsi_probe_target(sc->sc_scsibus, i); @@ -1381,6 +1382,7 @@ mfii_aen_ld_update(struct mfii_softc *sc } else if (nld == -1 && old != -1) { printf("%s: logical drive %d removed (target %d)\n", DEVNAME(sc), i, old); + sc->sc_ld[i].ld_present = 0; scsipi_target_detach(>sc_chan, i, 0, DETACH_FORCE); sysmon_envsys_sensor_detach(sc->sc_sme,
dumping on RAIDframe
Didn't RAIDframe recently (for certain values of "recently") gain the function to dump on a level 1 set? Should this work in -8? swapctl -z says "dump device is raid0b" (and raid0 is a level 1 RAID), but reboot 0x100 in DDB says dumping to dev 18,1 (offset=1090767, size=8252262): dump device not ready What am I missing? The offset (as reported by disklabel) of raid0b within raid0 is 2097152 (1G), the partition size is 67108864 (32G), so maybe something's wrong with the offset and size values (whatever unit they are in) DDB reports.
typo in raidN.conf leading to alledgedly failed component
I set up a server with a RAIDframe level 1 RAID and forgot raidctl -A softroot. So I booted an installation kernel via PXE, typed in a /tmp/raid0.conf and did raidctl -c /tmp/raid0.conf raid0, only I mistyped the name of the first component. That led to "hosed component", but worse, failed that component and apperantly marked the first component failed on the label of the second. So after raidctl -u raid0, correcting my typo and raidctl -c /tmp/raid0.conf raid0, I ended up with a failed first component that dind't relay fail. Can that be improved?
raidctl -A softroot and a failed component
I had a RAIDframe level 1 RAID with the first component marked as failed, e,g, component0: failed /dev/dkN: optimal and although the set was configured -A softroot, the kernel didn't configure raid0a as the root file system, presumably because the dk numbers didn't match. I was sitting in front of the console, so I could easily type raid0a etc., but this would have prevented an automatic boot. I'm afraid little can be done about that weird situation?
Re: Hard link creation witout write access
> a likely source of security issues. Why, exactly? I hope you need search permission to the original file (you certainly need search and write permission to the destination directory), so what can you do after the link you couldn't have done before? What about rename instead of link, should that be permitted?
Re: Maxphys on -current?
Hasn't there been a tls-maxphys branch?
unable to create xfer table DMA map for drive 0, error=12
I attached a 2,5" SSD to a machine, did a drvctl -r ata_hl atabus1 and got svwsata0:1: unable to create xfer table DMA map for drive 0, error=12 wd2(svwsata0:1:0): using PIO mode 4 Is this a problem with -6 that machine runs or what does it mean?
Re: compare kernel config
> Do we have a reliable way to compare kernel configations? config -x and diff?
Re: USB-related panic in 8.2_STABLE
> The same patch should apply just as well on netbsd-8. OK, I just did that. But we still don't know what led to the disconnect. Does the ohci0: 1 scheduling overruns give any clue?
Re: USB-related panic in 8.2_STABLE
> list *(ugen_get_cdesc+0xb1) 0x802f8f2e is in ugen_get_cdesc (/usr/src-8/sys/dev/usb/ugen.c:1376). 1371usb_config_descriptor_t *cdesc, *tdesc, cdescr; 1372int len; 1373usbd_status err; 1374 1375if (index == USB_CURRENT_CONFIG_INDEX) { 1376tdesc = usbd_get_config_descriptor(sc->sc_udev); 1377len = UGETW(tdesc->wTotalLength); 1378if (lenp) 1379*lenp = len; 1380cdesc = kmem_alloc(len, KM_SLEEP); > list *(ugenioctl+0x9a4) 0x802f99d1 is in ugenioctl (/usr/src-8/sys/dev/usb/ugen.c:1668). 1663*usbd_get_device_descriptor(sc->sc_udev); 1664break; 1665case USB_GET_CONFIG_DESC: 1666cd = (struct usb_config_desc *)addr; 1667cdesc = ugen_get_cdesc(sc, cd->ucd_config_index, ); 1668if (cdesc == NULL) 1669return EINVAL; 1670cd->ucd_desc = *cdesc; 1671kmem_free(cdesc, cdesclen); 1672break; Does that help? What about the ohci0: 1 scheduling overruns that preceded the detach that preceded the panic?
Re: USB-related panic in 8.2_STABLE
> You didn't give timing. Unfortunately, we don't know the timing. We don't know when and why the UPS disconnected. > normally the UPS doesn't disconnect It doesn't. Why should it?
SEGV in mmap() when building lang/gcc8 with devel/binutils
Sorry for the cross-post, but the problem is so weird that I'm confused what nature it is. For complicated reasons (see below for details), I'm trying to build lang/gcc8 so that it uses gas/gld from devel/binutils instead of /usr/bin/{as,ld}. I put DEPENDS+= binutils-[0-9]*:../../devel/binutils CONFIGURE_ARGS.NetBSD+= --with-gnu-ld --with-ld=${PREFIX}/bin/gld CONFIGURE_ARGS.NetBSD+= --with-gnu-as --with-as=${PREFIX}/bin/gas in Makefile. After quite some build time, some intermediate step chokes with: checking for x86_64--netbsd-gcc... /var/work/pkgsrc/lang/gcc8/work/build/./gcc/xgcc -B/var/work/pkgsrc/lang/gcc8/work/build/./gcc/ -B/usr/pkg.compiler_boot/gcc8/x86_64--netbsd/bin/ -B/usr/pkg.compiler_boot/gcc8/x86_64--netbsd/lib/ -isystem /usr/pkg.compiler_boot/gcc8/x86_64--netbsd/include -isystem /usr/pkg.compiler_boot/gcc8/x86_64--netbsd/sys-include checking for suffix of object files... configure: error: in `/var/work/pkgsrc/lang/gcc8/work/build/x86_64--netbsd/libgcc': configure: error: cannot compute suffix of object files: cannot compile See `config.log' for more details. gmake[2]: *** [Makefile:20348: configure-stage2-target-libgcc] Error 1 gmake[2]: Leaving directory '/var/work/pkgsrc/lang/gcc8/work/build' gmake[1]: *** [Makefile:26109: stage2-bubble] Error 2 gmake[1]: Leaving directory '/var/work/pkgsrc/lang/gcc8/work/build' gmake: *** [Makefile:949: all] Error 2 *** Error code 2 Stop. make[1]: stopped in /usr/pkgsrc/lang/gcc8 *** Error code 1 Stop. make: stopped in /usr/pkgsrc/lang/gcc8 According to config.log, the failing command is /var/work/pkgsrc/lang/gcc8/work/build/./gcc/xgcc -B/var/work/pkgsrc/lang/gcc8/work/build/./gcc/ -B/usr/pkg.compiler_boot/gcc8/x86_64--netbsd/bin/ -B/usr/pkg.compiler_boot/gcc8/x86_64--netbsd/lib/ -isystem /usr/pkg.compiler_boot/gcc8/x86_64--netbsd/include -isystem /usr/pkg.compiler_boot/gcc8/x86_64--netbsd/sys-include-c -g -O2 -D_FORTIFY_SOURCE=2 -I/usr/include -I/usr/pkg.compiler_boot/include/python3.10 conftest.c >&5 ktrace-ing that manually reveals cc1 CALL mmap(0,0x10,PROT_READ|PROT_WRITE,0x14001002,0x,0,0) cc1 RET mmap 137581005111296/0x7d2112f0 cc1 PSIG SIGSEGV caught handler=0xa2e336 mask=(11): code=SEGV_MAPERR, addr=0x8, trap=6) I even have no idea what xgcc is. Any hints? That's on 8.2_STABLE. What I'm really trying to do is to build for 6.1 using a chroot and kver. Without the patch to make gcc8 use pkgsrc binutils, it builds and seems to work (it can build itself into the standard LOCALBASE), but it fails on archivers/zstd because gcc emits a .S file (no inline involved) which /usr/bin/as can't assemble (tzcntl and shrx). Any hints on that welcome, too!
Re: ATA TRIM?
> According to that PDF, dholland is wrong. I fail to see a behaviour that would be allowed due to dholland@'s definition, but not according to the one you cited, nor the other way round.
acpiwmibus at acpiwmi0 not configured
I notice a line acpiwmibus at acpiwmi0 not configured in the autoconf messages. Indeed, my kernel config has acpiwmi* at acpi? and wmidell* at acpiwmibus? but no attachment for any acpiwmibus, nor does any other kernel config. Is there something magic about acpiwmibus or are the configs simply missing an appropriate line?
mpii_start() vs. mfii_start(): bus_space_write_raw_8(), bus_space_barrier()
I'm investigating timeout problems with my mpii(4) device (after the driver has been converted to MSI(-X). I'm trying to understand both sys/dev/pci/mpii.c and mfii.c since they adress the same hardware with different firmware. Comparing mpii_start() with mfii_start(), I'm stumbling over a number of differences I don't understand (I've removed some debug statements from mpii_start()): void mpii_start(struct mpii_softc *sc, struct mpii_ccb *ccb) { struct mpii_request_header *rhp; struct mpii_request_descr descr; #if defined(__LP64__) && 0 u_long *rdp = (u_long *) #else u_int32_t*rdp = (u_int32_t *) #endif [...] bus_dmamap_sync(sc->sc_dmat, MPII_DMA_MAP(sc->sc_requests), ccb->ccb_offset, sc->sc_request_size, BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); [...] #if defined(__LP64__) && 0 bus_space_write_raw_8(sc->sc_iot, sc->sc_ioh, MPII_REQ_DESCR_POST_LOW, *rdp); #else mutex_enter(>sc_req_mtx); bus_space_write_4(sc->sc_iot, sc->sc_ioh, MPII_REQ_DESCR_POST_LOW, rdp[0]); bus_space_barrier(sc->sc_iot, sc->sc_ioh, MPII_REQ_DESCR_POST_LOW, 8, BUS_SPACE_BARRIER_WRITE); bus_space_write_4(sc->sc_iot, sc->sc_ioh, MPII_REQ_DESCR_POST_HIGH, rdp[1]); bus_space_barrier(sc->sc_iot, sc->sc_ioh, MPII_REQ_DESCR_POST_LOW, 8, BUS_SPACE_BARRIER_WRITE); mutex_exit(>sc_req_mtx); #endif } static void mfii_start(struct mfii_softc *sc, struct mfii_ccb *ccb) { uint32_t *r = (uint32_t *)>ccb_req; #if defined(__LP64__) uint64_t buf; #endif bus_dmamap_sync(sc->sc_dmat, MFII_DMA_MAP(sc->sc_requests), ccb->ccb_request_offset, MFII_REQUEST_SIZE, BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE); #if defined(__LP64__) buf = ((uint64_t)r[1] << 32) | r[0]; bus_space_write_8(sc->sc_iot, sc->sc_ioh, MFI_IQPL, buf); #else mutex_enter(>sc_post_mtx); bus_space_write_4(sc->sc_iot, sc->sc_ioh, MFI_IQPL, r[0]); bus_space_write_4(sc->sc_iot, sc->sc_ioh, MFI_IQPH, r[1]); bus_space_barrier(sc->sc_iot, sc->sc_ioh, MFI_IQPL, 8, BUS_SPACE_BARRIER_WRITE); mutex_exit(>sc_post_mtx); #endif } 1. __LP64__ handling: Is the LP64 case simply an optimization or is it safer on the relevant platforms? 2. bus_space_write_raw_8(): I can't find any description or references for that function. Should that be bus_space_write_8()? 3. Single vs. double bus_space_barrier(): It strikes me as odd that mpii_start() has a call between the two bus_space_write_4() calls while mfii_start() hasn't. It also look suspicious to me that both calls use MPII_REQ_DESCR_POST_LOW. Can someone please enlighten me?
mfii0: cmd timeout
This is NOT kern/55192. So I thought I had mastered my PERC H330; set up two virtual volumes containing a one-disc RAID 0, set up GPTs, EFI boot volumes, built a RAIDframe RAID 1, disklabeled that, newfs'd the partitions, only remaining step being unpacking the sets (and a few config files). Unfortunately, the machine hangs unpacking the sets, uttering mfii0: cmd timeout ... and stalls. This is NOT kern/55192. It happens both with 8.2_STABLE with the patch from mfii.c 1.16 applied and -current. The strange thing is that dd'ing to the raw partition seems to work and initializing the RAID parity also worked. But unpacking any sets stalls sooner or later. Any idea how to debug this? I looked at OpenBSD's (where mfii(4) was ported from) CVS and couldn't find any changes that look related. I looked at FreeBSD, but they mave mrsas(4), which seems to be an entirely different beast provided by AVAGO/LSI. Any idea why OpenBSD wrote a new driver? Any chance to port mrsas(4) from FreeBSD?
Re: Dell PERC H330: no disks, no volumes
> There is a PERC H330 and a PERC HBA330 and the Dell PERC9 user manual > (includes the H330) says you can boot it in HBA mode. Not sure if > that means that you can chose the firmware. When I set the H330 to HBA mode, it still attaches as mfii0, the only difference to RAID mode being that the attachment in HBA mode says scsibus0 at mfii0: 0 targets, 8 luns per target instead of scsibus0 at mfii0: 32 targets, 8 luns per target in RAID mode. I tried to force it to use mpii (by adding the PCI Id in mpii.c and disabling mfii in the kernel config, but that didn't work either (I had the faint hope the controller would use the MPT-2 protocol in HBA mode despite showing the RAID PCI Ids). What /does/ work is setting the controller to RAID mode and create two volumes with a one-element RAID-0. But that feels like crazy.
Re: Dell PERC H330: no disks, no volumes
> Yes, in the controller setup you can create "Non-RAID Disks" (aka > JBOD) or "Virtual Disks" (aka RAID volumes) Where exactly are those Non-RAID Disks hiodden? > In theory you could use bioctl to create and manage volumes, but the > driver doesn't implement it. Ah, interesting. That was the way I was trying to use.
Re: Dell PERC H330: no disks, no volumes
> I don't remember the details (and it depends on the controller version), > but you need to have physical disks assigned to one (or more) RAID volume, > and then the RAID volume has to be exported as one (or more) virtual disks. But what if I want to pass the bare discs to NetBSD for a RAIDframe use?
Re: panic in sysmon_envsys_unregister
> I need to build a new install image (since I have no discs). I applied your fix to -8 and the panic disappeared. Thanks for the quick fix. Maybe it should be pulled up?
Re: Dell PERC H330: no disks, no volumes
Oh, I wasn't aware the H330 and HBA330 are different devices! > There is a PERC H330 and a PERC HBA330 and the Dell PERC9 user manual > (includes the H330) says you can boot it in HBA mode. Not sure if > that means that you can chose the firmware. Oh well. So the HBA330 is a PowerEdge RAID Controller that isn't a RAID controller? Thanks, Dell marketing! > -> This is attaching a H330 (RAID version) and it gets the mfii driver. > mfii0 at pci1 dev 0 function 0: "PERC H330 Mini", firmware 25.5.9.0001 OK, remains the question why I don't see any discs in bioctl. On startup, the machine utters the following: PowerEdge Expandable RAID Controller BIOS Copyright(c) 2016 Avago Technologies Press to Run Configuration Utility HA -0 (Bus 1 Dev 0) PERC H330 Mini FW package: 25.5.0.0001 0 Non-RAID Disk(s) found on the host adapter. 0 Non-RAID Disk(s) handled by BIOS 0 Virtual Disk(s) found on the host adapter. 0 Virtual Disk(s) handled by BIOS Is this normal? The only place I see discs being recognized is in the BIOS setup's controller setup.
Re: panic in sysmon_envsys_unregister
> This should be fixed by mfii.c rev. 1.26. Please update it and retry. Thanks. I need to build a new install image (since I have no discs). The other question is why the register call fails. According to the BIOS setup, the controller has no sensors. Could that be the problem?
panic in sysmon_envsys_unregister
I get a panic on shutdown: netbsd:sysmon_envsys_unregister+0x128: cmpq0(%rdx),%r12 sysmon_envsys_unregister mfii_detach config_detach config_detach_all cpu_reboot kern_reboot sys_reboot syscall ds 4da0 es 0 fs 1 gs c632 rdi 818f0510sme_global_mtx rsi rbp 9008514e4da0 rbx 90003d04c000 rdx 0 rcx d9e26f07b700 rax 0 r8 0 r9 0 r10 0 r11 0 r12 d9e26d5b1c40 r13 d9e26d7a5a00 r14 1 r15 81802c60mfii_ca rip 80a8c41esysmon_envsys_unregister+0x128 cs 8 rflags 10246 rsp 9008514e4d90 ss 10 This is -current from around yesterday. I guess the problem is related to mfii0: autoconfiguration error: unable to register with sysmon (rv = 86) mfii0: autoconfiguration error: unable to create sensors So probably someone is trying to un-resgister something not registered.
Re: Dell PERC H330: no disks, no volumes
> These controller chips can run two different kinds of firmware. > The mfii driver is for talking to the RAID firmware ("IR mode") > while the mpii driver is for talking to the vanilla SAS firmware > ("IT mode"). Ah, and how do I know which mode my card runs? mpii(4) explicitly mentions the Dell PERC HBA330, but the "R" in PERC is for RAID. The controller can be switched to RAID or HBA mode in the BIOS setup, so does it run both firmware versions?
Re: Dell PERC H330: no disks, no volumes
It appears to me we have two drivers for the SAS3008: mfii(4) and mpii(4). Why?
Dell PERC H330: no disks, no volumes
So after I managed to boot my new PowerEdge R6515, the next challenge is that I have no discs. The machine is equipped with a PERC H330 mini, a SCSI backplane and two SATA SSDs. I do see the discs in the BIOS's RAID controller configuration menu. Autoconfiguration says: mfii0 at pci1 dev 0 function 0: "PERC H330 Mini", firmware 25.5.9.0001 mfii0: interrupting at ioapic4 pin 26 scsibus0 at mfii0: 0 targets, 8 luns per target mfii0: unable to register with sysmon (rv = 86) mfii0: unable to create sensors [...] mfii0: physical disk inserted id 32 enclosure 32 mfii0: physical disk inserted id 0 enclosure 32 mfii0: physical disk inserted id 1 enclosure 32 (both with 8.2 and current), but bioctl mfii0 show says bioctl: no volumes available and bioctl show disks show a header and then bioctl: BIOCDISK_NOVOL: Inappropriate ioctl for device The BIOS configuration lets me set the controller mode from RAID to HBA an I can mark individual discs as "RAID elegible", but that doesn't seem to make a difference. I suppose it's something stupid. Anyone using a H330?
Re: debugging a kernel that doesn't start
> I'm trying to run NetBSD on a Dell PowerEdge R6515, and the kernel is being > loaded (PXE or USB) but then the machine hangs hard. I've made a giant step forward: booting the -current install image from a USB key /via UEFI/ works. Maybe it's a bug in the server's CSM. Thanks for all the helpful comments anyway.
Re: debugging a kernel that doesn't start
> then you can bypass all the worries of using BIOS routines or whatnot > and just poke the hardware directly. Probably stupid question: I can switch the machine to UEFI. Is it easier to debug things from there that from a BIOS boot?
Re: debugging a kernel that doesn't start
> That could be a strong clue or it could be unrelated. OK, just in case that might be another clue: If I want to interrupt the boot countdown, the first keystroke gets lost, I need to press a second time.
Re: debugging a kernel that doesn't start
> If you can setup a serial console, it may make things much easier. I do have a serial port on the machine. > I almost always use serial consoles on dev machines; I don't remember the > details but doing the equivalent of a putchar very early was possible. Is the BIOS still available or how does that work?
Re: debugging a kernel that doesn't start
> Have you tried booting a custom kernel with some drivers removed? No. I wouldn't know which drivers to remove. The problem is the Kernel utters absolutely nothing, so it must hang very, very early. > have you tried an uncompressed one? No, but I guess the official install image (on a USB key) is supposed to work as-is, no? > The simplest way to debug something is using a serial port, do you have > access to the one on this machine? Yes, there is one. It seems to sort-of mirror the on-screen messages up to the point the NetBSD boot runs. I tried consdev com0,9600 from the boot prompt but that hung the machine.
debugging a kernel that doesn't start
I'm trying to run NetBSD on a Dell PowerEdge R6515, and the kernel is being loaded (PXE or USB) but then the machine hangs hard. What's the way to debug a kernel that hangs so early that you can't printf or drop into ddb? I guess that's a phenomenon quite common for a new port or changes to locore.s (or whatever that's called today), but it's completely new to me. I have virtually no clue about PeCee hardware. At the point the kernel is started, are BIOS routines still available?
Re: mfii(4) and Dell PERC
Thanks for your answers. > Some people reported that kern/56669 (and perhaps kern/55192) still exist > on some systems :-( Hm. > bioctl mfi(i)X show Ah, thanks. What do I do in case a drive fails? Will adding a hot spare automagically start a reconstruction? > If your system has other number, please let me know. I don't have such a system yet. I wanted to find out about NetBSD compatibility before buying one.
mfii(4) and Dell PERC
I'm unsure whether this is the right list, is port-amd64 more appropriate? Does anyone use a Dell PERC H730P or similar RAID controller in RAID mode? mfii(4) says all configuration is done via the controller's BIOS. Does that mean I need to shut down in case a drive fails an I need to rebuild? Can I monitor the RAID state? Can I monitor the BBU Battery health? Thanks in advance.
Re: Periodic messages on NetBSD-9 and -current: xennet0: rx no cluster
> the request count on the mclpl line is incrementing at a pretty fast rate Maybe you're running into the same problem as me (see the "mbuf cluster leak?" thread on tech-net). Try a kernel with MBUFTRACE. If that shows you (via netstat -mss) a large number of tx bufs on a particular vlan interface, try destroy-ing and re-creating that interface (and reloading ipfilter in case you're using it). For me, that stops the allocations from rising (for a while). I still don't know what triggers it, though.
Re: mfii hanging on boot
> I committed the change yesterday. I don't get what the #if defined(__LP64__) && 0 is for.
Re: killed: out of swap
> Perhaps my understanding is wrong No.
Re: killed: out of swap
> I assume my impression is completely wrong (today). OK, thanks for all the explanations and insights.
Re: killed: out of swap
> So what should the kernel do? I don't know how thigs work under the hood today (I might have partially known in the times of sbrk()), but I would suppose that malloc() will ultimatively result in some system call enlarging the heap/data segment/whatever. That system call could simply fail. I assume my impression is completely wrong (today). But then, how can a malloc() fail before the process gets killed?
killed: out of swap
I have a program that keeps malloc()ing (and scribbling a bit into the allocated memory) until malloc() fails. The intention is to put pressure on the VM system to find out how much pool cache memory it can reclaim. When I run that program (with swap space unconfigured), it doesn't terminate normally, but gets killed by the kernel with "out of swap". Unfortunately, other processes happening to malloc() during that time may get killed, too. I don't quite get what the rationale for that is (or maybe I'm doing something stupidely wrong). If I malloc(), and that fails, that should fail and not kill me, no? I'm surely missing something.
Re: membar_enter semantics
I know close to nothing about the subject in question, but maybe thoughts from a non-expert may be useful: If there's a widely adopted terminology, one should probably stick to it even if the wording is counter-intuitive or misleading (but note that fact in the documentation). After all, Simple Groups are not easy at all and you need to know about Galois Theory to understand why Solvable Groups are named that way. If the operations are called foo-before-bar, I would have to look up documentation on every instance to understand what the intended usage is. So for me, naming the operations after what they do, but have aliases for intended usage would make sense. When I read frozz_enter() and frozz_exit() in code, my expectation is that every call fo enter is paired with a call to exit _in the control flow_, i.e., there's no (other than panic) code path that goes through one of them, but not the other. Would it make sense to call the intended-usage aliases something like push/pull, provide/consume or publish/whatever?
findroot: double match for boot device (was: Autoconfigured RAIDframe raid* numbering)
I do know that, but the warning seems to be new. It didn't appear before, but I had -A root (which now is force) before.
Re: Autoconfigured RAIDframe raid* numbering
> > Additinally, I got > > WARNING: findroot: double match for boot device (sd4, sd5) > > (where sd4a/5a are raid2's components) before > > boot device: raid2 > > root on raid2a dumps on raid2b > > What does that mean? > > Is this with -current newer than > >https://mail-index.netbsd.org/source-changes/2021/08/28/msg131862.html > > ? No, 8.2_STABLE.
Autoconfigured RAIDframe raid* numbering
If I have a number of autoconfigured RAIDframe sets on one machine, is there any guarantee which raid* number a set gets assigned? Is that numbering stable even if I remove one set (in the sense of physically un-plugging the drives) so the components will get different sd* numbers? I had raid0 (-A soft) for the system and raid1 (-A yes) for data. I added raid2 (also -A soft), transfererred everything (volatile data in single user mode) and then booted (single user, to be safe) off one of the new components. To my surprise, the raid* numbering was exactly like before, i.e. my root was now on raid2a. Additinally, I got WARNING: findroot: double match for boot device (sd4, sd5) (where sd4a/5a are raid2's components) before boot device: raid2 root on raid2a dumps on raid2b What does that mean?
RAIDframe: reconstucting a temporarily lost drive (was: SATA rescan)
> drvctl -r -a ata_hl atabusX OK, that (after moving to a different slot) brought the drive back again. However, the raid had failed the missing drive (whether upon booting with the missing drive or shortly before the crash I can't tell). I had /dev/wd0a optimal plus component1 failed. I guess there's no way to teach RAIDframe the missing drive is back (short of rebooting, which is out of the question)? I added /dev/wd1a as a spare and failed component1, which worked, but maybe there's a more elegant way?
SATA rescan?
Is there a way (short of re-booting) to re-scan a SATA port for a disc absent (or dysfunctional) during the boot? I.e., something like scsictl rescan?
Re: panic in iic_search()
This is another place where I have local patches in my tree that haven't been integated (see kern/55745). This is a regression in all "supported" versions of NetBSD (until -11 is released) rendering I2C inoperable on popular hardware.
8.x pmap fixes (was: boot -d)
> Here they are (for netbsd-8). I can boot -d with them [...] I just noticed that I still have these patches locally. Any chances to get them into -8? Should I file a PR?
Re: timeouts connecting to pgsql database
> What filesystem options are you using for wherever the database files > are located ? Back in the day I experienced that LFS was incredibly fast for a (MySQL) database. There were problems with the cleaner crashing, though.
Re: partial failures in write(2) (and read(2))
> I suppose libc could set a default handler for the new signal, and do some > extra work to set errno. Then the libc routine could better use a new syscall, no?
Re: X vs serial console?
> Is there any way I can test for it? Connect something to the HDMI outputs?
Re: X vs serial console?
Could it be the case that the X server expects some aspects of the video hardware to be initialized by the video console driver that are uninitialized in the serial console case? E.g., as you say outputs are shared between HDMI and VGA, the X video simply goes to the HDMI output?
Re: USB lockup (probably solved)
Looks like I'm making progress after all. > The change [nick] referred to was > > Revision 1.254.2.76 / (download) - annotate - [select for diffs], Mon > May 30 06:46:50 2016 UTC (4 years, 5 months ago) by skrll > Branch: nick-nhusb [...] > > Restructure the abort code for TD based transfers (ctrl, bulk, intr). > [...] I (hopefully) adapted that to -8 and it seems to work! I attach my adaption of nick's work plus some additional debugging code I added while analyzing the issue. So what next? File a PR? --- ohcivar.h.orig 2020-11-30 15:31:45.755906264 +0100 +++ ohcivar.h 2020-12-01 12:12:58.463657450 +0100 @@ -1,4 +1,4 @@ -/* $NetBSD: ohcivar.h,v 1.58.10.1 2018/08/25 11:29:52 martin Exp $ */ +/* $NetBSD: ohcivar.h,v 1.55.6.15 2016/05/30 06:46:50 skrll Exp $ */ /* * Copyright (c) 1998 The NetBSD Foundation, Inc. @@ -50,6 +50,7 @@ ohci_td_t td; struct ohci_soft_td *nexttd;/* mirrors nexttd in TD */ struct ohci_soft_td *dnext; /* next in done list */ + struct ohci_soft_td **held; /* where the ref to this std is held */ ohci_physaddr_t physaddr; usb_dma_t dma; int offs; @@ -71,6 +72,7 @@ ohci_itd_t itd; struct ohci_soft_itd *nextitd; /* mirrors nexttd in ITD */ struct ohci_soft_itd *dnext;/* next in done list */ + struct ohci_soft_itd **held;/* where the ref to this sitd is held */ ohci_physaddr_t physaddr; usb_dma_t dma; int offs; @@ -114,6 +116,8 @@ LIST_HEAD(, ohci_soft_td) sc_hash_tds[OHCI_HASH_SIZE]; LIST_HEAD(, ohci_soft_itd) sc_hash_itds[OHCI_HASH_SIZE]; + TAILQ_HEAD(, ohci_xfer) sc_abortingxfers; + int sc_noport; int sc_endian; @@ -128,6 +128,8 @@ int sc_flags; #define OHCIF_SUPERIO 0x0001 + kcondvar_t sc_softwake_cv; + ohci_soft_ed_t *sc_freeeds; ohci_soft_td_t *sc_freetds; ohci_soft_itd_t *sc_freeitds; @@ -148,6 +152,8 @@ struct ohci_xfer { struct usbd_xfer xfer; + uint32_t ox_abintrs; + TAILQ_ENTRY(ohci_xfer) ox_abnext; /* ctrl */ ohci_soft_td_t *ox_setup; ohci_soft_td_t *ox_stat; --- ohci.c 2020-11-23 18:30:07.0 +0100 +++ /tmp/ohci.c 2020-11-30 18:02:27.0 +0100 @@ -1,4 +1,4 @@ -/* $NetBSD: ohci.c,v 1.273.6.6 2020/02/25 18:52:44 martin Exp $*/ +/* $NetBSD: ohci.c,v 1.254.2.76 2016/05/30 06:46:50 skrll Exp $*/ /* * Copyright (c) 1998, 2004, 2005, 2012 The NetBSD Foundation, Inc. @@ -41,7 +41,7 @@ */ #include -__KERNEL_RCSID(0, "$NetBSD: ohci.c,v 1.273.6.6 2020/02/25 18:52:44 martin Exp $"); +__KERNEL_RCSID(0, "$NetBSD: ohci.c,v 1.254.2.76 2016/05/30 06:46:50 skrll Exp $"); #ifdef _KERNEL_OPT #include "opt_usb.h" @@ -384,6 +384,8 @@ ohci_detach(struct ohci_softc *sc, int f softint_disestablish(sc->sc_rhsc_si); + cv_destroy(>sc_softwake_cv); + mutex_destroy(>sc_lock); mutex_destroy(>sc_intr_lock); @@ -492,6 +494,7 @@ ohci_alloc_std(ohci_softc_t *sc) memset(>td, 0, sizeof(ohci_td_t)); std->nexttd = NULL; std->xfer = NULL; + std->held = NULL; return std; } @@ -539,14 +542,17 @@ ohci_alloc_std_chain(ohci_softc_t *sc, s DPRINTFN(8, "xfer %#jx nstd %jd", (uintptr_t)xfer, nstd, 0, 0); - for (size_t j = 0; j < ox->ox_nstd;) { + for (size_t j = 0; j < ox->ox_nstd; j++) { ohci_soft_td_t *cur = ohci_alloc_std(sc); if (cur == NULL) goto nomem; - ox->ox_stds[j++] = cur; + ox->ox_stds[j] = cur; + cur->held = >ox_stds[j]; cur->xfer = xfer; cur->flags = 0; + DPRINTFN(10, "xfer=%#jx new std=%#jx held at %#jx", ox, cur, + cur->held, 0); } return 0; @@ -788,6 +794,7 @@ ohci_init(ohci_softc_t *sc) mutex_init(>sc_lock, MUTEX_DEFAULT, IPL_SOFTUSB); mutex_init(>sc_intr_lock, MUTEX_DEFAULT, IPL_USB); + cv_init(>sc_softwake_cv, "ohciab"); sc->sc_rhsc_si = softint_establish(SOFTINT_USB | SOFTINT_MPSAFE, ohci_rhsc_softint, sc); @@ -797,6 +804,8 @@ ohci_init(ohci_softc_t *sc) for (i = 0; i < OHCI_HASH_SIZE; i++) LIST_INIT(>sc_hash_itds[i]); + TAILQ_INIT(>sc_abortingxfers); + sc->sc_xferpool = pool_cache_init(sizeof(struct ohci_xfer), 0, 0, 0, "ohcixfer", NULL, IPL_USB, NULL, NULL, NULL); @@ -1334,12 +1343,26 @@ ohci_intr1(ohci_softc_t *sc) */ softint_schedule(sc->sc_rhsc_si); } + if (eintrs & OHCI_SF) { + struct ohci_xfer *ox, *tmp; + TAILQ_FOREACH_SAFE(ox, >sc_abortingxfers, ox_abnext, tmp) { + DPRINTFN(10, "SF %#jx xfer %#jx", (uintptr_t)sc, (uintptr_t)ox, 0, 0); + ox->ox_abintrs &=
Re: USB lockup
I looked into the usbhist now. > is something being aborted? Yes. > I guess the E20 TD got written out with incorrect next_td, or some other > error condition caused the mixup. I think the only sane explanation absent a controller bug is that, at the time the HC finished E20, HcDoneHead was FA0 (41088FA0, really; note there are other TDs xxxFA0, which I ignore here). Since the "real" FA0 (which is the "tail" part of the transfer just initiated comes after E20, the HC must have finished a "different" FA0. As I added checks for a second TD being queued with the same physaddr that didn't fire, the HCD must have previously dequeued the FA0 TD. So my guess is that the HC and the HCD, prior before the E20->EE0->FA0->F40->0 chain is queued, disagree on whether FA0 is still active (i.e. under the HCs control) or not (i.e. under the HCDs control). > The change I referred to was Would you expect that change (maybe after some munging) to work with -8? > In PR/22646 some TDs can be on the done queue when the abort start and, If that "done queue" is in the HcDoneHead sense, not the HccaDoneHead sense, then, I guess, it would exactly fit what I think is going on. > if this is the case, they need to processed after the WDH interrupt. > Instead of waiting for WDH we release TDs that have been touched by the > HC and replace them with new ones. Once WDH happens the floating TDs > will be returned to the free list. I still need to understand how exactly that works. Could you give me a hint what these "referenced by" pointers are needed for? My first idea would be to, when aborting, not to actually de-queue the std's, but just mark them as aborted (in the std outside the td, of course) and, when encountering such a std in the softint, just de-queue and otherwise skip them. I'm afraid I'm missing something and that won't work.
Re: USB lockup
> Really hard to help without seeing the full ohcidebug usbhist log. I replaced the panic with abreak out of the done loop. Find attached my diff plus the usbhist from where I first started the offending command (which locks up the second time called). I didn't look into the log myself yet. Index: ohcivar.h === RCS file: /cvsroot/src/sys/dev/usb/ohcivar.h,v retrieving revision 1.58.10.1 diff -u -r1.58.10.1 ohcivar.h --- ohcivar.h 25 Aug 2018 11:29:52 - 1.58.10.1 +++ ohcivar.h 27 Nov 2020 12:08:49 - @@ -59,6 +59,9 @@ uint16_t flags; #define OHCI_CALL_DONE 0x0001 #define OHCI_ADD_LEN 0x0002 +#ifdef OHCI_DEBUG + int beenthere; /* loop detection */ +#endif } ohci_soft_td_t; #define OHCI_STD_SIZE ((sizeof(struct ohci_soft_td) + OHCI_TD_ALIGN - 1) / OHCI_TD_ALIGN * OHCI_TD_ALIGN) #define OHCI_STD_CHUNK 128 @@ -75,6 +78,9 @@ struct usbd_xfer *xfer; uint16_t flags; bool isdone;/* used only when DIAGNOSTIC is defined */ +#ifdef OHCI_DEBUG + int beenthere; /* loop detection */ +#endif } ohci_soft_itd_t; #define OHCI_SITD_SIZE ((sizeof(struct ohci_soft_itd) + OHCI_ITD_ALIGN - 1) / OHCI_ITD_ALIGN * OHCI_ITD_ALIGN) #define OHCI_SITD_CHUNK 64 Index: ohci.c === RCS file: /cvsroot/src/sys/dev/usb/ohci.c,v retrieving revision 1.273.6.6 diff -u -r1.273.6.6 ohci.c --- ohci.c 25 Feb 2020 18:52:44 - 1.273.6.6 +++ ohci.c 27 Nov 2020 12:09:01 - @@ -230,6 +230,8 @@ Static voidohci_dump_ed(ohci_softc_t *, ohci_soft_ed_t *); Static voidohci_dump_itd(ohci_softc_t *, ohci_soft_itd_t *); Static voidohci_dump_itds(ohci_softc_t *, ohci_soft_itd_t *); + +static int ohci_beenthere = 0; /* td list loop detection */ #endif #define OBARR(sc) bus_space_barrier((sc)->iot, (sc)->ioh, 0, (sc)->sc_size, \ @@ -693,6 +695,13 @@ DPRINTFN(2, "add 0 xfer", 0, 0, 0, 0); } +#ifdef OHCI_DEBUG + DPRINTFN(10, "--- dump start ---", 0, 0, 0, 0); + if (ohcidebug >= 10) + ohci_dump_td(sc, sp); + DPRINTFN(10, "--- dump end ---", 0, 0, 0, 0); +#endif + /* Last TD gets usb_syncmem'ed by caller */ *ep = cur; } @@ -1410,9 +1419,25 @@ OWRITE4(sc, OHCI_INTERRUPT_ENABLE, OHCI_WDH); /* Reverse the done list. */ +#ifdef OHCI_DEBUG + ohci_beenthere++; +#endif for (sdone = NULL, sidone = NULL; done != 0; ) { + DPRINTFN(10, "done=%#jx", (uintptr_t)done, 0, 0, 0); std = ohci_hash_find_td(sc, done); if (std != NULL) { +#ifdef OHCI_DEBUG + if (ohcidebug >= 10) + ohci_dump_td(sc, std); + if (std->beenthere == ohci_beenthere) { + DPRINTFN(1, "circular sdone: %#jx->%#jx", (uintptr_t)sdone, (uintptr_t)std, 0, 0); +#if 0 + panic("circular sdone"); +#endif + break; + } + std->beenthere = ohci_beenthere; +#endif usb_syncmem(>dma, std->offs, sizeof(std->td), BUS_DMASYNC_POSTWRITE | BUS_DMASYNC_POSTREAD); std->dnext = sdone; @@ -1423,6 +1448,20 @@ } sitd = ohci_hash_find_itd(sc, done); if (sitd != NULL) { +#ifdef OHCI_DEBUG +/* XXX no ohci_dump_itd() yet + if (ohcidebug >= 10) + ohci_dump_itd(sc, sitd); +*/ + if (sitd->beenthere == ohci_beenthere) { + DPRINTFN(1, "circular sidone: %#jx->%#jx", (uintptr_t)sidone, (uintptr_t)sitd, 0, 0); +#if 0 + panic("circular sidone"); +#endif + break; + } + sitd->beenthere = ohci_beenthere; +#endif usb_syncmem(>dma, sitd->offs, sizeof(sitd->itd), BUS_DMASYNC_POSTWRITE | BUS_DMASYNC_POSTREAD); sitd->dnext = sidone; @@ -1445,6 +1484,7 @@ for (std = sdone; std; std = std->dnext) ohci_dump_td(sc, std); } +/* XXX dump sidone list */ #endif DPRINTFN(10, "--- TD dump end ---", 0, 0, 0, 0); @@ -1838,6 +1878,15 @@ KASSERT(sc->sc_bus.ub_usepolling || mutex_owned(>sc_lock)); +#ifdef OHCI_DEBUG + for (ohci_soft_td_t *std2 = LIST_FIRST(>sc_hash_tds[h]); +std2 != NULL; +std2 = LIST_NEXT(std2, hnext)) { + if (std2->physaddr == std->physaddr) + panic("OHCI: duplicate physaddr"); + } +#endif + LIST_INSERT_HEAD(>sc_hash_tds[h], std, hnext); } @@ -1945,7
Re: USB lockup
Thanks a lot for looking into this! > Really hard to help without seeing the full ohcidebug usbhist log. The problem is that file system (or block I/O) seems to lock up so the usbhist is hard to get out of the machine other than by canera. I guess dump-ing will take ages to complete (16G RAM). I could try to replace my panic with simply writing something to usbhist and aborting the loop. > I guess the E20 TD got written out with incorrect next_td, or some other > error condition caused the mixup. You mean nexttd or td_nexttd? As far as I can tell, neither field is touched by the driver without being ohci_dump_td()'d afterwards, and, as I wrote, minus the loopback td_nexttd, everything is exactly as one would expect. > The change I referred to was I'll have a look into that one tomorrow. > is something being aborted? May well be. I haven't checked yet. My feeling is that this is either a controller error or some sort of DMA/cache/barrier/whatever race during the HccaDoneHead manipulation. But I'm steadily confused by the writing-a-1-clears-the-bit or writing-a-1-sets-the-bit semantics of the registers and know nothing about all these cache/barrier/re-ordering issues other that they may exist. The one nice thing is that the lock-up is easily and 100% reproducible. If only these PeCee boxes wouldn't take ages to reboot.
Re: USB lockup
> Add a check to ohci_softintr to see if the list goes circular and enter > ddb / dump usbhist when it does... I already did add a panic and it fired. I'm still trying to find out how that happens. What I'm seeing (dumped by device_ctrl_start()) is a chain of four TDs (named here after their addresses' three least significant nybbles): E20->EE0->FA0->F40->0 which are linked in that sense by both nexttd and td.td_nexttd. Then, in ohci_softint(), the done queue is (as linked by td.nexttd): FA0->EE0->E20->FA0->... and, as expected, the nexttd links are as before. Absent the E20->FA0 link, that's exactly what one would expect if the first three TDs have been handled (the done list is most recently done first); the big question is where that additinal link comes from. I've added code to ohci_hash_add_td() to catch a TD being added with a physical address already present in the hash list, but that didn't fire.
Re: USB lockup
I guess there's something different going on. Unless I'm mistaken, the list is circular in the td_nexttd sense, but not in the nexttd sense.
Re: USB lockup
> so the td list must have gone circular, no? It's indeed circular (in the td_nexttd sense), as addionally inserted debugging output revealed. It also happens in uniprocessor (boot -1) mode.
Re: USB lockup
> So, during the partial lockup, I see > ohci_softintr#63@0: add TD 0x80013ec2de20 > ohci_softintr#63@0: add TD 0x80013ec2dea0 that's ohci_softintr#63@0: add TD 0x80013ec2dfa0 > ohci_softintr#63@0: add TD 0x80013ec2dee0 So I think it's endlessly looping in the "Reverse the done list." loop in ohci_softintr(), so the td list must have gone circular, no?
Re: USB lockup
> The ddb backtrace usually is > bus_space_read_4() > bintime() > ohci_softintr() > usb_soft_intr() > softint_dispatch() > > The system call causing the lock-up is a USB_DEVICEINFO ioctl on /dev/usb0 > with udi_addr=2, which corresponds to ugen0. I tried a -current kernel from nyftp today, and it locks up the same way.
USB lockup (was: ktrace-ing a command that locks up the machine)
> Hmmm, this was usb, right? Yes. > Maybe turn on options USBHIST (and/or EHCIHIST, OHCIHIST, UHCIHIST, > XHCIHIST). None of these seem to be described in options(4) man > page, but you can dump the debug data using ``vmstat -u histname''. > And get a listof the actual histname's with ``vmstat -l'' Oh, thanks, I didn't knew of that. I don't even need any further options. So, during the partial lockup, I see ohci_softintr#63@0: add TD 0x80013ec2de20 ohci_softintr#63@0: add TD 0x80013ec2dea0 ohci_softintr#63@0: add TD 0x80013ec2dee0 at .01 intervals. The ddb backtrace usually is bus_space_read_4() bintime() ohci_softintr() usb_soft_intr() softint_dispatch() The system call causing the lock-up is a USB_DEVICEINFO ioctl on /dev/usb0 with udi_addr=2, which corresponds to ugen0. Any hints how to debug this further? I tried a DIGNOSTIC+DEBUG+LOCKDEBUG kernel, but it didn't complain. The strange thing is that not only USB locks up, but any file system operation seems to stall, too. No, these are not USB discs.
USB debugging (was: ktrace-ing a command that locks up the machine)
On Wed, Nov 18, 2020 at 09:05:47AM -0500, Greg Troxel wrote: > another suggestion is to enable USB debugging in the kernel and use a serial > console (or even just framebuffer) to see the last message before crash. I set options {USB,OHCI,EHCI}_DEBUG and sysctl -w hw.{usb,ohci,ehci}.debug=20 and get zero output. What the hell am I missing?
Re: ktrace-ing a command that locks up the machine
> ktrace over NFS. That would be -- eh -- somewhat involved. I doubt it will work given that writing to an FS mounted -o sync gives an empty file.
Re: ktrace-ing a command that locks up the machine
> Suggestion: put the ktrace file on a filesystem mounted -o sync. That (with ktrace -s) gave me an empty file.
ktrace-ing a command that locks up the machine
So after fixing kern/53311 and kern/55745 on -8, I'm back to one nesting level down my original task. I have a command that (when run the second time and with certain USB devices connected) will irrevertibly (to me) partly (no console switching) lock up the machine. I need to enter DDB and reboot. I would like to ktrace/ktruss the command to see which USB transfer exactly is the one that hangs. However, even with ktrace -s, there is no trace file after the re-boot (on FFS/WAPBL); I can't tell whether it exists before the reboot. Using ktruss, the last trace output to the console is way behind the execution. I would like to avoid GDB single stepping through libusb. Any ideas? The process is somewhat tedious because these wonderful grandgrandson-of-IBM-PCs take some 85 seconds from the reboot command to the primary boot.
USB lock-ups
Hello again. So after backporting the -current pmap fixes to -8 in order to be able to be able boot -d in order be able to examine I2C panics and after fixing them I have an operational -8 machine again only to find that the USB problems that made me update are still there. The simplest libusb program (I tried to get myslf acquianted to libusb) will lock-up the machine if run the second time. The only trace I have is (once) ohci0: WARNING: addr 0x41088dc0 not found. The machine becomes (at least) unresponsive to virtual console switches, most times, entering DDB works; backtrace is x86_memfence() usb_soft_intr() softint_dispatch() or bus_dmamap_sync() ohci_softintr() usb_soft_intr() softint_dispatch() When I looked, I had most processes in tstile. Any hints? Another broken pull-up? #include #include #include struct usb_bus *bus; struct usb_device *dev; usb_dev_handle *udev; int main(int argc, char *argv[]) { puts("init"); usb_init(); puts("find_busses"); usb_find_busses(); puts("find_devices"); usb_find_devices(); for (bus = usb_busses; bus; bus = bus->next) { puts(bus->dirname); for (dev = bus->devices; dev; dev = dev->next) { printf("%d: %s\n", dev->devnum, dev->filename); udev = usb_open(dev); if (!udev) { warnx("usb_open: %s", usb_strerror()); continue; } printf("%0x %0x %0x\n", dev->descriptor.idVendor, dev->descriptor.idProduct, dev->descriptor.bcdDevice); #if 0 if (usb_claim_interface(udev, 0) < 0) { errx(1, "usb_claim: %s", usb_strerror()); } #endif usb_close(udev); } } return 0; }
Re: boot -d
> So there seems to be something seriously amiss with I2C on -8 (and -9). After fixing that, it boots again (with the adopted pmap changes). Nevertheless, someone should review them, of course.
Re: boot -d
> Why not take spdmem out of your kernel config for now and test the > pmap patches ? It then panics in dbcool_chip_ident(). So there seems to be something seriously amiss with I2C on -8 (and -9).
Re: boot -d
> Why not take spdmem out of your kernel config for now and test the > pmap patches ? Yes, could do that next week (ENOTIME currently). Anything special to test? I've no idea what the code does resp. when it gets used.
Re: boot -d
> I‘ve backported the fixes, will post them later. Here they are (for netbsd-8). I can boot -d with them, but because of the spdmem panics, I can't tell whether the machine would run with them. Someone(TM) should review them and request a pullup, please. Not sure what to do with the __KERNEL_RCSID strings. Index: sys/arch/x86/include/pmap.h === RCS file: /cvsroot/src/sys/arch/x86/include/pmap.h,v retrieving revision 1.64.6.2 diff -u -r1.64.6.2 pmap.h --- sys/arch/x86/include/pmap.h 22 Mar 2018 16:59:04 - 1.64.6.2 +++ sys/arch/x86/include/pmap.h 13 Nov 2020 14:59:01 - @@ -1,4 +1,4 @@ -/* $NetBSD: pmap.h,v 1.64.6.2 2018/03/22 16:59:04 martin Exp $ */ +/* $NetBSD: pmap.h,v 1.100 2019/03/10 16:30:01 maxv Exp $ */ /* * Copyright (c) 1997 Charles D. Cranor and Washington University. @@ -291,7 +291,8 @@ pd_entry_t * const **); void pmap_unmap_ptes(struct pmap *, struct pmap *); -intpmap_pdes_invalid(vaddr_t, pd_entry_t * const *, pd_entry_t *); +bool pmap_pdes_valid(vaddr_t, pd_entry_t * const *, pd_entry_t *, + int *lastlvl); u_int x86_mmap_flags(paddr_t); @@ -342,12 +343,6 @@ * inline functions */ -__inline static bool __unused -pmap_pdes_valid(vaddr_t va, pd_entry_t * const *pdes, pd_entry_t *lastpde) -{ - return pmap_pdes_invalid(va, pdes, lastpde) == 0; -} - /* * pmap_update_pg: flush one page from the TLB (or flush the whole thing * if hardware doesn't support one-page flushing) Index: sys/arch/x86/x86/pmap.c === RCS file: /cvsroot/src/sys/arch/x86/x86/pmap.c,v retrieving revision 1.245.6.6 diff -u -r1.245.6.6 pmap.c --- sys/arch/x86/x86/pmap.c 22 Mar 2018 16:59:04 - 1.245.6.6 +++ sys/arch/x86/x86/pmap.c 13 Nov 2020 15:37:49 - @@ -28,6 +28,7 @@ * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE * POSSIBILITY OF SUCH DAMAGE. */ +/* $NetBSD: pmap.c,v 1.330 2019/03/10 16:30:01 maxv Exp $ */ /* * Copyright (c) 2007 Manuel Bouyer. @@ -171,7 +172,7 @@ */ #include -__KERNEL_RCSID(0, "$NetBSD: pmap.c,v 1.245.6.6 2018/03/22 16:59:04 martin Exp $"); +__KERNEL_RCSID(0, "$NetBSD: pmap.c,v XXX $"); #include "opt_user_ldt.h" #include "opt_lockdebug.h" @@ -3059,22 +3060,28 @@ * some misc. functions */ -int -pmap_pdes_invalid(vaddr_t va, pd_entry_t * const *pdes, pd_entry_t *lastpde) +bool +pmap_pdes_valid(vaddr_t va, pd_entry_t * const *pdes, pd_entry_t *lastpde, +int *lastlvl) { - int i; unsigned long index; pd_entry_t pde; + int i; for (i = PTP_LEVELS; i > 1; i--) { index = pl_i(va, i); pde = pdes[i - 2][index]; - if ((pde & PG_V) == 0) - return i; + if ((pde & PG_V) == 0) { + *lastlvl = i; + return false; + } + if (pde & PG_PS) + break; } if (lastpde != NULL) *lastpde = pde; - return 0; + *lastlvl = i; + return true; } /* @@ -3092,6 +3099,7 @@ paddr_t pa; lwp_t *l; bool hard, rv; + int lvl; #ifdef __HAVE_DIRECT_MAP if (va >= PMAP_DIRECT_BASE && va < PMAP_DIRECT_END) { @@ -3108,8 +3116,8 @@ kpreempt_disable(); ci = l->l_cpu; - if (__predict_true(!ci->ci_want_pmapload && ci->ci_pmap == pmap) || - pmap == pmap_kernel()) { + if (pmap == pmap_kernel() || + __predict_true(!ci->ci_want_pmapload && ci->ci_pmap == pmap)) { /* * no need to lock, because it's pmap_kernel() or our * own pmap and is active. if a user pmap, the caller @@ -3126,14 +3134,17 @@ hard = true; pmap_map_ptes(pmap, , , ); } - if (pmap_pdes_valid(va, pdes, )) { - pte = ptes[pl1_i(va)]; - if (pde & PG_PS) { + if (pmap_pdes_valid(va, pdes, , )) { + if (lvl == 2) { pa = (pde & PG_LGFRAME) | (va & (NBPD_L2 - 1)); rv = true; - } else if (__predict_true((pte & PG_V) != 0)) { - pa = pmap_pte2pa(pte) | (va & (NBPD_L1 - 1)); - rv = true; + } else { + KASSERT(lvl == 1); + pte = ptes[pl1_i(va)]; + if (__predict_true((pte & PG_V) != 0)) { + pa = pmap_pte2pa(pte) | (va & (NBPD_L1 - 1)); + rv = true; + } } } if (__predict_false(hard)) { @@ -3552,6 +3563,7 @@ vaddr_t blkendva, va = sva; struct vm_page *ptp; struct
Re: boot -d
> Am 12.11.2020 um 20:41 schrieb Andreas Gustafsson : > > t's probably easier to revert src/sys/arch/x86/x86/db_memrw.c 1.6 I‘ve backported the fixes, will post them later.
Re: boot -d
> It's probably easier to revert src/sys/arch/x86/x86/db_memrw.c 1.6. As far as I understood (which may well be wrong) the fixes fixed a real problem that only surfaced on that change by chance and might have other consequences?
Re: boot -d
> This looks like PR 53311. Ah, thanks! > The commit where that problem started (src/sys/arch/x86/x86/db_memrw.c 1.6) > was pulled up to to the -8 branch, and apparently the commits that fixed it > were not. I currently seem to attract pull-ups that mess up things. I had a look at the relevant commits src/sys/arch/x86/include/pmap.h 1.100 src/sys/arch/x86/x86/pmap.c 1.330 src/sys/arch/xen/x86/xen_pmap.c 1.31 but unfortunately am unable to back-port the second one to -8. I know nothing about pmap, and the -current version uses PTE_P and PTE_PS while the -8 version uses PG_V/nothing. Could someone in the know port these fixes to -8, please? Or guide me?
boot -d
Hello again. In about the third nesting level of what I wanted to do in the first place, I tried "boot netbsd -d" in the secondary boot. It loads the kernel, then complains about the ffs module missing (I don't use modules and don't have an 8.2 directory on that machine), clears the screen, displays "fatal breakpoint in supervisor mode" and re-boots. The problem is that the interesting messages are displayed only for a fraction of a second. In one out of three tries, I was able to catch them (partly) using the "slomo" (i.e. high speed) video recording mode of my iPhone, but of the line after the "fatal breakpoint" message, only the top half is displayed before it is cleared, so it's very hard to read the interesting parts. Any hints?
panic in iic_search()
I have an AMD64 server running 8/amd64, which ran happily (other than USB issues, which is another story) with 8.1_STABLE from September 2019. I updated to netbsd-8 from yesterday (so that's 8.2_STABLE) and a newly compiled kernel crashes in iic_search(). The last line printed before that is: iic0 at piixpm0: I2C bus With the working kernel, the next line is: spdmem0 at iic0 addr 0x50: NT4GC72B4NA1NL-CG Obviously, I have the spdmem* at iic? addr 0xxx lines uncommented in my config. The panic is: uvm_fault(0x90afec40, 0x0, 4) -> e fatal page fault in supervisor mode trap type 6 code 0x10 rip 0 cs 0x8 rflags 0x10246 cr2 0 ilevel 0x8 rsp 0x80d4f485 curlwp 0x80a1b600 pid 0.1 lowest kstack 0x80d4c2c0 kernel: page fault trap, code=0 Stopped in pid 0.1 (system) at 0:uvm_fault(0x80afec40, 0x7fbfc000, 1) -> e fatal page fault in supervisor mode trap type 6 code 0 rip 0x80d4f070 cs 0x8 rflags 0x10216 cr2 0x7fbfc000 ilevel 0x8 rsp 0x80d4f070 curlwp 0x80a1b600 pid 0.1 lowest kstack 0x80d4c2c0 kernel: page fault trap, code=0 Stopped in pid 0.1 (system) at netbsd:db_disasm+0x65: testb $0x1,0(%rdx,%rcx,8) Backtrace: db_disasm() at netbsd:db_disasm+0x65 db_trap() at netbsd:db_trap+0xf4 kpd_trap() at netbsd:kpd_trap+0xe2 trap() at netbsd:trap+0x5d6 -- trap (number 6) --- ?() at 0 iic_search() at netbsd:iic_search+0x92 mapply() at netbsd:mapply+0x39 config_search_loc() at netbsd:config_search_loc+0xaf iic_attach() at netbsd:iic_attach+0x4cd config_attach_loc() at netbsd:config_attach_loc+0x19c config_found_sm_loc() at netbsd:config_found_sm_loc+0x48 piixpm_rescan() at netbsd:piixpm_rescan+0xed piixpm_attach() at netbsd:piixpm_attach+0x1e7 config_attach_loc() at netbsd:config_attach_loc+0x19c config_found_sm_loc() at netbsd:config_found_sm_loc+0x48 pci_probe_device() at netbsd:pci_probe_device+0x57e pci_enumerate_bus() at netbsd:pci_enumerate_bus+0x198 pciattach() at netbsd:pciattach+0x198 config_attach_loc() at netbsd:config_attach_loc+0x19c config_found_sm_loc() at netbsd:config_found_sm_loc+0x48 mp_pci_scan() at netbsd:mp_pci_scan+0x9c mainbus_attach() at netbsd:mainbus_attach+0x2ce config_attach_loc() at netbsd:config_attach_loc+0x19c cpu_configure() at netbsd:cpu_configure+0x2b main() at netbsd:main+0x2a8 Where to go from here?
Re: RAIDframe: what if a disc fails during copyback
> it locks out all other non-copyback IO in order to finish the job! Oops! > Locking out all other IO is very poor... but if it's a small enough RAID set > you might be able to get away with the downtime for the copyback... Certainly not. > You shouldn't need to reboot for this... the 'failing spared disk' and > 'reconstruct to previous second disk' should work fine without reboot. I still don't get this. What I have is: Components: /dev/sd5a: spared /dev/sd6a: optimal Spares: /dev/sd7a: used_spare So what am I supposed to do from here?
Re: RAIDframe: what if a disc fails during copyback
Thanks for the detailed answer. > it's still there, and it does work, That's reassuring to know. > but it's not at all performant or system-friendly. Just how bad is it? > If you want the components labelled nicely, give the system a reboot Re-booting our file server is something I like to avoid. > and behaves very poorly. Depending on how poorly, I could probably live with it (the RAID in question is the small system one, not the large user data one). > In your case, what I'd do is just fail the spare, and initiate a reconstruct > to the original failed component. (You still have the data on the spare if > something goes back with the original good component.) Hm, I guess I would need to re-boot and intervene manually in that case. Just using the slow copyback looks preferrable if it doesn't take more than a day. Probably I need to test this on another machine before. I guess there's no way to initiate a reconstruction to a spare and failing the specified component only /after/ the reconstruction has completed, not before?
Re: RAIDframe: what if a disc fails during copyback
There still seems to be confusion on what I did. Let A and B be the two original components, C a spare (in the cupboard) and B' be B with the new firmware. I start with A and B as the two components of a RAID-1. Now B failes. I have a degraded RAID with A alone. I plug in C, scsictl scsibus0 scan all all it, add it as a hot spare (raidctl -a C) and initiate a reconstruction (raidctl -F B). Now I'm redundant again with A and C. Since I didn't re-boot, RAIDframe knows that B has failed and C is a used spare. I now actually un-plug B, plug it into another machine, do some testing (verifying that it may reset on writes), install new firmware, do futher testing (verifying it now doesn't reset on writes) and am about to re-plug it into the orignal server (which won't notice it ever disappeared or that B has turned into B'---as far as this question is concerned, I could have done all this in the original server). What I'm now intending to do is to raidctl -B (with A, B' and C installed, of course). After that, I intend to raidctl -r C, then scscictl scsibius0 detach C and finally un-plug C and put it back into the cupboard again. My question was about 1. B', 2. C or 3. A failing during the copyback. > there was a crop of bad Seagate 500GB disks for a while and they had > a tendancy to fail in mass at the same time. My working hypothesis since some five years is that all Seagate discs are bad and bound to fail. We had a series of SATA 250G (the example above is about SAS 146K) drives that failed the same way (dozens of them), got most of them replaced on warranty and had the replacements failing the same way again.
Re: RAIDframe: what if a disc fails during copyback
> So you have drives A, B, and C. A and B were live. Let's say B is the > one that failed. You reconstructed onto C and have been running with A > and C. Yes. > Now you have a new B (which in this case is the same hardware with new > firmware) and want to put it back into service. I'm not sure whether > you want to put it into service in place of A or in place of C. I'm > going to assume C. Yes. > So, you'd pull C, replace it with B No. I don't pull C. I re-add B (I have lots of empty slots). > and initiate a reconstruct No, a copyback (raidctl -B). > which for RAID 1 means copying from A to B. I don't know. I would expect it to copy from C to B. > > 1. The replaced component fails > > Is this B? Or C? Because it sounds to me as though C would be out of > service at this point. I mean B. > > 2. The spare fails > > Which is "the spare"? C. > Are you running with a hot spare? Yes. I added C as a hot spare when B failed and started a reconstruction. > I think a hot spare failing means nothing until/unless RAIDframe > tries to fall back on it. Yes. > > 3. The other, non-replaced component fails? > > That would be A? Yes. > Based on the assumption that RAIDframe RAID 1 cannot handle more than > two drives (always true as far as I know, and the 9.0 raidctl(8) manpage > says it's still true as of 9.0) The RAID-1 I'm speaking of does only have to components, but I did operate a RAIDframe RAID-1 on three components with 5.1 or something.
RAIDframe: what if a disc fails during copyback
(I could probably direct this question to oster@ instead of tech-kern@) In a RAIDframe RAID-1, a disc failed and I reconstructed on a spare. Now I want to replace the failed component (actually by the same disc, which needed a firmware update) and want to copyback to it. How will RAIDframe behave if, during the copyback: 1. The replaced component fails 2. The spare fails 3. The other, non-replaced component fails? Specifically: Is there any szenario (other than more than one disc failing) that will put the RAID into a non-redundant state? I guess 3. may?
Re: fsck updating but not fixing filesystem
> I have a reasonably large ffs filesystem (7.4GB, 35,459,874 files) I gues you mean 7.4TB? I remember (shudder) something similar, where the file server would panic (bad dir), fsck would fix some dirs (missing . or ..), the file server would panic ... rinse and repeat. Slightly short of me performing dump-newfs-restore, the problem disappeared. I never found out what was wrong. I think the general consensus is that ffs can be inconsistent it ways fsck is unable to detect.
Re: SIGCHLD and sigaction()
> I don't understand what problem queued SIGCHLD was invented to address. My impression is that it allows you to get notified of state changes of your child processes. If one signal could annonce several state changes, how would you know what these state changes are?
SIGCHLD and sigaction()
Another question in the context of SIGCHLD: When I install a SIGCHLD handler via sigaction() using SA_SIGINFO, is it guaranteed that my handler is called (at least) once per death-of-a-child? There is sentence in SUS If SA_SIGINFO is set in sa_flags, then subsequent occurrences of sig generated by sigqueue() or as a result of any signal-generating function that supports the specification of an application-defined value (when sig is already pending) shall be queued in FIFO order until delivered or accepted; that may cover this but that I don't understand.
Re: wait(2) and SIGCHLD
1. Sample program attached. Change SIG_IGN to SIG_DFL to see the difference. 2. macOS seems to behave the same way, as does Linux. 3. I don't see where POSIX defines or allows this, but given 2., I'm surely missing something. 4. The wording in wait(2) could be improved to clarify this is only about SIG_IGN, not SIG_DFL. At least, the NetBSD manpage mentions this at all. 5. Every time I think I knew Unix, I learn otherwise. #include #include #include #include #include #include int stat = 0; int ret; int main(int argc, char * argv[]) { signal(SIGCHLD, SIG_IGN); if (fork()) { if ((ret = wait()) < 0) err(1, "wait"); printf("ret %d, stat %d\n", ret, stat); } else { exit(42); } return 0; }
Re: wait(2) and SIGCHLD
> I'm not sure I've completely understood your question Probably not. Or I don't get what you are trying to say. What I observe is that a process that explicitly ignores SIGCHLD (SIG_IGN), then forks a child which exits, when wait()ing for the child, gets ECHILD (i.e., wait returns -1 and errno is ECHILD).
Re: wait(2) and SIGCHLD
The second question (that I forgot in the original mail) is whether wait(2) returning ECHILD for whatwever handling of SIGCHLD is covered by POSIX.
wait(2) and SIGCHLD
I'm confused regarding the behaviour of wait(2) wrt. SIGCHLD handling. The wait(2) manpage says: wait() will fail and return immediately if: [ECHILD]The calling process has no existing unwaited-for child processes; or no status from the terminated child process is available because the calling process has asked the system to discard such status by ignoring the signal SIGCHLD or setting the flag SA_NOCLDWAIT for that signal. However, ignore is the default handler for SIGCHLD. So does the because the calling process has asked the system to discard such status by ignoring the signal SIGCHLD mean that explicitly ignoring SIGCHLD is different from ignoring it per default?
Re: Horrendous RAIDframe reconstruction performance
> That's the reconstruction algorithm. It reads each stripe and if it > has a bad parity, the parity data gets rewritten. That's the way parity re-write works. I thought reconstruction worked differently. oster@?