Re: MCLADDREFERENCE() incrementing the wrong ext_refcnt?

2024-03-31 Thread Edgar Fuß
> So I think
> atomic_inc_uint(&(o)->m_ext.ext_refcnt);\
> should really be
> atomic_inc_uint(&(o)->m_ext_ref->m_ext.ext_refcnt); \
> which, of course, is the same thing if MEXT_ISEMBEDDED(o) is true.

> Am I getting something wrong?
Self-answer: Yes.
m_ext is m_ext_ref->m_ext_storage, so the additional indirection is 
already performed.


MCLADDREFERENCE() incrementing the wrong ext_refcnt?

2024-03-22 Thread Edgar Fuß
Hello.

I'm under the impression that MCLADDREFERENCE() may increment the wrong 
ext_refcnt.

In case it's permitted (I cant't find anything to the contrary) to 
call MCLADDREFERENCE(m1, m2) and then MCLADDREFERENCE(m2, m3), then the 
second call will increment m2's ext_refcnt where it should be incrementing 
m1's one (e.g. the one all of m1, m2 and m3's m_ext_ref are pointing to), no?

So I think
atomic_inc_uint(&(o)->m_ext.ext_refcnt);\
should really be
atomic_inc_uint(&(o)->m_ext_ref->m_ext.ext_refcnt); \
which, of course, is the same thing if MEXT_ISEMBEDDED(o) is true.

Am I getting something wrong?


_KERNEL_OPT and 0x6e074def

2023-12-19 Thread Edgar Fuß
What's the point of #include'ing opt_foobar.h only if _KERNEL_OPT is defined 
and what's magic about 0x6e074def?


Re: Notes on kern/57133

2023-10-06 Thread Edgar Fuß
> One change I can try is to put the diagnostic printf higher in the
> scsipi_request function
As the problem is so simple to reproduce, I'd put it just below xs = arg.

Or set a breakpoint on scsipi_get_opcodeinfo(), then, when hitting it, one on 
mpii_scsipi_request() (provided you find it's address) and then step through it.

Maybe dump the whole xs to see whether other fields are corrupted, too.

But I'd bet the problem is the ccb_done routine being called prematurely.

You could introduce a mpii_nothing_done() routine that prints something when 
called, replace ccb->ccb_done = mpii_scsi_cmd_done with = mpii_nothing_done 
and move the real assignement just before the mpii_start() call.
Or so I think.


Re: Notes on kern/57133

2023-10-04 Thread Edgar Fuß
> provide details on what this command is?
A3h/0Ch is REPORT SUPPORTED OPERATION CODES

The call is most probably from dev/scsipi:scsipi_get_opcodeinfo().

I'm still unsure how resid can be 0 at that point. scsipi_enqueue_xs() 
sets resid to datalen (which is undocumented). Apart from the path 
interpreting sense info nobody tampers with resid.

Can you check for resid != datalen in mpii_scsi_request() just before the
xs->xs_control & XS_CTL_POLL test (and, if it fires, print whether it's 
a polled request)?

Otherwise, I suspect ccb_done set to mpii_scsi_cmd_done where it shouldn't 
(i.e. some race/error for a mpt command that's not plain SCSI).


Re: dumping on RAIDframe

2023-09-25 Thread Edgar Fuß
> you dump a memory block that isn't a multiple of a disk sector 
> (according to disklabel)
You mean this one (from disklabel raid0):
bytes/sector: 512
?


Re: dumping on RAIDframe

2023-09-25 Thread Edgar Fuß
EF> dumping to dev 18,1 (offset=1090767, size=8252262):

GO>Dumping to a RAID 1 set is supported in -8.  But yes, none of those 
GO>values seem to align with each other.  18,1 is 'raid0b' thouugh, so that 
GO>part seems correct.

MvE> offset and size relate to the dump data (dumplo and dumpsize), not
MvE> the partition.

So here are the various configs (raid0.conf from raidctl -G raid0); I omit
sd1's info (normally identical/analogous to that of sd0) because I just 
pulled the disk to simulate a RAID failure. I'm unsure about whether the dump 
attempt was with a healthy or (artificially) failed RAID, I think it was 
with a healthy one.
  start   size  index  contents
  0  1 PMBR
  1  1 Pri GPT header
  2 32 Pri GPT table
 34   2014 Unused
   2048 262144  1  GPT part - EFI System
 264192  930869248  2  GPT part - NetBSD RAIDFrame component
  931133440   2015 Unused
  931135455 32 Sec GPT table
  931135487  1 Sec GPT header
/dev/rsd0: 2 wedges:
dk0: efi0, 262144 blocks at 2048, type: msdos
dk1: raid0, 930869248 blocks at 264192, type: raidframe
# raidctl config file for /dev/rraid0

START array
# numRow numCol numSpare
1 2 0

START disks
/dev/dk1
/dev/dk3

START layout
# sectPerSU SUsPerParityUnit SUsPerReconUnit RAID_level_1
128 1 1 1

START queue
fifo 100
# /dev/rraid0:
type: RAID
disk: raid0
label: lahn
flags:
bytes/sector: 512
sectors/track: 128
tracks/cylinder: 8
sectors/cylinder: 1024
cylinders: 909051
total sectors: 930869120
rpm: 3600
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0   # microseconds
track-to-track seek: 0  # microseconds
drivedata: 0 

6 partitions:
#sizeoffset fstype [fsize bsize cpg/sgs]
 a:   2097152 0 4.2BSD  0 0 0  # (Cyl.  0 -   2047)
 b:  67108864   2097152   swap # (Cyl.   2048 -  67583)
 d: 930869120 0 unused  0 0# (Cyl.  0 - 909051*)
 e:  62914560  69206016 4.2BSD  0 0 0  # (Cyl.  67584 - 129023)
 f: 629145600 132120576 4.2BSD  0 0 0  # (Cyl. 129024 - 743423)


boot.cfg location (was: GPT attributes in dkwedge [PATCH])

2023-09-25 Thread Edgar Fuß
> boot[].cfg is > searched in EFI par[tit]ion /EFI/NetBSD/boot.cfg 
> and root partition /boot.cfg. 
But how can EFI locate it on the root partition if it tells where the root 
partition lives?


Locating boot.cfg on ESP (was: GPT attributes in dkwedge [PATCH])

2023-09-25 Thread Edgar Fuß
>   | It's not obviously where efiboot finds boot.cfg, since that's in
>   | esp:/EFI/NetBSD/boot.cfg or,
> 
> And we correctly interpret that, always?
It works for me on four servers I recently set up if I put it into /EFI/NetBSD 
on the ESP. It also, for reasons unknown to me, works on one other identical 
server I set up earlier (the prototype for the others) if put into the root 
of the ESP, but that doesn't work on the four others. I didn't find out why.


Re: panic on mfii(4) vd removal

2023-09-22 Thread Edgar Fuß
> I get a panic if I remove a virtual disk from an mfii(4) device.
That's another blunder in mfii(4). Patch (including the last) attached.
Index: sys/dev/pci/mfii.c
===
RCS file: /cvsroot/src/sys/dev/pci/mfii.c,v
retrieving revision 1.3.2.7
diff -u -p -r1.3.2.7 mfii.c
--- sys/dev/pci/mfii.c  29 Sep 2022 14:41:43 -  1.3.2.7
+++ sys/dev/pci/mfii.c  22 Sep 2023 12:16:42 -
@@ -503,6 +503,8 @@ static const char *mfi_bbu_indicators[] 
 };
 #endif
 
+#define MFI_BBU_SENSORS 4
+
 static voidmfii_init_ld_sensor(struct mfii_softc *, envsys_data_t *, int);
 static voidmfii_refresh_ld_sensor(struct mfii_softc *, envsys_data_t *);
 static voidmfii_attach_sensor(struct mfii_softc *, envsys_data_t *);
@@ -1373,18 +1375,20 @@ mfii_aen_ld_update(struct mfii_softc *sc
if (old == -1 && nld != -1) {
printf("%s: logical drive %d added (target %d)\n",
DEVNAME(sc), i, nld);
+   sc->sc_ld[i].ld_present = 1;
 
// XXX scsi_probe_target(sc->sc_scsibus, i);
 
-   mfii_init_ld_sensor(sc, >sc_sensors[i], i);
-   mfii_attach_sensor(sc, >sc_sensors[i]);
+   mfii_init_ld_sensor(sc, >sc_sensors[i + 
MFI_BBU_SENSORS], i);
+   mfii_attach_sensor(sc, >sc_sensors[i + 
MFI_BBU_SENSORS]);
} else if (nld == -1 && old != -1) {
printf("%s: logical drive %d removed (target %d)\n",
DEVNAME(sc), i, old);
+   sc->sc_ld[i].ld_present = 0;
 
scsipi_target_detach(>sc_chan, i, 0, DETACH_FORCE);
sysmon_envsys_sensor_detach(sc->sc_sme,
-   >sc_sensors[i]);
+   >sc_sensors[i + MFI_BBU_SENSORS]);
}
}
 
@@ -3716,8 +3720,6 @@ freeme:
 
 #endif /* NBIO > 0 */
 
-#define MFI_BBU_SENSORS 4
-
 static void
 mfii_bbu(struct mfii_softc *sc, envsys_data_t *edata)
 {


panic on mfii(4) vd removal

2023-09-21 Thread Edgar Fuß
I get a panic if I remove a virtual disk from an mfii(4) device.

What I found out is that mfii_aen_ld_update() calls 
sysmon_envsys_sensor_detach(), which (near the end of the routine) calls 
TAILQ_REMOVE(). In that, the last statement (minus 
QUEUEDEBUG_TAILQ_POSTREMOVE()), which is
*(elm)->field.tqe_prev = (elm)->field.tqe_next;
fails because (elm)->field.tqe_prev is NULL.

I seem to be confused how tail queues work internally, because it appears 
to me that removing the first entry will fail, which is obviously not 
the case?

Any hints what's wrong?


Adding a virtual disk to mpii(4)

2023-09-21 Thread Edgar Fuß
After adding a virtual disk to an mfii(4) device (racadm createvirtualdisk, 
in my case), you get a nice
mfii0: logical drive 2 added (target 2)
message, but
scsictl scsibus0 scan 2 0
doesn't find any drive.

That's because sc_ld[i].ld_present is still unset from mfii_attach() and so 
mfii_scsipi_request() will return early (with xs->error = XS_SELTIMEOUT).

The attached fix seems pretty dammn obvious and appears to work.

Shall I file a PR?
Index: sys/dev/pci/mfii.c
===
RCS file: /cvsroot/src/sys/dev/pci/mfii.c,v
retrieving revision 1.3.2.7
diff -u -p -r1.3.2.7 mfii.c
--- sys/dev/pci/mfii.c  29 Sep 2022 14:41:43 -  1.3.2.7
+++ sys/dev/pci/mfii.c  21 Sep 2023 13:38:43 -
@@ -1373,6 +1373,7 @@ mfii_aen_ld_update(struct mfii_softc *sc
if (old == -1 && nld != -1) {
printf("%s: logical drive %d added (target %d)\n",
DEVNAME(sc), i, nld);
+   sc->sc_ld[i].ld_present = 1;
 
// XXX scsi_probe_target(sc->sc_scsibus, i);
 
@@ -1381,6 +1382,7 @@ mfii_aen_ld_update(struct mfii_softc *sc
} else if (nld == -1 && old != -1) {
printf("%s: logical drive %d removed (target %d)\n",
DEVNAME(sc), i, old);
+   sc->sc_ld[i].ld_present = 0;
 
scsipi_target_detach(>sc_chan, i, 0, DETACH_FORCE);
sysmon_envsys_sensor_detach(sc->sc_sme,


dumping on RAIDframe

2023-09-20 Thread Edgar Fuß
Didn't RAIDframe recently (for certain values of "recently") gain the function 
to dump on a level 1 set? Should this work in -8?
swapctl -z says "dump device is raid0b" (and raid0 is a level 1 RAID), but 
reboot 0x100 in DDB says
dumping to dev 18,1 (offset=1090767, size=8252262):

dump device not ready

What am I missing?

The offset (as reported by disklabel) of raid0b within raid0 is 2097152 (1G), 
the partition size is 67108864 (32G), so maybe something's wrong with the 
offset and size values (whatever unit they are in) DDB reports.


typo in raidN.conf leading to alledgedly failed component

2023-09-12 Thread Edgar Fuß
I set up a server with a RAIDframe level 1 RAID and forgot raidctl -A softroot.
So I booted an installation kernel via PXE, typed in a /tmp/raid0.conf and did
raidctl -c /tmp/raid0.conf raid0, only I mistyped the name of the first 
component. That led to "hosed component", but worse, failed that component and 
apperantly marked the first component failed on the label of the second. 
So after raidctl -u raid0, correcting my typo and raidctl -c /tmp/raid0.conf 
raid0, 
I ended up with a failed first component that dind't relay fail.

Can that be improved?


raidctl -A softroot and a failed component

2023-09-12 Thread Edgar Fuß
I had a RAIDframe level 1 RAID with the first component marked as failed, e,g,
component0: failed
/dev/dkN: optimal
and although the set was configured -A softroot, the kernel didn't configure 
raid0a as the root file system, presumably because the dk numbers didn't match.
I was sitting in front of the console, so I could easily type raid0a etc.,
but this would have prevented an automatic boot.

I'm afraid little can be done about that weird situation?


Re: Hard link creation witout write access

2023-09-07 Thread Edgar Fuß
> a likely source of security issues.
Why, exactly? I hope you need search permission to the original file 
(you certainly need search and write permission to the destination directory),
so what can you do after the link you couldn't have done before?

What about rename instead of link, should that be permitted?


Re: Maxphys on -current?

2023-08-04 Thread Edgar Fuß
Hasn't there been a tls-maxphys branch?


unable to create xfer table DMA map for drive 0, error=12

2023-08-03 Thread Edgar Fuß
I attached a 2,5" SSD to a machine, did a drvctl -r ata_hl atabus1 and got
svwsata0:1: unable to create xfer table DMA map for drive 0, error=12
wd2(svwsata0:1:0): using PIO mode 4
Is this a problem with -6 that machine runs or what does it mean?


Re: compare kernel config

2023-05-31 Thread Edgar Fuß
> Do we have a reliable way to compare kernel configations? 
config -x and diff?


Re: USB-related panic in 8.2_STABLE

2023-04-28 Thread Edgar Fuß
> The same patch should apply just as well on netbsd-8.
OK, I just did that.

But we still don't know what led to the disconnect. Does the
ohci0: 1 scheduling overruns
give any clue?


Re: USB-related panic in 8.2_STABLE

2023-04-27 Thread Edgar Fuß
> list *(ugen_get_cdesc+0xb1)
0x802f8f2e is in ugen_get_cdesc (/usr/src-8/sys/dev/usb/ugen.c:1376).
1371usb_config_descriptor_t *cdesc, *tdesc, cdescr;
1372int len;
1373usbd_status err;
1374
1375if (index == USB_CURRENT_CONFIG_INDEX) {
1376tdesc = usbd_get_config_descriptor(sc->sc_udev);
1377len = UGETW(tdesc->wTotalLength);
1378if (lenp)
1379*lenp = len;
1380cdesc = kmem_alloc(len, KM_SLEEP);

> list *(ugenioctl+0x9a4)
0x802f99d1 is in ugenioctl (/usr/src-8/sys/dev/usb/ugen.c:1668).
1663*usbd_get_device_descriptor(sc->sc_udev);
1664break;
1665case USB_GET_CONFIG_DESC:
1666cd = (struct usb_config_desc *)addr;
1667cdesc = ugen_get_cdesc(sc, cd->ucd_config_index, 
);
1668if (cdesc == NULL)
1669return EINVAL;
1670cd->ucd_desc = *cdesc;
1671kmem_free(cdesc, cdesclen);
1672break;

Does that help?

What about the
ohci0: 1 scheduling overruns
that preceded the detach that preceded the panic?


Re: USB-related panic in 8.2_STABLE

2023-04-27 Thread Edgar Fuß
> You didn't give timing.
Unfortunately, we don't know the timing.
We don't know when and why the UPS disconnected.

> normally the UPS doesn't disconnect
It doesn't. Why should it?


SEGV in mmap() when building lang/gcc8 with devel/binutils

2023-03-10 Thread Edgar Fuß
Sorry for the cross-post, but the problem is so weird that I'm confused 
what nature it is.

For complicated reasons (see below for details), I'm trying to build lang/gcc8 
so that it uses gas/gld from devel/binutils instead of /usr/bin/{as,ld}. I put
DEPENDS+=   binutils-[0-9]*:../../devel/binutils
CONFIGURE_ARGS.NetBSD+= --with-gnu-ld --with-ld=${PREFIX}/bin/gld
CONFIGURE_ARGS.NetBSD+= --with-gnu-as --with-as=${PREFIX}/bin/gas
in Makefile.

After quite some build time, some intermediate step chokes with:
checking for x86_64--netbsd-gcc... 
/var/work/pkgsrc/lang/gcc8/work/build/./gcc/xgcc 
-B/var/work/pkgsrc/lang/gcc8/work/build/./gcc/ 
-B/usr/pkg.compiler_boot/gcc8/x86_64--netbsd/bin/ 
-B/usr/pkg.compiler_boot/gcc8/x86_64--netbsd/lib/ -isystem 
/usr/pkg.compiler_boot/gcc8/x86_64--netbsd/include -isystem 
/usr/pkg.compiler_boot/gcc8/x86_64--netbsd/sys-include   
checking for suffix of object files... configure: error: in 
`/var/work/pkgsrc/lang/gcc8/work/build/x86_64--netbsd/libgcc':
configure: error: cannot compute suffix of object files: cannot compile
See `config.log' for more details.
gmake[2]: *** [Makefile:20348: configure-stage2-target-libgcc] Error 1
gmake[2]: Leaving directory '/var/work/pkgsrc/lang/gcc8/work/build'
gmake[1]: *** [Makefile:26109: stage2-bubble] Error 2
gmake[1]: Leaving directory '/var/work/pkgsrc/lang/gcc8/work/build'
gmake: *** [Makefile:949: all] Error 2
*** Error code 2

Stop.
make[1]: stopped in /usr/pkgsrc/lang/gcc8
*** Error code 1

Stop.
make: stopped in /usr/pkgsrc/lang/gcc8

According to config.log, the failing command is
/var/work/pkgsrc/lang/gcc8/work/build/./gcc/xgcc 
-B/var/work/pkgsrc/lang/gcc8/work/build/./gcc/ 
-B/usr/pkg.compiler_boot/gcc8/x86_64--netbsd/bin/ 
-B/usr/pkg.compiler_boot/gcc8/x86_64--netbsd/lib/ -isystem 
/usr/pkg.compiler_boot/gcc8/x86_64--netbsd/include -isystem 
/usr/pkg.compiler_boot/gcc8/x86_64--netbsd/sys-include-c -g -O2 
-D_FORTIFY_SOURCE=2 -I/usr/include -I/usr/pkg.compiler_boot/include/python3.10  
conftest.c >&5

ktrace-ing that manually reveals
cc1  CALL  
mmap(0,0x10,PROT_READ|PROT_WRITE,0x14001002,0x,0,0)
cc1  RET   mmap 137581005111296/0x7d2112f0
cc1  PSIG  SIGSEGV caught handler=0xa2e336 mask=(11): 
code=SEGV_MAPERR, addr=0x8, trap=6)

I even have no idea what xgcc is.

Any hints?


That's on 8.2_STABLE.

What I'm really trying to do is to build for 6.1 using a chroot and kver.
Without the patch to make gcc8 use pkgsrc binutils, it builds and seems to work 
(it can build itself into the standard LOCALBASE), but it fails on 
archivers/zstd because gcc emits a .S file (no inline involved) which 
/usr/bin/as can't assemble (tzcntl and shrx).

Any hints on that welcome, too!


Re: ATA TRIM?

2022-12-25 Thread Edgar Fuß
> According to that PDF, dholland is wrong.
I fail to see a behaviour that would be allowed due to dholland@'s definition, 
but not according to the one you cited, nor the other way round.


acpiwmibus at acpiwmi0 not configured

2022-12-19 Thread Edgar Fuß
I notice a line
acpiwmibus at acpiwmi0 not configured
in the autoconf messages.
Indeed, my kernel config has
acpiwmi* at acpi?
and
wmidell* at acpiwmibus?
but no attachment for any acpiwmibus, nor does any other kernel config.

Is there something magic about acpiwmibus or are the configs simply missing 
an appropriate line?


mpii_start() vs. mfii_start(): bus_space_write_raw_8(), bus_space_barrier()

2022-10-11 Thread Edgar Fuß
I'm investigating timeout problems with my mpii(4) device (after the driver 
has been converted to MSI(-X). I'm trying to understand both 
sys/dev/pci/mpii.c and mfii.c since they adress the same hardware with 
different firmware.

Comparing mpii_start() with mfii_start(), I'm stumbling over a number of 
differences I don't understand (I've removed some debug statements from 
mpii_start()):


void
mpii_start(struct mpii_softc *sc, struct mpii_ccb *ccb)
{
struct mpii_request_header  *rhp;
struct mpii_request_descr   descr;
#if defined(__LP64__) && 0
u_long   *rdp = (u_long *)
#else
u_int32_t*rdp = (u_int32_t *)
#endif

[...]
bus_dmamap_sync(sc->sc_dmat, MPII_DMA_MAP(sc->sc_requests),
ccb->ccb_offset, sc->sc_request_size,
BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE);
[...]

#if defined(__LP64__) && 0
bus_space_write_raw_8(sc->sc_iot, sc->sc_ioh,
MPII_REQ_DESCR_POST_LOW, *rdp);
#else
mutex_enter(>sc_req_mtx);
bus_space_write_4(sc->sc_iot, sc->sc_ioh,
MPII_REQ_DESCR_POST_LOW, rdp[0]);
bus_space_barrier(sc->sc_iot, sc->sc_ioh,
MPII_REQ_DESCR_POST_LOW, 8, BUS_SPACE_BARRIER_WRITE);

bus_space_write_4(sc->sc_iot, sc->sc_ioh,
MPII_REQ_DESCR_POST_HIGH, rdp[1]);
bus_space_barrier(sc->sc_iot, sc->sc_ioh,
MPII_REQ_DESCR_POST_LOW, 8, BUS_SPACE_BARRIER_WRITE);
mutex_exit(>sc_req_mtx);
#endif
}


static void
mfii_start(struct mfii_softc *sc, struct mfii_ccb *ccb)
{
uint32_t *r = (uint32_t *)>ccb_req;
#if defined(__LP64__)
uint64_t buf;
#endif

bus_dmamap_sync(sc->sc_dmat, MFII_DMA_MAP(sc->sc_requests),
ccb->ccb_request_offset, MFII_REQUEST_SIZE,
BUS_DMASYNC_PREREAD | BUS_DMASYNC_PREWRITE);

#if defined(__LP64__)
buf = ((uint64_t)r[1] << 32) | r[0];
bus_space_write_8(sc->sc_iot, sc->sc_ioh, MFI_IQPL, buf);
#else
mutex_enter(>sc_post_mtx);
bus_space_write_4(sc->sc_iot, sc->sc_ioh, MFI_IQPL, r[0]);
bus_space_write_4(sc->sc_iot, sc->sc_ioh, MFI_IQPH, r[1]);
bus_space_barrier(sc->sc_iot, sc->sc_ioh,
MFI_IQPL, 8, BUS_SPACE_BARRIER_WRITE);
mutex_exit(>sc_post_mtx);
#endif
}


1. __LP64__ handling: Is the LP64 case simply an optimization or is it 
   safer on the relevant platforms?

2. bus_space_write_raw_8(): I can't find any description or references for 
   that function. Should that be bus_space_write_8()?

3. Single vs. double bus_space_barrier(): It strikes me as odd that 
   mpii_start() has a call between the two bus_space_write_4() calls while 
   mfii_start() hasn't. It also look suspicious to me that both calls use 
   MPII_REQ_DESCR_POST_LOW.

Can someone please enlighten me?


mfii0: cmd timeout

2022-09-19 Thread Edgar Fuß
This is NOT kern/55192.

So I thought I had mastered my PERC H330; set up two virtual volumes 
containing a one-disc RAID 0, set up GPTs, EFI boot volumes, built a 
RAIDframe RAID 1, disklabeled that, newfs'd the partitions, only 
remaining step being unpacking the sets (and a few config files).

Unfortunately, the machine hangs unpacking the sets, uttering
mfii0: cmd timeout ...
and stalls.

This is NOT kern/55192. It happens both with 8.2_STABLE with the patch 
from mfii.c 1.16 applied and -current.

The strange thing is that dd'ing to the raw partition seems to work 
and initializing the RAID parity also worked. But unpacking any sets 
stalls sooner or later.

Any idea how to debug this?

I looked at OpenBSD's (where mfii(4) was ported from) CVS and couldn't 
find any changes that look related.
I looked at FreeBSD, but they mave mrsas(4), which seems to be an 
entirely different beast provided by AVAGO/LSI.
Any idea why OpenBSD wrote a new driver? Any chance to port mrsas(4) 
from FreeBSD?


Re: Dell PERC H330: no disks, no volumes

2022-09-15 Thread Edgar Fuß
> There is a PERC H330 and a PERC HBA330 and the Dell PERC9 user manual
> (includes the H330) says you can boot it in HBA mode. Not sure if
> that means that you can chose the firmware.
When I set the H330 to HBA mode, it still attaches as mfii0, the only 
difference to RAID mode being that the attachment in HBA mode says
scsibus0 at mfii0: 0 targets, 8 luns per target
instead of
scsibus0 at mfii0: 32 targets, 8 luns per target
in RAID mode.

I tried to force it to use mpii (by adding the PCI Id in mpii.c and 
disabling mfii in the kernel config, but that didn't work either 
(I had the faint hope the controller would use the MPT-2 protocol in 
HBA mode despite showing the RAID PCI Ids).

What /does/ work is setting the controller to RAID mode and create two 
volumes with a one-element RAID-0. But that feels like crazy.


Re: Dell PERC H330: no disks, no volumes

2022-09-14 Thread Edgar Fuß
> Yes, in the controller setup you can create "Non-RAID Disks" (aka
> JBOD) or "Virtual Disks" (aka RAID volumes)
Where exactly are those Non-RAID Disks hiodden?

> In theory you could use bioctl to create and manage volumes, but the
> driver doesn't implement it.
Ah, interesting. That was the way I was trying to use.


Re: Dell PERC H330: no disks, no volumes

2022-09-14 Thread Edgar Fuß
> I don't remember the details (and it depends on the controller version),
> but you need to have physical disks assigned to one (or more) RAID volume,
> and then the RAID volume has to be exported as one (or more) virtual disks.
But what if I want to pass the bare discs to NetBSD for a RAIDframe use?


Re: panic in sysmon_envsys_unregister

2022-09-14 Thread Edgar Fuß
> I need to build a new install image (since I have no discs).
I applied your fix to -8 and the panic disappeared.
Thanks for the quick fix. Maybe it should be pulled up?


Re: Dell PERC H330: no disks, no volumes

2022-09-14 Thread Edgar Fuß
Oh, I wasn't aware the H330 and HBA330 are different devices!

> There is a PERC H330 and a PERC HBA330 and the Dell PERC9 user manual
> (includes the H330) says you can boot it in HBA mode. Not sure if
> that means that you can chose the firmware.
Oh well. So the HBA330 is a PowerEdge RAID Controller that isn't a RAID 
controller? Thanks, Dell marketing!

> -> This is attaching a H330 (RAID version) and it gets the mfii driver.
> mfii0 at pci1 dev 0 function 0: "PERC H330 Mini", firmware 25.5.9.0001
OK, remains the question why I don't see any discs in bioctl.

On startup, the machine utters the following:

PowerEdge Expandable RAID Controller BIOS
Copyright(c) 2016 Avago Technologies
Press  to Run Configuration Utility
HA -0 (Bus 1 Dev 0) PERC H330 Mini
FW package: 25.5.0.0001


0 Non-RAID Disk(s) found on the host adapter.
0 Non-RAID Disk(s) handled by BIOS

0 Virtual Disk(s) found on the host adapter.

0 Virtual Disk(s) handled by BIOS

Is this normal? The only place I see discs being recognized is in the BIOS 
setup's controller setup.


Re: panic in sysmon_envsys_unregister

2022-09-14 Thread Edgar Fuß
> This should be fixed by mfii.c rev. 1.26. Please update it and retry.
Thanks. I need to build a new install image (since I have no discs).

The other question is why the register call fails.
According to the BIOS setup, the controller has no sensors. Could that be 
the problem?


panic in sysmon_envsys_unregister

2022-09-13 Thread Edgar Fuß
I get a panic on shutdown:

netbsd:sysmon_envsys_unregister+0x128:  cmpq0(%rdx),%r12
sysmon_envsys_unregister
mfii_detach
config_detach
config_detach_all
cpu_reboot
kern_reboot
sys_reboot
syscall
ds  4da0
es  0
fs  1
gs  c632
rdi 818f0510sme_global_mtx
rsi 
rbp 9008514e4da0
rbx 90003d04c000
rdx 0
rcx d9e26f07b700
rax 0
r8  0
r9  0
r10 0
r11 0
r12 d9e26d5b1c40
r13 d9e26d7a5a00
r14 1
r15 81802c60mfii_ca
rip 80a8c41esysmon_envsys_unregister+0x128
cs  8
rflags  10246
rsp 9008514e4d90
ss  10

This is -current from around yesterday.
I guess the problem is related to
mfii0: autoconfiguration error: unable to register with sysmon (rv = 86)
mfii0: autoconfiguration error: unable to create sensors
So probably someone is trying to un-resgister something not registered.


Re: Dell PERC H330: no disks, no volumes

2022-09-13 Thread Edgar Fuß
> These controller chips can run two different kinds of firmware.
> The mfii driver is for talking to the RAID firmware ("IR mode")
> while the mpii driver is for talking to the vanilla SAS firmware
> ("IT mode").
Ah, and how do I know which mode my card runs?
mpii(4) explicitly mentions the Dell PERC HBA330, but the "R" in PERC 
is for RAID.
The controller can be switched to RAID or HBA mode in the BIOS setup, 
so does it run both firmware versions?


Re: Dell PERC H330: no disks, no volumes

2022-09-13 Thread Edgar Fuß
It appears to me we have two drivers for the SAS3008: mfii(4) and mpii(4).
Why?


Dell PERC H330: no disks, no volumes

2022-09-13 Thread Edgar Fuß
So after I managed to boot my new PowerEdge R6515, the next challenge is that 
I have no discs.

The machine is equipped with a PERC H330 mini, a SCSI backplane and two 
SATA SSDs.

I do see the discs in the BIOS's RAID controller configuration menu.
Autoconfiguration says:
mfii0 at pci1 dev 0 function 0: "PERC H330 Mini", firmware 25.5.9.0001
mfii0: interrupting at ioapic4 pin 26
scsibus0 at mfii0: 0 targets, 8 luns per target
mfii0: unable to register with sysmon (rv = 86)
mfii0: unable to create sensors
[...]
mfii0: physical disk inserted id 32 enclosure 32
mfii0: physical disk inserted id 0 enclosure 32
mfii0: physical disk inserted id 1 enclosure 32
(both with 8.2 and current), but bioctl mfii0 show says
bioctl: no volumes available
and bioctl show disks show a header and then
bioctl: BIOCDISK_NOVOL: Inappropriate ioctl for device

The BIOS configuration lets me set the controller mode from RAID to HBA an 
I can mark individual discs as "RAID elegible", but that doesn't seem to 
make a difference.

I suppose it's something stupid. Anyone using a H330?


Re: debugging a kernel that doesn't start

2022-09-13 Thread Edgar Fuß
> I'm trying to run NetBSD on a Dell PowerEdge R6515, and the kernel is being 
> loaded (PXE or USB) but then the machine hangs hard.
I've made a giant step forward: booting the -current install image from a 
USB key /via UEFI/ works.
Maybe it's a bug in the server's CSM.

Thanks for all the helpful comments anyway.


Re: debugging a kernel that doesn't start

2022-09-12 Thread Edgar Fuß
> then you can bypass all the worries of using BIOS routines or whatnot 
> and just poke the hardware directly.
Probably stupid question: I can switch the machine to UEFI. Is it easier 
to debug things from there that from a BIOS boot?


Re: debugging a kernel that doesn't start

2022-09-12 Thread Edgar Fuß
> That could be a strong clue or it could be unrelated.
OK, just in case that might be another clue: If I want to interrupt the 
boot countdown, the first keystroke gets lost, I need to press  
a second time.


Re: debugging a kernel that doesn't start

2022-09-12 Thread Edgar Fuß
> If you can setup a serial console, it may make things much easier.
I do have a serial port on the machine.

> I almost always use serial consoles on dev machines; I don't remember the
> details but doing the equivalent of a putchar very early was possible.
Is the BIOS still available or how does that work?


Re: debugging a kernel that doesn't start

2022-09-12 Thread Edgar Fuß
> Have you tried booting a custom kernel with some drivers removed?
No. I wouldn't know which drivers to remove.
The problem is the Kernel utters absolutely nothing, so it must hang very, 
very early.

> have you tried an uncompressed one?
No, but I guess the official install image (on a USB key) is supposed to 
work as-is, no?

> The simplest way to debug something is using a serial port, do you have
> access to the one on this machine?
Yes, there is one. It seems to sort-of mirror the on-screen messages up to 
the point the NetBSD boot runs. I tried
consdev com0,9600
from the boot prompt but that hung the machine.


debugging a kernel that doesn't start

2022-09-12 Thread Edgar Fuß
I'm trying to run NetBSD on a Dell PowerEdge R6515, and the kernel is being 
loaded (PXE or USB) but then the machine hangs hard.

What's the way to debug a kernel that hangs so early that you can't printf 
or drop into ddb? I guess that's a phenomenon quite common for a new port 
or changes to locore.s (or whatever that's called today), but it's completely 
new to me.

I have virtually no clue about PeCee hardware. At the point the kernel is 
started, are BIOS routines still available?


Re: mfii(4) and Dell PERC

2022-08-08 Thread Edgar Fuß
Thanks for your answers.

> Some people reported that kern/56669 (and perhaps kern/55192) still exist 
> on some systems :-(
Hm.

> bioctl mfi(i)X show
Ah, thanks.
What do I do in case a drive fails? Will adding a hot spare automagically start 
a reconstruction?

> If your system has other number, please let me know.
I don't have such a system yet. I wanted to find out about NetBSD compatibility 
before buying one.


mfii(4) and Dell PERC

2022-08-08 Thread Edgar Fuß
I'm unsure whether this is the right list, is port-amd64 more appropriate?

Does anyone use a Dell PERC H730P or similar RAID controller in RAID mode?

mfii(4) says all configuration is done via the controller's BIOS.
Does that mean I need to shut down in case a drive fails an I need to rebuild?

Can I monitor the RAID state?
Can I monitor the BBU Battery health?

Thanks in advance.


Re: Periodic messages on NetBSD-9 and -current: xennet0: rx no cluster

2022-06-24 Thread Edgar Fuß
> the request count on the mclpl line is incrementing at a pretty fast rate
Maybe you're running into the same problem as me (see the "mbuf cluster leak?" 
thread on tech-net).
Try a kernel with MBUFTRACE. If that shows you (via netstat -mss) a large 
number of tx bufs on a particular vlan interface, try destroy-ing and 
re-creating that interface (and reloading ipfilter in case you're using it).
For me, that stops the allocations from rising (for a while).
I still don't know what triggers it, though.


Re: mfii hanging on boot

2022-06-23 Thread Edgar Fuß
> I committed the change yesterday.
I don't get what the #if defined(__LP64__) && 0 is for.


Re: killed: out of swap

2022-06-15 Thread Edgar Fuß
> Perhaps my understanding is wrong
No.


Re: killed: out of swap

2022-06-14 Thread Edgar Fuß
> I assume my impression is completely wrong (today).
OK, thanks for all the explanations and insights.


Re: killed: out of swap

2022-06-14 Thread Edgar Fuß
> So what should the kernel do?
I don't know how thigs work under the hood today (I might have partially 
known in the times of sbrk()), but I would suppose that malloc() will 
ultimatively result in some system call enlarging the heap/data 
segment/whatever. That system call could simply fail.

I assume my impression is completely wrong (today). But then, how can 
a malloc() fail before the process gets killed?


killed: out of swap

2022-06-14 Thread Edgar Fuß
I have a program that keeps malloc()ing (and scribbling a bit into the 
allocated memory) until malloc() fails. The intention is to put pressure 
on the VM system to find out how much pool cache memory it can reclaim.

When I run that program (with swap space unconfigured), it doesn't terminate 
normally, but gets killed by the kernel with "out of swap". Unfortunately, 
other processes happening to malloc() during that time may get killed, too.

I don't quite get what the rationale for that is (or maybe I'm doing 
something stupidely wrong). If I malloc(), and that fails, that should fail 
and not kill me, no?

I'm surely missing something.


Re: membar_enter semantics

2022-02-15 Thread Edgar Fuß
I know close to nothing about the subject in question, but maybe thoughts from 
a non-expert may be useful:

If there's a widely adopted terminology, one should probably stick to it even 
if the wording is counter-intuitive or misleading (but note that fact in the 
documentation). After all, Simple Groups are not easy at all and you need to
 know about Galois Theory to understand why Solvable Groups are named that way.

If the operations are called foo-before-bar, I would have to look up 
documentation on every instance to understand what the intended usage is. 
So for me, naming the operations after what they do, but have aliases for 
intended usage would make sense.

When I read frozz_enter() and frozz_exit() in code, my expectation is that 
every call fo enter is paired with a call to exit _in the control flow_, i.e., 
there's no (other than panic) code path that goes through one of them, but 
not the other.

Would it make sense to call the intended-usage aliases something like 
push/pull, provide/consume or publish/whatever?


findroot: double match for boot device (was: Autoconfigured RAIDframe raid* numbering)

2021-09-02 Thread Edgar Fuß
I do know that, but the warning seems to be new.
It didn't appear before, but I had -A root (which now is force) before.


Re: Autoconfigured RAIDframe raid* numbering

2021-09-02 Thread Edgar Fuß
> > Additinally, I got
> > WARNING: findroot: double match for boot device (sd4, sd5)
> > (where sd4a/5a are raid2's components) before
> > boot device: raid2
> > root on raid2a dumps on raid2b
> > What does that mean?
> 
> Is this with -current newer than
> 
>https://mail-index.netbsd.org/source-changes/2021/08/28/msg131862.html
> 
> ?
No, 8.2_STABLE.


Autoconfigured RAIDframe raid* numbering

2021-09-02 Thread Edgar Fuß
If I have a number of autoconfigured RAIDframe sets on one machine, is there 
any guarantee which raid* number a set gets assigned? Is that numbering 
stable even if I remove one set (in the sense of physically un-plugging the 
drives) so the components will get different sd* numbers?

I had raid0 (-A soft) for the system and raid1 (-A yes) for data. I added 
raid2 (also -A soft), transfererred everything (volatile data in single user 
mode) and then booted (single user, to be safe) off one of the new components. 
To my surprise, the raid* numbering was exactly like before, i.e. my root was 
now on raid2a.

Additinally, I got
WARNING: findroot: double match for boot device (sd4, sd5)
(where sd4a/5a are raid2's components) before
boot device: raid2
root on raid2a dumps on raid2b
What does that mean?


RAIDframe: reconstucting a temporarily lost drive (was: SATA rescan)

2021-06-16 Thread Edgar Fuß
> drvctl -r -a ata_hl atabusX
OK, that (after moving to a different slot) brought the drive back again.

However, the raid had failed the missing drive (whether upon booting with 
the missing drive or shortly before the crash I can't tell). I had /dev/wd0a 
optimal plus component1 failed. I guess there's no way to teach RAIDframe 
the missing drive is back (short of rebooting, which is out of the question)?

I added /dev/wd1a as a spare and failed component1, which worked, but maybe 
there's a more elegant way?


SATA rescan?

2021-06-15 Thread Edgar Fuß
Is there a way (short of re-booting) to re-scan a SATA port for a disc absent 
(or dysfunctional) during the boot? I.e., something like scsictl rescan?


Re: panic in iic_search()

2021-06-15 Thread Edgar Fuß
This is another place where I have local patches in my tree that haven't been 
integated (see kern/55745).
This is a regression in all "supported" versions of NetBSD (until -11 is 
released) rendering I2C inoperable on popular hardware.


8.x pmap fixes (was: boot -d)

2021-06-15 Thread Edgar Fuß
> Here they are (for netbsd-8). I can boot -d with them [...]
I just noticed that I still have these patches locally.
Any chances to get them into -8? Should I file a PR?


Re: timeouts connecting to pgsql database

2021-02-20 Thread Edgar Fuß
> What filesystem options are you using for wherever the database files
> are located ?
Back in the day I experienced that LFS was incredibly fast for a (MySQL) 
database.
There were problems with the cleaner crashing, though.


Re: partial failures in write(2) (and read(2))

2021-02-11 Thread Edgar Fuß
> I suppose libc could set a default handler for the new signal, and do some 
> extra work to set errno.
Then the libc routine could better use a new syscall, no?


Re: X vs serial console?

2021-02-09 Thread Edgar Fuß
> Is there any way I can test for it?
Connect something to the HDMI outputs?


Re: X vs serial console?

2021-02-09 Thread Edgar Fuß
Could it be the case that the X server expects some aspects of the video 
hardware to be initialized by the video console driver that are uninitialized 
in the serial console case? E.g., as you say outputs are shared between HDMI 
and VGA, the X video simply goes to the HDMI output?


Re: USB lockup (probably solved)

2020-12-01 Thread Edgar Fuß
Looks like I'm making progress after all.

> The change [nick] referred to was
> 
> Revision 1.254.2.76 / (download) - annotate - [select for diffs], Mon
> May 30 06:46:50 2016 UTC (4 years, 5 months ago) by skrll
> Branch: nick-nhusb
[...]
> 
> Restructure the abort code for TD based transfers (ctrl, bulk, intr).
> 
[...]

I (hopefully) adapted that to -8 and it seems to work!

I attach my adaption of nick's work plus some additional debugging code 
I added while analyzing the issue.

So what next? File a PR?
--- ohcivar.h.orig  2020-11-30 15:31:45.755906264 +0100
+++ ohcivar.h   2020-12-01 12:12:58.463657450 +0100
@@ -1,4 +1,4 @@
-/* $NetBSD: ohcivar.h,v 1.58.10.1 2018/08/25 11:29:52 martin Exp $ */
+/* $NetBSD: ohcivar.h,v 1.55.6.15 2016/05/30 06:46:50 skrll Exp $  */
 
 /*
  * Copyright (c) 1998 The NetBSD Foundation, Inc.
@@ -50,6 +50,7 @@
ohci_td_t td;
struct ohci_soft_td *nexttd;/* mirrors nexttd in TD */
struct ohci_soft_td *dnext; /* next in done list */
+   struct ohci_soft_td **held; /* where the ref to this std is held */
ohci_physaddr_t physaddr;
usb_dma_t dma;
int offs;
@@ -71,6 +72,7 @@
ohci_itd_t itd;
struct ohci_soft_itd *nextitd;  /* mirrors nexttd in ITD */
struct ohci_soft_itd *dnext;/* next in done list */
+   struct ohci_soft_itd **held;/* where the ref to this sitd is held */
ohci_physaddr_t physaddr;
usb_dma_t dma;
int offs;
@@ -114,6 +116,8 @@
LIST_HEAD(, ohci_soft_td)  sc_hash_tds[OHCI_HASH_SIZE];
LIST_HEAD(, ohci_soft_itd) sc_hash_itds[OHCI_HASH_SIZE];
 
+   TAILQ_HEAD(, ohci_xfer) sc_abortingxfers;
+
int sc_noport;
 
int sc_endian;
@@ -128,6 +128,8 @@
int sc_flags;
 #define OHCIF_SUPERIO  0x0001
 
+   kcondvar_t sc_softwake_cv;
+
ohci_soft_ed_t *sc_freeeds;
ohci_soft_td_t *sc_freetds;
ohci_soft_itd_t *sc_freeitds;
@@ -148,6 +152,8 @@
 
 struct ohci_xfer {
struct usbd_xfer xfer;
+   uint32_t ox_abintrs;
+   TAILQ_ENTRY(ohci_xfer) ox_abnext;
/* ctrl */
ohci_soft_td_t *ox_setup;
ohci_soft_td_t *ox_stat;
--- ohci.c  2020-11-23 18:30:07.0 +0100
+++ /tmp/ohci.c 2020-11-30 18:02:27.0 +0100
@@ -1,4 +1,4 @@
-/* $NetBSD: ohci.c,v 1.273.6.6 2020/02/25 18:52:44 martin Exp $*/
+/* $NetBSD: ohci.c,v 1.254.2.76 2016/05/30 06:46:50 skrll Exp $*/
 
 /*
  * Copyright (c) 1998, 2004, 2005, 2012 The NetBSD Foundation, Inc.
@@ -41,7 +41,7 @@
  */
 
 #include 
-__KERNEL_RCSID(0, "$NetBSD: ohci.c,v 1.273.6.6 2020/02/25 18:52:44 martin Exp 
$");
+__KERNEL_RCSID(0, "$NetBSD: ohci.c,v 1.254.2.76 2016/05/30 06:46:50 skrll Exp 
$");
 
 #ifdef _KERNEL_OPT
 #include "opt_usb.h"
@@ -384,6 +384,8 @@ ohci_detach(struct ohci_softc *sc, int f
 
softint_disestablish(sc->sc_rhsc_si);
 
+   cv_destroy(>sc_softwake_cv);
+
mutex_destroy(>sc_lock);
mutex_destroy(>sc_intr_lock);
 
@@ -492,6 +494,7 @@ ohci_alloc_std(ohci_softc_t *sc)
memset(>td, 0, sizeof(ohci_td_t));
std->nexttd = NULL;
std->xfer = NULL;
+   std->held = NULL;
 
return std;
 }
@@ -539,14 +542,17 @@ ohci_alloc_std_chain(ohci_softc_t *sc, s
 
DPRINTFN(8, "xfer %#jx nstd %jd", (uintptr_t)xfer, nstd, 0, 0);
 
-   for (size_t j = 0; j < ox->ox_nstd;) {
+   for (size_t j = 0; j < ox->ox_nstd; j++) {
ohci_soft_td_t *cur = ohci_alloc_std(sc);
if (cur == NULL)
goto nomem;
 
-   ox->ox_stds[j++] = cur;
+   ox->ox_stds[j] = cur;
+   cur->held = >ox_stds[j];
cur->xfer = xfer;
cur->flags = 0;
+   DPRINTFN(10, "xfer=%#jx new std=%#jx held at %#jx", ox, cur,
+   cur->held, 0);
}
 
return 0;
@@ -788,6 +794,7 @@ ohci_init(ohci_softc_t *sc)
 
mutex_init(>sc_lock, MUTEX_DEFAULT, IPL_SOFTUSB);
mutex_init(>sc_intr_lock, MUTEX_DEFAULT, IPL_USB);
+   cv_init(>sc_softwake_cv, "ohciab");
 
sc->sc_rhsc_si = softint_establish(SOFTINT_USB | SOFTINT_MPSAFE,
ohci_rhsc_softint, sc);
@@ -797,6 +804,8 @@ ohci_init(ohci_softc_t *sc)
for (i = 0; i < OHCI_HASH_SIZE; i++)
LIST_INIT(>sc_hash_itds[i]);
 
+   TAILQ_INIT(>sc_abortingxfers);
+
sc->sc_xferpool = pool_cache_init(sizeof(struct ohci_xfer), 0, 0, 0,
"ohcixfer", NULL, IPL_USB, NULL, NULL, NULL);
 
@@ -1334,12 +1343,26 @@ ohci_intr1(ohci_softc_t *sc)
 */
softint_schedule(sc->sc_rhsc_si);
}
+   if (eintrs & OHCI_SF) {
+   struct ohci_xfer *ox, *tmp;
+   TAILQ_FOREACH_SAFE(ox, >sc_abortingxfers, ox_abnext, tmp) {
+   DPRINTFN(10, "SF %#jx xfer %#jx", (uintptr_t)sc, 
(uintptr_t)ox, 0, 0);
+   ox->ox_abintrs &= 

Re: USB lockup

2020-11-28 Thread Edgar Fuß
I looked into the usbhist now.

> is something being aborted?
Yes.

> I guess the E20 TD got written out with incorrect next_td, or some other
> error condition caused the mixup.
I think the only sane explanation absent a controller bug is that, at the 
time the HC finished E20, HcDoneHead was FA0 (41088FA0, really; note there 
are other TDs xxxFA0, which I ignore here). Since the "real" FA0 (which is 
the "tail" part of the transfer just initiated comes after E20, the HC must 
have finished a "different" FA0. As I added checks for a second TD being 
queued with the same physaddr that didn't fire, the HCD must have previously 
dequeued the FA0 TD. So my guess is that the HC and the HCD, prior before 
the E20->EE0->FA0->F40->0 chain is queued, disagree on whether FA0 is still 
active (i.e. under the HCs control) or not (i.e. under the HCDs control).

> The change I referred to was
Would you expect that change (maybe after some munging) to work with -8?

> In PR/22646 some TDs can be on the done queue when the abort start and,
If that "done queue" is in the HcDoneHead sense, not the HccaDoneHead sense, 
then, I guess, it would exactly fit what I think is going on.

> if this is the case, they need to processed after the WDH interrupt.
> Instead of waiting for WDH we release TDs that have been touched by the
> HC and replace them with new ones.  Once WDH happens the floating TDs
> will be returned to the free list.
I still need to understand how exactly that works. Could you give me a hint 
what these "referenced by" pointers are needed for?
My first idea would be to, when aborting, not to actually de-queue the std's, 
but just mark them as aborted (in the std outside the td, of course) and, 
when encountering such a std in the softint, just de-queue and otherwise 
skip them. I'm afraid I'm missing something and that won't work.


Re: USB lockup

2020-11-27 Thread Edgar Fuß
> Really hard to help without seeing the full ohcidebug usbhist log.
I replaced the panic with abreak out of the done loop.

Find attached my diff plus the usbhist from where I first started the 
offending command (which locks up the second time called).

I didn't look into the log myself yet.
Index: ohcivar.h
===
RCS file: /cvsroot/src/sys/dev/usb/ohcivar.h,v
retrieving revision 1.58.10.1
diff -u -r1.58.10.1 ohcivar.h
--- ohcivar.h   25 Aug 2018 11:29:52 -  1.58.10.1
+++ ohcivar.h   27 Nov 2020 12:08:49 -
@@ -59,6 +59,9 @@
uint16_t flags;
 #define OHCI_CALL_DONE 0x0001
 #define OHCI_ADD_LEN   0x0002
+#ifdef OHCI_DEBUG
+   int beenthere;  /* loop detection */
+#endif
 } ohci_soft_td_t;
 #define OHCI_STD_SIZE ((sizeof(struct ohci_soft_td) + OHCI_TD_ALIGN - 1) / 
OHCI_TD_ALIGN * OHCI_TD_ALIGN)
 #define OHCI_STD_CHUNK 128
@@ -75,6 +78,9 @@
struct usbd_xfer *xfer;
uint16_t flags;
bool isdone;/* used only when DIAGNOSTIC is defined */
+#ifdef OHCI_DEBUG
+   int beenthere;  /* loop detection */
+#endif
 } ohci_soft_itd_t;
 #define OHCI_SITD_SIZE ((sizeof(struct ohci_soft_itd) + OHCI_ITD_ALIGN - 1) / 
OHCI_ITD_ALIGN * OHCI_ITD_ALIGN)
 #define OHCI_SITD_CHUNK 64
Index: ohci.c
===
RCS file: /cvsroot/src/sys/dev/usb/ohci.c,v
retrieving revision 1.273.6.6
diff -u -r1.273.6.6 ohci.c
--- ohci.c  25 Feb 2020 18:52:44 -  1.273.6.6
+++ ohci.c  27 Nov 2020 12:09:01 -
@@ -230,6 +230,8 @@
 Static voidohci_dump_ed(ohci_softc_t *, ohci_soft_ed_t *);
 Static voidohci_dump_itd(ohci_softc_t *, ohci_soft_itd_t *);
 Static voidohci_dump_itds(ohci_softc_t *, ohci_soft_itd_t *);
+
+static int ohci_beenthere = 0; /* td list loop detection */
 #endif
 
 #define OBARR(sc) bus_space_barrier((sc)->iot, (sc)->ioh, 0, (sc)->sc_size, \
@@ -693,6 +695,13 @@
DPRINTFN(2, "add 0 xfer", 0, 0, 0, 0);
}
 
+#ifdef OHCI_DEBUG
+   DPRINTFN(10, "--- dump start ---", 0, 0, 0, 0);
+   if (ohcidebug >= 10)
+   ohci_dump_td(sc, sp);
+   DPRINTFN(10, "--- dump end ---", 0, 0, 0, 0);
+#endif
+
/* Last TD gets usb_syncmem'ed by caller */
*ep = cur;
 }
@@ -1410,9 +1419,25 @@
OWRITE4(sc, OHCI_INTERRUPT_ENABLE, OHCI_WDH);
 
/* Reverse the done list. */
+#ifdef OHCI_DEBUG
+   ohci_beenthere++;
+#endif
for (sdone = NULL, sidone = NULL; done != 0; ) {
+   DPRINTFN(10, "done=%#jx", (uintptr_t)done, 0, 0, 0);
std = ohci_hash_find_td(sc, done);
if (std != NULL) {
+#ifdef OHCI_DEBUG
+   if (ohcidebug >= 10) 
+   ohci_dump_td(sc, std);
+   if (std->beenthere == ohci_beenthere) {
+   DPRINTFN(1, "circular sdone: %#jx->%#jx", 
(uintptr_t)sdone, (uintptr_t)std, 0, 0);
+#if 0
+   panic("circular sdone");
+#endif
+   break;
+   }
+   std->beenthere = ohci_beenthere;
+#endif
usb_syncmem(>dma, std->offs, sizeof(std->td),
BUS_DMASYNC_POSTWRITE | BUS_DMASYNC_POSTREAD);
std->dnext = sdone;
@@ -1423,6 +1448,20 @@
}
sitd = ohci_hash_find_itd(sc, done);
if (sitd != NULL) {
+#ifdef OHCI_DEBUG
+/* XXX no ohci_dump_itd() yet
+   if (ohcidebug >= 10) 
+   ohci_dump_itd(sc, sitd);
+*/
+   if (sitd->beenthere == ohci_beenthere) {
+   DPRINTFN(1, "circular sidone: %#jx->%#jx", 
(uintptr_t)sidone, (uintptr_t)sitd, 0, 0);
+#if 0
+   panic("circular sidone");
+#endif
+   break;
+   }
+   sitd->beenthere = ohci_beenthere;
+#endif
usb_syncmem(>dma, sitd->offs, sizeof(sitd->itd),
BUS_DMASYNC_POSTWRITE | BUS_DMASYNC_POSTREAD);
sitd->dnext = sidone;
@@ -1445,6 +1484,7 @@
for (std = sdone; std; std = std->dnext)
ohci_dump_td(sc, std);
}
+/* XXX dump sidone list */
 #endif
DPRINTFN(10, "--- TD dump end ---", 0, 0, 0, 0);
 
@@ -1838,6 +1878,15 @@
 
KASSERT(sc->sc_bus.ub_usepolling || mutex_owned(>sc_lock));
 
+#ifdef OHCI_DEBUG
+   for (ohci_soft_td_t *std2 = LIST_FIRST(>sc_hash_tds[h]);
+std2 != NULL;
+std2 = LIST_NEXT(std2, hnext)) {
+   if (std2->physaddr == std->physaddr)
+   panic("OHCI: duplicate physaddr");
+   }
+#endif
+
LIST_INSERT_HEAD(>sc_hash_tds[h], std, hnext);
 }
 
@@ -1945,7 

Re: USB lockup

2020-11-26 Thread Edgar Fuß
Thanks a lot for looking into this!

> Really hard to help without seeing the full ohcidebug usbhist log.
The problem is that file system (or block I/O) seems to lock up so the 
usbhist is hard to get out of the machine other than by canera. 
I guess dump-ing will take ages to complete (16G RAM).
I could try to replace my panic with simply writing something to usbhist 
and aborting the loop.

> I guess the E20 TD got written out with incorrect next_td, or some other
> error condition caused the mixup.
You mean nexttd or td_nexttd? As far as I can tell, neither field is touched 
by the driver without being ohci_dump_td()'d afterwards, and, as I wrote, 
minus the loopback td_nexttd, everything is exactly as one would expect.

> The change I referred to was
I'll have a look into that one tomorrow.

> is something being aborted?
May well be. I haven't checked yet.

My feeling is that this is either a controller error or some sort of 
DMA/cache/barrier/whatever race during the HccaDoneHead manipulation. 
But I'm steadily confused by the writing-a-1-clears-the-bit or 
writing-a-1-sets-the-bit semantics of the registers and know nothing about 
all these cache/barrier/re-ordering issues other that they may exist.

The one nice thing is that the lock-up is easily and 100% reproducible. 
If only these PeCee boxes wouldn't take ages to reboot.


Re: USB lockup

2020-11-26 Thread Edgar Fuß
> Add a check to ohci_softintr to see if the list goes circular and enter
> ddb / dump usbhist when it does...
I already did add a panic and it fired.

I'm still trying to find out how that happens.

What I'm seeing (dumped by device_ctrl_start()) is a chain of four TDs 
(named here after their addresses' three least significant nybbles):
E20->EE0->FA0->F40->0
which are linked in that sense by both nexttd and td.td_nexttd.

Then, in ohci_softint(), the done queue is (as linked by td.nexttd):
FA0->EE0->E20->FA0->...
and, as expected, the nexttd links are as before.
Absent the E20->FA0 link, that's exactly what one would expect if the first 
three TDs have been handled (the done list is most recently done first); 
the big question is where that additinal link comes from.

I've added code to ohci_hash_add_td() to catch a TD being added with a 
physical address already present in the hash list, but that didn't fire.


Re: USB lockup

2020-11-24 Thread Edgar Fuß
I guess there's something different going on. Unless I'm mistaken, 
the list is circular in the td_nexttd sense, but not in the nexttd sense.


Re: USB lockup

2020-11-24 Thread Edgar Fuß
> so the td list must have gone circular, no?
It's indeed circular (in the td_nexttd sense), as addionally inserted 
debugging output revealed. It also happens in uniprocessor (boot -1) mode.


Re: USB lockup

2020-11-23 Thread Edgar Fuß
> So, during the partial lockup, I see
>   ohci_softintr#63@0: add TD 0x80013ec2de20
>   ohci_softintr#63@0: add TD 0x80013ec2dea0
that's  ohci_softintr#63@0: add TD 0x80013ec2dfa0
>   ohci_softintr#63@0: add TD 0x80013ec2dee0
So I think it's endlessly looping in the "Reverse the done list." loop 
in ohci_softintr(), so the td list must have gone circular, no?


Re: USB lockup

2020-11-23 Thread Edgar Fuß
> The ddb backtrace usually is
> bus_space_read_4()
> bintime()
> ohci_softintr()
> usb_soft_intr()
> softint_dispatch()
> 
> The system call causing the lock-up is a USB_DEVICEINFO ioctl on /dev/usb0 
> with udi_addr=2, which corresponds to ugen0.
I tried a -current kernel from nyftp today, and it locks up the same way.


USB lockup (was: ktrace-ing a command that locks up the machine)

2020-11-20 Thread Edgar Fuß
> Hmmm, this was usb, right?
Yes.

> Maybe turn on options USBHIST (and/or EHCIHIST, OHCIHIST, UHCIHIST,
> XHCIHIST).  None of these seem to be described in options(4) man
> page, but you can dump the debug data using ``vmstat -u histname''.
> And get a listof the actual histname's with ``vmstat -l''
Oh, thanks, I didn't knew of that. I don't even need any further options.

So, during the partial lockup, I see
ohci_softintr#63@0: add TD 0x80013ec2de20
ohci_softintr#63@0: add TD 0x80013ec2dea0
ohci_softintr#63@0: add TD 0x80013ec2dee0
at .01 intervals.

The ddb backtrace usually is
bus_space_read_4()
bintime()
ohci_softintr()
usb_soft_intr()
softint_dispatch()

The system call causing the lock-up is a USB_DEVICEINFO ioctl on /dev/usb0 
with udi_addr=2, which corresponds to ugen0.

Any hints how to debug this further? I tried a DIGNOSTIC+DEBUG+LOCKDEBUG 
kernel, but it didn't complain. The strange thing is that not only USB 
locks up, but any file system operation seems to stall, too. No, these 
are not USB discs.


USB debugging (was: ktrace-ing a command that locks up the machine)

2020-11-18 Thread Edgar Fuß
On Wed, Nov 18, 2020 at 09:05:47AM -0500, Greg Troxel wrote:
> another suggestion is to enable USB debugging in the kernel and use a serial 
> console (or even just framebuffer) to see the last message before crash.
I set options {USB,OHCI,EHCI}_DEBUG and sysctl -w hw.{usb,ohci,ehci}.debug=20 
and get zero output. What the hell am I missing?


Re: ktrace-ing a command that locks up the machine

2020-11-18 Thread Edgar Fuß
> ktrace over NFS.
That would be -- eh -- somewhat involved. I doubt it will work given that 
writing to an FS mounted -o sync gives an empty file.


Re: ktrace-ing a command that locks up the machine

2020-11-18 Thread Edgar Fuß
> Suggestion: put the ktrace file on a filesystem mounted -o sync.
That (with ktrace -s) gave me an empty file.


ktrace-ing a command that locks up the machine

2020-11-18 Thread Edgar Fuß
So after fixing kern/53311 and kern/55745 on -8, I'm back to one nesting 
level down my original task.

I have a command that (when run the second time and with certain USB devices 
connected) will irrevertibly (to me) partly (no console switching) lock up 
the machine. I need to enter DDB and reboot.

I would like to ktrace/ktruss the command to see which USB transfer exactly 
is the one that hangs. However, even with ktrace -s, there is no trace file 
after the re-boot (on FFS/WAPBL); I can't tell whether it exists before the 
reboot. Using ktruss, the last trace output to the console is way behind the 
execution.

I would like to avoid GDB single stepping through libusb.

Any ideas?

The process is somewhat tedious because these wonderful 
grandgrandson-of-IBM-PCs take some 85 seconds from the reboot command to 
the primary boot.


USB lock-ups

2020-11-16 Thread Edgar Fuß
Hello again.

So after backporting the -current pmap fixes to -8 in order to be able to 
be able boot -d in order be able to examine I2C panics and after fixing them 
I have an operational -8 machine again only to find that the USB problems 
that made me update are still there.

The simplest libusb program (I tried to get myslf acquianted to libusb) will 
lock-up the machine if run the second time. The only trace I have is (once)
ohci0: WARNING: addr 0x41088dc0 not found.
The machine becomes (at least) unresponsive to virtual console switches, 
most times, entering DDB works; backtrace is
x86_memfence()
usb_soft_intr()
softint_dispatch()
or
bus_dmamap_sync()
ohci_softintr()
usb_soft_intr()
softint_dispatch()

When I looked, I had most processes in tstile.

Any hints? Another broken pull-up?
#include 
#include 
#include 

struct usb_bus *bus;
struct usb_device *dev;
usb_dev_handle *udev;

int main(int argc, char *argv[]) {
puts("init");
usb_init();
puts("find_busses");
usb_find_busses();
puts("find_devices");
usb_find_devices();

for (bus = usb_busses; bus; bus = bus->next) {
puts(bus->dirname);
for (dev = bus->devices; dev; dev = dev->next) {
printf("%d: %s\n", dev->devnum, dev->filename);
udev = usb_open(dev);
if (!udev) {
warnx("usb_open: %s", usb_strerror());
continue;
}
printf("%0x %0x %0x\n", dev->descriptor.idVendor, 
dev->descriptor.idProduct, dev->descriptor.bcdDevice);
#if 0
if (usb_claim_interface(udev, 0) < 0) {
errx(1, "usb_claim: %s", usb_strerror());
}
#endif
usb_close(udev);
}
}
return 0;
}


Re: boot -d

2020-11-16 Thread Edgar Fuß
> So there seems to be something seriously amiss with I2C on -8 (and -9).
After fixing that, it boots again (with the adopted pmap changes).
Nevertheless, someone should review them, of course.


Re: boot -d

2020-11-16 Thread Edgar Fuß
> Why not take spdmem out of your kernel config for now and test the
> pmap patches ?
It then panics in dbcool_chip_ident(). So there seems to be something seriously 
amiss with I2C on -8 (and -9).


Re: boot -d

2020-11-13 Thread Edgar Fuß
> Why not take spdmem out of your kernel config for now and test the
> pmap patches ?
Yes, could do that next week (ENOTIME currently). Anything special to test? 
I've no idea what the code does resp. when it gets used.


Re: boot -d

2020-11-13 Thread Edgar Fuß
> I‘ve backported the fixes, will post them later.
Here they are (for netbsd-8). I can boot -d with them, but because of the 
spdmem panics, I can't tell whether the machine would run with them.
Someone(TM) should review them and request a pullup, please.
Not sure what to do with the __KERNEL_RCSID strings.
Index: sys/arch/x86/include/pmap.h
===
RCS file: /cvsroot/src/sys/arch/x86/include/pmap.h,v
retrieving revision 1.64.6.2
diff -u -r1.64.6.2 pmap.h
--- sys/arch/x86/include/pmap.h 22 Mar 2018 16:59:04 -  1.64.6.2
+++ sys/arch/x86/include/pmap.h 13 Nov 2020 14:59:01 -
@@ -1,4 +1,4 @@
-/* $NetBSD: pmap.h,v 1.64.6.2 2018/03/22 16:59:04 martin Exp $ */
+/* $NetBSD: pmap.h,v 1.100 2019/03/10 16:30:01 maxv Exp $  */
 
 /*
  * Copyright (c) 1997 Charles D. Cranor and Washington University.
@@ -291,7 +291,8 @@
pd_entry_t * const **);
 void   pmap_unmap_ptes(struct pmap *, struct pmap *);
 
-intpmap_pdes_invalid(vaddr_t, pd_entry_t * const *, pd_entry_t *);
+bool   pmap_pdes_valid(vaddr_t, pd_entry_t * const *, pd_entry_t *,
+   int *lastlvl);
 
 u_int  x86_mmap_flags(paddr_t);
 
@@ -342,12 +343,6 @@
  * inline functions
  */
 
-__inline static bool __unused
-pmap_pdes_valid(vaddr_t va, pd_entry_t * const *pdes, pd_entry_t *lastpde)
-{
-   return pmap_pdes_invalid(va, pdes, lastpde) == 0;
-}
-
 /*
  * pmap_update_pg: flush one page from the TLB (or flush the whole thing
  * if hardware doesn't support one-page flushing)
Index: sys/arch/x86/x86/pmap.c
===
RCS file: /cvsroot/src/sys/arch/x86/x86/pmap.c,v
retrieving revision 1.245.6.6
diff -u -r1.245.6.6 pmap.c
--- sys/arch/x86/x86/pmap.c 22 Mar 2018 16:59:04 -  1.245.6.6
+++ sys/arch/x86/x86/pmap.c 13 Nov 2020 15:37:49 -
@@ -28,6 +28,7 @@
  * ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
  * POSSIBILITY OF SUCH DAMAGE.
  */
+/* $NetBSD: pmap.c,v 1.330 2019/03/10 16:30:01 maxv Exp $  */
 
 /*
  * Copyright (c) 2007 Manuel Bouyer.
@@ -171,7 +172,7 @@
  */
 
 #include 
-__KERNEL_RCSID(0, "$NetBSD: pmap.c,v 1.245.6.6 2018/03/22 16:59:04 martin Exp 
$");
+__KERNEL_RCSID(0, "$NetBSD: pmap.c,v XXX $");
 
 #include "opt_user_ldt.h"
 #include "opt_lockdebug.h"
@@ -3059,22 +3060,28 @@
  * some misc. functions
  */
 
-int
-pmap_pdes_invalid(vaddr_t va, pd_entry_t * const *pdes, pd_entry_t *lastpde)
+bool
+pmap_pdes_valid(vaddr_t va, pd_entry_t * const *pdes, pd_entry_t *lastpde,
+int *lastlvl)
 {
-   int i;
unsigned long index;
pd_entry_t pde;
+   int i;
 
for (i = PTP_LEVELS; i > 1; i--) {
index = pl_i(va, i);
pde = pdes[i - 2][index];
-   if ((pde & PG_V) == 0)
-   return i;
+   if ((pde & PG_V) == 0) {
+   *lastlvl = i;
+   return false;
+   }
+   if (pde & PG_PS)
+   break;
}
if (lastpde != NULL)
*lastpde = pde;
-   return 0;
+   *lastlvl = i;
+   return true;
 }
 
 /*
@@ -3092,6 +3099,7 @@
paddr_t pa;
lwp_t *l;
bool hard, rv;
+   int lvl;
 
 #ifdef __HAVE_DIRECT_MAP
if (va >= PMAP_DIRECT_BASE && va < PMAP_DIRECT_END) {
@@ -3108,8 +3116,8 @@
 
kpreempt_disable();
ci = l->l_cpu;
-   if (__predict_true(!ci->ci_want_pmapload && ci->ci_pmap == pmap) ||
-   pmap == pmap_kernel()) {
+   if (pmap == pmap_kernel() ||
+   __predict_true(!ci->ci_want_pmapload && ci->ci_pmap == pmap)) {
/*
 * no need to lock, because it's pmap_kernel() or our
 * own pmap and is active.  if a user pmap, the caller
@@ -3126,14 +3134,17 @@
hard = true;
pmap_map_ptes(pmap, , , );
}
-   if (pmap_pdes_valid(va, pdes, )) {
-   pte = ptes[pl1_i(va)];
-   if (pde & PG_PS) {
+   if (pmap_pdes_valid(va, pdes, , )) {
+   if (lvl == 2) {
pa = (pde & PG_LGFRAME) | (va & (NBPD_L2 - 1));
rv = true;
-   } else if (__predict_true((pte & PG_V) != 0)) {
-   pa = pmap_pte2pa(pte) | (va & (NBPD_L1 - 1));
-   rv = true;
+   } else {
+   KASSERT(lvl == 1);
+   pte = ptes[pl1_i(va)];
+   if (__predict_true((pte & PG_V) != 0)) {
+   pa = pmap_pte2pa(pte) | (va & (NBPD_L1 - 1));
+   rv = true;
+   }
}
}
if (__predict_false(hard)) {
@@ -3552,6 +3563,7 @@
vaddr_t blkendva, va = sva;
struct vm_page *ptp;
struct 

Re: boot -d

2020-11-13 Thread Edgar Fuß


> Am 12.11.2020 um 20:41 schrieb Andreas Gustafsson :
> 
> t's probably easier to revert src/sys/arch/x86/x86/db_memrw.c 1.6
I‘ve backported the fixes, will post them later.


Re: boot -d

2020-11-12 Thread Edgar Fuß
> It's probably easier to revert src/sys/arch/x86/x86/db_memrw.c 1.6.
As far as I understood (which may well be wrong) the fixes fixed a real 
problem that only surfaced on that change by chance and might have other 
consequences?


Re: boot -d

2020-11-12 Thread Edgar Fuß
> This looks like PR 53311.
Ah, thanks!

> The commit where that problem started (src/sys/arch/x86/x86/db_memrw.c 1.6) 
> was pulled up to to the -8 branch, and apparently the commits that fixed it 
> were not.
I currently seem to attract pull-ups that mess up things.

I had a look at the relevant commits
src/sys/arch/x86/include/pmap.h 1.100
src/sys/arch/x86/x86/pmap.c 1.330
src/sys/arch/xen/x86/xen_pmap.c 1.31
but unfortunately am unable to back-port the second one to -8.
I know nothing about pmap, and the -current version uses PTE_P and PTE_PS 
while the -8 version uses PG_V/nothing.

Could someone in the know port these fixes to -8, please? Or guide me?


boot -d

2020-11-12 Thread Edgar Fuß
Hello again.

In about the third nesting level of what I wanted to do in the first place, 
I tried "boot netbsd -d" in the secondary boot. It loads the kernel, then 
complains about the ffs module missing (I don't use modules and don't have 
an 8.2 directory on that machine), clears the screen, displays 
"fatal breakpoint in supervisor mode" and re-boots.

The problem is that the interesting messages are displayed only for a 
fraction of a second. In one out of three tries, I was able to catch them 
(partly) using the "slomo" (i.e. high speed) video recording mode of my 
iPhone, but of the line after the "fatal breakpoint" message, only the 
top half is displayed before it is cleared, so it's very hard to read the 
interesting parts.

Any hints?


panic in iic_search()

2020-11-11 Thread Edgar Fuß
I have an AMD64 server running 8/amd64, which ran happily (other than USB 
issues, which is another story) with 8.1_STABLE from September 2019.
I updated to netbsd-8 from yesterday (so that's 8.2_STABLE) and a newly 
compiled kernel crashes in iic_search(). The last line printed before that is:
iic0 at piixpm0: I2C bus
With the working kernel, the next line is:
spdmem0 at iic0 addr 0x50: NT4GC72B4NA1NL-CG

Obviously, I have the spdmem* at iic? addr 0xxx lines uncommented in my config.

The panic is:
uvm_fault(0x90afec40, 0x0, 4) -> e
fatal page fault in supervisor mode
trap type 6 code 0x10 rip 0 cs 0x8 rflags 0x10246 cr2 0 ilevel 0x8 rsp 
0x80d4f485
curlwp 0x80a1b600 pid 0.1 lowest kstack 0x80d4c2c0
kernel: page fault trap, code=0
Stopped in pid 0.1 (system) at  0:uvm_fault(0x80afec40, 0x7fbfc000, 
1) -> e
fatal page fault in supervisor mode
trap type 6 code 0 rip 0x80d4f070 cs 0x8 rflags 0x10216 cr2 
0x7fbfc000 ilevel 0x8 rsp 0x80d4f070
curlwp 0x80a1b600 pid 0.1 lowest kstack 0x80d4c2c0
kernel: page fault trap, code=0
Stopped in pid 0.1 (system) at  netbsd:db_disasm+0x65:  testb   
$0x1,0(%rdx,%rcx,8)

Backtrace:
db_disasm() at netbsd:db_disasm+0x65
db_trap() at netbsd:db_trap+0xf4
kpd_trap() at netbsd:kpd_trap+0xe2
trap() at netbsd:trap+0x5d6
-- trap (number 6) ---
?() at 0
iic_search() at netbsd:iic_search+0x92
mapply() at netbsd:mapply+0x39
config_search_loc() at netbsd:config_search_loc+0xaf
iic_attach() at netbsd:iic_attach+0x4cd
config_attach_loc() at netbsd:config_attach_loc+0x19c
config_found_sm_loc() at netbsd:config_found_sm_loc+0x48
piixpm_rescan() at netbsd:piixpm_rescan+0xed
piixpm_attach() at netbsd:piixpm_attach+0x1e7
config_attach_loc() at netbsd:config_attach_loc+0x19c
config_found_sm_loc() at netbsd:config_found_sm_loc+0x48
pci_probe_device() at netbsd:pci_probe_device+0x57e
pci_enumerate_bus() at netbsd:pci_enumerate_bus+0x198
pciattach() at netbsd:pciattach+0x198
config_attach_loc() at netbsd:config_attach_loc+0x19c
config_found_sm_loc() at netbsd:config_found_sm_loc+0x48
mp_pci_scan() at netbsd:mp_pci_scan+0x9c
mainbus_attach() at netbsd:mainbus_attach+0x2ce
config_attach_loc() at netbsd:config_attach_loc+0x19c
cpu_configure() at netbsd:cpu_configure+0x2b
main() at netbsd:main+0x2a8

Where to go from here?


Re: RAIDframe: what if a disc fails during copyback

2020-10-30 Thread Edgar Fuß
> it locks out all other non-copyback IO in order to finish the job!
Oops!

> Locking out all other IO is very poor... but if it's a small enough RAID set
> you might be able to get away with the downtime for the copyback...
Certainly not.

> You shouldn't need to reboot for this... the 'failing spared disk' and
> 'reconstruct to previous second disk' should work fine without reboot.
I still don't get this. What I have is:

Components:
   /dev/sd5a: spared
   /dev/sd6a: optimal
Spares:
   /dev/sd7a: used_spare

So what am I supposed to do from here?


Re: RAIDframe: what if a disc fails during copyback

2020-10-30 Thread Edgar Fuß
Thanks for the detailed answer.

> it's still there, and it does work, 
That's reassuring to know.

> but it's not at all performant or system-friendly.
Just how bad is it?

> If you want the components labelled nicely, give the system a reboot
Re-booting our file server is something I like to avoid.

> and behaves very poorly.
Depending on how poorly, I could probably live with it (the RAID in question 
is the small system one, not the large user data one).

> In your case, what I'd do is just fail the spare, and initiate a reconstruct
> to the original failed component.  (You still have the data on the spare if
> something goes back with the original good component.)
Hm, I guess I would need to re-boot and intervene manually in that case.
Just using the slow copyback looks preferrable if it doesn't take more than 
a day.

Probably I need to test this on another machine before.
I guess there's no way to initiate a reconstruction to a spare and failing 
the specified component only /after/ the reconstruction has completed, 
not before?


Re: RAIDframe: what if a disc fails during copyback

2020-10-29 Thread Edgar Fuß
There still seems to be confusion on what I did.

Let A and B be the two original components, C a spare (in the cupboard) 
and B' be B with the new firmware.

I start with A and B as the two components of a RAID-1.
Now B failes. I have a degraded RAID with A alone.
I plug in C, scsictl scsibus0 scan all all it, add it as a hot spare 
(raidctl -a C) and initiate a reconstruction (raidctl -F B).
Now I'm redundant again with A and C. Since I didn't re-boot, RAIDframe 
knows that B has failed and C is a used spare.
I now actually un-plug B, plug it into another machine, do some testing 
(verifying that it may reset on writes), install new firmware, do futher 
testing (verifying it now doesn't reset on writes) and am about to 
re-plug it into the orignal server (which won't notice it ever disappeared 
or that B has turned into B'---as far as this question is concerned, 
I could have done all this in the original server).
What I'm now intending to do is to raidctl -B (with A, B' and C installed, 
of course). After that, I intend to raidctl -r C, then 
scscictl scsibius0 detach C and finally un-plug C and put it back into the 
cupboard again.

My question was about 1. B', 2. C or 3. A failing during the copyback.

> there was a crop of bad Seagate 500GB disks for a while and they had 
> a tendancy to fail in mass at the same time.
My working hypothesis since some five years is that all Seagate discs 
are bad and bound to fail. We had a series of SATA 250G (the example above 
is about SAS 146K) drives that failed the same way (dozens of them), 
got most of them replaced on warranty and had the replacements failing 
the same way again.


Re: RAIDframe: what if a disc fails during copyback

2020-10-29 Thread Edgar Fuß
> So you have drives A, B, and C.  A and B were live.  Let's say B is the
> one that failed.  You reconstructed onto C and have been running with A
> and C.
Yes.

> Now you have a new B (which in this case is the same hardware with new
> firmware) and want to put it back into service.  I'm not sure whether
> you want to put it into service in place of A or in place of C.  I'm
> going to assume C.
Yes.

> So, you'd pull C, replace it with B
No. I don't pull C. I re-add B (I have lots of empty slots).

> and initiate a reconstruct
No, a copyback (raidctl -B).

> which for RAID 1 means copying from A to B.
I don't know. I would expect it to copy from C to B.

> > 1. The replaced component fails
> 
> Is this B?  Or C?  Because it sounds to me as though C would be out of
> service at this point.
I mean B.

> > 2. The spare fails
> 
> Which is "the spare"?
C.

> Are you running with a hot spare?
Yes. I added C as a hot spare when B failed and started a reconstruction.

> I think a hot spare failing means nothing until/unless RAIDframe 
> tries to fall back on it.
Yes.

> > 3. The other, non-replaced component fails?
> 
> That would be A?
Yes.

> Based on the assumption that RAIDframe RAID 1 cannot handle more than 
> two drives (always true as far as I know, and the 9.0 raidctl(8) manpage 
> says it's still true as of 9.0)
The RAID-1 I'm speaking of does only have to components, but I did operate 
a RAIDframe RAID-1 on three components with 5.1 or something.


RAIDframe: what if a disc fails during copyback

2020-10-29 Thread Edgar Fuß
(I could probably direct this question to oster@ instead of tech-kern@)

In a RAIDframe RAID-1, a disc failed and I reconstructed on a spare.
Now I want to replace the failed component (actually by the same disc, 
which needed a firmware update) and want to copyback to it.
How will RAIDframe behave if, during the copyback:
1. The replaced component fails
2. The spare fails
3. The other, non-replaced component fails?

Specifically: Is there any szenario (other than more than one disc failing) 
that will put the RAID into a non-redundant state? I guess 3. may?


Re: fsck updating but not fixing filesystem

2020-08-24 Thread Edgar Fuß
> I have a reasonably large ffs filesystem (7.4GB, 35,459,874 files)
I gues you mean 7.4TB?

I remember (shudder) something similar, where the file server would panic 
(bad dir), fsck would fix some dirs (missing . or ..), the file server 
would panic ... rinse and repeat.

Slightly short of me performing dump-newfs-restore, the problem disappeared.
I never found out what was wrong.

I think the general consensus is that ffs can be inconsistent it ways fsck 
is unable to detect.


Re: SIGCHLD and sigaction()

2020-08-16 Thread Edgar Fuß
> I don't understand what problem queued SIGCHLD was invented to address.
My impression is that it allows you to get notified of state changes of your 
child processes. If one signal could annonce several state changes, how 
would you know what these state changes are?


SIGCHLD and sigaction()

2020-08-15 Thread Edgar Fuß
Another question in the context of SIGCHLD:

When I install a SIGCHLD handler via sigaction() using SA_SIGINFO, 
is it guaranteed that my handler is called (at least) once per 
death-of-a-child? There is sentence in SUS

If SA_SIGINFO is set in sa_flags, then subsequent occurrences of sig 
generated by sigqueue() or as a result of any signal-generating function 
that supports the specification of an application-defined value (when sig 
is already pending) shall be queued in FIFO order until delivered or accepted;

that may cover this but that I don't understand.


Re: wait(2) and SIGCHLD

2020-08-14 Thread Edgar Fuß
1. Sample program attached. Change SIG_IGN to SIG_DFL to see the difference.

2. macOS seems to behave the same way, as does Linux.

3. I don't see where POSIX defines or allows this, but given 2., I'm surely
   missing something.

4. The wording in wait(2) could be improved to clarify this is only about 
   SIG_IGN, not SIG_DFL. At least, the NetBSD manpage mentions this at all.

5. Every time I think I knew Unix, I learn otherwise.
#include 
#include 
#include 
#include 
#include 
#include 

int stat = 0;
int ret;

int main(int argc, char * argv[]) {
signal(SIGCHLD, SIG_IGN);
if (fork()) {
if ((ret = wait()) < 0) err(1, "wait");
printf("ret %d, stat %d\n", ret, stat);
} else {
exit(42);
}
return 0;
}


Re: wait(2) and SIGCHLD

2020-08-14 Thread Edgar Fuß
> I'm not sure I've completely understood your question
Probably not. Or I don't get what you are trying to say.

What I observe is that a process that explicitly ignores SIGCHLD (SIG_IGN), 
then forks a child which exits, when wait()ing for the child, gets ECHILD 
(i.e., wait returns -1 and errno is ECHILD).


Re: wait(2) and SIGCHLD

2020-08-14 Thread Edgar Fuß
The second question (that I forgot in the original mail) is whether 
wait(2) returning ECHILD for whatwever handling of SIGCHLD is covered by POSIX.


wait(2) and SIGCHLD

2020-08-14 Thread Edgar Fuß
I'm confused regarding the behaviour of wait(2) wrt. SIGCHLD handling.

The wait(2) manpage says:

wait() will fail and return immediately if:
[ECHILD]The calling process has no existing unwaited-for child
processes; or no status from the terminated child
process is available because the calling process has
asked the system to discard such status by ignoring
the signal SIGCHLD or setting the flag SA_NOCLDWAIT
for that signal.

However, ignore is the default handler for SIGCHLD.

So does the
because the calling process has asked the system
to discard such status by ignoring the signal SIGCHLD
mean that explicitly ignoring SIGCHLD is different from ignoring it per default?


Re: Horrendous RAIDframe reconstruction performance

2020-06-28 Thread Edgar Fuß
> That's the reconstruction algorithm. It reads each stripe and if it
> has a bad parity, the parity data gets rewritten.
That's the way parity re-write works. I thought reconstruction worked 
differently. oster@?


  1   2   3   4   5   >