> Date: Mon, 27 Feb 2023 10:00:25 +1000
> From: David Gwynne <[email protected]>
>
> On Sun, Feb 26, 2023 at 01:28:04PM +0100, Mark Kettenis wrote:
> > > Date: Sun, 26 Feb 2023 18:13:18 +1000
> > > From: David Gwynne <[email protected]>
>
> yeesh, i should have proofread my email before i sent it. sorry about
> making it harder to read than it should have been.
>
> > > i picked a couple of Dell Wyse 3040 boxes, which are very cute, i
> > > like them a lot. however, i have to disable acpitz to be able to
> > > use them because the driver gets stuck during attach.
> > >
> > > during apcitz_attach does a read of all the temperatures. the read
> > > of _TMP ends up talking to tipmic(4) via tipmic_thermal_opreg_handler().
> > > tipmic_thermal_opreg_handler has a loop on line 335 waiting for
> > > sc->sc_stat_adc to change, but that value is only set from tipmic_intr.
> > > acpitz_attach is running while the kernel is code, and it appears that
> > > the interrupt handler never runs, so that value never changes, and
> > > acpitz blocks. also because it's cold, the timeout on the tsleep doesn't
> > > do anything. thanks to patrick for helping me on the acpi side of things
> > > so we could figure this out.
> >
> > A better approach might be to make sure that while we're cold,
> > tipmic_thermal_opreg_handler() polls for completion. Something like:
> >
> > while (sc->sc_stat_adc == 0) {
> > if (cold) {
> > delay(1000);
> > tpmic_intr();
> > } else {
> > if (tsleep(&sc->sc_stat_adc, PRIBIO, "tipmic",
> > SEC_TO_NSEC(1))) {
> > ...
> > }
> > }
> > }
> >
> >
> > > i tried deferring basically all of acpitz_attach to when kthreads are
> > > running, and that works well enough to get to userland.
> > >
> > > is that reasonable?
> >
> > The problem is that you can't really know whether AML accesses the
> > opregion while cold.
>
> good point. the diff below works in this situation and is less
> intrusive.
ok kettenis@
> > > also, shortly after dwiic complains about short reads and the kernel
> > > locks up again. i'll have to plug it in and transcribe the exact
> > > errors. i think that's a separate problem though.
> >
> > Yes, dwiic(4) has always been somewhat problematic. Transactions seem
> > to fail randomly on some platforms like the atom system you're looking
> > at but also on my Ampere eMAG system.
>
> fun. i managed to catch some of the dwiic stuff via dmesg before
> it locked up:
>
> dwiic0: timed out waiting for tx_empty intr
> dwiic0: timed out waiting for rx_full intr
> dwiic0: timed out reading remaining 1
> tipmic0: can't read register 0x5b
> dwiic0: timed out waiting for tx_empty intr
> dwiic0: timed out reading remaining 1
> tipmic0: can't read register 0x01
> dwiic0: timed out reading remaining 1
> tipmic0: can't read register 0x01
> dwiic0: timed out waiting for rx_full intr
> dwiic0: timed out reading remaining 1
> tipmic0: can't read register 0x5a
> dwiic0: timed out waiting for rx_full intr
> dwiic0: timed out reading remaining 1
> tipmic0: can't read register 0x50
> dwiic0: timed out waiting for stop intr
> dwiic0: timed out waiting for stop intr
> dwiic0: timed out waiting for stop intr
> dwiic0: timed out reading remaining 1
> tipmic0: can't read register 0x01
> dwiic0: timed out waiting for bus idle
> dwiic0: timed out waiting for stop intr
> dwiic0: timed out waiting for stop intr
> dwiic0: timed out waiting for stop intr
> dwiic0: timed out waiting for stop intr
> dwiic0: timed out waiting for stop intr
> dwiic0: timed out reading remaining 1
> tipmic0: can't read register 0x01
> dwiic0: timed out waiting for bus idle
>
> Index: tipmic.c
> ===================================================================
> RCS file: /cvs/src/sys/dev/acpi/tipmic.c,v
> retrieving revision 1.7
> diff -u -p -r1.7 tipmic.c
> --- tipmic.c 6 Apr 2022 18:59:27 -0000 1.7
> +++ tipmic.c 26 Feb 2023 23:56:04 -0000
> @@ -276,6 +276,25 @@ struct tipmic_regmap tipmic_thermal_regm
> { 0x18, TIPMIC_SYSTEMP_HI, TIPMIC_SYSTEMP_LO }
> };
>
> +static int
> +tipmic_wait_adc(struct tipmic_softc *sc)
> +{
> + int i;
> +
> + if (!cold) {
> + return (tsleep_nsec(&sc->sc_stat_adc, PRIBIO, "tipmic",
> + SEC_TO_NSEC(1)));
> + }
> +
> + for (i = 0; i < 1000; i++) {
> + delay(1000);
> + if (tipmic_intr(sc) == 1)
> + return (0);
> + }
> +
> + return (EWOULDBLOCK);
> +}
> +
> int
> tipmic_thermal_opreg_handler(void *cookie, int iodir, uint64_t address,
> int size, uint64_t *value)
> @@ -333,8 +352,7 @@ tipmic_thermal_opreg_handler(void *cooki
> splx(s);
>
> while (sc->sc_stat_adc == 0) {
> - if (tsleep_nsec(&sc->sc_stat_adc, PRIBIO, "tipmic",
> - SEC_TO_NSEC(1))) {
> + if (tipmic_wait_adc(sc)) {
> printf("%s: ADC timeout\n", sc->sc_dev.dv_xname);
> break;
> }
>