Re: acpihpet(4): acpihpet_delay: only use lower 32 bits of counter
On Fri, Sep 09, 2022 at 07:32:58AM -0500, Scott Cheloha wrote: > On Fri, Sep 09, 2022 at 03:59:01PM +1000, Jonathan Gray wrote: > > On Thu, Sep 08, 2022 at 08:31:21PM -0500, Scott Cheloha wrote: > > > On Sat, Aug 27, 2022 at 09:28:06PM -0500, Scott Cheloha wrote: > > > > Whoops, forgot about the split read problem. My mistake. > > > > > > > > Because 32-bit platforms cannot do bus_space_read_8 atomically, and > > > > i386 can use acpihpet(4), we can only safely use the lower 32 bits of > > > > the counter in acpihpet_delay() (unless we want two versions of > > > > acpihpet_delay()... which I don't). > > > > > > > > Switch from acpihpet_r() to bus_space_read_4(9) and accumulate cycles > > > > as we do in acpihpet_delay(). Unlike acpitimer(4), the HPET is a > > > > 64-bit counter so we don't need to mask the difference between val1 > > > > and val2. > > > > > > > > [...] > > > > > > 12 day ping. > > > > > > This needs fixing before it causes problems. > > > > the hpet spec says to set a bit to force a 32-bit counter on > > 32-bit platforms > > > > see 2.4.7 Issues related to 64-bit Timers with 32-bit CPUs, in > > https://www.intel.com/content/dam/www/public/us/en/documents/technical-specifications/software-developers-hpet-spec-1-0a.pdf > > I don't follow your meaning. Putting the HPET in 32-bit mode doesn't > help us here, it would just break acpihpet_delay() in a different way. > > The problem is that acpihpet_delay() is written as a 64-bit delay(9) > and there's no way to do that safely on i386 without introducing extra > overhead. > > The easiest and cheapest fix is to rewrite acpihpet_delay() as a > 32-bit delay(9), i.e. we count cycles until we pass a threshold. > acpitimer_delay() in acpi/acpitimer.c is a 32-bit delay(9) and it > works great, let's just do the same thing again here. > > ok? Seems to handle wrapping. ok jsg@ > > Index: acpihpet.c > === > RCS file: /cvs/src/sys/dev/acpi/acpihpet.c,v > retrieving revision 1.28 > diff -u -p -r1.28 acpihpet.c > --- acpihpet.c25 Aug 2022 18:01:54 - 1.28 > +++ acpihpet.c9 Sep 2022 12:29:41 - > @@ -281,13 +281,19 @@ acpihpet_attach(struct device *parent, s > void > acpihpet_delay(int usecs) > { > - uint64_t c, s; > + uint64_t count = 0, cycles; > struct acpihpet_softc *sc = hpet_timecounter.tc_priv; > + uint32_t val1, val2; > > - s = acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER); > - c = usecs * hpet_timecounter.tc_frequency / 100; > - while (acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER) - s < c) > + val2 = bus_space_read_4(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER); > + cycles = usecs * hpet_timecounter.tc_frequency / 100; > + while (count < cycles) { > CPU_BUSY_CYCLE(); > + val1 = val2; > + val2 = bus_space_read_4(sc->sc_iot, sc->sc_ioh, > + HPET_MAIN_COUNTER); > + count += val2 - val1; > + } > } > > u_int > >
Re: acpihpet(4): acpihpet_delay: only use lower 32 bits of counter
On Fri, Sep 09, 2022 at 03:59:01PM +1000, Jonathan Gray wrote: > On Thu, Sep 08, 2022 at 08:31:21PM -0500, Scott Cheloha wrote: > > On Sat, Aug 27, 2022 at 09:28:06PM -0500, Scott Cheloha wrote: > > > Whoops, forgot about the split read problem. My mistake. > > > > > > Because 32-bit platforms cannot do bus_space_read_8 atomically, and > > > i386 can use acpihpet(4), we can only safely use the lower 32 bits of > > > the counter in acpihpet_delay() (unless we want two versions of > > > acpihpet_delay()... which I don't). > > > > > > Switch from acpihpet_r() to bus_space_read_4(9) and accumulate cycles > > > as we do in acpihpet_delay(). Unlike acpitimer(4), the HPET is a > > > 64-bit counter so we don't need to mask the difference between val1 > > > and val2. > > > > > > [...] > > > > 12 day ping. > > > > This needs fixing before it causes problems. > > the hpet spec says to set a bit to force a 32-bit counter on > 32-bit platforms > > see 2.4.7 Issues related to 64-bit Timers with 32-bit CPUs, in > https://www.intel.com/content/dam/www/public/us/en/documents/technical-specifications/software-developers-hpet-spec-1-0a.pdf I don't follow your meaning. Putting the HPET in 32-bit mode doesn't help us here, it would just break acpihpet_delay() in a different way. The problem is that acpihpet_delay() is written as a 64-bit delay(9) and there's no way to do that safely on i386 without introducing extra overhead. The easiest and cheapest fix is to rewrite acpihpet_delay() as a 32-bit delay(9), i.e. we count cycles until we pass a threshold. acpitimer_delay() in acpi/acpitimer.c is a 32-bit delay(9) and it works great, let's just do the same thing again here. ok? Index: acpihpet.c === RCS file: /cvs/src/sys/dev/acpi/acpihpet.c,v retrieving revision 1.28 diff -u -p -r1.28 acpihpet.c --- acpihpet.c 25 Aug 2022 18:01:54 - 1.28 +++ acpihpet.c 9 Sep 2022 12:29:41 - @@ -281,13 +281,19 @@ acpihpet_attach(struct device *parent, s void acpihpet_delay(int usecs) { - uint64_t c, s; + uint64_t count = 0, cycles; struct acpihpet_softc *sc = hpet_timecounter.tc_priv; + uint32_t val1, val2; - s = acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER); - c = usecs * hpet_timecounter.tc_frequency / 100; - while (acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER) - s < c) + val2 = bus_space_read_4(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER); + cycles = usecs * hpet_timecounter.tc_frequency / 100; + while (count < cycles) { CPU_BUSY_CYCLE(); + val1 = val2; + val2 = bus_space_read_4(sc->sc_iot, sc->sc_ioh, + HPET_MAIN_COUNTER); + count += val2 - val1; + } } u_int
Re: acpihpet(4): acpihpet_delay: only use lower 32 bits of counter
On Thu, Sep 08, 2022 at 08:31:21PM -0500, Scott Cheloha wrote: > On Sat, Aug 27, 2022 at 09:28:06PM -0500, Scott Cheloha wrote: > > Whoops, forgot about the split read problem. My mistake. > > > > Because 32-bit platforms cannot do bus_space_read_8 atomically, and > > i386 can use acpihpet(4), we can only safely use the lower 32 bits of > > the counter in acpihpet_delay() (unless we want two versions of > > acpihpet_delay()... which I don't). > > > > Switch from acpihpet_r() to bus_space_read_4(9) and accumulate cycles > > as we do in acpihpet_delay(). Unlike acpitimer(4), the HPET is a > > 64-bit counter so we don't need to mask the difference between val1 > > and val2. > > > > [...] > > 12 day ping. > > This needs fixing before it causes problems. the hpet spec says to set a bit to force a 32-bit counter on 32-bit platforms see 2.4.7 Issues related to 64-bit Timers with 32-bit CPUs, in https://www.intel.com/content/dam/www/public/us/en/documents/technical-specifications/software-developers-hpet-spec-1-0a.pdf > > ok? > > Index: acpihpet.c > === > RCS file: /cvs/src/sys/dev/acpi/acpihpet.c,v > retrieving revision 1.28 > diff -u -p -r1.28 acpihpet.c > --- acpihpet.c25 Aug 2022 18:01:54 - 1.28 > +++ acpihpet.c9 Sep 2022 01:30:26 - > @@ -281,13 +281,19 @@ acpihpet_attach(struct device *parent, s > void > acpihpet_delay(int usecs) > { > - uint64_t c, s; > + uint64_t count = 0, cycles; > struct acpihpet_softc *sc = hpet_timecounter.tc_priv; > + uint32_t val1, val2; > > - s = acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER); > - c = usecs * hpet_timecounter.tc_frequency / 100; > - while (acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER) - s < c) > + val2 = bus_space_read_4(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER); > + cycles = usecs * hpet_timecounter.tc_frequency / 100; > + while (count < cycles) { > CPU_BUSY_CYCLE(); > + val1 = val2; > + val2 = bus_space_read_4(sc->sc_iot, sc->sc_ioh, > + HPET_MAIN_COUNTER); > + count += val2 - val1; > + } > } > > u_int > >
Re: acpihpet(4): acpihpet_delay: only use lower 32 bits of counter
On Sat, Aug 27, 2022 at 09:28:06PM -0500, Scott Cheloha wrote: > Whoops, forgot about the split read problem. My mistake. > > Because 32-bit platforms cannot do bus_space_read_8 atomically, and > i386 can use acpihpet(4), we can only safely use the lower 32 bits of > the counter in acpihpet_delay() (unless we want two versions of > acpihpet_delay()... which I don't). > > Switch from acpihpet_r() to bus_space_read_4(9) and accumulate cycles > as we do in acpihpet_delay(). Unlike acpitimer(4), the HPET is a > 64-bit counter so we don't need to mask the difference between val1 > and val2. > > [...] 12 day ping. This needs fixing before it causes problems. ok? Index: acpihpet.c === RCS file: /cvs/src/sys/dev/acpi/acpihpet.c,v retrieving revision 1.28 diff -u -p -r1.28 acpihpet.c --- acpihpet.c 25 Aug 2022 18:01:54 - 1.28 +++ acpihpet.c 9 Sep 2022 01:30:26 - @@ -281,13 +281,19 @@ acpihpet_attach(struct device *parent, s void acpihpet_delay(int usecs) { - uint64_t c, s; + uint64_t count = 0, cycles; struct acpihpet_softc *sc = hpet_timecounter.tc_priv; + uint32_t val1, val2; - s = acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER); - c = usecs * hpet_timecounter.tc_frequency / 100; - while (acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER) - s < c) + val2 = bus_space_read_4(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER); + cycles = usecs * hpet_timecounter.tc_frequency / 100; + while (count < cycles) { CPU_BUSY_CYCLE(); + val1 = val2; + val2 = bus_space_read_4(sc->sc_iot, sc->sc_ioh, + HPET_MAIN_COUNTER); + count += val2 - val1; + } } u_int
acpihpet(4): acpihpet_delay: only use lower 32 bits of counter
Whoops, forgot about the split read problem. My mistake. Because 32-bit platforms cannot do bus_space_read_8 atomically, and i386 can use acpihpet(4), we can only safely use the lower 32 bits of the counter in acpihpet_delay() (unless we want two versions of acpihpet_delay()... which I don't). Switch from acpihpet_r() to bus_space_read_4(9) and accumulate cycles as we do in acpihpet_delay(). Unlike acpitimer(4), the HPET is a 64-bit counter so we don't need to mask the difference between val1 and val2. ok? Index: acpihpet.c === RCS file: /cvs/src/sys/dev/acpi/acpihpet.c,v retrieving revision 1.28 diff -u -p -r1.28 acpihpet.c --- acpihpet.c 25 Aug 2022 18:01:54 - 1.28 +++ acpihpet.c 28 Aug 2022 02:26:07 - @@ -281,13 +281,19 @@ acpihpet_attach(struct device *parent, s void acpihpet_delay(int usecs) { - uint64_t c, s; + uint64_t count = 0, cycles; struct acpihpet_softc *sc = hpet_timecounter.tc_priv; + uint32_t val1, val2; - s = acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER); - c = usecs * hpet_timecounter.tc_frequency / 100; - while (acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER) - s < c) + val2 = bus_space_read_4(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER); + cycles = usecs * hpet_timecounter.tc_frequency / 100; + while (count < cycles) { CPU_BUSY_CYCLE(); + val1 = val2; + val2 = bus_space_read_4(sc->sc_iot, sc->sc_ioh, + HPET_MAIN_COUNTER); + count += val2 - val1; + } } u_int