Re: acpihpet(4): acpihpet_delay: only use lower 32 bits of counter

2022-09-12 Thread Jonathan Gray
On Fri, Sep 09, 2022 at 07:32:58AM -0500, Scott Cheloha wrote:
> On Fri, Sep 09, 2022 at 03:59:01PM +1000, Jonathan Gray wrote:
> > On Thu, Sep 08, 2022 at 08:31:21PM -0500, Scott Cheloha wrote:
> > > On Sat, Aug 27, 2022 at 09:28:06PM -0500, Scott Cheloha wrote:
> > > > Whoops, forgot about the split read problem.  My mistake.
> > > > 
> > > > Because 32-bit platforms cannot do bus_space_read_8 atomically, and
> > > > i386 can use acpihpet(4), we can only safely use the lower 32 bits of
> > > > the counter in acpihpet_delay() (unless we want two versions of
> > > > acpihpet_delay()... which I don't).
> > > > 
> > > > Switch from acpihpet_r() to bus_space_read_4(9) and accumulate cycles
> > > > as we do in acpihpet_delay().  Unlike acpitimer(4), the HPET is a
> > > > 64-bit counter so we don't need to mask the difference between val1
> > > > and val2.
> > > > 
> > > > [...]
> > > 
> > > 12 day ping.
> > > 
> > > This needs fixing before it causes problems.
> > 
> > the hpet spec says to set a bit to force a 32-bit counter on
> > 32-bit platforms
> > 
> > see 2.4.7 Issues related to 64-bit Timers with 32-bit CPUs, in
> > https://www.intel.com/content/dam/www/public/us/en/documents/technical-specifications/software-developers-hpet-spec-1-0a.pdf
> 
> I don't follow your meaning.  Putting the HPET in 32-bit mode doesn't
> help us here, it would just break acpihpet_delay() in a different way.
> 
> The problem is that acpihpet_delay() is written as a 64-bit delay(9)
> and there's no way to do that safely on i386 without introducing extra
> overhead.
> 
> The easiest and cheapest fix is to rewrite acpihpet_delay() as a
> 32-bit delay(9), i.e. we count cycles until we pass a threshold.
> acpitimer_delay() in acpi/acpitimer.c is a 32-bit delay(9) and it
> works great, let's just do the same thing again here.
> 
> ok?

Seems to handle wrapping.

ok jsg@

> 
> Index: acpihpet.c
> ===
> RCS file: /cvs/src/sys/dev/acpi/acpihpet.c,v
> retrieving revision 1.28
> diff -u -p -r1.28 acpihpet.c
> --- acpihpet.c25 Aug 2022 18:01:54 -  1.28
> +++ acpihpet.c9 Sep 2022 12:29:41 -
> @@ -281,13 +281,19 @@ acpihpet_attach(struct device *parent, s
>  void
>  acpihpet_delay(int usecs)
>  {
> - uint64_t c, s;
> + uint64_t count = 0, cycles;
>   struct acpihpet_softc *sc = hpet_timecounter.tc_priv;
> + uint32_t val1, val2;
>  
> - s = acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER);
> - c = usecs * hpet_timecounter.tc_frequency / 100;
> - while (acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER) - s < c)
> + val2 = bus_space_read_4(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER);
> + cycles = usecs * hpet_timecounter.tc_frequency / 100;
> + while (count < cycles) {
>   CPU_BUSY_CYCLE();
> + val1 = val2;
> + val2 = bus_space_read_4(sc->sc_iot, sc->sc_ioh,
> + HPET_MAIN_COUNTER);
> + count += val2 - val1;
> + }
>  }
>  
>  u_int
> 
> 



Re: acpihpet(4): acpihpet_delay: only use lower 32 bits of counter

2022-09-09 Thread Scott Cheloha
On Fri, Sep 09, 2022 at 03:59:01PM +1000, Jonathan Gray wrote:
> On Thu, Sep 08, 2022 at 08:31:21PM -0500, Scott Cheloha wrote:
> > On Sat, Aug 27, 2022 at 09:28:06PM -0500, Scott Cheloha wrote:
> > > Whoops, forgot about the split read problem.  My mistake.
> > > 
> > > Because 32-bit platforms cannot do bus_space_read_8 atomically, and
> > > i386 can use acpihpet(4), we can only safely use the lower 32 bits of
> > > the counter in acpihpet_delay() (unless we want two versions of
> > > acpihpet_delay()... which I don't).
> > > 
> > > Switch from acpihpet_r() to bus_space_read_4(9) and accumulate cycles
> > > as we do in acpihpet_delay().  Unlike acpitimer(4), the HPET is a
> > > 64-bit counter so we don't need to mask the difference between val1
> > > and val2.
> > > 
> > > [...]
> > 
> > 12 day ping.
> > 
> > This needs fixing before it causes problems.
> 
> the hpet spec says to set a bit to force a 32-bit counter on
> 32-bit platforms
> 
> see 2.4.7 Issues related to 64-bit Timers with 32-bit CPUs, in
> https://www.intel.com/content/dam/www/public/us/en/documents/technical-specifications/software-developers-hpet-spec-1-0a.pdf

I don't follow your meaning.  Putting the HPET in 32-bit mode doesn't
help us here, it would just break acpihpet_delay() in a different way.

The problem is that acpihpet_delay() is written as a 64-bit delay(9)
and there's no way to do that safely on i386 without introducing extra
overhead.

The easiest and cheapest fix is to rewrite acpihpet_delay() as a
32-bit delay(9), i.e. we count cycles until we pass a threshold.
acpitimer_delay() in acpi/acpitimer.c is a 32-bit delay(9) and it
works great, let's just do the same thing again here.

ok?

Index: acpihpet.c
===
RCS file: /cvs/src/sys/dev/acpi/acpihpet.c,v
retrieving revision 1.28
diff -u -p -r1.28 acpihpet.c
--- acpihpet.c  25 Aug 2022 18:01:54 -  1.28
+++ acpihpet.c  9 Sep 2022 12:29:41 -
@@ -281,13 +281,19 @@ acpihpet_attach(struct device *parent, s
 void
 acpihpet_delay(int usecs)
 {
-   uint64_t c, s;
+   uint64_t count = 0, cycles;
struct acpihpet_softc *sc = hpet_timecounter.tc_priv;
+   uint32_t val1, val2;
 
-   s = acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER);
-   c = usecs * hpet_timecounter.tc_frequency / 100;
-   while (acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER) - s < c)
+   val2 = bus_space_read_4(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER);
+   cycles = usecs * hpet_timecounter.tc_frequency / 100;
+   while (count < cycles) {
CPU_BUSY_CYCLE();
+   val1 = val2;
+   val2 = bus_space_read_4(sc->sc_iot, sc->sc_ioh,
+   HPET_MAIN_COUNTER);
+   count += val2 - val1;
+   }
 }
 
 u_int



Re: acpihpet(4): acpihpet_delay: only use lower 32 bits of counter

2022-09-08 Thread Jonathan Gray
On Thu, Sep 08, 2022 at 08:31:21PM -0500, Scott Cheloha wrote:
> On Sat, Aug 27, 2022 at 09:28:06PM -0500, Scott Cheloha wrote:
> > Whoops, forgot about the split read problem.  My mistake.
> > 
> > Because 32-bit platforms cannot do bus_space_read_8 atomically, and
> > i386 can use acpihpet(4), we can only safely use the lower 32 bits of
> > the counter in acpihpet_delay() (unless we want two versions of
> > acpihpet_delay()... which I don't).
> > 
> > Switch from acpihpet_r() to bus_space_read_4(9) and accumulate cycles
> > as we do in acpihpet_delay().  Unlike acpitimer(4), the HPET is a
> > 64-bit counter so we don't need to mask the difference between val1
> > and val2.
> > 
> > [...]
> 
> 12 day ping.
> 
> This needs fixing before it causes problems.

the hpet spec says to set a bit to force a 32-bit counter on
32-bit platforms

see 2.4.7 Issues related to 64-bit Timers with 32-bit CPUs, in
https://www.intel.com/content/dam/www/public/us/en/documents/technical-specifications/software-developers-hpet-spec-1-0a.pdf

> 
> ok?
> 
> Index: acpihpet.c
> ===
> RCS file: /cvs/src/sys/dev/acpi/acpihpet.c,v
> retrieving revision 1.28
> diff -u -p -r1.28 acpihpet.c
> --- acpihpet.c25 Aug 2022 18:01:54 -  1.28
> +++ acpihpet.c9 Sep 2022 01:30:26 -
> @@ -281,13 +281,19 @@ acpihpet_attach(struct device *parent, s
>  void
>  acpihpet_delay(int usecs)
>  {
> - uint64_t c, s;
> + uint64_t count = 0, cycles;
>   struct acpihpet_softc *sc = hpet_timecounter.tc_priv;
> + uint32_t val1, val2;
>  
> - s = acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER);
> - c = usecs * hpet_timecounter.tc_frequency / 100;
> - while (acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER) - s < c)
> + val2 = bus_space_read_4(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER);
> + cycles = usecs * hpet_timecounter.tc_frequency / 100;
> + while (count < cycles) {
>   CPU_BUSY_CYCLE();
> + val1 = val2;
> + val2 = bus_space_read_4(sc->sc_iot, sc->sc_ioh,
> + HPET_MAIN_COUNTER);
> + count += val2 - val1;
> + }
>  }
>  
>  u_int
> 
> 



Re: acpihpet(4): acpihpet_delay: only use lower 32 bits of counter

2022-09-08 Thread Scott Cheloha
On Sat, Aug 27, 2022 at 09:28:06PM -0500, Scott Cheloha wrote:
> Whoops, forgot about the split read problem.  My mistake.
> 
> Because 32-bit platforms cannot do bus_space_read_8 atomically, and
> i386 can use acpihpet(4), we can only safely use the lower 32 bits of
> the counter in acpihpet_delay() (unless we want two versions of
> acpihpet_delay()... which I don't).
> 
> Switch from acpihpet_r() to bus_space_read_4(9) and accumulate cycles
> as we do in acpihpet_delay().  Unlike acpitimer(4), the HPET is a
> 64-bit counter so we don't need to mask the difference between val1
> and val2.
> 
> [...]

12 day ping.

This needs fixing before it causes problems.

ok?

Index: acpihpet.c
===
RCS file: /cvs/src/sys/dev/acpi/acpihpet.c,v
retrieving revision 1.28
diff -u -p -r1.28 acpihpet.c
--- acpihpet.c  25 Aug 2022 18:01:54 -  1.28
+++ acpihpet.c  9 Sep 2022 01:30:26 -
@@ -281,13 +281,19 @@ acpihpet_attach(struct device *parent, s
 void
 acpihpet_delay(int usecs)
 {
-   uint64_t c, s;
+   uint64_t count = 0, cycles;
struct acpihpet_softc *sc = hpet_timecounter.tc_priv;
+   uint32_t val1, val2;
 
-   s = acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER);
-   c = usecs * hpet_timecounter.tc_frequency / 100;
-   while (acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER) - s < c)
+   val2 = bus_space_read_4(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER);
+   cycles = usecs * hpet_timecounter.tc_frequency / 100;
+   while (count < cycles) {
CPU_BUSY_CYCLE();
+   val1 = val2;
+   val2 = bus_space_read_4(sc->sc_iot, sc->sc_ioh,
+   HPET_MAIN_COUNTER);
+   count += val2 - val1;
+   }
 }
 
 u_int



acpihpet(4): acpihpet_delay: only use lower 32 bits of counter

2022-08-27 Thread Scott Cheloha
Whoops, forgot about the split read problem.  My mistake.

Because 32-bit platforms cannot do bus_space_read_8 atomically, and
i386 can use acpihpet(4), we can only safely use the lower 32 bits of
the counter in acpihpet_delay() (unless we want two versions of
acpihpet_delay()... which I don't).

Switch from acpihpet_r() to bus_space_read_4(9) and accumulate cycles
as we do in acpihpet_delay().  Unlike acpitimer(4), the HPET is a
64-bit counter so we don't need to mask the difference between val1
and val2.

ok?

Index: acpihpet.c
===
RCS file: /cvs/src/sys/dev/acpi/acpihpet.c,v
retrieving revision 1.28
diff -u -p -r1.28 acpihpet.c
--- acpihpet.c  25 Aug 2022 18:01:54 -  1.28
+++ acpihpet.c  28 Aug 2022 02:26:07 -
@@ -281,13 +281,19 @@ acpihpet_attach(struct device *parent, s
 void
 acpihpet_delay(int usecs)
 {
-   uint64_t c, s;
+   uint64_t count = 0, cycles;
struct acpihpet_softc *sc = hpet_timecounter.tc_priv;
+   uint32_t val1, val2;
 
-   s = acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER);
-   c = usecs * hpet_timecounter.tc_frequency / 100;
-   while (acpihpet_r(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER) - s < c)
+   val2 = bus_space_read_4(sc->sc_iot, sc->sc_ioh, HPET_MAIN_COUNTER);
+   cycles = usecs * hpet_timecounter.tc_frequency / 100;
+   while (count < cycles) {
CPU_BUSY_CYCLE();
+   val1 = val2;
+   val2 = bus_space_read_4(sc->sc_iot, sc->sc_ioh,
+   HPET_MAIN_COUNTER);
+   count += val2 - val1;
+   }
 }
 
 u_int