Re: delays in sensors thread

2020-12-11 Thread Marcus Glocker
On Fri, Dec 11, 2020 at 10:01:35AM +0100, Alexandre Ratchov wrote:

> On Fri, Dec 11, 2020 at 09:07:45AM +0100, Marcus Glocker wrote:
> > 
> > After doing some deeper analyzes in to asmc_wait() I agree to that.
> > Something seems to go fundamental wrong there.  In every asmc_update()
> > execution, I can see asmc_wait() timeout 9 times, always on the
> > ASMC_ACCEPT check.  That means out of ~48ms to execute asmc_update(),
> > we spend 45ms waiting for timeouts!
> > 
> > When this can be fixed, I guess we should be fine with the remaining
> > ~3ms to execute asmc_update().
> > 
> 
> Even 3ms seems the wrong order of magnitude; that's ~300 CPU
> cycles on a 1GHz processor. This is 3 usb frames, so this will still
> hurt audio in low-latency corner cases.

Right.  I'm happy to continue the discussion on what could be the right
approach to get there.  But first I would like to remove the unnecessary
delays/timeouts we currently see in asmc(4).  Therefore ...

... with this diff, which is basically doing the same as the Linux
driver, I can get the asmc_update() function down to 2-3ms, without
getting any timeout in the update code.  This allows me to stream
uaudio(4), run uvideo(4), and compile a kernel with '-j 4' in
parallel on my iMac11,3 without seeing freezes.

Testers, comments, OKs?


Index: sys/dev/acpi/asmc.c
===
RCS file: /cvs/src/sys/dev/acpi/asmc.c,v
retrieving revision 1.2
diff -u -p -u -p -r1.2 asmc.c
--- sys/dev/acpi/asmc.c 13 Sep 2020 14:11:28 -  1.2
+++ sys/dev/acpi/asmc.c 11 Dec 2020 19:10:22 -
@@ -58,6 +58,8 @@
 #define ASMC_MAXLIGHT  2   /* left and right light sensor */
 #define ASMC_MAXMOTION 3   /* x y z axis motion sensors */
 
+#define ASMC_DELAY_LOOP200 /* ASMC_DELAY_LOOP * 10us = 2ms */
+
 struct asmc_prod {
const char  *pr_name;
uint8_t  pr_light;
@@ -434,36 +436,41 @@ asmc_status(struct asmc_softc *sc)
 }
 
 static int
-asmc_wait(struct asmc_softc *sc, uint8_t mask, uint8_t val)
+asmc_write(struct asmc_softc *sc, uint8_t off, uint8_t val)
 {
int i;
+   uint8_t status;
 
-   for (i = 0; i < 500; i++) { /* wait up to 5 ms */
-   if ((bus_space_read_1(sc->sc_iot, sc->sc_ioh, ASMC_COMMAND) &
-   mask) == val)
+   bus_space_write_1(sc->sc_iot, sc->sc_ioh, off, val);
+   for (i = 0; i < ASMC_DELAY_LOOP; i++) {
+   status = bus_space_read_1(sc->sc_iot, sc->sc_ioh, ASMC_COMMAND);
+   if (status & ASMC_IBF)
+   continue;
+   if (status & ASMC_ACCEPT)
return 0;
delay(10);
+   bus_space_write_1(sc->sc_iot, sc->sc_ioh, off, val);
}
-   return ETIMEDOUT;
-}
 
-static int
-asmc_write(struct asmc_softc *sc, uint8_t off, uint8_t val)
-{
-   if (asmc_wait(sc, ASMC_IBF, 0))
-   return 1;
-   bus_space_write_1(sc->sc_iot, sc->sc_ioh, off, val);
-   if (asmc_wait(sc, ASMC_ACCEPT, ASMC_ACCEPT))
-   return 1;
-   return 0;
+   return ETIMEDOUT;
 }
 
 static int
 asmc_read(struct asmc_softc *sc, uint8_t off, uint8_t *buf)
 {
-   if (asmc_wait(sc, ASMC_OBF, ASMC_OBF))
-   return 1;
+   int i;
+   uint8_t status;
+
+   for (i = 0; i < ASMC_DELAY_LOOP; i++) {
+   status = bus_space_read_1(sc->sc_iot, sc->sc_ioh, ASMC_COMMAND);
+   if (status & ASMC_OBF)
+   break;
+   delay(10);
+   }
+   if (i == ASMC_DELAY_LOOP)
+   return ETIMEDOUT;
*buf = bus_space_read_1(sc->sc_iot, sc->sc_ioh, off);
+
return 0;
 }
 



Re: delays in sensors thread

2020-12-11 Thread Alexandre Ratchov
On Fri, Dec 11, 2020 at 09:07:45AM +0100, Marcus Glocker wrote:
> 
> After doing some deeper analyzes in to asmc_wait() I agree to that.
> Something seems to go fundamental wrong there.  In every asmc_update()
> execution, I can see asmc_wait() timeout 9 times, always on the
> ASMC_ACCEPT check.  That means out of ~48ms to execute asmc_update(),
> we spend 45ms waiting for timeouts!
> 
> When this can be fixed, I guess we should be fine with the remaining
> ~3ms to execute asmc_update().
> 

Even 3ms seems the wrong order of magnitude; that's ~300 CPU
cycles on a 1GHz processor. This is 3 usb frames, so this will still
hurt audio in low-latency corner cases.



Re: delays in sensors thread

2020-12-11 Thread Marcus Glocker
On Thu, Dec 10, 2020 at 06:06:26PM -0300, Martin Pieuchot wrote:

> On 10/12/20(Thu) 21:40, Alexandre Ratchov wrote:
> > On Thu, Dec 10, 2020 at 05:27:16PM +0100, Marcus Glocker wrote:
> > > Hi All,
> > > 
> > > I recently started to play around with uvideo(4) and uaudio(4) on my
> > > amd64 iMacs.  There I quickly noticed regular freezes when streaming
> > > USB video or audio.  On some of those machines it was very frequent,
> > > like every few seconds the video or audio stream did freeze for ~1s,
> > > then resume, while the rest of the system did continue to operate fine.
> > > 
> > > First I found that when running the machine with an SP kernel, the issue
> > > disappears.
> > 
> > On SP kernels, interrupts are still working while the CPU is spinning
> > in kernel mode (as long as current IPL permits it). That's why audio
> > works better.
> > 
> > > Secondly some debugging hours, and quite some e-mail
> > > exchanges with mpi@ later, I found that the freeze is getting triggered
> > > by the asmc(4) driver, specifically by the sensor_task_register()
> > > update function.  My first intention was to change
> > > sensor_task_register() to call taskq_create() with the TASKQ_MPSAFE
> > > flag for a test, to remove the kernel lock, which also resolved the
> > > freezing with an MP kernel. [1]
> > > 
> > > In the end I found that the asmc(4) sensor update code is calling a
> > > busy loop in asmc_wait(), where the delay call is spending ~50ms in
> > > average.  Doing that during the KERNEL_LOCK() is resulting in
> > > noticeable USB ISOC transfer delays.  Obviously replacing the delay(9)
> > > with tsleep_nsec(9) in asmc(4) did fix the issue as well. [2]
> > > 
> > > I'm not sure if just applying diff [2] to the driver is the right
> > > approach finally or if we need to take a more generic path to address
> > > this problem.  Any feedback, help, comments appreciated.
> > > 
> > 
> > Would asmc(4) work if we sleep for very long (tsleep_nsec() has no
> > upper bound)? Spinning during 50ms in kernel mode doesn't look right,
> > so using tsleep() looks as a step forward as long as sleeping doesn't
> > break asmc(4).

I did test this quickly, but it makes the asmc_wait() entirely fail.
 
> Another way to look at the issue is:  Can this delay be reduced?
> 
> Considering that asmc_wait() is called twice per asmc_write() and once
> per amsc_read() that means it is called at least 14 times in asmc_command()
> and up to 20 times when reading fan informations,
> 
> If any of the operations times out the whole command is restarted,
> possibly twice.
> 
> So it could be interesting to know if a particular query fail and is
> restarted?  In this case, is it fixable and/or why do we need to
> restart the command 3 times?
> 
> Or is the loop in asmc_wait() doing too big steps?  Or too smalls which
> would imply that some commands fail?
> 
> Improving/fixing the delays might be easier than re-structuring the
> driver and could fix both issues raised in this thread: holding the
> KERNEL_LOCK() for too long and delaying other sensors. 

After doing some deeper analyzes in to asmc_wait() I agree to that.
Something seems to go fundamental wrong there.  In every asmc_update()
execution, I can see asmc_wait() timeout 9 times, always on the
ASMC_ACCEPT check.  That means out of ~48ms to execute asmc_update(),
we spend 45ms waiting for timeouts!

When this can be fixed, I guess we should be fine with the remaining
~3ms to execute asmc_update().

I'll have a look.



Re: delays in sensors thread

2020-12-10 Thread Martin Pieuchot
On 10/12/20(Thu) 21:40, Alexandre Ratchov wrote:
> On Thu, Dec 10, 2020 at 05:27:16PM +0100, Marcus Glocker wrote:
> > Hi All,
> > 
> > I recently started to play around with uvideo(4) and uaudio(4) on my
> > amd64 iMacs.  There I quickly noticed regular freezes when streaming
> > USB video or audio.  On some of those machines it was very frequent,
> > like every few seconds the video or audio stream did freeze for ~1s,
> > then resume, while the rest of the system did continue to operate fine.
> > 
> > First I found that when running the machine with an SP kernel, the issue
> > disappears.
> 
> On SP kernels, interrupts are still working while the CPU is spinning
> in kernel mode (as long as current IPL permits it). That's why audio
> works better.
> 
> > Secondly some debugging hours, and quite some e-mail
> > exchanges with mpi@ later, I found that the freeze is getting triggered
> > by the asmc(4) driver, specifically by the sensor_task_register()
> > update function.  My first intention was to change
> > sensor_task_register() to call taskq_create() with the TASKQ_MPSAFE
> > flag for a test, to remove the kernel lock, which also resolved the
> > freezing with an MP kernel. [1]
> > 
> > In the end I found that the asmc(4) sensor update code is calling a
> > busy loop in asmc_wait(), where the delay call is spending ~50ms in
> > average.  Doing that during the KERNEL_LOCK() is resulting in
> > noticeable USB ISOC transfer delays.  Obviously replacing the delay(9)
> > with tsleep_nsec(9) in asmc(4) did fix the issue as well. [2]
> > 
> > I'm not sure if just applying diff [2] to the driver is the right
> > approach finally or if we need to take a more generic path to address
> > this problem.  Any feedback, help, comments appreciated.
> > 
> 
> Would asmc(4) work if we sleep for very long (tsleep_nsec() has no
> upper bound)? Spinning during 50ms in kernel mode doesn't look right,
> so using tsleep() looks as a step forward as long as sleeping doesn't
> break asmc(4).

Another way to look at the issue is:  Can this delay be reduced?

Considering that asmc_wait() is called twice per asmc_write() and once
per amsc_read() that means it is called at least 14 times in asmc_command()
and up to 20 times when reading fan informations,

If any of the operations times out the whole command is restarted,
possibly twice.

So it could be interesting to know if a particular query fail and is
restarted?  In this case, is it fixable and/or why do we need to
restart the command 3 times?

Or is the loop in asmc_wait() doing too big steps?  Or too smalls which
would imply that some commands fail?

Improving/fixing the delays might be easier than re-structuring the
driver and could fix both issues raised in this thread: holding the
KERNEL_LOCK() for too long and delaying other sensors. 



Re: delays in sensors thread

2020-12-10 Thread Alexandre Ratchov
On Thu, Dec 10, 2020 at 05:27:16PM +0100, Marcus Glocker wrote:
> Hi All,
> 
> I recently started to play around with uvideo(4) and uaudio(4) on my
> amd64 iMacs.  There I quickly noticed regular freezes when streaming
> USB video or audio.  On some of those machines it was very frequent,
> like every few seconds the video or audio stream did freeze for ~1s,
> then resume, while the rest of the system did continue to operate fine.
> 
> First I found that when running the machine with an SP kernel, the issue
> disappears.

On SP kernels, interrupts are still working while the CPU is spinning
in kernel mode (as long as current IPL permits it). That's why audio
works better.

> Secondly some debugging hours, and quite some e-mail
> exchanges with mpi@ later, I found that the freeze is getting triggered
> by the asmc(4) driver, specifically by the sensor_task_register()
> update function.  My first intention was to change
> sensor_task_register() to call taskq_create() with the TASKQ_MPSAFE
> flag for a test, to remove the kernel lock, which also resolved the
> freezing with an MP kernel. [1]
> 
> In the end I found that the asmc(4) sensor update code is calling a
> busy loop in asmc_wait(), where the delay call is spending ~50ms in
> average.  Doing that during the KERNEL_LOCK() is resulting in
> noticeable USB ISOC transfer delays.  Obviously replacing the delay(9)
> with tsleep_nsec(9) in asmc(4) did fix the issue as well. [2]
> 
> I'm not sure if just applying diff [2] to the driver is the right
> approach finally or if we need to take a more generic path to address
> this problem.  Any feedback, help, comments appreciated.
> 

Would asmc(4) work if we sleep for very long (tsleep_nsec() has no
upper bound)? Spinning during 50ms in kernel mode doesn't look right,
so using tsleep() looks as a step forward as long as sleeping doesn't
break asmc(4).

AFAIU, calling delay(10) + yield() would have a similar effect.



Re: delays in sensors thread

2020-12-10 Thread Martin Pieuchot
On 10/12/20(Thu) 18:47, Mark Kettenis wrote:
> [...] 
> Sensor drivers that are "good citizens" should probably continue to
> use the sensor thread.  Butfor things like asmc(4) that handle a
> largish bundle of sensors, a separate thread would be fine.  And I
> suspect that thread could run unlocked.

Using a separate thread won't help as long as it is KERNEL_LOCK()'d.

What allows the rest of the kernel to make progress in this case is the
fact that tsleep(9) releases the lock.

If one is going to do some work to make sure asmc_update() is safe to be
executed w/o KERNEL_LOCK() then it would be easier to start by releasing
the lock in this function.



Re: delays in sensors thread

2020-12-10 Thread Theo de Raadt
Mark Kettenis  wrote:

> Well, we deliberately chose not to do that since that would lead to a
> gazillion of kernel threads if you have lots of sensor drivers.
> 
> Sensor drivers that are "good citizens" should probably continue to
> use the sensor thread.  Butfor things like asmc(4) that handle a
> largish bundle of sensors, a separate thread would be fine.  And I
> suspect that thread could run unlocked.

Of course.  I meant this should be done in drivers dealing with
hardware which is like this.

Some sensor drivers, not all.

The proposed solution of replacing delay() with tsleep() is
unsatisfactory, since rather than sleeping a private thread, it will be
sleeping the sensor thread and delay *all* the other sensors as well.



Re: delays in sensors thread

2020-12-10 Thread Mark Kettenis
> From: "Theo de Raadt" 
> Date: Thu, 10 Dec 2020 10:13:52 -0700
> 
> Marcus Glocker  wrote:
> 
> > Hi All,
> > 
> > I recently started to play around with uvideo(4) and uaudio(4) on my
> > amd64 iMacs.  There I quickly noticed regular freezes when streaming
> > USB video or audio.  On some of those machines it was very frequent,
> > like every few seconds the video or audio stream did freeze for ~1s,
> > then resume, while the rest of the system did continue to operate fine.
> > 
> > First I found that when running the machine with an SP kernel, the issue
> > disappears.  Secondly some debugging hours, and quite some e-mail
> > exchanges with mpi@ later, I found that the freeze is getting triggered
> > by the asmc(4) driver, specifically by the sensor_task_register()
> > update function.  My first intention was to change
> > sensor_task_register() to call taskq_create() with the TASKQ_MPSAFE
> > flag for a test, to remove the kernel lock, which also resolved the
> > freezing with an MP kernel. [1]
> > 
> > In the end I found that the asmc(4) sensor update code is calling a
> > busy loop in asmc_wait(), where the delay call is spending ~50ms in
> > average.  Doing that during the KERNEL_LOCK() is resulting in
> > noticeable USB ISOC transfer delays.  Obviously replacing the delay(9)
> > with tsleep_nsec(9) in asmc(4) did fix the issue as well. [2]
> > 
> > I'm not sure if just applying diff [2] to the driver is the right
> > approach finally or if we need to take a more generic path to address
> > this problem.  Any feedback, help, comments appreciated.
> 
> I've discussed with Marcus that there is a better way to solve these
> drivers.
> 
> They should install their own timeout or thread, which collects the
> information.  Then, when the sensor thread comes around it can instantly
> copy the information.

Well, we deliberately chose not to do that since that would lead to a
gazillion of kernel threads if you have lots of sensor drivers.

Sensor drivers that are "good citizens" should probably continue to
use the sensor thread.  Butfor things like asmc(4) that handle a
largish bundle of sensors, a separate thread would be fine.  And I
suspect that thread could run unlocked.

> Of course, there should be no delay() in the information collector.
> 
> In an ideal world, delay() whould only be used in low-IPL or
> bootstrapping code.

It depends a bit.  Small delays (of the order of a few microseconds)
are probably fine as the overhead of setting up a sleep will be large.
Longer delays should be done using sleeps.  The problem we have right
now is that the granularity of sleeps is so high.  Which drives us to
using delay(9) in cases where we need to sleep for hundreds of
microseconds.



Re: delays in sensors thread

2020-12-10 Thread Theo de Raadt
Marcus Glocker  wrote:

> Hi All,
> 
> I recently started to play around with uvideo(4) and uaudio(4) on my
> amd64 iMacs.  There I quickly noticed regular freezes when streaming
> USB video or audio.  On some of those machines it was very frequent,
> like every few seconds the video or audio stream did freeze for ~1s,
> then resume, while the rest of the system did continue to operate fine.
> 
> First I found that when running the machine with an SP kernel, the issue
> disappears.  Secondly some debugging hours, and quite some e-mail
> exchanges with mpi@ later, I found that the freeze is getting triggered
> by the asmc(4) driver, specifically by the sensor_task_register()
> update function.  My first intention was to change
> sensor_task_register() to call taskq_create() with the TASKQ_MPSAFE
> flag for a test, to remove the kernel lock, which also resolved the
> freezing with an MP kernel. [1]
> 
> In the end I found that the asmc(4) sensor update code is calling a
> busy loop in asmc_wait(), where the delay call is spending ~50ms in
> average.  Doing that during the KERNEL_LOCK() is resulting in
> noticeable USB ISOC transfer delays.  Obviously replacing the delay(9)
> with tsleep_nsec(9) in asmc(4) did fix the issue as well. [2]
> 
> I'm not sure if just applying diff [2] to the driver is the right
> approach finally or if we need to take a more generic path to address
> this problem.  Any feedback, help, comments appreciated.

I've discussed with Marcus that there is a better way to solve these
drivers.

They should install their own timeout or thread, which collects the
information.  Then, when the sensor thread comes around it can instantly
copy the information.

Of course, there should be no delay() in the information collector.

In an ideal world, delay() whould only be used in low-IPL or
bootstrapping code.



delays in sensors thread

2020-12-10 Thread Marcus Glocker
Hi All,

I recently started to play around with uvideo(4) and uaudio(4) on my
amd64 iMacs.  There I quickly noticed regular freezes when streaming
USB video or audio.  On some of those machines it was very frequent,
like every few seconds the video or audio stream did freeze for ~1s,
then resume, while the rest of the system did continue to operate fine.

First I found that when running the machine with an SP kernel, the issue
disappears.  Secondly some debugging hours, and quite some e-mail
exchanges with mpi@ later, I found that the freeze is getting triggered
by the asmc(4) driver, specifically by the sensor_task_register()
update function.  My first intention was to change
sensor_task_register() to call taskq_create() with the TASKQ_MPSAFE
flag for a test, to remove the kernel lock, which also resolved the
freezing with an MP kernel. [1]

In the end I found that the asmc(4) sensor update code is calling a
busy loop in asmc_wait(), where the delay call is spending ~50ms in
average.  Doing that during the KERNEL_LOCK() is resulting in
noticeable USB ISOC transfer delays.  Obviously replacing the delay(9)
with tsleep_nsec(9) in asmc(4) did fix the issue as well. [2]

I'm not sure if just applying diff [2] to the driver is the right
approach finally or if we need to take a more generic path to address
this problem.  Any feedback, help, comments appreciated.


[1]

Index: kern/kern_sensors.c
===
RCS file: /cvs/src/sys/kern/kern_sensors.c,v
retrieving revision 1.39
diff -u -p -u -p -r1.39 kern_sensors.c
--- kern/kern_sensors.c 19 Dec 2019 17:40:11 -  1.39
+++ kern/kern_sensors.c 9 Dec 2020 12:13:10 -
@@ -193,7 +193,8 @@ sensor_task_register(void *arg, void (*f
 #endif
 
if (sensors_taskq == NULL &&
-   (sensors_taskq = taskq_create("sensors", 1, IPL_HIGH, 0))
== NULL)
+   (sensors_taskq = taskq_create("sensors", 1, IPL_HIGH,
+   TASKQ_MPSAFE)) == NULL)
sensors_taskq = systq;
 
st = malloc(sizeof(*st), M_DEVBUF, M_NOWAIT);


[2]

Index: dev/acpi/asmc.c
===
RCS file: /cvs/src/sys/dev/acpi/asmc.c,v
retrieving revision 1.2
diff -u -p -u -p -r1.2 asmc.c
--- dev/acpi/asmc.c 13 Sep 2020 14:11:28 -  1.2
+++ dev/acpi/asmc.c 9 Dec 2020 22:06:50 -
@@ -442,7 +442,7 @@ asmc_wait(struct asmc_softc *sc, uint8_t
if ((bus_space_read_1(sc->sc_iot, sc->sc_ioh,
ASMC_COMMAND) & mask) == val)
return 0;
-   delay(10);
+   tsleep_nsec(sc, PWAIT, "asmc_wait", USEC_TO_NSEC(10));
}
return ETIMEDOUT;
 }