clockintr(9): clock interrupt scheduler

2022-10-28 Thread Scott Cheloha
Hi,

This patch adds a machine-indepent clock interrupt scheduler to the
kernel.  It is called clockintr(9).  The code is featureful enough to
emulate most of what the machine-dependent clock interrupt code is
doing across all all platforms.  This is a first step toward what some
systems call "tickless" kernel operation.

I am looking for OKs to commit this piece and continue working on it
in the tree.  The next step will entail switching every platform over
to using clockintr(9) while refining it to address problems found in
the process.  The attached patch is only the MI portion.  The code is
conditionally compiled when __HAVE_CLOCKINTR is set, so this patch
will not affect behavior on any platform yet.

The rest of this mail enumerates why we want this and a brief
description of how it works.

First, "why do we want this?":

1. An MI clock interrupt scheduler will make it possible to provide
   clock interrupt control to other MI parts of the kernel.

   For example, if the timeout(9) layer can steer the clock interrupt
   it can dramatically reduce timeout expiration latency, which will
   allow userspace to block for less than 1 tick.  This will fix some
   well-known vmm(4)/vmd(8) time problems.

   We could also provide a very fast profiling tick for dt(4) on
   demand, and disable it when btrace(8) exits.  There wouldn't
   be a need to run the profiling trace at the fixed hardclock(9)
   rate, hz(9).

   In general, parts of the kernel could benefit from having direct
   control of the clock interrupt.

2. An MI clock interrupt scheduler will allow us to reduce the
   clock interrupt rate or disable the clock interrupt entirely,
   saving power.

   For example, acpicpu(4) could temporarily stop the clock interrupt
   on very idle CPUs, allowing them to remain in low-power states for
   longer periods of time.

   In general, if there is no work to do, most CPUs in the system
   don't need to tick.

3. Parts of the MD clock interrupt code are very repetitive.  We have
   11 different hardclock/statclock dispatch loops.  armv7 has 4 such
   loops.  It would be pragmatic to consolidate these into a single
   dispatch loop and use it on every platform.

Second, "how does it work?".  The high level clockintr(9) life-cycle
proceeds like this:

1. The primary CPU calls clockintr_init(9) to initialize global
   clockintr(9) state.  Usually this is done from cpu_initclocks().

2. The primary CPU calls clockintr_cpu_init(9) to initialize its
   local clockintr(9) state.  Every CPU has a 'clockintr_queue'
   struct kept in its cpu_info struct.  The struct contains a work
   schedule.

   During clockintr_cpu_init(9), an 'intrclock' struct may also be
   installed on the calling CPU.  An intrclock provides a uniform
   interface to a CPU for manipulating its interrupt clock.  If no
   intrclock is installed, the platform code is responsible for
   rearming the clock interrupt.  Most platforms have a suitable
   hardware clock, though.

3. During cpu_hatch(), the secondary CPUs all initialize their local
   state with clockintr_cpu_init(9).

4. When a clock interrupt arrives, the CPU calls clockintr_dispatch(9)
   from the MD clock interrupt handler, e.g. lapic_clockintr().

   clockintr_dispatch() runs hardclock(), statclock(), and schedclock().
   It might run them more than once if the interrupt is late.  It then
   rearms the installed intrclock to fire when the next event is
   scheduled to expire.

5. Repeat step 4 indefinitely until the system shuts down, suspends,
   or hibernates.

6. During resume, the primary CPU calls inittodr(9) to advance the
   system uptime clock across the duration the system was down.

7. Go to step 2. This time, clockintr_cpu_init(9) skips past any
   downtime on the caller's work schedule.  This prevents a "thundering
   herd" of useless work during the first clock interrupt after the
   system resumes.

There is a manpage in the patch with additional detail on what these
interfaces do.

-Scott

Index: distrib/sets/lists/comp/mi
===
RCS file: /cvs/src/distrib/sets/lists/comp/mi,v
retrieving revision 1.1610
diff -u -p -r1.1610 mi
--- distrib/sets/lists/comp/mi  7 Oct 2022 15:43:41 -   1.1610
+++ distrib/sets/lists/comp/mi  28 Oct 2022 16:39:43 -
@@ -1286,6 +1286,7 @@
 ./usr/include/sys/cdefs.h
 ./usr/include/sys/cdio.h
 ./usr/include/sys/chio.h
+./usr/include/sys/clockintr.h
 ./usr/include/sys/conf.h
 ./usr/include/sys/core.h
 ./usr/include/sys/ctf.h
@@ -3071,6 +3072,7 @@
 ./usr/share/man/man9/bufq_init.9
 ./usr/share/man/man9/bus_dma.9
 ./usr/share/man/man9/bus_space.9
+./usr/share/man/man9/clockintr.9
 ./usr/share/man/man9/cond_init.9
 ./usr/share/man/man9/config_attach.9
 ./usr/share/man/man9/config_defer.9
Index: share/man/man9/Makefile
===
RCS file: /cvs/src/share/man/man9/Makefile,v
retrieving revision 1.307
diff -u -p -r1.307 Mak

Fwd: Gmux Driver for MacBook Retina Display

2022-10-28 Thread Leonardo Moreno Urbieta
Hi,

I'm resending this email in case adding this driver is useful for the
project.

If not useful, no problem. Will work on other issues that may arise in this
hardware.

Thanks,

Leonardo

-- Forwarded message -
De: leomoreno 
Date: vie, 9 sept 2022 a las 11:43
Subject: Gmux Driver for MacBook Retina Display
To:


Hello,

I wrote this driver to control the backlight in my
MacBook Pro 2015 with Retina Display.

I had to refer to some of the code in the linux
apple-gmux.c driver to learn how to communicate
with the chip. This could cause some issue with the
license? That's why I haven't included any license
text in the file.

Also based this driver heavily on parts of asmc.c and abl.c

Built it today with current and it works correctly using
wsconsctl.

Does it stand any chance to get commited? After
any necessary corrections.

Thanks,

Leonardo

Index: sys/arch/amd64/conf/GENERIC
===
RCS file: /cvs/src/sys/arch/amd64/conf/GENERIC,v
retrieving revision 1.512
diff -u -p -u -p -r1.512 GENERIC
--- sys/arch/amd64/conf/GENERIC 8 Mar 2022 15:08:01 -   1.512
+++ sys/arch/amd64/conf/GENERIC 9 Sep 2022 16:30:21 -
@@ -82,6 +82,7 @@ tpm*  at acpi?
 acpihve*   at acpi?
 acpisurface*   at acpi?
 acpihid*   at acpi?
+gmux*  at acpi?# Apple MacBookPro Retina Backlight
 ipmi0  at acpi? disable
 ccpmic*at iic?
 tipmic*at iic?
Index: sys/dev/acpi/files.acpi
===
RCS file: /cvs/src/sys/dev/acpi/files.acpi,v
retrieving revision 1.65
diff -u -p -u -p -r1.65 files.acpi
--- sys/dev/acpi/files.acpi 31 Aug 2022 16:10:59 -  1.65
+++ sys/dev/acpi/files.acpi 9 Sep 2022 16:30:22 -
@@ -90,6 +90,11 @@ device   abl
 attach abl at acpi
 file   dev/acpi/abl.c  abl

+# Apple MacBookPro Retina Display Backlight
+device  gmux
+attach  gmux at acpi
+file   dev/acpi/gmux.c gmux
+
 # Apple System Management Controller (SMC)
 device asmc
 attach asmc at acpi
Index: sys/dev/acpi/gmux.c
===
RCS file: sys/dev/acpi/gmux.c
diff -N sys/dev/acpi/gmux.c
--- /dev/null   1 Jan 1970 00:00:00 -
+++ sys/dev/acpi/gmux.c 9 Sep 2022 16:30:22 -
@@ -0,0 +1,244 @@
+/*
+ * Driver to control backlight in MacBook Retina Display.
+ * Address numbers and inner workings heavily derived from
+ * apple-gmux Linux driver.
+ *
+ * */
+
+
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+
+#ifdef GMUX_DEBUG
+#define DPRINTF(x) printf x
+#else
+#define DPRINTF(x)
+#endif
+
+#define GMUX_PORT_BRIGHTNESS   0x74
+#define GMUX_PORT_VALUE0xc2
+#define GMUX_PORT_READ 0xd0
+#define GMUX_PORT_WRITE0xd4
+
+#define GMUX_MIN_BRIGHTNESS0
+#define GMUX_MAX_BRIGHTNESS1023
+
+struct gmux_softc {
+   struct devicesc_dev;
+
+   struct acpi_softc   *sc_acpi;
+   struct aml_node *sc_devnode;
+
+   bus_space_tag_t  sc_iot;
+   bus_space_handle_t   sc_ioh;
+
+   uint16_t sc_brightness;
+};
+
+intgmux_match(struct device *, void *, void *);
+void   gmux_attach(struct device *, struct device *, void *);
+void   gmux_complete(struct gmux_softc *);
+void   gmux_ready(struct gmux_softc *);
+bool   gmux_confirm_retina_display(struct gmux_softc *);
+intgmux_get_brightness(struct gmux_softc *);
+intgmux_set_brightness(struct gmux_softc *,uint16_t);
+
+/* Hooks for wsconsole brightness control. */
+intgmux_get_param(struct wsdisplay_param *);
+intgmux_set_param(struct wsdisplay_param *);
+
+const struct cfattach gmux_ca = {
+   sizeof(struct gmux_softc), gmux_match, gmux_attach, NULL, NULL
+};
+
+struct cfdriver gmux_cd = {
+   NULL, "gmux", DV_DULL
+};
+
+const char *gmux_hids[] = {
+   "APP000B", NULL
+};
+
+int
+gmux_match(struct device *parent, void *match, void *aux)
+{
+   struct acpi_attach_args *aa = aux;
+   struct cfdata *cf = match;
+
+   return acpi_matchhids(aa, gmux_hids, cf->cf_driver->cd_name);
+
+}
+
+void
+gmux_attach(struct device *parent, struct device *self, void *aux)
+{
+   struct gmux_softc *sc = (struct gmux_softc *)self;
+   struct acpi_attach_args *aaa = aux;
+   struct aml_value res;
+   int64_t sta;
+
+   sc->sc_acpi = (struct acpi_softc *)parent;
+   sc->sc_devnode = aaa->aaa_node;
+
+   printf(": %s", sc->sc_devnode->name);
+
+sta = acpi_getsta(sc->sc_acpi, sc->sc_devnode);
+   if ((sta & (STA_PRESENT | STA_ENABLED | STA_DEV_OK)) !=
+   (STA_PRESENT | STA_ENABLED | STA_DEV_OK)) {
+   printf(": not enabled\n");
+   return;
+   }
+
+   if (!(aml_evalname(sc->sc_acpi, sc->sc_devnode, "_CID", 0, NUL

Re: Towards unlocking mmap(2) & munmap(2)

2022-10-28 Thread Martin Pieuchot
On 20/10/22(Thu) 16:17, Martin Pieuchot wrote:
> On 11/09/22(Sun) 12:26, Martin Pieuchot wrote:
> > Diff below adds a minimalist set of assertions to ensure proper locks
> > are held in uvm_mapanon() and uvm_unmap_remove() which are the guts of
> > mmap(2) for anons and munmap(2).
> > 
> > Please test it with WITNESS enabled and report back.
> 
> New version of the diff that includes a lock/unlock dance  in 
> uvm_map_teardown().  While grabbing this lock should not be strictly
> necessary because no other reference to the map should exist when the
> reaper is holding it, it helps make progress with asserts.  Grabbing
> the lock is easy and it can also save us a lot of time if there is any
> reference counting bugs (like we've discovered w/ vnode and swapping).

Here's an updated version that adds a lock/unlock dance in
uvm_map_deallocate() to satisfy the assert in uvm_unmap_remove().
Thanks to tb@ from pointing this out.

I received many positive feedback and test reports, I'm now asking for
oks.


Index: uvm/uvm_addr.c
===
RCS file: /cvs/src/sys/uvm/uvm_addr.c,v
retrieving revision 1.31
diff -u -p -r1.31 uvm_addr.c
--- uvm/uvm_addr.c  21 Feb 2022 10:26:20 -  1.31
+++ uvm/uvm_addr.c  28 Oct 2022 08:41:30 -
@@ -416,6 +416,8 @@ uvm_addr_invoke(struct vm_map *map, stru
!(hint >= uaddr->uaddr_minaddr && hint < uaddr->uaddr_maxaddr))
return ENOMEM;
 
+   vm_map_assert_anylock(map);
+
error = (*uaddr->uaddr_functions->uaddr_select)(map, uaddr,
entry_out, addr_out, sz, align, offset, prot, hint);
 
Index: uvm/uvm_fault.c
===
RCS file: /cvs/src/sys/uvm/uvm_fault.c,v
retrieving revision 1.132
diff -u -p -r1.132 uvm_fault.c
--- uvm/uvm_fault.c 31 Aug 2022 01:27:04 -  1.132
+++ uvm/uvm_fault.c 28 Oct 2022 08:41:30 -
@@ -1626,6 +1626,7 @@ uvm_fault_unwire_locked(vm_map_t map, va
struct vm_page *pg;
 
KASSERT((map->flags & VM_MAP_INTRSAFE) == 0);
+   vm_map_assert_anylock(map);
 
/*
 * we assume that the area we are unwiring has actually been wired
Index: uvm/uvm_map.c
===
RCS file: /cvs/src/sys/uvm/uvm_map.c,v
retrieving revision 1.301
diff -u -p -r1.301 uvm_map.c
--- uvm/uvm_map.c   24 Oct 2022 15:11:56 -  1.301
+++ uvm/uvm_map.c   28 Oct 2022 08:46:28 -
@@ -491,6 +491,8 @@ uvmspace_dused(struct vm_map *map, vaddr
vaddr_t stack_begin, stack_end; /* Position of stack. */
 
KASSERT(map->flags & VM_MAP_ISVMSPACE);
+   vm_map_assert_anylock(map);
+
vm = (struct vmspace *)map;
stack_begin = MIN((vaddr_t)vm->vm_maxsaddr, (vaddr_t)vm->vm_minsaddr);
stack_end = MAX((vaddr_t)vm->vm_maxsaddr, (vaddr_t)vm->vm_minsaddr);
@@ -570,6 +572,8 @@ uvm_map_isavail(struct vm_map *map, stru
if (addr + sz < addr)
return 0;
 
+   vm_map_assert_anylock(map);
+
/*
 * Kernel memory above uvm_maxkaddr is considered unavailable.
 */
@@ -1457,6 +1461,8 @@ uvm_map_mkentry(struct vm_map *map, stru
entry->guard = 0;
entry->fspace = 0;
 
+   vm_map_assert_wrlock(map);
+
/* Reset free space in first. */
free = uvm_map_uaddr_e(map, first);
uvm_mapent_free_remove(map, free, first);
@@ -1584,6 +1590,8 @@ boolean_t
 uvm_map_lookup_entry(struct vm_map *map, vaddr_t address,
 struct vm_map_entry **entry)
 {
+   vm_map_assert_anylock(map);
+
*entry = uvm_map_entrybyaddr(&map->addr, address);
return *entry != NULL && !UVM_ET_ISHOLE(*entry) &&
(*entry)->start <= address && (*entry)->end > address;
@@ -1704,6 +1712,8 @@ uvm_map_is_stack_remappable(struct vm_ma
vaddr_t end = addr + sz;
struct vm_map_entry *first, *iter, *prev = NULL;
 
+   vm_map_assert_anylock(map);
+
if (!uvm_map_lookup_entry(map, addr, &first)) {
printf("map stack 0x%lx-0x%lx of map %p failed: no mapping\n",
addr, end, map);
@@ -1868,6 +1878,8 @@ uvm_mapent_mkfree(struct vm_map *map, st
vaddr_t  addr;  /* Start of freed range. */
vaddr_t  end;   /* End of freed range. */
 
+   UVM_MAP_REQ_WRITE(map);
+
prev = *prev_ptr;
if (prev == entry)
*prev_ptr = prev = NULL;
@@ -1996,10 +2008,7 @@ uvm_unmap_remove(struct vm_map *map, vad
if (start >= end)
return 0;
 
-   if ((map->flags & VM_MAP_INTRSAFE) == 0)
-   splassert(IPL_NONE);
-   else
-   splassert(IPL_VM);
+   vm_map_assert_wrlock(map);
 
/* Find first affected entry. */
entry = uvm_map_entrybyaddr(&map->addr, start);
@@ -2531,6 +2540,8 @@ uvm_map_teardown(struct vm_map *map)
 
KASSERT((map->flags & VM_MAP_IN