On Sun, Aug 07, 2022 at 11:05:37AM +0000, Visa Hankala wrote:
> On Sun, Jul 31, 2022 at 01:28:18PM -0500, Scott Cheloha wrote:
> > Apparently mips64, i.e. octeon and loongson, has the same problem as
> > powerpc/macppc and powerpc64.  The timer interrupt is normally only
> > logically masked, not physically masked in the hardware, when we're
> > running at or above IPL_CLOCK.  If we arrive at cp0_int5() when the
> > clock interrupt is logically masked we postpone all work until the
> > next tick.  This is a problem for my WIP clock interrupt work.
> 
> I think the use of logical masking has been a design choice, not
> something dictated by the hardware. Physical masking should be possible,
> but some extra care would be needed to implement it, as the mips64
> interrupt code is a bit clunky.

That would be cleaner, but from the sound of it, it's easier to start
with this.

> > So, this patch is basically the same as what I did for macppc and what
> > I have proposed for powerpc64.
> > 
> > - Add a new member, ci_timer_deferred, to mips64's cpu_info struct.
> > 
> >   While here, remove ci_pendingticks.  We don't need it anymore.
> > 
> > - If we get to cp0_int5() and our IPL is too high, set
> >   cpu_info.ci_timer_deferred and return.
> > 
> > - If we get to cp0_int5() and our IPL is low enough, clear
> >   cpu_info.ci_timer_deferred and do clock interrupt work.
> > 
> > - In splx(9), if the new IPL is low enough and cpu_info.ci_timer_deferred
> >   is set, trigger the clock interrupt.
> > 
> > The only big difference is that mips64 uses an equality comparison
> > when deciding whether to arm the timer interrupt, so it's really easy
> > to "miss" CP0.count when you're setting CP0.compare.
> > 
> > To address this I've written a function, cp0_raise_int5(), that spins
> > until it is sure the timer interrupt will go off.  The function needed
> > a bit of new code for reading CP0.cause, which I've added to
> > cp0access.S.  I am using an initial offset of 16 cycles based on
> > experimentation with the machine I have access to, a 500Mhz CN50xx.
> > Values lower than 16 require more than one loop to arm the timer.  If
> > that value is insufficient for other machines we can try passing the
> > initial offset as an argument to the function.
> 
> It should not be necessary to make the initial offset variable. The
> offset is primarily a function of the length and content of the
> instruction sequence. Some unpredictability comes from cache misses
> and maybe branch prediction failures.

Gotcha.  So it mostly depends on the number of instructions between
loading CP0.count and storing CP0.compare.

> > I wasn't sure where to put the prototype for cp0_raise_int5() so I
> > stuck it in mips64/cpu.h.  If there's a better place for it, just say
> > so.
> 
> Currently, mips64 clock.c is formulated as a proper driver. I think
> callers should not invoke its functions directly but use a hook instead.
> The MI mips64 code starts the clock through the md_startclock function
> pointer. Maybe there could be md_triggerclock.
> 
> To reduce risk of confusion, I would rename cp0_raise_int5 to
> cp0_trigger_int5, as `raise' overlaps with the spl API. Also,
> ci_clock_deferred instead of ci_timer_deferred would look more
> consistent with the surrounding code.

Okay, I took all these suggestions and incorporated them.  Updated
patch attached.

One thing I'm still uncertain about is how glxclk fits into the
loongson picture.  It's an interrupt clock that runs hardclock() and
statclock(), but the code doesn't do any logical masking, so I don't
know whether or not I need to adjust anything in that code or account
for it at all.  If there's no logical masking there's no deferral, so
it would never call need to call md_triggerclock() from splx(9).

Also:

My EdgeRouter PoE just finished a serial `make build`.  Took almost 12
days.  Which is a good sign!  Lots of opportunity for the patch to
fail and the clock to die.

In that time, under what I assume is relatively heavy load, the clock
interrupt deferral counters look like this:

cp0_raise_calls at 0xffffffff81701308: 133049
cp0_raise_miss at 0xffffffff81701300: 0

So 16 cycles as the initial offset works great.  We never ran the loop
more than once, i.e. we never "missed" CP0.count.

The machine has been up a little more than a million seconds.  So, at
100hz, with no separate statclock, and 2 CPUs, we'd expect ~200 clock
interrupts a second, or 200 million in total.

In ~200,000,000 cp0_int5() calls, we deferred ~133,000 of them, or
~0.0665%.

Index: mips64/mips64/clock.c
===================================================================
RCS file: /cvs/src/sys/arch/mips64/mips64/clock.c,v
retrieving revision 1.45
diff -u -p -r1.45 clock.c
--- mips64/mips64/clock.c       6 Apr 2022 18:59:26 -0000       1.45
+++ mips64/mips64/clock.c       8 Aug 2022 07:41:44 -0000
@@ -60,6 +60,7 @@ const struct cfattach clock_ca = {
 };
 
 void   cp0_startclock(struct cpu_info *);
+void   cp0_trigger_int5(void);
 uint32_t cp0_int5(uint32_t, struct trapframe *);
 
 int
@@ -86,19 +87,20 @@ clockattach(struct device *parent, struc
        cp0_set_compare(cp0_get_count() - 1);
 
        md_startclock = cp0_startclock;
+       md_triggerclock = cp0_trigger_int5;
 }
 
 /*
  *  Interrupt handler for targets using the internal count register
  *  as interval clock. Normally the system is run with the clock
  *  interrupt always enabled. Masking is done here and if the clock
- *  can not be run the tick is just counted and handled later when
- *  the clock is logically unmasked again.
+ *  cannot be run the tick is handled later when the clock is logically
+ *  unmasked again.
  */
 uint32_t
 cp0_int5(uint32_t mask, struct trapframe *tf)
 {
-       u_int32_t clkdiff;
+       u_int32_t clkdiff, pendingticks = 0;
        struct cpu_info *ci = curcpu();
 
        /*
@@ -113,15 +115,26 @@ cp0_int5(uint32_t mask, struct trapframe
        }
 
        /*
+        * If the clock interrupt is masked, defer any work until it
+        * is unmasked from splx(9).
+        */
+       if (tf->ipl >= IPL_CLOCK) {
+               ci->ci_clock_deferred = 1;
+               cp0_set_compare(cp0_get_count() - 1);
+               return CR_INT_5;
+       }
+       ci->ci_clock_deferred = 0;
+
+       /*
         * Count how many ticks have passed since the last clock interrupt...
         */
        clkdiff = cp0_get_count() - ci->ci_cpu_counter_last;
        while (clkdiff >= ci->ci_cpu_counter_interval) {
                ci->ci_cpu_counter_last += ci->ci_cpu_counter_interval;
                clkdiff = cp0_get_count() - ci->ci_cpu_counter_last;
-               ci->ci_pendingticks++;
+               pendingticks++;
        }
-       ci->ci_pendingticks++;
+       pendingticks++;
        ci->ci_cpu_counter_last += ci->ci_cpu_counter_interval;
 
        /*
@@ -132,32 +145,64 @@ cp0_int5(uint32_t mask, struct trapframe
        clkdiff = cp0_get_count() - ci->ci_cpu_counter_last;
        if ((int)clkdiff >= 0) {
                ci->ci_cpu_counter_last += ci->ci_cpu_counter_interval;
-               ci->ci_pendingticks++;
+               pendingticks++;
                cp0_set_compare(ci->ci_cpu_counter_last);
        }
 
        /*
-        * Process clock interrupt unless it is currently masked.
+        * Process clock interrupt.
         */
-       if (tf->ipl < IPL_CLOCK) {
 #ifdef MULTIPROCESSOR
-               register_t sr;
+       register_t sr;
 
-               sr = getsr();
-               ENABLEIPI();
+       sr = getsr();
+       ENABLEIPI();
 #endif
-               while (ci->ci_pendingticks) {
-                       atomic_inc_long(
-                           (unsigned long *)&cp0_clock_count.ec_count);
-                       hardclock(tf);
-                       ci->ci_pendingticks--;
-               }
+       while (pendingticks) {
+               atomic_inc_long((unsigned long *)&cp0_clock_count.ec_count);
+               hardclock(tf);
+               pendingticks--;
+       }
 #ifdef MULTIPROCESSOR
-               setsr(sr);
+       setsr(sr);
 #endif
-       }
 
        return CR_INT_5;        /* Clock is always on 5 */
+}
+
+unsigned long cp0_raise_calls, cp0_raise_miss;
+
+/*
+ * Trigger the clock interrupt.
+ * 
+ * We need to spin until either (a) INT5 is pending or (b) the compare
+ * register leads the count register, i.e. we know INT5 will be pending
+ * very soon.
+ *
+ * To ensure we don't spin forever, double the compensatory offset
+ * added to the compare value every time we miss the count register.
+ */
+void
+cp0_trigger_int5(void)
+{
+       uint32_t compare, offset = 16;
+       int leading = 0;
+       register_t sr;
+
+       sr = disableintr();
+       while (!leading && !ISSET(cp0_get_cause(), CR_INT_5)) {
+               compare = cp0_get_count() + offset;
+               cp0_set_compare(compare);
+               leading = (int32_t)(compare - cp0_get_count()) > 0;
+               offset *= 2;
+       }
+       setsr(sr);
+
+       unsigned long misses = 0;
+       for (; offset > 32; offset /= 2)
+               misses++;
+       atomic_add_long(&cp0_raise_miss, misses);
+       atomic_inc_long(&cp0_raise_calls);
 }
 
 /*
Index: mips64/mips64/cp0access.S
===================================================================
RCS file: /cvs/src/sys/arch/mips64/mips64/cp0access.S,v
retrieving revision 1.23
diff -u -p -r1.23 cp0access.S
--- mips64/mips64/cp0access.S   1 May 2021 16:11:11 -0000       1.23
+++ mips64/mips64/cp0access.S   8 Aug 2022 07:41:44 -0000
@@ -198,3 +198,10 @@ LEAF(cpu_rnd_messybits, 0)
        j       ra
         NOP
 END(cpu_rnd_messybits)
+
+LEAF(cp0_get_cause, 0)
+       MFC0    v0, COP_0_CAUSE_REG
+       MFC0_HAZARD
+       j       ra
+        NOP
+END(cp0_get_cause)
Index: mips64/include/cpu.h
===================================================================
RCS file: /cvs/src/sys/arch/mips64/include/cpu.h,v
retrieving revision 1.138
diff -u -p -r1.138 cpu.h
--- mips64/include/cpu.h        28 Jan 2022 16:20:09 -0000      1.138
+++ mips64/include/cpu.h        8 Aug 2022 07:41:44 -0000
@@ -178,11 +178,10 @@ struct cpu_info {
        volatile int    ci_ipl;                 /* software IPL */
        uint32_t        ci_softpending;         /* pending soft interrupts */
        int             ci_clock_started;
+       volatile int    ci_clock_deferred;      /* clock interrupt postponed */
        u_int32_t       ci_cpu_counter_last;    /* last compare value loaded */
        u_int32_t       ci_cpu_counter_interval; /* # of counter ticks/tick */
 
-       u_int32_t       ci_pendingticks;
-
        struct pmap     *ci_curpmap;
        uint            ci_intrdepth;           /* interrupt depth */
 #ifdef MULTIPROCESSOR
@@ -258,6 +257,7 @@ void        smp_rendezvous_cpus(unsigned long, 
 #define CPU_BUSY_CYCLE()       do {} while (0)
 
 extern void (*md_startclock)(struct cpu_info *);
+extern void (*md_triggerclock)(void);
 void   cp0_calibrate(struct cpu_info *);
 
 unsigned int cpu_rnd_messybits(void);
@@ -447,6 +447,7 @@ register_t disableintr(void);
 register_t getsr(void);
 register_t setsr(register_t);
 
+uint32_t cp0_get_cause(void);
 u_int  cp0_get_count(void);
 register_t cp0_get_config(void);
 uint32_t cp0_get_config_1(void);
Index: octeon/dev/octcit.c
===================================================================
RCS file: /cvs/src/sys/arch/octeon/dev/octcit.c,v
retrieving revision 1.12
diff -u -p -r1.12 octcit.c
--- octeon/dev/octcit.c 1 Sep 2019 12:16:01 -0000       1.12
+++ octeon/dev/octcit.c 8 Aug 2022 07:41:45 -0000
@@ -489,6 +489,10 @@ octcit_splx(int newipl)
                (void)CIU3_RD_8(sc, CIU3_IDT_PP(CIU3_IDT(core, 0)));
        }
 
+       /* Trigger deferred clock interrupt if it is now unmasked. */
+       if (ci->ci_clock_deferred && newipl < IPL_CLOCK)
+               md_triggerclock();
+
        /* If we still have softints pending trigger processing. */
        if (ci->ci_softpending != 0 && newipl < IPL_SOFTINT)
                setsoftintr0();
Index: octeon/dev/octciu.c
===================================================================
RCS file: /cvs/src/sys/arch/octeon/dev/octciu.c,v
retrieving revision 1.17
diff -u -p -r1.17 octciu.c
--- octeon/dev/octciu.c 1 Sep 2019 12:16:01 -0000       1.17
+++ octeon/dev/octciu.c 8 Aug 2022 07:41:45 -0000
@@ -588,6 +588,10 @@ octciu_splx(int newipl)
                    scpu->scpu_ibank[2].en,
                    scpu->scpu_intem[2] & ~scpu->scpu_imask[newipl][2]);
 
+       /* Trigger deferred clock interrupt if it is now unmasked. */
+       if (ci->ci_clock_deferred && newipl < IPL_CLOCK)
+               md_triggerclock();
+
        /* If we still have softints pending trigger processing. */
        if (ci->ci_softpending != 0 && newipl < IPL_SOFTINT)
                setsoftintr0();
Index: loongson/dev/bonito.c
===================================================================
RCS file: /cvs/src/sys/arch/loongson/dev/bonito.c,v
retrieving revision 1.35
diff -u -p -r1.35 bonito.c
--- loongson/dev/bonito.c       11 Mar 2021 11:16:57 -0000      1.35
+++ loongson/dev/bonito.c       8 Aug 2022 07:41:45 -0000
@@ -485,6 +485,11 @@ bonito_splx(int newipl)
        /* Update masks to new ipl. Order highly important! */
        ci->ci_ipl = newipl;
        bonito_setintrmask(newipl);
+
+       /* Trigger deferred clock interrupt if it is now unmasked. */
+       if (ci->ci_clock_deferred && newipl < IPL_CLOCK)
+               md_triggerclock();
+
        /* If we still have softints pending trigger processing. */
        if (ci->ci_softpending != 0 && newipl < IPL_SOFTINT)
                setsoftintr0();
Index: loongson/loongson/loongson3_intr.c
===================================================================
RCS file: /cvs/src/sys/arch/loongson/loongson/loongson3_intr.c,v
retrieving revision 1.7
diff -u -p -r1.7 loongson3_intr.c
--- loongson/loongson/loongson3_intr.c  24 Feb 2018 11:42:31 -0000      1.7
+++ loongson/loongson/loongson3_intr.c  8 Aug 2022 07:41:45 -0000
@@ -355,6 +355,10 @@ loongson3_splx(int newipl)
                REGVAL(LS3_IRT_INTENSET(0)) =
                    loongson3_intem & ~loongson3_imask[newipl];
 
+       /* Trigger deferred clock interrupt if it is now unmasked. */
+       if (ci->ci_clock_deferred && newipl < IPL_CLOCK)
+               md_triggerclock();
+
        if (ci->ci_softpending != 0 && newipl < IPL_SOFTINT)
                setsoftintr0();
 }

Reply via email to