Excerpts from Nicholas Piggin's message of June 5, 2021 10:17 am:
> Excerpts from Andy Lutomirski's message of June 5, 2021 3:05 am:
>> On 6/4/21 9:54 AM, Andy Lutomirski wrote:
>>> On 5/31/21 11:22 PM, Nicholas Piggin wrote:
>>>> There haven't been objections to the series since last posting, this
>>>> is just a rebase and tidies up a few comments minor patch rearranging.
>>>>
>>> 
>>> I continue to object to having too many modes.  I like my more generic
>>> improvements better.  Let me try to find some time to email again.
>>> 
>> 
>> Specifically, this:
>> 
>> https://git.kernel.org/pub/scm/linux/kernel/git/luto/linux.git/commit/?h=x86/mm
> 
> That's worse than what powerpc does with the shoot lazies code so 
> we wouldn't use it anyway.
> 
> The fact is mm-cpumask and lazy mm is very architecture specific, so I 
> don't really see that another "mode" is such a problem, it's for the 
> most part "this is what powerpc does" -> "this is what powerpc does".
> The only mode in the context switch is just "take a ref on the lazy mm"
> or "don't take a ref". Surely that's not too onerous to add!?
> 
> Actually the bigger part of it is actually the no-lazy mmu mode which
> is not yet used, I thought it was a neat little demonstrator of how code
> works with/without lazy but I will get rid of that for submission.

I admit that does add a bit more churn than necessary maybe that was
the main objection.

Here is the entire kernel/sched/core.c change after that is removed.
Pretty simple now. I'll resubmit.

Thanks,
Nick


diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index e359c76ea2e2..1be0b97e12ec 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -4171,7 +4171,7 @@ static struct rq *finish_task_switch(struct task_struct 
*prev)
        __releases(rq->lock)
 {
        struct rq *rq = this_rq();
-       struct mm_struct *mm = rq->prev_mm;
+       struct mm_struct *mm = NULL;
        long prev_state;
 
        /*
@@ -4190,7 +4190,10 @@ static struct rq *finish_task_switch(struct task_struct 
*prev)
                      current->comm, current->pid, preempt_count()))
                preempt_count_set(FORK_PREEMPT_COUNT);
 
-       rq->prev_mm = NULL;
+#ifdef CONFIG_MMU_LAZY_TLB_REFCOUNT
+       mm = rq->prev_lazy_mm;
+       rq->prev_lazy_mm = NULL;
+#endif
 
        /*
         * A task struct has one reference for the use as "current".
@@ -4326,9 +4329,21 @@ context_switch(struct rq *rq, struct task_struct *prev,
                switch_mm_irqs_off(prev->active_mm, next->mm, next);
 
                if (!prev->mm) {                        // from kernel
-                       /* will mmdrop_lazy_tlb() in finish_task_switch(). */
-                       rq->prev_mm = prev->active_mm;
+#ifdef CONFIG_MMU_LAZY_TLB_REFCOUNT
+                       /* Will mmdrop_lazy_tlb() in finish_task_switch(). */
+                       rq->prev_lazy_mm = prev->active_mm;
                        prev->active_mm = NULL;
+#else
+                       /*
+                        * Without MMU_LAZY_TLB_REFCOUNT there is no lazy
+                        * tracking (because no rq->prev_lazy_mm) in
+                        * finish_task_switch, so no mmdrop_lazy_tlb(),
+                        * so no memory barrier for membarrier (see the
+                        * membarrier comment in finish_task_switch()).
+                        * Do it here.
+                        */
+                       smp_mb();
+#endif
                }
        }
 
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index a189bec13729..0729cf19a987 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -961,7 +961,9 @@ struct rq {
        struct task_struct      *idle;
        struct task_struct      *stop;
        unsigned long           next_balance;
-       struct mm_struct        *prev_mm;
+#ifdef CONFIG_MMU_LAZY_TLB_REFCOUNT
+       struct mm_struct        *prev_lazy_mm;
+#endif
 
        unsigned int            clock_update_flags;
        u64                     clock;

Reply via email to