Re: [PATCH RFC/RFT v2 4/4] riscv: Stop emitting preventive sfence.vma for new userspace mappings with Svvptc

2024-02-01 Thread Andrea Parri
On Wed, Jan 31, 2024 at 04:59:29PM +0100, Alexandre Ghiti wrote:
> The preventive sfence.vma were emitted because new mappings must be made
> visible to the page table walker but Svvptc guarantees that xRET act as
> a fence, so no need to sfence.vma for the uarchs that implement this
> extension.

AFAIU, your first submission shows that you don't need that xRET property.
Similarly for other archs.  What was rationale behind this Svvptc change?


> This allows to drastically reduce the number of sfence.vma emitted:
> 
> * Ubuntu boot to login:
> Before: ~630k sfence.vma
> After:  ~200k sfence.vma
> 
> * ltp - mmapstress01
> Before: ~45k
> After:  ~6.3k
> 
> * lmbench - lat_pagefault
> Before: ~665k
> After:   832 (!)
> 
> * lmbench - lat_mmap
> Before: ~546k
> After:   718 (!)

This Svvptc seems to move/add the "burden" of the synchronization to xRET:
Perhaps integrate the above counts w/ the perf gains in the cover letter?

  Andrea


Re: [PATCH v3 1/7] dump_stack: Support adding to the dump stack arch description

2019-02-20 Thread Andrea Parri
> >> > > +   * Order the stores above in vsnprintf() vs the store 
> >> > > of the
> >> > > +   * space below which joins the two strings. Note this 
> >> > > doesn't
> >> > > +   * make the code truly race free because there is no 
> >> > > barrier on
> >> > > +   * the read side. ie. Another CPU might load the 
> >> > > uninitialised
> >> > > +   * tail of the buffer first and then the space below 
> >> > > (rather
> >> > > +   * than the NULL that was there previously), and so 
> >> > > print the
> >> > > +   * uninitialised tail. But the whole string lives in 
> >> > > BSS so in
> >> > > +   * practice it should just see NULLs.
> >> > 
> >> > The comment doesn't say _why_ we need to order these stores: IOW, what
> >> > will or can go wrong without this order?  This isn't clear to me.
> >> >
> >> > Another good practice when adding smp_*-constructs (as discussed, e.g.,
> >> > at KS'18) is to indicate the matching construct/synch. mechanism.
> >> 
> >> Yes, one barrier without a counter-part is suspicious.
> >
> > As is this silence...,
> >
> > Michael, what happened to this patch? did you submit a new version?
> 
> No, I'm just busy, it's the merge window next week :)

Got it.


> 
> I thought the comment was pretty clear, if the stores are observed out
> of order we might print the uninitialised tail.
> 
> And the barrier on the read side would need to be in printk somewhere,
> which is obviously unpleasant.

Indeed.


> 
> >> If the parallel access is really needed then we could define the
> >> current length as atomic_t and use:
> >> 
> >>+ atomic_cmpxchg() to reserve the space for the string
> >>+ %*s to limit the printed length
> >> 
> >> In the worst case, we would print an incomplete string.
> >> See below for a sample code.
> >
> > Seems worth exploring, IMO; but I'd like to first hear _clear about
> > the _intended semantics (before digging into alternatives)...
> 
> It is not my intention to support concurrent updates of the string. The
> idea is you setup the string early in boot.

Understood, thanks for the clarification.


> 
> The concern with a concurrent reader is simply that the string is dumped
> in the panic path, and you never really know when you're going to panic.
> Even if you only write to the string before doing SMP bringup you might
> still have another CPU go rogue and panic before then.
> 
> But I probably should have just not added the barrier, it's over
> paranoid and will almost certainly never matter in practice.

Oh, well, I can only echo you: if you don't care about the stores being
_observed_ out of order, you could simply remove the barrier; if you do
care, then you need "more paranoid" on the readers side.  ;-)

  Andrea


> 
> cheers


Re: [PATCH v3 1/7] dump_stack: Support adding to the dump stack arch description

2019-02-19 Thread Andrea Parri
On Mon, Feb 11, 2019 at 03:38:59PM +0100, Petr Mladek wrote:
> On Mon 2019-02-11 13:50:35, Andrea Parri wrote:
> > Hi Michael,
> > 
> > 
> > On Thu, Feb 07, 2019 at 11:46:29PM +1100, Michael Ellerman wrote:
> > > Arch code can set a "dump stack arch description string" which is
> > > displayed with oops output to describe the hardware platform.
> > > 
> > > It is useful to initialise this as early as possible, so that an early
> > > oops will have the hardware description.
> > > 
> > > However in practice we discover the hardware platform in stages, so it
> > > would be useful to be able to incrementally fill in the hardware
> > > description as we discover it.
> > > 
> > > This patch adds that ability, by creating dump_stack_add_arch_desc().
> > > 
> > > If there is no existing string it behaves exactly like
> > > dump_stack_set_arch_desc(). However if there is an existing string it
> > > appends to it, with a leading space.
> > > 
> > > This makes it easy to call it multiple times from different parts of the
> > > code and get a reasonable looking result.
> > > 
> > > Signed-off-by: Michael Ellerman 
> > > ---
> > >  include/linux/printk.h |  5 
> > >  lib/dump_stack.c   | 58 ++
> > >  2 files changed, 63 insertions(+)
> > > 
> > > v3: No change, just widened Cc list.
> > > 
> > > v2: Add a smp_wmb() and comment.
> > > 
> > > v1 is here for reference 
> > > https://lore.kernel.org/lkml/1430824337-15339-1-git-send-email-...@ellerman.id.au/
> > > 
> > > I'll take this series via the powerpc tree if no one minds?
> > > 
> > > 
> > > diff --git a/include/linux/printk.h b/include/linux/printk.h
> > > index 77740a506ebb..d5fb4f960271 100644
> > > --- a/include/linux/printk.h
> > > +++ b/include/linux/printk.h
> > > @@ -198,6 +198,7 @@ u32 log_buf_len_get(void);
> > >  void log_buf_vmcoreinfo_setup(void);
> > >  void __init setup_log_buf(int early);
> > >  __printf(1, 2) void dump_stack_set_arch_desc(const char *fmt, ...);
> > > +__printf(1, 2) void dump_stack_add_arch_desc(const char *fmt, ...);
> > >  void dump_stack_print_info(const char *log_lvl);
> > >  void show_regs_print_info(const char *log_lvl);
> > >  extern asmlinkage void dump_stack(void) __cold;
> > > @@ -256,6 +257,10 @@ static inline __printf(1, 2) void 
> > > dump_stack_set_arch_desc(const char *fmt, ...)
> > >  {
> > >  }
> > >  
> > > +static inline __printf(1, 2) void dump_stack_add_arch_desc(const char 
> > > *fmt, ...)
> > > +{
> > > +}
> > > +
> > >  static inline void dump_stack_print_info(const char *log_lvl)
> > >  {
> > >  }
> > > diff --git a/lib/dump_stack.c b/lib/dump_stack.c
> > > index 5cff72f18c4a..69b710ff92b5 100644
> > > --- a/lib/dump_stack.c
> > > +++ b/lib/dump_stack.c
> > > @@ -35,6 +35,64 @@ void __init dump_stack_set_arch_desc(const char *fmt, 
> > > ...)
> > >   va_end(args);
> > >  }
> > >  
> > > +/**
> > > + * dump_stack_add_arch_desc - add arch-specific info to show with task 
> > > dumps
> > > + * @fmt: printf-style format string
> > > + * @...: arguments for the format string
> > > + *
> > > + * See dump_stack_set_arch_desc() for why you'd want to use this.
> > > + *
> > > + * This version adds to any existing string already created with either
> > > + * dump_stack_set_arch_desc() or dump_stack_add_arch_desc(). If there is 
> > > an
> > > + * existing string a space will be prepended to the passed string.
> > > + */
> > > +void __init dump_stack_add_arch_desc(const char *fmt, ...)
> > > +{
> > > + va_list args;
> > > + int pos, len;
> > > + char *p;
> > > +
> > > + /*
> > > +  * If there's an existing string we snprintf() past the end of it, and
> > > +  * then turn the terminating NULL of the existing string into a space
> > > +  * to create one string separated by a space.
> > > +  *
> > > +  * If there's no existing string we just snprintf() to the buffer, like
> > > +  * dump_stack_set_arch_desc(), but without calling it because we'd need
> > > +  * a varargs version.
> > > +  */
> > > + len = strnlen(dump_stack_arch_desc_str, 
> > > sizeof(dump_s

Re: [PATCH v3 1/7] dump_stack: Support adding to the dump stack arch description

2019-02-11 Thread Andrea Parri
Hi Michael,


On Thu, Feb 07, 2019 at 11:46:29PM +1100, Michael Ellerman wrote:
> Arch code can set a "dump stack arch description string" which is
> displayed with oops output to describe the hardware platform.
> 
> It is useful to initialise this as early as possible, so that an early
> oops will have the hardware description.
> 
> However in practice we discover the hardware platform in stages, so it
> would be useful to be able to incrementally fill in the hardware
> description as we discover it.
> 
> This patch adds that ability, by creating dump_stack_add_arch_desc().
> 
> If there is no existing string it behaves exactly like
> dump_stack_set_arch_desc(). However if there is an existing string it
> appends to it, with a leading space.
> 
> This makes it easy to call it multiple times from different parts of the
> code and get a reasonable looking result.
> 
> Signed-off-by: Michael Ellerman 
> ---
>  include/linux/printk.h |  5 
>  lib/dump_stack.c   | 58 ++
>  2 files changed, 63 insertions(+)
> 
> v3: No change, just widened Cc list.
> 
> v2: Add a smp_wmb() and comment.
> 
> v1 is here for reference 
> https://lore.kernel.org/lkml/1430824337-15339-1-git-send-email-...@ellerman.id.au/
> 
> I'll take this series via the powerpc tree if no one minds?
> 
> 
> diff --git a/include/linux/printk.h b/include/linux/printk.h
> index 77740a506ebb..d5fb4f960271 100644
> --- a/include/linux/printk.h
> +++ b/include/linux/printk.h
> @@ -198,6 +198,7 @@ u32 log_buf_len_get(void);
>  void log_buf_vmcoreinfo_setup(void);
>  void __init setup_log_buf(int early);
>  __printf(1, 2) void dump_stack_set_arch_desc(const char *fmt, ...);
> +__printf(1, 2) void dump_stack_add_arch_desc(const char *fmt, ...);
>  void dump_stack_print_info(const char *log_lvl);
>  void show_regs_print_info(const char *log_lvl);
>  extern asmlinkage void dump_stack(void) __cold;
> @@ -256,6 +257,10 @@ static inline __printf(1, 2) void 
> dump_stack_set_arch_desc(const char *fmt, ...)
>  {
>  }
>  
> +static inline __printf(1, 2) void dump_stack_add_arch_desc(const char *fmt, 
> ...)
> +{
> +}
> +
>  static inline void dump_stack_print_info(const char *log_lvl)
>  {
>  }
> diff --git a/lib/dump_stack.c b/lib/dump_stack.c
> index 5cff72f18c4a..69b710ff92b5 100644
> --- a/lib/dump_stack.c
> +++ b/lib/dump_stack.c
> @@ -35,6 +35,64 @@ void __init dump_stack_set_arch_desc(const char *fmt, ...)
>   va_end(args);
>  }
>  
> +/**
> + * dump_stack_add_arch_desc - add arch-specific info to show with task dumps
> + * @fmt: printf-style format string
> + * @...: arguments for the format string
> + *
> + * See dump_stack_set_arch_desc() for why you'd want to use this.
> + *
> + * This version adds to any existing string already created with either
> + * dump_stack_set_arch_desc() or dump_stack_add_arch_desc(). If there is an
> + * existing string a space will be prepended to the passed string.
> + */
> +void __init dump_stack_add_arch_desc(const char *fmt, ...)
> +{
> + va_list args;
> + int pos, len;
> + char *p;
> +
> + /*
> +  * If there's an existing string we snprintf() past the end of it, and
> +  * then turn the terminating NULL of the existing string into a space
> +  * to create one string separated by a space.
> +  *
> +  * If there's no existing string we just snprintf() to the buffer, like
> +  * dump_stack_set_arch_desc(), but without calling it because we'd need
> +  * a varargs version.
> +  */
> + len = strnlen(dump_stack_arch_desc_str, 
> sizeof(dump_stack_arch_desc_str));
> + pos = len;
> +
> + if (len)
> + pos++;
> +
> + if (pos >= sizeof(dump_stack_arch_desc_str))
> + return; /* Ran out of space */
> +
> + p = _stack_arch_desc_str[pos];
> +
> + va_start(args, fmt);
> + vsnprintf(p, sizeof(dump_stack_arch_desc_str) - pos, fmt, args);
> + va_end(args);
> +
> + if (len) {
> + /*
> +  * Order the stores above in vsnprintf() vs the store of the
> +  * space below which joins the two strings. Note this doesn't
> +  * make the code truly race free because there is no barrier on
> +  * the read side. ie. Another CPU might load the uninitialised
> +  * tail of the buffer first and then the space below (rather
> +  * than the NULL that was there previously), and so print the
> +  * uninitialised tail. But the whole string lives in BSS so in
> +  * practice it should just see NULLs.

The comment doesn't say _why_ we need to order these stores: IOW, what
will or can go wrong without this order?  This isn't clear to me.

Another good practice when adding smp_*-constructs (as discussed, e.g.,
at KS'18) is to indicate the matching construct/synch. mechanism.

  Andrea


> +  */
> + smp_wmb();
> +
> + dump_stack_arch_desc_str[len] = ' ';
> + }
> 

Re: [PATCH 2/2] smp: introduce kick_active_cpus_sync()

2018-03-26 Thread Andrea Parri
Hi Yury,

On Sun, Mar 25, 2018 at 08:50:04PM +0300, Yury Norov wrote:
> kick_all_cpus_sync() forces all CPUs to sync caches by sending broadcast IPI.
> If CPU is in extended quiescent state (idle task or nohz_full userspace), this
> work may be done at the exit of this state. Delaying synchronization helps to
> save power if CPU is in idle state and decrease latency for real-time tasks.
> 
> This patch introduces kick_active_cpus_sync() and uses it in mm/slab and arm64
> code to delay syncronization.
> 
> For task isolation (https://lkml.org/lkml/2017/11/3/589), IPI to the CPU 
> running
> isolated task would be fatal, as it breaks isolation. The approach with 
> delaying
> of synchronization work helps to maintain isolated state.
> 
> I've tested it with test from task isolation series on ThunderX2 for more than
> 10 hours (10k giga-ticks) without breaking isolation.
> 
> Signed-off-by: Yury Norov 
> ---
>  arch/arm64/kernel/insn.c |  2 +-
>  include/linux/smp.h  |  2 ++
>  kernel/smp.c | 24 
>  mm/slab.c|  2 +-
>  4 files changed, 28 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/arm64/kernel/insn.c b/arch/arm64/kernel/insn.c
> index 2718a77da165..9d7c492e920e 100644
> --- a/arch/arm64/kernel/insn.c
> +++ b/arch/arm64/kernel/insn.c
> @@ -291,7 +291,7 @@ int __kprobes aarch64_insn_patch_text(void *addrs[], u32 
> insns[], int cnt)
>* synchronization.
>*/
>   ret = aarch64_insn_patch_text_nosync(addrs[0], 
> insns[0]);
> - kick_all_cpus_sync();
> + kick_active_cpus_sync();
>   return ret;
>   }
>   }
> diff --git a/include/linux/smp.h b/include/linux/smp.h
> index 9fb239e12b82..27215e22240d 100644
> --- a/include/linux/smp.h
> +++ b/include/linux/smp.h
> @@ -105,6 +105,7 @@ int smp_call_function_any(const struct cpumask *mask,
> smp_call_func_t func, void *info, int wait);
>  
>  void kick_all_cpus_sync(void);
> +void kick_active_cpus_sync(void);
>  void wake_up_all_idle_cpus(void);
>  
>  /*
> @@ -161,6 +162,7 @@ smp_call_function_any(const struct cpumask *mask, 
> smp_call_func_t func,
>  }
>  
>  static inline void kick_all_cpus_sync(void) {  }
> +static inline void kick_active_cpus_sync(void) {  }
>  static inline void wake_up_all_idle_cpus(void) {  }
>  
>  #ifdef CONFIG_UP_LATE_INIT
> diff --git a/kernel/smp.c b/kernel/smp.c
> index 084c8b3a2681..0358d6673850 100644
> --- a/kernel/smp.c
> +++ b/kernel/smp.c
> @@ -724,6 +724,30 @@ void kick_all_cpus_sync(void)
>  }
>  EXPORT_SYMBOL_GPL(kick_all_cpus_sync);
>  
> +/**
> + * kick_active_cpus_sync - Force CPUs that are not in extended
> + * quiescent state (idle or nohz_full userspace) sync by sending
> + * IPI. Extended quiescent state CPUs will sync at the exit of
> + * that state.
> + */
> +void kick_active_cpus_sync(void)
> +{
> + int cpu;
> + struct cpumask kernel_cpus;
> +
> + smp_mb();

(A general remark only:)

checkpatch.pl should have warned about the fact that this barrier is
missing an accompanying comment (which accesses are being "ordered",
what is the pairing barrier, etc.).

Moreover if, as your reply above suggested, your patch is relying on
"implicit barriers" (something I would not recommend) then even more
so you should comment on these requirements.

This could: (a) force you to reason about the memory ordering stuff,
(b) easy the task of reviewing and adopting your patch, (c) easy the
task of preserving those requirements (as implementations changes).

  Andrea


> +
> + cpumask_clear(_cpus);
> + preempt_disable();
> + for_each_online_cpu(cpu) {
> + if (!rcu_eqs_special_set(cpu))
> + cpumask_set_cpu(cpu, _cpus);
> + }
> + smp_call_function_many(_cpus, do_nothing, NULL, 1);
> + preempt_enable();
> +}
> +EXPORT_SYMBOL_GPL(kick_active_cpus_sync);
> +
>  /**
>   * wake_up_all_idle_cpus - break all cpus out of idle
>   * wake_up_all_idle_cpus try to break all cpus which is in idle state even
> diff --git a/mm/slab.c b/mm/slab.c
> index 324446621b3e..678d5dbd6f46 100644
> --- a/mm/slab.c
> +++ b/mm/slab.c
> @@ -3856,7 +3856,7 @@ static int __do_tune_cpucache(struct kmem_cache 
> *cachep, int limit,
>* cpus, so skip the IPIs.
>*/
>   if (prev)
> - kick_all_cpus_sync();
> + kick_active_cpus_sync();
>  
>   check_irq_on();
>   cachep->batchcount = batchcount;
> -- 
> 2.14.1
> 


Re: [PATCH tip/core/rcu 1/3] membarrier: Provide register expedited private command

2017-10-08 Thread Andrea Parri
On Fri, Oct 06, 2017 at 10:32:19AM +0200, Peter Zijlstra wrote:
> > AFAIU the scheduler rq->lock is held while preemption is disabled.
> > synchronize_sched() is used here to ensure that all pre-existing
> > preempt-off critical sections have completed.
> > 
> > So saying that we use synchronize_sched() to synchronize with rq->lock
> > would be stretching the truth a bit. It's actually only true because the
> > scheduler holding the rq->lock is surrounded by a preempt-off
> > critical section.
> 
> No, rq->lock is sufficient, note that rq->lock is a raw_spinlock_t which
> implies !preempt. Yes, we also surround the rq->lock usage with a
> slightly larger preempt_disable() section but that's not in fact
> required for this.

That's what it is, according to the current sources: we seemed to agree that
a preempt-off critical section is what we rely on here and that the start of
this critical section is not marked by that raw_spin_lock.

  Andrea


Re: [PATCH tip/core/rcu 1/3] membarrier: Provide register expedited private command

2017-10-08 Thread Andrea Parri
On Fri, Oct 6, 2017 at 12:02 AM, Andrea Parri <parri.and...@gmail.com> wrote:
> On Thu, Oct 05, 2017 at 04:02:06PM +, Mathieu Desnoyers wrote:
>> - On Oct 5, 2017, at 8:12 AM, Peter Zijlstra pet...@infradead.org wrote:
>>
>> > On Wed, Oct 04, 2017 at 02:37:53PM -0700, Paul E. McKenney wrote:
>> >> diff --git a/arch/powerpc/kernel/membarrier.c 
>> >> b/arch/powerpc/kernel/membarrier.c
>> >> new file mode 100644
>> >> index ..b0d79a5f5981
>> >> --- /dev/null
>> >> +++ b/arch/powerpc/kernel/membarrier.c
>> >> @@ -0,0 +1,45 @@
>> >
>> >> +void membarrier_arch_register_private_expedited(struct task_struct *p)
>> >> +{
>> >> +  struct task_struct *t;
>> >> +
>> >> +  if (get_nr_threads(p) == 1) {
>> >> +  set_thread_flag(TIF_MEMBARRIER_PRIVATE_EXPEDITED);
>> >> +  return;
>> >> +  }
>> >> +  /*
>> >> +   * Coherence of TIF_MEMBARRIER_PRIVATE_EXPEDITED against thread
>> >> +   * fork is protected by siglock.
>> >> +   */
>> >> +  spin_lock(>sighand->siglock);
>> >> +  for_each_thread(p, t)
>> >> +  set_ti_thread_flag(task_thread_info(t),
>> >> +  TIF_MEMBARRIER_PRIVATE_EXPEDITED);
>> >
>> > I'm not sure this works correctly vs CLONE_VM without CLONE_THREAD.
>>
>> The intent here is to hold the sighand siglock to provide mutual
>> exclusion against invocation of membarrier_fork(p, clone_flags)
>> by copy_process().
>>
>> copy_process() grabs spin_lock(>sighand->siglock) for both
>> CLONE_THREAD and not CLONE_THREAD flags.
>>
>> What am I missing here ?
>>
>> >
>> >> +  spin_unlock(>sighand->siglock);
>> >> +  /*
>> >> +   * Ensure all future scheduler executions will observe the new
>> >> +   * thread flag state for this process.
>> >> +   */
>> >> +  synchronize_sched();
>> >
>> > This relies on the flag being read inside rq->lock, right?
>>
>> Yes. The flag is read by membarrier_arch_switch_mm(), invoked
>> within switch_mm_irqs_off(), called by context_switch() before
>> finish_task_switch() releases the rq lock.
>
> I fail to graps the relation between this synchronize_sched() and rq->lock.

s/graps/grasp

  Andrea


>
> (Besides, we made no reference to rq->lock while discussing:
>
>   
> https://github.com/paulmckrcu/litmus/commit/47039df324b60ace0cf7b2403299580be730119b
>   replace membarrier_arch_sched_in with switch_mm_irqs_off )
>
> Could you elaborate?
>
>   Andrea
>
>
>>
>> Is the comment clear enough, or do you have suggestions for
>> improvements ?
>
>
>
>>
>> Thanks,
>>
>> Mathieu
>>
>> >
>> > > +}
>>
>> --
>> Mathieu Desnoyers
>> EfficiOS Inc.
>> http://www.efficios.com


Re: [PATCH tip/core/rcu 1/3] membarrier: Provide register expedited private command

2017-10-08 Thread Andrea Parri
On Thu, Oct 05, 2017 at 04:02:06PM +, Mathieu Desnoyers wrote:
> - On Oct 5, 2017, at 8:12 AM, Peter Zijlstra pet...@infradead.org wrote:
> 
> > On Wed, Oct 04, 2017 at 02:37:53PM -0700, Paul E. McKenney wrote:
> >> diff --git a/arch/powerpc/kernel/membarrier.c 
> >> b/arch/powerpc/kernel/membarrier.c
> >> new file mode 100644
> >> index ..b0d79a5f5981
> >> --- /dev/null
> >> +++ b/arch/powerpc/kernel/membarrier.c
> >> @@ -0,0 +1,45 @@
> > 
> >> +void membarrier_arch_register_private_expedited(struct task_struct *p)
> >> +{
> >> +  struct task_struct *t;
> >> +
> >> +  if (get_nr_threads(p) == 1) {
> >> +  set_thread_flag(TIF_MEMBARRIER_PRIVATE_EXPEDITED);
> >> +  return;
> >> +  }
> >> +  /*
> >> +   * Coherence of TIF_MEMBARRIER_PRIVATE_EXPEDITED against thread
> >> +   * fork is protected by siglock.
> >> +   */
> >> +  spin_lock(>sighand->siglock);
> >> +  for_each_thread(p, t)
> >> +  set_ti_thread_flag(task_thread_info(t),
> >> +  TIF_MEMBARRIER_PRIVATE_EXPEDITED);
> > 
> > I'm not sure this works correctly vs CLONE_VM without CLONE_THREAD.
> 
> The intent here is to hold the sighand siglock to provide mutual
> exclusion against invocation of membarrier_fork(p, clone_flags)
> by copy_process().
> 
> copy_process() grabs spin_lock(>sighand->siglock) for both
> CLONE_THREAD and not CLONE_THREAD flags.
> 
> What am I missing here ?
> 
> > 
> >> +  spin_unlock(>sighand->siglock);
> >> +  /*
> >> +   * Ensure all future scheduler executions will observe the new
> >> +   * thread flag state for this process.
> >> +   */
> >> +  synchronize_sched();
> > 
> > This relies on the flag being read inside rq->lock, right?
> 
> Yes. The flag is read by membarrier_arch_switch_mm(), invoked
> within switch_mm_irqs_off(), called by context_switch() before
> finish_task_switch() releases the rq lock.

I fail to graps the relation between this synchronize_sched() and rq->lock.

(Besides, we made no reference to rq->lock while discussing:

  
https://github.com/paulmckrcu/litmus/commit/47039df324b60ace0cf7b2403299580be730119b
  replace membarrier_arch_sched_in with switch_mm_irqs_off )

Could you elaborate?

  Andrea


> 
> Is the comment clear enough, or do you have suggestions for
> improvements ?



> 
> Thanks,
> 
> Mathieu
> 
> > 
> > > +}
> 
> -- 
> Mathieu Desnoyers
> EfficiOS Inc.
> http://www.efficios.com