On Fri, Mar 20, 2026 at 09:31:57AM -0400, Mathieu Desnoyers wrote:
> On 2026-03-20 09:21, Harry Yoo (Oracle) wrote:
> > On Fri, Mar 20, 2026 at 08:35:46AM -0400, Mathieu Desnoyers wrote:
> > > On 2026-03-20 00:17, Harry Yoo wrote:
> > > [...]
> > > > > [1]: 
> > > > > https://lore.kernel.org/[email protected]/
> > > > 
> > > > @Mathieu: In patch 1/3 description,
> > > > > Changes since v7:
> > > > > - Explicitly initialize the subsystem from start_kernel() right
> > > > >     after mm_core_init() so it is up and running before the creation 
> > > > > of
> > > > >     the first mm at boot.
> > > > 
> > > > But how does this work when someone calls mm_cpumask() on init_mm early?
> > > > Looks like it will behave incorrectly because get_rss_stat_items_size()
> > > > returns zero?
> > > 
> > > It doesn't work as expected at all. I missed that all users of 
> > > mm_cpumask()
> > > end up relying on get_rss_stat_items_size(), which now calls
> > > percpu_counter_tree_items_size(), which depends on initialization from
> > > percpu_counter_tree_subsystem_init().
> > > 
> > > If you add a call to percpu_counter_tree_subsystem_init in
> > > arch/powerpc/kernel/setup_arch() just before:
> > > 
> > >          VM_WARN_ON(cpumask_test_cpu(smp_processor_id(), 
> > > mm_cpumask(&init_mm)));
> > >          cpumask_set_cpu(smp_processor_id(), mm_cpumask(&init_mm));
> > > 
> > > Does the warning go away ?
> > 
> > Hmm it goes away, but I'm not sure if it is it okay to use nr_cpu_ids
> > before setup_nr_cpu_ids() is called?
> 
> AFAIU on powerpc setup_nr_cpu_ids() is called near the end of
> smp_setup_cpu_maps(), which is called early in setup_arch,
> at least before the two lines which use mm_cpumask.

Right.

> > > Alternatively, would could use a lazy initialization invoking
> > > percpu_counter_tree_subsystem_init from percpu_counter_tree_items_size
> > > when the initialization is not already done.
> > 
> > So this probably isn't a way to go?
> 
> I'd favor explicit initialization, so the inter-dependencies are clear.

Ack.

> > Hmm perhaps we should treat init_mm as a special case in
> > mm_cpus_allowed() and mm_cpumask().
> 
> I'd prefer not to go there if boot sequence permits and keep things
> simple.
> 
> I think we're in a situation very similar to tree RCU, here is what
> is done in rcu_init_geometry:
> 
>         static bool initialized;
> 
>         if (initialized) {
>                 /*
>                  * Warn if setup_nr_cpu_ids() had not yet been invoked,
>                  * unless nr_cpus_ids == NR_CPUS, in which case who cares?
>                  */
>                 WARN_ON_ONCE(old_nr_cpu_ids != nr_cpu_ids);
>                 return;
>         }
> 
>         old_nr_cpu_ids = nr_cpu_ids;
>         initialized = true;

Yeah, as long as nr_cpus_order doesn't change after init,
that will work for HPCC. powerpc seems to be a special case that calls
mm_cpumask() very early in the boot process, so explicitly calling the
init function seems to be fair.

By the way, thinking about it differently - it would probably be simpler
to just eliminate mm_cpumask's dependency on HPCC init dependency by
placing those cpumasks before percpu counter tree items... (but yeah,
that would make mm_struct a bit larger due to alignment requirements)

-- 
Cheers,
Harry / Hyeonggon

Reply via email to