Re: [PATCH v03] powerpc/numa: Perform full re-add of CPU for PRRN/VPHN topology update

2019-02-08 Thread Michael Bringmann
On 2/7/19 11:44 PM, Srikar Dronamraju wrote:
>>
>>  int arch_update_cpu_topology(void)
>>  {
>> -return numa_update_cpu_topology(true);
>> +int changed = topology_changed;
>> +
>> +topology_changed = 0;
>> +return changed;
>>  }
>>
> 
> Do we need Powerpc override for arch_update_cpu_topology() now?  That
> topology_changed sometime back doesn't seem to have help. The scheduler
> atleast now is neglecting whether the topology changed or not.

I was dealing with a a concurrency problem.  Revisiting again.
> 
> Also we can do away with the new topology_changed.
> 
>>  static void topology_work_fn(struct work_struct *work)
>>  {
>> -rebuild_sched_domains();
>> +lock_device_hotplug();
>> +if (numa_update_cpu_topology(true))
>> +rebuild_sched_domains();
>> +unlock_device_hotplug();
>>  }
> 
> Should this hunk be a separate patch by itself to say why
> rebuild_sched_domains with a changelog that explains why it should be under
> lock_device_hotplug? rebuild_sched_domains already takes cpuset_mutex. 
> So I am not sure if we need to take device_hotplug_lock.

topology_work_fn runs in its own thread like the DLPAR operations.
This patch adds calls to Nathan's 'dlpar_cpu_readd' from the topology_work_fn
thread.  The lock/unlock_device_hotplug guard against concurrency issues
with the DLPAR operations, grabbing that lock here to avoid overlap with
those other operations.  This mod is dependent upon using dlpar_cpu_readd.

> 
>>  static DECLARE_WORK(topology_work, topology_work_fn);
>>
>> -static void topology_schedule_update(void)
>> +void topology_schedule_update(void)
>>  {
>> -schedule_work(_work);
>> +if (!topology_update_in_progress)
>> +schedule_work(_work);
>>  }
>>
>>  static void topology_timer_fn(struct timer_list *unused)
>>  {
>> +bool sdo = false;
> 
> Is sdo any abbrevation?

'for do the schedule update'.  Will remove per below.

> 
>> +
>> +if (topology_scans < 1)
>> +bitmap_fill(cpumask_bits(_associativity_changes_mask),
>> +nr_cpumask_bits);
> 
> Why do we need topology_scan? Just to make sure
> cpu_associativity_changes_mask is populated only once?
> cant we use a static bool inside the function for the same?

I was running into a race condition.  On one of my test systems,
start_topology_update via shared_proc_topology_init and the PHYP did
not provide any change info about the CPUs that early in the boot.
The first run erased the cpu bits in cpu_associativity_changes_mask,
and subsequent runs did not pay attention to the reported updates.
Taking another look.
> 
> 
>> +
>>  if (prrn_enabled && cpumask_weight(_associativity_changes_mask))
>> -topology_schedule_update();
>> -else if (vphn_enabled) {
>> +sdo =  true;
>> +if (vphn_enabled) {
> 
> Any reason to remove the else above?
When vphn_enabled and prrn_enabled, it was not calling 
'update_cpu_associativity_changes_mask()',
so was not getting the necessary change info.

>>  if (update_cpu_associativity_changes_mask() > 0)
>> -topology_schedule_update();
>> +sdo =  true;
>>  reset_topology_timer();
>>  }
>> +if (sdo)
>> +topology_schedule_update();
>> +topology_scans++;
>>  }
> 
> Are the above two hunks necessary? Not getting how the current changes are
> different from the previous.
Not important.  Will undo.
> 

-- 
Michael W. Bringmann
Linux Technology Center
IBM Corporation
Tie-Line  363-5196
External: (512) 286-5196
Cell:   (512) 466-0650
m...@linux.vnet.ibm.com



Re: [PATCH v03] powerpc/numa: Perform full re-add of CPU for PRRN/VPHN topology update

2019-02-07 Thread Srikar Dronamraju
> 
>  int arch_update_cpu_topology(void)
>  {
> - return numa_update_cpu_topology(true);
> + int changed = topology_changed;
> +
> + topology_changed = 0;
> + return changed;
>  }
> 

Do we need Powerpc override for arch_update_cpu_topology() now?  That
topology_changed sometime back doesn't seem to have help. The scheduler
atleast now is neglecting whether the topology changed or not.

Also we can do away with the new topology_changed.

>  static void topology_work_fn(struct work_struct *work)
>  {
> - rebuild_sched_domains();
> + lock_device_hotplug();
> + if (numa_update_cpu_topology(true))
> + rebuild_sched_domains();
> + unlock_device_hotplug();
>  }

Should this hunk be a separate patch by itself to say why
rebuild_sched_domains with a changelog that explains why it should be under
lock_device_hotplug? rebuild_sched_domains already takes cpuset_mutex. 
So I am not sure if we need to take device_hotplug_lock.

>  static DECLARE_WORK(topology_work, topology_work_fn);
> 
> -static void topology_schedule_update(void)
> +void topology_schedule_update(void)
>  {
> - schedule_work(_work);
> + if (!topology_update_in_progress)
> + schedule_work(_work);
>  }
> 
>  static void topology_timer_fn(struct timer_list *unused)
>  {
> + bool sdo = false;

Is sdo any abbrevation?

> +
> + if (topology_scans < 1)
> + bitmap_fill(cpumask_bits(_associativity_changes_mask),
> + nr_cpumask_bits);

Why do we need topology_scan? Just to make sure
cpu_associativity_changes_mask is populated only once?
cant we use a static bool inside the function for the same?


> +
>   if (prrn_enabled && cpumask_weight(_associativity_changes_mask))
> - topology_schedule_update();
> - else if (vphn_enabled) {
> + sdo =  true;
> + if (vphn_enabled) {

Any reason to remove the else above?

>   if (update_cpu_associativity_changes_mask() > 0)
> - topology_schedule_update();
> + sdo =  true;
>   reset_topology_timer();
>   }
> + if (sdo)
> + topology_schedule_update();
> + topology_scans++;
>  }

Are the above two hunks necessary? Not getting how the current changes are
different from the previous.

-- 
Thanks and Regards
Srikar Dronamraju



Re: [PATCH v03] powerpc/numa: Perform full re-add of CPU for PRRN/VPHN topology update

2019-02-06 Thread kbuild test robot
Hi Michael,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on v5.0-rc4 next-20190206]
[if your patch is applied to the wrong git tree, please drop us a note to help 
improve the system]

url:
https://github.com/0day-ci/linux/commits/Michael-Bringmann/powerpc-numa-Perform-full-re-add-of-CPU-for-PRRN-VPHN-topology-update/20190207-101545
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-allyesconfig (attached as .config)
compiler: powerpc64-linux-gnu-gcc (Debian 8.2.0-11) 8.2.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
GCC_VERSION=8.2.0 make.cross ARCH=powerpc 

All errors (new ones prefixed by >>):

   arch/powerpc/mm/numa.c: In function 'numa_update_cpu_topology':
>> arch/powerpc/mm/numa.c:1361:4: error: implicit declaration of function 
>> 'dlpar_cpu_readd'; did you mean 'raw_cpu_read'? 
>> [-Werror=implicit-function-declaration]
   dlpar_cpu_readd(cpu);
   ^~~
   raw_cpu_read
   cc1: some warnings being treated as errors

vim +1361 arch/powerpc/mm/numa.c

  1298  
  1299  /*
  1300   * Update the node maps and sysfs entries for each cpu whose home node
  1301   * has changed. Returns 1 when the topology has changed, and 0 
otherwise.
  1302   *
  1303   * readd_cpus: Also readd any CPUs that have changed affinity
  1304   */
  1305  static int numa_update_cpu_topology(bool readd_cpus)
  1306  {
  1307  unsigned int cpu, sibling, changed = 0;
  1308  struct topology_update_data *updates, *ud;
  1309  cpumask_t updated_cpus;
  1310  struct device *dev;
  1311  int weight, new_nid, i = 0;
  1312  
  1313  if ((!prrn_enabled && !vphn_enabled && topology_inited) ||
  1314  topology_update_in_progress)
  1315  return 0;
  1316  
  1317  weight = cpumask_weight(_associativity_changes_mask);
  1318  if (!weight)
  1319  return 0;
  1320  
  1321  updates = kcalloc(weight, sizeof(*updates), GFP_KERNEL);
  1322  if (!updates)
  1323  return 0;
  1324  
  1325  topology_update_in_progress = 1;
  1326  
  1327  cpumask_clear(_cpus);
  1328  
  1329  for_each_cpu(cpu, _associativity_changes_mask) {
  1330  /*
  1331   * If siblings aren't flagged for changes, updates list
  1332   * will be too short. Skip on this update and set for 
next
  1333   * update.
  1334   */
  1335  if (!cpumask_subset(cpu_sibling_mask(cpu),
  1336  
_associativity_changes_mask)) {
  1337  pr_info("Sibling bits not set for associativity 
"
  1338  "change, cpu%d\n", cpu);
  1339  cpumask_or(_associativity_changes_mask,
  1340  _associativity_changes_mask,
  1341  cpu_sibling_mask(cpu));
  1342  cpu = cpu_last_thread_sibling(cpu);
  1343  continue;
  1344  }
  1345  
  1346  new_nid = find_and_online_cpu_nid(cpu);
  1347  
  1348  if ((new_nid == numa_cpu_lookup_table[cpu]) ||
  1349  !cpu_present(cpu)) {
  1350  cpumask_andnot(_associativity_changes_mask,
  1351  _associativity_changes_mask,
  1352  cpu_sibling_mask(cpu));
  1353  if (cpu_present(cpu))
  1354  dbg("Assoc chg gives same node %d for 
cpu%d\n",
  1355  new_nid, cpu);
  1356  cpu = cpu_last_thread_sibling(cpu);
  1357  continue;
  1358  }
  1359  
  1360  if (readd_cpus)
> 1361  dlpar_cpu_readd(cpu);
  1362  
  1363  for_each_cpu(sibling, cpu_sibling_mask(cpu)) {
  1364  ud = [i++];
  1365  ud->next = [i];
  1366  ud->cpu = sibling;
  1367  ud->new_nid = new_nid;
  1368  ud->old_nid = numa_cpu_lookup_table[sibling];
  1369  cpumask_set_cpu(sibling, _cpus);
  1370  }
  1371  cpu = cpu_last_thread_sibling(cpu);
  1372  }
  1373  
  1374  /*
  1375   * Prevent processing of 'updates' from overflowing array
  1376   * where last entry filled in a 'next' pointer.
  1377