* Aneesh Kumar K.V <aneesh.ku...@linux.ibm.com> [2020-08-17 17:04:24]:
> On 8/17/20 4:29 PM, Srikar Dronamraju wrote: > > * Aneesh Kumar K.V <aneesh.ku...@linux.ibm.com> [2020-08-17 16:02:36]: > > > > > We use ibm,associativity and ibm,associativity-lookup-arrays to derive > > > the numa > > > node numbers. These device tree properties are firmware indicated > > > grouping of > > > resources based on their hierarchy in the platform. These numbers (group > > > id) are > > > not sequential and hypervisor/firmware can follow different numbering > > > schemes. > > > For ex: on powernv platforms, we group them in the below order. > > > > > > * - CCM node ID > > > * - HW card ID > > > * - HW module ID > > > * - Chip ID > > > * - Core ID > > > > > > Based on ibm,associativity-reference-points we use one of the above group > > > ids as > > > Linux NUMA node id. (On PowerNV platform Chip ID is used). This results > > > in Linux reporting non-linear NUMA node id and which also results in Linux > > > reporting empty node 0 NUMA nodes. > > > > > > This can be resolved by mapping the firmware provided group id to a > > > logical Linux > > > NUMA id. In this patch, we do this only for pseries platforms considering > > > the > > > firmware group id is a virtualized entity and users would not have drawn > > > any > > > conclusion based on the Linux Numa Node id. > > > > > > On PowerNV platform since we have historically mapped Chip ID as Linux > > > NUMA node > > > id, we keep the existing Linux NUMA node id numbering. > > > > I still dont understand how you are going to handle numa distances. > > With your patch, have you tried dlpar add/remove on a sparsely noded > > machine? > > > > We follow the same steps when fetching distance information. Instead of > using affinity domain id, we now use the mapped node id. The relevant hunk > in the patch is > > + nid = affinity_domain_to_nid(&domain); > > if (nid > 0 && > - of_read_number(associativity, 1) >= distance_ref_points_depth) { > + of_read_number(associativity, 1) >= distance_ref_points_depth) { > /* > * Skip the length field and send start of associativity array > */ > > I haven't tried dlpar add/remove. I don't have a setup to try that. Do you > see a problem there? > Yes, I think there can be 2 problems. 1. distance table may be filled with incorrect data. 2. numactl -H distance table shows symmetric data, the symmetric nature may be lost. > -aneesh > > -- Thanks and Regards Srikar Dronamraju