Re: [PATCH v5 2/3] x86/topology: Avoid wasting 128k for package id array

2017-11-12 Thread Prarit Bhargava


On 11/12/2017 08:36 AM, Thomas Gleixner wrote:
> On Fri, 10 Nov 2017, Andi Kleen wrote:
 All of that works. There is no way to make sure that a lookup is fully
 serialized against a concurrent update. Even if the lookup holds
 cpu_read_lock() the new package might arrive right after the unlock.

>>>
>>> Thanks Thomas.
>>>
>>> Andi, do you want to take a look at this?
>>
>> I was originally worried about races, that is why i tried to put 
>> everything into cpu_data. But that didn't work out because something
>> clears it. Perhaps the right solution would be some extra per_cpu
>> data variables, and search for the first match. I suspect that would
>> be simpler. But if that doesn't work I guess something like Thomas'
>> example will work.
> 
> Sure, we can use a separate per cpu variable. The race for looking up phys
> -> logical will always be there if that handles stuff like the uncore PCI
> physid one. There is not much which can prevent that.
> 
> The other option is to figure out what clears cpu_data on online and just
> preserve the logical/physcial translation across that clear. 
> 
> One thing you need to be careful about (in both cases) is the value. The
> data is zeroed on boot, so we either need to fill that with UINT_MAX at
> boot time in one of the functions which does a for_each_possible_cpu() loop
> anyway or just leave 0 as the 'not initialized' value and make the first
> logical package be '1'. The readout functions (percpu, translation,
> etc.) just can subtract 1.

I'll look into this (code & test) and get back with a v6.

Thanks Thomas,

P.

> 
> Thanks,
> 
>   tglx
> 


Re: [PATCH v5 2/3] x86/topology: Avoid wasting 128k for package id array

2017-11-12 Thread Prarit Bhargava


On 11/12/2017 08:36 AM, Thomas Gleixner wrote:
> On Fri, 10 Nov 2017, Andi Kleen wrote:
 All of that works. There is no way to make sure that a lookup is fully
 serialized against a concurrent update. Even if the lookup holds
 cpu_read_lock() the new package might arrive right after the unlock.

>>>
>>> Thanks Thomas.
>>>
>>> Andi, do you want to take a look at this?
>>
>> I was originally worried about races, that is why i tried to put 
>> everything into cpu_data. But that didn't work out because something
>> clears it. Perhaps the right solution would be some extra per_cpu
>> data variables, and search for the first match. I suspect that would
>> be simpler. But if that doesn't work I guess something like Thomas'
>> example will work.
> 
> Sure, we can use a separate per cpu variable. The race for looking up phys
> -> logical will always be there if that handles stuff like the uncore PCI
> physid one. There is not much which can prevent that.
> 
> The other option is to figure out what clears cpu_data on online and just
> preserve the logical/physcial translation across that clear. 
> 
> One thing you need to be careful about (in both cases) is the value. The
> data is zeroed on boot, so we either need to fill that with UINT_MAX at
> boot time in one of the functions which does a for_each_possible_cpu() loop
> anyway or just leave 0 as the 'not initialized' value and make the first
> logical package be '1'. The readout functions (percpu, translation,
> etc.) just can subtract 1.

I'll look into this (code & test) and get back with a v6.

Thanks Thomas,

P.

> 
> Thanks,
> 
>   tglx
> 


Re: [PATCH v5 2/3] x86/topology: Avoid wasting 128k for package id array

2017-11-12 Thread Thomas Gleixner
On Fri, 10 Nov 2017, Andi Kleen wrote:
> > > All of that works. There is no way to make sure that a lookup is fully
> > > serialized against a concurrent update. Even if the lookup holds
> > > cpu_read_lock() the new package might arrive right after the unlock.
> > > 
> > 
> > Thanks Thomas.
> > 
> > Andi, do you want to take a look at this?
> 
> I was originally worried about races, that is why i tried to put 
> everything into cpu_data. But that didn't work out because something
> clears it. Perhaps the right solution would be some extra per_cpu
> data variables, and search for the first match. I suspect that would
> be simpler. But if that doesn't work I guess something like Thomas'
> example will work.

Sure, we can use a separate per cpu variable. The race for looking up phys
-> logical will always be there if that handles stuff like the uncore PCI
physid one. There is not much which can prevent that.

The other option is to figure out what clears cpu_data on online and just
preserve the logical/physcial translation across that clear. 

One thing you need to be careful about (in both cases) is the value. The
data is zeroed on boot, so we either need to fill that with UINT_MAX at
boot time in one of the functions which does a for_each_possible_cpu() loop
anyway or just leave 0 as the 'not initialized' value and make the first
logical package be '1'. The readout functions (percpu, translation,
etc.) just can subtract 1.

Thanks,

tglx



Re: [PATCH v5 2/3] x86/topology: Avoid wasting 128k for package id array

2017-11-12 Thread Thomas Gleixner
On Fri, 10 Nov 2017, Andi Kleen wrote:
> > > All of that works. There is no way to make sure that a lookup is fully
> > > serialized against a concurrent update. Even if the lookup holds
> > > cpu_read_lock() the new package might arrive right after the unlock.
> > > 
> > 
> > Thanks Thomas.
> > 
> > Andi, do you want to take a look at this?
> 
> I was originally worried about races, that is why i tried to put 
> everything into cpu_data. But that didn't work out because something
> clears it. Perhaps the right solution would be some extra per_cpu
> data variables, and search for the first match. I suspect that would
> be simpler. But if that doesn't work I guess something like Thomas'
> example will work.

Sure, we can use a separate per cpu variable. The race for looking up phys
-> logical will always be there if that handles stuff like the uncore PCI
physid one. There is not much which can prevent that.

The other option is to figure out what clears cpu_data on online and just
preserve the logical/physcial translation across that clear. 

One thing you need to be careful about (in both cases) is the value. The
data is zeroed on boot, so we either need to fill that with UINT_MAX at
boot time in one of the functions which does a for_each_possible_cpu() loop
anyway or just leave 0 as the 'not initialized' value and make the first
logical package be '1'. The readout functions (percpu, translation,
etc.) just can subtract 1.

Thanks,

tglx



Re: [PATCH v5 2/3] x86/topology: Avoid wasting 128k for package id array

2017-11-10 Thread Andi Kleen
> Andi's idea would work and the code would become much cleaner, if
> smp_store_cpu_info() only overwrote cpu_data for new physically hotplugged 
> cpus.

Can always just use a separate per_cpu variable for this that is not cleared.

-Andi


Re: [PATCH v5 2/3] x86/topology: Avoid wasting 128k for package id array

2017-11-10 Thread Andi Kleen
> Andi's idea would work and the code would become much cleaner, if
> smp_store_cpu_info() only overwrote cpu_data for new physically hotplugged 
> cpus.

Can always just use a separate per_cpu variable for this that is not cleared.

-Andi


Re: [PATCH v5 2/3] x86/topology: Avoid wasting 128k for package id array

2017-11-10 Thread Prarit Bhargava


On 11/10/2017 10:01 AM, Andi Kleen wrote:
>>> All of that works. There is no way to make sure that a lookup is fully
>>> serialized against a concurrent update. Even if the lookup holds
>>> cpu_read_lock() the new package might arrive right after the unlock.
>>>
>>
>> Thanks Thomas.
>>
>> Andi, do you want to take a look at this?
> 
> I was originally worried about races, that is why i tried to put 
> everything into cpu_data. But that didn't work out because something
> clears it. Perhaps the right solution would be some extra per_cpu
> data variables, and search for the first match. I suspect that would
> be simpler. But if that doesn't work I guess something like Thomas'
> example will work.
> 
> I assume you will handle it, Prarit?

I can, but let's discuss your paragraph above.  I'd like to get Thomas' thoughts
on the problem and see if he thinks your original idea is valid, or if he has a
better suggestion.

Thomas, what Andi was originally attempting to do (way back in v1) is store the
logical_proc_id in cpu_data and drop the array completely.

The problem is the procedure on a cpu up, smp_store_cpu_info() overwrites the
cpu's cpu_data with boot_cpu_data (arch/x86/kernel/smpboot.c:402).  AFAICT this
is done because *new* cpus (ie, physical hotplug) should be initialized with
some basic data from boot_cpu_data.  ie) for an already existing cpu it looks
like this is unnecessary as the data has already been initialized.

Andi's idea would work and the code would become much cleaner, if
smp_store_cpu_info() only overwrote cpu_data for new physically hotplugged cpus.

Admittedly, I might be missing some path to that code so I'm looking for
verification.  What do you think?

P.

> 
> -Andi
> 


Re: [PATCH v5 2/3] x86/topology: Avoid wasting 128k for package id array

2017-11-10 Thread Prarit Bhargava


On 11/10/2017 10:01 AM, Andi Kleen wrote:
>>> All of that works. There is no way to make sure that a lookup is fully
>>> serialized against a concurrent update. Even if the lookup holds
>>> cpu_read_lock() the new package might arrive right after the unlock.
>>>
>>
>> Thanks Thomas.
>>
>> Andi, do you want to take a look at this?
> 
> I was originally worried about races, that is why i tried to put 
> everything into cpu_data. But that didn't work out because something
> clears it. Perhaps the right solution would be some extra per_cpu
> data variables, and search for the first match. I suspect that would
> be simpler. But if that doesn't work I guess something like Thomas'
> example will work.
> 
> I assume you will handle it, Prarit?

I can, but let's discuss your paragraph above.  I'd like to get Thomas' thoughts
on the problem and see if he thinks your original idea is valid, or if he has a
better suggestion.

Thomas, what Andi was originally attempting to do (way back in v1) is store the
logical_proc_id in cpu_data and drop the array completely.

The problem is the procedure on a cpu up, smp_store_cpu_info() overwrites the
cpu's cpu_data with boot_cpu_data (arch/x86/kernel/smpboot.c:402).  AFAICT this
is done because *new* cpus (ie, physical hotplug) should be initialized with
some basic data from boot_cpu_data.  ie) for an already existing cpu it looks
like this is unnecessary as the data has already been initialized.

Andi's idea would work and the code would become much cleaner, if
smp_store_cpu_info() only overwrote cpu_data for new physically hotplugged cpus.

Admittedly, I might be missing some path to that code so I'm looking for
verification.  What do you think?

P.

> 
> -Andi
> 


Re: [PATCH v5 2/3] x86/topology: Avoid wasting 128k for package id array

2017-11-10 Thread Andi Kleen
> > All of that works. There is no way to make sure that a lookup is fully
> > serialized against a concurrent update. Even if the lookup holds
> > cpu_read_lock() the new package might arrive right after the unlock.
> > 
> 
> Thanks Thomas.
> 
> Andi, do you want to take a look at this?

I was originally worried about races, that is why i tried to put 
everything into cpu_data. But that didn't work out because something
clears it. Perhaps the right solution would be some extra per_cpu
data variables, and search for the first match. I suspect that would
be simpler. But if that doesn't work I guess something like Thomas'
example will work.

I assume you will handle it, Prarit?

-Andi


Re: [PATCH v5 2/3] x86/topology: Avoid wasting 128k for package id array

2017-11-10 Thread Andi Kleen
> > All of that works. There is no way to make sure that a lookup is fully
> > serialized against a concurrent update. Even if the lookup holds
> > cpu_read_lock() the new package might arrive right after the unlock.
> > 
> 
> Thanks Thomas.
> 
> Andi, do you want to take a look at this?

I was originally worried about races, that is why i tried to put 
everything into cpu_data. But that didn't work out because something
clears it. Perhaps the right solution would be some extra per_cpu
data variables, and search for the first match. I suspect that would
be simpler. But if that doesn't work I guess something like Thomas'
example will work.

I assume you will handle it, Prarit?

-Andi


Re: [PATCH v5 2/3] x86/topology: Avoid wasting 128k for package id array

2017-11-10 Thread Thomas Gleixner
On Fri, 10 Nov 2017, Thomas Gleixner wrote:

> On Fri, 10 Nov 2017, Prarit Bhargava wrote:
> > On 11/09/2017 07:43 PM, Thomas Gleixner wrote:
> > > On Sun, 5 Nov 2017, Prarit Bhargava wrote:
> > >> [v5]: Change kmalloc to GFP_ATOMIC to fix "sleeping function" warning on
> > >> virtual machines.
> > > 
> > > What has this to do with virtual machines? The very same issue is on
> > > physcial hardware because this is called from the early CPU bringup code
> > > with interrupts and preemption disabled.
> > 
> > There was a Intel test bot report of a failure during boot on virtual 
> > systems
> > with Andi's patch.
> 
> Sure, but the problem has nothing to do with virtual machines at all.

The same is true for the KASAN report of out of bound access:

new = logical_packages++;
-   if (new != pkg) {
-   pr_info("CPU %u Converting physical %u to logical package %u\n",
-   cpu, pkg, new);
+
+   /* Allocate and copy a new array */
+   ltp_pkg_map_new = kmalloc(logical_packages * sizeof(u16), GFP_KERNEL);
+   BUG_ON(!ltp_pkg_map_new);
+   if (logical_to_physical_pkg_map) {
+   memcpy(ltp_pkg_map_new, logical_to_physical_pkg_map,
+  logical_packages * sizeof(u16));

That's caused by incrementing logical_packages _before_ the memcpy(), but
the old array is one u16 shorter than the newly allocated.

That's reported on a VM as well, but is a genuine code bug.

Thanks,

tglx


Re: [PATCH v5 2/3] x86/topology: Avoid wasting 128k for package id array

2017-11-10 Thread Thomas Gleixner
On Fri, 10 Nov 2017, Thomas Gleixner wrote:

> On Fri, 10 Nov 2017, Prarit Bhargava wrote:
> > On 11/09/2017 07:43 PM, Thomas Gleixner wrote:
> > > On Sun, 5 Nov 2017, Prarit Bhargava wrote:
> > >> [v5]: Change kmalloc to GFP_ATOMIC to fix "sleeping function" warning on
> > >> virtual machines.
> > > 
> > > What has this to do with virtual machines? The very same issue is on
> > > physcial hardware because this is called from the early CPU bringup code
> > > with interrupts and preemption disabled.
> > 
> > There was a Intel test bot report of a failure during boot on virtual 
> > systems
> > with Andi's patch.
> 
> Sure, but the problem has nothing to do with virtual machines at all.

The same is true for the KASAN report of out of bound access:

new = logical_packages++;
-   if (new != pkg) {
-   pr_info("CPU %u Converting physical %u to logical package %u\n",
-   cpu, pkg, new);
+
+   /* Allocate and copy a new array */
+   ltp_pkg_map_new = kmalloc(logical_packages * sizeof(u16), GFP_KERNEL);
+   BUG_ON(!ltp_pkg_map_new);
+   if (logical_to_physical_pkg_map) {
+   memcpy(ltp_pkg_map_new, logical_to_physical_pkg_map,
+  logical_packages * sizeof(u16));

That's caused by incrementing logical_packages _before_ the memcpy(), but
the old array is one u16 shorter than the newly allocated.

That's reported on a VM as well, but is a genuine code bug.

Thanks,

tglx


Re: [PATCH v5 2/3] x86/topology: Avoid wasting 128k for package id array

2017-11-10 Thread Thomas Gleixner
On Fri, 10 Nov 2017, Prarit Bhargava wrote:
> On 11/09/2017 07:43 PM, Thomas Gleixner wrote:
> > On Sun, 5 Nov 2017, Prarit Bhargava wrote:
> >> [v5]: Change kmalloc to GFP_ATOMIC to fix "sleeping function" warning on
> >> virtual machines.
> > 
> > What has this to do with virtual machines? The very same issue is on
> > physcial hardware because this is called from the early CPU bringup code
> > with interrupts and preemption disabled.
> 
> There was a Intel test bot report of a failure during boot on virtual systems
> with Andi's patch.

Sure, but the problem has nothing to do with virtual machines at all.

Thanks,

tglx


Re: [PATCH v5 2/3] x86/topology: Avoid wasting 128k for package id array

2017-11-10 Thread Thomas Gleixner
On Fri, 10 Nov 2017, Prarit Bhargava wrote:
> On 11/09/2017 07:43 PM, Thomas Gleixner wrote:
> > On Sun, 5 Nov 2017, Prarit Bhargava wrote:
> >> [v5]: Change kmalloc to GFP_ATOMIC to fix "sleeping function" warning on
> >> virtual machines.
> > 
> > What has this to do with virtual machines? The very same issue is on
> > physcial hardware because this is called from the early CPU bringup code
> > with interrupts and preemption disabled.
> 
> There was a Intel test bot report of a failure during boot on virtual systems
> with Andi's patch.

Sure, but the problem has nothing to do with virtual machines at all.

Thanks,

tglx


Re: [PATCH v5 2/3] x86/topology: Avoid wasting 128k for package id array

2017-11-10 Thread Prarit Bhargava


On 11/09/2017 07:43 PM, Thomas Gleixner wrote:
> On Sun, 5 Nov 2017, Prarit Bhargava wrote:
>> [v5]: Change kmalloc to GFP_ATOMIC to fix "sleeping function" warning on
>> virtual machines.
> 
> What has this to do with virtual machines? The very same issue is on
> physcial hardware because this is called from the early CPU bringup code
> with interrupts and preemption disabled.

There was a Intel test bot report of a failure during boot on virtual systems
with Andi's patch.

> 
>> +/* Allocate and copy a new array */
>> +ltp_pkg_map_new = kmalloc(logical_packages * sizeof(u16), GFP_ATOMIC);
>> +BUG_ON(!ltp_pkg_map_new);
> 
> Having an allocation in that code path is a bad idea. First of all the
> error handling in there is just crap, because the only thing you can do is
> panic. Aside of that atomic allocations should be avoided when we can and
> we can.
> 
> Sorry I missed that when looking at the patch earlier. Something along this
> makes it work proper:
> 
> struct pkg_map {
>   unsigned intsize;
>   unsigned intused;
>   unsigned intmap[0];
> };
> 
> static struct pkg_map *logical_to_physical_pkg_map __read_mostly;
> 
> static int resize_pkg_map(void)
> {
>   struct pkg_map *newmap, *oldmap = logical_to_physical_pkg_map;
>   int size;
> 
>   if (oldmap->size > oldmap->used)
>   return 0;
> 
>   size = sizeof(*oldmap) + sizeof(unsigned int) * oldmap->size;
>   newmap = kzalloc(size + sizeof(unsigned int));
>   if (!newmap)
>   return -ENOMEM;
> 
>   memcpy(newmap, oldmap, size);
>   newmap->size++;
>   logical_to_physical_pkg_map = newmap;
>   kfree(oldmap);
>   return 0;
> }
> 
> int __cpu_up()
> {
>   if (resize_pkg_map())
>   return -ENOMEM;
>   return smp_ops.cpu_up();
> }
> 
> static void update_map()
> {
>   if (find_map())
>   return;
>   map->map[map->used] = physid;
>   map->used++;
> }
> 
> static void smp_init_package_map()
> {
>   struct pkg_map *map;
> 
>   map = kzalloc(sizeof(*newmap) + sizeof(unsigned int));
>   map->size = 1;
> }
> 
> See? No BUG_ON() in the early secondary cpu boot code. If memory allocation
> fails the thing goes back gracefully.
> 
> Locking/barriers omitted as you have choices here:
> 
>1) RCU
> 
>   Needs the proper RCU magic for the lookup and the pointer swap.
> 
>   That requires also a proper barrier between the assignement of the
>   new id and the increment of the used count plus the corresponding one
>   on the read side.
> 
>2) mutex
> 
>   Must be held when swapping the pointers and across lookup
> 
>   Same barrier requirement as RCU
> 
>3) raw_spinlock
> 
>   Must be held when swapping the pointers and across lookup
> 
>   No barriers as long as you hold the lock across the assignement and
>   increment.
> 
> All of that works. There is no way to make sure that a lookup is fully
> serialized against a concurrent update. Even if the lookup holds
> cpu_read_lock() the new package might arrive right after the unlock.
> 

Thanks Thomas.

Andi, do you want to take a look at this?

P.

> Thanks,
> 
>   tglx
> 


Re: [PATCH v5 2/3] x86/topology: Avoid wasting 128k for package id array

2017-11-10 Thread Prarit Bhargava


On 11/09/2017 07:43 PM, Thomas Gleixner wrote:
> On Sun, 5 Nov 2017, Prarit Bhargava wrote:
>> [v5]: Change kmalloc to GFP_ATOMIC to fix "sleeping function" warning on
>> virtual machines.
> 
> What has this to do with virtual machines? The very same issue is on
> physcial hardware because this is called from the early CPU bringup code
> with interrupts and preemption disabled.

There was a Intel test bot report of a failure during boot on virtual systems
with Andi's patch.

> 
>> +/* Allocate and copy a new array */
>> +ltp_pkg_map_new = kmalloc(logical_packages * sizeof(u16), GFP_ATOMIC);
>> +BUG_ON(!ltp_pkg_map_new);
> 
> Having an allocation in that code path is a bad idea. First of all the
> error handling in there is just crap, because the only thing you can do is
> panic. Aside of that atomic allocations should be avoided when we can and
> we can.
> 
> Sorry I missed that when looking at the patch earlier. Something along this
> makes it work proper:
> 
> struct pkg_map {
>   unsigned intsize;
>   unsigned intused;
>   unsigned intmap[0];
> };
> 
> static struct pkg_map *logical_to_physical_pkg_map __read_mostly;
> 
> static int resize_pkg_map(void)
> {
>   struct pkg_map *newmap, *oldmap = logical_to_physical_pkg_map;
>   int size;
> 
>   if (oldmap->size > oldmap->used)
>   return 0;
> 
>   size = sizeof(*oldmap) + sizeof(unsigned int) * oldmap->size;
>   newmap = kzalloc(size + sizeof(unsigned int));
>   if (!newmap)
>   return -ENOMEM;
> 
>   memcpy(newmap, oldmap, size);
>   newmap->size++;
>   logical_to_physical_pkg_map = newmap;
>   kfree(oldmap);
>   return 0;
> }
> 
> int __cpu_up()
> {
>   if (resize_pkg_map())
>   return -ENOMEM;
>   return smp_ops.cpu_up();
> }
> 
> static void update_map()
> {
>   if (find_map())
>   return;
>   map->map[map->used] = physid;
>   map->used++;
> }
> 
> static void smp_init_package_map()
> {
>   struct pkg_map *map;
> 
>   map = kzalloc(sizeof(*newmap) + sizeof(unsigned int));
>   map->size = 1;
> }
> 
> See? No BUG_ON() in the early secondary cpu boot code. If memory allocation
> fails the thing goes back gracefully.
> 
> Locking/barriers omitted as you have choices here:
> 
>1) RCU
> 
>   Needs the proper RCU magic for the lookup and the pointer swap.
> 
>   That requires also a proper barrier between the assignement of the
>   new id and the increment of the used count plus the corresponding one
>   on the read side.
> 
>2) mutex
> 
>   Must be held when swapping the pointers and across lookup
> 
>   Same barrier requirement as RCU
> 
>3) raw_spinlock
> 
>   Must be held when swapping the pointers and across lookup
> 
>   No barriers as long as you hold the lock across the assignement and
>   increment.
> 
> All of that works. There is no way to make sure that a lookup is fully
> serialized against a concurrent update. Even if the lookup holds
> cpu_read_lock() the new package might arrive right after the unlock.
> 

Thanks Thomas.

Andi, do you want to take a look at this?

P.

> Thanks,
> 
>   tglx
> 


Re: [PATCH v5 2/3] x86/topology: Avoid wasting 128k for package id array

2017-11-09 Thread Thomas Gleixner
On Sun, 5 Nov 2017, Prarit Bhargava wrote:
> [v5]: Change kmalloc to GFP_ATOMIC to fix "sleeping function" warning on
> virtual machines.

What has this to do with virtual machines? The very same issue is on
physcial hardware because this is called from the early CPU bringup code
with interrupts and preemption disabled.

> + /* Allocate and copy a new array */
> + ltp_pkg_map_new = kmalloc(logical_packages * sizeof(u16), GFP_ATOMIC);
> + BUG_ON(!ltp_pkg_map_new);

Having an allocation in that code path is a bad idea. First of all the
error handling in there is just crap, because the only thing you can do is
panic. Aside of that atomic allocations should be avoided when we can and
we can.

Sorry I missed that when looking at the patch earlier. Something along this
makes it work proper:

struct pkg_map {
unsigned intsize;
unsigned intused;
unsigned intmap[0];
};

static struct pkg_map *logical_to_physical_pkg_map __read_mostly;

static int resize_pkg_map(void)
{
struct pkg_map *newmap, *oldmap = logical_to_physical_pkg_map;
int size;

if (oldmap->size > oldmap->used)
return 0;

size = sizeof(*oldmap) + sizeof(unsigned int) * oldmap->size;
newmap = kzalloc(size + sizeof(unsigned int));
if (!newmap)
return -ENOMEM;

memcpy(newmap, oldmap, size);
newmap->size++;
logical_to_physical_pkg_map = newmap;
kfree(oldmap);
return 0;
}

int __cpu_up()
{
if (resize_pkg_map())
return -ENOMEM;
return smp_ops.cpu_up();
}

static void update_map()
{
if (find_map())
return;
map->map[map->used] = physid;
map->used++;
}

static void smp_init_package_map()
{
struct pkg_map *map;

map = kzalloc(sizeof(*newmap) + sizeof(unsigned int));
map->size = 1;
}

See? No BUG_ON() in the early secondary cpu boot code. If memory allocation
fails the thing goes back gracefully.

Locking/barriers omitted as you have choices here:

   1) RCU

  Needs the proper RCU magic for the lookup and the pointer swap.

  That requires also a proper barrier between the assignement of the
  new id and the increment of the used count plus the corresponding one
  on the read side.

   2) mutex

  Must be held when swapping the pointers and across lookup

  Same barrier requirement as RCU

   3) raw_spinlock

  Must be held when swapping the pointers and across lookup

  No barriers as long as you hold the lock across the assignement and
  increment.

All of that works. There is no way to make sure that a lookup is fully
serialized against a concurrent update. Even if the lookup holds
cpu_read_lock() the new package might arrive right after the unlock.

Thanks,

tglx


Re: [PATCH v5 2/3] x86/topology: Avoid wasting 128k for package id array

2017-11-09 Thread Thomas Gleixner
On Sun, 5 Nov 2017, Prarit Bhargava wrote:
> [v5]: Change kmalloc to GFP_ATOMIC to fix "sleeping function" warning on
> virtual machines.

What has this to do with virtual machines? The very same issue is on
physcial hardware because this is called from the early CPU bringup code
with interrupts and preemption disabled.

> + /* Allocate and copy a new array */
> + ltp_pkg_map_new = kmalloc(logical_packages * sizeof(u16), GFP_ATOMIC);
> + BUG_ON(!ltp_pkg_map_new);

Having an allocation in that code path is a bad idea. First of all the
error handling in there is just crap, because the only thing you can do is
panic. Aside of that atomic allocations should be avoided when we can and
we can.

Sorry I missed that when looking at the patch earlier. Something along this
makes it work proper:

struct pkg_map {
unsigned intsize;
unsigned intused;
unsigned intmap[0];
};

static struct pkg_map *logical_to_physical_pkg_map __read_mostly;

static int resize_pkg_map(void)
{
struct pkg_map *newmap, *oldmap = logical_to_physical_pkg_map;
int size;

if (oldmap->size > oldmap->used)
return 0;

size = sizeof(*oldmap) + sizeof(unsigned int) * oldmap->size;
newmap = kzalloc(size + sizeof(unsigned int));
if (!newmap)
return -ENOMEM;

memcpy(newmap, oldmap, size);
newmap->size++;
logical_to_physical_pkg_map = newmap;
kfree(oldmap);
return 0;
}

int __cpu_up()
{
if (resize_pkg_map())
return -ENOMEM;
return smp_ops.cpu_up();
}

static void update_map()
{
if (find_map())
return;
map->map[map->used] = physid;
map->used++;
}

static void smp_init_package_map()
{
struct pkg_map *map;

map = kzalloc(sizeof(*newmap) + sizeof(unsigned int));
map->size = 1;
}

See? No BUG_ON() in the early secondary cpu boot code. If memory allocation
fails the thing goes back gracefully.

Locking/barriers omitted as you have choices here:

   1) RCU

  Needs the proper RCU magic for the lookup and the pointer swap.

  That requires also a proper barrier between the assignement of the
  new id and the increment of the used count plus the corresponding one
  on the read side.

   2) mutex

  Must be held when swapping the pointers and across lookup

  Same barrier requirement as RCU

   3) raw_spinlock

  Must be held when swapping the pointers and across lookup

  No barriers as long as you hold the lock across the assignement and
  increment.

All of that works. There is no way to make sure that a lookup is fully
serialized against a concurrent update. Even if the lookup holds
cpu_read_lock() the new package might arrive right after the unlock.

Thanks,

tglx