[Xen-devel] [PATCH v4 01/17] cpu: Add new {add, remove}_cpu() functions

2020-03-23 Thread Qais Yousef
The new functions use device_{online,offline}() which are userspace
safe.

This is in preparation to move cpu_{up, down} kernel users to use
a safer interface that is not racy with userspace.

Suggested-by: "Paul E. McKenney" 
Signed-off-by: Qais Yousef 
CC: Thomas Gleixner 
CC: "Paul E. McKenney" 
CC: Helge Deller 
CC: Michael Ellerman 
CC: "David S. Miller" 
CC: Juergen Gross 
CC: Mark Rutland 
CC: Lorenzo Pieralisi 
CC: xen-devel@lists.xenproject.org
CC: linux-par...@vger.kernel.org
CC: sparcli...@vger.kernel.org
CC: linuxppc-...@lists.ozlabs.org
CC: linux-arm-ker...@lists.infradead.org
CC: x...@kernel.org
CC: linux-ker...@vger.kernel.org
---
 include/linux/cpu.h |  2 ++
 kernel/cpu.c| 24 
 2 files changed, 26 insertions(+)

diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index 1ca2baf817ed..cf8cf38dca43 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -89,6 +89,7 @@ extern ssize_t arch_cpu_release(const char *, size_t);
 #ifdef CONFIG_SMP
 extern bool cpuhp_tasks_frozen;
 int cpu_up(unsigned int cpu);
+int add_cpu(unsigned int cpu);
 void notify_cpu_starting(unsigned int cpu);
 extern void cpu_maps_update_begin(void);
 extern void cpu_maps_update_done(void);
@@ -118,6 +119,7 @@ extern void cpu_hotplug_disable(void);
 extern void cpu_hotplug_enable(void);
 void clear_tasks_mm_cpumask(int cpu);
 int cpu_down(unsigned int cpu);
+int remove_cpu(unsigned int cpu);
 
 #else /* CONFIG_HOTPLUG_CPU */
 
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 9c706af713fb..069802f7010f 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1057,6 +1057,18 @@ int cpu_down(unsigned int cpu)
 }
 EXPORT_SYMBOL(cpu_down);
 
+int remove_cpu(unsigned int cpu)
+{
+   int ret;
+
+   lock_device_hotplug();
+   ret = device_offline(get_cpu_device(cpu));
+   unlock_device_hotplug();
+
+   return ret;
+}
+EXPORT_SYMBOL_GPL(remove_cpu);
+
 #else
 #define takedown_cpu   NULL
 #endif /*CONFIG_HOTPLUG_CPU*/
@@ -1209,6 +1221,18 @@ int cpu_up(unsigned int cpu)
 }
 EXPORT_SYMBOL_GPL(cpu_up);
 
+int add_cpu(unsigned int cpu)
+{
+   int ret;
+
+   lock_device_hotplug();
+   ret = device_online(get_cpu_device(cpu));
+   unlock_device_hotplug();
+
+   return ret;
+}
+EXPORT_SYMBOL_GPL(add_cpu);
+
 #ifdef CONFIG_PM_SLEEP_SMP
 static cpumask_var_t frozen_cpus;
 
-- 
2.17.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v4 00/17] Convert cpu_up/down to device_online/offline

2020-03-23 Thread Qais Yousef
=
Changes in v4
=

* Split arm and arm64 patches so that the change to use reboot_cpu goes
  into its own separate patch (Russell)
* Collected new Acked-by
* Rebased on top of v5.6-rc6
* Trimmed the CC list on the cover letter as lists were rejecting it


git clone git://linux-arm.org/linux-qy.git -b cpu-hp-cleanup-v4


Older post can be found here


https://lore.kernel.org/lkml/20200223192942.18420-2-qais.you...@arm.com/


=
Test Coverage
=

All tests ran with LOCKDEP enabled.

Platform: Juno-r2: arm64


* Overnight rcutorture
* Overnight locktorture
* kexec -f Image --command="$(cat /proc/cmdline) reboot=s[0-5]"
* Hibernate to disk (using suspend option)
* Userspace hotplug via sysfs
* PSCI firemware checker

Notes:

* Couldn't convince Juno to hibernate using [reboot] or [shutdown]
  options.

Platform: qemu (8 vCPUs) and VM (2 vCPUs): x86_64
-

* Overnight rcutorture
* Overnight locktorture
* Userspace hotplug via sysfs
* echo mmiotrace > /sys/kernel/debug/tracing/current_tracer &&
  echo nop > /sys/kernel/debug/tracing/current_tracer
* Ran with CONFIG_DEBUG_HOTPLUG_CPU0 and CONFIG_BOOTPARAM_HOTPLUG_CPU0

Notes:

* qemu failed to bring cpu0 after offlining. Same behavior observed on
  vanilla v5.6-rc6. Worked fine on the VM.

* mmiotrace successfully brought down all cpus when enabled,
  then back online again when disabled. Including when cpu0 was
  offline.

* My xen shenanigans are too 'humble' too create environment to test
  the change in xen yet..


=
Original Cover Letter
=

Using cpu_up/down directly to bring cpus online/offline loses synchronization
with sysfs and could suffer from a race similar to what is described in
commit a6717c01ddc2 ("powerpc/rtas: use device model APIs and serialization
during LPM").

cpu_up/down seem to be more of a internal implementation detail for the cpu
subsystem to use to boot up cpus, perform suspend/resume and low level hotplug
operations. Users outside of the cpu subsystem would be better using the device
core API to bring a cpu online/offline which is the interface used to hotplug
memory and other system devices.

Several users have already migrated to use the device core API, this series
converts the remaining users and hides cpu_up/down from internal users at the
end.

I noticed this problem while working on a hack to disable offlining
a particular CPU but noticed that setting the offline_disabled attribute in the
device struct isn't enough because users can easily bypass the device core.
While my hack isn't a valid use case but it did highlight the inconsistency in
the way cpus are being onlined/offlined and this attempt hopefully improves on
this.

The first patch introduces new API to {add,remove}_cpu() using device_{online,
offline}() with correct locks held and export it.

The following 10 patches fix arch users.

The remaining 6 patches fix generic code users. Particularly creating a new
special exported API for the device core to use instead of cpu_up/down.

The last patch removes cpu_up/down from cpu.h and unexport the functions.

In some cases where the use of cpu_up/down seemed legitimate, I encapsulated
the logic in a higher level - special purposed function; and converted the code
to use that instead.


CC: Thomas Gleixner 
CC: Tony Luck 
CC: Fenghua Yu 
CC: Russell King 
CC: Catalin Marinas 
CC: Michael Ellerman 
CC: "David S. Miller" 
CC: Helge Deller 
CC: Juergen Gross 
CC: Mark Rutland 
CC: Lorenzo Pieralisi 
CC: "Paul E. McKenney" 
CC: Greg Kroah-Hartman 
CC: xen-devel@lists.xenproject.org
CC: linux-par...@vger.kernel.org
CC: sparcli...@vger.kernel.org
CC: linuxppc-...@lists.ozlabs.org
CC: x...@kernel.org
CC: linux-arm-ker...@lists.infradead.org
CC: linux-i...@vger.kernel.org
CC: linux-ker...@vger.kernel.org

Qais Yousef (17):
  cpu: Add new {add,remove}_cpu() functions
  smp: Create a new function to shutdown nonboot cpus
  ia64: Replace cpu_down with smp_shutdown_nonboot_cpus()
  arm: Don't use disable_nonboot_cpus()
  arm: Use reboot_cpu instead of hardcoding it to 0
  arm64: Don't use disable_nonboot_cpus()
  arm64: Use reboot_cpu instead of hardconding it to 0
  arm64: hibernate.c: Create a new function to handle cpu_up(sleep_cpu)
  x86: Replace cpu_up/down with add/remove_cpu
  powerpc: Replace cpu_up/down with add/remove_cpu
  sparc: Replace cpu_up/down with add/remove_cpu
  parisc: Replace cpu_up/down with add/remove_cpu
  driver: xen: Replace cpu_up/down with device_online/offline
  firmware: psci: Replace cpu_up/down with add/remove_cpu
  torture: Replace cpu_up

[Xen-devel] [PATCH v4 13/17] driver: xen: Replace cpu_up/down with device_online/offline

2020-03-23 Thread Qais Yousef
The core device API performs extra housekeeping bits that are missing
from directly calling cpu_up/down.

See commit a6717c01ddc2 ("powerpc/rtas: use device model APIs and
serialization during LPM") for an example description of what might go
wrong.

This also prepares to make cpu_up/down a private interface for anything
but the cpu subsystem.

Reviewed-by: Juergen Gross 
Signed-off-by: Qais Yousef 
CC: Boris Ostrovsky 
CC: Juergen Gross 
CC: Stefano Stabellini 
CC: xen-devel@lists.xenproject.org
CC: linux-ker...@vger.kernel.org
---
 drivers/xen/cpu_hotplug.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/xen/cpu_hotplug.c b/drivers/xen/cpu_hotplug.c
index f192b6f42da9..ec975decb5de 100644
--- a/drivers/xen/cpu_hotplug.c
+++ b/drivers/xen/cpu_hotplug.c
@@ -94,7 +94,7 @@ static int setup_cpu_watcher(struct notifier_block *notifier,
 
for_each_possible_cpu(cpu) {
if (vcpu_online(cpu) == 0) {
-   (void)cpu_down(cpu);
+   device_offline(get_cpu_device(cpu));
set_cpu_present(cpu, false);
}
}
-- 
2.17.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 01/15] cpu: Add new {add, remove}_cpu() functions

2020-02-23 Thread Qais Yousef
The new functions use device_{online,offline}() which are userspace
safe.

This is in preparation to move cpu_{up, down} kernel users to use
a safer interface that is not racy with userspace.

Suggested-by: "Paul E. McKenney" 
Signed-off-by: Qais Yousef 
CC: Thomas Gleixner 
CC: "Paul E. McKenney" 
CC: Helge Deller 
CC: Michael Ellerman 
CC: "David S. Miller" 
CC: Juergen Gross 
CC: Mark Rutland 
CC: Lorenzo Pieralisi 
CC: xen-devel@lists.xenproject.org
CC: linux-par...@vger.kernel.org
CC: sparcli...@vger.kernel.org
CC: linuxppc-...@lists.ozlabs.org
CC: linux-arm-ker...@lists.infradead.org
CC: x...@kernel.org
CC: linux-ker...@vger.kernel.org
---
 include/linux/cpu.h |  2 ++
 kernel/cpu.c| 24 
 2 files changed, 26 insertions(+)

diff --git a/include/linux/cpu.h b/include/linux/cpu.h
index 1ca2baf817ed..cf8cf38dca43 100644
--- a/include/linux/cpu.h
+++ b/include/linux/cpu.h
@@ -89,6 +89,7 @@ extern ssize_t arch_cpu_release(const char *, size_t);
 #ifdef CONFIG_SMP
 extern bool cpuhp_tasks_frozen;
 int cpu_up(unsigned int cpu);
+int add_cpu(unsigned int cpu);
 void notify_cpu_starting(unsigned int cpu);
 extern void cpu_maps_update_begin(void);
 extern void cpu_maps_update_done(void);
@@ -118,6 +119,7 @@ extern void cpu_hotplug_disable(void);
 extern void cpu_hotplug_enable(void);
 void clear_tasks_mm_cpumask(int cpu);
 int cpu_down(unsigned int cpu);
+int remove_cpu(unsigned int cpu);
 
 #else /* CONFIG_HOTPLUG_CPU */
 
diff --git a/kernel/cpu.c b/kernel/cpu.c
index 9c706af713fb..069802f7010f 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -1057,6 +1057,18 @@ int cpu_down(unsigned int cpu)
 }
 EXPORT_SYMBOL(cpu_down);
 
+int remove_cpu(unsigned int cpu)
+{
+   int ret;
+
+   lock_device_hotplug();
+   ret = device_offline(get_cpu_device(cpu));
+   unlock_device_hotplug();
+
+   return ret;
+}
+EXPORT_SYMBOL_GPL(remove_cpu);
+
 #else
 #define takedown_cpu   NULL
 #endif /*CONFIG_HOTPLUG_CPU*/
@@ -1209,6 +1221,18 @@ int cpu_up(unsigned int cpu)
 }
 EXPORT_SYMBOL_GPL(cpu_up);
 
+int add_cpu(unsigned int cpu)
+{
+   int ret;
+
+   lock_device_hotplug();
+   ret = device_online(get_cpu_device(cpu));
+   unlock_device_hotplug();
+
+   return ret;
+}
+EXPORT_SYMBOL_GPL(add_cpu);
+
 #ifdef CONFIG_PM_SLEEP_SMP
 static cpumask_var_t frozen_cpus;
 
-- 
2.17.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 11/15] driver: xen: Replace cpu_up/down with device_online/offline

2020-02-23 Thread Qais Yousef
The core device API performs extra housekeeping bits that are missing
from directly calling cpu_up/down.

See commit a6717c01ddc2 ("powerpc/rtas: use device model APIs and
serialization during LPM") for an example description of what might go
wrong.

This also prepares to make cpu_up/down a private interface for anything
but the cpu subsystem.

Reviewed-by: Juergen Gross 
Signed-off-by: Qais Yousef 
CC: Boris Ostrovsky 
CC: Juergen Gross 
CC: Stefano Stabellini 
CC: xen-devel@lists.xenproject.org
CC: linux-ker...@vger.kernel.org
---

Juergen, there's a new add_cpu() which you should be CCed on. I wasn't sure if
I could safely convert this to use it since I couldn't find whether the
notifier already hold the lock or not. If you think remove_cpu() is safe, let
me know and I can send an updated patch.

 drivers/xen/cpu_hotplug.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/xen/cpu_hotplug.c b/drivers/xen/cpu_hotplug.c
index f192b6f42da9..ec975decb5de 100644
--- a/drivers/xen/cpu_hotplug.c
+++ b/drivers/xen/cpu_hotplug.c
@@ -94,7 +94,7 @@ static int setup_cpu_watcher(struct notifier_block *notifier,
 
for_each_possible_cpu(cpu) {
if (vcpu_online(cpu) == 0) {
-   (void)cpu_down(cpu);
+   device_offline(get_cpu_device(cpu));
set_cpu_present(cpu, false);
}
}
-- 
2.17.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH v3 00/15] Convert cpu_up/down to device_online/offline

2020-02-23 Thread Qais Yousef
Changes in v3:
* Fixup smp_shutdown_nonboot_cpus() to hold the right lock as suggested
  by Russel King.
* Split the combined arm/arm64 patch into 2 separate patches.
* Add new add/remove_cpu() functions that wraps lock,
  device_online/offline, unlock as suggested by Paul McKenney
* Use the new add/remove instead of device_online/offline where
  appropriate
* Dropped a patch that exported device_online/offline() since
  the new add/remove_cpu() are exported and used by torture test which
  could build as a module
* Rebsed on top 5.6-rc1 from linus/master


git clone git://linux-arm.org/linux-qy.git -b cpu-hp-cleanup-v3


Older post can be found here:

https://lore.kernel.org/linuxppc-dev/20191125112754.25223-1-qais.you...@arm.com/
https://lore.kernel.org/lkml/20191125112754.25223-2-qais.you...@arm.com/


Using cpu_up/down directly to bring cpus online/offline loses synchronization
with sysfs and could suffer from a race similar to what is described in
commit a6717c01ddc2 ("powerpc/rtas: use device model APIs and serialization
during LPM").

cpu_up/down seem to be more of a internal implementation detail for the cpu
subsystem to use to boot up cpus, perform suspend/resume and low level hotplug
operations. Users outside of the cpu subsystem would be better using the device
core API to bring a cpu online/offline which is the interface used to hotplug
memory and other system devices.

Several users have already migrated to use the device core API, this series
converts the remaining users and hides cpu_up/down from internal users at the
end.

I noticed this problem while working on a hack to disable offlining
a particular CPU but noticed that setting the offline_disabled attribute in the
device struct isn't enough because users can easily bypass the device core.
While my hack isn't a valid use case but it did highlight the inconsistency in
the way cpus are being onlined/offlined and this attempt hopefully improves on
this.

The first 9 patches fix arch users.

The remaining 6 patches fix generic code users. Particularly creating a new
special exported API for the device core to use instead of cpu_up/down.

The last patch removes cpu_up/down from cpu.h and unexport the functions.

In some cases where the use of cpu_up/down seemed legitimate, I encapsulated
the logic in a higher level - special purposed function; and converted the code
to use that instead.

I did re-run the rcu torture, lock torture and psci checker tests and no
problem was noticed. I did perform build tests on all arch affected except for
parisc.

Hopefully I got the CC list right for all the patches. Apologies in advance if
some people were omitted from some patches but they should have been CCed.

CC: Armijn Hemel 
CC: Benjamin Herrenschmidt 
CC: Bjorn Helgaas 
CC: Borislav Petkov 
CC: Boris Ostrovsky 
CC: Catalin Marinas 
CC: Christophe Leroy 
CC: Daniel Lezcano 
CC: Davidlohr Bueso 
CC: "David S. Miller" 
CC: Eiichi Tsukata 
CC: Enrico Weigelt 
CC: Fenghua Yu 
CC: Greg Kroah-Hartman 
CC: Helge Deller 
CC: "H. Peter Anvin" 
CC: Ingo Molnar 
CC: "James E.J. Bottomley" 
CC: James Morse 
CC: Jiri Kosina 
CC: Josh Poimboeuf 
CC: Josh Triplett 
CC: Juergen Gross 
CC: Lorenzo Pieralisi 
CC: Mark Rutland 
CC: Michael Ellerman 
CC: Nadav Amit 
CC: Nicholas Piggin 
CC: "Paul E. McKenney" 
CC: Paul Mackerras 
CC: Pavankumar Kondeti 
CC: "Peter Zijlstra (Intel)" 
CC: "Rafael J. Wysocki" 
CC: Ram Pai 
CC: Richard Fontana 
CC: Russell King 
CC: Sakari Ailus 
CC: Stefano Stabellini 
CC: Steve Capper 
CC: Thiago Jung Bauermann 
CC: Thomas Gleixner 
CC: Tony Luck 
CC: Will Deacon 
CC: Zhenzhong Duan 
CC: linux-arm-ker...@lists.infradead.org
CC: linux-i...@vger.kernel.org
CC: linux-ker...@vger.kernel.org
CC: linux-par...@vger.kernel.org
CC: linuxppc-...@lists.ozlabs.org
CC: sparcli...@vger.kernel.org
CC: x...@kernel.org
CC: xen-devel@lists.xenproject.org


Qais Yousef (15):
  cpu: Add new {add,remove}_cpu() functions
  smp: Create a new function to shutdown nonboot cpus
  ia64: Replace cpu_down with smp_shutdown_nonboot_cpus()
  arm: Don't use disable_nonboot_cpus()
  arm64: Don't use disable_nonboot_cpus()
  arm64: hibernate.c: Create a new function to handle cpu_up(sleep_cpu)
  x86: Replace cpu_up/down with add/remove_cpu
  powerpc: Replace cpu_up/down with add/remove_cpu
  sparc: Replace cpu_up/down with add/remove_cpu
  parisc: Replace cpu_up/down with add/remove_cpu
  driver: xen: Replace cpu_up/down with device_online/offline
  firmware: psci: Replace cpu_up/down with add/remove_cpu
  torture: Replace cpu_up/down with add/remove_cpu
  smp: Create a new function to bringup nonboot cpus online
  cpu: Hide cpu_up/down

 arch/arm/kernel/reboot.c |   4 +-
 arch/arm64/kernel/hibernate.c|  13 +--
 arch/arm64/kernel/process.c  |   4 +-
 arch/ia

Re: [Xen-devel] [PATCH v2 00/14] Convert cpu_up/down to device_online/offline

2020-02-05 Thread Qais Yousef
Hi Thomas

On 11/25/19 11:27, Qais Yousef wrote:
> Changes in v2:
>   * Add 2 new patches that create smp_shutdown_nonboot_cpus() to be used
> in machine_shutdown() in ia64, arm and arm64
>   * Use proper kernel-doc for the newly introduced functions
>   * Renamed a function
>   * Removed a stale comment in a function
>   * Rebased on top of 5.4-rc8
> 
>   git clone git://linux-arm.org/linux-qy.git -b cpu-hp-cleanup-v2

I want to spin v3 to address Russel's comments. If you have any feedback it'd
be great to have them before I spin v3.

Thanks

--
Qais Yousef

> 
> Using cpu_up/down directly to bring cpus online/offline loses synchronization
> with sysfs and could suffer from a race similar to what is described in
> commit a6717c01ddc2 ("powerpc/rtas: use device model APIs and serialization
> during LPM").
> 
> cpu_up/down seem to be more of a internal implementation detail for the cpu
> subsystem to use to boot up cpus, perform suspend/resume and low level hotplug
> operations. Users outside of the cpu subsystem would be better using the 
> device
> core API to bring a cpu online/offline which is the interface used to hotplug
> memory and other system devices.
> 
> Several users have already migrated to use the device core API, this series
> converts the remaining users and hides cpu_up/down from internal users at the
> end.
> 
> I noticed this problem while working on a hack to disable offlining
> a particular CPU but noticed that setting the offline_disabled attribute in 
> the
> device struct isn't enough because users can easily bypass the device core.
> While my hack isn't a valid use case but it did highlight the inconsistency in
> the way cpus are being onlined/offlined and this attempt hopefully improves on
> this.
> 
> The first 8 patches fix arch users.
> 
> The remaining 6 patches fix generic code users. Particularly creating a new
> special exported API for the device core to use instead of cpu_up/down.
> 
> The last patch removes cpu_up/down from cpu.h and unexport the functions.
> 
> In some cases where the use of cpu_up/down seemed legitimate, I encapsulated
> the logic in a higher level - special purposed function; and converted the 
> code
> to use that instead.
> 
> I did re-run the rcu torture, lock torture and psci checker tests and no
> problem was noticed. I did perform build tests on all arch affected except for
> parisc.
> 
> Hopefully I got the CC list right for all the patches. Apologies in advance if
> some people were omitted from some patches but they should have been CCed.
> 
> CC: Armijn Hemel 
> CC: Benjamin Herrenschmidt 
> CC: Bjorn Helgaas 
> CC: Borislav Petkov 
> CC: Boris Ostrovsky 
> CC: Catalin Marinas 
> CC: Christophe Leroy 
> CC: Daniel Lezcano 
> CC: Davidlohr Bueso 
> CC: "David S. Miller" 
> CC: Eiichi Tsukata 
> CC: Enrico Weigelt 
> CC: Fenghua Yu 
> CC: Greg Kroah-Hartman 
> CC: Helge Deller 
> CC: "H. Peter Anvin" 
> CC: Ingo Molnar 
> CC: "James E.J. Bottomley" 
> CC: James Morse 
> CC: Jiri Kosina 
> CC: Josh Poimboeuf 
> CC: Josh Triplett 
> CC: Juergen Gross 
> CC: Lorenzo Pieralisi 
> CC: Mark Rutland 
> CC: Michael Ellerman 
> CC: Nadav Amit 
> CC: Nicholas Piggin 
> CC: "Paul E. McKenney" 
> CC: Paul Mackerras 
> CC: Pavankumar Kondeti 
> CC: "Peter Zijlstra (Intel)" 
> CC: "Rafael J. Wysocki" 
> CC: Ram Pai 
> CC: Richard Fontana 
> CC: Russell King 
> CC: Sakari Ailus 
> CC: Stefano Stabellini 
> CC: Steve Capper 
> CC: Thiago Jung Bauermann 
> CC: Thomas Gleixner 
> CC: Tony Luck 
> CC: Will Deacon 
> CC: Zhenzhong Duan 
> CC: linux-arm-ker...@lists.infradead.org
> CC: linux-i...@vger.kernel.org
> CC: linux-ker...@vger.kernel.org
> CC: linux-par...@vger.kernel.org
> CC: linuxppc-...@lists.ozlabs.org
> CC: sparcli...@vger.kernel.org
> CC: x...@kernel.org
> CC: xen-devel@lists.xenproject.org
> 
> 
> Qais Yousef (14):
>   smp: create a new function to shutdown nonboot cpus
>   ia64: Replace cpu_down with smp_shutdown_nonboot_cpus()
>   arm: arm64: Don't use disable_nonboot_cpus()
>   arm64: hibernate.c: create a new function to handle cpu_up(sleep_cpu)
>   x86: Replace cpu_up/down with devcie_online/offline
>   powerpc: Replace cpu_up/down with device_online/offline
>   sparc: Replace cpu_up/down with device_online/offline
>   parisc: Replace cpu_up/down with device_online/offline
>   driver: base: cpu: export device_online/offline
>   driver: xen: Replace cpu_up/down with device_online/offline
>   firmware: psci: Replace cpu_up/down with device_online/offline
>   torture: Replace cpu_up/dow

[Xen-devel] [PATCH v2 00/14] Convert cpu_up/down to device_online/offline

2019-11-25 Thread Qais Yousef
Changes in v2:
* Add 2 new patches that create smp_shutdown_nonboot_cpus() to be used
  in machine_shutdown() in ia64, arm and arm64
* Use proper kernel-doc for the newly introduced functions
* Renamed a function
* Removed a stale comment in a function
* Rebased on top of 5.4-rc8

git clone git://linux-arm.org/linux-qy.git -b cpu-hp-cleanup-v2

Using cpu_up/down directly to bring cpus online/offline loses synchronization
with sysfs and could suffer from a race similar to what is described in
commit a6717c01ddc2 ("powerpc/rtas: use device model APIs and serialization
during LPM").

cpu_up/down seem to be more of a internal implementation detail for the cpu
subsystem to use to boot up cpus, perform suspend/resume and low level hotplug
operations. Users outside of the cpu subsystem would be better using the device
core API to bring a cpu online/offline which is the interface used to hotplug
memory and other system devices.

Several users have already migrated to use the device core API, this series
converts the remaining users and hides cpu_up/down from internal users at the
end.

I noticed this problem while working on a hack to disable offlining
a particular CPU but noticed that setting the offline_disabled attribute in the
device struct isn't enough because users can easily bypass the device core.
While my hack isn't a valid use case but it did highlight the inconsistency in
the way cpus are being onlined/offlined and this attempt hopefully improves on
this.

The first 8 patches fix arch users.

The remaining 6 patches fix generic code users. Particularly creating a new
special exported API for the device core to use instead of cpu_up/down.

The last patch removes cpu_up/down from cpu.h and unexport the functions.

In some cases where the use of cpu_up/down seemed legitimate, I encapsulated
the logic in a higher level - special purposed function; and converted the code
to use that instead.

I did re-run the rcu torture, lock torture and psci checker tests and no
problem was noticed. I did perform build tests on all arch affected except for
parisc.

Hopefully I got the CC list right for all the patches. Apologies in advance if
some people were omitted from some patches but they should have been CCed.

CC: Armijn Hemel 
CC: Benjamin Herrenschmidt 
CC: Bjorn Helgaas 
CC: Borislav Petkov 
CC: Boris Ostrovsky 
CC: Catalin Marinas 
CC: Christophe Leroy 
CC: Daniel Lezcano 
CC: Davidlohr Bueso 
CC: "David S. Miller" 
CC: Eiichi Tsukata 
CC: Enrico Weigelt 
CC: Fenghua Yu 
CC: Greg Kroah-Hartman 
CC: Helge Deller 
CC: "H. Peter Anvin" 
CC: Ingo Molnar 
CC: "James E.J. Bottomley" 
CC: James Morse 
CC: Jiri Kosina 
CC: Josh Poimboeuf 
CC: Josh Triplett 
CC: Juergen Gross 
CC: Lorenzo Pieralisi 
CC: Mark Rutland 
CC: Michael Ellerman 
CC: Nadav Amit 
CC: Nicholas Piggin 
CC: "Paul E. McKenney" 
CC: Paul Mackerras 
CC: Pavankumar Kondeti 
CC: "Peter Zijlstra (Intel)" 
CC: "Rafael J. Wysocki" 
CC: Ram Pai 
CC: Richard Fontana 
CC: Russell King 
CC: Sakari Ailus 
CC: Stefano Stabellini 
CC: Steve Capper 
CC: Thiago Jung Bauermann 
CC: Thomas Gleixner 
CC: Tony Luck 
CC: Will Deacon 
CC: Zhenzhong Duan 
CC: linux-arm-ker...@lists.infradead.org
CC: linux-i...@vger.kernel.org
CC: linux-ker...@vger.kernel.org
CC: linux-par...@vger.kernel.org
CC: linuxppc-...@lists.ozlabs.org
CC: sparcli...@vger.kernel.org
CC: x...@kernel.org
CC: xen-devel@lists.xenproject.org


Qais Yousef (14):
  smp: create a new function to shutdown nonboot cpus
  ia64: Replace cpu_down with smp_shutdown_nonboot_cpus()
  arm: arm64: Don't use disable_nonboot_cpus()
  arm64: hibernate.c: create a new function to handle cpu_up(sleep_cpu)
  x86: Replace cpu_up/down with devcie_online/offline
  powerpc: Replace cpu_up/down with device_online/offline
  sparc: Replace cpu_up/down with device_online/offline
  parisc: Replace cpu_up/down with device_online/offline
  driver: base: cpu: export device_online/offline
  driver: xen: Replace cpu_up/down with device_online/offline
  firmware: psci: Replace cpu_up/down with device_online/offline
  torture: Replace cpu_up/down with device_online/offline
  smp: Create a new function to bringup nonboot cpus online
  cpu: Hide cpu_up/down

 arch/arm/kernel/reboot.c   |  4 +-
 arch/arm64/kernel/hibernate.c  | 13 ++--
 arch/arm64/kernel/process.c|  4 +-
 arch/ia64/kernel/process.c |  8 +--
 arch/parisc/kernel/processor.c |  4 +-
 arch/powerpc/kernel/machine_kexec_64.c |  4 +-
 arch/sparc/kernel/ds.c |  8 ++-
 arch/x86/kernel/topology.c |  4 +-
 arch/x86/mm/mmio-mod.c |  8 ++-
 arch/x86/xen/smp.c |  4 +-
 drivers/base/core.c|  4 ++
 drivers/base/cpu.c |  4 +-
 drivers/firmware/psci/psci_checker.c   |  6 +-
 drivers/xen/cpu_hotplug.c  

[Xen-devel] [PATCH v2 10/14] driver: xen: Replace cpu_up/down with device_online/offline

2019-11-25 Thread Qais Yousef
The core device API performs extra housekeeping bits that are missing
from directly calling cpu_up/down.

See commit a6717c01ddc2 ("powerpc/rtas: use device model APIs and
serialization during LPM") for an example description of what might go
wrong.

This also prepares to make cpu_up/down a private interface for anything
but the cpu subsystem.

Signed-off-by: Qais Yousef 
CC: Boris Ostrovsky 
CC: Juergen Gross 
CC: Stefano Stabellini 
CC: xen-devel@lists.xenproject.org
CC: linux-ker...@vger.kernel.org
---
 drivers/xen/cpu_hotplug.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/xen/cpu_hotplug.c b/drivers/xen/cpu_hotplug.c
index f192b6f42da9..ec975decb5de 100644
--- a/drivers/xen/cpu_hotplug.c
+++ b/drivers/xen/cpu_hotplug.c
@@ -94,7 +94,7 @@ static int setup_cpu_watcher(struct notifier_block *notifier,
 
for_each_possible_cpu(cpu) {
if (vcpu_online(cpu) == 0) {
-   (void)cpu_down(cpu);
+   device_offline(get_cpu_device(cpu));
set_cpu_present(cpu, false);
}
}
-- 
2.17.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

Re: [Xen-devel] [PATCH 00/12] Convert cpu_up/down to device_online/offline

2019-11-18 Thread Qais Yousef
Hi Thomas

On 10/30/19 15:38, Qais Yousef wrote:
> Using cpu_up/down directly to bring cpus online/offline loses synchronization
> with sysfs and could suffer from a race similar to what is described in
> commit a6717c01ddc2 ("powerpc/rtas: use device model APIs and serialization
> during LPM").
> 
> cpu_up/down seem to be more of a internal implementation detail for the cpu
> subsystem to use to boot up cpus, perform suspend/resume and low level hotplug
> operations. Users outside of the cpu subsystem would be better using the 
> device
> core API to bring a cpu online/offline which is the interface used to hotplug
> memory and other system devices.
> 
> Several users have already migrated to use the device core API, this series
> converts the remaining users and hides cpu_up/down from internal users at the
> end.
> 
> I still need to update the documentation to remove references to cpu_up/down
> and advocate for device_online/offline instead if this series will make its 
> way
> through.
> 
> I noticed this problem while working on a hack to disable offlining
> a particular CPU but noticed that setting the offline_disabled attribute in 
> the
> device struct isn't enough because users can easily bypass the device core.
> While my hack isn't a valid use case but it did highlight the inconsistency in
> the way cpus are being onlined/offlined and this attempt hopefully improves on
> this.
> 
> The first 6 patches fixes arch users.
> 
> The next 5 patches fixes generic code users. Particularly creating a new
> special exported API for the device core to use instead of cpu_up/down.
> Maybe we can do something more restrictive than that.
> 
> The last patch removes cpu_up/down from cpu.h and unexport the functions.
> 
> In some cases where the use of cpu_up/down seemed legitimate, I encapsulated
> the logic in a higher level - special purposed function; and converted the 
> code
> to use that instead.
> 
> I did run the rcu torture, lock torture and psci checker tests and no problem
> was noticed. I did perform build tests on all arch affected except for parisc.
> 
> Hopefully I got the CC list right for all the patches. Apologies in advance if
> some people were omitted from some patches but they should have been CCed.

I had to make an educated guess that you're probably the 'maintainer' of cpu
hotplug - but there's no explicit entry that says that. Please let me know if
I need to bring the attention of others too.

The series do have few rough edges to address, but it's relatively
straightforward and I think does offer a nice improvement in the form of
consolidating the API for bringing up/down cpus from external
subsystems/drivers. Beside fix the inconsistency of device's core view of the
state of the cpu which can happen when cpu_{up/down} are called directly.

The downside I see is that the external API to bring cpus up/down for
suspend/resume and at boot seem to have grown a bit organically (I've added
a couple in this series to address 2 direct users of cpu_{up,down}). We might
need to rethink this API, but I think this is outside the scope of this series.

Any thoughts/feedback would be appreciated.

Thanks

--
Qais Yousef

___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 00/12] Convert cpu_up/down to device_online/offline

2019-10-30 Thread Qais Yousef
Using cpu_up/down directly to bring cpus online/offline loses synchronization
with sysfs and could suffer from a race similar to what is described in
commit a6717c01ddc2 ("powerpc/rtas: use device model APIs and serialization
during LPM").

cpu_up/down seem to be more of a internal implementation detail for the cpu
subsystem to use to boot up cpus, perform suspend/resume and low level hotplug
operations. Users outside of the cpu subsystem would be better using the device
core API to bring a cpu online/offline which is the interface used to hotplug
memory and other system devices.

Several users have already migrated to use the device core API, this series
converts the remaining users and hides cpu_up/down from internal users at the
end.

I still need to update the documentation to remove references to cpu_up/down
and advocate for device_online/offline instead if this series will make its way
through.

I noticed this problem while working on a hack to disable offlining
a particular CPU but noticed that setting the offline_disabled attribute in the
device struct isn't enough because users can easily bypass the device core.
While my hack isn't a valid use case but it did highlight the inconsistency in
the way cpus are being onlined/offlined and this attempt hopefully improves on
this.

The first 6 patches fixes arch users.

The next 5 patches fixes generic code users. Particularly creating a new
special exported API for the device core to use instead of cpu_up/down.
Maybe we can do something more restrictive than that.

The last patch removes cpu_up/down from cpu.h and unexport the functions.

In some cases where the use of cpu_up/down seemed legitimate, I encapsulated
the logic in a higher level - special purposed function; and converted the code
to use that instead.

I did run the rcu torture, lock torture and psci checker tests and no problem
was noticed. I did perform build tests on all arch affected except for parisc.

Hopefully I got the CC list right for all the patches. Apologies in advance if
some people were omitted from some patches but they should have been CCed.

CC: Armijn Hemel 
CC: Benjamin Herrenschmidt 
CC: Bjorn Helgaas 
CC: Borislav Petkov 
CC: Boris Ostrovsky 
CC: Catalin Marinas 
CC: Christophe Leroy 
CC: Daniel Lezcano 
CC: Davidlohr Bueso 
CC: "David S. Miller" 
CC: Eiichi Tsukata 
CC: Enrico Weigelt 
CC: Fenghua Yu 
CC: Greg Kroah-Hartman 
CC: Helge Deller 
CC: "H. Peter Anvin" 
CC: Ingo Molnar 
CC: "James E.J. Bottomley" 
CC: James Morse 
CC: Jiri Kosina 
CC: Josh Poimboeuf 
CC: Josh Triplett 
CC: Juergen Gross 
CC: Lorenzo Pieralisi 
CC: Mark Rutland 
CC: Michael Ellerman 
CC: Nadav Amit 
CC: Nicholas Piggin 
CC: "Paul E. McKenney" 
CC: Paul Mackerras 
CC: Pavankumar Kondeti 
CC: "Peter Zijlstra (Intel)" 
CC: "Rafael J. Wysocki" 
CC: Ram Pai 
CC: Richard Fontana 
CC: Sakari Ailus 
CC: Stefano Stabellini 
CC: Steve Capper 
CC: Thiago Jung Bauermann 
CC: Thomas Gleixner 
CC: Tony Luck 
CC: Will Deacon 
CC: Zhenzhong Duan 
CC: linux-arm-ker...@lists.infradead.org
CC: linux-i...@vger.kernel.org
CC: linux-ker...@vger.kernel.org
CC: linux-par...@vger.kernel.org
CC: linuxppc-...@lists.ozlabs.org
CC: sparcli...@vger.kernel.org
CC: x...@kernel.org
CC: xen-devel@lists.xenproject.org


Qais Yousef (12):
  arm64: hibernate.c: create a new function to handle cpu_up(sleep_cpu)
  x86: Replace cpu_up/down with devcie_online/offline
  powerpc: Replace cpu_up/down with device_online/offline
  ia64: Replace cpu_down with freeze_secondary_cpus
  sparc: Replace cpu_up/down with device_online/offline
  parisc: Replace cpu_up/down with device_online/offline
  driver: base: cpu: export device_online/offline
  driver: xen: Replace cpu_up/down with device_online/offline
  firmware: psci: Replace cpu_up/down with device_online/offline
  torture: Replace cpu_up/down with device_online/offline
  smp: Create a new function to bringup nonboot cpus online
  cpu: Hide cpu_up/down

 arch/arm64/kernel/hibernate.c  | 13 +++
 arch/ia64/kernel/process.c |  8 +---
 arch/parisc/kernel/processor.c |  4 +-
 arch/powerpc/kernel/machine_kexec_64.c |  4 +-
 arch/sparc/kernel/ds.c |  8 +++-
 arch/x86/kernel/topology.c |  4 +-
 arch/x86/mm/mmio-mod.c |  8 +++-
 arch/x86/xen/smp.c |  4 +-
 drivers/base/core.c|  4 ++
 drivers/base/cpu.c |  4 +-
 drivers/firmware/psci/psci_checker.c   |  6 ++-
 drivers/xen/cpu_hotplug.c  |  2 +-
 include/linux/cpu.h|  6 ++-
 kernel/cpu.c   | 53 --
 kernel/smp.c   |  9 +
 kernel/torture.c   | 15 ++--
 16 files changed, 106 insertions(+), 46 deletions(-)

-- 
2.17.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel

[Xen-devel] [PATCH 08/12] driver: xen: Replace cpu_up/down with device_online/offline

2019-10-30 Thread Qais Yousef
The core device API performs extra housekeeping bits that are missing
from directly calling cpu_up/down.

See commit a6717c01ddc2 ("powerpc/rtas: use device model APIs and
serialization during LPM") for an example description of what might go
wrong.

This also prepares to make cpu_up/down a private interface for anything
but the cpu subsystem.

Signed-off-by: Qais Yousef 
CC: Boris Ostrovsky 
CC: Juergen Gross 
CC: Stefano Stabellini 
CC: xen-devel@lists.xenproject.org
CC: linux-ker...@vger.kernel.org
---
 drivers/xen/cpu_hotplug.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/xen/cpu_hotplug.c b/drivers/xen/cpu_hotplug.c
index f192b6f42da9..ec975decb5de 100644
--- a/drivers/xen/cpu_hotplug.c
+++ b/drivers/xen/cpu_hotplug.c
@@ -94,7 +94,7 @@ static int setup_cpu_watcher(struct notifier_block *notifier,
 
for_each_possible_cpu(cpu) {
if (vcpu_online(cpu) == 0) {
-   (void)cpu_down(cpu);
+   device_offline(get_cpu_device(cpu));
set_cpu_present(cpu, false);
}
}
-- 
2.17.1


___
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel