[PATCH] pinctrl: core: Fix unused variable build warnings

2020-11-23 Thread Pawan Gupta
A recent commit f1b206cf7c57 ("pinctrl: core: print gpio in pins debugfs
file") added build warnings when CONFIG_GPIOLIB=n. Offcourse the kernel
fails to build when warnings are treated as errors. Below is the error
message:

  $ make CFLAGS_KERNEL+=-Werror

  drivers/pinctrl/core.c: In function ‘pinctrl_pins_show’:
  drivers/pinctrl/core.c:1607:20: error: unused variable ‘chip’ 
[-Werror=unused-variable]
   1607 |  struct gpio_chip *chip;
|^~~~
  drivers/pinctrl/core.c:1606:15: error: unused variable ‘gpio_num’ 
[-Werror=unused-variable]
   1606 |  unsigned int gpio_num;
|   ^~~~
  drivers/pinctrl/core.c:1605:29: error: unused variable ‘range’ 
[-Werror=unused-variable]
   1605 |  struct pinctrl_gpio_range *range;
| ^
  cc1: all warnings being treated as errors

These variables are only used inside #ifdef CONFIG_GPIOLIB, fix the
build warnings by wrapping the definition inside the config.

Fixes: f1b206cf7c57 ("pinctrl: core: print gpio in pins debugfs file")
Signed-off-by: Pawan Gupta 
---
 drivers/pinctrl/core.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/pinctrl/core.c b/drivers/pinctrl/core.c
index 3663d87f51a0..1bb371a5cf8d 100644
--- a/drivers/pinctrl/core.c
+++ b/drivers/pinctrl/core.c
@@ -1602,10 +1602,11 @@ static int pinctrl_pins_show(struct seq_file *s, void 
*what)
struct pinctrl_dev *pctldev = s->private;
const struct pinctrl_ops *ops = pctldev->desc->pctlops;
unsigned i, pin;
+#ifdef CONFIG_GPIOLIB
struct pinctrl_gpio_range *range;
unsigned int gpio_num;
struct gpio_chip *chip;
-
+#endif
seq_printf(s, "registered pins: %d\n", pctldev->desc->npins);
 
mutex_lock(>mutex);
-- 
2.21.3



[tip: x86/urgent] x86/bugs/multihit: Fix mitigation reporting when VMX is not in use

2020-08-06 Thread tip-bot2 for Pawan Gupta
The following commit has been merged into the x86/urgent branch of tip:

Commit-ID: f29dfa53cc8ae6ad93bae619bcc0bf45cab344f7
Gitweb:
https://git.kernel.org/tip/f29dfa53cc8ae6ad93bae619bcc0bf45cab344f7
Author:Pawan Gupta 
AuthorDate:Thu, 16 Jul 2020 12:23:59 -07:00
Committer: Ingo Molnar 
CommitterDate: Fri, 07 Aug 2020 01:32:00 +02:00

x86/bugs/multihit: Fix mitigation reporting when VMX is not in use

On systems that have virtualization disabled or unsupported, sysfs
mitigation for X86_BUG_ITLB_MULTIHIT is reported incorrectly as:

  $ cat /sys/devices/system/cpu/vulnerabilities/itlb_multihit
  KVM: Vulnerable

System is not vulnerable to DoS attack from a rogue guest when
virtualization is disabled or unsupported in the hardware. Change the
mitigation reporting for these cases.

Fixes: b8e8c8303ff2 ("kvm: mmu: ITLB_MULTIHIT mitigation")
Reported-by: Nelson Dsouza 
Co-developed-by: Sean Christopherson 
Signed-off-by: Sean Christopherson 
Signed-off-by: Pawan Gupta 
Signed-off-by: Ingo Molnar 
Reviewed-by: Tony Luck 
Acked-by: Thomas Gleixner 
Link: 
https://lore.kernel.org/r/0ba029932a816179b9d14a30db38f0f11ef1f166.1594925782.git.pawan.kumar.gu...@linux.intel.com
---
 Documentation/admin-guide/hw-vuln/multihit.rst | 4 
 arch/x86/kernel/cpu/bugs.c | 8 +++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/hw-vuln/multihit.rst 
b/Documentation/admin-guide/hw-vuln/multihit.rst
index ba9988d..140e4ce 100644
--- a/Documentation/admin-guide/hw-vuln/multihit.rst
+++ b/Documentation/admin-guide/hw-vuln/multihit.rst
@@ -80,6 +80,10 @@ The possible values in this file are:
- The processor is not vulnerable.
  * - KVM: Mitigation: Split huge pages
- Software changes mitigate this issue.
+ * - KVM: Mitigation: VMX unsupported
+   - KVM is not vulnerable because Virtual Machine Extensions (VMX) is not 
supported.
+ * - KVM: Mitigation: VMX disabled
+   - KVM is not vulnerable because Virtual Machine Extensions (VMX) is 
disabled.
  * - KVM: Vulnerable
- The processor is vulnerable, but no mitigation enabled
 
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index f0b743a..d3f0db4 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "cpu.h"
 
@@ -1549,7 +1550,12 @@ static ssize_t l1tf_show_state(char *buf)
 
 static ssize_t itlb_multihit_show_state(char *buf)
 {
-   if (itlb_multihit_kvm_mitigation)
+   if (!boot_cpu_has(X86_FEATURE_MSR_IA32_FEAT_CTL) ||
+   !boot_cpu_has(X86_FEATURE_VMX))
+   return sprintf(buf, "KVM: Mitigation: VMX unsupported\n");
+   else if (!(cr4_read_shadow() & X86_CR4_VMXE))
+   return sprintf(buf, "KVM: Mitigation: VMX disabled\n");
+   else if (itlb_multihit_kvm_mitigation)
return sprintf(buf, "KVM: Mitigation: Split huge pages\n");
else
return sprintf(buf, "KVM: Vulnerable\n");


[tip: x86/urgent] x86/bugs/multihit: Fix mitigation reporting when VMX is not in use

2020-08-06 Thread tip-bot2 for Pawan Gupta
The following commit has been merged into the x86/urgent branch of tip:

Commit-ID: 31dccf7df081364045bff196bdc060ddda97f2e9
Gitweb:
https://git.kernel.org/tip/31dccf7df081364045bff196bdc060ddda97f2e9
Author:Pawan Gupta 
AuthorDate:Thu, 16 Jul 2020 12:23:59 -07:00
Committer: Ingo Molnar 
CommitterDate: Thu, 06 Aug 2020 19:04:51 +02:00

x86/bugs/multihit: Fix mitigation reporting when VMX is not in use

On systems that have virtualization disabled or unsupported, sysfs
mitigation for X86_BUG_ITLB_MULTIHIT is reported incorrectly as:

  $ cat /sys/devices/system/cpu/vulnerabilities/itlb_multihit
  KVM: Vulnerable

System is not vulnerable to DoS attack from a rogue guest when
virtualization is disabled or unsupported in the hardware. Change the
mitigation reporting for these cases.

Fixes: b8e8c8303ff2 ("kvm: mmu: ITLB_MULTIHIT mitigation")
Reported-by: Nelson Dsouza 
Co-developed-by: Sean Christopherson 
Signed-off-by: Sean Christopherson 
Signed-off-by: Pawan Gupta 
Signed-off-by: Ingo Molnar 
Reviewed-by: Tony Luck 
Acked-by: Thomas Gleixner 
Link: 
https://lore.kernel.org/r/0ba029932a816179b9d14a30db38f0f11ef1f166.1594925782.git.pawan.kumar.gu...@linux.intel.com
---
 Documentation/admin-guide/hw-vuln/multihit.rst | 4 
 arch/x86/kernel/cpu/bugs.c | 8 +++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/hw-vuln/multihit.rst 
b/Documentation/admin-guide/hw-vuln/multihit.rst
index ba9988d..140e4ce 100644
--- a/Documentation/admin-guide/hw-vuln/multihit.rst
+++ b/Documentation/admin-guide/hw-vuln/multihit.rst
@@ -80,6 +80,10 @@ The possible values in this file are:
- The processor is not vulnerable.
  * - KVM: Mitigation: Split huge pages
- Software changes mitigate this issue.
+ * - KVM: Mitigation: VMX unsupported
+   - KVM is not vulnerable because Virtual Machine Extensions (VMX) is not 
supported.
+ * - KVM: Mitigation: VMX disabled
+   - KVM is not vulnerable because Virtual Machine Extensions (VMX) is 
disabled.
  * - KVM: Vulnerable
- The processor is vulnerable, but no mitigation enabled
 
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index f0b743a..d3f0db4 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "cpu.h"
 
@@ -1549,7 +1550,12 @@ static ssize_t l1tf_show_state(char *buf)
 
 static ssize_t itlb_multihit_show_state(char *buf)
 {
-   if (itlb_multihit_kvm_mitigation)
+   if (!boot_cpu_has(X86_FEATURE_MSR_IA32_FEAT_CTL) ||
+   !boot_cpu_has(X86_FEATURE_VMX))
+   return sprintf(buf, "KVM: Mitigation: VMX unsupported\n");
+   else if (!(cr4_read_shadow() & X86_CR4_VMXE))
+   return sprintf(buf, "KVM: Mitigation: VMX disabled\n");
+   else if (itlb_multihit_kvm_mitigation)
return sprintf(buf, "KVM: Mitigation: Split huge pages\n");
else
return sprintf(buf, "KVM: Vulnerable\n");


[PATCH v2] x86/bugs/multihit: Fix mitigation reporting when VMX is not in use

2020-07-16 Thread Pawan Gupta
On systems that have virtualization disabled or unsupported, sysfs
mitigation for X86_BUG_ITLB_MULTIHIT is reported incorrectly as:

  $ cat /sys/devices/system/cpu/vulnerabilities/itlb_multihit
  KVM: Vulnerable

System is not vulnerable to DoS attack from a rogue guest when
virtualization is disabled or unsupported in the hardware. Change the
mitigation reporting for these cases.

Fixes: b8e8c8303ff2 ("kvm: mmu: ITLB_MULTIHIT mitigation")
Reported-by: Nelson Dsouza 
Co-developed-by: Sean Christopherson 
Signed-off-by: Sean Christopherson 
Signed-off-by: Pawan Gupta 
Reviewed-by: Tony Luck 
---
v2:
 - Change mitigation reporting as per the state on VMX feature.

v1: 
https://lore.kernel.org/lkml/267631f4db4fd7e9f7ca789c2efaeab44103f68e.1594689154.git.pawan.kumar.gu...@linux.intel.com/

 Documentation/admin-guide/hw-vuln/multihit.rst | 4 
 arch/x86/kernel/cpu/bugs.c | 8 +++-
 2 files changed, 11 insertions(+), 1 deletion(-)

diff --git a/Documentation/admin-guide/hw-vuln/multihit.rst 
b/Documentation/admin-guide/hw-vuln/multihit.rst
index ba9988d8bce5..140e4cec38c3 100644
--- a/Documentation/admin-guide/hw-vuln/multihit.rst
+++ b/Documentation/admin-guide/hw-vuln/multihit.rst
@@ -80,6 +80,10 @@ The possible values in this file are:
- The processor is not vulnerable.
  * - KVM: Mitigation: Split huge pages
- Software changes mitigate this issue.
+ * - KVM: Mitigation: VMX unsupported
+   - KVM is not vulnerable because Virtual Machine Extensions (VMX) is not 
supported.
+ * - KVM: Mitigation: VMX disabled
+   - KVM is not vulnerable because Virtual Machine Extensions (VMX) is 
disabled.
  * - KVM: Vulnerable
- The processor is vulnerable, but no mitigation enabled
 
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 0b71970d2d3d..b0802d45abd3 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -31,6 +31,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "cpu.h"
 
@@ -1556,7 +1557,12 @@ static ssize_t l1tf_show_state(char *buf)
 
 static ssize_t itlb_multihit_show_state(char *buf)
 {
-   if (itlb_multihit_kvm_mitigation)
+   if (!boot_cpu_has(X86_FEATURE_MSR_IA32_FEAT_CTL) ||
+   !boot_cpu_has(X86_FEATURE_VMX))
+   return sprintf(buf, "KVM: Mitigation: VMX unsupported\n");
+   else if (!(cr4_read_shadow() & X86_CR4_VMXE))
+   return sprintf(buf, "KVM: Mitigation: VMX disabled\n");
+   else if (itlb_multihit_kvm_mitigation)
return sprintf(buf, "KVM: Mitigation: Split huge pages\n");
else
return sprintf(buf, "KVM: Vulnerable\n");
-- 
2.21.3



Re: [PATCH] x86/bugs/multihit: Fix mitigation reporting when KVM is not in use

2020-07-15 Thread Pawan Gupta
On Tue, Jul 14, 2020 at 05:51:30PM -0700, Sean Christopherson wrote:
> On Tue, Jul 14, 2020 at 02:20:59PM -0700, Dave Hansen wrote:
> > On 7/14/20 2:04 PM, Pawan Gupta wrote:
> > >> I see three inputs and four possible states (sorry for the ugly table,
> > >> it was this or a spreadsheet :):
> > >>
> > >> X86_FEATURE_VMX  CONFIG_KVM_*hpage split  ResultReason
> > >>  N   x   xNot Affected  No VMX
> > >>  Y   N   xNot affected  No KVM
> 
> This line item is pointless, the relevant itlb_multihit_show_state()
> implementation depends on CONFIG_KVM_INTEL.  The !KVM_INTEL version simply
> prints ""Processor vulnerable".

While we are on it, for CONFIG_KVM_INTEL=n would it make sense to report "Not
affected(No KVM)"? "Processor vulnerable" is not telling much about the
mitigation.

Thanks,
Pawan


Re: [PATCH] x86/bugs/multihit: Fix mitigation reporting when KVM is not in use

2020-07-14 Thread Pawan Gupta
On Tue, Jul 14, 2020 at 12:54:26PM -0700, Dave Hansen wrote:
> On 7/14/20 12:17 PM, Pawan Gupta wrote:
> > On Tue, Jul 14, 2020 at 07:57:53AM -0700, Dave Hansen wrote:
> >> Let's stick to things which are at least static per reboot.  Checking
> >> for X86_FEATURE_VMX or even CONFIG_KVM_INTEL seems like a good stopping
> >> point.  "Could this kernel run a naughty guest?"  If so, report
> >> "Vulnerable".  It's the same as Meltdown: "Could this kernel run
> >> untrusted code?"  If so, report "Vulnerable".
> > 
> > Thanks, These are good inputs. So what I need to add is a boot time
> > check for VMX feature and report "Vulnerable" or "Not
> > affected(VMX disabled)".
> > 
> > Are you suggesting to not change the reporting when KVM deploys the
> > "Split huge pages" mitigation? Is this because VMX can still be used by
> > other VMMs?
> > 
> > The current mitigation reporting is very specific to KVM:
> > 
> > - "KVM: Vulnerable"
> > - "KVM: Mitigation: Split huge pages"
> > 
> > As the kernel doesn't know about the mitigation state of out-of-tree
> > VMMs can we add VMX reporting to always say vulnerable when VMX is
> > enabled:
> > 
> > - "VMX: Vulnerable, KVM: Vulnerable"
> > - "VMX: Vulnerable, KVM: Mitigation: Split huge pages"
> > 
> > And if VMX is disabled report:
> > 
> > - "VMX: Not affected(VMX disabled)"
> 
> I see three inputs and four possible states (sorry for the ugly table,
> it was this or a spreadsheet :):
> 
> X86_FEATURE_VMX   CONFIG_KVM_*hpage split  ResultReason
>   N   x   xNot Affected  No VMX
>   Y   N   xNot affected  No KVM
>   Y   Y   YMitigated hpage split
>   Y   Y   NVulnerable

Thank you.

Just a note... for the last 2 cases kernel wont know about "hpage split"
mitigation until KVM is loaded. So for these cases reporting at boot
will be "Vulnerable" and would change to "Mitigated" once KVM is loaded
and deploys the mitigation. This is the current behavior.

Thanks,
Pawan


Re: [PATCH] x86/bugs/multihit: Fix mitigation reporting when KVM is not in use

2020-07-14 Thread Pawan Gupta
On Tue, Jul 14, 2020 at 07:57:53AM -0700, Dave Hansen wrote:
> Let's stick to things which are at least static per reboot.  Checking
> for X86_FEATURE_VMX or even CONFIG_KVM_INTEL seems like a good stopping
> point.  "Could this kernel run a naughty guest?"  If so, report
> "Vulnerable".  It's the same as Meltdown: "Could this kernel run
> untrusted code?"  If so, report "Vulnerable".

Thanks, These are good inputs. So what I need to add is a boot time
check for VMX feature and report "Vulnerable" or "Not
affected(VMX disabled)".

Are you suggesting to not change the reporting when KVM deploys the
"Split huge pages" mitigation? Is this because VMX can still be used by
other VMMs?

The current mitigation reporting is very specific to KVM:

- "KVM: Vulnerable"
- "KVM: Mitigation: Split huge pages"

As the kernel doesn't know about the mitigation state of out-of-tree
VMMs can we add VMX reporting to always say vulnerable when VMX is
enabled:

- "VMX: Vulnerable, KVM: Vulnerable"
- "VMX: Vulnerable, KVM: Mitigation: Split huge pages"

And if VMX is disabled report:

- "VMX: Not affected(VMX disabled)"

or something like that.

Thanks,
Pawan


[PATCH] x86/bugs/multihit: Fix mitigation reporting when KVM is not in use

2020-07-13 Thread Pawan Gupta
On systems that have virtualization disabled or KVM module is not
loaded, sysfs mitigation state of X86_BUG_ITLB_MULTIHIT is reported
incorrectly as:

  $ cat /sys/devices/system/cpu/vulnerabilities/itlb_multihit
  KVM: Vulnerable

System is not vulnerable to DoS attack from a rogue guest when:
 - KVM module is not loaded or
 - Virtualization is disabled in the hardware or
 - Kernel was configured without support for KVM

Change the reporting to "Currently not affected (KVM not in use)" for
such cases.

Reported-by: Nelson Dsouza 
Fixes: b8e8c8303ff2 ("kvm: mmu: ITLB_MULTIHIT mitigation")
Signed-off-by: Pawan Gupta 
Reviewed-by: Tony Luck 
---
 .../admin-guide/hw-vuln/multihit.rst  |  5 +++-
 arch/x86/include/asm/processor.h  |  6 +
 arch/x86/kernel/cpu/bugs.c| 24 +--
 arch/x86/kvm/mmu/mmu.c|  9 +--
 4 files changed, 29 insertions(+), 15 deletions(-)

diff --git a/Documentation/admin-guide/hw-vuln/multihit.rst 
b/Documentation/admin-guide/hw-vuln/multihit.rst
index ba9988d8bce5..842961419f3e 100644
--- a/Documentation/admin-guide/hw-vuln/multihit.rst
+++ b/Documentation/admin-guide/hw-vuln/multihit.rst
@@ -82,7 +82,10 @@ The possible values in this file are:
- Software changes mitigate this issue.
  * - KVM: Vulnerable
- The processor is vulnerable, but no mitigation enabled
-
+ * - Currently not affected (KVM not in use)
+   - The processor is vulnerable but no mitigation is required because
+ KVM module is not loaded or virtualization is disabled in the 
hardware or
+ kernel was configured without support for KVM.
 
 Enumeration of the erratum
 
diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/processor.h
index 03b7c4ca425a..830a3e7725af 100644
--- a/arch/x86/include/asm/processor.h
+++ b/arch/x86/include/asm/processor.h
@@ -989,4 +989,10 @@ enum mds_mitigations {
MDS_MITIGATION_VMWERV,
 };
 
+enum itlb_multihit_mitigations {
+   ITLB_MULTIHIT_MITIGATION_OFF,
+   ITLB_MULTIHIT_MITIGATION_FULL,
+   ITLB_MULTIHIT_MITIGATION_NO_KVM,
+};
+
 #endif /* _ASM_X86_PROCESSOR_H */
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 0b71970d2d3d..97f66a93f2be 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -1395,8 +1395,15 @@ void x86_spec_ctrl_setup_ap(void)
x86_amd_ssb_disable();
 }
 
-bool itlb_multihit_kvm_mitigation;
-EXPORT_SYMBOL_GPL(itlb_multihit_kvm_mitigation);
+/* Default to KVM not in use, KVM module changes this later */
+enum itlb_multihit_mitigations itlb_multihit_mitigation = 
ITLB_MULTIHIT_MITIGATION_NO_KVM;
+EXPORT_SYMBOL_GPL(itlb_multihit_mitigation);
+
+static const char * const itlb_multihit_strings[] = {
+   [ITLB_MULTIHIT_MITIGATION_OFF]  = "KVM: Vulnerable",
+   [ITLB_MULTIHIT_MITIGATION_FULL] = "KVM: Mitigation: Split huge 
pages",
+   [ITLB_MULTIHIT_MITIGATION_NO_KVM]   = "Currently not affected (KVM 
not in use)",
+};
 
 #undef pr_fmt
 #define pr_fmt(fmt)"L1TF: " fmt
@@ -1553,25 +1560,18 @@ static ssize_t l1tf_show_state(char *buf)
   l1tf_vmx_states[l1tf_vmx_mitigation],
   sched_smt_active() ? "vulnerable" : "disabled");
 }
-
-static ssize_t itlb_multihit_show_state(char *buf)
-{
-   if (itlb_multihit_kvm_mitigation)
-   return sprintf(buf, "KVM: Mitigation: Split huge pages\n");
-   else
-   return sprintf(buf, "KVM: Vulnerable\n");
-}
 #else
 static ssize_t l1tf_show_state(char *buf)
 {
return sprintf(buf, "%s\n", L1TF_DEFAULT_MSG);
 }
+#endif
 
 static ssize_t itlb_multihit_show_state(char *buf)
 {
-   return sprintf(buf, "Processor vulnerable\n");
+   return sprintf(buf, "%s\n",
+  itlb_multihit_strings[itlb_multihit_mitigation]);
 }
-#endif
 
 static ssize_t mds_show_state(char *buf)
 {
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 6d6a0ae7800c..e089b9e565a5 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -50,7 +50,7 @@
 #include 
 #include "trace.h"
 
-extern bool itlb_multihit_kvm_mitigation;
+extern enum itlb_multihit_mitigations itlb_multihit_mitigation;
 
 static int __read_mostly nx_huge_pages = -1;
 #ifdef CONFIG_PREEMPT_RT
@@ -6158,7 +6158,12 @@ static bool get_nx_auto_mode(void)
 
 static void __set_nx_huge_pages(bool val)
 {
-   nx_huge_pages = itlb_multihit_kvm_mitigation = val;
+   nx_huge_pages = val;
+
+   if (val)
+   itlb_multihit_mitigation = ITLB_MULTIHIT_MITIGATION_FULL;
+   else
+   itlb_multihit_mitigation = ITLB_MULTIHIT_MITIGATION_OFF;
 }
 
 static int set_nx_huge_pages(const char *val, const struct kernel_param *kp)
-- 
2.21.3



Re: [RFC PATCH v3 13/16] sched: Add core wide task selection and scheduling.

2019-06-07 Thread Pawan Gupta
On Wed, May 29, 2019 at 08:36:49PM +, Vineeth Remanan Pillai wrote:
> From: Peter Zijlstra 
> 
> Instead of only selecting a local task, select a task for all SMT
> siblings for every reschedule on the core (irrespective which logical
> CPU does the reschedule).
> 
> NOTE: there is still potential for siblings rivalry.
> NOTE: this is far too complicated; but thus far I've failed to
>   simplify it further.

Looks like there are still some race conditions while bringing cpu
online/offline. I am seeing an easy to reproduce panic when turning SMT on/off
in a loop with core scheduling ON. I dont see the panic with core scheduling
OFF.

Steps to reproduce:

mkdir /sys/fs/cgroup/cpu/group1
mkdir /sys/fs/cgroup/cpu/group2
echo 1 > /sys/fs/cgroup/cpu/group1/cpu.tag
echo 1 > /sys/fs/cgroup/cpu/group2/cpu.tag

echo $$ > /sys/fs/cgroup/cpu/group1/tasks

while [ 1 ];  do
echo on  > /sys/devices/system/cpu/smt/control
echo off > /sys/devices/system/cpu/smt/control
done

Panic logs:
[  274.629437] BUG: unable to handle kernel NULL pointer dereference at
0024
[  274.630366] #PF error: [normal kernel read fault]
[  274.630933] PGD 80003e52c067 P4D 80003e52c067 PUD 0
[  274.631613] Oops:  [#1] SMP PTI
[  274.632016] CPU: 0 PID: 1470 Comm: bash Tainted: GW
5.1.4+ #33
[  274.632854] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996),
BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29
04/01/2014
[  274.634248] RIP: 0010:__schedule+0x9d4/0x1350
[  274.634699] Code: da 0f 83 21 04 00 00 48 8b 35 70 f3 ab 00 48 c7 c7
51 1c a8 81 e8 4c 4e 6b ff 49 8b 85 b8 0b 00 00 48 85 c0 0f 84 2f 09 00
01
[  274.636648] RSP: 0018:c98f3ca8 EFLAGS: 00010046
[  274.637197] RAX:  RBX: 0001 RCX:
0040
[  274.637941] RDX:  RSI:  RDI:
82544890
[  274.638691] RBP: c98f3d40 R08: 04c7 R09:
0030
[  274.639449] R10: 0001 R11: c98f3b28 R12:
88803d2d0e80
[  274.640172] R13: 88803eaa0a40 R14: 88803ea20a40 R15:
88803d2d0e80
[  274.640915] FS:  () GS:88803ea0(0063)
knlGS:f7f8b780
[  274.641755] CS:  0010 DS: 002b ES: 002b CR0: 80050033
[  274.642355] CR2: 0024 CR3: 3c01a005 CR4:
00360ef0
[  274.643135] DR0:  DR1:  DR2:

[  274.643995] DR3:  DR6: fffe0ff0 DR7:
0400
[  274.645023] Call Trace:
[  274.645336]  schedule+0x28/0x70
[  274.645621]  native_cpu_up+0x271/0x6d0
[  274.645959]  ? cpus_read_trylock+0x40/0x40
[  274.646324]  bringup_cpu+0x2d/0xe0
[  274.646631]  cpuhp_invoke_callback+0x94/0x550
[  274.647032]  ? ring_buffer_record_is_set_on+0x10/0x10
[  274.647478]  _cpu_up+0xa9/0x140
[  274.647763]  store_smt_control+0x1cb/0x260
[  274.648132]  kernfs_fop_write+0x108/0x190
[  274.648498]  vfs_write+0xa5/0x1a0
[  274.648794]  ksys_write+0x57/0xd0
[  274.649100]  do_fast_syscall_32+0x92/0x220
[  274.649468]  entry_SYSENTER_compat+0x7c/0x8e


NULL pointer exception is triggered when sibling is offline during core task
pick in pick_next_task() leaving rq_i->core_pick = NULL and if sibling comes
online before the "Reschedule siblings" block in the same function it causes
panic in is_idle_task(rq_i->core_pick).

Traces for the scenario:
[ 274.599567] bash-1470 0d... 273921815us : __schedule: cpu(0) is online during 
core_pick 
[ 274.600339] bash-1470 0d... 273921816us : __schedule: cpu(1) is offline 
during core_pick 
[ 274.601106] bash-1470 0d... 273921816us : __schedule: picked: bash/1470 
88803cb9c000 
[ 274.602106] bash-1470 0d... 273921816us : __schedule: cpu(0) is online.. 
during Reschedule siblings 
[ 274.603219] bash-1470 0d... 273921816us : __schedule: cpu(1) is online.. 
during Reschedule siblings 
[ 274.604333] -0 1d... 273921816us : start_secondary: cpu(1) is online 
now 
[ 274.605239] bash-1470 0d... 273922148us : __schedule: rq_i->core_pick on 
cpu(1) is NULL

I am not able to reproduce the panic after the below change. Not sure if this
is the right fix. Maybe we don't have to allow cpus to go online/offline while
pick_next_task() is executing.

-- 8< ---
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 90655c9ad937..b230b095772a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3874,7 +3874,7 @@ next_class:;
for_each_cpu(i, smt_mask) {
struct rq *rq_i = cpu_rq(i);
 
-   if (cpu_is_offline(i))
+   if (cpu_is_offline(i) || !rq_i->core_pick)
continue;
 
WARN_ON_ONCE(!rq_i->core_pick);


Re: [RFC][PATCH 00/16] sched: Core scheduling

2019-03-12 Thread Pawan Gupta
Hi,

With core scheduling LTP reports 2 new failures related to 
cgroups(memcg_stat_rss and memcg_move_charge_at_immigrate). I will try to debug 
it.

Also "perf sched map" indicates there might be a small window when 2 processes 
in different cgroups run together on one core.
In below case B0 and D0(stress-ng-cpu and sysbench) belong to 2 different 
cgroups with cpu.tag enabled.

$ perf sched map

  *A0 382.266600 secs A0 => kworker/0:1-eve:51
  *B0 382.266612 secs B0 => stress-ng-cpu:7956
  *A0 382.394597 secs 
  *B0 382.394609 secs 
   B0 *C0 382.494459 secs C0 => i915/signal:0:450
   B0 *D0 382.494468 secs D0 => sysbench:8088
  *.   D0 382.494472 secs .  => swapper:0
   .  *C0 383.095787 secs 
  *B0  C0 383.095792 secs 
   B0 *D0 383.095820 secs
  *A0  D0 383.096587 secs

In some cases I dont see an IPI getting sent to sibling cpu when 2 incompatible 
processes are picked. Like is below logs at timestamp 382.146250
"stress-ng-cpu" is picked when "sysbench" is running on the sibling cpu.

  kworker/0:1-51[000] d...   382.146246: __schedule: cpu(0): selected: 
stress-ng-cpu/7956 9945bad29200
  kworker/0:1-51[000] d...   382.146246: __schedule: max: 
stress-ng-cpu/7956 9945bad29200
  kworker/0:1-51[000] d...   382.146247: __prio_less: (swapper/4/0;140,0,0) 
?< (sysbench/8088;140,34783671987,0)
  kworker/0:1-51[000] d...   382.146248: __prio_less: 
(stress-ng-cpu/7956;119,34817170203,0) ?< (sysbench/8088;119,34783671987,0)
  kworker/0:1-51[000] d...   382.146249: __schedule: cpu(4): selected: 
sysbench/8088 9945a7405200
  kworker/0:1-51[000] d...   382.146249: __prio_less: 
(stress-ng-cpu/7956;119,34817170203,0) ?< (sysbench/8088;119,34783671987,0)
  kworker/0:1-51[000] d...   382.146250: __schedule: picked: 
stress-ng-cpu/7956 9945bad29200
  kworker/0:1-51[000] d...   382.146251: __switch_to: Pawan: cpu(0) 
switching to stress-ng-cpu
  kworker/0:1-51[000] d...   382.146251: __switch_to: Pawan: cpu(4) running 
sysbench
stress-ng-cpu-7956  [000] dN..   382.274234: __schedule: cpu(0): selected: 
kworker/0:1/51 0
stress-ng-cpu-7956  [000] dN..   382.274235: __schedule: max: kworker/0:1/51 0
stress-ng-cpu-7956  [000] dN..   382.274235: __schedule: cpu(4): selected: 
sysbench/8088 9945a7405200
stress-ng-cpu-7956  [000] dN..   382.274237: __prio_less: 
(kworker/0:1/51;119,50744489595,0) ?< (sysbench/8088;119,34911643157,0)
stress-ng-cpu-7956  [000] dN..   382.274237: __schedule: picked: kworker/0:1/51 0
stress-ng-cpu-7956  [000] d...   382.274239: __switch_to: Pawan: cpu(0) 
switching to kworker/0:1
stress-ng-cpu-7956  [000] d...   382.274239: __switch_to: Pawan: cpu(4) running 
sysbench

-Pawan