Re: [PATCH v3 21/22] arm64: Delay enabling hardware DBM feature

2018-03-08 Thread Mark Rutland
On Thu, Mar 08, 2018 at 01:53:13PM +, Dave Martin wrote:
> In practice, we don't race booting of two CPUs against each other
> IIUC.

Yup. Today, hotplugs are strictly serialized.

Several things would have to change for parallel hotplug to be feasible.

Thanks,
Mark.


Re: [PATCH v3 21/22] arm64: Delay enabling hardware DBM feature

2018-03-08 Thread Dave Martin
On Wed, Mar 07, 2018 at 05:39:09PM +, Suzuki K Poulose wrote:
> On 09/02/18 18:58, Dave Martin wrote:
> >On Fri, Feb 09, 2018 at 05:55:12PM +, Suzuki K Poulose wrote:
> >>We enable hardware DBM bit in a capable CPU, very early in the
> >>boot via __cpu_setup. This doesn't give us a flexibility of
> >>optionally disable the feature, as the clearing the bit
> >>is a bit costly as the TLB can cache the settings. Instead,
> >>we delay enabling the feature until the CPU is brought up
> >>into the kernel. We use the feature capability mechanism
> >>to handle it.
> >>
> >>The hardware DBM is a non-conflicting feature. i.e, the kernel
> >>can safely run with a mix of CPUs with some using the feature
> >>and the others don't. So, it is safe for a late CPU to have
> >>this capability and enable it, even if the active CPUs don't.
> >>
> >>To get this handled properly by the infrastructure, we
> >>unconditionally set the capability and only enable it
> >>on CPUs which really have the feature. Also, we print the
> >>feature detection from the "matches" call back to make sure
> >>we don't mislead the user when none of the CPUs could use the
> >>feature.
> >>
> >>Cc: Catalin Marinas 
> >>Cc: Dave Martin 
> >>Signed-off-by: Suzuki K Poulose 
> >>---
> >>Changes since V2
> >>  - Print the feature detection message only when at least one CPU
> >>is actually using it.
> 
> 
> >>+static bool has_hw_dbm(const struct arm64_cpu_capabilities *cap,
> >>+  int __unused)
> >>+{
> >>+   static bool detected = false;
> >>+   /*
> >>+* DBM is a non-conflicting feature. i.e, the kernel can safely
> >>+* run a mix of CPUs with and without the feature. So, we
> >>+* unconditionally enable the capability to allow any late CPU
> >>+* to use the feature. We only enable the control bits on the
> >>+* CPU, if it actually supports.
> >>+*
> >>+* We have to make sure we print the "feature" detection only
> >>+* when at least one CPU actually uses it. So check if this CPU
> >>+* can actually use it and print the message exactly once.
> >>+*
> >>+* This is safe as all CPUs (including secondary CPUs - due to the
> >>+* LOCAL_CPU scope - and the hotplugged CPUs - via verification)
> >>+* goes through the "matches" check exactly once. Also if a CPU
> >>+* matches the criteria, it is guaranteed that the CPU will turn
> >>+* the DBM on, as the capability is unconditionally enabled.
> >>+*/
> >>+   if (!detected && cpu_can_use_dbm(cap)) {
> >>+   detected = true;
> >>+   pr_info("detected feature: Hardware dirty bit management\n");
> >>+   }
> >
> >Can we just do
> >
> > if (cpu_can_use_dbm(cap))
> > pr_info_once(...);
> >
> >Then we can get rid of "detected".
> 
> The reason for open coding is the cost of cpu_can_use_dbm() with
> addition of black listed CPUs in the next patch in the series.

Oh, I see.  Yes, that makes sense.

There's obvious raciness here, but I guess pr_info_once() doesn't defend
against that either.  In practice, we don't race booting of two CPUs
against each other IIUC.

If you really like you could make detected __read_mostly like
printk_once() does, but that's no big deal if this is not on a hot path
(and probably not even then).

Cheers
---Dave


Re: [PATCH v3 21/22] arm64: Delay enabling hardware DBM feature

2018-03-07 Thread Suzuki K Poulose

On 09/02/18 18:58, Dave Martin wrote:

On Fri, Feb 09, 2018 at 05:55:12PM +, Suzuki K Poulose wrote:

We enable hardware DBM bit in a capable CPU, very early in the
boot via __cpu_setup. This doesn't give us a flexibility of
optionally disable the feature, as the clearing the bit
is a bit costly as the TLB can cache the settings. Instead,
we delay enabling the feature until the CPU is brought up
into the kernel. We use the feature capability mechanism
to handle it.

The hardware DBM is a non-conflicting feature. i.e, the kernel
can safely run with a mix of CPUs with some using the feature
and the others don't. So, it is safe for a late CPU to have
this capability and enable it, even if the active CPUs don't.

To get this handled properly by the infrastructure, we
unconditionally set the capability and only enable it
on CPUs which really have the feature. Also, we print the
feature detection from the "matches" call back to make sure
we don't mislead the user when none of the CPUs could use the
feature.

Cc: Catalin Marinas 
Cc: Dave Martin 
Signed-off-by: Suzuki K Poulose 
---
Changes since V2
  - Print the feature detection message only when at least one CPU
is actually using it.




+static bool has_hw_dbm(const struct arm64_cpu_capabilities *cap,
+  int __unused)
+{
+   static bool detected = false;
+   /*
+* DBM is a non-conflicting feature. i.e, the kernel can safely
+* run a mix of CPUs with and without the feature. So, we
+* unconditionally enable the capability to allow any late CPU
+* to use the feature. We only enable the control bits on the
+* CPU, if it actually supports.
+*
+* We have to make sure we print the "feature" detection only
+* when at least one CPU actually uses it. So check if this CPU
+* can actually use it and print the message exactly once.
+*
+* This is safe as all CPUs (including secondary CPUs - due to the
+* LOCAL_CPU scope - and the hotplugged CPUs - via verification)
+* goes through the "matches" check exactly once. Also if a CPU
+* matches the criteria, it is guaranteed that the CPU will turn
+* the DBM on, as the capability is unconditionally enabled.
+*/
+   if (!detected && cpu_can_use_dbm(cap)) {
+   detected = true;
+   pr_info("detected feature: Hardware dirty bit management\n");
+   }


Can we just do

if (cpu_can_use_dbm(cap))
pr_info_once(...);

Then we can get rid of "detected".


The reason for open coding is the cost of cpu_can_use_dbm() with
addition of black listed CPUs in the next patch in the series.

Cheers
Suzuki


Re: [PATCH v3 21/22] arm64: Delay enabling hardware DBM feature

2018-02-09 Thread Dave Martin
On Fri, Feb 09, 2018 at 05:55:12PM +, Suzuki K Poulose wrote:
> We enable hardware DBM bit in a capable CPU, very early in the
> boot via __cpu_setup. This doesn't give us a flexibility of
> optionally disable the feature, as the clearing the bit
> is a bit costly as the TLB can cache the settings. Instead,
> we delay enabling the feature until the CPU is brought up
> into the kernel. We use the feature capability mechanism
> to handle it.
> 
> The hardware DBM is a non-conflicting feature. i.e, the kernel
> can safely run with a mix of CPUs with some using the feature
> and the others don't. So, it is safe for a late CPU to have
> this capability and enable it, even if the active CPUs don't.
> 
> To get this handled properly by the infrastructure, we
> unconditionally set the capability and only enable it
> on CPUs which really have the feature. Also, we print the
> feature detection from the "matches" call back to make sure
> we don't mislead the user when none of the CPUs could use the
> feature.
> 
> Cc: Catalin Marinas 
> Cc: Dave Martin 
> Signed-off-by: Suzuki K Poulose 
> ---
> Changes since V2
>  - Print the feature detection message only when at least one CPU
>is actually using it.
> ---
>  arch/arm64/include/asm/cpucaps.h |  3 +-
>  arch/arm64/kernel/cpufeature.c   | 71 
> 
>  arch/arm64/mm/proc.S | 13 
>  3 files changed, 79 insertions(+), 8 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/cpucaps.h 
> b/arch/arm64/include/asm/cpucaps.h
> index bb263820de13..8df80cc828ac 100644
> --- a/arch/arm64/include/asm/cpucaps.h
> +++ b/arch/arm64/include/asm/cpucaps.h
> @@ -45,7 +45,8 @@
>  #define ARM64_HARDEN_BRANCH_PREDICTOR24
>  #define ARM64_HARDEN_BP_POST_GUEST_EXIT  25
>  #define ARM64_HAS_RAS_EXTN   26
> +#define ARM64_HW_DBM 27
>  
> -#define ARM64_NCAPS  27
> +#define ARM64_NCAPS  28
>  
>  #endif /* __ASM_CPUCAPS_H */
> diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
> index d8663822c604..a96a1f94f427 100644
> --- a/arch/arm64/kernel/cpufeature.c
> +++ b/arch/arm64/kernel/cpufeature.c
> @@ -939,6 +939,57 @@ static int __init parse_kpti(char *str)
>  __setup("kpti=", parse_kpti);
>  #endif   /* CONFIG_UNMAP_KERNEL_AT_EL0 */
>  
> +#ifdef CONFIG_ARM64_HW_AFDBM
> +static inline void __cpu_enable_hw_dbm(void)
> +{
> + u64 tcr = read_sysreg(tcr_el1) | TCR_HD;
> +
> + write_sysreg(tcr, tcr_el1);
> + isb();
> +}
> +
> +static bool cpu_can_use_dbm(const struct arm64_cpu_capabilities *cap)
> +{
> + return has_cpuid_feature(cap, SCOPE_LOCAL_CPU);
> +}
> +
> +static void cpu_enable_hw_dbm(struct arm64_cpu_capabilities const *cap)
> +{
> + if (cpu_can_use_dbm(cap))
> + __cpu_enable_hw_dbm();
> +}
> +
> +static bool has_hw_dbm(const struct arm64_cpu_capabilities *cap,
> +int __unused)
> +{
> + static bool detected = false;
> + /*
> +  * DBM is a non-conflicting feature. i.e, the kernel can safely
> +  * run a mix of CPUs with and without the feature. So, we
> +  * unconditionally enable the capability to allow any late CPU
> +  * to use the feature. We only enable the control bits on the
> +  * CPU, if it actually supports.
> +  *
> +  * We have to make sure we print the "feature" detection only
> +  * when at least one CPU actually uses it. So check if this CPU
> +  * can actually use it and print the message exactly once.
> +  *
> +  * This is safe as all CPUs (including secondary CPUs - due to the
> +  * LOCAL_CPU scope - and the hotplugged CPUs - via verification)
> +  * goes through the "matches" check exactly once. Also if a CPU
> +  * matches the criteria, it is guaranteed that the CPU will turn
> +  * the DBM on, as the capability is unconditionally enabled.
> +  */
> + if (!detected && cpu_can_use_dbm(cap)) {
> + detected = true;
> + pr_info("detected feature: Hardware dirty bit management\n");
> + }

Can we just do

if (cpu_can_use_dbm(cap))
pr_info_once(...);

Then we can get rid of "detected".

[...]

Cheers
---Dave


[PATCH v3 21/22] arm64: Delay enabling hardware DBM feature

2018-02-09 Thread Suzuki K Poulose
We enable hardware DBM bit in a capable CPU, very early in the
boot via __cpu_setup. This doesn't give us a flexibility of
optionally disable the feature, as the clearing the bit
is a bit costly as the TLB can cache the settings. Instead,
we delay enabling the feature until the CPU is brought up
into the kernel. We use the feature capability mechanism
to handle it.

The hardware DBM is a non-conflicting feature. i.e, the kernel
can safely run with a mix of CPUs with some using the feature
and the others don't. So, it is safe for a late CPU to have
this capability and enable it, even if the active CPUs don't.

To get this handled properly by the infrastructure, we
unconditionally set the capability and only enable it
on CPUs which really have the feature. Also, we print the
feature detection from the "matches" call back to make sure
we don't mislead the user when none of the CPUs could use the
feature.

Cc: Catalin Marinas 
Cc: Dave Martin 
Signed-off-by: Suzuki K Poulose 
---
Changes since V2
 - Print the feature detection message only when at least one CPU
   is actually using it.
---
 arch/arm64/include/asm/cpucaps.h |  3 +-
 arch/arm64/kernel/cpufeature.c   | 71 
 arch/arm64/mm/proc.S | 13 
 3 files changed, 79 insertions(+), 8 deletions(-)

diff --git a/arch/arm64/include/asm/cpucaps.h b/arch/arm64/include/asm/cpucaps.h
index bb263820de13..8df80cc828ac 100644
--- a/arch/arm64/include/asm/cpucaps.h
+++ b/arch/arm64/include/asm/cpucaps.h
@@ -45,7 +45,8 @@
 #define ARM64_HARDEN_BRANCH_PREDICTOR  24
 #define ARM64_HARDEN_BP_POST_GUEST_EXIT25
 #define ARM64_HAS_RAS_EXTN 26
+#define ARM64_HW_DBM   27
 
-#define ARM64_NCAPS27
+#define ARM64_NCAPS28
 
 #endif /* __ASM_CPUCAPS_H */
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index d8663822c604..a96a1f94f427 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -939,6 +939,57 @@ static int __init parse_kpti(char *str)
 __setup("kpti=", parse_kpti);
 #endif /* CONFIG_UNMAP_KERNEL_AT_EL0 */
 
+#ifdef CONFIG_ARM64_HW_AFDBM
+static inline void __cpu_enable_hw_dbm(void)
+{
+   u64 tcr = read_sysreg(tcr_el1) | TCR_HD;
+
+   write_sysreg(tcr, tcr_el1);
+   isb();
+}
+
+static bool cpu_can_use_dbm(const struct arm64_cpu_capabilities *cap)
+{
+   return has_cpuid_feature(cap, SCOPE_LOCAL_CPU);
+}
+
+static void cpu_enable_hw_dbm(struct arm64_cpu_capabilities const *cap)
+{
+   if (cpu_can_use_dbm(cap))
+   __cpu_enable_hw_dbm();
+}
+
+static bool has_hw_dbm(const struct arm64_cpu_capabilities *cap,
+  int __unused)
+{
+   static bool detected = false;
+   /*
+* DBM is a non-conflicting feature. i.e, the kernel can safely
+* run a mix of CPUs with and without the feature. So, we
+* unconditionally enable the capability to allow any late CPU
+* to use the feature. We only enable the control bits on the
+* CPU, if it actually supports.
+*
+* We have to make sure we print the "feature" detection only
+* when at least one CPU actually uses it. So check if this CPU
+* can actually use it and print the message exactly once.
+*
+* This is safe as all CPUs (including secondary CPUs - due to the
+* LOCAL_CPU scope - and the hotplugged CPUs - via verification)
+* goes through the "matches" check exactly once. Also if a CPU
+* matches the criteria, it is guaranteed that the CPU will turn
+* the DBM on, as the capability is unconditionally enabled.
+*/
+   if (!detected && cpu_can_use_dbm(cap)) {
+   detected = true;
+   pr_info("detected feature: Hardware dirty bit management\n");
+   }
+
+   return true;
+}
+
+#endif
+
 static void cpu_copy_el2regs(const struct arm64_cpu_capabilities *__unused)
 {
/*
@@ -1103,6 +1154,26 @@ static const struct arm64_cpu_capabilities 
arm64_features[] = {
.cpu_enable = cpu_clear_disr,
},
 #endif /* CONFIG_ARM64_RAS_EXTN */
+#ifdef CONFIG_ARM64_HW_AFDBM
+   {
+   /*
+* Since we turn this on always, we don't want the user to
+* think that the feature is available when it may not be.
+* So hide the description.
+*
+* .desc = "Hardware pagetable Dirty Bit Management",
+*
+*/
+   .type = ARM64_CPUCAP_WEAK_LOCAL_CPU_FEATURE,
+   .capability = ARM64_HW_DBM,
+   .sys_reg = SYS_ID_AA64MMFR1_EL1,
+   .sign = FTR_UNSIGNED,
+   .field_pos = ID_AA64MMFR1_HADBS_SHIFT,
+   .min_field_value = 2,
+   .matches = has_hw_dbm,
+   .cpu_enable = cpu_enable_hw