[Bug 203571] New: may_use_simd() returns false in kworkers

2019-05-10 Thread bugzilla-daemon
https://bugzilla.kernel.org/show_bug.cgi?id=203571

Bug ID: 203571
   Summary: may_use_simd() returns false in kworkers
   Product: Platform Specific/Hardware
   Version: 2.5
Kernel Version: 4.19
  Hardware: All
OS: Linux
  Tree: Mainline
Status: NEW
  Severity: normal
  Priority: P1
 Component: PPC-64
  Assignee: platform_ppc...@kernel-bugs.osdl.org
  Reporter: sland...@gmail.com
Regression: No

This patch optimizing chacha20 for WireGuard
https://github.com/shawnl/WireGuard/commit/3e02fce92a14cba7b7d1e2733def3a51bec97498

doesn't work because may_use_simd() is always returning false in kworkers. If I
remove the check everything seems to work fine.

-- 
You are receiving this mail because:
You are watching the assignee of the bug.

Re: [PATCH 2/2] powerpc/perf: Fix mmcra corruption by bhrb_filter

2019-05-10 Thread Ravi Bangoria



On 5/11/19 8:12 AM, Ravi Bangoria wrote:
> Consider a scenario where user creates two events:
> 
>   1st event:
> attr.sample_type |= PERF_SAMPLE_BRANCH_STACK;
> attr.branch_sample_type = PERF_SAMPLE_BRANCH_ANY;
> fd = perf_event_open(attr, 0, 1, -1, 0);
> 
>   This sets cpuhw->bhrb_filter to 0 and returns valid fd.
> 
>   2nd event:
> attr.sample_type |= PERF_SAMPLE_BRANCH_STACK;
> attr.branch_sample_type = PERF_SAMPLE_BRANCH_CALL;
> fd = perf_event_open(attr, 0, 1, -1, 0);
> 
>   It overrides cpuhw->bhrb_filter to -1 and returns with error.
> 
> Now if power_pmu_enable() gets called by any path other than
> power_pmu_add(), ppmu->config_bhrb(-1) will set mmcra to -1.
> 
> Signed-off-by: Ravi Bangoria 

Fixes: 3925f46bb590 ("powerpc/perf: Enable branch stack sampling framework")



[PATCH 2/2] powerpc/perf: Fix mmcra corruption by bhrb_filter

2019-05-10 Thread Ravi Bangoria
Consider a scenario where user creates two events:

  1st event:
attr.sample_type |= PERF_SAMPLE_BRANCH_STACK;
attr.branch_sample_type = PERF_SAMPLE_BRANCH_ANY;
fd = perf_event_open(attr, 0, 1, -1, 0);

  This sets cpuhw->bhrb_filter to 0 and returns valid fd.

  2nd event:
attr.sample_type |= PERF_SAMPLE_BRANCH_STACK;
attr.branch_sample_type = PERF_SAMPLE_BRANCH_CALL;
fd = perf_event_open(attr, 0, 1, -1, 0);

  It overrides cpuhw->bhrb_filter to -1 and returns with error.

Now if power_pmu_enable() gets called by any path other than
power_pmu_add(), ppmu->config_bhrb(-1) will set mmcra to -1.

Signed-off-by: Ravi Bangoria 
---
 arch/powerpc/perf/core-book3s.c | 6 --
 arch/powerpc/perf/power8-pmu.c  | 3 +++
 arch/powerpc/perf/power9-pmu.c  | 3 +++
 3 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/perf/core-book3s.c b/arch/powerpc/perf/core-book3s.c
index b0723002a396..8eb5dc5df62b 100644
--- a/arch/powerpc/perf/core-book3s.c
+++ b/arch/powerpc/perf/core-book3s.c
@@ -1846,6 +1846,7 @@ static int power_pmu_event_init(struct perf_event *event)
int n;
int err;
struct cpu_hw_events *cpuhw;
+   u64 bhrb_filter;
 
if (!ppmu)
return -ENOENT;
@@ -1951,13 +1952,14 @@ static int power_pmu_event_init(struct perf_event 
*event)
err = power_check_constraints(cpuhw, events, cflags, n + 1);
 
if (has_branch_stack(event)) {
-   cpuhw->bhrb_filter = ppmu->bhrb_filter_map(
+   bhrb_filter = ppmu->bhrb_filter_map(
event->attr.branch_sample_type);
 
-   if (cpuhw->bhrb_filter == -1) {
+   if (bhrb_filter == -1) {
put_cpu_var(cpu_hw_events);
return -EOPNOTSUPP;
}
+   cpuhw->bhrb_filter = bhrb_filter;
}
 
put_cpu_var(cpu_hw_events);
diff --git a/arch/powerpc/perf/power8-pmu.c b/arch/powerpc/perf/power8-pmu.c
index d12a2db26353..d10feef93b6b 100644
--- a/arch/powerpc/perf/power8-pmu.c
+++ b/arch/powerpc/perf/power8-pmu.c
@@ -29,6 +29,7 @@ enum {
 #definePOWER8_MMCRA_IFM1   0x4000UL
 #definePOWER8_MMCRA_IFM2   0x8000UL
 #definePOWER8_MMCRA_IFM3   0xC000UL
+#definePOWER8_MMCRA_BHRB_MASK  0xC000UL
 
 /*
  * Raw event encoding for PowerISA v2.07 (Power8):
@@ -243,6 +244,8 @@ static u64 power8_bhrb_filter_map(u64 branch_sample_type)
 
 static void power8_config_bhrb(u64 pmu_bhrb_filter)
 {
+   pmu_bhrb_filter &= POWER8_MMCRA_BHRB_MASK;
+
/* Enable BHRB filter in PMU */
mtspr(SPRN_MMCRA, (mfspr(SPRN_MMCRA) | pmu_bhrb_filter));
 }
diff --git a/arch/powerpc/perf/power9-pmu.c b/arch/powerpc/perf/power9-pmu.c
index 030544e35959..f3987915cadc 100644
--- a/arch/powerpc/perf/power9-pmu.c
+++ b/arch/powerpc/perf/power9-pmu.c
@@ -92,6 +92,7 @@ enum {
 #define POWER9_MMCRA_IFM1  0x4000UL
 #define POWER9_MMCRA_IFM2  0x8000UL
 #define POWER9_MMCRA_IFM3  0xC000UL
+#define POWER9_MMCRA_BHRB_MASK 0xC000UL
 
 /* Nasty Power9 specific hack */
 #define PVR_POWER9_CUMULUS 0x2000
@@ -300,6 +301,8 @@ static u64 power9_bhrb_filter_map(u64 branch_sample_type)
 
 static void power9_config_bhrb(u64 pmu_bhrb_filter)
 {
+   pmu_bhrb_filter &= POWER9_MMCRA_BHRB_MASK;
+
/* Enable BHRB filter in PMU */
mtspr(SPRN_MMCRA, (mfspr(SPRN_MMCRA) | pmu_bhrb_filter));
 }
-- 
2.20.1



[PATCH 1/2] perf ioctl: Add check for the sample_period value

2019-05-10 Thread Ravi Bangoria
Add a check for sample_period value sent from userspace. Negative
value does not make sense. And in powerpc arch code this could cause
a recursive PMI leading to a hang (reported when running perf-fuzzer).

Signed-off-by: Ravi Bangoria 
---
 kernel/events/core.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index abbd4b3b96c2..e44c90378940 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5005,6 +5005,9 @@ static int perf_event_period(struct perf_event *event, 
u64 __user *arg)
if (perf_event_check_period(event, value))
return -EINVAL;
 
+   if (!event->attr.freq && (value & (1ULL << 63)))
+   return -EINVAL;
+
event_function_call(event, __perf_event_period, );
 
return 0;
-- 
2.20.1



Re: [PATCH, RFC] byteorder: sanity check toolchain vs kernel endianess

2019-05-10 Thread Arnd Bergmann
On Fri, May 10, 2019 at 6:53 AM Dmitry Vyukov  wrote:
> >
> > I think it's good to have a sanity check in-place for consistency.
>
>
> Hi,
>
> This broke our cross-builds from x86. I am using:
>
> $ powerpc64le-linux-gnu-gcc --version
> powerpc64le-linux-gnu-gcc (Debian 7.2.0-7) 7.2.0
>
> and it says that it's little-endian somehow:
>
> $ powerpc64le-linux-gnu-gcc -dM -E - < /dev/null | grep BYTE_ORDER
> #define __BYTE_ORDER__ __ORDER_LITTLE_ENDIAN__
>
> Is it broke compiler? Or I always hold it wrong? Is there some
> additional flag I need to add?

It looks like a bug in the kernel Makefiles to me. powerpc32 is always
big-endian,
powerpc64 used to be big-endian but is now usually little-endian. There are
often three separate toolchains that default to the respective user
space targets
(ppc32be, ppc64be, ppc64le), but generally you should be able to build
any of the
three kernel configurations with any of those compilers, and have the Makefile
pass the correct -m32/-m64/-mbig-endian/-mlittle-endian command line options
depending on the kernel configuration. It seems that this is not happening
here. I have not checked why, but if this is the problem, it should be
easy enough
to figure out.

   Arnd


Re: [PATCH kernel 2/2] powerpc/pseries/dma: Enable swiotlb

2019-05-10 Thread Thiago Jung Bauermann


Hello Alexey,

Thanks!

I have similar changes in my "Secure Virtual Machine Enablement"
patches, which I am currently preparing for posting again real soon now.

This is the last version:

https://lore.kernel.org/linuxppc-dev/20180824162535.22798-1-bauer...@linux.ibm.com/

Alexey Kardashevskiy  writes:

> So far the pseries platforms has always been using IOMMU making SWIOTLB
> unnecessary. Now we want secure guests which means devices can only
> access certain areas of guest physical memory; we are going to use
> SWIOTLB for this purpose.
>
> This allows SWIOTLB for pseries. By default there is no change in behavior.
>
> This enables SWIOTLB when the "swiotlb" kernel parameter is set to "force".
>
> With the SWIOTLB enabled, the kernel creates a directly mapped DMA window
> (using the usual DDW mechanism) and implements SWIOTLB on top of that.
>
> Signed-off-by: Alexey Kardashevskiy 
> ---
>  arch/powerpc/platforms/pseries/setup.c | 5 +
>  arch/powerpc/platforms/pseries/Kconfig | 1 +
>  2 files changed, 6 insertions(+)
>
> diff --git a/arch/powerpc/platforms/pseries/setup.c 
> b/arch/powerpc/platforms/pseries/setup.c
> index e4f0dfd4ae33..30d72b587ac5 100644
> --- a/arch/powerpc/platforms/pseries/setup.c
> +++ b/arch/powerpc/platforms/pseries/setup.c
> @@ -42,6 +42,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include 
>  #include 
> @@ -71,6 +72,7 @@
>  #include 
>  #include 
>  #include 
> +#include 
>
>  #include "pseries.h"
>  #include "../../../../drivers/pci/pci.h"
> @@ -797,6 +799,9 @@ static void __init pSeries_setup_arch(void)
>   }
>
>   ppc_md.pcibios_root_bridge_prepare = pseries_root_bridge_prepare;
> +
> + if (swiotlb_force == SWIOTLB_FORCE)
> + ppc_swiotlb_enable = 1;
>  }

Yep! I have this here, enabled when booting as a secure guest:

https://lore.kernel.org/linuxppc-dev/20180824162535.22798-6-bauer...@linux.ibm.com/

And also another patch which makes it so that if booting as a secure
guest it acts as if the swiotlb kernel parameter was set to force:

https://lore.kernel.org/linuxppc-dev/20180824162535.22798-11-bauer...@linux.ibm.com/

>  static void pseries_panic(char *str)
> diff --git a/arch/powerpc/platforms/pseries/Kconfig 
> b/arch/powerpc/platforms/pseries/Kconfig
> index 9c6b3d860518..b9e8b608de01 100644
> --- a/arch/powerpc/platforms/pseries/Kconfig
> +++ b/arch/powerpc/platforms/pseries/Kconfig
> @@ -23,6 +23,7 @@ config PPC_PSERIES
>   select ARCH_RANDOM
>   select PPC_DOORBELL
>   select FORCE_SMP
> + select SWIOTLB
>   default y
>
>  config PPC_SPLPAR

I put this in a PPC_SVM config option:

https://lore.kernel.org/linuxppc-dev/20180824162535.22798-3-bauer...@linux.ibm.com/

--
Thiago Jung Bauermann
IBM Linux Technology Center



Re: [PATCH kernel 1/2] powerpc/pseries/dma: Allow swiotlb

2019-05-10 Thread Thiago Jung Bauermann


Alexey Kardashevskiy  writes:

> The commit 8617a5c5bc00 ("powerpc/dma: handle iommu bypass in
> dma_iommu_ops") merged direct DMA ops into the IOMMU DMA ops allowing
> SWIOTLB as well but only for mapping; the unmapping and bouncing parts
> were left unmodified.
>
> This adds missing direct unmapping calls to .unmap_page() and .unmap_sg().
>
> This adds missing sync callbacks and directs them to the direct DMA hooks.
>
> Fixes: 8617a5c5bc00 (powerpc/dma: handle iommu bypass in dma_iommu_ops)
> Signed-off-by: Alexey Kardashevskiy 

Nice! Thanks for working on this. I have the patch at the end of this
email to get virtio-scsi-pci and virtio-blk-pci working in a secure
guest.

I applied your patch and reverted my patch and unfortunately the guest
hangs right after mounting the disk:

[0.185659] virtio-pci :00:04.0: enabling device (0100 -> 0102)
[0.187082] virtio-pci :00:04.0: ibm,query-pe-dma-windows(2026) 2000 
800 2000 returned 0
[0.187497] virtio-pci :00:04.0: ibm,create-pe-dma-window(2027) 2000 
800 2000 10 20 returned 0 (liobn = 0x8001 startin
g addr = 800 0)
[0.226654] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
[0.227094] Non-volatile memory driver v1.3
[0.228950] brd: module loaded
[0.230666] loop: module loaded
[0.230773] ipr: IBM Power RAID SCSI Device Driver version: 2.6.4 (March 14, 
2017)
[0.233323] scsi host0: Virtio SCSI HBA
[0.235439] scsi 0:0:0:0: Direct-Access QEMU QEMU HARDDISK2.5+ 
PQ: 0 ANSI: 5
[0.369009] random: fast init done
[0.370819] sd 0:0:0:0: Attached scsi generic sg0 type 0
[0.371320] sd 0:0:0:0: Power-on or device reset occurred



[0.380378] sd 0:0:0:0: [sda] 31457280 512-byte logical blocks: (16.1 
GB/15.0 GiB)
[0.381102] sd 0:0:0:0: [sda] Write Protect is off
[0.381195] sd 0:0:0:0: [sda] Mode Sense: 63 00 00 08
[0.382436] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, 
doesn't support DPO or FUA
[0.383630] sd 0:0:0:0: [sda] Optimal transfer size 0 bytes < PAGE_SIZE 
(65536 bytes)
[0.391562]  sda: sda1 sda2
[0.398101] sd 0:0:0:0: [sda] Attached SCSI disk
[0.398205] md: Waiting for all devices to be available before autodetect
[0.398318] md: If you don't use raid, use raid=noautodetect
[0.398515] md: Autodetecting RAID arrays.
[0.398585] md: autorun ...
[0.398631] md: ... autorun DONE.
[0.403552] EXT4-fs (sda2): mounted filesystem with ordered data mode. Opts: 
(null)
[0.403700] VFS: Mounted root (ext4 filesystem) readonly on device 8:2.
[0.405258] devtmpfs: mounted
[0.406427] Freeing unused kernel memory: 4224K
[0.406519] This architecture does not have kernel memory protection.
[0.406633] Run /sbin/init as init process

Sorry, I don't have any information on where the guest is stuck. I tried
+l, +t and +w but nothing out of the ordinary
showed up. Will try something else later.

--
Thiago Jung Bauermann
IBM Linux Technology Center



>From 70d2fba809119ae2d35c9ca4269405bb5c28413a Mon Sep 17 00:00:00 2001
From: Thiago Jung Bauermann 
Date: Thu, 24 Jan 2019 22:40:16 -0200
Subject: [PATCH 1/1] powerpc/pseries/iommu: Don't use dma_iommu_ops on secure
 guests

Secure guest memory is inacessible to devices so regular DMA isn't
possible.

In that case set devices' dma_map_ops to NULL so that the generic
DMA code path will use SWIOTLB and DMA to bounce buffers.

Signed-off-by: Thiago Jung Bauermann 
---
 arch/powerpc/platforms/pseries/iommu.c | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/pseries/iommu.c 
b/arch/powerpc/platforms/pseries/iommu.c
index 36eb1ddbac69..1636306007eb 100644
--- a/arch/powerpc/platforms/pseries/iommu.c
+++ b/arch/powerpc/platforms/pseries/iommu.c
@@ -50,6 +50,7 @@
 #include 
 #include 
 #include 
+#include 

 #include "pseries.h"

@@ -1335,7 +1336,10 @@ void iommu_init_early_pSeries(void)
of_reconfig_notifier_register(_reconfig_nb);
register_memory_notifier(_mem_nb);

-   set_pci_dma_ops(_iommu_ops);
+   if (is_secure_guest())
+   set_pci_dma_ops(NULL);
+   else
+   set_pci_dma_ops(_iommu_ops);
 }

 static int __init disable_multitce(char *str)



Re: [PATCH 0/4] Enabling secure boot on PowerNV systems

2019-05-10 Thread Claudio Carvalho
Hi Matthew,

Thanks for the feedback and sorry for the delay in responding.


On 4/10/19 2:36 PM, Matthew Garrett wrote:
> (Cc:ing Peter Jones)
>
> On Tue, Apr 9, 2019 at 3:55 PM Claudio Carvalho  
> wrote:
>>
>> On 4/5/19 7:19 PM, Matthew Garrett wrote:
>>> Based on our experience doing this in UEFI, that's insufficient - you
>>> want to be able to block individual binaries or leaf certificates
>>> without dropping trust in an intermediate certificate entirely.
>>
>> We agree that a dbx would be useful for blacklisting particular kernels
>> signed with given certificate. However, we have been avoiding doing so for
>> the initial release of secure boot on OpenPOWER. We don't have individual
>> firmware binaries in OpenPOWER. Kernels are currently the only concern for
>> the OS secure boot certificates we're discussing here. Also, we have a very
>> limited keystore space in POWER9.
>>
>> Petitboot doesn't have standardized OS kernel verification at all right
>> now.  Having the capability even without dbx seems valuable.
> I don't see the benefit in attempting to maintain compatibility with
> existing tooling unless you're going to be *completely* compatible
> with existing tooling. That means supporting dbx and dbt.


Before addressing that, I'd like to share some of the current OpenPOWER
secure boot design.
Different from UEFI, secure boot in OpenPOWER systems have two distinct
domains. Each one has its own key hierarchy and signing and signature
verification mechanisms.

In the firmware secure boot domain (work already upstream):
 - Every image loaded up to skiroot is wrapped in a secure boot container.
Skiroot is a Linux zimage with petitboot (kexec bootloader) embedded in the
initramfs.
 - Within the secure boot container, the payload image is protected by a
chain of signatures anchored in the root ECDSA keys, also known as hardware
keys.
 - All public keys required to verify the container are stored in the
container itself, but a hash of the trusted public hardware keys is stored
in a protected SEEPROM region outside of the container. Firmware uses it to
check if the container is anchored in the trusted hardware keys. If not,
the container payload is not executed and the boot is aborted.
 - The hash of the hardware keys is set by the firmware supplier, for
instance, the platform manufacturer.

In OS secure boot domain (work in progress):
- The skiroot container is verified as part of firmware secure boot.
- Skiroot uses UEFI-like secure variables (PK, KEK and db) to verify OS
kernels. Only X.509 certificates will be supported for these secure variables.
- OS kernels are signed using the Linux kernel sign-file tool, as if they
were kernel modules.
- In the skiroot kernel, if secure boot is enabled, the db certificates
will be loaded into the platform keyring and IMA-appraisal will verify the
kexec image against the platform keyring.
- The PK is set by whoever controls the platform, for instance, the
manufacturer or the end customer.

How about dbx and dbt?

The db keys will be used to verify only OS kernels via kexecs initiated by
petitboot. So we only need the dbx to revoke kernel images, either via
certs or hashes. Currently, the kernel loads certs and hashes from the dbx
to the system blacklist keyring. The revoked certs are checked during pkcs7
signature verification and loading of keys. However, there doesn't appear
to be any verification against blacklisted hashes. Should kernel images be
revoked only by keys and not hashes? We tried to find published revoked
kernel lists but couldn't find any. How is kernel image revocation handled
in practice?

Also, we didn't see the shim or kernel loading anything from dbt.

In general, how do you think the kernel ought to support blacklists?


>
 The API is still a work in progress.  We are planning to publish a document
 describing the current API and overall design shortly.
>>> Ok. How are the attributes interpreted by the API?
>>
>> We support a subset of standard EFI variable attributes, and we only use
>> EFI variables that relate to secure boot. Our goal is not to implement
>> UEFI.  However, we do seek to be compatible with user space tooling and
>> reuse as much existing infrastructure as possible. We don’t support the
>> following: EFI_VARIABLE_HARDWARE_ERROR_RECORD,
>> EFI_VARIABLE_AUTHENTICATED_WRITE_ACCESS and
>> EFI_VARIABLE_ENHANCED_AUTHENTICATED_ACCESS.
> Ok. I think that's realistically fine.
>
 Perhaps the biggest departure is that the secure variables are stored in
 flash memory that is not lockable.  In order to protect the secure
 variables, hashes of the flash regions where they're stored are written to
 TPM NVRAM indices.  The TPM NVRAM indices we use are write locked at
 runtime.  The sysadmin enqueues update commands in flash.  During the next
 boot, the firmware verifies and processes the commands to update the
 certificate store and accompanying integrity hashes in the TPM NVRAM
 indices and 

Re: [PATCH] EDAC, mpc85xx: Prevent building as a module

2019-05-10 Thread Borislav Petkov
On Fri, May 10, 2019 at 04:13:20PM +0200, Borislav Petkov wrote:
> On Fri, May 10, 2019 at 08:50:52PM +1000, Michael Ellerman wrote:
> > Yeah that looks better to me. I didn't think about the case where EDAC
> > core is modular.
> > 
> > Do you want me to send a new patch?
> 
> Nah, I'll fix it up.

I've pushed it here:

https://git.kernel.org/pub/scm/linux/kernel/git/bp/bp.git/commit/?h=edac-fix-for-5.2

in case you wanna throw your build tests on it. My dingy cross-compiler
can't do much really.

Thx.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Re: [PATCH] vsprintf: Do not break early boot with probing addresses

2019-05-10 Thread christophe leroy




Le 10/05/2019 à 18:24, Steven Rostedt a écrit :

On Fri, 10 May 2019 10:42:13 +0200
Petr Mladek  wrote:


  static const char *check_pointer_msg(const void *ptr)
  {
-   char byte;
-
if (!ptr)
return "(null)";
  
-	if (probe_kernel_address(ptr, byte))

+   if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
return "(efault)";
  



< PAGE_SIZE ?

do you mean: < TASK_SIZE ?


I guess not.

Usually, < PAGE_SIZE means NULL pointer dereference (via the member of a 
struct)


Christophe

---
L'absence de virus dans ce courrier électronique a été vérifiée par le logiciel 
antivirus Avast.
https://www.avast.com/antivirus



Re: [PATCH] vsprintf: Do not break early boot with probing addresses

2019-05-10 Thread Martin Schwidefsky
On Fri, 10 May 2019 12:40:58 -0400
Steven Rostedt  wrote:

> On Fri, 10 May 2019 18:32:58 +0200
> Martin Schwidefsky  wrote:
> 
> > On Fri, 10 May 2019 12:24:01 -0400
> > Steven Rostedt  wrote:
> >   
> > > On Fri, 10 May 2019 10:42:13 +0200
> > > Petr Mladek  wrote:
> > > 
> > > >  static const char *check_pointer_msg(const void *ptr)
> > > >  {
> > > > -   char byte;
> > > > -
> > > > if (!ptr)
> > > > return "(null)";
> > > >  
> > > > -   if (probe_kernel_address(ptr, byte))
> > > > +   if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
> > > > return "(efault)";
> > > >
> > > 
> > > 
> > >   < PAGE_SIZE ?
> > > 
> > > do you mean: < TASK_SIZE ?
> > 
> > The check with < TASK_SIZE would break on s390. The 'ptr' is
> > in the kernel address space, *not* in the user address space.
> > Remember s390 has two separate address spaces for kernel/user
> > the check < TASK_SIZE only makes sense with a __user pointer.
> >   
> 
> So we allow this to read user addresses? Can't that cause a fault?
> 
> If the condition is true, we return "(efault)".

On x86 this would allow a user space access as kernel and user live
in the same address space, on s390 it would not.
h
-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.



Re: [PATCH] vsprintf: Do not break early boot with probing addresses

2019-05-10 Thread Andy Shevchenko
On Fri, May 10, 2019 at 12:24:01PM -0400, Steven Rostedt wrote:
> On Fri, 10 May 2019 10:42:13 +0200
> Petr Mladek  wrote:
> 
> >  static const char *check_pointer_msg(const void *ptr)
> >  {
> > -   char byte;
> > -
> > if (!ptr)
> > return "(null)";
> >  
> > -   if (probe_kernel_address(ptr, byte))
> > +   if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
> > return "(efault)";
> >  
> 
> 
>   < PAGE_SIZE ?
> 
> do you mean: < TASK_SIZE ?

Original code used PAGE_SIZE. If it needs to be changed, that it might be a
separate explanation / patch.

-- 
With Best Regards,
Andy Shevchenko




Re: [PATCH] vsprintf: Do not break early boot with probing addresses

2019-05-10 Thread Steven Rostedt
On Fri, 10 May 2019 18:32:58 +0200
Martin Schwidefsky  wrote:

> On Fri, 10 May 2019 12:24:01 -0400
> Steven Rostedt  wrote:
> 
> > On Fri, 10 May 2019 10:42:13 +0200
> > Petr Mladek  wrote:
> >   
> > >  static const char *check_pointer_msg(const void *ptr)
> > >  {
> > > - char byte;
> > > -
> > >   if (!ptr)
> > >   return "(null)";
> > >  
> > > - if (probe_kernel_address(ptr, byte))
> > > + if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
> > >   return "(efault)";
> > >  
> > 
> > 
> > < PAGE_SIZE ?
> > 
> > do you mean: < TASK_SIZE ?  
> 
> The check with < TASK_SIZE would break on s390. The 'ptr' is
> in the kernel address space, *not* in the user address space.
> Remember s390 has two separate address spaces for kernel/user
> the check < TASK_SIZE only makes sense with a __user pointer.
> 

So we allow this to read user addresses? Can't that cause a fault?

If the condition is true, we return "(efault)".

-- Steve


Re: [PATCH] vsprintf: Do not break early boot with probing addresses

2019-05-10 Thread Martin Schwidefsky
On Fri, 10 May 2019 12:24:01 -0400
Steven Rostedt  wrote:

> On Fri, 10 May 2019 10:42:13 +0200
> Petr Mladek  wrote:
> 
> >  static const char *check_pointer_msg(const void *ptr)
> >  {
> > -   char byte;
> > -
> > if (!ptr)
> > return "(null)";
> >  
> > -   if (probe_kernel_address(ptr, byte))
> > +   if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
> > return "(efault)";
> >
> 
> 
>   < PAGE_SIZE ?
> 
> do you mean: < TASK_SIZE ?

The check with < TASK_SIZE would break on s390. The 'ptr' is
in the kernel address space, *not* in the user address space.
Remember s390 has two separate address spaces for kernel/user
the check < TASK_SIZE only makes sense with a __user pointer.

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.



[PATCH 8/8] powerpc/pseries: Add documentation for vcpudispatch_stats

2019-05-10 Thread Naveen N. Rao
Add a document describing the fields provided by
/proc/powerpc/vcpudispatch_stats.

Signed-off-by: Naveen N. Rao 
---
 Documentation/powerpc/vcpudispatch_stats.txt | 68 
 1 file changed, 68 insertions(+)
 create mode 100644 Documentation/powerpc/vcpudispatch_stats.txt

diff --git a/Documentation/powerpc/vcpudispatch_stats.txt 
b/Documentation/powerpc/vcpudispatch_stats.txt
new file mode 100644
index ..e21476bfd78c
--- /dev/null
+++ b/Documentation/powerpc/vcpudispatch_stats.txt
@@ -0,0 +1,68 @@
+VCPU Dispatch Statistics:
+=
+
+For Shared Processor LPARs, the POWER Hypervisor maintains a relatively
+static mapping of the LPAR processors (vcpus) to physical processor
+chips (representing the "home" node) and tries to always dispatch vcpus
+on their associated physical processor chip. However, under certain
+scenarios, vcpus may be dispatched on a different processor chip (away
+from its home node).
+
+/proc/powerpc/vcpudispatch_stats can be used to obtain statistics
+related to the vcpu dispatch behavior. Writing '1' to this file enables
+collecting the statistics, while writing '0' disables the statistics.
+By default, the DTLB log for each vcpu is processed 50 times a second so
+as not to miss any entries. This processing frequency can be changed
+through /proc/powerpc/vcpudispatch_stats_freq.
+
+The statistics themselves are available by reading the procfs file
+/proc/powerpc/vcpudispatch_stats. Each line in the output corresponds to
+a vcpu as represented by the first field, followed by 8 numbers.
+
+The first number corresponds to:
+1. total vcpu dispatches since the beginning of statistics collection
+
+The next 4 numbers represent vcpu dispatch dispersions:
+2. number of times this vcpu was dispatched on the same processor as last
+   time
+3. number of times this vcpu was dispatched on a different processor core
+   as last time, but within the same chip
+4. number of times this vcpu was dispatched on a different chip
+5. number of times this vcpu was dispatches on a different socket/drawer
+(next numa boundary)
+
+The final 3 numbers represent statistics in relation to the home node of
+the vcpu:
+6. number of times this vcpu was dispatched in its home node (chip)
+7. number of times this vcpu was dispatched in a different node
+8. number of times this vcpu was dispatched in a node further away (numa
+distance)
+
+An example output:
+$ sudo cat /proc/powerpc/vcpudispatch_stats
+cpu0 6839 4126 2683 30 0 6821 18 0
+cpu1 2515 1274 1229 12 0 2509 6 0
+cpu2 2317 1198 1109 10 0 2312 5 0
+cpu3 2259 1165 1088 6 0 2256 3 0
+cpu4 2205 1143 1056 6 0 2202 3 0
+cpu5 2165 1121 1038 6 0 2162 3 0
+cpu6 2183 1127 1050 6 0 2180 3 0
+cpu7 2193 1133 1052 8 0 2187 6 0
+cpu8 2165 1115 1032 18 0 2156 9 0
+cpu9 2301 1252 1033 16 0 2293 8 0
+cpu10 2197 1138 1041 18 0 2187 10 0
+cpu11 2273 1185 1062 26 0 2260 13 0
+cpu12 2186 1125 1043 18 0 2177 9 0
+cpu13 2161 1115 1030 16 0 2153 8 0
+cpu14 2206 1153 1033 20 0 2196 10 0
+cpu15 2163 1115 1032 16 0 2155 8 0
+
+In the output above, for vcpu0, there have been 6839 dispatches since
+statistics were enabled. 4126 of those dispatches were on the same
+physical cpu as the last time. 2683 were on a different core, but within
+the same chip, while 30 dispatches were on a different chip compared to
+its last dispatch.
+
+Also, out of the total of 6839 dispatches, we see that there have been
+6821 dispatches on the vcpu's home node, while 18 dispatches were
+outside its home node, on a neighbouring chip.
-- 
2.21.0



[PATCH 4/8] powerpc/pseries: Generalize hcall_vphn()

2019-05-10 Thread Naveen N. Rao
H_HOME_NODE_ASSOCIATIVITY hcall can take two different flags and return
different associativity information in each case. Generalize the
existing hcall_vphn() function to take flags as an argument and to
return the result. Update the only existing user to pass the proper
arguments.

Signed-off-by: Naveen N. Rao 
---
 arch/powerpc/mm/book3s64/vphn.h |  8 
 arch/powerpc/mm/numa.c  | 27 +--
 2 files changed, 21 insertions(+), 14 deletions(-)

diff --git a/arch/powerpc/mm/book3s64/vphn.h b/arch/powerpc/mm/book3s64/vphn.h
index f0b93c2dd578..f7ff1e0c3801 100644
--- a/arch/powerpc/mm/book3s64/vphn.h
+++ b/arch/powerpc/mm/book3s64/vphn.h
@@ -11,6 +11,14 @@
  */
 #define VPHN_ASSOC_BUFSIZE (VPHN_REGISTER_COUNT*sizeof(u64)/sizeof(u16) + 1)
 
+/*
+ * The H_HOME_NODE_ASSOCIATIVITY hcall takes two values for flags:
+ * 1 for retrieving associativity information for a guest cpu
+ * 2 for retrieving associativity information for a host/hypervisor cpu
+ */
+#define VPHN_FLAG_VCPU 1
+#define VPHN_FLAG_PCPU 2
+
 extern int vphn_unpack_associativity(const long *packed, __be32 *unpacked);
 
 #endif
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 57e64273cb33..57f006b6214b 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -1087,6 +1087,17 @@ static void reset_topology_timer(void);
 static int topology_timer_secs = 1;
 static int topology_inited;
 
+static long hcall_vphn(unsigned long cpu, u64 flags, __be32 *associativity)
+{
+   long rc;
+   long retbuf[PLPAR_HCALL9_BUFSIZE] = {0};
+
+   rc = plpar_hcall9(H_HOME_NODE_ASSOCIATIVITY, retbuf, flags, cpu);
+   vphn_unpack_associativity(retbuf, associativity);
+
+   return rc;
+}
+
 /*
  * Change polling interval for associativity changes.
  */
@@ -1165,25 +1176,13 @@ static int update_cpu_associativity_changes_mask(void)
  * Retrieve the new associativity information for a virtual processor's
  * home node.
  */
-static long hcall_vphn(unsigned long cpu, __be32 *associativity)
-{
-   long rc;
-   long retbuf[PLPAR_HCALL9_BUFSIZE] = {0};
-   u64 flags = 1;
-   int hwcpu = get_hard_smp_processor_id(cpu);
-
-   rc = plpar_hcall9(H_HOME_NODE_ASSOCIATIVITY, retbuf, flags, hwcpu);
-   vphn_unpack_associativity(retbuf, associativity);
-
-   return rc;
-}
-
 static long vphn_get_associativity(unsigned long cpu,
__be32 *associativity)
 {
long rc;
 
-   rc = hcall_vphn(cpu, associativity);
+   rc = hcall_vphn(get_hard_smp_processor_id(cpu),
+   VPHN_FLAG_VCPU, associativity);
 
switch (rc) {
case H_FUNCTION:
-- 
2.21.0



[PATCH 6/8] powerpc/pseries: Provide vcpu dispatch statistics

2019-05-10 Thread Naveen N. Rao
For Shared Processor LPARs, the POWER Hypervisor maintains a relatively
static mapping of the LPAR processors (vcpus) to physical processor
chips (representing the "home" node) and tries to always dispatch vcpus
on their associated physical processor chip. However, under certain
scenarios, vcpus may be dispatched on a different processor chip (away
from its home node). The actual physical processor number on which a
certain vcpu is dispatched is available to the guest in the
'processor_id' field of each DTL entry.

The guest can discover the home node of each vcpu through the
H_HOME_NODE_ASSOCIATIVITY(flags=1) hcall. The guest can also discover
the associativity of physical processors, as represented in the DTL
entry, through the H_HOME_NODE_ASSOCIATIVITY(flags=2) hcall.

These can then be compared to determine if the vcpu was dispatched on
its home node or not. If the vcpu was not dispatched on the home node,
it is possible to determine if the vcpu was dispatched in a different
chip, socket or drawer.

Introduce a procfs file /proc/powerpc/vcpudispatch_stats that can be
used to obtain these statistics. Writing '1' to this file enables
collecting the statistics, while writing '0' disables the statistics.
The statistics themselves are available by reading the procfs file. By
default, the DTLB log for each vcpu is processed 50 times a second so as
not to miss any entries. This processing frequency can be changed
through /proc/powerpc/vcpudispatch_stats_freq.

Signed-off-by: Naveen N. Rao 
---
 arch/powerpc/include/asm/topology.h   |   4 +
 arch/powerpc/mm/numa.c| 107 +++
 arch/powerpc/platforms/pseries/lpar.c | 441 +-
 3 files changed, 550 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/topology.h 
b/arch/powerpc/include/asm/topology.h
index f85e2b01c3df..7c064731a0f2 100644
--- a/arch/powerpc/include/asm/topology.h
+++ b/arch/powerpc/include/asm/topology.h
@@ -93,6 +93,10 @@ extern int prrn_is_enabled(void);
 extern int find_and_online_cpu_nid(int cpu);
 extern int timed_topology_update(int nsecs);
 extern void __init shared_proc_topology_init(void);
+extern int init_cpu_associativity(void);
+extern void destroy_cpu_associativity(void);
+extern int cpu_relative_dispatch_distance(int last_disp_cpu, int cur_disp_cpu);
+extern int cpu_home_node_dispatch_distance(int disp_cpu);
 #else
 static inline int start_topology_update(void)
 {
diff --git a/arch/powerpc/mm/numa.c b/arch/powerpc/mm/numa.c
index 57f006b6214b..c0828d5e12e0 100644
--- a/arch/powerpc/mm/numa.c
+++ b/arch/powerpc/mm/numa.c
@@ -1086,6 +1086,16 @@ static int prrn_enabled;
 static void reset_topology_timer(void);
 static int topology_timer_secs = 1;
 static int topology_inited;
+static __be32 *vcpu_associativity, *pcpu_associativity;
+
+/*
+ * This represents the number of cpus in the hypervisor. Since there is no
+ * architected way to discover the number of processors in the host, we
+ * provision for dealing with NR_CPUS. This is currently 2048 by default, and
+ * is sufficient for our purposes. This will need to be tweaked if
+ * CONFIG_NR_CPUS is changed.
+ */
+#define NR_CPUS_H  NR_CPUS
 
 static long hcall_vphn(unsigned long cpu, u64 flags, __be32 *associativity)
 {
@@ -1098,6 +1108,103 @@ static long hcall_vphn(unsigned long cpu, u64 flags, 
__be32 *associativity)
return rc;
 }
 
+int init_cpu_associativity(void)
+{
+   vcpu_associativity = kcalloc(num_possible_cpus() / threads_per_core,
+   VPHN_ASSOC_BUFSIZE * sizeof(__be32), GFP_KERNEL);
+   pcpu_associativity = kcalloc(NR_CPUS_H / threads_per_core,
+   VPHN_ASSOC_BUFSIZE * sizeof(__be32), GFP_KERNEL);
+
+   if (!vcpu_associativity || !pcpu_associativity) {
+   pr_err("error allocating memory for associativity 
information\n");
+   return -ENOMEM;
+   }
+
+   return 0;
+}
+
+void destroy_cpu_associativity(void)
+{
+   kfree(vcpu_associativity);
+   kfree(pcpu_associativity);
+   vcpu_associativity = pcpu_associativity = 0;
+}
+
+static __be32 *__get_cpu_associativity(int cpu, __be32 *cpu_assoc, int flag)
+{
+   __be32 *assoc;
+   int rc = 0;
+
+   assoc = _assoc[(int)(cpu / threads_per_core) * VPHN_ASSOC_BUFSIZE];
+   if (!assoc[0]) {
+   rc = hcall_vphn(cpu, flag, [0]);
+   if (rc)
+   return NULL;
+   }
+
+   return assoc;
+}
+
+static __be32 *get_pcpu_associativity(int cpu)
+{
+   return __get_cpu_associativity(cpu, pcpu_associativity, VPHN_FLAG_PCPU);
+}
+
+static __be32 *get_vcpu_associativity(int cpu)
+{
+   return __get_cpu_associativity(cpu, vcpu_associativity, VPHN_FLAG_VCPU);
+}
+
+static int calc_dispatch_distance(__be32 *cpu1_assoc, __be32 *cpu2_assoc)
+{
+   int i, index, dist;
+
+   for (i = 0, dist = 0; i < distance_ref_points_depth; i++) {
+   index = be32_to_cpu(distance_ref_points[i]);
+   if 

[PATCH 5/8] powerpc/pseries: Introduce helpers to gatekeep DTLB usage

2019-05-10 Thread Naveen N. Rao
Since we would be introducing a new user of the DTL buffer in a
subsequent patch, add helpers to gatekeep use of the DTL buffer. The
current usage of the DTL buffer from debugfs is at a per-cpu level
(corresponding to the cpu debugfs file that is opened). Subsequently, we
will have users enabling/accessing DTLB for all online cpus. These
helpers allow any number of per-cpu users, or a single global user
exclusively.

Signed-off-by: Naveen N. Rao 
---
 arch/powerpc/include/asm/plpar_wrappers.h |  2 ++
 arch/powerpc/platforms/pseries/dtl.c  | 10 ++-
 arch/powerpc/platforms/pseries/lpar.c | 36 +++
 3 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/plpar_wrappers.h 
b/arch/powerpc/include/asm/plpar_wrappers.h
index d08feb1bc2bd..ab7dd454b6eb 100644
--- a/arch/powerpc/include/asm/plpar_wrappers.h
+++ b/arch/powerpc/include/asm/plpar_wrappers.h
@@ -88,6 +88,8 @@ static inline long register_dtl(unsigned long cpu, unsigned 
long vpa)
return vpa_call(H_VPA_REG_DTL, cpu, vpa);
 }
 
+extern bool register_dtl_buffer_access(bool global);
+extern void unregister_dtl_buffer_access(bool global);
 extern void register_dtl_buffer(int cpu);
 extern void alloc_dtl_buffers(void);
 extern void vpa_init(int cpu);
diff --git a/arch/powerpc/platforms/pseries/dtl.c 
b/arch/powerpc/platforms/pseries/dtl.c
index fb05804adb2f..dd28296c9903 100644
--- a/arch/powerpc/platforms/pseries/dtl.c
+++ b/arch/powerpc/platforms/pseries/dtl.c
@@ -193,11 +193,15 @@ static int dtl_enable(struct dtl *dtl)
if (dtl->buf)
return -EBUSY;
 
+   if (register_dtl_buffer_access(false))
+   return -EBUSY;
+
n_entries = dtl_buf_entries;
buf = kmem_cache_alloc_node(dtl_cache, GFP_KERNEL, 
cpu_to_node(dtl->cpu));
if (!buf) {
printk(KERN_WARNING "%s: buffer alloc failed for cpu %d\n",
__func__, dtl->cpu);
+   unregister_dtl_buffer_access(false);
return -ENOMEM;
}
 
@@ -214,8 +218,11 @@ static int dtl_enable(struct dtl *dtl)
}
spin_unlock(>lock);
 
-   if (rc)
+   if (rc) {
+   unregister_dtl_buffer_access(false);
kmem_cache_free(dtl_cache, buf);
+   }
+
return rc;
 }
 
@@ -227,6 +234,7 @@ static void dtl_disable(struct dtl *dtl)
dtl->buf = NULL;
dtl->buf_entries = 0;
spin_unlock(>lock);
+   unregister_dtl_buffer_access(false);
 }
 
 /* file interface */
diff --git a/arch/powerpc/platforms/pseries/lpar.c 
b/arch/powerpc/platforms/pseries/lpar.c
index 3375ca8cefb5..6af5a2a11deb 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -65,6 +65,42 @@ EXPORT_SYMBOL(plpar_hcall);
 EXPORT_SYMBOL(plpar_hcall9);
 EXPORT_SYMBOL(plpar_hcall_norets);
 
+static DEFINE_SPINLOCK(dtl_buffer_refctr_lock);
+static unsigned int dtl_buffer_global_refctr, dtl_buffer_percpu_refctr;
+
+bool register_dtl_buffer_access(bool global)
+{
+   int rc = 0;
+
+   spin_lock(_buffer_refctr_lock);
+
+   if ((global && (dtl_buffer_global_refctr || dtl_buffer_percpu_refctr))
+   || (!global && dtl_buffer_global_refctr)) {
+   rc = -1;
+   } else {
+   if (global)
+   dtl_buffer_global_refctr++;
+   else
+   dtl_buffer_percpu_refctr++;
+   }
+
+   spin_unlock(_buffer_refctr_lock);
+
+   return rc;
+}
+
+void unregister_dtl_buffer_access(bool global)
+{
+   spin_lock(_buffer_refctr_lock);
+
+   if (global)
+   dtl_buffer_global_refctr--;
+   else
+   dtl_buffer_percpu_refctr--;
+
+   spin_unlock(_buffer_refctr_lock);
+}
+
 void alloc_dtl_buffers(void)
 {
int cpu;
-- 
2.21.0



[PATCH 7/8] powerpc/pseries: Protect against hogging the cpu while setting up the stats

2019-05-10 Thread Naveen N. Rao
When enabling or disabling the vcpu dispatch statistics, we do a lot of
work including allocating/deallocating memory across all possible cpus
for the DTL buffer. In order to guard against hogging the cpu for too
long, track the time we're taking and yield the processor if necessary.

Signed-off-by: Naveen N. Rao 
---
 arch/powerpc/include/asm/plpar_wrappers.h |  2 +-
 arch/powerpc/platforms/pseries/lpar.c | 29 ---
 arch/powerpc/platforms/pseries/setup.c|  2 +-
 3 files changed, 22 insertions(+), 11 deletions(-)

diff --git a/arch/powerpc/include/asm/plpar_wrappers.h 
b/arch/powerpc/include/asm/plpar_wrappers.h
index ab7dd454b6eb..d01bf0036e4d 100644
--- a/arch/powerpc/include/asm/plpar_wrappers.h
+++ b/arch/powerpc/include/asm/plpar_wrappers.h
@@ -91,7 +91,7 @@ static inline long register_dtl(unsigned long cpu, unsigned 
long vpa)
 extern bool register_dtl_buffer_access(bool global);
 extern void unregister_dtl_buffer_access(bool global);
 extern void register_dtl_buffer(int cpu);
-extern void alloc_dtl_buffers(void);
+extern void alloc_dtl_buffers(unsigned long *time_limit);
 extern void vpa_init(int cpu);
 
 static inline long plpar_pte_enter(unsigned long flags,
diff --git a/arch/powerpc/platforms/pseries/lpar.c 
b/arch/powerpc/platforms/pseries/lpar.c
index b808a70cc253..8bc1b950cfd0 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -257,7 +257,7 @@ static void process_dtl_buffer(struct work_struct *work)
HZ / vcpudispatch_stats_freq);
 }
 
-void alloc_dtl_buffers(void)
+void alloc_dtl_buffers(unsigned long *time_limit)
 {
int cpu;
struct paca_struct *pp;
@@ -281,10 +281,15 @@ void alloc_dtl_buffers(void)
pp->dispatch_log = dtl;
pp->dispatch_log_end = dtl + N_DISPATCH_LOG;
pp->dtl_curr = dtl;
+
+   if (time_limit && time_after(jiffies, *time_limit)) {
+   cond_resched();
+   *time_limit = jiffies + HZ;
+   }
}
 }
 
-static void free_dtl_buffers(void)
+static void free_dtl_buffers(unsigned long *time_limit)
 {
 #ifndef CONFIG_VIRT_CPU_ACCOUNTING_NATIVE
int cpu;
@@ -299,6 +304,11 @@ static void free_dtl_buffers(void)
pp->dispatch_log = 0;
pp->dispatch_log_end = 0;
pp->dtl_curr = 0;
+
+   if (time_limit && time_after(jiffies, *time_limit)) {
+   cond_resched();
+   *time_limit = jiffies + HZ;
+   }
}
 #endif
 }
@@ -381,7 +391,7 @@ static void reset_global_dtl_mask(void)
lppaca_of(cpu).dtl_enable_mask = dtl_mask;
 }
 
-static int dtl_worker_enable(void)
+static int dtl_worker_enable(unsigned long *time_limit)
 {
int rc = 0, state;
 
@@ -400,13 +410,13 @@ static int dtl_worker_enable(void)
set_global_dtl_mask(DTL_LOG_ALL);
 
/* Setup dtl buffers and register those */
-   alloc_dtl_buffers();
+   alloc_dtl_buffers(time_limit);
 
state = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "powerpc/dtl:online",
dtl_worker_online, dtl_worker_offline);
if (state < 0) {
pr_err("vcpudispatch_stats: unable to setup workqueue for DTL 
processing\n");
-   free_dtl_buffers();
+   free_dtl_buffers(time_limit);
reset_global_dtl_mask();
unregister_dtl_buffer_access(1);
rc = -EINVAL;
@@ -420,14 +430,14 @@ static int dtl_worker_enable(void)
return rc;
 }
 
-static void dtl_worker_disable(void)
+static void dtl_worker_disable(unsigned long *time_limit)
 {
mutex_lock(_worker_mutex);
dtl_worker_refctr--;
if (!dtl_worker_refctr) {
cpuhp_remove_state(dtl_worker_state);
dtl_worker_state = 0;
-   free_dtl_buffers();
+   free_dtl_buffers(time_limit);
reset_global_dtl_mask();
unregister_dtl_buffer_access(1);
}
@@ -437,6 +447,7 @@ static void dtl_worker_disable(void)
 static ssize_t vcpudispatch_stats_write(struct file *file, const char __user 
*p,
size_t count, loff_t *ppos)
 {
+   unsigned long time_limit = jiffies + HZ;
struct vcpu_dispatch_data *disp;
int rc, cmd, cpu;
char buf[16];
@@ -469,13 +480,13 @@ static ssize_t vcpudispatch_stats_write(struct file 
*file, const char __user *p,
disp->last_disp_cpu = -1;
}
 
-   rc = dtl_worker_enable();
+   rc = dtl_worker_enable(_limit);
if (rc) {
destroy_cpu_associativity();
return rc;
}
} else {
-   dtl_worker_disable();
+   dtl_worker_disable(_limit);

[PATCH 2/8] powerpc/pseries: Do not save the previous DTL mask value

2019-05-10 Thread Naveen N. Rao
When CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is enabled, we always initialize
DTL enable mask to DTL_LOG_PREEMPT (0x2). There are no other places
where the mask is changed. As such, when reading the DTL log buffer
through debugfs, there is no need to save and restore the previous mask
value.

We don't need to save and restore the earlier mask value if
CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is not enabled. So, remove the field
from the structure as well.

Signed-off-by: Naveen N. Rao 
---
 arch/powerpc/platforms/pseries/dtl.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/dtl.c 
b/arch/powerpc/platforms/pseries/dtl.c
index 051ea2de1e1a..fb05804adb2f 100644
--- a/arch/powerpc/platforms/pseries/dtl.c
+++ b/arch/powerpc/platforms/pseries/dtl.c
@@ -55,7 +55,6 @@ struct dtl_ring {
struct dtl_entry *write_ptr;
struct dtl_entry *buf;
struct dtl_entry *buf_end;
-   u8  saved_dtl_mask;
 };
 
 static DEFINE_PER_CPU(struct dtl_ring, dtl_rings);
@@ -105,7 +104,6 @@ static int dtl_start(struct dtl *dtl)
dtlr->write_ptr = dtl->buf;
 
/* enable event logging */
-   dtlr->saved_dtl_mask = lppaca_of(dtl->cpu).dtl_enable_mask;
lppaca_of(dtl->cpu).dtl_enable_mask |= dtl_event_mask;
 
dtl_consumer = consume_dtle;
@@ -123,7 +121,7 @@ static void dtl_stop(struct dtl *dtl)
dtlr->buf = NULL;
 
/* restore dtl_enable_mask */
-   lppaca_of(dtl->cpu).dtl_enable_mask = dtlr->saved_dtl_mask;
+   lppaca_of(dtl->cpu).dtl_enable_mask = DTL_LOG_PREEMPT;
 
if (atomic_dec_and_test(_count))
dtl_consumer = NULL;
-- 
2.21.0



[PATCH 3/8] powerpc/pseries: Factor out DTL buffer allocation and registration routines

2019-05-10 Thread Naveen N. Rao
Introduce new helpers for DTL buffer allocation and registration and
have the existing code use those.

Signed-off-by: Naveen N. Rao 
---
 arch/powerpc/include/asm/plpar_wrappers.h |  2 +
 arch/powerpc/platforms/pseries/lpar.c | 66 ---
 arch/powerpc/platforms/pseries/setup.c| 34 +---
 3 files changed, 52 insertions(+), 50 deletions(-)

diff --git a/arch/powerpc/include/asm/plpar_wrappers.h 
b/arch/powerpc/include/asm/plpar_wrappers.h
index cff5a411e595..d08feb1bc2bd 100644
--- a/arch/powerpc/include/asm/plpar_wrappers.h
+++ b/arch/powerpc/include/asm/plpar_wrappers.h
@@ -88,6 +88,8 @@ static inline long register_dtl(unsigned long cpu, unsigned 
long vpa)
return vpa_call(H_VPA_REG_DTL, cpu, vpa);
 }
 
+extern void register_dtl_buffer(int cpu);
+extern void alloc_dtl_buffers(void);
 extern void vpa_init(int cpu);
 
 static inline long plpar_pte_enter(unsigned long flags,
diff --git a/arch/powerpc/platforms/pseries/lpar.c 
b/arch/powerpc/platforms/pseries/lpar.c
index 23f2ac6793b7..3375ca8cefb5 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -65,13 +65,58 @@ EXPORT_SYMBOL(plpar_hcall);
 EXPORT_SYMBOL(plpar_hcall9);
 EXPORT_SYMBOL(plpar_hcall_norets);
 
+void alloc_dtl_buffers(void)
+{
+   int cpu;
+   struct paca_struct *pp;
+   struct dtl_entry *dtl;
+
+   for_each_possible_cpu(cpu) {
+   pp = paca_ptrs[cpu];
+   dtl = kmem_cache_alloc(dtl_cache, GFP_KERNEL);
+   if (!dtl) {
+   pr_warn("Failed to allocate dispatch trace log for cpu 
%d\n",
+   cpu);
+   pr_warn("Stolen time statistics will be unreliable\n");
+   break;
+   }
+
+   pp->dtl_ridx = 0;
+   pp->dispatch_log = dtl;
+   pp->dispatch_log_end = dtl + N_DISPATCH_LOG;
+   pp->dtl_curr = dtl;
+   }
+}
+
+void register_dtl_buffer(int cpu)
+{
+   long ret;
+   struct paca_struct *pp;
+   struct dtl_entry *dtl;
+   int hwcpu = get_hard_smp_processor_id(cpu);
+
+   pp = paca_ptrs[cpu];
+   dtl = pp->dispatch_log;
+   if (dtl) {
+   pp->dtl_ridx = 0;
+   pp->dtl_curr = dtl;
+   lppaca_of(cpu).dtl_idx = 0;
+
+   /* hypervisor reads buffer length from this field */
+   dtl->enqueue_to_dispatch_time = cpu_to_be32(DISPATCH_LOG_BYTES);
+   ret = register_dtl(hwcpu, __pa(dtl));
+   if (ret)
+   pr_err("WARNING: DTL registration of cpu %d (hw %d) "
+  "failed with %ld\n", cpu, hwcpu, ret);
+   lppaca_of(cpu).dtl_enable_mask = DTL_LOG_PREEMPT;
+   }
+}
+
 void vpa_init(int cpu)
 {
int hwcpu = get_hard_smp_processor_id(cpu);
unsigned long addr;
long ret;
-   struct paca_struct *pp;
-   struct dtl_entry *dtl;
 
/*
 * The spec says it "may be problematic" if CPU x registers the VPA of
@@ -112,22 +157,7 @@ void vpa_init(int cpu)
/*
 * Register dispatch trace log, if one has been allocated.
 */
-   pp = paca_ptrs[cpu];
-   dtl = pp->dispatch_log;
-   if (dtl) {
-   pp->dtl_ridx = 0;
-   pp->dtl_curr = dtl;
-   lppaca_of(cpu).dtl_idx = 0;
-
-   /* hypervisor reads buffer length from this field */
-   dtl->enqueue_to_dispatch_time = cpu_to_be32(DISPATCH_LOG_BYTES);
-   ret = register_dtl(hwcpu, __pa(dtl));
-   if (ret)
-   pr_err("WARNING: DTL registration of cpu %d (hw %d) "
-  "failed with %ld\n", smp_processor_id(),
-  hwcpu, ret);
-   lppaca_of(cpu).dtl_enable_mask = DTL_LOG_PREEMPT;
-   }
+   register_dtl_buffer(cpu);
 }
 
 #ifdef CONFIG_PPC_BOOK3S_64
diff --git a/arch/powerpc/platforms/pseries/setup.c 
b/arch/powerpc/platforms/pseries/setup.c
index fabaefff8399..b6995e5cc5c9 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -277,46 +277,16 @@ struct kmem_cache *dtl_cache;
  */
 static int alloc_dispatch_logs(void)
 {
-   int cpu, ret;
-   struct paca_struct *pp;
-   struct dtl_entry *dtl;
-
if (!firmware_has_feature(FW_FEATURE_SPLPAR))
return 0;
 
if (!dtl_cache)
return 0;
 
-   for_each_possible_cpu(cpu) {
-   pp = paca_ptrs[cpu];
-   dtl = kmem_cache_alloc(dtl_cache, GFP_KERNEL);
-   if (!dtl) {
-   pr_warn("Failed to allocate dispatch trace log for cpu 
%d\n",
-   cpu);
-   pr_warn("Stolen time statistics will be unreliable\n");
-   break;
-   }
-
-   pp->dtl_ridx = 0;
-

[PATCH 1/8] powerpc/pseries: Use macros for referring to the DTL enable mask

2019-05-10 Thread Naveen N. Rao
Introduce macros to encode the DTL enable mask fields and use those
instead of hardcoding numbers.

Signed-off-by: Naveen N. Rao 
---
 arch/powerpc/include/asm/lppaca.h  | 11 +++
 arch/powerpc/platforms/pseries/dtl.c   |  8 +---
 arch/powerpc/platforms/pseries/lpar.c  |  2 +-
 arch/powerpc/platforms/pseries/setup.c |  2 +-
 4 files changed, 14 insertions(+), 9 deletions(-)

diff --git a/arch/powerpc/include/asm/lppaca.h 
b/arch/powerpc/include/asm/lppaca.h
index 7c23ce8a5a4c..2c7e31187726 100644
--- a/arch/powerpc/include/asm/lppaca.h
+++ b/arch/powerpc/include/asm/lppaca.h
@@ -154,6 +154,17 @@ struct dtl_entry {
 #define DISPATCH_LOG_BYTES 4096/* bytes per cpu */
 #define N_DISPATCH_LOG (DISPATCH_LOG_BYTES / sizeof(struct dtl_entry))
 
+/*
+ * Dispatch trace log event enable mask:
+ *   0x1: voluntary virtual processor waits
+ *   0x2: time-slice preempts
+ *   0x4: virtual partition memory page faults
+ */
+#define DTL_LOG_CEDE   0x1
+#define DTL_LOG_PREEMPT0x2
+#define DTL_LOG_FAULT  0x4
+#define DTL_LOG_ALL(DTL_LOG_CEDE | DTL_LOG_PREEMPT | DTL_LOG_FAULT)
+
 extern struct kmem_cache *dtl_cache;
 
 /*
diff --git a/arch/powerpc/platforms/pseries/dtl.c 
b/arch/powerpc/platforms/pseries/dtl.c
index ef6595153642..051ea2de1e1a 100644
--- a/arch/powerpc/platforms/pseries/dtl.c
+++ b/arch/powerpc/platforms/pseries/dtl.c
@@ -40,13 +40,7 @@ struct dtl {
 };
 static DEFINE_PER_CPU(struct dtl, cpu_dtl);
 
-/*
- * Dispatch trace log event mask:
- * 0x7: 0x1: voluntary virtual processor waits
- *  0x2: time-slice preempts
- *  0x4: virtual partition memory page faults
- */
-static u8 dtl_event_mask = 0x7;
+static u8 dtl_event_mask = DTL_LOG_ALL;
 
 
 /*
diff --git a/arch/powerpc/platforms/pseries/lpar.c 
b/arch/powerpc/platforms/pseries/lpar.c
index 1034ef1fe2b4..23f2ac6793b7 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -126,7 +126,7 @@ void vpa_init(int cpu)
pr_err("WARNING: DTL registration of cpu %d (hw %d) "
   "failed with %ld\n", smp_processor_id(),
   hwcpu, ret);
-   lppaca_of(cpu).dtl_enable_mask = 2;
+   lppaca_of(cpu).dtl_enable_mask = DTL_LOG_PREEMPT;
}
 }
 
diff --git a/arch/powerpc/platforms/pseries/setup.c 
b/arch/powerpc/platforms/pseries/setup.c
index e4f0dfd4ae33..fabaefff8399 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -316,7 +316,7 @@ static int alloc_dispatch_logs(void)
pr_err("WARNING: DTL registration of cpu %d (hw %d) failed "
   "with %d\n", smp_processor_id(),
   hard_smp_processor_id(), ret);
-   get_paca()->lppaca_ptr->dtl_enable_mask = 2;
+   get_paca()->lppaca_ptr->dtl_enable_mask = DTL_LOG_PREEMPT;
 
return 0;
 }
-- 
2.21.0



[PATCH 0/8] Provide vcpu dispatch statistics

2019-05-10 Thread Naveen N. Rao
This series adds a new procfs file /proc/powerpc/vcpudispatch_stats for 
providing statistics around how the LPAR processors are dispatched by 
the POWER Hypervisor, in a shared LPAR environment. Patch 6/8 has more 
details on how the statistics are gathered.

An example output:
$ sudo cat /proc/powerpc/vcpudispatch_stats
cpu0 6839 4126 2683 30 0 6821 18 0
cpu1 2515 1274 1229 12 0 2509 6 0
cpu2 2317 1198 1109 10 0 2312 5 0
cpu3 2259 1165 1088 6 0 2256 3 0
cpu4 2205 1143 1056 6 0 2202 3 0
cpu5 2165 1121 1038 6 0 2162 3 0
cpu6 2183 1127 1050 6 0 2180 3 0
cpu7 2193 1133 1052 8 0 2187 6 0
cpu8 2165 1115 1032 18 0 2156 9 0
cpu9 2301 1252 1033 16 0 2293 8 0
cpu10 2197 1138 1041 18 0 2187 10 0
cpu11 2273 1185 1062 26 0 2260 13 0
cpu12 2186 1125 1043 18 0 2177 9 0
cpu13 2161 1115 1030 16 0 2153 8 0
cpu14 2206 1153 1033 20 0 2196 10 0
cpu15 2163 1115 1032 16 0 2155 8 0

In the output above, for vcpu0, there have been 6839 dispatches since
statistics were enabled. 4126 of those dispatches were on the same
physical cpu as the last time. 2683 were on a different core, but within
the same chip, while 30 dispatches were on a different chip compared to
its last dispatch.

Also, out of the total of 6839 dispatches, we see that there have been
6821 dispatches on the vcpu's home node, while 18 dispatches were
outside its home node, on a neighbouring chip.

Changes since RFC:
- Patches 1/8 to 5/8: no changes, except rebase to powerpc/merge
- Patch 6/8: The mutex guarding the vphn hcall has been dropped. It was 
  only meant to serialize hcalls issued when stats are initially 
  enabled.  However, in reality, the various per-cpu workers will be 
  scheduled at slightly different times and chances of hcalls for 
  retrieving the same associativity information at the same time is very 
  less. Even in that case, there are no other side effects.
- Patch 6/8: The third column for vcpu dispatches on the same core, but 
  different thread has been dropped and merged with the second column.  
- Patch 7/8: new patch to ensure we don't take too much time while 
  enabling/disabling statistics on large systems with heavy workload.
- Patch 8/8: new patch adding a document describing the fields in the 
  procfs file.


- Naveen

Naveen N. Rao (8):
  powerpc/pseries: Use macros for referring to the DTL enable mask
  powerpc/pseries: Do not save the previous DTL mask value
  powerpc/pseries: Factor out DTL buffer allocation and registration
routines
  powerpc/pseries: Generalize hcall_vphn()
  powerpc/pseries: Introduce helpers to gatekeep DTLB usage
  powerpc/pseries: Provide vcpu dispatch statistics
  powerpc/pseries: Protect against hogging the cpu while setting up the
stats
  powerpc/pseries: Add documentation for vcpudispatch_stats

 Documentation/powerpc/vcpudispatch_stats.txt |  68 +++
 arch/powerpc/include/asm/lppaca.h|  11 +
 arch/powerpc/include/asm/plpar_wrappers.h|   4 +
 arch/powerpc/include/asm/topology.h  |   4 +
 arch/powerpc/mm/book3s64/vphn.h  |   8 +
 arch/powerpc/mm/numa.c   | 134 -
 arch/powerpc/platforms/pseries/dtl.c |  22 +-
 arch/powerpc/platforms/pseries/lpar.c| 550 ++-
 arch/powerpc/platforms/pseries/setup.c   |  34 +-
 9 files changed, 760 insertions(+), 75 deletions(-)
 create mode 100644 Documentation/powerpc/vcpudispatch_stats.txt

-- 
2.21.0



Re: [PATCH] vsprintf: Do not break early boot with probing addresses

2019-05-10 Thread Steven Rostedt
On Fri, 10 May 2019 10:42:13 +0200
Petr Mladek  wrote:

>  static const char *check_pointer_msg(const void *ptr)
>  {
> - char byte;
> -
>   if (!ptr)
>   return "(null)";
>  
> - if (probe_kernel_address(ptr, byte))
> + if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
>   return "(efault)";
>  


< PAGE_SIZE ?

do you mean: < TASK_SIZE ?

-- Steve


[PATCH v11 5/7] powerpc: define syscall_get_error()

2019-05-10 Thread Dmitry V. Levin
syscall_get_error() is required to be implemented on this
architecture in addition to already implemented syscall_get_nr(),
syscall_get_arguments(), syscall_get_return_value(), and
syscall_get_arch() functions in order to extend the generic
ptrace API with PTRACE_GET_SYSCALL_INFO request.

Acked-by: Michael Ellerman 
Cc: Elvira Khabirova 
Cc: Eugene Syromyatnikov 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Oleg Nesterov 
Cc: Andy Lutomirski 
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Dmitry V. Levin 
---

Notes:
v11: added Acked-by from 
https://lore.kernel.org/lkml/87woj3wwmf@concordia.ellerman.id.au/
v10: unchanged
v9: unchanged
v8: unchanged
v7: unchanged
v6: unchanged
v5: initial revision

This change has been tested with
tools/testing/selftests/ptrace/get_syscall_info.c and strace,
so it's correct from PTRACE_GET_SYSCALL_INFO point of view.

This cast doubts on commit v4.3-rc1~86^2~81 that changed
syscall_set_return_value() in a way that doesn't quite match
syscall_get_error(), but syscall_set_return_value() is out
of scope of this series, so I'll just let you know my concerns.

See also 
https://lore.kernel.org/lkml/874lbbt3k6@concordia.ellerman.id.au/
and https://lore.kernel.org/lkml/87woj3wwmf@concordia.ellerman.id.au/
for more details on powerpc syscall_set_return_value() confusion.

 arch/powerpc/include/asm/syscall.h | 10 ++
 1 file changed, 10 insertions(+)

diff --git a/arch/powerpc/include/asm/syscall.h 
b/arch/powerpc/include/asm/syscall.h
index a048fed0722f..bd9663137d57 100644
--- a/arch/powerpc/include/asm/syscall.h
+++ b/arch/powerpc/include/asm/syscall.h
@@ -38,6 +38,16 @@ static inline void syscall_rollback(struct task_struct *task,
regs->gpr[3] = regs->orig_gpr3;
 }
 
+static inline long syscall_get_error(struct task_struct *task,
+struct pt_regs *regs)
+{
+   /*
+* If the system call failed,
+* regs->gpr[3] contains a positive ERRORCODE.
+*/
+   return (regs->ccr & 0x1000UL) ? -regs->gpr[3] : 0;
+}
+
 static inline long syscall_get_return_value(struct task_struct *task,
struct pt_regs *regs)
 {
-- 
ldv


[PATCH v11 0/7] ptrace: add PTRACE_GET_SYSCALL_INFO request

2019-05-10 Thread Dmitry V. Levin
[Andrew, could you take this patchset into your tree, please?

Besides the patch for hexagon, all patches in this series have
Acked-by or Reviewed-by tags already.

I have been waiting and pinging the hexagon maintainer since November
without any visible effect.  The last Acked-by from the hexagon maintainer
in linux.git was in October and the last Signed-off-by was in July.  Since
that time not a single change affecting hexagon was able to attract
attention of the hexagon maintainer, so I don't think it's worth waiting
any longer.]

PTRACE_GET_SYSCALL_INFO is a generic ptrace API that lets ptracer obtain
details of the syscall the tracee is blocked in.

There are two reasons for a special syscall-related ptrace request.

Firstly, with the current ptrace API there are cases when ptracer cannot
retrieve necessary information about syscalls.  Some examples include:
* The notorious int-0x80-from-64-bit-task issue.  See [1] for details.
In short, if a 64-bit task performs a syscall through int 0x80, its tracer
has no reliable means to find out that the syscall was, in fact,
a compat syscall, and misidentifies it.
* Syscall-enter-stop and syscall-exit-stop look the same for the tracer.
Common practice is to keep track of the sequence of ptrace-stops in order
not to mix the two syscall-stops up.  But it is not as simple as it looks;
for example, strace had a (just recently fixed) long-standing bug where
attaching strace to a tracee that is performing the execve system call
led to the tracer identifying the following syscall-exit-stop as
syscall-enter-stop, which messed up all the state tracking.
* Since the introduction of commit 84d77d3f06e7e8dea057d10e8ec77ad71f721be3
("ptrace: Don't allow accessing an undumpable mm"), both PTRACE_PEEKDATA
and process_vm_readv become unavailable when the process dumpable flag
is cleared.  On such architectures as ia64 this results in all syscall
arguments being unavailable for the tracer.

Secondly, ptracers also have to support a lot of arch-specific code for
obtaining information about the tracee.  For some architectures, this
requires a ptrace(PTRACE_PEEKUSER, ...) invocation for every syscall
argument and return value.

PTRACE_GET_SYSCALL_INFO returns the following structure:

struct ptrace_syscall_info {
__u8 op;/* PTRACE_SYSCALL_INFO_* */
__u32 arch __attribute__((__aligned__(sizeof(__u32;
__u64 instruction_pointer;
__u64 stack_pointer;
union {
struct {
__u64 nr;
__u64 args[6];
} entry;
struct {
__s64 rval;
__u8 is_error;
} exit;
struct {
__u64 nr;
__u64 args[6];
__u32 ret_data;
} seccomp;
};
};

The structure was chosen according to [2], except for the following
changes:
* seccomp substructure was added as a superset of entry substructure;
* the type of nr field was changed from int to __u64 because syscall
numbers are, as a practical matter, 64 bits;
* stack_pointer field was added along with instruction_pointer field
since it is readily available and can save the tracer from extra
PTRACE_GETREGS/PTRACE_GETREGSET calls;
* arch is always initialized to aid with tracing system calls
* such as execve();
* instruction_pointer and stack_pointer are always initialized
so they could be easily obtained for non-syscall stops;
* a boolean is_error field was added along with rval field, this way
the tracer can more reliably distinguish a return value
from an error value.

strace has been ported to PTRACE_GET_SYSCALL_INFO.
Starting with release 4.26, strace uses PTRACE_GET_SYSCALL_INFO API
as the preferred mechanism of obtaining syscall information.

[1] 
https://lore.kernel.org/lkml/ca+55afzcsvmddj9lh_gdbz1ozhyem6zrgpbdajnywm2lf_e...@mail.gmail.com/
[2] 
https://lore.kernel.org/lkml/caobl_7gm0n80n7j_dfw_eqyflyzq+sf4y2avsccv88tb3aw...@mail.gmail.com/

---

Notes:
v11:
* Added more Acked-by.
* Rebased back to mainline as the prerequisite syscall_get_arch patchset
  has already been merged via audit tree.

v10:
* Added more Acked-by.

v9:
* Rebased to linux-next again due to syscall_get_arguments() signature 
change.

v8:
* Moved syscall_get_arch() specific patches to a separate patchset
  which is now merged into audit/next tree.
* Rebased to linux-next.
* Moved ptrace_get_syscall_info code under #ifdef 
CONFIG_HAVE_ARCH_TRACEHOOK,
  narrowing down the set of architectures supported by this implementation
  back to those 19 that enable CONFIG_HAVE_ARCH_TRACEHOOK because
  I failed to get all syscall_get_*(), instruction_pointer(),
  and user_stack_pointer() functions implemented on some niche
  architectures.  This leaves the following architectures out:
  alpha, h8300, m68k, microblaze, 

Re: [PATCH] vsprintf: Do not break early boot with probing addresses

2019-05-10 Thread Petr Mladek
On Fri 2019-05-10 10:42:13, Petr Mladek wrote:
> The commit 3e5903eb9cff70730 ("vsprintf: Prevent crash when dereferencing
> invalid pointers") broke boot on several architectures. The common
> pattern is that probe_kernel_read() is not working during early
> boot because userspace access framework is not ready.
> 
> It is a generic problem. We have to avoid any complex external
> functions in vsprintf() code, especially in the common path.
> They might break printk() easily and are hard to debug.
> 
> Replace probe_kernel_read() with some simple checks for obvious
> problems.

JFYI, I have sent a pull request with this patch, see
https://lkml.kernel.org/r/20190510144718.riyy72g4cy5nk...@pathway.suse.cz

Best Regards,
Petr


Re: [PATCH 03/16] lib,treewide: add new match_string() helper/macro

2019-05-10 Thread andriy.shevche...@linux.intel.com
On Fri, May 10, 2019 at 09:15:27AM +, Ardelean, Alexandru wrote:
> On Wed, 2019-05-08 at 16:22 +0300, Alexandru Ardelean wrote:
> > On Wed, 2019-05-08 at 15:18 +0200, Greg KH wrote:
> > > On Wed, May 08, 2019 at 04:11:28PM +0300, Andy Shevchenko wrote:
> > > > On Wed, May 08, 2019 at 02:28:29PM +0300, Alexandru Ardelean wrote:

> > > > Can you split include/linux/ change from the rest?
> > > 
> > > That would break the build, why do you want it split out?  This makes
> > > sense all as a single patch to me.
> > > 
> > 
> > Not really.
> > It would be just be the new match_string() helper/macro in a new commit.
> > And the conversions of the simple users of match_string() (the ones using
> > ARRAY_SIZE()) in another commit.
> > 
> 
> I should have asked in my previous reply.
> Leave this as-is or re-formulate in 2 patches ?

Depends on on what you would like to spend your time: collecting Acks for all
pieces in treewide patch or send new API first followed up by per driver /
module update in next cycle.

I also have no strong preference.
And I think it's good to add Heikki Krogerus to Cc list for both patch series,
since he is the author of sysfs variant and may have something to comment on
the rest.

-- 
With Best Regards,
Andy Shevchenko




[PATCH] powerpc/imc: Add documentation for IMC and trace-mode

2019-05-10 Thread Anju T Sudhakar
Documentation for IMC(In-Memory Collection Counters) infrastructure
and trace-mode of IMC.

Signed-off-by: Anju T Sudhakar 
---
 Documentation/powerpc/imc.txt | 195 ++
 1 file changed, 195 insertions(+)
 create mode 100644 Documentation/powerpc/imc.txt

diff --git a/Documentation/powerpc/imc.txt b/Documentation/powerpc/imc.txt
new file mode 100644
index ..9c32e059f3be
--- /dev/null
+++ b/Documentation/powerpc/imc.txt
@@ -0,0 +1,195 @@
+   ===
+   IMC (In-Memory Collection Counters)
+   ===
+   Date created: 10 May 2019
+
+Table of Contents:
+--
+   - Basic overview
+   - IMC example Usage
+   - IMC Trace Mode
+   - LDBAR Register Layout
+   - TRACE_IMC_SCOM bit representation
+   - Trace IMC example usage
+   - Benefits of using IMC trace-mode
+
+
+Basic overview
+==
+
+IMC (In-Memory collection counters) is a hardware monitoring facility
+that collects large number of hardware performance events at Nest level
+(these are on-chip but off-core), Core level and Thread level.
+
+The Nest PMU counters are handled by a Nest IMC microcode which runs
+in the OCC (On-Chip Controller) complex. The microcode collects the
+counter data and moves the nest IMC counter data to memory.
+
+The Core and Thread IMC PMU counters are handled in the core. Core
+level PMU counters give us the IMC counters' data per core and thread
+level PMU counters give us the IMC counters' data per CPU thread.
+
+OPAL obtains the IMC PMU and supported events information from the
+IMC Catalog and passes on to the kernel via the device tree. The event's
+information contains :
+ - Event name
+ - Event Offset
+ - Event description
+and, maybe :
+ - Event scale
+ - Event unit
+
+Some PMUs may have a common scale and unit values for all their
+supported events. For those cases, the scale and unit properties for
+those events must be inherited from the PMU.
+
+The event offset in the memory is where the counter data gets
+accumulated.
+
+IMC catalog is available at:
+   https://github.com/open-power/ima-catalog
+
+The kernel discovers the IMC counters information in the device tree
+at the "imc-counters" device node which has a compatible field
+"ibm,opal-in-memory-counters". From the device tree, the kernel parses
+the PMUs and their event's information and register the PMU and it
+attributes in the kernel.
+
+IMC example usage
+=
+
+# perf list
+
+  [...]
+  nest_mcs01/PM_MCS01_64B_RD_DISP_PORT01/[Kernel PMU event]
+  nest_mcs01/PM_MCS01_64B_RD_DISP_PORT23/[Kernel PMU event]
+
+  [...]
+  core_imc/CPM_0THRD_NON_IDLE_PCYC/  [Kernel PMU event]
+  core_imc/CPM_1THRD_NON_IDLE_INST/  [Kernel PMU event]
+
+  [...]
+  thread_imc/CPM_0THRD_NON_IDLE_PCYC/[Kernel PMU event]
+  thread_imc/CPM_1THRD_NON_IDLE_INST/[Kernel PMU event]
+
+To see per chip data for nest_mcs0/PM_MCS_DOWN_128B_DATA_XFER_MC0/ :
+ # ./perf stat -e "nest_mcs01/PM_MCS01_64B_WR_DISP_PORT01/" -a --per-socket
+
+To see non-idle instructions for core 0 :
+ # ./perf stat -e "core_imc/CPM_NON_IDLE_INST/" -C 0 -I 1000
+
+To see non-idle instructions for a "make" :
+ # ./perf stat -e "thread_imc/CPM_NON_IDLE_PCYC/" make
+
+
+IMC Trace-mode
+===
+
+POWER9 support two modes for IMC which are the Accumulation mode and
+Trace mode. In Accumulation mode, event counts are accumulated in system
+Memory. Hypervisor then reads the posted counts periodically or when
+requested. In IMC Trace mode, the 64 bit trace scom value is initialized
+with the event information. The CPMC*SEL and CPMC_LOAD in the trace scom,
+specifies the event to be monitored and the sampling duration. On each
+overflow in the CPMC*SEL, hardware snapshots the program counter along
+with event counts and writes into memory pointed by LDBAR.
+
+LDBAR is a 64 bit special purpose per thread register, it has bits to
+indicate whether hardware is configured for accumulation or trace mode.
+
+* LDBAR Register Layout:
+   0 : Enable/Disable
+   1 : 0 -> Accumulation Mode
+   1 -> Trace Mode
+   2:3   : Reserved
+   4-6   : PB scope
+   7 : Reserved
+   8:50  : Counter Address
+   51:63 : Reserved
+
+* TRACE_IMC_SCOM bit representation:
+
+   0:1 : SAMPSEL
+   2:33: CPMC_LOAD
+   34:40   : CPMC1SEL
+   41:47   : CPMC2SEL
+   48:50   : BUFFERSIZE
+   51:63   : RESERVED
+
+CPMC_LOAD contains the sampling duration. SAMPSEL and CPMC*SEL determines
+the event to count. BUFFRSIZE indicates the memory range. On each overflow,
+hardware snapshots program counter along with event counts and update the
+memory and reloads the CMPC_LOAD value for the next sampling duration.
+IMC hardware does not support exceptions, so it 

Re: [PATCH] EDAC, mpc85xx: Prevent building as a module

2019-05-10 Thread Borislav Petkov
On Fri, May 10, 2019 at 08:50:52PM +1000, Michael Ellerman wrote:
> Yeah that looks better to me. I didn't think about the case where EDAC
> core is modular.
> 
> Do you want me to send a new patch?

Nah, I'll fix it up.

Thx.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.


Re: [GIT PULL] Please pull powerpc/linux.git powerpc-5.2-1 tag

2019-05-10 Thread pr-tracker-bot
The pull request you sent on Fri, 10 May 2019 22:20:55 +1000:

> https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
> tags/powerpc-5.2-1

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/b970afcfcabd63cd3832e95db096439c177c3592

Thank you!

-- 
Deet-doot-dot, I am a bot.
https://korg.wiki.kernel.org/userdoc/prtracker


[GIT PULL] Please pull powerpc/linux.git powerpc-5.2-1 tag

2019-05-10 Thread Michael Ellerman
-BEGIN PGP SIGNED MESSAGE-
Hash: SHA1

Hi Linus,

Please pull powerpc updates for 5.2.

Slightly delayed due to the issue with printk() calling probe_kernel_read()
interacting with our new user access prevention stuff, but all fixed now.

The only out-of-area changes are the addition of a cpuhp_state, small additions
to Documentation and MAINTAINERS updates.

No conflicts that I'm aware of.

cheers


The following changes since commit 79a3aaa7b82e3106be97842dedfd8429248896e6:

  Linux 5.1-rc3 (2019-03-31 14:39:29 -0700)

are available in the git repository at:

  https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git 
tags/powerpc-5.2-1

for you to fetch changes up to 8150a153c013aa2dd1ffae43370b89ac1347a7fb:

  powerpc/64s: Use early_mmu_has_feature() in set_kuap() (2019-05-09 14:28:56 
+1000)

- --
powerpc updates for 5.2

Highlights:

 - Support for Kernel Userspace Access/Execution Prevention (like
   SMAP/SMEP/PAN/PXN) on some 64-bit and 32-bit CPUs. This prevents the kernel
   from accidentally accessing userspace outside copy_to/from_user(), or
   ever executing userspace.

 - KASAN support on 32-bit.

 - Rework of where we map the kernel, vmalloc, etc. on 64-bit hash to use the
   same address ranges we use with the Radix MMU.

 - A rewrite into C of large parts of our idle handling code for 64-bit Book3S
   (ie. power8 & power9).

 - A fast path entry for syscalls on 32-bit CPUs, for a 12-17% speedup in the
   null_syscall benchmark.

 - On 64-bit bare metal we have support for recovering from errors with the time
   base (our clocksource), however if that fails currently we hang in __delay()
   and never crash. We now have support for detecting that case and short
   circuiting __delay() so we at least panic() and reboot.

 - Add support for optionally enabling the DAWR on Power9, which had to be
   disabled by default due to a hardware erratum. This has the effect of
   enabling hardware breakpoints for GDB, the downside is a badly behaved
   program could crash the machine by pointing the DAWR at cache inhibited
   memory. This is opt-in obviously.

 - xmon, our crash handler, gets support for a read only mode where operations
   that could change memory or otherwise disturb the system are disabled.

Plus many clean-ups, reworks and minor fixes etc.

Thanks to:
  Christophe Leroy, Akshay Adiga, Alastair D'Silva, Alexey Kardashevskiy, Andrew
  Donnellan, Aneesh Kumar K.V, Anju T Sudhakar, Anton Blanchard, Ben Hutchings,
  Bo YU, Breno Leitao, Cédric Le Goater, Christopher M. Riedl, Christoph
  Hellwig, Colin Ian King, David Gibson, Ganesh Goudar, Gautham R. Shenoy,
  George Spelvin, Greg Kroah-Hartman, Greg Kurz, Horia Geantă, Jagadeesh
  Pagadala, Joel Stanley, Joe Perches, Julia Lawall, Laurentiu Tudor, Laurent
  Vivier, Lukas Bulwahn, Madhavan Srinivasan, Mahesh Salgaonkar, Mathieu
  Malaterre, Michael Neuling, Mukesh Ojha, Nathan Fontenot, Nathan Lynch,
  Nicholas Piggin, Nick Desaulniers, Oliver O'Halloran, Peng Hao, Qian Cai, Ravi
  Bangoria, Rick Lindsley, Russell Currey, Sachin Sant, Stewart Smith, Sukadev
  Bhattiprolu, Thomas Huth, Tobin C. Harding, Tyrel Datwyler, Valentin
  Schneider, Wei Yongjun, Wen Yang, YueHaibing.

- --
Alastair D'Silva (11):
  ocxl: Rename struct link to ocxl_link
  ocxl: read_pasid never returns an error, so make it void
  ocxl: Remove superfluous 'extern' from headers
  ocxl: Remove some unused exported symbols
  ocxl: Split pci.c
  ocxl: Don't pass pci_dev around
  ocxl: Create a clear delineation between ocxl backend & frontend
  ocxl: Allow external drivers to use OpenCAPI contexts
  ocxl: afu_irq only deals with IRQ IDs, not offsets
  ocxl: move event_fd handling to frontend
  ocxl: Provide global MMIO accessors for external drivers

Alexey Kardashevskiy (1):
  powerpc/powernv/ioda: Handle failures correctly in 
pnv_pci_ioda_iommu_bypass_supported()

Andrew Donnellan (2):
  powerpc/powernv: Squash sparse warnings in opal-call.c
  MAINTAINERS: Update cxl/ocxl email address

Aneesh Kumar K.V (16):
  powerpc/mm/radix: Don't do SLB preload when using the radix MMU
  powerpc/mm: Fix build error with FLATMEM book3s64 config
  powerpc/mm: Remove PPC_MM_SLICES #ifdef for book3s64
  powerpc/mm: Add helpers for accessing hash translation related variables
  powerpc/mm: Move slb_addr_linit to early_init_mmu
  powerpc/mm: Reduce memory usage for mm_context_t for radix
  powerc/mm/hash: Reduce hash_mm_context size
  powerpc/mm/hash64: Add a variable to track the end of IO mapping
  powerpc/mm/hash64: Map all the kernel regions in the same 0xc range
  powerpc/mm: Validate address values against different region limits
  powerpc/mm: Drop the unnecessary region check
  powerpc/mm/hash: Simplify the region id calculation.
   

[PATCH 1/2] powerpc/8xx: move CPM1 related files from sysdev/ to platforms/8xx

2019-05-10 Thread Christophe Leroy
Only 8xx selects CPM1 and related CONFIG options are already
in platforms/8xx/Kconfig

This patch moves the related C files to platforms/8xx/.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/platforms/8xx/Makefile | 3 +++
 arch/powerpc/{sysdev => platforms/8xx}/cpm1.c   | 0
 arch/powerpc/{sysdev => platforms/8xx}/cpm_gpio.c   | 0
 arch/powerpc/{sysdev => platforms/8xx}/micropatch.c | 0
 arch/powerpc/sysdev/Makefile| 3 ---
 5 files changed, 3 insertions(+), 3 deletions(-)
 rename arch/powerpc/{sysdev => platforms/8xx}/cpm1.c (100%)
 rename arch/powerpc/{sysdev => platforms/8xx}/cpm_gpio.c (100%)
 rename arch/powerpc/{sysdev => platforms/8xx}/micropatch.c (100%)

diff --git a/arch/powerpc/platforms/8xx/Makefile 
b/arch/powerpc/platforms/8xx/Makefile
index 708ab099e886..10b338436655 100644
--- a/arch/powerpc/platforms/8xx/Makefile
+++ b/arch/powerpc/platforms/8xx/Makefile
@@ -3,6 +3,9 @@
 # Makefile for the PowerPC 8xx linux kernel.
 #
 obj-y  += m8xx_setup.o machine_check.o pic.o
+obj-$(CONFIG_CPM1) += cpm1.o
+obj-$(CONFIG_UCODE_PATCH)  += micropatch.o
+obj-$(CONFIG_8xx_GPIO) += cpm_gpio.o
 obj-$(CONFIG_MPC885ADS)   += mpc885ads_setup.o
 obj-$(CONFIG_MPC86XADS)   += mpc86xads_setup.o
 obj-$(CONFIG_PPC_EP88XC)  += ep88xc.o
diff --git a/arch/powerpc/sysdev/cpm1.c b/arch/powerpc/platforms/8xx/cpm1.c
similarity index 100%
rename from arch/powerpc/sysdev/cpm1.c
rename to arch/powerpc/platforms/8xx/cpm1.c
diff --git a/arch/powerpc/sysdev/cpm_gpio.c 
b/arch/powerpc/platforms/8xx/cpm_gpio.c
similarity index 100%
rename from arch/powerpc/sysdev/cpm_gpio.c
rename to arch/powerpc/platforms/8xx/cpm_gpio.c
diff --git a/arch/powerpc/sysdev/micropatch.c 
b/arch/powerpc/platforms/8xx/micropatch.c
similarity index 100%
rename from arch/powerpc/sysdev/micropatch.c
rename to arch/powerpc/platforms/8xx/micropatch.c
diff --git a/arch/powerpc/sysdev/Makefile b/arch/powerpc/sysdev/Makefile
index aaf23283ba0c..cfcade8270a9 100644
--- a/arch/powerpc/sysdev/Makefile
+++ b/arch/powerpc/sysdev/Makefile
@@ -37,12 +37,9 @@ obj-$(CONFIG_XILINX_PCI) += xilinx_pci.o
 obj-$(CONFIG_OF_RTC)   += of_rtc.o
 
 obj-$(CONFIG_CPM)  += cpm_common.o
-obj-$(CONFIG_CPM1) += cpm1.o
 obj-$(CONFIG_CPM2) += cpm2.o cpm2_pic.o cpm_gpio.o
-obj-$(CONFIG_8xx_GPIO) += cpm_gpio.o
 obj-$(CONFIG_QUICC_ENGINE) += cpm_common.o
 obj-$(CONFIG_PPC_DCR)  += dcr.o
-obj-$(CONFIG_UCODE_PATCH)  += micropatch.o
 
 obj-$(CONFIG_PPC_MPC512x)  += mpc5xxx_clocks.o
 obj-$(CONFIG_PPC_MPC52xx)  += mpc5xxx_clocks.o
-- 
2.13.3



[PATCH 2/2] powerpc/8xx: Add microcode patch to move SMC parameter RAM.

2019-05-10 Thread Christophe Leroy
Some SCC functions like the QMC requires an extended parameter RAM.
On modern 8xx (ie 866 and 885), SPI area can already be relocated,
allowing the use of those functions on SCC2. But SCC3 and SCC4
parameter RAM collide with SMC1 and SMC2 parameter RAMs.

This patch adds microcode to allow the relocation of both SMC1 and
SMC2, and relocate them at offsets 0x1ec0 and 0x1fc0.
Those offsets are by default for the CPM1 DSP1 and DSP2, but there
is no kernel driver using them at the moment so this area can be
reused.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/platforms/8xx/Kconfig  |   7 ++
 arch/powerpc/platforms/8xx/micropatch.c | 109 +++-
 2 files changed, 114 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/8xx/Kconfig 
b/arch/powerpc/platforms/8xx/Kconfig
index d408162d5af4..e0fe670f06f6 100644
--- a/arch/powerpc/platforms/8xx/Kconfig
+++ b/arch/powerpc/platforms/8xx/Kconfig
@@ -157,6 +157,13 @@ config I2C_SPI_SMC1_UCODE_PATCH
help
  Help not implemented yet, coming soon.
 
+config SMC_UCODE_PATCH
+   bool "SMC relocation patch"
+   help
+ This microcode relocates SMC1 and SMC2 parameter RAMs at
+ offset 0x1ec0 and 0x1fc0 to allow extended parameter RAM
+ for SCC3 and SCC4.
+
 endchoice
 
 config UCODE_PATCH
diff --git a/arch/powerpc/platforms/8xx/micropatch.c 
b/arch/powerpc/platforms/8xx/micropatch.c
index 33a9042fca80..dc4423daf7d4 100644
--- a/arch/powerpc/platforms/8xx/micropatch.c
+++ b/arch/powerpc/platforms/8xx/micropatch.c
@@ -622,6 +622,86 @@ static uint patch_2f00[] __initdata = {
 };
 #endif
 
+/*
+ * SMC relocation patch arrays.
+ */
+
+#ifdef CONFIG_SMC_UCODE_PATCH
+
+static uint patch_2000[] __initdata = {
+   0x3fff, 0x3ffd, 0x3ffb, 0x3ff9,
+   0x5fefeff8, 0x5f91eff8, 0x3ff3, 0x3ff1,
+   0x3a11e710, 0xedf0ccb9, 0xf318ed66, 0x7f0e5fe2,
+   0x7fedbb38, 0x3afe7468, 0x7fedf4d8, 0x8ffbb92d,
+   0xb83b77fd, 0xb0bb5eb9, 0xdfda7fed, 0x90bde74d,
+   0x6f0dcbd3, 0xe7decfed, 0xcb50cfed, 0xcfeddf6d,
+   0x914d4f74, 0x5eaedfcb, 0x9ee0e7df, 0xefbb6ffb,
+   0xe7ef7f0e, 0x9ee57fed, 0xebb7effa, 0xeb30affb,
+   0x7fea90b3, 0x7e0cf09f, 0xb318, 0x5fffdfff,
+   0xac35efea, 0x7fce1fc1, 0xe2ff5fbd, 0xaffbe2ff,
+   0x5fbfaffb, 0xf9a87d0f, 0xaef8770f, 0x7d0fb0a2,
+   0xeffbbfff, 0xcfef5fba, 0x7d0fbfff, 0x5fba4cf8,
+   0x7fddd09b, 0x49f847fd, 0x7efdf097, 0x7fedfffd,
+   0x7dfdf093, 0xef7e7e1e, 0x5fba7f0e, 0x3a11e710,
+   0xedf0cc87, 0xfb18ad0a, 0x1f85bbb8, 0x74283b7e,
+   0x7375e4bb, 0x2ab64fb8, 0x5c7de4bb, 0x32fdffbf,
+   0x5f0843f8, 0x7ce3e1bb, 0xe74f7ded, 0x6f0f4fe8,
+   0xc7ba32be, 0x73f2efeb, 0x600b4f78, 0xe5bb760b,
+   0x5388aef8, 0x4ef80b6a, 0xcfef9ee5, 0xabf8751f,
+   0xefef5b88, 0x741f4fe8, 0x751e760d, 0x7fdb70dd,
+   0x741cafce, 0xefcc7fce, 0x751e7088, 0x741ce7bb,
+   0x334ecfed, 0xafdbefeb, 0xe5bb760b, 0x53ceaef8,
+   0xafe8e7eb, 0x4bf8771e, 0x7e007fed, 0x4fcbe2cc,
+   0x7fbc3085, 0x7b0f7a0f, 0x34b177fd, 0xb0e75e93,
+   0xdf313e3b, 0xaf78741f, 0x741f30cc, 0xcfef5f08,
+   0x741f3e88, 0xafb8771e, 0x5f437fed, 0x0bafe2cc,
+   0x741ccfec, 0xe5ca53a9, 0x6fcb4f74, 0x5e89df27,
+   0x2a923d14, 0x4b8fdf0c, 0x751f741c, 0x6c1eeffa,
+   0xefea7fce, 0x6ffc309a, 0xefec3fca, 0x308fdf0a,
+   0xadf85e7a, 0xaf7daefd, 0x5e7adf0a, 0x5e7aafdd,
+   0x761f1088, 0x1e7c7efd, 0x3089fffe, 0x4908fb18,
+   0x5fffdfff, 0xafbbf0f7, 0x4ef85f43, 0xadf81489,
+   0x7a0f7089, 0xcfef5089, 0x7a0fdf0c, 0x5e7cafed,
+   0xbc6e780f, 0xefef780f, 0xefef790f, 0xa7f85eeb,
+   0xffef790f, 0xefef790f, 0x1489df0a, 0x5e7aadfd,
+   0x5f09fffb, 0xe79aded9, 0xeff96079, 0x607ae79a,
+   0xded8eff9, 0x60795edb, 0x607acfef, 0xefefefdf,
+   0xefbfef7f, 0xeeffedff, 0xebffe7ff, 0xafefafdf,
+   0xafbfaf7f, 0xaeffadff, 0xabffa7ff, 0x6fef6fdf,
+   0x6fbf6f7f, 0x6eff6dff, 0x6bff67ff, 0x2fef2fdf,
+   0x2fbf2f7f, 0x2eff2dff, 0x2bff27ff, 0x4e08fd1f,
+   0xe5ff6e0f, 0xaff87eef, 0x7e0ffdef, 0xf11f6079,
+   0xabf8f51e, 0x7e0af11c, 0x37cfae16, 0x7fec909a,
+   0xadf8efdc, 0xcfeae52f, 0x7d0fe12b, 0xf11c6079,
+   0x7e0a4df8, 0xcfea5ea0, 0x7d0befec, 0xcfea5ea2,
+   0xe522efdc, 0x5ea2cfda, 0x4e08fd1f, 0x6e0faff8,
+   0x7c1f761f, 0xfdeff91f, 0x6079abf8, 0x761cee00,
+   0xf91f2bfb, 0xefefcfec, 0xf91f6079, 0x761c27fb,
+   0xefdf5e83, 0xcfdc7fdd, 0x50f84bf8, 0x47fd7c1f,
+   0x761ccfcf, 0x7eef7fed, 0x7dfd70ef, 0xef7e7f1e,
+   0x771efb18, 0x6079e722, 0xe6bbe5bb, 0x2e66e5bb,
+   0x600b2ee1, 0xe2bbe2bb, 0xe2bbe2bb, 0x2f5ee2bb,
+   0xe2bb2ff9, 0x6079e2bb,
+};
+
+static uint patch_2f00[] __initdata = {
+   0x30303030, 0x3e3e3030, 0xaf79b9b3, 0xbaa3b979,
+   0x9693369f, 0x79f79777, 0x97333fff, 0xfb3b9e9f,
+   0x79b91d11, 0x9e13f3ff, 0x3f9b6bd9, 0xe173d136,
+   0x695669d1, 0x697b3daf, 0x79b93a3a, 0x3f979f91,
+   0x379ff976, 

Re: [PATCH 09/16] mmc: sdhci-xenon: use new match_string() helper/macro

2019-05-10 Thread Dan Carpenter
On Fri, May 10, 2019 at 09:13:26AM +, Ardelean, Alexandru wrote:
> On Wed, 2019-05-08 at 16:26 +0300, Alexandru Ardelean wrote:
> > On Wed, 2019-05-08 at 15:20 +0300, Dan Carpenter wrote:
> > > 
> > > 
> > > On Wed, May 08, 2019 at 02:28:35PM +0300, Alexandru Ardelean wrote:
> > > > -static const char * const phy_types[] = {
> > > > - "emmc 5.0 phy",
> > > > - "emmc 5.1 phy"
> > > > -};
> > > > -
> > > >  enum xenon_phy_type_enum {
> > > >   EMMC_5_0_PHY,
> > > >   EMMC_5_1_PHY,
> > > >   NR_PHY_TYPES
> > > 
> > > There is no need for NR_PHY_TYPES now so you could remove that as well.
> > > 
> > 
> > I thought the same.
> > The only reason to keep NR_PHY_TYPES, is for potential future patches,
> > where it would be just 1 addition
> > 
> >  enum xenon_phy_type_enum {
> >   EMMC_5_0_PHY,
> >   EMMC_5_1_PHY,
> > +  EMMC_5_2_PHY,
> >   NR_PHY_TYPES
> >   }
> > 
> > Depending on style/preference of how to do enums (allow comma on last
> > enum
> > or not allow comma on last enum value), adding new enum values woudl be 2
> > additions + 1 deletion lines.
> > 
> >  enum xenon_phy_type_enum {
> >   EMMC_5_0_PHY,
> > -  EMMC_5_1_PHY
> > +  EMM
> > C_5_1_PHY,
> > +  EMMC_5_2_PHY
> >  }
> > 
> > Either way (leave NR_PHY_TYPES or remove NR_PHY_TYPES) is fine from my
> > side.
> > 
> 
> Preference on this ?
> If no objection [nobody insists] I would keep.
> 
> I don't feel strongly about it [dropping NR_PHY_TYPES or not].

If you end up resending the series could you remove it, but if not then
it's not worth it.

regards,
dan carpenter



Re: [PATCH 09/16] mmc: sdhci-xenon: use new match_string() helper/macro

2019-05-10 Thread Ardelean, Alexandru
On Fri, 2019-05-10 at 14:01 +0300, Dan Carpenter wrote:
> [External]
> 
> 
> On Fri, May 10, 2019 at 09:13:26AM +, Ardelean, Alexandru wrote:
> > On Wed, 2019-05-08 at 16:26 +0300, Alexandru Ardelean wrote:
> > > On Wed, 2019-05-08 at 15:20 +0300, Dan Carpenter wrote:
> > > > 
> > > > 
> > > > On Wed, May 08, 2019 at 02:28:35PM +0300, Alexandru Ardelean wrote:
> > > > > -static const char * const phy_types[] = {
> > > > > - "emmc 5.0 phy",
> > > > > - "emmc 5.1 phy"
> > > > > -};
> > > > > -
> > > > >  enum xenon_phy_type_enum {
> > > > >   EMMC_5_0_PHY,
> > > > >   EMMC_5_1_PHY,
> > > > >   NR_PHY_TYPES
> > > > 
> > > > There is no need for NR_PHY_TYPES now so you could remove that as
> > > > well.
> > > > 
> > > 
> > > I thought the same.
> > > The only reason to keep NR_PHY_TYPES, is for potential future
> > > patches,
> > > where it would be just 1 addition
> > > 
> > >  enum xenon_phy_type_enum {
> > >   EMMC_5_0_PHY,
> > >   EMMC_5_1_PHY,
> > > +  EMMC_5_2_PHY,
> > >   NR_PHY_TYPES
> > >   }
> > > 
> > > Depending on style/preference of how to do enums (allow comma on last
> > > enum
> > > or not allow comma on last enum value), adding new enum values woudl
> > > be 2
> > > additions + 1 deletion lines.
> > > 
> > >  enum xenon_phy_type_enum {
> > >   EMMC_5_0_PHY,
> > > -  EMMC_5_1_PHY
> > > +  EMM
> > > C_5_1_PHY,
> > > +  EMMC_5_2_PHY
> > >  }
> > > 
> > > Either way (leave NR_PHY_TYPES or remove NR_PHY_TYPES) is fine from
> > > my
> > > side.
> > > 
> > 
> > Preference on this ?
> > If no objection [nobody insists] I would keep.
> > 
> > I don't feel strongly about it [dropping NR_PHY_TYPES or not].
> 
> If you end up resending the series could you remove it, but if not then
> it's not worth it.

ack

thanks
Alex

> 
> regards,
> dan carpenter
> 


Re: [PATCH] EDAC, mpc85xx: Prevent building as a module

2019-05-10 Thread Michael Ellerman
Borislav Petkov  writes:

> On Thu, May 09, 2019 at 04:55:34PM +0200, Borislav Petkov wrote:
>> On Fri, May 10, 2019 at 12:52:05AM +1000, Michael Ellerman wrote:
>> > Thanks. It would be nice if you could send it as a fix for 5.2, it's the
>> > last thing blocking one of my allmodconfig builds. But if you don't
>> > think it qualifies as a fix that's fine too, it can wait.
>> 
>> Sure, no problem. Will do a pull request later.
>
> Hmm, so looking at this more, I was able to produce this config with my
> ancient cross-compiler:
>
> CONFIG_EDAC_SUPPORT=y
> CONFIG_EDAC=m
> CONFIG_EDAC_LEGACY_SYSFS=y
> CONFIG_EDAC_MPC85XX=y

Oh yeah good point.

> Now, mpc85xx_edac is built-in and edac_core.ko is a module
> (CONFIG_EDAC=m) and that should not work - i.e., builtin code calling
> module functions. But my cross-compiler is happily building this without
> complaint. Or maybe I'm missing something.

That's weird.

> In any case, I *think* the proper fix should be to do:
>
> config EDAC_MPC85XX
> bool "Freescale MPC83xx / MPC85xx"
> depends on FSL_SOC && EDAC=y
>
> so that you can't even produce the above invalid .config snippet.
>
> Hmmm?

Yeah that looks better to me. I didn't think about the case where EDAC
core is modular.

Do you want me to send a new patch?

cheers


RE: [PATCH] vsprintf: Do not break early boot with probing addresses

2019-05-10 Thread Michael Ellerman
David Laight  writes:
> From: Michal Suchánek
>> Sent: 09 May 2019 14:38
> ...
>> > The problem is the combination of some new code called via printk(),
>> > check_pointer() which calls probe_kernel_read(). That then calls
>> > allow_user_access() (PPC_KUAP) and that uses mmu_has_feature() too early
>> > (before we've patched features).
>> 
>> There is early_mmu_has_feature for this case. mmu_has_feature does not
>> work before patching so parts of kernel that can run before patching
>> must use the early_ variant which actually runs code reading the
>> feature bitmap to determine the answer.
>
> Does the early_ variant get patched so the it is reasonably
> efficient after the 'patching' is done?

No they don't get patched ever. The name is a bit misleading I guess.

> Or should there be a third version which gets patched across?

For a case like this it's entirely safe to just skip the code early in
boot, so if it was a static_key_false everything would just work.

Unfortunately the way the code is currently written we would have to
change all MMU features to static_key_false and that risks breaking
something else.

We have a long standing TODO to rework all our feature logic and unify
CPU/MMU/firmware/etc. features. Possibly as part of that we can come up
with a scheme where the default value is per-feature bit.

Having said all that, in this case the overhead of the test and branch
is small compared to the cost of writing to the SPR which controls user
access and then doing an isync, so it's all somewhat premature
optimisation.

cheers


[PATCH net 4/5] net: ethernet: fix similar warning reported by kbuild test robot

2019-05-10 Thread Petr Štetiar
This patch fixes following (similar) warning reported by kbuild test robot:

 In function ‘memcpy’,
  inlined from ‘smsc75xx_init_mac_address’ at drivers/net/usb/smsc75xx.c:778:3,
  inlined from ‘smsc75xx_bind’ at drivers/net/usb/smsc75xx.c:1501:2:
  ./include/linux/string.h:355:9: warning: argument 2 null where non-null 
expected [-Wnonnull]
  return __builtin_memcpy(p, q, size);
 ^~~~
  drivers/net/usb/smsc75xx.c: In function ‘smsc75xx_bind’:
  ./include/linux/string.h:355:9: note: in a call to built-in function 
‘__builtin_memcpy’

I've replaced the offending memcpy with ether_addr_copy, because I'm
100% sure, that of_get_mac_address can't return NULL as it returns valid
pointer or ERR_PTR encoded value, nothing else.

I'm hesitant to just change IS_ERR into IS_ERR_OR_NULL check, as this
would make the warning disappear also, but it would be confusing to
check for impossible return value just to make a compiler happy.

I'm now changing all occurencies of memcpy to ether_addr_copy after the
of_get_mac_address call, as it's very likely, that we're going to get
similar reports from kbuild test robot in the future.

Fixes: a51645f70f63 ("net: ethernet: support of_get_mac_address new ERR_PTR 
error")
Reported-by: kbuild test robot 
Signed-off-by: Petr Štetiar 
---
 drivers/net/ethernet/allwinner/sun4i-emac.c   | 2 +-
 drivers/net/ethernet/arc/emac_main.c  | 2 +-
 drivers/net/ethernet/cavium/octeon/octeon_mgmt.c  | 2 +-
 drivers/net/ethernet/davicom/dm9000.c | 2 +-
 drivers/net/ethernet/freescale/fec_mpc52xx.c  | 2 +-
 drivers/net/ethernet/freescale/fman/mac.c | 2 +-
 drivers/net/ethernet/freescale/fs_enet/fs_enet-main.c | 2 +-
 drivers/net/ethernet/freescale/gianfar.c  | 2 +-
 drivers/net/ethernet/freescale/ucc_geth.c | 2 +-
 drivers/net/ethernet/marvell/mv643xx_eth.c| 2 +-
 drivers/net/ethernet/marvell/mvneta.c | 2 +-
 drivers/net/ethernet/marvell/sky2.c   | 2 +-
 drivers/net/ethernet/micrel/ks8851.c  | 2 +-
 drivers/net/ethernet/micrel/ks8851_mll.c  | 2 +-
 drivers/net/ethernet/nxp/lpc_eth.c| 2 +-
 drivers/net/ethernet/renesas/sh_eth.c | 2 +-
 drivers/net/ethernet/ti/cpsw.c| 2 +-
 drivers/net/ethernet/xilinx/ll_temac_main.c   | 2 +-
 drivers/net/ethernet/xilinx/xilinx_emaclite.c | 2 +-
 19 files changed, 19 insertions(+), 19 deletions(-)

diff --git a/drivers/net/ethernet/allwinner/sun4i-emac.c 
b/drivers/net/ethernet/allwinner/sun4i-emac.c
index 37ebd890ef51..9e06dff619c3 100644
--- a/drivers/net/ethernet/allwinner/sun4i-emac.c
+++ b/drivers/net/ethernet/allwinner/sun4i-emac.c
@@ -871,7 +871,7 @@ static int emac_probe(struct platform_device *pdev)
/* Read MAC-address from DT */
mac_addr = of_get_mac_address(np);
if (!IS_ERR(mac_addr))
-   memcpy(ndev->dev_addr, mac_addr, ETH_ALEN);
+   ether_addr_copy(ndev->dev_addr, mac_addr);
 
/* Check if the MAC address is valid, if not get a random one */
if (!is_valid_ether_addr(ndev->dev_addr)) {
diff --git a/drivers/net/ethernet/arc/emac_main.c 
b/drivers/net/ethernet/arc/emac_main.c
index 7f89ad5c336d..13a1d99b29c6 100644
--- a/drivers/net/ethernet/arc/emac_main.c
+++ b/drivers/net/ethernet/arc/emac_main.c
@@ -961,7 +961,7 @@ int arc_emac_probe(struct net_device *ndev, int interface)
mac_addr = of_get_mac_address(dev->of_node);
 
if (!IS_ERR(mac_addr))
-   memcpy(ndev->dev_addr, mac_addr, ETH_ALEN);
+   ether_addr_copy(ndev->dev_addr, mac_addr);
else
eth_hw_addr_random(ndev);
 
diff --git a/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c 
b/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
index 15b1130aa4ae..0e5de88fd6e8 100644
--- a/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
+++ b/drivers/net/ethernet/cavium/octeon/octeon_mgmt.c
@@ -1504,7 +1504,7 @@ static int octeon_mgmt_probe(struct platform_device *pdev)
mac = of_get_mac_address(pdev->dev.of_node);
 
if (!IS_ERR(mac))
-   memcpy(netdev->dev_addr, mac, ETH_ALEN);
+   ether_addr_copy(netdev->dev_addr, mac);
else
eth_hw_addr_random(netdev);
 
diff --git a/drivers/net/ethernet/davicom/dm9000.c 
b/drivers/net/ethernet/davicom/dm9000.c
index 953ee5616801..5e1aff9a5fd6 100644
--- a/drivers/net/ethernet/davicom/dm9000.c
+++ b/drivers/net/ethernet/davicom/dm9000.c
@@ -1413,7 +1413,7 @@ static struct dm9000_plat_data *dm9000_parse_dt(struct 
device *dev)
 
mac_addr = of_get_mac_address(np);
if (!IS_ERR(mac_addr))
-   memcpy(pdata->dev_addr, mac_addr, sizeof(pdata->dev_addr));
+   ether_addr_copy(pdata->dev_addr, mac_addr);
 
return pdata;
 }
diff --git a/drivers/net/ethernet/freescale/fec_mpc52xx.c 

[PATCH net 3/5] powerpc: tsi108: fix similar warning reported by kbuild test robot

2019-05-10 Thread Petr Štetiar
This patch fixes following (similar) warning reported by kbuild test robot:

 In function ‘memcpy’,
  inlined from ‘smsc75xx_init_mac_address’ at drivers/net/usb/smsc75xx.c:778:3,
  inlined from ‘smsc75xx_bind’ at drivers/net/usb/smsc75xx.c:1501:2:
  ./include/linux/string.h:355:9: warning: argument 2 null where non-null 
expected [-Wnonnull]
  return __builtin_memcpy(p, q, size);
 ^~~~
  drivers/net/usb/smsc75xx.c: In function ‘smsc75xx_bind’:
  ./include/linux/string.h:355:9: note: in a call to built-in function 
‘__builtin_memcpy’

I've replaced the offending memcpy with ether_addr_copy, because I'm
100% sure, that of_get_mac_address can't return NULL as it returns valid
pointer or ERR_PTR encoded value, nothing else.

I'm hesitant to just change IS_ERR into IS_ERR_OR_NULL check, as this
would make the warning disappear also, but it would be confusing to
check for impossible return value just to make a compiler happy.

I'm now changing all occurencies of memcpy to ether_addr_copy after the
of_get_mac_address call, as it's very likely, that we're going to get
similar reports from kbuild test robot in the future.

Fixes: ea168cdf1299 ("powerpc: tsi108: support of_get_mac_address new ERR_PTR 
error")
Reported-by: kbuild test robot 
Signed-off-by: Petr Štetiar 
---
 arch/powerpc/sysdev/tsi108_dev.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/sysdev/tsi108_dev.c b/arch/powerpc/sysdev/tsi108_dev.c
index c92dcac85231..026619c9a8cb 100644
--- a/arch/powerpc/sysdev/tsi108_dev.c
+++ b/arch/powerpc/sysdev/tsi108_dev.c
@@ -18,6 +18,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -106,7 +107,7 @@ static int __init tsi108_eth_of_init(void)
 
mac_addr = of_get_mac_address(np);
if (!IS_ERR(mac_addr))
-   memcpy(tsi_eth_data.mac_addr, mac_addr, 6);
+   ether_addr_copy(tsi_eth_data.mac_addr, mac_addr);
 
ph = of_get_property(np, "mdio-handle", NULL);
mdio = of_find_node_by_phandle(*ph);
-- 
1.9.1



[RESEND PATCH] powerpc/pseries: Fix cpu_hotplug_lock acquisition in resize_hpt

2019-05-10 Thread Gautham R. Shenoy
From: "Gautham R. Shenoy" 

During a memory hotplug operations involving resizing of the HPT, we
invoke a stop_machine() to perform the resizing. In this code path, we
end up recursively taking the cpu_hotplug_lock, first in
memory_hotplug_begin() and then subsequently in stop_machine(). This
causes the system to hang. With lockdep enabled we get the following
error message before the hang.

  swapper/0/1 is trying to acquire lock:
  (ptrval) (cpu_hotplug_lock.rw_sem){}, at: stop_machine+0x2c/0x60

  but task is already holding lock:
  (ptrval) (cpu_hotplug_lock.rw_sem){}, at: 
mem_hotplug_begin+0x20/0x50

  other info that might help us debug this:
   Possible unsafe locking scenario:

 CPU0
 
lock(cpu_hotplug_lock.rw_sem);
lock(cpu_hotplug_lock.rw_sem);

   *** DEADLOCK ***

Fix this issue by
  1) Requiring all the calls to pseries_lpar_resize_hpt() be made
 with cpu_hotplug_lock held.

  2) In pseries_lpar_resize_hpt() invoke stop_machine_cpuslocked()
 as a consequence of 1)

  3) To satisfy 1), in hpt_order_set(), call mmu_hash_ops.resize_hpt()
 with cpu_hotplug_lock held.

Reported-by: Aneesh Kumar K.V 
Signed-off-by: Gautham R. Shenoy 
---

Rebased this one against powerpc/next instead of linux/master.

 arch/powerpc/mm/book3s64/hash_utils.c | 9 -
 arch/powerpc/platforms/pseries/lpar.c | 8 ++--
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/mm/book3s64/hash_utils.c 
b/arch/powerpc/mm/book3s64/hash_utils.c
index 919a861..d07fcafd 100644
--- a/arch/powerpc/mm/book3s64/hash_utils.c
+++ b/arch/powerpc/mm/book3s64/hash_utils.c
@@ -38,6 +38,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -1928,10 +1929,16 @@ static int hpt_order_get(void *data, u64 *val)
 
 static int hpt_order_set(void *data, u64 val)
 {
+   int ret;
+
if (!mmu_hash_ops.resize_hpt)
return -ENODEV;
 
-   return mmu_hash_ops.resize_hpt(val);
+   cpus_read_lock();
+   ret = mmu_hash_ops.resize_hpt(val);
+   cpus_read_unlock();
+
+   return ret;
 }
 
 DEFINE_DEBUGFS_ATTRIBUTE(fops_hpt_order, hpt_order_get, hpt_order_set, 
"%llu\n");
diff --git a/arch/powerpc/platforms/pseries/lpar.c 
b/arch/powerpc/platforms/pseries/lpar.c
index 1034ef1..2fc9756 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -859,7 +859,10 @@ static int pseries_lpar_resize_hpt_commit(void *data)
return 0;
 }
 
-/* Must be called in user context */
+/*
+ * Must be called in user context. The caller should hold the
+ * cpus_lock.
+ */
 static int pseries_lpar_resize_hpt(unsigned long shift)
 {
struct hpt_resize_state state = {
@@ -913,7 +916,8 @@ static int pseries_lpar_resize_hpt(unsigned long shift)
 
t1 = ktime_get();
 
-   rc = stop_machine(pseries_lpar_resize_hpt_commit, , NULL);
+   rc = stop_machine_cpuslocked(pseries_lpar_resize_hpt_commit,
+, NULL);
 
t2 = ktime_get();
 
-- 
1.9.4



[PATCH v2] powerpc: slightly improve cache helpers

2019-05-10 Thread Christophe Leroy
Cache instructions (dcbz, dcbi, dcbf and dcbst) take two registers
that are summed to obtain the target address. Using 'Z' constraint
and '%y0' argument gives GCC the opportunity to use both registers
instead of only one with the second being forced to 0.

Suggested-by: Segher Boessenkool 
Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/cache.h | 8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h
index 40ea5b3781c6..df8e4c407366 100644
--- a/arch/powerpc/include/asm/cache.h
+++ b/arch/powerpc/include/asm/cache.h
@@ -85,22 +85,22 @@ extern void _set_L3CR(unsigned long);
 
 static inline void dcbz(void *addr)
 {
-   __asm__ __volatile__ ("dcbz 0, %0" : : "r"(addr) : "memory");
+   __asm__ __volatile__ ("dcbz %y0" : : "Z"(*(u8 *)addr) : "memory");
 }
 
 static inline void dcbi(void *addr)
 {
-   __asm__ __volatile__ ("dcbi 0, %0" : : "r"(addr) : "memory");
+   __asm__ __volatile__ ("dcbi %y0" : : "Z"(*(u8 *)addr) : "memory");
 }
 
 static inline void dcbf(void *addr)
 {
-   __asm__ __volatile__ ("dcbf 0, %0" : : "r"(addr) : "memory");
+   __asm__ __volatile__ ("dcbf %y0" : : "Z"(*(u8 *)addr) : "memory");
 }
 
 static inline void dcbst(void *addr)
 {
-   __asm__ __volatile__ ("dcbst 0, %0" : : "r"(addr) : "memory");
+   __asm__ __volatile__ ("dcbst %y0" : : "Z"(*(u8 *)addr) : "memory");
 }
 #endif /* !__ASSEMBLY__ */
 #endif /* __KERNEL__ */
-- 
2.13.3



Re: [PATCH 03/16] lib,treewide: add new match_string() helper/macro

2019-05-10 Thread Ardelean, Alexandru
On Wed, 2019-05-08 at 16:22 +0300, Alexandru Ardelean wrote:
> On Wed, 2019-05-08 at 15:18 +0200, Greg KH wrote:
> > 
> > 
> > On Wed, May 08, 2019 at 04:11:28PM +0300, Andy Shevchenko wrote:
> > > On Wed, May 08, 2019 at 02:28:29PM +0300, Alexandru Ardelean wrote:
> > > > This change re-introduces `match_string()` as a macro that uses
> > > > ARRAY_SIZE() to compute the size of the array.
> > > > The macro is added in all the places that do
> > > > `match_string(_a, ARRAY_SIZE(_a), s)`, since the change is pretty
> > > > straightforward.
> > > 
> > > Can you split include/linux/ change from the rest?
> > 
> > That would break the build, why do you want it split out?  This makes
> > sense all as a single patch to me.
> > 
> 
> Not really.
> It would be just be the new match_string() helper/macro in a new commit.
> And the conversions of the simple users of match_string() (the ones using
> ARRAY_SIZE()) in another commit.
> 

I should have asked in my previous reply.
Leave this as-is or re-formulate in 2 patches ?

No strong preference from my side.

Thanks
Alex

> Thanks
> Alex
> 
> > thanks,
> > 
> > greg k-h


Re: [PATCH 09/16] mmc: sdhci-xenon: use new match_string() helper/macro

2019-05-10 Thread Ardelean, Alexandru
On Wed, 2019-05-08 at 16:26 +0300, Alexandru Ardelean wrote:
> On Wed, 2019-05-08 at 15:20 +0300, Dan Carpenter wrote:
> > 
> > 
> > On Wed, May 08, 2019 at 02:28:35PM +0300, Alexandru Ardelean wrote:
> > > -static const char * const phy_types[] = {
> > > - "emmc 5.0 phy",
> > > - "emmc 5.1 phy"
> > > -};
> > > -
> > >  enum xenon_phy_type_enum {
> > >   EMMC_5_0_PHY,
> > >   EMMC_5_1_PHY,
> > >   NR_PHY_TYPES
> > 
> > There is no need for NR_PHY_TYPES now so you could remove that as well.
> > 
> 
> I thought the same.
> The only reason to keep NR_PHY_TYPES, is for potential future patches,
> where it would be just 1 addition
> 
>  enum xenon_phy_type_enum {
>   EMMC_5_0_PHY,
>   EMMC_5_1_PHY,
> +  EMMC_5_2_PHY,
>   NR_PHY_TYPES
>   }
> 
> Depending on style/preference of how to do enums (allow comma on last
> enum
> or not allow comma on last enum value), adding new enum values woudl be 2
> additions + 1 deletion lines.
> 
>  enum xenon_phy_type_enum {
>   EMMC_5_0_PHY,
> -  EMMC_5_1_PHY
> +  EMM
> C_5_1_PHY,
> +  EMMC_5_2_PHY
>  }
> 
> Either way (leave NR_PHY_TYPES or remove NR_PHY_TYPES) is fine from my
> side.
> 

Preference on this ?
If no objection [nobody insists] I would keep.

I don't feel strongly about it [dropping NR_PHY_TYPES or not].

Thanks
Alex

> Thanks
> Alex
> 
> > regards,
> > dan carpenter
> > 


[PATCH] powerpc/pseries: Fix cpu_hotplug_lock acquisition in resize_hpt

2019-05-10 Thread Gautham R. Shenoy
From: "Gautham R. Shenoy" 

During a memory hotplug operations involving resizing of the HPT, we
invoke a stop_machine() to perform the resizing. In this code path, we
end up recursively taking the cpu_hotplug_lock, first in
memory_hotplug_begin() and then subsequently in stop_machine(). This
causes the system to hang. With lockdep enabled we get the following
error message before the hang.

  swapper/0/1 is trying to acquire lock:
  (ptrval) (cpu_hotplug_lock.rw_sem){}, at: stop_machine+0x2c/0x60

  but task is already holding lock:
  (ptrval) (cpu_hotplug_lock.rw_sem){}, at: 
mem_hotplug_begin+0x20/0x50

  other info that might help us debug this:
   Possible unsafe locking scenario:

 CPU0
 
lock(cpu_hotplug_lock.rw_sem);
lock(cpu_hotplug_lock.rw_sem);

   *** DEADLOCK ***

Fix this issue by
  1) Requiring all the calls to pseries_lpar_resize_hpt() be made
 with cpu_hotplug_lock held.

  2) In pseries_lpar_resize_hpt() invoke stop_machine_cpuslocked()
 as a consequence of 1)

  3) To satisfy 1), in hpt_order_set(), call mmu_hash_ops.resize_hpt()
 with cpu_hotplug_lock held.

Reported-by: Aneesh Kumar K.V 
Signed-off-by: Gautham R. Shenoy 
---
 arch/powerpc/mm/hash_utils_64.c   | 9 -
 arch/powerpc/platforms/pseries/lpar.c | 8 ++--
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/mm/hash_utils_64.c b/arch/powerpc/mm/hash_utils_64.c
index 0a4f939..b05c79c 100644
--- a/arch/powerpc/mm/hash_utils_64.c
+++ b/arch/powerpc/mm/hash_utils_64.c
@@ -37,6 +37,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
@@ -1890,10 +1891,16 @@ static int hpt_order_get(void *data, u64 *val)
 
 static int hpt_order_set(void *data, u64 val)
 {
+   int ret;
+
if (!mmu_hash_ops.resize_hpt)
return -ENODEV;
 
-   return mmu_hash_ops.resize_hpt(val);
+   cpus_read_lock();
+   ret = mmu_hash_ops.resize_hpt(val);
+   cpus_read_unlock();
+
+   return ret;
 }
 
 DEFINE_DEBUGFS_ATTRIBUTE(fops_hpt_order, hpt_order_get, hpt_order_set, 
"%llu\n");
diff --git a/arch/powerpc/platforms/pseries/lpar.c 
b/arch/powerpc/platforms/pseries/lpar.c
index f2a9f0a..65df95b 100644
--- a/arch/powerpc/platforms/pseries/lpar.c
+++ b/arch/powerpc/platforms/pseries/lpar.c
@@ -859,7 +859,10 @@ static int pseries_lpar_resize_hpt_commit(void *data)
return 0;
 }
 
-/* Must be called in user context */
+/*
+ * Must be called in user context. The caller should hold the
+ * cpus_lock.
+ */
 static int pseries_lpar_resize_hpt(unsigned long shift)
 {
struct hpt_resize_state state = {
@@ -911,7 +914,8 @@ static int pseries_lpar_resize_hpt(unsigned long shift)
 
t1 = ktime_get();
 
-   rc = stop_machine(pseries_lpar_resize_hpt_commit, , NULL);
+   rc = stop_machine_cpuslocked(pseries_lpar_resize_hpt_commit,
+, NULL);
 
t2 = ktime_get();
 
-- 
1.9.4



Re: [PATCH] vsprintf: Do not break early boot with probing addresses

2019-05-10 Thread Sergey Senozhatsky
On (05/10/19 10:42), Petr Mladek wrote:
[..]
> Fixes: 3e5903eb9cff70730 ("vsprintf: Prevent crash when dereferencing invalid 
> pointers")
> Signed-off-by: Petr Mladek 

FWIW
Reviewed-by: Sergey Senozhatsky 

-ss


[PATCH] vsprintf: Do not break early boot with probing addresses

2019-05-10 Thread Petr Mladek
The commit 3e5903eb9cff70730 ("vsprintf: Prevent crash when dereferencing
invalid pointers") broke boot on several architectures. The common
pattern is that probe_kernel_read() is not working during early
boot because userspace access framework is not ready.

It is a generic problem. We have to avoid any complex external
functions in vsprintf() code, especially in the common path.
They might break printk() easily and are hard to debug.

Replace probe_kernel_read() with some simple checks for obvious
problems.

Details:

1. Report on Power:

Kernel crashes very early during boot with with CONFIG_PPC_KUAP and
CONFIG_JUMP_LABEL_FEATURE_CHECK_DEBUG

The problem is the combination of some new code called via printk(),
check_pointer() which calls probe_kernel_read(). That then calls
allow_user_access() (PPC_KUAP) and that uses mmu_has_feature() too early
(before we've patched features). With the JUMP_LABEL debug enabled that
causes us to call printk() & dump_stack() and we end up recursing and
overflowing the stack.

Because it happens so early you don't get any output, just an apparently
dead system.

The stack trace (which you don't see) is something like:

  ...
  dump_stack+0xdc
  probe_kernel_read+0x1a4
  check_pointer+0x58
  string+0x3c
  vsnprintf+0x1bc
  vscnprintf+0x20
  printk_safe_log_store+0x7c
  printk+0x40
  dump_stack_print_info+0xbc
  dump_stack+0x8
  probe_kernel_read+0x1a4
  probe_kernel_read+0x19c
  check_pointer+0x58
  string+0x3c
  vsnprintf+0x1bc
  vscnprintf+0x20
  vprintk_store+0x6c
  vprintk_emit+0xec
  vprintk_func+0xd4
  printk+0x40
  cpufeatures_process_feature+0xc8
  scan_cpufeatures_subnodes+0x380
  of_scan_flat_dt_subnodes+0xb4
  dt_cpu_ftrs_scan_callback+0x158
  of_scan_flat_dt+0xf0
  dt_cpu_ftrs_scan+0x3c
  early_init_devtree+0x360
  early_setup+0x9c

2. Report on s390:

vsnprintf invocations, are broken on s390. For example, the early boot
output now looks like this where the first (efault) should be
the linux_banner:

[0.099985] (efault)
[0.099985] setup: Linux is running as a z/VM guest operating system in 
64-bit mode
[0.100066] setup: The maximum memory size is 8192MB
[0.100070] cma: Reserved 4 MiB at (efault)
[0.100100] numa: NUMA mode: (efault)

The reason for this, is that the code assumes that
probe_kernel_address() works very early. This however is not true on
at least s390. Uaccess on KERNEL_DS works only after page tables have
been setup on s390, which happens with setup_arch()->paging_init().

Any probe_kernel_address() invocation before that will return -EFAULT.

Fixes: 3e5903eb9cff70730 ("vsprintf: Prevent crash when dereferencing invalid 
pointers")
Signed-off-by: Petr Mladek 
---
 lib/vsprintf.c | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index 7b0a6140bfad..2f003cfe340e 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -628,19 +628,16 @@ static char *error_string(char *buf, char *end, const 
char *s,
 }
 
 /*
- * This is not a fool-proof test. 99% of the time that this will fault is
- * due to a bad pointer, not one that crosses into bad memory. Just test
- * the address to make sure it doesn't fault due to a poorly added printk
- * during debugging.
+ * Do not call any complex external code here. Nested printk()/vsprintf()
+ * might cause infinite loops. Failures might break printk() and would
+ * be hard to debug.
  */
 static const char *check_pointer_msg(const void *ptr)
 {
-   char byte;
-
if (!ptr)
return "(null)";
 
-   if (probe_kernel_address(ptr, byte))
+   if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
return "(efault)";
 
return NULL;
-- 
2.16.4



Re: [PATCH] vsprintf: Do not break early boot with probing addresses

2019-05-10 Thread Sergey Senozhatsky
On (05/10/19 10:06), Petr Mladek wrote:
[..]
> I am going to send a patch replacing probe_kernel_address() with
> a simple check:
> 
>   if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
>   return "(efault)";

I'm OK with this.
Probing ptrs was a good idea, it just didn't work out.

-ss


Re: [PATCH] vsprintf: Do not break early boot with probing addresses

2019-05-10 Thread Petr Mladek
On Fri 2019-05-10 14:07:09, Sergey Senozhatsky wrote:
> On (05/09/19 21:47), Linus Torvalds wrote:
> >[ Sorry about html and mobile crud, I'm not at the computer right now ]
> >How about we just undo the whole misguided probe_kernel_address() thing?
> 
> But the problem will remain - %pS/%pF on PPC (and some other arch-s)
> do dereference_function_descriptor(), which calls probe_kernel_address().
> So if probe_kernel_address() starts to dump_stack(), then we are heading
> towards stack overflow. Unless I'm totally missing something.

That is true. On the other hand, %pS/%pF uses
dereference_function_descriptor() only on three architectures.
And these modifiers are used only rarely (ok, in dump_stack()
but still).

On the other hand, any infinite loop in vsprintf() via
probe_kernel_address() would break any printk(). And would be
hard to debug.

I tend to agree with Linus. probe_kernel_address() is too complicated.
It is prone to these infinite loops and should not be used in
the default printk() path.

It would be nice to have a lightweight and safe alternative. But
I can't find any. And I think that it is not worth any huge
complexity. It was just a nice to have idea...


I am going to send a patch replacing probe_kernel_address() with
a simple check:

if ((unsigned long)ptr < PAGE_SIZE || IS_ERR_VALUE(ptr))
return "(efault)";

The original patch still makes sense because it adds the check
into more locations and replaces some custom variants.

Best Regards,
Petr


Re: [PATCH] vsprintf: Do not break early boot with probing addresses

2019-05-10 Thread Michael Ellerman
Sergey Senozhatsky  writes:
> On (05/09/19 21:47), Linus Torvalds wrote:
>>[ Sorry about html and mobile crud, I'm not at the computer right now ]
>>How about we just undo the whole misguided probe_kernel_address() thing?
>
> But the problem will remain - %pS/%pF on PPC (and some other arch-s)
> do dereference_function_descriptor(), which calls probe_kernel_address().

(Only on 64-bit big endian, and we may even change that one day)

> So if probe_kernel_address() starts to dump_stack(), then we are heading
> towards stack overflow. Unless I'm totally missing something.

We only ended up calling dump_stack() from probe_kernel_address() due to
a combination of things:
  1. probe_kernel_address() actually uses __copy_from_user_inatomic()
 which is silly because it's not doing a user access.
  2. our user access code uses mmu_has_feature() which uses jump labels,
 and so isn't safe to call until we've initialised those jump labels.
 This is unnecessarily fragile, we can easily make the user access
 code safe to call before the jump labels are initialised.
  3. we had extra debug code enabled in mmu_has_feature() which calls
 dump_stack().

I've fixed 2, and plan to fix 1 as well at some point. And 3 is behind a
CONFIG option that no one except me is going to have enabled in
practice.

So in future we shouldn't be calling dump_stack() in that path.

cheers


[PATCH] powerpc/64: mark start_here_multiplatform as __ref

2019-05-10 Thread Christophe Leroy
Otherwise, the following warning is encountered:

WARNING: vmlinux.o(.text+0x3dc6): Section mismatch in reference from the 
variable start_here_multiplatform to the function .init.text:.early_setup()
The function start_here_multiplatform() references
the function __init .early_setup().
This is often because start_here_multiplatform lacks a __init
annotation or the annotation of .early_setup is wrong.

Fixes: 56c46bba9bbf ("powerpc/64: Fix booting large kernels with 
STRICT_KERNEL_RWX")
Cc: Russell Currey 
Signed-off-by: Christophe Leroy 
---
 arch/powerpc/kernel/head_64.S | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index 5321a11c2835..259be7f6d551 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -904,6 +904,7 @@ p_toc:  .8byte  __toc_start + 0x8000 - 0b
 /*
  * This is where the main kernel code starts.
  */
+__REF
 start_here_multiplatform:
/* set up the TOC */
bl  relative_toc
@@ -979,6 +980,7 @@ start_here_multiplatform:
RFI
b   .   /* prevent speculative execution */
 
+   .previous
/* This is where all platforms converge execution */
 
 start_here_common:
-- 
2.13.3