Re: [PATCH v5 01/13] powerpc: Remove Xilinx PPC405/PPC440 support

2020-06-17 Thread Nathan Chancellor
On Thu, Jun 18, 2020 at 10:48:21AM +1000, Michael Ellerman wrote:
> Nick Desaulniers  writes:
> > On Wed, Jun 17, 2020 at 3:20 AM Michael Ellerman  
> > wrote:
> >> Michael Ellerman  writes:
> >> > Michal Simek  writes:
> >> 
> >>
> >> >> Or if bamboo requires uImage to be built by default you can do it via
> >> >> Kconfig.
> >> >>
> >> >> diff --git a/arch/powerpc/platforms/44x/Kconfig
> >> >> b/arch/powerpc/platforms/44x/Kconfig
> >> >> index 39e93d23fb38..300864d7b8c9 100644
> >> >> --- a/arch/powerpc/platforms/44x/Kconfig
> >> >> +++ b/arch/powerpc/platforms/44x/Kconfig
> >> >> @@ -13,6 +13,7 @@ config BAMBOO
> >> >> select PPC44x_SIMPLE
> >> >> select 440EP
> >> >> select FORCE_PCI
> >> >> +   select DEFAULT_UIMAGE
> >> >> help
> >> >>   This option enables support for the IBM PPC440EP evaluation 
> >> >> board.
> >> >
> >> > Who knows what the actual bamboo board used. But I'd be happy to take a
> >> > SOB'ed patch to do the above, because these days the qemu emulation is
> >> > much more likely to be used than the actual board.
> >>
> >> I just went to see why my CI boot of 44x didn't catch this, and it's
> >> because I don't use the uImage, I just boot the vmlinux directly:
> >>
> >>   $ qemu-system-ppc -M bamboo -m 128m -display none -kernel build~/vmlinux 
> >> -append "console=ttyS0" -display none -nodefaults -serial mon:stdio
> >>   Linux version 5.8.0-rc1-00118-g69119673bd50 (michael@alpine1-p1) (gcc 
> >> (Ubuntu 9.3.0-10ubuntu2) 9.3.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #4 
> >> Wed Jun 17 20:19:22 AEST 2020
> >>   Using PowerPC 44x Platform machine description
> >>   ioremap() called early from find_legacy_serial_ports+0x690/0x770. Use 
> >> early_ioremap() instead
> >>   printk: bootconsole [udbg0] enabled
> >>
> >>
> >> So that's probably the simplest solution?
> >
> > If the uImage or zImage self decompresses, I would prefer to test that as 
> > well.
> 
> The uImage is decompressed by qemu AIUI.
> 
> >> That means previously arch/powerpc/boot/zImage was just a hardlink to
> >> the uImage:
> >
> > It sounds like we can just boot the zImage, or is that no longer
> > created with the uImage?
> 
> The zImage won't boot on bamboo.
> 
> Because of the vagaries of the arch/powerpc/boot/Makefile the zImage
> ends up pointing to treeImage.ebony, which is for a different board.
> 
> The zImage link is made to the first item in $(image-y):
> 
> $(obj)/zImage:$(addprefix $(obj)/, $(image-y))
>   $(Q)rm -f $@; ln $< $@
>  ^
>  first preqrequisite
> 
> Which for this defconfig happens to be:
> 
> image-$(CONFIG_EBONY) += treeImage.ebony cuImage.ebony
> 
> If you turned off CONFIG_EBONY then the zImage will be a link to
> treeImage.bamboo, but qemu can't boot that either.
> 
> It's kind of nuts that the zImage points to some arbitrary image
> depending on what's configured and the order of things in the Makefile.
> But I'm not sure how we make it less nuts without risking breaking
> people's existing setups.
> 
> cheers

Hi Michael,

For what it's worth, this is squared this away in terms of our CI by
just building and booting the uImage directly, rather than implicitly
using the zImage:

https://github.com/ClangBuiltLinux/continuous-integration/pull/282
https://github.com/ClangBuiltLinux/boot-utils/pull/22

We were only using the zImage because that is what Joel Stanley intially
set us up with when PowerPC 32-bit was added to our CI:

https://github.com/ClangBuiltLinux/continuous-integration/pull/100

Admittedly, we really do not have many PowerPC experts in our
organization so we are supporting it on a "best effort" basis, which
often involves using whatever knowledge is floating around or can be
gained from interactions such as this :) so thank you for that!

Cheers,
Nathan


[powerpc:next-test] BUILD SUCCESS 9d7d80a443b962d67db4fbd4b832081f4d7df4a8

2020-06-17 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git  
next-test
branch HEAD: 9d7d80a443b962d67db4fbd4b832081f4d7df4a8  powerpc/powernv/ioda: 
Return correct error if TCE level allocation failed

elapsed time: 723m

configs tested: 5
configs skipped: 105

The following configs have been built successfully.
More configs may be tested in the coming days.

powerpc defconfig
powerpc  allyesconfig
powerpc  rhel-kconfig
powerpc  allmodconfig
powerpc   allnoconfig

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


[powerpc:fixes-test] BUILD SUCCESS b55129f97aeefd265314e12d98935330e011a14a

2020-06-17 Thread kernel test robot
tree/branch: https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git  
fixes-test
branch HEAD: b55129f97aeefd265314e12d98935330e011a14a  powerpc/8xx: Provide 
ptep_get() with 16k pages

elapsed time: 723m

configs tested: 94
configs skipped: 4

The following configs have been built successfully.
More configs may be tested in the coming days.

arm defconfig
arm  allyesconfig
arm  allmodconfig
arm   allnoconfig
arm64allyesconfig
arm64   defconfig
arm64allmodconfig
arm64 allnoconfig
armclps711x_defconfig
powerpc  ep88xc_defconfig
armxcep_defconfig
arm  iop32x_defconfig
mips   xway_defconfig
sparcalldefconfig
arm mv78xx0_defconfig
powerpc mpc83xx_defconfig
i386 allyesconfig
i386defconfig
i386  debian-10.3
i386  allnoconfig
ia64 allmodconfig
ia64defconfig
ia64  allnoconfig
ia64 allyesconfig
m68k allmodconfig
m68k  allnoconfig
m68k   sun3_defconfig
m68kdefconfig
m68k allyesconfig
nds32   defconfig
nds32 allnoconfig
csky allyesconfig
cskydefconfig
alpha   defconfig
alphaallyesconfig
xtensa  defconfig
xtensa   allyesconfig
h8300allyesconfig
h8300allmodconfig
arc defconfig
sh   allmodconfig
shallnoconfig
microblazeallnoconfig
arc  allyesconfig
nios2   defconfig
nios2allyesconfig
openriscdefconfig
c6x  allyesconfig
c6x   allnoconfig
openrisc allyesconfig
mips allyesconfig
mips  allnoconfig
mips allmodconfig
pariscallnoconfig
parisc  defconfig
parisc   allyesconfig
parisc   allmodconfig
powerpc  rhel-kconfig
powerpc defconfig
powerpc  allyesconfig
powerpc  allmodconfig
powerpc   allnoconfig
i386 randconfig-a015-20200617
i386 randconfig-a011-20200617
i386 randconfig-a014-20200617
i386 randconfig-a016-20200617
i386 randconfig-a013-20200617
i386 randconfig-a012-20200617
riscvallyesconfig
riscv allnoconfig
riscv   defconfig
riscvallmodconfig
s390 allyesconfig
s390  allnoconfig
s390 allmodconfig
s390defconfig
sparcallyesconfig
sparc   defconfig
sparc64 defconfig
sparc64   allnoconfig
sparc64  allyesconfig
sparc64  allmodconfig
um   allmodconfig
umallnoconfig
um   allyesconfig
um  defconfig
x86_64   rhel
x86_64 rhel-7.2-clear
x86_64lkp
x86_64  fedora-25
x86_64   rhel-8.3
x86_64   rhel-7.6
x86_64rhel-7.6-kselftests
x86_64  kexec

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


linux-next: manual merge of the pidfd tree with the powerpc-fixes tree

2020-06-17 Thread Stephen Rothwell
Hi all,

Today's linux-next merge of the pidfd tree got a conflict in:

  arch/powerpc/kernel/syscalls/syscall.tbl

between commit:

  35e32a6cb5f6 ("powerpc/syscalls: Split SPU-ness out of ABI")

from the powerpc-fixes tree and commit:

  9b4feb630e8e ("arch: wire-up close_range()")

from the pidfd tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc arch/powerpc/kernel/syscalls/syscall.tbl
index c0cdaacd770e,dd87a782d80e..
--- a/arch/powerpc/kernel/syscalls/syscall.tbl
+++ b/arch/powerpc/kernel/syscalls/syscall.tbl
@@@ -480,6 -524,8 +480,7 @@@
  434   common  pidfd_open  sys_pidfd_open
  435   32  clone3  ppc_clone3  
sys_clone3
  435   64  clone3  sys_clone3
 -435   spu clone3  sys_ni_syscall
+ 436   common  close_range sys_close_range
  437   common  openat2 sys_openat2
  438   common  pidfd_getfd sys_pidfd_getfd
  439   common  faccessat2  sys_faccessat2


pgpJFrDUQM9cx.pgp
Description: OpenPGP digital signature


[PATCH AUTOSEL 4.4 52/60] ASoC: fsl_asrc_dma: Fix dma_chan leak when config DMA channel failed

2020-06-17 Thread Sasha Levin
From: Xiyu Yang 

[ Upstream commit 36124fb19f1ae68a500cd76a76d40c6e81bee346 ]

fsl_asrc_dma_hw_params() invokes dma_request_channel() or
fsl_asrc_get_dma_channel(), which returns a reference of the specified
dma_chan object to "pair->dma_chan[dir]" with increased refcnt.

The reference counting issue happens in one exception handling path of
fsl_asrc_dma_hw_params(). When config DMA channel failed for Back-End,
the function forgets to decrease the refcnt increased by
dma_request_channel() or fsl_asrc_get_dma_channel(), causing a refcnt
leak.

Fix this issue by calling dma_release_channel() when config DMA channel
failed.

Signed-off-by: Xiyu Yang 
Signed-off-by: Xin Tan 
Link: 
https://lore.kernel.org/r/1590415966-52416-1-git-send-email-xiyuyan...@fudan.edu.cn
Signed-off-by: Mark Brown 
Signed-off-by: Sasha Levin 
---
 sound/soc/fsl/fsl_asrc_dma.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sound/soc/fsl/fsl_asrc_dma.c b/sound/soc/fsl/fsl_asrc_dma.c
index ffc000bc1f15..56a873ba08e4 100644
--- a/sound/soc/fsl/fsl_asrc_dma.c
+++ b/sound/soc/fsl/fsl_asrc_dma.c
@@ -243,6 +243,7 @@ static int fsl_asrc_dma_hw_params(struct snd_pcm_substream 
*substream,
ret = dmaengine_slave_config(pair->dma_chan[dir], &config_be);
if (ret) {
dev_err(dev, "failed to config DMA channel for Back-End\n");
+   dma_release_channel(pair->dma_chan[dir]);
return ret;
}
 
-- 
2.25.1



[PATCH AUTOSEL 4.4 38/60] powerpc/pseries/ras: Fix FWNMI_VALID off by one

2020-06-17 Thread Sasha Levin
From: Nicholas Piggin 

[ Upstream commit deb70f7a35a22dffa55b2c3aac71bc6fb0f486ce ]

This was discovered developing qemu fwnmi sreset support. This
off-by-one bug means the last 16 bytes of the rtas area can not
be used for a 16 byte save area.

It's not a serious bug, and QEMU implementation has to retain a
workaround for old kernels, but it's good to tighten it.

Signed-off-by: Nicholas Piggin 
Signed-off-by: Michael Ellerman 
Acked-by: Mahesh Salgaonkar 
Link: https://lore.kernel.org/r/20200508043408.886394-7-npig...@gmail.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/pseries/ras.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/ras.c 
b/arch/powerpc/platforms/pseries/ras.c
index 9795e52bab3d..9e817c1b7808 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -265,10 +265,11 @@ static irqreturn_t ras_error_interrupt(int irq, void 
*dev_id)
 /*
  * Some versions of FWNMI place the buffer inside the 4kB page starting at
  * 0x7000. Other versions place it inside the rtas buffer. We check both.
+ * Minimum size of the buffer is 16 bytes.
  */
 #define VALID_FWNMI_BUFFER(A) \
-   A) >= 0x7000) && ((A) < 0x7ff0)) || \
-   (((A) >= rtas.base) && ((A) < (rtas.base + rtas.size - 16
+   A) >= 0x7000) && ((A) <= 0x8000 - 16)) || \
+   (((A) >= rtas.base) && ((A) <= (rtas.base + rtas.size - 16
 
 /*
  * Get the error information for errors coming through the
-- 
2.25.1



[PATCH AUTOSEL 4.4 39/60] powerpc/ps3: Fix kexec shutdown hang

2020-06-17 Thread Sasha Levin
From: Geoff Levand 

[ Upstream commit 126554465d93b10662742128918a5fc338cda4aa ]

The ps3_mm_region_destroy() and ps3_mm_vas_destroy() routines
are called very late in the shutdown via kexec's mmu_cleanup_all
routine.  By the time mmu_cleanup_all runs it is too late to use
udbg_printf, and calling it will cause PS3 systems to hang.

Remove all debugging statements from ps3_mm_region_destroy() and
ps3_mm_vas_destroy() and replace any error reporting with calls
to lv1_panic.

With this change builds with 'DEBUG' defined will not cause kexec
reboots to hang, and builds with 'DEBUG' defined or not will end
in lv1_panic if an error is encountered.

Signed-off-by: Geoff Levand 
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/7325c4af2b4c989c19d6a26b90b1fec9c0615ddf.1589049250.git.ge...@infradead.org
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/ps3/mm.c | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/platforms/ps3/mm.c b/arch/powerpc/platforms/ps3/mm.c
index b0f34663b1ae..19bae78b1f25 100644
--- a/arch/powerpc/platforms/ps3/mm.c
+++ b/arch/powerpc/platforms/ps3/mm.c
@@ -212,13 +212,14 @@ void ps3_mm_vas_destroy(void)
 {
int result;
 
-   DBG("%s:%d: map.vas_id= %llu\n", __func__, __LINE__, map.vas_id);
-
if (map.vas_id) {
result = lv1_select_virtual_address_space(0);
-   BUG_ON(result);
-   result = lv1_destruct_virtual_address_space(map.vas_id);
-   BUG_ON(result);
+   result += lv1_destruct_virtual_address_space(map.vas_id);
+
+   if (result) {
+   lv1_panic(0);
+   }
+
map.vas_id = 0;
}
 }
@@ -316,19 +317,20 @@ static void ps3_mm_region_destroy(struct mem_region *r)
int result;
 
if (!r->destroy) {
-   pr_info("%s:%d: Not destroying high region: %llxh %llxh\n",
-   __func__, __LINE__, r->base, r->size);
return;
}
 
-   DBG("%s:%d: r->base = %llxh\n", __func__, __LINE__, r->base);
-
if (r->base) {
result = lv1_release_memory(r->base);
-   BUG_ON(result);
+
+   if (result) {
+   lv1_panic(0);
+   }
+
r->size = r->base = r->offset = 0;
map.total = map.rm.size;
}
+
ps3_mm_set_repository_highmem(NULL);
 }
 
-- 
2.25.1



[PATCH AUTOSEL 4.4 26/60] tty: hvc: Fix data abort due to race in hvc_open

2020-06-17 Thread Sasha Levin
From: Raghavendra Rao Ananta 

[ Upstream commit e2bd1dcbe1aa34ff5570b3427c530e4332ecf0fe ]

Potentially, hvc_open() can be called in parallel when two tasks calls
open() on /dev/hvcX. In such a scenario, if the hp->ops->notifier_add()
callback in the function fails, where it sets the tty->driver_data to
NULL, the parallel hvc_open() can see this NULL and cause a memory abort.
Hence, serialize hvc_open and check if tty->private_data is NULL before
proceeding ahead.

The issue can be easily reproduced by launching two tasks simultaneously
that does nothing but open() and close() on /dev/hvcX.
For example:
$ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &

Signed-off-by: Raghavendra Rao Ananta 
Link: https://lore.kernel.org/r/20200428032601.22127-1-rana...@codeaurora.org
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Sasha Levin 
---
 drivers/tty/hvc/hvc_console.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c
index acf6d143c753..81f23af8beca 100644
--- a/drivers/tty/hvc/hvc_console.c
+++ b/drivers/tty/hvc/hvc_console.c
@@ -89,6 +89,8 @@ static LIST_HEAD(hvc_structs);
  */
 static DEFINE_SPINLOCK(hvc_structs_lock);
 
+/* Mutex to serialize hvc_open */
+static DEFINE_MUTEX(hvc_open_mutex);
 /*
  * This value is used to assign a tty->index value to a hvc_struct based
  * upon order of exposure via hvc_probe(), when we can not match it to
@@ -333,16 +335,24 @@ static int hvc_install(struct tty_driver *driver, struct 
tty_struct *tty)
  */
 static int hvc_open(struct tty_struct *tty, struct file * filp)
 {
-   struct hvc_struct *hp = tty->driver_data;
+   struct hvc_struct *hp;
unsigned long flags;
int rc = 0;
 
+   mutex_lock(&hvc_open_mutex);
+
+   hp = tty->driver_data;
+   if (!hp) {
+   rc = -EIO;
+   goto out;
+   }
+
spin_lock_irqsave(&hp->port.lock, flags);
/* Check and then increment for fast path open. */
if (hp->port.count++ > 0) {
spin_unlock_irqrestore(&hp->port.lock, flags);
hvc_kick();
-   return 0;
+   goto out;
} /* else count == 0 */
spin_unlock_irqrestore(&hp->port.lock, flags);
 
@@ -371,6 +381,8 @@ static int hvc_open(struct tty_struct *tty, struct file * 
filp)
/* Force wakeup of the polling thread */
hvc_kick();
 
+out:
+   mutex_unlock(&hvc_open_mutex);
return rc;
 }
 
-- 
2.25.1



[PATCH AUTOSEL 4.4 21/60] scsi: ibmvscsi: Don't send host info in adapter info MAD after LPM

2020-06-17 Thread Sasha Levin
From: Tyrel Datwyler 

[ Upstream commit 4919b33b63c8b69d8dcf2b867431d0e3b6dc6d28 ]

The adapter info MAD is used to send the client info and receive the host
info as a response. A persistent buffer is used and as such the client info
is overwritten after the response. During the course of a normal adapter
reset the client info is refreshed in the buffer in preparation for sending
the adapter info MAD.

However, in the special case of LPM where we reenable the CRQ instead of a
full CRQ teardown and reset we fail to refresh the client info in the
adapter info buffer. As a result, after Live Partition Migration (LPM) we
erroneously report the host's info as our own.

[mkp: typos]

Link: https://lore.kernel.org/r/20200603203632.18426-1-tyr...@linux.ibm.com
Signed-off-by: Tyrel Datwyler 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/ibmvscsi/ibmvscsi.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
index e26747a1b35a..e7075aae15da 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -427,6 +427,8 @@ static int ibmvscsi_reenable_crq_queue(struct crq_queue 
*queue,
int rc = 0;
struct vio_dev *vdev = to_vio_dev(hostdata->dev);
 
+   set_adapter_info(hostdata);
+
/* Re-enable the CRQ */
do {
if (rc)
-- 
2.25.1



[PATCH AUTOSEL 4.4 16/60] powerpc/crashkernel: Take "mem=" option into account

2020-06-17 Thread Sasha Levin
From: Pingfan Liu 

[ Upstream commit be5470e0c285a68dc3afdea965032f5ddc8269d7 ]

'mem=" option is an easy way to put high pressure on memory during
some test. Hence after applying the memory limit, instead of total
mem, the actual usable memory should be considered when reserving mem
for crashkernel. Otherwise the boot up may experience OOM issue.

E.g. it would reserve 4G prior to the change and 512M afterward, if
passing
crashkernel="2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G",
and mem=5G on a 256G machine.

This issue is powerpc specific because it puts higher priority on
fadump and kdump reservation than on "mem=". Referring the following
code:
if (fadump_reserve_mem() == 0)
reserve_crashkernel();
...
/* Ensure that total memory size is page-aligned. */
limit = ALIGN(memory_limit ?: memblock_phys_mem_size(), PAGE_SIZE);
memblock_enforce_memory_limit(limit);

While on other arches, the effect of "mem=" takes a higher priority
and pass through memblock_phys_mem_size() before calling
reserve_crashkernel().

Signed-off-by: Pingfan Liu 
Reviewed-by: Hari Bathini 
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/1585749644-4148-1-git-send-email-kernelf...@gmail.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/kernel/machine_kexec.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/machine_kexec.c 
b/arch/powerpc/kernel/machine_kexec.c
index 8dff2b371219..a14d9b008f74 100644
--- a/arch/powerpc/kernel/machine_kexec.c
+++ b/arch/powerpc/kernel/machine_kexec.c
@@ -113,11 +113,12 @@ void machine_kexec(struct kimage *image)
 
 void __init reserve_crashkernel(void)
 {
-   unsigned long long crash_size, crash_base;
+   unsigned long long crash_size, crash_base, total_mem_sz;
int ret;
 
+   total_mem_sz = memory_limit ? memory_limit : memblock_phys_mem_size();
/* use common parsing */
-   ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
+   ret = parse_crashkernel(boot_command_line, total_mem_sz,
&crash_size, &crash_base);
if (ret == 0 && crash_size > 0) {
crashk_res.start = crash_base;
@@ -176,6 +177,7 @@ void __init reserve_crashkernel(void)
/* Crash kernel trumps memory limit */
if (memory_limit && memory_limit <= crashk_res.end) {
memory_limit = crashk_res.end + 1;
+   total_mem_sz = memory_limit;
printk("Adjusted memory limit for crashkernel, now 0x%llx\n",
   memory_limit);
}
@@ -184,7 +186,7 @@ void __init reserve_crashkernel(void)
"for crashkernel (System RAM: %ldMB)\n",
(unsigned long)(crash_size >> 20),
(unsigned long)(crashk_res.start >> 20),
-   (unsigned long)(memblock_phys_mem_size() >> 20));
+   (unsigned long)(total_mem_sz >> 20));
 
if (!memblock_is_region_memory(crashk_res.start, crash_size) ||
memblock_reserve(crashk_res.start, crash_size)) {
-- 
2.25.1



[PATCH AUTOSEL 4.4 14/60] powerpc/pseries: Update hv-24x7 information after migration

2020-06-17 Thread Sasha Levin
From: Kajol Jain 

[ Upstream commit 373b373053384f12951ae9f916043d955501d482 ]

Function 'read_sys_info_pseries()' is added to get system parameter
values like number of sockets and chips per socket.
and it gets these details via rtas_call with token
"PROCESSOR_MODULE_INFO".

Incase lpar migrate from one system to another, system
parameter details like chips per sockets or number of sockets might
change. So, it needs to be re-initialized otherwise, these values
corresponds to previous system values.
This patch adds a call to 'read_sys_info_pseries()' from
'post-mobility_fixup()' to re-init the physsockets and physchips values

Signed-off-by: Kajol Jain 
Signed-off-by: Michael Ellerman 
Link: https://lore.kernel.org/r/20200525104308.9814-6-kj...@linux.ibm.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/pseries/mobility.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/mobility.c 
b/arch/powerpc/platforms/pseries/mobility.c
index 8d30a425a88a..58ddc4389a51 100644
--- a/arch/powerpc/platforms/pseries/mobility.c
+++ b/arch/powerpc/platforms/pseries/mobility.c
@@ -326,6 +326,9 @@ void post_mobility_fixup(void)
/* Possibly switch to a new RFI flush type */
pseries_setup_rfi_flush();
 
+   /* Reinitialise system information for hv-24x7 */
+   read_24x7_sys_info();
+
return;
 }
 
-- 
2.25.1



[PATCH AUTOSEL 4.4 10/60] ps3disk: use the default segment boundary

2020-06-17 Thread Sasha Levin
From: Emmanuel Nicolet 

[ Upstream commit 720bc316690bd27dea9d71510b50f0cd698ffc32 ]

Since commit dcebd755926b ("block: use bio_for_each_bvec() to compute
multi-page bvec count"), the kernel will bug_on on the PS3 because
bio_split() is called with sectors == 0:

  kernel BUG at block/bio.c:1853!
  Oops: Exception in kernel mode, sig: 5 [#1]
  BE PAGE_SIZE=4K MMU=Hash PREEMPT SMP NR_CPUS=8 NUMA PS3
  Modules linked in: firewire_sbp2 rtc_ps3(+) soundcore ps3_gelic(+) \
  ps3rom(+) firewire_core ps3vram(+) usb_common crc_itu_t
  CPU: 0 PID: 97 Comm: blkid Not tainted 5.3.0-rc4 #1
  NIP:  c027d0d0 LR: c027d0b0 CTR: 
  REGS: c135ae90 TRAP: 0700   Not tainted  (5.3.0-rc4)
  MSR:  80028032   CR: 44008240  XER: 2000
  IRQMASK: 0
  GPR00: c0289368 c135b120 c084a500 c4ff8300
  GPR04: 0c00 c4c905e0 c4c905e0 
  GPR08:  0001  
  GPR12:  c08ef000 003e 00080001
  GPR16: 0100   0004
  GPR20: c062fd7e 0001  0080
  GPR24: c0781788 c135b350 0080 c4c905e0
  GPR28: c135b348 c4ff8300  c4c9
  NIP [c027d0d0] .bio_split+0x28/0xac
  LR [c027d0b0] .bio_split+0x8/0xac
  Call Trace:
  [c135b120] [c027d130] .bio_split+0x88/0xac (unreliable)
  [c135b1b0] [c0289368] .__blk_queue_split+0x11c/0x53c
  [c135b2d0] [c028f614] .blk_mq_make_request+0x80/0x7d4
  [c135b3d0] [c0283a8c] .generic_make_request+0x118/0x294
  [c135b4b0] [c0283d34] .submit_bio+0x12c/0x174
  [c135b580] [c0205a44] .mpage_bio_submit+0x3c/0x4c
  [c135b600] [c0206184] .mpage_readpages+0xa4/0x184
  [c135b750] [c01ff8fc] .blkdev_readpages+0x24/0x38
  [c135b7c0] [c01589f0] .read_pages+0x6c/0x1a8
  [c135b8b0] [c0158c74] .__do_page_cache_readahead+0x118/0x184
  [c135b9b0] [c01591a8] .force_page_cache_readahead+0xe4/0xe8
  [c135ba50] [c014fc24] .generic_file_read_iter+0x1d8/0x830
  [c135bb50] [c01ffadc] .blkdev_read_iter+0x40/0x5c
  [c135bbc0] [c01b9e00] .new_sync_read+0x144/0x1a0
  [c135bcd0] [c01bc454] .vfs_read+0xa0/0x124
  [c135bd70] [c01bc7a4] .ksys_read+0x70/0xd8
  [c135be20] [c000a524] system_call+0x5c/0x70
  Instruction dump:
  7fe3fb78 482e30dc 7c0802a6 482e3085 7c9e2378 f821ff71 7ca42b78 7d3e00d0
  7c7d1b78 79290fe0 7cc53378 69290001 <0b09> 81230028 7bca0020 7929ba62
  [ end trace 313fec760f30aa1f ]---

The problem originates from setting the segment boundary of the
request queue to -1UL. This makes get_max_segment_size() return zero
when offset is zero, whatever the max segment size. The test with
BLK_SEG_BOUNDARY_MASK fails and 'mask - (mask & offset) + 1' overflows
to zero in the return statement.

Not setting the segment boundary and using the default
value (BLK_SEG_BOUNDARY_MASK) fixes the problem.

Signed-off-by: Emmanuel Nicolet 
Signed-off-by: Geoff Levand 
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/060a416c43138f45105c0540eff1a45539f7e2fc.1589049250.git.ge...@infradead.org
Signed-off-by: Sasha Levin 
---
 drivers/block/ps3disk.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/block/ps3disk.c b/drivers/block/ps3disk.c
index c120d70d3fb3..fc7a20286090 100644
--- a/drivers/block/ps3disk.c
+++ b/drivers/block/ps3disk.c
@@ -464,7 +464,6 @@ static int ps3disk_probe(struct ps3_system_bus_device *_dev)
blk_queue_bounce_limit(queue, BLK_BOUNCE_HIGH);
 
blk_queue_max_hw_sectors(queue, dev->bounce_size >> 9);
-   blk_queue_segment_boundary(queue, -1UL);
blk_queue_dma_alignment(queue, dev->blk_size-1);
blk_queue_logical_block_size(queue, dev->blk_size);
 
-- 
2.25.1



[PATCH AUTOSEL 4.9 69/80] ASoC: fsl_asrc_dma: Fix dma_chan leak when config DMA channel failed

2020-06-17 Thread Sasha Levin
From: Xiyu Yang 

[ Upstream commit 36124fb19f1ae68a500cd76a76d40c6e81bee346 ]

fsl_asrc_dma_hw_params() invokes dma_request_channel() or
fsl_asrc_get_dma_channel(), which returns a reference of the specified
dma_chan object to "pair->dma_chan[dir]" with increased refcnt.

The reference counting issue happens in one exception handling path of
fsl_asrc_dma_hw_params(). When config DMA channel failed for Back-End,
the function forgets to decrease the refcnt increased by
dma_request_channel() or fsl_asrc_get_dma_channel(), causing a refcnt
leak.

Fix this issue by calling dma_release_channel() when config DMA channel
failed.

Signed-off-by: Xiyu Yang 
Signed-off-by: Xin Tan 
Link: 
https://lore.kernel.org/r/1590415966-52416-1-git-send-email-xiyuyan...@fudan.edu.cn
Signed-off-by: Mark Brown 
Signed-off-by: Sasha Levin 
---
 sound/soc/fsl/fsl_asrc_dma.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sound/soc/fsl/fsl_asrc_dma.c b/sound/soc/fsl/fsl_asrc_dma.c
index dc30d780f874..3fcf174b99d3 100644
--- a/sound/soc/fsl/fsl_asrc_dma.c
+++ b/sound/soc/fsl/fsl_asrc_dma.c
@@ -243,6 +243,7 @@ static int fsl_asrc_dma_hw_params(struct snd_pcm_substream 
*substream,
ret = dmaengine_slave_config(pair->dma_chan[dir], &config_be);
if (ret) {
dev_err(dev, "failed to config DMA channel for Back-End\n");
+   dma_release_channel(pair->dma_chan[dir]);
return ret;
}
 
-- 
2.25.1



[PATCH AUTOSEL 4.9 55/80] powerpc/64s/pgtable: fix an undefined behaviour

2020-06-17 Thread Sasha Levin
From: Qian Cai 

[ Upstream commit c2e929b18cea6cbf71364f22d742d9aad7f4677a ]

Booting a power9 server with hash MMU could trigger an undefined
behaviour because pud_offset(p4d, 0) will do,

0 >> (PAGE_SHIFT:16 + PTE_INDEX_SIZE:8 + H_PMD_INDEX_SIZE:10)

Fix it by converting pud_index() and friends to static inline
functions.

UBSAN: shift-out-of-bounds in arch/powerpc/mm/ptdump/ptdump.c:282:15
shift exponent 34 is too large for 32-bit type 'int'
CPU: 6 PID: 1 Comm: swapper/0 Not tainted 5.6.0-rc4-next-20200303+ #13
Call Trace:
dump_stack+0xf4/0x164 (unreliable)
ubsan_epilogue+0x18/0x78
__ubsan_handle_shift_out_of_bounds+0x160/0x21c
walk_pagetables+0x2cc/0x700
walk_pud at arch/powerpc/mm/ptdump/ptdump.c:282
(inlined by) walk_pagetables at arch/powerpc/mm/ptdump/ptdump.c:311
ptdump_check_wx+0x8c/0xf0
mark_rodata_ro+0x48/0x80
kernel_init+0x74/0x194
ret_from_kernel_thread+0x5c/0x74

Suggested-by: Christophe Leroy 
Signed-off-by: Qian Cai 
Signed-off-by: Michael Ellerman 
Reviewed-by: Christophe Leroy 
Link: https://lore.kernel.org/r/20200306044852.3236-1-...@lca.pw
Signed-off-by: Sasha Levin 
---
 arch/powerpc/include/asm/book3s/64/pgtable.h | 23 
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 9fd77f8794a0..315758c84187 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -754,10 +754,25 @@ extern struct page *pgd_page(pgd_t pgd);
 #define pud_page_vaddr(pud)__va(pud_val(pud) & ~PUD_MASKED_BITS)
 #define pgd_page_vaddr(pgd)__va(pgd_val(pgd) & ~PGD_MASKED_BITS)
 
-#define pgd_index(address) (((address) >> (PGDIR_SHIFT)) & (PTRS_PER_PGD - 1))
-#define pud_index(address) (((address) >> (PUD_SHIFT)) & (PTRS_PER_PUD - 1))
-#define pmd_index(address) (((address) >> (PMD_SHIFT)) & (PTRS_PER_PMD - 1))
-#define pte_index(address) (((address) >> (PAGE_SHIFT)) & (PTRS_PER_PTE - 1))
+static inline unsigned long pgd_index(unsigned long address)
+{
+   return (address >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1);
+}
+
+static inline unsigned long pud_index(unsigned long address)
+{
+   return (address >> PUD_SHIFT) & (PTRS_PER_PUD - 1);
+}
+
+static inline unsigned long pmd_index(unsigned long address)
+{
+   return (address >> PMD_SHIFT) & (PTRS_PER_PMD - 1);
+}
+
+static inline unsigned long pte_index(unsigned long address)
+{
+   return (address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
+}
 
 /*
  * Find an entry in a page-table-directory.  We combine the address region
-- 
2.25.1



[PATCH AUTOSEL 4.9 48/80] powerpc/ps3: Fix kexec shutdown hang

2020-06-17 Thread Sasha Levin
From: Geoff Levand 

[ Upstream commit 126554465d93b10662742128918a5fc338cda4aa ]

The ps3_mm_region_destroy() and ps3_mm_vas_destroy() routines
are called very late in the shutdown via kexec's mmu_cleanup_all
routine.  By the time mmu_cleanup_all runs it is too late to use
udbg_printf, and calling it will cause PS3 systems to hang.

Remove all debugging statements from ps3_mm_region_destroy() and
ps3_mm_vas_destroy() and replace any error reporting with calls
to lv1_panic.

With this change builds with 'DEBUG' defined will not cause kexec
reboots to hang, and builds with 'DEBUG' defined or not will end
in lv1_panic if an error is encountered.

Signed-off-by: Geoff Levand 
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/7325c4af2b4c989c19d6a26b90b1fec9c0615ddf.1589049250.git.ge...@infradead.org
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/ps3/mm.c | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/platforms/ps3/mm.c b/arch/powerpc/platforms/ps3/mm.c
index b0f34663b1ae..19bae78b1f25 100644
--- a/arch/powerpc/platforms/ps3/mm.c
+++ b/arch/powerpc/platforms/ps3/mm.c
@@ -212,13 +212,14 @@ void ps3_mm_vas_destroy(void)
 {
int result;
 
-   DBG("%s:%d: map.vas_id= %llu\n", __func__, __LINE__, map.vas_id);
-
if (map.vas_id) {
result = lv1_select_virtual_address_space(0);
-   BUG_ON(result);
-   result = lv1_destruct_virtual_address_space(map.vas_id);
-   BUG_ON(result);
+   result += lv1_destruct_virtual_address_space(map.vas_id);
+
+   if (result) {
+   lv1_panic(0);
+   }
+
map.vas_id = 0;
}
 }
@@ -316,19 +317,20 @@ static void ps3_mm_region_destroy(struct mem_region *r)
int result;
 
if (!r->destroy) {
-   pr_info("%s:%d: Not destroying high region: %llxh %llxh\n",
-   __func__, __LINE__, r->base, r->size);
return;
}
 
-   DBG("%s:%d: r->base = %llxh\n", __func__, __LINE__, r->base);
-
if (r->base) {
result = lv1_release_memory(r->base);
-   BUG_ON(result);
+
+   if (result) {
+   lv1_panic(0);
+   }
+
r->size = r->base = r->offset = 0;
map.total = map.rm.size;
}
+
ps3_mm_set_repository_highmem(NULL);
 }
 
-- 
2.25.1



[PATCH AUTOSEL 4.9 47/80] powerpc/pseries/ras: Fix FWNMI_VALID off by one

2020-06-17 Thread Sasha Levin
From: Nicholas Piggin 

[ Upstream commit deb70f7a35a22dffa55b2c3aac71bc6fb0f486ce ]

This was discovered developing qemu fwnmi sreset support. This
off-by-one bug means the last 16 bytes of the rtas area can not
be used for a 16 byte save area.

It's not a serious bug, and QEMU implementation has to retain a
workaround for old kernels, but it's good to tighten it.

Signed-off-by: Nicholas Piggin 
Signed-off-by: Michael Ellerman 
Acked-by: Mahesh Salgaonkar 
Link: https://lore.kernel.org/r/20200508043408.886394-7-npig...@gmail.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/pseries/ras.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/ras.c 
b/arch/powerpc/platforms/pseries/ras.c
index 8799d8a83d56..0af19aa1df57 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -311,10 +311,11 @@ static irqreturn_t ras_error_interrupt(int irq, void 
*dev_id)
 /*
  * Some versions of FWNMI place the buffer inside the 4kB page starting at
  * 0x7000. Other versions place it inside the rtas buffer. We check both.
+ * Minimum size of the buffer is 16 bytes.
  */
 #define VALID_FWNMI_BUFFER(A) \
-   A) >= 0x7000) && ((A) < 0x7ff0)) || \
-   (((A) >= rtas.base) && ((A) < (rtas.base + rtas.size - 16
+   A) >= 0x7000) && ((A) <= 0x8000 - 16)) || \
+   (((A) >= rtas.base) && ((A) <= (rtas.base + rtas.size - 16
 
 /*
  * Get the error information for errors coming through the
-- 
2.25.1



[PATCH AUTOSEL 4.9 33/80] tty: hvc: Fix data abort due to race in hvc_open

2020-06-17 Thread Sasha Levin
From: Raghavendra Rao Ananta 

[ Upstream commit e2bd1dcbe1aa34ff5570b3427c530e4332ecf0fe ]

Potentially, hvc_open() can be called in parallel when two tasks calls
open() on /dev/hvcX. In such a scenario, if the hp->ops->notifier_add()
callback in the function fails, where it sets the tty->driver_data to
NULL, the parallel hvc_open() can see this NULL and cause a memory abort.
Hence, serialize hvc_open and check if tty->private_data is NULL before
proceeding ahead.

The issue can be easily reproduced by launching two tasks simultaneously
that does nothing but open() and close() on /dev/hvcX.
For example:
$ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &

Signed-off-by: Raghavendra Rao Ananta 
Link: https://lore.kernel.org/r/20200428032601.22127-1-rana...@codeaurora.org
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Sasha Levin 
---
 drivers/tty/hvc/hvc_console.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c
index 985f49a65906..35d591287734 100644
--- a/drivers/tty/hvc/hvc_console.c
+++ b/drivers/tty/hvc/hvc_console.c
@@ -89,6 +89,8 @@ static LIST_HEAD(hvc_structs);
  */
 static DEFINE_SPINLOCK(hvc_structs_lock);
 
+/* Mutex to serialize hvc_open */
+static DEFINE_MUTEX(hvc_open_mutex);
 /*
  * This value is used to assign a tty->index value to a hvc_struct based
  * upon order of exposure via hvc_probe(), when we can not match it to
@@ -333,16 +335,24 @@ static int hvc_install(struct tty_driver *driver, struct 
tty_struct *tty)
  */
 static int hvc_open(struct tty_struct *tty, struct file * filp)
 {
-   struct hvc_struct *hp = tty->driver_data;
+   struct hvc_struct *hp;
unsigned long flags;
int rc = 0;
 
+   mutex_lock(&hvc_open_mutex);
+
+   hp = tty->driver_data;
+   if (!hp) {
+   rc = -EIO;
+   goto out;
+   }
+
spin_lock_irqsave(&hp->port.lock, flags);
/* Check and then increment for fast path open. */
if (hp->port.count++ > 0) {
spin_unlock_irqrestore(&hp->port.lock, flags);
hvc_kick();
-   return 0;
+   goto out;
} /* else count == 0 */
spin_unlock_irqrestore(&hp->port.lock, flags);
 
@@ -370,6 +380,8 @@ static int hvc_open(struct tty_struct *tty, struct file * 
filp)
/* Force wakeup of the polling thread */
hvc_kick();
 
+out:
+   mutex_unlock(&hvc_open_mutex);
return rc;
 }
 
-- 
2.25.1



[PATCH AUTOSEL 4.9 28/80] scsi: ibmvscsi: Don't send host info in adapter info MAD after LPM

2020-06-17 Thread Sasha Levin
From: Tyrel Datwyler 

[ Upstream commit 4919b33b63c8b69d8dcf2b867431d0e3b6dc6d28 ]

The adapter info MAD is used to send the client info and receive the host
info as a response. A persistent buffer is used and as such the client info
is overwritten after the response. During the course of a normal adapter
reset the client info is refreshed in the buffer in preparation for sending
the adapter info MAD.

However, in the special case of LPM where we reenable the CRQ instead of a
full CRQ teardown and reset we fail to refresh the client info in the
adapter info buffer. As a result, after Live Partition Migration (LPM) we
erroneously report the host's info as our own.

[mkp: typos]

Link: https://lore.kernel.org/r/20200603203632.18426-1-tyr...@linux.ibm.com
Signed-off-by: Tyrel Datwyler 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/ibmvscsi/ibmvscsi.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
index e1730227b448..f299839698a3 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -425,6 +425,8 @@ static int ibmvscsi_reenable_crq_queue(struct crq_queue 
*queue,
int rc = 0;
struct vio_dev *vdev = to_vio_dev(hostdata->dev);
 
+   set_adapter_info(hostdata);
+
/* Re-enable the CRQ */
do {
if (rc)
-- 
2.25.1



[PATCH AUTOSEL 4.9 23/80] powerpc/crashkernel: Take "mem=" option into account

2020-06-17 Thread Sasha Levin
From: Pingfan Liu 

[ Upstream commit be5470e0c285a68dc3afdea965032f5ddc8269d7 ]

'mem=" option is an easy way to put high pressure on memory during
some test. Hence after applying the memory limit, instead of total
mem, the actual usable memory should be considered when reserving mem
for crashkernel. Otherwise the boot up may experience OOM issue.

E.g. it would reserve 4G prior to the change and 512M afterward, if
passing
crashkernel="2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G",
and mem=5G on a 256G machine.

This issue is powerpc specific because it puts higher priority on
fadump and kdump reservation than on "mem=". Referring the following
code:
if (fadump_reserve_mem() == 0)
reserve_crashkernel();
...
/* Ensure that total memory size is page-aligned. */
limit = ALIGN(memory_limit ?: memblock_phys_mem_size(), PAGE_SIZE);
memblock_enforce_memory_limit(limit);

While on other arches, the effect of "mem=" takes a higher priority
and pass through memblock_phys_mem_size() before calling
reserve_crashkernel().

Signed-off-by: Pingfan Liu 
Reviewed-by: Hari Bathini 
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/1585749644-4148-1-git-send-email-kernelf...@gmail.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/kernel/machine_kexec.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/machine_kexec.c 
b/arch/powerpc/kernel/machine_kexec.c
index 9dafd7af39b8..cb4d6cd949fc 100644
--- a/arch/powerpc/kernel/machine_kexec.c
+++ b/arch/powerpc/kernel/machine_kexec.c
@@ -113,11 +113,12 @@ void machine_kexec(struct kimage *image)
 
 void __init reserve_crashkernel(void)
 {
-   unsigned long long crash_size, crash_base;
+   unsigned long long crash_size, crash_base, total_mem_sz;
int ret;
 
+   total_mem_sz = memory_limit ? memory_limit : memblock_phys_mem_size();
/* use common parsing */
-   ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
+   ret = parse_crashkernel(boot_command_line, total_mem_sz,
&crash_size, &crash_base);
if (ret == 0 && crash_size > 0) {
crashk_res.start = crash_base;
@@ -176,6 +177,7 @@ void __init reserve_crashkernel(void)
/* Crash kernel trumps memory limit */
if (memory_limit && memory_limit <= crashk_res.end) {
memory_limit = crashk_res.end + 1;
+   total_mem_sz = memory_limit;
printk("Adjusted memory limit for crashkernel, now 0x%llx\n",
   memory_limit);
}
@@ -184,7 +186,7 @@ void __init reserve_crashkernel(void)
"for crashkernel (System RAM: %ldMB)\n",
(unsigned long)(crash_size >> 20),
(unsigned long)(crashk_res.start >> 20),
-   (unsigned long)(memblock_phys_mem_size() >> 20));
+   (unsigned long)(total_mem_sz >> 20));
 
if (!memblock_is_region_memory(crashk_res.start, crash_size) ||
memblock_reserve(crashk_res.start, crash_size)) {
-- 
2.25.1



[PATCH AUTOSEL 4.9 21/80] powerpc/perf/hv-24x7: Fix inconsistent output values incase multiple hv-24x7 events run

2020-06-17 Thread Sasha Levin
From: Kajol Jain 

[ Upstream commit b4ac18eead28611ff470d0f47a35c4e0ac080d9c ]

Commit 2b206ee6b0df ("powerpc/perf/hv-24x7: Display change in counter
values")' added to print _change_ in the counter value rather then raw
value for 24x7 counters. Incase of transactions, the event count
is set to 0 at the beginning of the transaction. It also sets
the event's prev_count to the raw value at the time of initialization.
Because of setting event count to 0, we are seeing some weird behaviour,
whenever we run multiple 24x7 events at a time.

For example:

command#: ./perf stat -e "{hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/,
   hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/}"
   -C 0 -I 1000 sleep 100

 1.000121704120 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 1.000121704  5 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 2.000357733  8 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 2.000357733 10 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 3.000495215 18,446,744,073,709,551,616 
hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 3.000495215 18,446,744,073,709,551,616 
hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 4.000641884 56 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 4.000641884 18,446,744,073,709,551,616 
hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 5.000791887 18,446,744,073,709,551,616 
hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/

Getting these large values in case we do -I.

As we are setting event_count to 0, for interval case, overall event_count is 
not
coming in incremental order. As we may can get new delta lesser then previous 
count.
Because of which when we print intervals, we are getting negative value which 
create
these large values.

This patch removes part where we set event_count to 0 in function
'h_24x7_event_read'. There won't be much impact as we do set 
event->hw.prev_count
to the raw value at the time of initialization to print change value.

With this patch
In power9 platform

command#: ./perf stat -e "{hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/,
   hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/}"
   -C 0 -I 1000 sleep 100

 1.000117685 93 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 1.000117685  1 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 2.000349331 98 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 2.000349331  2 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 3.000495900131 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 3.000495900  4 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 4.000645920204 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 4.000645920 61 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 4.284169997 22 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/

Suggested-by: Sukadev Bhattiprolu 
Signed-off-by: Kajol Jain 
Tested-by: Madhavan Srinivasan 
Signed-off-by: Michael Ellerman 
Link: https://lore.kernel.org/r/20200525104308.9814-2-kj...@linux.ibm.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/perf/hv-24x7.c | 10 --
 1 file changed, 10 deletions(-)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index 991c6a517ddc..2456522583c2 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -1306,16 +1306,6 @@ static void h_24x7_event_read(struct perf_event *event)
h24x7hw = &get_cpu_var(hv_24x7_hw);
h24x7hw->events[i] = event;
put_cpu_var(h24x7hw);
-   /*
-* Clear the event count so we can compute the _change_
-* in the 24x7 raw counter value at the end of the txn.
-*
-* Note that we could alternatively read the 24x7 value
-* now and save its value in event->hw.prev_count. But
-* that would require issuing a hcall, which would then
-* defeat the purpose of using the txn interface.
-*/
-   local64_set(&event->count, 0);
}
 
put_cpu_var(hv_24x7_reqb);
-- 
2.25.1



[PATCH AUTOSEL 4.9 16/80] ps3disk: use the default segment boundary

2020-06-17 Thread Sasha Levin
From: Emmanuel Nicolet 

[ Upstream commit 720bc316690bd27dea9d71510b50f0cd698ffc32 ]

Since commit dcebd755926b ("block: use bio_for_each_bvec() to compute
multi-page bvec count"), the kernel will bug_on on the PS3 because
bio_split() is called with sectors == 0:

  kernel BUG at block/bio.c:1853!
  Oops: Exception in kernel mode, sig: 5 [#1]
  BE PAGE_SIZE=4K MMU=Hash PREEMPT SMP NR_CPUS=8 NUMA PS3
  Modules linked in: firewire_sbp2 rtc_ps3(+) soundcore ps3_gelic(+) \
  ps3rom(+) firewire_core ps3vram(+) usb_common crc_itu_t
  CPU: 0 PID: 97 Comm: blkid Not tainted 5.3.0-rc4 #1
  NIP:  c027d0d0 LR: c027d0b0 CTR: 
  REGS: c135ae90 TRAP: 0700   Not tainted  (5.3.0-rc4)
  MSR:  80028032   CR: 44008240  XER: 2000
  IRQMASK: 0
  GPR00: c0289368 c135b120 c084a500 c4ff8300
  GPR04: 0c00 c4c905e0 c4c905e0 
  GPR08:  0001  
  GPR12:  c08ef000 003e 00080001
  GPR16: 0100   0004
  GPR20: c062fd7e 0001  0080
  GPR24: c0781788 c135b350 0080 c4c905e0
  GPR28: c135b348 c4ff8300  c4c9
  NIP [c027d0d0] .bio_split+0x28/0xac
  LR [c027d0b0] .bio_split+0x8/0xac
  Call Trace:
  [c135b120] [c027d130] .bio_split+0x88/0xac (unreliable)
  [c135b1b0] [c0289368] .__blk_queue_split+0x11c/0x53c
  [c135b2d0] [c028f614] .blk_mq_make_request+0x80/0x7d4
  [c135b3d0] [c0283a8c] .generic_make_request+0x118/0x294
  [c135b4b0] [c0283d34] .submit_bio+0x12c/0x174
  [c135b580] [c0205a44] .mpage_bio_submit+0x3c/0x4c
  [c135b600] [c0206184] .mpage_readpages+0xa4/0x184
  [c135b750] [c01ff8fc] .blkdev_readpages+0x24/0x38
  [c135b7c0] [c01589f0] .read_pages+0x6c/0x1a8
  [c135b8b0] [c0158c74] .__do_page_cache_readahead+0x118/0x184
  [c135b9b0] [c01591a8] .force_page_cache_readahead+0xe4/0xe8
  [c135ba50] [c014fc24] .generic_file_read_iter+0x1d8/0x830
  [c135bb50] [c01ffadc] .blkdev_read_iter+0x40/0x5c
  [c135bbc0] [c01b9e00] .new_sync_read+0x144/0x1a0
  [c135bcd0] [c01bc454] .vfs_read+0xa0/0x124
  [c135bd70] [c01bc7a4] .ksys_read+0x70/0xd8
  [c135be20] [c000a524] system_call+0x5c/0x70
  Instruction dump:
  7fe3fb78 482e30dc 7c0802a6 482e3085 7c9e2378 f821ff71 7ca42b78 7d3e00d0
  7c7d1b78 79290fe0 7cc53378 69290001 <0b09> 81230028 7bca0020 7929ba62
  [ end trace 313fec760f30aa1f ]---

The problem originates from setting the segment boundary of the
request queue to -1UL. This makes get_max_segment_size() return zero
when offset is zero, whatever the max segment size. The test with
BLK_SEG_BOUNDARY_MASK fails and 'mask - (mask & offset) + 1' overflows
to zero in the return statement.

Not setting the segment boundary and using the default
value (BLK_SEG_BOUNDARY_MASK) fixes the problem.

Signed-off-by: Emmanuel Nicolet 
Signed-off-by: Geoff Levand 
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/060a416c43138f45105c0540eff1a45539f7e2fc.1589049250.git.ge...@infradead.org
Signed-off-by: Sasha Levin 
---
 drivers/block/ps3disk.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/block/ps3disk.c b/drivers/block/ps3disk.c
index 76f33c84ce3d..7ec5e8f0cbe5 100644
--- a/drivers/block/ps3disk.c
+++ b/drivers/block/ps3disk.c
@@ -464,7 +464,6 @@ static int ps3disk_probe(struct ps3_system_bus_device *_dev)
blk_queue_bounce_limit(queue, BLK_BOUNCE_HIGH);
 
blk_queue_max_hw_sectors(queue, dev->bounce_size >> 9);
-   blk_queue_segment_boundary(queue, -1UL);
blk_queue_dma_alignment(queue, dev->blk_size-1);
blk_queue_logical_block_size(queue, dev->blk_size);
 
-- 
2.25.1



[PATCH AUTOSEL 4.14 086/108] ASoC: fsl_asrc_dma: Fix dma_chan leak when config DMA channel failed

2020-06-17 Thread Sasha Levin
From: Xiyu Yang 

[ Upstream commit 36124fb19f1ae68a500cd76a76d40c6e81bee346 ]

fsl_asrc_dma_hw_params() invokes dma_request_channel() or
fsl_asrc_get_dma_channel(), which returns a reference of the specified
dma_chan object to "pair->dma_chan[dir]" with increased refcnt.

The reference counting issue happens in one exception handling path of
fsl_asrc_dma_hw_params(). When config DMA channel failed for Back-End,
the function forgets to decrease the refcnt increased by
dma_request_channel() or fsl_asrc_get_dma_channel(), causing a refcnt
leak.

Fix this issue by calling dma_release_channel() when config DMA channel
failed.

Signed-off-by: Xiyu Yang 
Signed-off-by: Xin Tan 
Link: 
https://lore.kernel.org/r/1590415966-52416-1-git-send-email-xiyuyan...@fudan.edu.cn
Signed-off-by: Mark Brown 
Signed-off-by: Sasha Levin 
---
 sound/soc/fsl/fsl_asrc_dma.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sound/soc/fsl/fsl_asrc_dma.c b/sound/soc/fsl/fsl_asrc_dma.c
index e1b97e59275a..15d7e6da0555 100644
--- a/sound/soc/fsl/fsl_asrc_dma.c
+++ b/sound/soc/fsl/fsl_asrc_dma.c
@@ -243,6 +243,7 @@ static int fsl_asrc_dma_hw_params(struct snd_pcm_substream 
*substream,
ret = dmaengine_slave_config(pair->dma_chan[dir], &config_be);
if (ret) {
dev_err(dev, "failed to config DMA channel for Back-End\n");
+   dma_release_channel(pair->dma_chan[dir]);
return ret;
}
 
-- 
2.25.1



[PATCH AUTOSEL 4.14 084/108] powerpc/4xx: Don't unmap NULL mbase

2020-06-17 Thread Sasha Levin
From: huhai 

[ Upstream commit bcec081ecc940fc38730b29c743bbee661164161 ]

Signed-off-by: huhai 
Signed-off-by: Michael Ellerman 
Link: https://lore.kernel.org/r/20200521072648.1254699-1-...@ellerman.id.au
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/4xx/pci.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/4xx/pci.c b/arch/powerpc/platforms/4xx/pci.c
index 73e6b36bcd51..256943af58aa 100644
--- a/arch/powerpc/platforms/4xx/pci.c
+++ b/arch/powerpc/platforms/4xx/pci.c
@@ -1242,7 +1242,7 @@ static void __init ppc460sx_pciex_check_link(struct 
ppc4xx_pciex_port *port)
if (mbase == NULL) {
printk(KERN_ERR "%pOF: Can't map internal config space !",
port->node);
-   goto done;
+   return;
}
 
while (attempt && (0 == (in_le32(mbase + PECFG_460SX_DLLSTA)
@@ -1252,9 +1252,7 @@ static void __init ppc460sx_pciex_check_link(struct 
ppc4xx_pciex_port *port)
}
if (attempt)
port->link = 1;
-done:
iounmap(mbase);
-
 }
 
 static struct ppc4xx_pciex_hwops ppc460sx_pcie_hwops __initdata = {
-- 
2.25.1



[PATCH AUTOSEL 4.14 068/108] powerpc/64s/pgtable: fix an undefined behaviour

2020-06-17 Thread Sasha Levin
From: Qian Cai 

[ Upstream commit c2e929b18cea6cbf71364f22d742d9aad7f4677a ]

Booting a power9 server with hash MMU could trigger an undefined
behaviour because pud_offset(p4d, 0) will do,

0 >> (PAGE_SHIFT:16 + PTE_INDEX_SIZE:8 + H_PMD_INDEX_SIZE:10)

Fix it by converting pud_index() and friends to static inline
functions.

UBSAN: shift-out-of-bounds in arch/powerpc/mm/ptdump/ptdump.c:282:15
shift exponent 34 is too large for 32-bit type 'int'
CPU: 6 PID: 1 Comm: swapper/0 Not tainted 5.6.0-rc4-next-20200303+ #13
Call Trace:
dump_stack+0xf4/0x164 (unreliable)
ubsan_epilogue+0x18/0x78
__ubsan_handle_shift_out_of_bounds+0x160/0x21c
walk_pagetables+0x2cc/0x700
walk_pud at arch/powerpc/mm/ptdump/ptdump.c:282
(inlined by) walk_pagetables at arch/powerpc/mm/ptdump/ptdump.c:311
ptdump_check_wx+0x8c/0xf0
mark_rodata_ro+0x48/0x80
kernel_init+0x74/0x194
ret_from_kernel_thread+0x5c/0x74

Suggested-by: Christophe Leroy 
Signed-off-by: Qian Cai 
Signed-off-by: Michael Ellerman 
Reviewed-by: Christophe Leroy 
Link: https://lore.kernel.org/r/20200306044852.3236-1-...@lca.pw
Signed-off-by: Sasha Levin 
---
 arch/powerpc/include/asm/book3s/64/pgtable.h | 23 
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index bcb79a96a6c8..618ee2c0ed53 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -898,10 +898,25 @@ extern struct page *pgd_page(pgd_t pgd);
 #define pud_page_vaddr(pud)__va(pud_val(pud) & ~PUD_MASKED_BITS)
 #define pgd_page_vaddr(pgd)__va(pgd_val(pgd) & ~PGD_MASKED_BITS)
 
-#define pgd_index(address) (((address) >> (PGDIR_SHIFT)) & (PTRS_PER_PGD - 1))
-#define pud_index(address) (((address) >> (PUD_SHIFT)) & (PTRS_PER_PUD - 1))
-#define pmd_index(address) (((address) >> (PMD_SHIFT)) & (PTRS_PER_PMD - 1))
-#define pte_index(address) (((address) >> (PAGE_SHIFT)) & (PTRS_PER_PTE - 1))
+static inline unsigned long pgd_index(unsigned long address)
+{
+   return (address >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1);
+}
+
+static inline unsigned long pud_index(unsigned long address)
+{
+   return (address >> PUD_SHIFT) & (PTRS_PER_PUD - 1);
+}
+
+static inline unsigned long pmd_index(unsigned long address)
+{
+   return (address >> PMD_SHIFT) & (PTRS_PER_PMD - 1);
+}
+
+static inline unsigned long pte_index(unsigned long address)
+{
+   return (address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
+}
 
 /*
  * Find an entry in a page-table-directory.  We combine the address region
-- 
2.25.1



[PATCH AUTOSEL 4.14 060/108] powerpc/pseries/ras: Fix FWNMI_VALID off by one

2020-06-17 Thread Sasha Levin
From: Nicholas Piggin 

[ Upstream commit deb70f7a35a22dffa55b2c3aac71bc6fb0f486ce ]

This was discovered developing qemu fwnmi sreset support. This
off-by-one bug means the last 16 bytes of the rtas area can not
be used for a 16 byte save area.

It's not a serious bug, and QEMU implementation has to retain a
workaround for old kernels, but it's good to tighten it.

Signed-off-by: Nicholas Piggin 
Signed-off-by: Michael Ellerman 
Acked-by: Mahesh Salgaonkar 
Link: https://lore.kernel.org/r/20200508043408.886394-7-npig...@gmail.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/pseries/ras.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/ras.c 
b/arch/powerpc/platforms/pseries/ras.c
index 99d1152ae224..5ec935521204 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -325,10 +325,11 @@ static irqreturn_t ras_error_interrupt(int irq, void 
*dev_id)
 /*
  * Some versions of FWNMI place the buffer inside the 4kB page starting at
  * 0x7000. Other versions place it inside the rtas buffer. We check both.
+ * Minimum size of the buffer is 16 bytes.
  */
 #define VALID_FWNMI_BUFFER(A) \
-   A) >= 0x7000) && ((A) < 0x7ff0)) || \
-   (((A) >= rtas.base) && ((A) < (rtas.base + rtas.size - 16
+   A) >= 0x7000) && ((A) <= 0x8000 - 16)) || \
+   (((A) >= rtas.base) && ((A) <= (rtas.base + rtas.size - 16
 
 /*
  * Get the error information for errors coming through the
-- 
2.25.1



[PATCH AUTOSEL 4.14 061/108] powerpc/ps3: Fix kexec shutdown hang

2020-06-17 Thread Sasha Levin
From: Geoff Levand 

[ Upstream commit 126554465d93b10662742128918a5fc338cda4aa ]

The ps3_mm_region_destroy() and ps3_mm_vas_destroy() routines
are called very late in the shutdown via kexec's mmu_cleanup_all
routine.  By the time mmu_cleanup_all runs it is too late to use
udbg_printf, and calling it will cause PS3 systems to hang.

Remove all debugging statements from ps3_mm_region_destroy() and
ps3_mm_vas_destroy() and replace any error reporting with calls
to lv1_panic.

With this change builds with 'DEBUG' defined will not cause kexec
reboots to hang, and builds with 'DEBUG' defined or not will end
in lv1_panic if an error is encountered.

Signed-off-by: Geoff Levand 
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/7325c4af2b4c989c19d6a26b90b1fec9c0615ddf.1589049250.git.ge...@infradead.org
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/ps3/mm.c | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/platforms/ps3/mm.c b/arch/powerpc/platforms/ps3/mm.c
index b0f34663b1ae..19bae78b1f25 100644
--- a/arch/powerpc/platforms/ps3/mm.c
+++ b/arch/powerpc/platforms/ps3/mm.c
@@ -212,13 +212,14 @@ void ps3_mm_vas_destroy(void)
 {
int result;
 
-   DBG("%s:%d: map.vas_id= %llu\n", __func__, __LINE__, map.vas_id);
-
if (map.vas_id) {
result = lv1_select_virtual_address_space(0);
-   BUG_ON(result);
-   result = lv1_destruct_virtual_address_space(map.vas_id);
-   BUG_ON(result);
+   result += lv1_destruct_virtual_address_space(map.vas_id);
+
+   if (result) {
+   lv1_panic(0);
+   }
+
map.vas_id = 0;
}
 }
@@ -316,19 +317,20 @@ static void ps3_mm_region_destroy(struct mem_region *r)
int result;
 
if (!r->destroy) {
-   pr_info("%s:%d: Not destroying high region: %llxh %llxh\n",
-   __func__, __LINE__, r->base, r->size);
return;
}
 
-   DBG("%s:%d: r->base = %llxh\n", __func__, __LINE__, r->base);
-
if (r->base) {
result = lv1_release_memory(r->base);
-   BUG_ON(result);
+
+   if (result) {
+   lv1_panic(0);
+   }
+
r->size = r->base = r->offset = 0;
map.total = map.rm.size;
}
+
ps3_mm_set_repository_highmem(NULL);
 }
 
-- 
2.25.1



[PATCH AUTOSEL 4.14 043/108] tty: hvc: Fix data abort due to race in hvc_open

2020-06-17 Thread Sasha Levin
From: Raghavendra Rao Ananta 

[ Upstream commit e2bd1dcbe1aa34ff5570b3427c530e4332ecf0fe ]

Potentially, hvc_open() can be called in parallel when two tasks calls
open() on /dev/hvcX. In such a scenario, if the hp->ops->notifier_add()
callback in the function fails, where it sets the tty->driver_data to
NULL, the parallel hvc_open() can see this NULL and cause a memory abort.
Hence, serialize hvc_open and check if tty->private_data is NULL before
proceeding ahead.

The issue can be easily reproduced by launching two tasks simultaneously
that does nothing but open() and close() on /dev/hvcX.
For example:
$ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &

Signed-off-by: Raghavendra Rao Ananta 
Link: https://lore.kernel.org/r/20200428032601.22127-1-rana...@codeaurora.org
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Sasha Levin 
---
 drivers/tty/hvc/hvc_console.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c
index d52221ae1b85..663cbe3669e1 100644
--- a/drivers/tty/hvc/hvc_console.c
+++ b/drivers/tty/hvc/hvc_console.c
@@ -88,6 +88,8 @@ static LIST_HEAD(hvc_structs);
  */
 static DEFINE_SPINLOCK(hvc_structs_lock);
 
+/* Mutex to serialize hvc_open */
+static DEFINE_MUTEX(hvc_open_mutex);
 /*
  * This value is used to assign a tty->index value to a hvc_struct based
  * upon order of exposure via hvc_probe(), when we can not match it to
@@ -332,16 +334,24 @@ static int hvc_install(struct tty_driver *driver, struct 
tty_struct *tty)
  */
 static int hvc_open(struct tty_struct *tty, struct file * filp)
 {
-   struct hvc_struct *hp = tty->driver_data;
+   struct hvc_struct *hp;
unsigned long flags;
int rc = 0;
 
+   mutex_lock(&hvc_open_mutex);
+
+   hp = tty->driver_data;
+   if (!hp) {
+   rc = -EIO;
+   goto out;
+   }
+
spin_lock_irqsave(&hp->port.lock, flags);
/* Check and then increment for fast path open. */
if (hp->port.count++ > 0) {
spin_unlock_irqrestore(&hp->port.lock, flags);
hvc_kick();
-   return 0;
+   goto out;
} /* else count == 0 */
spin_unlock_irqrestore(&hp->port.lock, flags);
 
@@ -369,6 +379,8 @@ static int hvc_open(struct tty_struct *tty, struct file * 
filp)
/* Force wakeup of the polling thread */
hvc_kick();
 
+out:
+   mutex_unlock(&hvc_open_mutex);
return rc;
 }
 
-- 
2.25.1



[PATCH AUTOSEL 4.14 036/108] scsi: ibmvscsi: Don't send host info in adapter info MAD after LPM

2020-06-17 Thread Sasha Levin
From: Tyrel Datwyler 

[ Upstream commit 4919b33b63c8b69d8dcf2b867431d0e3b6dc6d28 ]

The adapter info MAD is used to send the client info and receive the host
info as a response. A persistent buffer is used and as such the client info
is overwritten after the response. During the course of a normal adapter
reset the client info is refreshed in the buffer in preparation for sending
the adapter info MAD.

However, in the special case of LPM where we reenable the CRQ instead of a
full CRQ teardown and reset we fail to refresh the client info in the
adapter info buffer. As a result, after Live Partition Migration (LPM) we
erroneously report the host's info as our own.

[mkp: typos]

Link: https://lore.kernel.org/r/20200603203632.18426-1-tyr...@linux.ibm.com
Signed-off-by: Tyrel Datwyler 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/ibmvscsi/ibmvscsi.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
index 83645a1c6f82..aff868afe68d 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -429,6 +429,8 @@ static int ibmvscsi_reenable_crq_queue(struct crq_queue 
*queue,
int rc = 0;
struct vio_dev *vdev = to_vio_dev(hostdata->dev);
 
+   set_adapter_info(hostdata);
+
/* Re-enable the CRQ */
do {
if (rc)
-- 
2.25.1



[PATCH AUTOSEL 4.14 029/108] powerpc/crashkernel: Take "mem=" option into account

2020-06-17 Thread Sasha Levin
From: Pingfan Liu 

[ Upstream commit be5470e0c285a68dc3afdea965032f5ddc8269d7 ]

'mem=" option is an easy way to put high pressure on memory during
some test. Hence after applying the memory limit, instead of total
mem, the actual usable memory should be considered when reserving mem
for crashkernel. Otherwise the boot up may experience OOM issue.

E.g. it would reserve 4G prior to the change and 512M afterward, if
passing
crashkernel="2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G",
and mem=5G on a 256G machine.

This issue is powerpc specific because it puts higher priority on
fadump and kdump reservation than on "mem=". Referring the following
code:
if (fadump_reserve_mem() == 0)
reserve_crashkernel();
...
/* Ensure that total memory size is page-aligned. */
limit = ALIGN(memory_limit ?: memblock_phys_mem_size(), PAGE_SIZE);
memblock_enforce_memory_limit(limit);

While on other arches, the effect of "mem=" takes a higher priority
and pass through memblock_phys_mem_size() before calling
reserve_crashkernel().

Signed-off-by: Pingfan Liu 
Reviewed-by: Hari Bathini 
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/1585749644-4148-1-git-send-email-kernelf...@gmail.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/kernel/machine_kexec.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/machine_kexec.c 
b/arch/powerpc/kernel/machine_kexec.c
index 9dafd7af39b8..cb4d6cd949fc 100644
--- a/arch/powerpc/kernel/machine_kexec.c
+++ b/arch/powerpc/kernel/machine_kexec.c
@@ -113,11 +113,12 @@ void machine_kexec(struct kimage *image)
 
 void __init reserve_crashkernel(void)
 {
-   unsigned long long crash_size, crash_base;
+   unsigned long long crash_size, crash_base, total_mem_sz;
int ret;
 
+   total_mem_sz = memory_limit ? memory_limit : memblock_phys_mem_size();
/* use common parsing */
-   ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
+   ret = parse_crashkernel(boot_command_line, total_mem_sz,
&crash_size, &crash_base);
if (ret == 0 && crash_size > 0) {
crashk_res.start = crash_base;
@@ -176,6 +177,7 @@ void __init reserve_crashkernel(void)
/* Crash kernel trumps memory limit */
if (memory_limit && memory_limit <= crashk_res.end) {
memory_limit = crashk_res.end + 1;
+   total_mem_sz = memory_limit;
printk("Adjusted memory limit for crashkernel, now 0x%llx\n",
   memory_limit);
}
@@ -184,7 +186,7 @@ void __init reserve_crashkernel(void)
"for crashkernel (System RAM: %ldMB)\n",
(unsigned long)(crash_size >> 20),
(unsigned long)(crashk_res.start >> 20),
-   (unsigned long)(memblock_phys_mem_size() >> 20));
+   (unsigned long)(total_mem_sz >> 20));
 
if (!memblock_is_region_memory(crashk_res.start, crash_size) ||
memblock_reserve(crashk_res.start, crash_size)) {
-- 
2.25.1



[PATCH AUTOSEL 4.14 027/108] powerpc/perf/hv-24x7: Fix inconsistent output values incase multiple hv-24x7 events run

2020-06-17 Thread Sasha Levin
From: Kajol Jain 

[ Upstream commit b4ac18eead28611ff470d0f47a35c4e0ac080d9c ]

Commit 2b206ee6b0df ("powerpc/perf/hv-24x7: Display change in counter
values")' added to print _change_ in the counter value rather then raw
value for 24x7 counters. Incase of transactions, the event count
is set to 0 at the beginning of the transaction. It also sets
the event's prev_count to the raw value at the time of initialization.
Because of setting event count to 0, we are seeing some weird behaviour,
whenever we run multiple 24x7 events at a time.

For example:

command#: ./perf stat -e "{hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/,
   hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/}"
   -C 0 -I 1000 sleep 100

 1.000121704120 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 1.000121704  5 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 2.000357733  8 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 2.000357733 10 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 3.000495215 18,446,744,073,709,551,616 
hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 3.000495215 18,446,744,073,709,551,616 
hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 4.000641884 56 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 4.000641884 18,446,744,073,709,551,616 
hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 5.000791887 18,446,744,073,709,551,616 
hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/

Getting these large values in case we do -I.

As we are setting event_count to 0, for interval case, overall event_count is 
not
coming in incremental order. As we may can get new delta lesser then previous 
count.
Because of which when we print intervals, we are getting negative value which 
create
these large values.

This patch removes part where we set event_count to 0 in function
'h_24x7_event_read'. There won't be much impact as we do set 
event->hw.prev_count
to the raw value at the time of initialization to print change value.

With this patch
In power9 platform

command#: ./perf stat -e "{hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/,
   hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/}"
   -C 0 -I 1000 sleep 100

 1.000117685 93 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 1.000117685  1 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 2.000349331 98 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 2.000349331  2 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 3.000495900131 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 3.000495900  4 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 4.000645920204 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 4.000645920 61 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 4.284169997 22 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/

Suggested-by: Sukadev Bhattiprolu 
Signed-off-by: Kajol Jain 
Tested-by: Madhavan Srinivasan 
Signed-off-by: Michael Ellerman 
Link: https://lore.kernel.org/r/20200525104308.9814-2-kj...@linux.ibm.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/perf/hv-24x7.c | 10 --
 1 file changed, 10 deletions(-)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index 72238eedc360..2bb798918483 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -1413,16 +1413,6 @@ static void h_24x7_event_read(struct perf_event *event)
h24x7hw = &get_cpu_var(hv_24x7_hw);
h24x7hw->events[i] = event;
put_cpu_var(h24x7hw);
-   /*
-* Clear the event count so we can compute the _change_
-* in the 24x7 raw counter value at the end of the txn.
-*
-* Note that we could alternatively read the 24x7 value
-* now and save its value in event->hw.prev_count. But
-* that would require issuing a hcall, which would then
-* defeat the purpose of using the txn interface.
-*/
-   local64_set(&event->count, 0);
}
 
put_cpu_var(hv_24x7_reqb);
-- 
2.25.1



[PATCH AUTOSEL 4.14 021/108] ps3disk: use the default segment boundary

2020-06-17 Thread Sasha Levin
From: Emmanuel Nicolet 

[ Upstream commit 720bc316690bd27dea9d71510b50f0cd698ffc32 ]

Since commit dcebd755926b ("block: use bio_for_each_bvec() to compute
multi-page bvec count"), the kernel will bug_on on the PS3 because
bio_split() is called with sectors == 0:

  kernel BUG at block/bio.c:1853!
  Oops: Exception in kernel mode, sig: 5 [#1]
  BE PAGE_SIZE=4K MMU=Hash PREEMPT SMP NR_CPUS=8 NUMA PS3
  Modules linked in: firewire_sbp2 rtc_ps3(+) soundcore ps3_gelic(+) \
  ps3rom(+) firewire_core ps3vram(+) usb_common crc_itu_t
  CPU: 0 PID: 97 Comm: blkid Not tainted 5.3.0-rc4 #1
  NIP:  c027d0d0 LR: c027d0b0 CTR: 
  REGS: c135ae90 TRAP: 0700   Not tainted  (5.3.0-rc4)
  MSR:  80028032   CR: 44008240  XER: 2000
  IRQMASK: 0
  GPR00: c0289368 c135b120 c084a500 c4ff8300
  GPR04: 0c00 c4c905e0 c4c905e0 
  GPR08:  0001  
  GPR12:  c08ef000 003e 00080001
  GPR16: 0100   0004
  GPR20: c062fd7e 0001  0080
  GPR24: c0781788 c135b350 0080 c4c905e0
  GPR28: c135b348 c4ff8300  c4c9
  NIP [c027d0d0] .bio_split+0x28/0xac
  LR [c027d0b0] .bio_split+0x8/0xac
  Call Trace:
  [c135b120] [c027d130] .bio_split+0x88/0xac (unreliable)
  [c135b1b0] [c0289368] .__blk_queue_split+0x11c/0x53c
  [c135b2d0] [c028f614] .blk_mq_make_request+0x80/0x7d4
  [c135b3d0] [c0283a8c] .generic_make_request+0x118/0x294
  [c135b4b0] [c0283d34] .submit_bio+0x12c/0x174
  [c135b580] [c0205a44] .mpage_bio_submit+0x3c/0x4c
  [c135b600] [c0206184] .mpage_readpages+0xa4/0x184
  [c135b750] [c01ff8fc] .blkdev_readpages+0x24/0x38
  [c135b7c0] [c01589f0] .read_pages+0x6c/0x1a8
  [c135b8b0] [c0158c74] .__do_page_cache_readahead+0x118/0x184
  [c135b9b0] [c01591a8] .force_page_cache_readahead+0xe4/0xe8
  [c135ba50] [c014fc24] .generic_file_read_iter+0x1d8/0x830
  [c135bb50] [c01ffadc] .blkdev_read_iter+0x40/0x5c
  [c135bbc0] [c01b9e00] .new_sync_read+0x144/0x1a0
  [c135bcd0] [c01bc454] .vfs_read+0xa0/0x124
  [c135bd70] [c01bc7a4] .ksys_read+0x70/0xd8
  [c135be20] [c000a524] system_call+0x5c/0x70
  Instruction dump:
  7fe3fb78 482e30dc 7c0802a6 482e3085 7c9e2378 f821ff71 7ca42b78 7d3e00d0
  7c7d1b78 79290fe0 7cc53378 69290001 <0b09> 81230028 7bca0020 7929ba62
  [ end trace 313fec760f30aa1f ]---

The problem originates from setting the segment boundary of the
request queue to -1UL. This makes get_max_segment_size() return zero
when offset is zero, whatever the max segment size. The test with
BLK_SEG_BOUNDARY_MASK fails and 'mask - (mask & offset) + 1' overflows
to zero in the return statement.

Not setting the segment boundary and using the default
value (BLK_SEG_BOUNDARY_MASK) fixes the problem.

Signed-off-by: Emmanuel Nicolet 
Signed-off-by: Geoff Levand 
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/060a416c43138f45105c0540eff1a45539f7e2fc.1589049250.git.ge...@infradead.org
Signed-off-by: Sasha Levin 
---
 drivers/block/ps3disk.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/block/ps3disk.c b/drivers/block/ps3disk.c
index 075662f2cf46..d20f66d57804 100644
--- a/drivers/block/ps3disk.c
+++ b/drivers/block/ps3disk.c
@@ -468,7 +468,6 @@ static int ps3disk_probe(struct ps3_system_bus_device *_dev)
blk_queue_bounce_limit(queue, BLK_BOUNCE_HIGH);
 
blk_queue_max_hw_sectors(queue, dev->bounce_size >> 9);
-   blk_queue_segment_boundary(queue, -1UL);
blk_queue_dma_alignment(queue, dev->blk_size-1);
blk_queue_logical_block_size(queue, dev->blk_size);
 
-- 
2.25.1



[PATCH AUTOSEL 4.19 133/172] ASoC: fsl_asrc_dma: Fix dma_chan leak when config DMA channel failed

2020-06-17 Thread Sasha Levin
From: Xiyu Yang 

[ Upstream commit 36124fb19f1ae68a500cd76a76d40c6e81bee346 ]

fsl_asrc_dma_hw_params() invokes dma_request_channel() or
fsl_asrc_get_dma_channel(), which returns a reference of the specified
dma_chan object to "pair->dma_chan[dir]" with increased refcnt.

The reference counting issue happens in one exception handling path of
fsl_asrc_dma_hw_params(). When config DMA channel failed for Back-End,
the function forgets to decrease the refcnt increased by
dma_request_channel() or fsl_asrc_get_dma_channel(), causing a refcnt
leak.

Fix this issue by calling dma_release_channel() when config DMA channel
failed.

Signed-off-by: Xiyu Yang 
Signed-off-by: Xin Tan 
Link: 
https://lore.kernel.org/r/1590415966-52416-1-git-send-email-xiyuyan...@fudan.edu.cn
Signed-off-by: Mark Brown 
Signed-off-by: Sasha Levin 
---
 sound/soc/fsl/fsl_asrc_dma.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sound/soc/fsl/fsl_asrc_dma.c b/sound/soc/fsl/fsl_asrc_dma.c
index 1033ac6631b0..b9ac448989ed 100644
--- a/sound/soc/fsl/fsl_asrc_dma.c
+++ b/sound/soc/fsl/fsl_asrc_dma.c
@@ -241,6 +241,7 @@ static int fsl_asrc_dma_hw_params(struct snd_pcm_substream 
*substream,
ret = dmaengine_slave_config(pair->dma_chan[dir], &config_be);
if (ret) {
dev_err(dev, "failed to config DMA channel for Back-End\n");
+   dma_release_channel(pair->dma_chan[dir]);
return ret;
}
 
-- 
2.25.1



[PATCH AUTOSEL 4.19 131/172] powerpc/4xx: Don't unmap NULL mbase

2020-06-17 Thread Sasha Levin
From: huhai 

[ Upstream commit bcec081ecc940fc38730b29c743bbee661164161 ]

Signed-off-by: huhai 
Signed-off-by: Michael Ellerman 
Link: https://lore.kernel.org/r/20200521072648.1254699-1-...@ellerman.id.au
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/4xx/pci.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/4xx/pci.c b/arch/powerpc/platforms/4xx/pci.c
index 5aca523551ae..2f237027fdcc 100644
--- a/arch/powerpc/platforms/4xx/pci.c
+++ b/arch/powerpc/platforms/4xx/pci.c
@@ -1242,7 +1242,7 @@ static void __init ppc460sx_pciex_check_link(struct 
ppc4xx_pciex_port *port)
if (mbase == NULL) {
printk(KERN_ERR "%pOF: Can't map internal config space !",
port->node);
-   goto done;
+   return;
}
 
while (attempt && (0 == (in_le32(mbase + PECFG_460SX_DLLSTA)
@@ -1252,9 +1252,7 @@ static void __init ppc460sx_pciex_check_link(struct 
ppc4xx_pciex_port *port)
}
if (attempt)
port->link = 1;
-done:
iounmap(mbase);
-
 }
 
 static struct ppc4xx_pciex_hwops ppc460sx_pcie_hwops __initdata = {
-- 
2.25.1



[PATCH AUTOSEL 4.19 124/172] KVM: PPC: Book3S HV: Ignore kmemleak false positives

2020-06-17 Thread Sasha Levin
From: Qian Cai 

[ Upstream commit 0aca8a5575544bd21b3363058afb8f1e81505150 ]

kvmppc_pmd_alloc() and kvmppc_pte_alloc() allocate some memory but then
pud_populate() and pmd_populate() will use __pa() to reference the newly
allocated memory.

Since kmemleak is unable to track the physical memory resulting in false
positives, silence those by using kmemleak_ignore().

unreferenced object 0xc000201c382a1000 (size 4096):
 comm "qemu-kvm", pid 124828, jiffies 4295733767 (age 341.250s)
 hex dump (first 32 bytes):
   c0 00 20 09 f4 60 03 87 c0 00 20 10 72 a0 03 87  .. ..` .r...
   c0 00 20 0e 13 a0 03 87 c0 00 20 1b dc c0 03 87  .. ... .
 backtrace:
   [<4cc2790f>] kvmppc_create_pte+0x838/0xd20 [kvm_hv]
   kvmppc_pmd_alloc at arch/powerpc/kvm/book3s_64_mmu_radix.c:366
   (inlined by) kvmppc_create_pte at arch/powerpc/kvm/book3s_64_mmu_radix.c:590
   [] kvmppc_book3s_instantiate_page+0x2e0/0x8c0 [kvm_hv]
   [] kvmppc_book3s_radix_page_fault+0x1b4/0x2b0 [kvm_hv]
   [<86dddc0e>] kvmppc_book3s_hv_page_fault+0x214/0x12a0 [kvm_hv]
   [<5ae9ccc2>] kvmppc_vcpu_run_hv+0xc5c/0x15f0 [kvm_hv]
   [] kvmppc_vcpu_run+0x34/0x48 [kvm]
   [] kvm_arch_vcpu_ioctl_run+0x314/0x420 [kvm]
   [<2543dd54>] kvm_vcpu_ioctl+0x33c/0x950 [kvm]
   [<48155cd6>] ksys_ioctl+0xd8/0x130
   [<41ffeaa7>] sys_ioctl+0x28/0x40
   [<4afc4310>] system_call_exception+0x114/0x1e0
   [] system_call_common+0xf0/0x278
unreferenced object 0xc0002001f0c03900 (size 256):
 comm "qemu-kvm", pid 124830, jiffies 4295735235 (age 326.570s)
 hex dump (first 32 bytes):
   c0 00 20 10 fa a0 03 87 c0 00 20 10 fa a1 03 87  .. ... .
   c0 00 20 10 fa a2 03 87 c0 00 20 10 fa a3 03 87  .. ... .
 backtrace:
   [<23f675b8>] kvmppc_create_pte+0x854/0xd20 [kvm_hv]
   kvmppc_pte_alloc at arch/powerpc/kvm/book3s_64_mmu_radix.c:356
   (inlined by) kvmppc_create_pte at arch/powerpc/kvm/book3s_64_mmu_radix.c:593
   [] kvmppc_book3s_instantiate_page+0x2e0/0x8c0 [kvm_hv]
   [] kvmppc_book3s_radix_page_fault+0x1b4/0x2b0 [kvm_hv]
   [<86dddc0e>] kvmppc_book3s_hv_page_fault+0x214/0x12a0 [kvm_hv]
   [<5ae9ccc2>] kvmppc_vcpu_run_hv+0xc5c/0x15f0 [kvm_hv]
   [] kvmppc_vcpu_run+0x34/0x48 [kvm]
   [] kvm_arch_vcpu_ioctl_run+0x314/0x420 [kvm]
   [<2543dd54>] kvm_vcpu_ioctl+0x33c/0x950 [kvm]
   [<48155cd6>] ksys_ioctl+0xd8/0x130
   [<41ffeaa7>] sys_ioctl+0x28/0x40
   [<4afc4310>] system_call_exception+0x114/0x1e0
   [] system_call_common+0xf0/0x278

Signed-off-by: Qian Cai 
Signed-off-by: Paul Mackerras 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/kvm/book3s_64_mmu_radix.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c 
b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index 998f8d089ac7..df0f08cb821b 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -171,7 +171,13 @@ static struct kmem_cache *kvm_pmd_cache;
 
 static pte_t *kvmppc_pte_alloc(void)
 {
-   return kmem_cache_alloc(kvm_pte_cache, GFP_KERNEL);
+   pte_t *pte;
+
+   pte = kmem_cache_alloc(kvm_pte_cache, GFP_KERNEL);
+   /* pmd_populate() will only reference _pa(pte). */
+   kmemleak_ignore(pte);
+
+   return pte;
 }
 
 static void kvmppc_pte_free(pte_t *ptep)
@@ -187,7 +193,13 @@ static inline int pmd_is_leaf(pmd_t pmd)
 
 static pmd_t *kvmppc_pmd_alloc(void)
 {
-   return kmem_cache_alloc(kvm_pmd_cache, GFP_KERNEL);
+   pmd_t *pmd;
+
+   pmd = kmem_cache_alloc(kvm_pmd_cache, GFP_KERNEL);
+   /* pud_populate() will only reference _pa(pmd). */
+   kmemleak_ignore(pmd);
+
+   return pmd;
 }
 
 static void kvmppc_pmd_free(pmd_t *pmdp)
-- 
2.25.1



[PATCH AUTOSEL 4.19 106/172] powerpc/64s/pgtable: fix an undefined behaviour

2020-06-17 Thread Sasha Levin
From: Qian Cai 

[ Upstream commit c2e929b18cea6cbf71364f22d742d9aad7f4677a ]

Booting a power9 server with hash MMU could trigger an undefined
behaviour because pud_offset(p4d, 0) will do,

0 >> (PAGE_SHIFT:16 + PTE_INDEX_SIZE:8 + H_PMD_INDEX_SIZE:10)

Fix it by converting pud_index() and friends to static inline
functions.

UBSAN: shift-out-of-bounds in arch/powerpc/mm/ptdump/ptdump.c:282:15
shift exponent 34 is too large for 32-bit type 'int'
CPU: 6 PID: 1 Comm: swapper/0 Not tainted 5.6.0-rc4-next-20200303+ #13
Call Trace:
dump_stack+0xf4/0x164 (unreliable)
ubsan_epilogue+0x18/0x78
__ubsan_handle_shift_out_of_bounds+0x160/0x21c
walk_pagetables+0x2cc/0x700
walk_pud at arch/powerpc/mm/ptdump/ptdump.c:282
(inlined by) walk_pagetables at arch/powerpc/mm/ptdump/ptdump.c:311
ptdump_check_wx+0x8c/0xf0
mark_rodata_ro+0x48/0x80
kernel_init+0x74/0x194
ret_from_kernel_thread+0x5c/0x74

Suggested-by: Christophe Leroy 
Signed-off-by: Qian Cai 
Signed-off-by: Michael Ellerman 
Reviewed-by: Christophe Leroy 
Link: https://lore.kernel.org/r/20200306044852.3236-1-...@lca.pw
Signed-off-by: Sasha Levin 
---
 arch/powerpc/include/asm/book3s/64/pgtable.h | 23 
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 2aea6efc2e63..008eb63fa851 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -985,10 +985,25 @@ extern struct page *pgd_page(pgd_t pgd);
 #define pud_page_vaddr(pud)__va(pud_val(pud) & ~PUD_MASKED_BITS)
 #define pgd_page_vaddr(pgd)__va(pgd_val(pgd) & ~PGD_MASKED_BITS)
 
-#define pgd_index(address) (((address) >> (PGDIR_SHIFT)) & (PTRS_PER_PGD - 1))
-#define pud_index(address) (((address) >> (PUD_SHIFT)) & (PTRS_PER_PUD - 1))
-#define pmd_index(address) (((address) >> (PMD_SHIFT)) & (PTRS_PER_PMD - 1))
-#define pte_index(address) (((address) >> (PAGE_SHIFT)) & (PTRS_PER_PTE - 1))
+static inline unsigned long pgd_index(unsigned long address)
+{
+   return (address >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1);
+}
+
+static inline unsigned long pud_index(unsigned long address)
+{
+   return (address >> PUD_SHIFT) & (PTRS_PER_PUD - 1);
+}
+
+static inline unsigned long pmd_index(unsigned long address)
+{
+   return (address >> PMD_SHIFT) & (PTRS_PER_PMD - 1);
+}
+
+static inline unsigned long pte_index(unsigned long address)
+{
+   return (address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
+}
 
 /*
  * Find an entry in a page-table-directory.  We combine the address region
-- 
2.25.1



[PATCH AUTOSEL 4.19 094/172] powerpc/ps3: Fix kexec shutdown hang

2020-06-17 Thread Sasha Levin
From: Geoff Levand 

[ Upstream commit 126554465d93b10662742128918a5fc338cda4aa ]

The ps3_mm_region_destroy() and ps3_mm_vas_destroy() routines
are called very late in the shutdown via kexec's mmu_cleanup_all
routine.  By the time mmu_cleanup_all runs it is too late to use
udbg_printf, and calling it will cause PS3 systems to hang.

Remove all debugging statements from ps3_mm_region_destroy() and
ps3_mm_vas_destroy() and replace any error reporting with calls
to lv1_panic.

With this change builds with 'DEBUG' defined will not cause kexec
reboots to hang, and builds with 'DEBUG' defined or not will end
in lv1_panic if an error is encountered.

Signed-off-by: Geoff Levand 
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/7325c4af2b4c989c19d6a26b90b1fec9c0615ddf.1589049250.git.ge...@infradead.org
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/ps3/mm.c | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/platforms/ps3/mm.c b/arch/powerpc/platforms/ps3/mm.c
index 8c7009d001d9..894f62d77a77 100644
--- a/arch/powerpc/platforms/ps3/mm.c
+++ b/arch/powerpc/platforms/ps3/mm.c
@@ -212,13 +212,14 @@ void ps3_mm_vas_destroy(void)
 {
int result;
 
-   DBG("%s:%d: map.vas_id= %llu\n", __func__, __LINE__, map.vas_id);
-
if (map.vas_id) {
result = lv1_select_virtual_address_space(0);
-   BUG_ON(result);
-   result = lv1_destruct_virtual_address_space(map.vas_id);
-   BUG_ON(result);
+   result += lv1_destruct_virtual_address_space(map.vas_id);
+
+   if (result) {
+   lv1_panic(0);
+   }
+
map.vas_id = 0;
}
 }
@@ -316,19 +317,20 @@ static void ps3_mm_region_destroy(struct mem_region *r)
int result;
 
if (!r->destroy) {
-   pr_info("%s:%d: Not destroying high region: %llxh %llxh\n",
-   __func__, __LINE__, r->base, r->size);
return;
}
 
-   DBG("%s:%d: r->base = %llxh\n", __func__, __LINE__, r->base);
-
if (r->base) {
result = lv1_release_memory(r->base);
-   BUG_ON(result);
+
+   if (result) {
+   lv1_panic(0);
+   }
+
r->size = r->base = r->offset = 0;
map.total = map.rm.size;
}
+
ps3_mm_set_repository_highmem(NULL);
 }
 
-- 
2.25.1



[PATCH AUTOSEL 4.19 093/172] powerpc/pseries/ras: Fix FWNMI_VALID off by one

2020-06-17 Thread Sasha Levin
From: Nicholas Piggin 

[ Upstream commit deb70f7a35a22dffa55b2c3aac71bc6fb0f486ce ]

This was discovered developing qemu fwnmi sreset support. This
off-by-one bug means the last 16 bytes of the rtas area can not
be used for a 16 byte save area.

It's not a serious bug, and QEMU implementation has to retain a
workaround for old kernels, but it's good to tighten it.

Signed-off-by: Nicholas Piggin 
Signed-off-by: Michael Ellerman 
Acked-by: Mahesh Salgaonkar 
Link: https://lore.kernel.org/r/20200508043408.886394-7-npig...@gmail.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/pseries/ras.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/ras.c 
b/arch/powerpc/platforms/pseries/ras.c
index 851ce326874a..e81a285f3a6c 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -328,10 +328,11 @@ static irqreturn_t ras_error_interrupt(int irq, void 
*dev_id)
 /*
  * Some versions of FWNMI place the buffer inside the 4kB page starting at
  * 0x7000. Other versions place it inside the rtas buffer. We check both.
+ * Minimum size of the buffer is 16 bytes.
  */
 #define VALID_FWNMI_BUFFER(A) \
-   A) >= 0x7000) && ((A) < 0x7ff0)) || \
-   (((A) >= rtas.base) && ((A) < (rtas.base + rtas.size - 16
+   A) >= 0x7000) && ((A) <= 0x8000 - 16)) || \
+   (((A) >= rtas.base) && ((A) <= (rtas.base + rtas.size - 16
 
 static inline struct rtas_error_log *fwnmi_get_errlog(void)
 {
-- 
2.25.1



[PATCH AUTOSEL 4.19 088/172] powerpc/64: Don't initialise init_task->thread.regs

2020-06-17 Thread Sasha Levin
From: Michael Ellerman 

[ Upstream commit 7ffa8b7dc11752827329e4e84a574ea6aaf24716 ]

Aneesh increased the size of struct pt_regs by 16 bytes and started
seeing this WARN_ON:

  smp: Bringing up secondary CPUs ...
  [ cut here ]
  WARNING: CPU: 0 PID: 0 at arch/powerpc/kernel/process.c:455 
giveup_all+0xb4/0x110
  Modules linked in:
  CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
5.7.0-rc2-gcc-8.2.0-1.g8f6a41f-default+ #318
  NIP:  c001a2b4 LR: c001a29c CTR: c31d
  REGS: c26d3980 TRAP: 0700   Not tainted  
(5.7.0-rc2-gcc-8.2.0-1.g8f6a41f-default+)
  MSR:  8282b033   CR: 48048224  XER: 

  CFAR: c0019cc8 IRQMASK: 1
  GPR00: c001a264 c26d3c20 c26d7200 8280b033
  GPR04: 0001  0077 30206d7372203164
  GPR08: 2000 02002000 8280b033 3230303030303030
  GPR12: 8800 c31d 00800050 0266
  GPR16: 0309a1a0 0309a4b0 0309a2d8 0309a890
  GPR20: 030d0098 c264da40 fd62 c000ff798080
  GPR24: c264edf0 c001007469f0 fd62 c20e5e90
  GPR28: c264edf0 c264d200 1db6 c264d200
  NIP [c001a2b4] giveup_all+0xb4/0x110
  LR [c001a29c] giveup_all+0x9c/0x110
  Call Trace:
  [c26d3c20] [c001a264] giveup_all+0x64/0x110 (unreliable)
  [c26d3c90] [c001ae34] __switch_to+0x104/0x480
  [c26d3cf0] [c0e0b8a0] __schedule+0x320/0x970
  [c26d3dd0] [c0e0c518] schedule_idle+0x38/0x70
  [c26d3df0] [c019c7c8] do_idle+0x248/0x3f0
  [c26d3e70] [c019cbb8] cpu_startup_entry+0x38/0x40
  [c26d3ea0] [c0011bb0] rest_init+0xe0/0xf8
  [c26d3ed0] [c2004820] start_kernel+0x990/0x9e0
  [c26d3f90] [c000c49c] start_here_common+0x1c/0x400

Which was unexpected. The warning is checking the thread.regs->msr
value of the task we are switching from:

  usermsr = tsk->thread.regs->msr;
  ...
  WARN_ON((usermsr & MSR_VSX) && !((usermsr & MSR_FP) && (usermsr & MSR_VEC)));

ie. if MSR_VSX is set then both of MSR_FP and MSR_VEC are also set.

Dumping tsk->thread.regs->msr we see that it's: 0x1db6

Which is not a normal looking MSR, in fact the only valid bit is
MSR_VSX, all the other bits are reserved in the current definition of
the MSR.

We can see from the oops that it was swapper/0 that we were switching
from when we hit the warning, ie. init_task. So its thread.regs points
to the base (high addresses) in init_stack.

Dumping the content of init_task->thread.regs, with the members of
pt_regs annotated (the 16 bytes larger version), we see:

   c2780080gpr[0] gpr[1]
   c2666008gpr[2] gpr[3]
  c26d3ed0 0078gpr[4] gpr[5]
  c0011b68 c2780080gpr[6] gpr[7]
   gpr[8] gpr[9]
  c26d3f90 80002200gpr[10]gpr[11]
  c2004820 c26d7200gpr[12]gpr[13]
  1db6 c10aabe8gpr[14]gpr[15]
  c10aabe8 c10aabe8gpr[16]gpr[17]
  c294d598 gpr[18]gpr[19]
   1ff8gpr[20]gpr[21]
   c206d608gpr[22]gpr[23]
  c278e0cc gpr[24]gpr[25]
  2fff c000gpr[26]gpr[27]
  0200 0028gpr[28]gpr[29]
  1db6 0475gpr[30]gpr[31]
  0200 1db6nipmsr
   orig_r3ctr
  c000c49c link   xer
   ccrsofte
   trap   dar
   dsisr  result
   pprkuap
   pad[2] pad[3]

This looks suspiciously like stack frames, not a pt_regs. If we look
closely we can see return addresses from the stack trace above,
c2004820 (start_kernel) and c000c49c (start_here_common).

init_task->thread.regs is setup at build time in processor.h:

  #define INIT_THREAD  { \
.ksp = INIT_SP, \
.regs = (struct pt_regs *)INIT_SP - 1, /* XXX bogus, I think */ \

The early boot code where we setup the initial stack is:

  LOAD_REG_ADDR(r3,init_thread_union)

  /* set up a stack pointer */
  LOAD_REG_IMMEDIATE(r1,THREAD_SIZE)
  add   r1,r3,r1
  lir0,0
  stdu  r0,-STACK_FRAME_OVERHEAD(r1)

Which creates a stack frame of size 112 bytes (STACK_FRAME_OVERHEAD).
Which is far too small to contain a pt_regs.

So the result is init_task->thread.regs is pointing at

[PATCH AUTOSEL 4.19 068/172] tty: hvc: Fix data abort due to race in hvc_open

2020-06-17 Thread Sasha Levin
From: Raghavendra Rao Ananta 

[ Upstream commit e2bd1dcbe1aa34ff5570b3427c530e4332ecf0fe ]

Potentially, hvc_open() can be called in parallel when two tasks calls
open() on /dev/hvcX. In such a scenario, if the hp->ops->notifier_add()
callback in the function fails, where it sets the tty->driver_data to
NULL, the parallel hvc_open() can see this NULL and cause a memory abort.
Hence, serialize hvc_open and check if tty->private_data is NULL before
proceeding ahead.

The issue can be easily reproduced by launching two tasks simultaneously
that does nothing but open() and close() on /dev/hvcX.
For example:
$ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &

Signed-off-by: Raghavendra Rao Ananta 
Link: https://lore.kernel.org/r/20200428032601.22127-1-rana...@codeaurora.org
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Sasha Levin 
---
 drivers/tty/hvc/hvc_console.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c
index cdcc64ea2554..f8e43a6faea9 100644
--- a/drivers/tty/hvc/hvc_console.c
+++ b/drivers/tty/hvc/hvc_console.c
@@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs);
  */
 static DEFINE_MUTEX(hvc_structs_mutex);
 
+/* Mutex to serialize hvc_open */
+static DEFINE_MUTEX(hvc_open_mutex);
 /*
  * This value is used to assign a tty->index value to a hvc_struct based
  * upon order of exposure via hvc_probe(), when we can not match it to
@@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver *driver, struct 
tty_struct *tty)
  */
 static int hvc_open(struct tty_struct *tty, struct file * filp)
 {
-   struct hvc_struct *hp = tty->driver_data;
+   struct hvc_struct *hp;
unsigned long flags;
int rc = 0;
 
+   mutex_lock(&hvc_open_mutex);
+
+   hp = tty->driver_data;
+   if (!hp) {
+   rc = -EIO;
+   goto out;
+   }
+
spin_lock_irqsave(&hp->port.lock, flags);
/* Check and then increment for fast path open. */
if (hp->port.count++ > 0) {
spin_unlock_irqrestore(&hp->port.lock, flags);
hvc_kick();
-   return 0;
+   goto out;
} /* else count == 0 */
spin_unlock_irqrestore(&hp->port.lock, flags);
 
@@ -383,6 +393,8 @@ static int hvc_open(struct tty_struct *tty, struct file * 
filp)
/* Force wakeup of the polling thread */
hvc_kick();
 
+out:
+   mutex_unlock(&hvc_open_mutex);
return rc;
 }
 
-- 
2.25.1



[PATCH AUTOSEL 4.19 049/172] scsi: ibmvscsi: Don't send host info in adapter info MAD after LPM

2020-06-17 Thread Sasha Levin
From: Tyrel Datwyler 

[ Upstream commit 4919b33b63c8b69d8dcf2b867431d0e3b6dc6d28 ]

The adapter info MAD is used to send the client info and receive the host
info as a response. A persistent buffer is used and as such the client info
is overwritten after the response. During the course of a normal adapter
reset the client info is refreshed in the buffer in preparation for sending
the adapter info MAD.

However, in the special case of LPM where we reenable the CRQ instead of a
full CRQ teardown and reset we fail to refresh the client info in the
adapter info buffer. As a result, after Live Partition Migration (LPM) we
erroneously report the host's info as our own.

[mkp: typos]

Link: https://lore.kernel.org/r/20200603203632.18426-1-tyr...@linux.ibm.com
Signed-off-by: Tyrel Datwyler 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/ibmvscsi/ibmvscsi.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
index b99ded6b9e0b..036508a4bd5a 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -429,6 +429,8 @@ static int ibmvscsi_reenable_crq_queue(struct crq_queue 
*queue,
int rc = 0;
struct vio_dev *vdev = to_vio_dev(hostdata->dev);
 
+   set_adapter_info(hostdata);
+
/* Re-enable the CRQ */
do {
if (rc)
-- 
2.25.1



[PATCH AUTOSEL 4.19 038/172] powerpc/crashkernel: Take "mem=" option into account

2020-06-17 Thread Sasha Levin
From: Pingfan Liu 

[ Upstream commit be5470e0c285a68dc3afdea965032f5ddc8269d7 ]

'mem=" option is an easy way to put high pressure on memory during
some test. Hence after applying the memory limit, instead of total
mem, the actual usable memory should be considered when reserving mem
for crashkernel. Otherwise the boot up may experience OOM issue.

E.g. it would reserve 4G prior to the change and 512M afterward, if
passing
crashkernel="2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G",
and mem=5G on a 256G machine.

This issue is powerpc specific because it puts higher priority on
fadump and kdump reservation than on "mem=". Referring the following
code:
if (fadump_reserve_mem() == 0)
reserve_crashkernel();
...
/* Ensure that total memory size is page-aligned. */
limit = ALIGN(memory_limit ?: memblock_phys_mem_size(), PAGE_SIZE);
memblock_enforce_memory_limit(limit);

While on other arches, the effect of "mem=" takes a higher priority
and pass through memblock_phys_mem_size() before calling
reserve_crashkernel().

Signed-off-by: Pingfan Liu 
Reviewed-by: Hari Bathini 
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/1585749644-4148-1-git-send-email-kernelf...@gmail.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/kernel/machine_kexec.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/machine_kexec.c 
b/arch/powerpc/kernel/machine_kexec.c
index 63f5a9311a29..094c37fb07a9 100644
--- a/arch/powerpc/kernel/machine_kexec.c
+++ b/arch/powerpc/kernel/machine_kexec.c
@@ -116,11 +116,12 @@ void machine_kexec(struct kimage *image)
 
 void __init reserve_crashkernel(void)
 {
-   unsigned long long crash_size, crash_base;
+   unsigned long long crash_size, crash_base, total_mem_sz;
int ret;
 
+   total_mem_sz = memory_limit ? memory_limit : memblock_phys_mem_size();
/* use common parsing */
-   ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
+   ret = parse_crashkernel(boot_command_line, total_mem_sz,
&crash_size, &crash_base);
if (ret == 0 && crash_size > 0) {
crashk_res.start = crash_base;
@@ -179,6 +180,7 @@ void __init reserve_crashkernel(void)
/* Crash kernel trumps memory limit */
if (memory_limit && memory_limit <= crashk_res.end) {
memory_limit = crashk_res.end + 1;
+   total_mem_sz = memory_limit;
printk("Adjusted memory limit for crashkernel, now 0x%llx\n",
   memory_limit);
}
@@ -187,7 +189,7 @@ void __init reserve_crashkernel(void)
"for crashkernel (System RAM: %ldMB)\n",
(unsigned long)(crash_size >> 20),
(unsigned long)(crashk_res.start >> 20),
-   (unsigned long)(memblock_phys_mem_size() >> 20));
+   (unsigned long)(total_mem_sz >> 20));
 
if (!memblock_is_region_memory(crashk_res.start, crash_size) ||
memblock_reserve(crashk_res.start, crash_size)) {
-- 
2.25.1



[PATCH AUTOSEL 4.19 035/172] powerpc/perf/hv-24x7: Fix inconsistent output values incase multiple hv-24x7 events run

2020-06-17 Thread Sasha Levin
From: Kajol Jain 

[ Upstream commit b4ac18eead28611ff470d0f47a35c4e0ac080d9c ]

Commit 2b206ee6b0df ("powerpc/perf/hv-24x7: Display change in counter
values")' added to print _change_ in the counter value rather then raw
value for 24x7 counters. Incase of transactions, the event count
is set to 0 at the beginning of the transaction. It also sets
the event's prev_count to the raw value at the time of initialization.
Because of setting event count to 0, we are seeing some weird behaviour,
whenever we run multiple 24x7 events at a time.

For example:

command#: ./perf stat -e "{hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/,
   hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/}"
   -C 0 -I 1000 sleep 100

 1.000121704120 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 1.000121704  5 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 2.000357733  8 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 2.000357733 10 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 3.000495215 18,446,744,073,709,551,616 
hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 3.000495215 18,446,744,073,709,551,616 
hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 4.000641884 56 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 4.000641884 18,446,744,073,709,551,616 
hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 5.000791887 18,446,744,073,709,551,616 
hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/

Getting these large values in case we do -I.

As we are setting event_count to 0, for interval case, overall event_count is 
not
coming in incremental order. As we may can get new delta lesser then previous 
count.
Because of which when we print intervals, we are getting negative value which 
create
these large values.

This patch removes part where we set event_count to 0 in function
'h_24x7_event_read'. There won't be much impact as we do set 
event->hw.prev_count
to the raw value at the time of initialization to print change value.

With this patch
In power9 platform

command#: ./perf stat -e "{hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/,
   hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/}"
   -C 0 -I 1000 sleep 100

 1.000117685 93 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 1.000117685  1 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 2.000349331 98 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 2.000349331  2 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 3.000495900131 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 3.000495900  4 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 4.000645920204 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 4.000645920 61 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 4.284169997 22 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/

Suggested-by: Sukadev Bhattiprolu 
Signed-off-by: Kajol Jain 
Tested-by: Madhavan Srinivasan 
Signed-off-by: Michael Ellerman 
Link: https://lore.kernel.org/r/20200525104308.9814-2-kj...@linux.ibm.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/perf/hv-24x7.c | 10 --
 1 file changed, 10 deletions(-)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index 72238eedc360..2bb798918483 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -1413,16 +1413,6 @@ static void h_24x7_event_read(struct perf_event *event)
h24x7hw = &get_cpu_var(hv_24x7_hw);
h24x7hw->events[i] = event;
put_cpu_var(h24x7hw);
-   /*
-* Clear the event count so we can compute the _change_
-* in the 24x7 raw counter value at the end of the txn.
-*
-* Note that we could alternatively read the 24x7 value
-* now and save its value in event->hw.prev_count. But
-* that would require issuing a hcall, which would then
-* defeat the purpose of using the txn interface.
-*/
-   local64_set(&event->count, 0);
}
 
put_cpu_var(hv_24x7_reqb);
-- 
2.25.1



[PATCH AUTOSEL 4.19 027/172] ps3disk: use the default segment boundary

2020-06-17 Thread Sasha Levin
From: Emmanuel Nicolet 

[ Upstream commit 720bc316690bd27dea9d71510b50f0cd698ffc32 ]

Since commit dcebd755926b ("block: use bio_for_each_bvec() to compute
multi-page bvec count"), the kernel will bug_on on the PS3 because
bio_split() is called with sectors == 0:

  kernel BUG at block/bio.c:1853!
  Oops: Exception in kernel mode, sig: 5 [#1]
  BE PAGE_SIZE=4K MMU=Hash PREEMPT SMP NR_CPUS=8 NUMA PS3
  Modules linked in: firewire_sbp2 rtc_ps3(+) soundcore ps3_gelic(+) \
  ps3rom(+) firewire_core ps3vram(+) usb_common crc_itu_t
  CPU: 0 PID: 97 Comm: blkid Not tainted 5.3.0-rc4 #1
  NIP:  c027d0d0 LR: c027d0b0 CTR: 
  REGS: c135ae90 TRAP: 0700   Not tainted  (5.3.0-rc4)
  MSR:  80028032   CR: 44008240  XER: 2000
  IRQMASK: 0
  GPR00: c0289368 c135b120 c084a500 c4ff8300
  GPR04: 0c00 c4c905e0 c4c905e0 
  GPR08:  0001  
  GPR12:  c08ef000 003e 00080001
  GPR16: 0100   0004
  GPR20: c062fd7e 0001  0080
  GPR24: c0781788 c135b350 0080 c4c905e0
  GPR28: c135b348 c4ff8300  c4c9
  NIP [c027d0d0] .bio_split+0x28/0xac
  LR [c027d0b0] .bio_split+0x8/0xac
  Call Trace:
  [c135b120] [c027d130] .bio_split+0x88/0xac (unreliable)
  [c135b1b0] [c0289368] .__blk_queue_split+0x11c/0x53c
  [c135b2d0] [c028f614] .blk_mq_make_request+0x80/0x7d4
  [c135b3d0] [c0283a8c] .generic_make_request+0x118/0x294
  [c135b4b0] [c0283d34] .submit_bio+0x12c/0x174
  [c135b580] [c0205a44] .mpage_bio_submit+0x3c/0x4c
  [c135b600] [c0206184] .mpage_readpages+0xa4/0x184
  [c135b750] [c01ff8fc] .blkdev_readpages+0x24/0x38
  [c135b7c0] [c01589f0] .read_pages+0x6c/0x1a8
  [c135b8b0] [c0158c74] .__do_page_cache_readahead+0x118/0x184
  [c135b9b0] [c01591a8] .force_page_cache_readahead+0xe4/0xe8
  [c135ba50] [c014fc24] .generic_file_read_iter+0x1d8/0x830
  [c135bb50] [c01ffadc] .blkdev_read_iter+0x40/0x5c
  [c135bbc0] [c01b9e00] .new_sync_read+0x144/0x1a0
  [c135bcd0] [c01bc454] .vfs_read+0xa0/0x124
  [c135bd70] [c01bc7a4] .ksys_read+0x70/0xd8
  [c135be20] [c000a524] system_call+0x5c/0x70
  Instruction dump:
  7fe3fb78 482e30dc 7c0802a6 482e3085 7c9e2378 f821ff71 7ca42b78 7d3e00d0
  7c7d1b78 79290fe0 7cc53378 69290001 <0b09> 81230028 7bca0020 7929ba62
  [ end trace 313fec760f30aa1f ]---

The problem originates from setting the segment boundary of the
request queue to -1UL. This makes get_max_segment_size() return zero
when offset is zero, whatever the max segment size. The test with
BLK_SEG_BOUNDARY_MASK fails and 'mask - (mask & offset) + 1' overflows
to zero in the return statement.

Not setting the segment boundary and using the default
value (BLK_SEG_BOUNDARY_MASK) fixes the problem.

Signed-off-by: Emmanuel Nicolet 
Signed-off-by: Geoff Levand 
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/060a416c43138f45105c0540eff1a45539f7e2fc.1589049250.git.ge...@infradead.org
Signed-off-by: Sasha Levin 
---
 drivers/block/ps3disk.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/block/ps3disk.c b/drivers/block/ps3disk.c
index afe1508d82c6..bd1c66f5631a 100644
--- a/drivers/block/ps3disk.c
+++ b/drivers/block/ps3disk.c
@@ -466,7 +466,6 @@ static int ps3disk_probe(struct ps3_system_bus_device *_dev)
queue->queuedata = dev;
 
blk_queue_max_hw_sectors(queue, dev->bounce_size >> 9);
-   blk_queue_segment_boundary(queue, -1UL);
blk_queue_dma_alignment(queue, dev->blk_size-1);
blk_queue_logical_block_size(queue, dev->blk_size);
 
-- 
2.25.1



[PATCH AUTOSEL 5.4 209/266] ASoC: fsl_asrc_dma: Fix dma_chan leak when config DMA channel failed

2020-06-17 Thread Sasha Levin
From: Xiyu Yang 

[ Upstream commit 36124fb19f1ae68a500cd76a76d40c6e81bee346 ]

fsl_asrc_dma_hw_params() invokes dma_request_channel() or
fsl_asrc_get_dma_channel(), which returns a reference of the specified
dma_chan object to "pair->dma_chan[dir]" with increased refcnt.

The reference counting issue happens in one exception handling path of
fsl_asrc_dma_hw_params(). When config DMA channel failed for Back-End,
the function forgets to decrease the refcnt increased by
dma_request_channel() or fsl_asrc_get_dma_channel(), causing a refcnt
leak.

Fix this issue by calling dma_release_channel() when config DMA channel
failed.

Signed-off-by: Xiyu Yang 
Signed-off-by: Xin Tan 
Link: 
https://lore.kernel.org/r/1590415966-52416-1-git-send-email-xiyuyan...@fudan.edu.cn
Signed-off-by: Mark Brown 
Signed-off-by: Sasha Levin 
---
 sound/soc/fsl/fsl_asrc_dma.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sound/soc/fsl/fsl_asrc_dma.c b/sound/soc/fsl/fsl_asrc_dma.c
index 01052a0808b0..5aee6b8366d2 100644
--- a/sound/soc/fsl/fsl_asrc_dma.c
+++ b/sound/soc/fsl/fsl_asrc_dma.c
@@ -241,6 +241,7 @@ static int fsl_asrc_dma_hw_params(struct snd_pcm_substream 
*substream,
ret = dmaengine_slave_config(pair->dma_chan[dir], &config_be);
if (ret) {
dev_err(dev, "failed to config DMA channel for Back-End\n");
+   dma_release_channel(pair->dma_chan[dir]);
return ret;
}
 
-- 
2.25.1



[PATCH AUTOSEL 5.4 207/266] powerpc/4xx: Don't unmap NULL mbase

2020-06-17 Thread Sasha Levin
From: huhai 

[ Upstream commit bcec081ecc940fc38730b29c743bbee661164161 ]

Signed-off-by: huhai 
Signed-off-by: Michael Ellerman 
Link: https://lore.kernel.org/r/20200521072648.1254699-1-...@ellerman.id.au
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/4xx/pci.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/4xx/pci.c b/arch/powerpc/platforms/4xx/pci.c
index e6e2adcc7b64..c13d64c3b019 100644
--- a/arch/powerpc/platforms/4xx/pci.c
+++ b/arch/powerpc/platforms/4xx/pci.c
@@ -1242,7 +1242,7 @@ static void __init ppc460sx_pciex_check_link(struct 
ppc4xx_pciex_port *port)
if (mbase == NULL) {
printk(KERN_ERR "%pOF: Can't map internal config space !",
port->node);
-   goto done;
+   return;
}
 
while (attempt && (0 == (in_le32(mbase + PECFG_460SX_DLLSTA)
@@ -1252,9 +1252,7 @@ static void __init ppc460sx_pciex_check_link(struct 
ppc4xx_pciex_port *port)
}
if (attempt)
port->link = 1;
-done:
iounmap(mbase);
-
 }
 
 static struct ppc4xx_pciex_hwops ppc460sx_pcie_hwops __initdata = {
-- 
2.25.1



[PATCH AUTOSEL 5.4 196/266] KVM: PPC: Book3S: Fix some RCU-list locks

2020-06-17 Thread Sasha Levin
From: Qian Cai 

[ Upstream commit ab8b65be183180c3eef405d449163964ecc4b571 ]

It is unsafe to traverse kvm->arch.spapr_tce_tables and
stt->iommu_tables without the RCU read lock held. Also, add
cond_resched_rcu() in places with the RCU read lock held that could take
a while to finish.

 arch/powerpc/kvm/book3s_64_vio.c:76 RCU-list traversed in non-reader section!!

 other info that might help us debug this:

 rcu_scheduler_active = 2, debug_locks = 1
 no locks held by qemu-kvm/4265.

 stack backtrace:
 CPU: 96 PID: 4265 Comm: qemu-kvm Not tainted 5.7.0-rc4-next-20200508+ #2
 Call Trace:
 [c000201a8690f720] [c0715948] dump_stack+0xfc/0x174 (unreliable)
 [c000201a8690f770] [c01d9470] lockdep_rcu_suspicious+0x140/0x164
 [c000201a8690f7f0] [c00810b9fb48] 
kvm_spapr_tce_release_iommu_group+0x1f0/0x220 [kvm]
 [c000201a8690f870] [c00810b8462c] 
kvm_spapr_tce_release_vfio_group+0x54/0xb0 [kvm]
 [c000201a8690f8a0] [c00810b84710] kvm_vfio_destroy+0x88/0x140 [kvm]
 [c000201a8690f8f0] [c00810b7d488] kvm_put_kvm+0x370/0x600 [kvm]
 [c000201a8690f990] [c00810b7e3c0] kvm_vm_release+0x38/0x60 [kvm]
 [c000201a8690f9c0] [c05223f4] __fput+0x124/0x330
 [c000201a8690fa20] [c0151cd8] task_work_run+0xb8/0x130
 [c000201a8690fa70] [c01197e8] do_exit+0x4e8/0xfa0
 [c000201a8690fb70] [c011a374] do_group_exit+0x64/0xd0
 [c000201a8690fbb0] [c0132c90] get_signal+0x1f0/0x1200
 [c000201a8690fcc0] [c0020690] do_notify_resume+0x130/0x3c0
 [c000201a8690fda0] [c0038d64] syscall_exit_prepare+0x1a4/0x280
 [c000201a8690fe20] [c000c8f8] system_call_common+0xf8/0x278

 
 arch/powerpc/kvm/book3s_64_vio.c:368 RCU-list traversed in non-reader section!!

 other info that might help us debug this:

 rcu_scheduler_active = 2, debug_locks = 1
 2 locks held by qemu-kvm/4264:
  #0: c000201ae2d000d8 (&vcpu->mutex){+.+.}-{3:3}, at: 
kvm_vcpu_ioctl+0xdc/0x950 [kvm]
  #1: c000200c9ed0c468 (&kvm->srcu){}-{0:0}, at: 
kvmppc_h_put_tce+0x88/0x340 [kvm]

 
 arch/powerpc/kvm/book3s_64_vio.c:108 RCU-list traversed in non-reader section!!

 other info that might help us debug this:

 rcu_scheduler_active = 2, debug_locks = 1
 1 lock held by qemu-kvm/4257:
  #0: c000200b1b363a40 (&kv->lock){+.+.}-{3:3}, at: 
kvm_vfio_set_attr+0x598/0x6c0 [kvm]

 
 arch/powerpc/kvm/book3s_64_vio.c:146 RCU-list traversed in non-reader section!!

 other info that might help us debug this:

 rcu_scheduler_active = 2, debug_locks = 1
 1 lock held by qemu-kvm/4257:
  #0: c000200b1b363a40 (&kv->lock){+.+.}-{3:3}, at: 
kvm_vfio_set_attr+0x598/0x6c0 [kvm]

Signed-off-by: Qian Cai 
Signed-off-by: Paul Mackerras 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/kvm/book3s_64_vio.c | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index 5834db0a54c6..03b947429e4d 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -74,6 +74,7 @@ extern void kvm_spapr_tce_release_iommu_group(struct kvm *kvm,
struct kvmppc_spapr_tce_iommu_table *stit, *tmp;
struct iommu_table_group *table_group = NULL;
 
+   rcu_read_lock();
list_for_each_entry_rcu(stt, &kvm->arch.spapr_tce_tables, list) {
 
table_group = iommu_group_get_iommudata(grp);
@@ -88,7 +89,9 @@ extern void kvm_spapr_tce_release_iommu_group(struct kvm *kvm,
kref_put(&stit->kref, kvm_spapr_tce_liobn_put);
}
}
+   cond_resched_rcu();
}
+   rcu_read_unlock();
 }
 
 extern long kvm_spapr_tce_attach_iommu_group(struct kvm *kvm, int tablefd,
@@ -106,12 +109,14 @@ extern long kvm_spapr_tce_attach_iommu_group(struct kvm 
*kvm, int tablefd,
if (!f.file)
return -EBADF;
 
+   rcu_read_lock();
list_for_each_entry_rcu(stt, &kvm->arch.spapr_tce_tables, list) {
if (stt == f.file->private_data) {
found = true;
break;
}
}
+   rcu_read_unlock();
 
fdput(f);
 
@@ -144,6 +149,7 @@ extern long kvm_spapr_tce_attach_iommu_group(struct kvm 
*kvm, int tablefd,
if (!tbl)
return -EINVAL;
 
+   rcu_read_lock();
list_for_each_entry_rcu(stit, &stt->iommu_tables, next) {
if (tbl != stit->tbl)
continue;
@@ -151,14 +157,17 @@ extern long kvm_spapr_tce_attach_iommu_group(struct kvm 
*kvm, int tablefd,
if (!kref_get_unless_zero(&stit->kref)) {
/* stit is being destroyed */
iommu_tce_table_put(tbl);
+   rcu_read_unlock();
return -ENOTTY;
}
/*
 * The table is already known to this KVM, we just increased
 * its KVM reference count

[PATCH AUTOSEL 5.4 195/266] KVM: PPC: Book3S HV: Ignore kmemleak false positives

2020-06-17 Thread Sasha Levin
From: Qian Cai 

[ Upstream commit 0aca8a5575544bd21b3363058afb8f1e81505150 ]

kvmppc_pmd_alloc() and kvmppc_pte_alloc() allocate some memory but then
pud_populate() and pmd_populate() will use __pa() to reference the newly
allocated memory.

Since kmemleak is unable to track the physical memory resulting in false
positives, silence those by using kmemleak_ignore().

unreferenced object 0xc000201c382a1000 (size 4096):
 comm "qemu-kvm", pid 124828, jiffies 4295733767 (age 341.250s)
 hex dump (first 32 bytes):
   c0 00 20 09 f4 60 03 87 c0 00 20 10 72 a0 03 87  .. ..` .r...
   c0 00 20 0e 13 a0 03 87 c0 00 20 1b dc c0 03 87  .. ... .
 backtrace:
   [<4cc2790f>] kvmppc_create_pte+0x838/0xd20 [kvm_hv]
   kvmppc_pmd_alloc at arch/powerpc/kvm/book3s_64_mmu_radix.c:366
   (inlined by) kvmppc_create_pte at arch/powerpc/kvm/book3s_64_mmu_radix.c:590
   [] kvmppc_book3s_instantiate_page+0x2e0/0x8c0 [kvm_hv]
   [] kvmppc_book3s_radix_page_fault+0x1b4/0x2b0 [kvm_hv]
   [<86dddc0e>] kvmppc_book3s_hv_page_fault+0x214/0x12a0 [kvm_hv]
   [<5ae9ccc2>] kvmppc_vcpu_run_hv+0xc5c/0x15f0 [kvm_hv]
   [] kvmppc_vcpu_run+0x34/0x48 [kvm]
   [] kvm_arch_vcpu_ioctl_run+0x314/0x420 [kvm]
   [<2543dd54>] kvm_vcpu_ioctl+0x33c/0x950 [kvm]
   [<48155cd6>] ksys_ioctl+0xd8/0x130
   [<41ffeaa7>] sys_ioctl+0x28/0x40
   [<4afc4310>] system_call_exception+0x114/0x1e0
   [] system_call_common+0xf0/0x278
unreferenced object 0xc0002001f0c03900 (size 256):
 comm "qemu-kvm", pid 124830, jiffies 4295735235 (age 326.570s)
 hex dump (first 32 bytes):
   c0 00 20 10 fa a0 03 87 c0 00 20 10 fa a1 03 87  .. ... .
   c0 00 20 10 fa a2 03 87 c0 00 20 10 fa a3 03 87  .. ... .
 backtrace:
   [<23f675b8>] kvmppc_create_pte+0x854/0xd20 [kvm_hv]
   kvmppc_pte_alloc at arch/powerpc/kvm/book3s_64_mmu_radix.c:356
   (inlined by) kvmppc_create_pte at arch/powerpc/kvm/book3s_64_mmu_radix.c:593
   [] kvmppc_book3s_instantiate_page+0x2e0/0x8c0 [kvm_hv]
   [] kvmppc_book3s_radix_page_fault+0x1b4/0x2b0 [kvm_hv]
   [<86dddc0e>] kvmppc_book3s_hv_page_fault+0x214/0x12a0 [kvm_hv]
   [<5ae9ccc2>] kvmppc_vcpu_run_hv+0xc5c/0x15f0 [kvm_hv]
   [] kvmppc_vcpu_run+0x34/0x48 [kvm]
   [] kvm_arch_vcpu_ioctl_run+0x314/0x420 [kvm]
   [<2543dd54>] kvm_vcpu_ioctl+0x33c/0x950 [kvm]
   [<48155cd6>] ksys_ioctl+0xd8/0x130
   [<41ffeaa7>] sys_ioctl+0x28/0x40
   [<4afc4310>] system_call_exception+0x114/0x1e0
   [] system_call_common+0xf0/0x278

Signed-off-by: Qian Cai 
Signed-off-by: Paul Mackerras 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/kvm/book3s_64_mmu_radix.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c 
b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index 2d415c36a61d..43b56f8f6beb 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -353,7 +353,13 @@ static struct kmem_cache *kvm_pmd_cache;
 
 static pte_t *kvmppc_pte_alloc(void)
 {
-   return kmem_cache_alloc(kvm_pte_cache, GFP_KERNEL);
+   pte_t *pte;
+
+   pte = kmem_cache_alloc(kvm_pte_cache, GFP_KERNEL);
+   /* pmd_populate() will only reference _pa(pte). */
+   kmemleak_ignore(pte);
+
+   return pte;
 }
 
 static void kvmppc_pte_free(pte_t *ptep)
@@ -363,7 +369,13 @@ static void kvmppc_pte_free(pte_t *ptep)
 
 static pmd_t *kvmppc_pmd_alloc(void)
 {
-   return kmem_cache_alloc(kvm_pmd_cache, GFP_KERNEL);
+   pmd_t *pmd;
+
+   pmd = kmem_cache_alloc(kvm_pmd_cache, GFP_KERNEL);
+   /* pud_populate() will only reference _pa(pmd). */
+   kmemleak_ignore(pmd);
+
+   return pmd;
 }
 
 static void kvmppc_pmd_free(pmd_t *pmdp)
-- 
2.25.1



[PATCH AUTOSEL 5.4 189/266] powerpc/32s: Don't warn when mapping RO data ROX.

2020-06-17 Thread Sasha Levin
From: Christophe Leroy 

[ Upstream commit 4b19f96a81bceaf0bcf44d79c0855c61158065ec ]

Mapping RO data as ROX is not an issue since that data
cannot be modified to introduce an exploit.

PPC64 accepts to have RO data mapped ROX, as a trade off
between kernel size and strictness of protection.

On PPC32, kernel size is even more critical as amount of
memory is usually small.

Depending on the number of available IBATs, the last IBATs
might overflow the end of text. Only warn if it crosses
the end of RO data.

Signed-off-by: Christophe Leroy 
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/6499f8eeb2a36330e5c9fc1cee9a79374875bd54.1589866984.git.christophe.le...@csgroup.eu
Signed-off-by: Sasha Levin 
---
 arch/powerpc/mm/book3s32/mmu.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/mm/book3s32/mmu.c b/arch/powerpc/mm/book3s32/mmu.c
index 84d5fab94f8f..1424a120710e 100644
--- a/arch/powerpc/mm/book3s32/mmu.c
+++ b/arch/powerpc/mm/book3s32/mmu.c
@@ -187,6 +187,7 @@ void mmu_mark_initmem_nx(void)
int i;
unsigned long base = (unsigned long)_stext - PAGE_OFFSET;
unsigned long top = (unsigned long)_etext - PAGE_OFFSET;
+   unsigned long border = (unsigned long)__init_begin - PAGE_OFFSET;
unsigned long size;
 
if (IS_ENABLED(CONFIG_PPC_BOOK3S_601))
@@ -201,9 +202,10 @@ void mmu_mark_initmem_nx(void)
size = block_size(base, top);
size = max(size, 128UL << 10);
if ((top - base) > size) {
-   if (strict_kernel_rwx_enabled())
-   pr_warn("Kernel _etext not properly aligned\n");
size <<= 1;
+   if (strict_kernel_rwx_enabled() && base + size > border)
+   pr_warn("Some RW data is getting mapped X. "
+   "Adjust CONFIG_DATA_SHIFT to avoid 
that.\n");
}
setibat(i++, PAGE_OFFSET + base, base, size, PAGE_KERNEL_TEXT);
base += size;
-- 
2.25.1



[PATCH AUTOSEL 5.4 171/266] powerpc/64s/pgtable: fix an undefined behaviour

2020-06-17 Thread Sasha Levin
From: Qian Cai 

[ Upstream commit c2e929b18cea6cbf71364f22d742d9aad7f4677a ]

Booting a power9 server with hash MMU could trigger an undefined
behaviour because pud_offset(p4d, 0) will do,

0 >> (PAGE_SHIFT:16 + PTE_INDEX_SIZE:8 + H_PMD_INDEX_SIZE:10)

Fix it by converting pud_index() and friends to static inline
functions.

UBSAN: shift-out-of-bounds in arch/powerpc/mm/ptdump/ptdump.c:282:15
shift exponent 34 is too large for 32-bit type 'int'
CPU: 6 PID: 1 Comm: swapper/0 Not tainted 5.6.0-rc4-next-20200303+ #13
Call Trace:
dump_stack+0xf4/0x164 (unreliable)
ubsan_epilogue+0x18/0x78
__ubsan_handle_shift_out_of_bounds+0x160/0x21c
walk_pagetables+0x2cc/0x700
walk_pud at arch/powerpc/mm/ptdump/ptdump.c:282
(inlined by) walk_pagetables at arch/powerpc/mm/ptdump/ptdump.c:311
ptdump_check_wx+0x8c/0xf0
mark_rodata_ro+0x48/0x80
kernel_init+0x74/0x194
ret_from_kernel_thread+0x5c/0x74

Suggested-by: Christophe Leroy 
Signed-off-by: Qian Cai 
Signed-off-by: Michael Ellerman 
Reviewed-by: Christophe Leroy 
Link: https://lore.kernel.org/r/20200306044852.3236-1-...@lca.pw
Signed-off-by: Sasha Levin 
---
 arch/powerpc/include/asm/book3s/64/pgtable.h | 23 
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index a143d394ff46..e1eb8aa9cfbb 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -998,10 +998,25 @@ extern struct page *pgd_page(pgd_t pgd);
 #define pud_page_vaddr(pud)__va(pud_val(pud) & ~PUD_MASKED_BITS)
 #define pgd_page_vaddr(pgd)__va(pgd_val(pgd) & ~PGD_MASKED_BITS)
 
-#define pgd_index(address) (((address) >> (PGDIR_SHIFT)) & (PTRS_PER_PGD - 1))
-#define pud_index(address) (((address) >> (PUD_SHIFT)) & (PTRS_PER_PUD - 1))
-#define pmd_index(address) (((address) >> (PMD_SHIFT)) & (PTRS_PER_PMD - 1))
-#define pte_index(address) (((address) >> (PAGE_SHIFT)) & (PTRS_PER_PTE - 1))
+static inline unsigned long pgd_index(unsigned long address)
+{
+   return (address >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1);
+}
+
+static inline unsigned long pud_index(unsigned long address)
+{
+   return (address >> PUD_SHIFT) & (PTRS_PER_PUD - 1);
+}
+
+static inline unsigned long pmd_index(unsigned long address)
+{
+   return (address >> PMD_SHIFT) & (PTRS_PER_PMD - 1);
+}
+
+static inline unsigned long pte_index(unsigned long address)
+{
+   return (address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
+}
 
 /*
  * Find an entry in a page-table-directory.  We combine the address region
-- 
2.25.1



[PATCH AUTOSEL 5.4 153/266] powerpc/ps3: Fix kexec shutdown hang

2020-06-17 Thread Sasha Levin
From: Geoff Levand 

[ Upstream commit 126554465d93b10662742128918a5fc338cda4aa ]

The ps3_mm_region_destroy() and ps3_mm_vas_destroy() routines
are called very late in the shutdown via kexec's mmu_cleanup_all
routine.  By the time mmu_cleanup_all runs it is too late to use
udbg_printf, and calling it will cause PS3 systems to hang.

Remove all debugging statements from ps3_mm_region_destroy() and
ps3_mm_vas_destroy() and replace any error reporting with calls
to lv1_panic.

With this change builds with 'DEBUG' defined will not cause kexec
reboots to hang, and builds with 'DEBUG' defined or not will end
in lv1_panic if an error is encountered.

Signed-off-by: Geoff Levand 
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/7325c4af2b4c989c19d6a26b90b1fec9c0615ddf.1589049250.git.ge...@infradead.org
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/ps3/mm.c | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/platforms/ps3/mm.c b/arch/powerpc/platforms/ps3/mm.c
index 423be34f0f5f..f42fe4e86ce5 100644
--- a/arch/powerpc/platforms/ps3/mm.c
+++ b/arch/powerpc/platforms/ps3/mm.c
@@ -200,13 +200,14 @@ void ps3_mm_vas_destroy(void)
 {
int result;
 
-   DBG("%s:%d: map.vas_id= %llu\n", __func__, __LINE__, map.vas_id);
-
if (map.vas_id) {
result = lv1_select_virtual_address_space(0);
-   BUG_ON(result);
-   result = lv1_destruct_virtual_address_space(map.vas_id);
-   BUG_ON(result);
+   result += lv1_destruct_virtual_address_space(map.vas_id);
+
+   if (result) {
+   lv1_panic(0);
+   }
+
map.vas_id = 0;
}
 }
@@ -304,19 +305,20 @@ static void ps3_mm_region_destroy(struct mem_region *r)
int result;
 
if (!r->destroy) {
-   pr_info("%s:%d: Not destroying high region: %llxh %llxh\n",
-   __func__, __LINE__, r->base, r->size);
return;
}
 
-   DBG("%s:%d: r->base = %llxh\n", __func__, __LINE__, r->base);
-
if (r->base) {
result = lv1_release_memory(r->base);
-   BUG_ON(result);
+
+   if (result) {
+   lv1_panic(0);
+   }
+
r->size = r->base = r->offset = 0;
map.total = map.rm.size;
}
+
ps3_mm_set_repository_highmem(NULL);
 }
 
-- 
2.25.1



[PATCH AUTOSEL 5.4 151/266] powerpc/pseries/ras: Fix FWNMI_VALID off by one

2020-06-17 Thread Sasha Levin
From: Nicholas Piggin 

[ Upstream commit deb70f7a35a22dffa55b2c3aac71bc6fb0f486ce ]

This was discovered developing qemu fwnmi sreset support. This
off-by-one bug means the last 16 bytes of the rtas area can not
be used for a 16 byte save area.

It's not a serious bug, and QEMU implementation has to retain a
workaround for old kernels, but it's good to tighten it.

Signed-off-by: Nicholas Piggin 
Signed-off-by: Michael Ellerman 
Acked-by: Mahesh Salgaonkar 
Link: https://lore.kernel.org/r/20200508043408.886394-7-npig...@gmail.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/pseries/ras.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/ras.c 
b/arch/powerpc/platforms/pseries/ras.c
index 753adeb624f2..13ef77fd648f 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -395,10 +395,11 @@ static irqreturn_t ras_error_interrupt(int irq, void 
*dev_id)
 /*
  * Some versions of FWNMI place the buffer inside the 4kB page starting at
  * 0x7000. Other versions place it inside the rtas buffer. We check both.
+ * Minimum size of the buffer is 16 bytes.
  */
 #define VALID_FWNMI_BUFFER(A) \
-   A) >= 0x7000) && ((A) < 0x7ff0)) || \
-   (((A) >= rtas.base) && ((A) < (rtas.base + rtas.size - 16
+   A) >= 0x7000) && ((A) <= 0x8000 - 16)) || \
+   (((A) >= rtas.base) && ((A) <= (rtas.base + rtas.size - 16
 
 static inline struct rtas_error_log *fwnmi_get_errlog(void)
 {
-- 
2.25.1



[PATCH AUTOSEL 5.4 150/266] powerpc/64s/exception: Fix machine check no-loss idle wakeup

2020-06-17 Thread Sasha Levin
From: Nicholas Piggin 

[ Upstream commit 8a5054d8cbbe03c68dcb0957c291c942132e4101 ]

The architecture allows for machine check exceptions to cause idle
wakeups which resume at the 0x200 address which has to return via
the idle wakeup code, but the early machine check handler is run
first.

The case of a no state-loss sleep is broken because the early
handler uses non-volatile register r1 , which is needed for the wakeup
protocol, but it is not restored.

Fix this by loading r1 from the MCE exception frame before returning
to the idle wakeup code. Also update the comment which has become
stale since the idle rewrite in C.

This crash was found and fix confirmed with a machine check injection
test in qemu powernv model (which is not upstream in qemu yet).

Fixes: 10d91611f426d ("powerpc/64s: Reimplement book3s idle code in C")
Signed-off-by: Nicholas Piggin 
Signed-off-by: Michael Ellerman 
Link: https://lore.kernel.org/r/20200508043408.886394-2-npig...@gmail.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/kernel/exceptions-64s.S | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index d0018dd17e0a..70ac8a6ba0c1 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1090,17 +1090,19 @@ EXC_COMMON_BEGIN(machine_check_idle_common)
bl  machine_check_queue_event
 
/*
-* We have not used any non-volatile GPRs here, and as a rule
-* most exception code including machine check does not.
-* Therefore PACA_NAPSTATELOST does not need to be set. Idle
-* wakeup will restore volatile registers.
+* GPR-loss wakeups are relatively straightforward, because the
+* idle sleep code has saved all non-volatile registers on its
+* own stack, and r1 in PACAR1.
 *
-* Load the original SRR1 into r3 for pnv_powersave_wakeup_mce.
+* For no-loss wakeups the r1 and lr registers used by the
+* early machine check handler have to be restored first. r2 is
+* the kernel TOC, so no need to restore it.
 *
 * Then decrement MCE nesting after finishing with the stack.
 */
ld  r3,_MSR(r1)
ld  r4,_LINK(r1)
+   ld  r1,GPR1(r1)
 
lhz r11,PACA_IN_MCE(r13)
subir11,r11,1
@@ -1109,7 +,7 @@ EXC_COMMON_BEGIN(machine_check_idle_common)
mtlrr4
rlwinm  r10,r3,47-31,30,31
cmpwi   cr1,r10,2
-   bltlr   cr1 /* no state loss, return to idle caller */
+   bltlr   cr1 /* no state loss, return to idle caller with r3=SRR1 */
b   idle_return_gpr_loss
 #endif
 
-- 
2.25.1



[PATCH AUTOSEL 5.4 142/266] powerpc/64: Don't initialise init_task->thread.regs

2020-06-17 Thread Sasha Levin
From: Michael Ellerman 

[ Upstream commit 7ffa8b7dc11752827329e4e84a574ea6aaf24716 ]

Aneesh increased the size of struct pt_regs by 16 bytes and started
seeing this WARN_ON:

  smp: Bringing up secondary CPUs ...
  [ cut here ]
  WARNING: CPU: 0 PID: 0 at arch/powerpc/kernel/process.c:455 
giveup_all+0xb4/0x110
  Modules linked in:
  CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
5.7.0-rc2-gcc-8.2.0-1.g8f6a41f-default+ #318
  NIP:  c001a2b4 LR: c001a29c CTR: c31d
  REGS: c26d3980 TRAP: 0700   Not tainted  
(5.7.0-rc2-gcc-8.2.0-1.g8f6a41f-default+)
  MSR:  8282b033   CR: 48048224  XER: 

  CFAR: c0019cc8 IRQMASK: 1
  GPR00: c001a264 c26d3c20 c26d7200 8280b033
  GPR04: 0001  0077 30206d7372203164
  GPR08: 2000 02002000 8280b033 3230303030303030
  GPR12: 8800 c31d 00800050 0266
  GPR16: 0309a1a0 0309a4b0 0309a2d8 0309a890
  GPR20: 030d0098 c264da40 fd62 c000ff798080
  GPR24: c264edf0 c001007469f0 fd62 c20e5e90
  GPR28: c264edf0 c264d200 1db6 c264d200
  NIP [c001a2b4] giveup_all+0xb4/0x110
  LR [c001a29c] giveup_all+0x9c/0x110
  Call Trace:
  [c26d3c20] [c001a264] giveup_all+0x64/0x110 (unreliable)
  [c26d3c90] [c001ae34] __switch_to+0x104/0x480
  [c26d3cf0] [c0e0b8a0] __schedule+0x320/0x970
  [c26d3dd0] [c0e0c518] schedule_idle+0x38/0x70
  [c26d3df0] [c019c7c8] do_idle+0x248/0x3f0
  [c26d3e70] [c019cbb8] cpu_startup_entry+0x38/0x40
  [c26d3ea0] [c0011bb0] rest_init+0xe0/0xf8
  [c26d3ed0] [c2004820] start_kernel+0x990/0x9e0
  [c26d3f90] [c000c49c] start_here_common+0x1c/0x400

Which was unexpected. The warning is checking the thread.regs->msr
value of the task we are switching from:

  usermsr = tsk->thread.regs->msr;
  ...
  WARN_ON((usermsr & MSR_VSX) && !((usermsr & MSR_FP) && (usermsr & MSR_VEC)));

ie. if MSR_VSX is set then both of MSR_FP and MSR_VEC are also set.

Dumping tsk->thread.regs->msr we see that it's: 0x1db6

Which is not a normal looking MSR, in fact the only valid bit is
MSR_VSX, all the other bits are reserved in the current definition of
the MSR.

We can see from the oops that it was swapper/0 that we were switching
from when we hit the warning, ie. init_task. So its thread.regs points
to the base (high addresses) in init_stack.

Dumping the content of init_task->thread.regs, with the members of
pt_regs annotated (the 16 bytes larger version), we see:

   c2780080gpr[0] gpr[1]
   c2666008gpr[2] gpr[3]
  c26d3ed0 0078gpr[4] gpr[5]
  c0011b68 c2780080gpr[6] gpr[7]
   gpr[8] gpr[9]
  c26d3f90 80002200gpr[10]gpr[11]
  c2004820 c26d7200gpr[12]gpr[13]
  1db6 c10aabe8gpr[14]gpr[15]
  c10aabe8 c10aabe8gpr[16]gpr[17]
  c294d598 gpr[18]gpr[19]
   1ff8gpr[20]gpr[21]
   c206d608gpr[22]gpr[23]
  c278e0cc gpr[24]gpr[25]
  2fff c000gpr[26]gpr[27]
  0200 0028gpr[28]gpr[29]
  1db6 0475gpr[30]gpr[31]
  0200 1db6nipmsr
   orig_r3ctr
  c000c49c link   xer
   ccrsofte
   trap   dar
   dsisr  result
   pprkuap
   pad[2] pad[3]

This looks suspiciously like stack frames, not a pt_regs. If we look
closely we can see return addresses from the stack trace above,
c2004820 (start_kernel) and c000c49c (start_here_common).

init_task->thread.regs is setup at build time in processor.h:

  #define INIT_THREAD  { \
.ksp = INIT_SP, \
.regs = (struct pt_regs *)INIT_SP - 1, /* XXX bogus, I think */ \

The early boot code where we setup the initial stack is:

  LOAD_REG_ADDR(r3,init_thread_union)

  /* set up a stack pointer */
  LOAD_REG_IMMEDIATE(r1,THREAD_SIZE)
  add   r1,r3,r1
  lir0,0
  stdu  r0,-STACK_FRAME_OVERHEAD(r1)

Which creates a stack frame of size 112 bytes (STACK_FRAME_OVERHEAD).
Which is far too small to contain a pt_regs.

So the result is init_task->thread.regs is pointing at

Re: [PATCH] mm: Move p?d_alloc_track to separate header file

2020-06-17 Thread Stephen Rothwell
Hi Joerg,

Sorry for the late reply.

On Tue,  9 Jun 2020 14:05:33 +0200 Joerg Roedel  wrote:
>
> diff --git a/include/linux/pgalloc-track.h b/include/linux/pgalloc-track.h
> new file mode 100644
> index ..1dcc865029a2
> --- /dev/null
> +++ b/include/linux/pgalloc-track.h
> @@ -0,0 +1,51 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _LINUX_PGALLLC_TRACK_H
> +#define _LINUX_PGALLLC_TRACK_H
> +

Maybe this could have a comment that it should always be included after
mm.h or we could add enough to make it standalone (even just an include
of mm.h would probably be enough).

-- 
Cheers,
Stephen Rothwell


pgpVz5RzbYz6w.pgp
Description: OpenPGP digital signature


[PATCH AUTOSEL 5.4 105/266] tty: hvc: Fix data abort due to race in hvc_open

2020-06-17 Thread Sasha Levin
From: Raghavendra Rao Ananta 

[ Upstream commit e2bd1dcbe1aa34ff5570b3427c530e4332ecf0fe ]

Potentially, hvc_open() can be called in parallel when two tasks calls
open() on /dev/hvcX. In such a scenario, if the hp->ops->notifier_add()
callback in the function fails, where it sets the tty->driver_data to
NULL, the parallel hvc_open() can see this NULL and cause a memory abort.
Hence, serialize hvc_open and check if tty->private_data is NULL before
proceeding ahead.

The issue can be easily reproduced by launching two tasks simultaneously
that does nothing but open() and close() on /dev/hvcX.
For example:
$ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &

Signed-off-by: Raghavendra Rao Ananta 
Link: https://lore.kernel.org/r/20200428032601.22127-1-rana...@codeaurora.org
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Sasha Levin 
---
 drivers/tty/hvc/hvc_console.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c
index cdcc64ea2554..f8e43a6faea9 100644
--- a/drivers/tty/hvc/hvc_console.c
+++ b/drivers/tty/hvc/hvc_console.c
@@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs);
  */
 static DEFINE_MUTEX(hvc_structs_mutex);
 
+/* Mutex to serialize hvc_open */
+static DEFINE_MUTEX(hvc_open_mutex);
 /*
  * This value is used to assign a tty->index value to a hvc_struct based
  * upon order of exposure via hvc_probe(), when we can not match it to
@@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver *driver, struct 
tty_struct *tty)
  */
 static int hvc_open(struct tty_struct *tty, struct file * filp)
 {
-   struct hvc_struct *hp = tty->driver_data;
+   struct hvc_struct *hp;
unsigned long flags;
int rc = 0;
 
+   mutex_lock(&hvc_open_mutex);
+
+   hp = tty->driver_data;
+   if (!hp) {
+   rc = -EIO;
+   goto out;
+   }
+
spin_lock_irqsave(&hp->port.lock, flags);
/* Check and then increment for fast path open. */
if (hp->port.count++ > 0) {
spin_unlock_irqrestore(&hp->port.lock, flags);
hvc_kick();
-   return 0;
+   goto out;
} /* else count == 0 */
spin_unlock_irqrestore(&hp->port.lock, flags);
 
@@ -383,6 +393,8 @@ static int hvc_open(struct tty_struct *tty, struct file * 
filp)
/* Force wakeup of the polling thread */
hvc_kick();
 
+out:
+   mutex_unlock(&hvc_open_mutex);
return rc;
 }
 
-- 
2.25.1



[PATCH AUTOSEL 5.4 082/266] scsi: ibmvscsi: Don't send host info in adapter info MAD after LPM

2020-06-17 Thread Sasha Levin
From: Tyrel Datwyler 

[ Upstream commit 4919b33b63c8b69d8dcf2b867431d0e3b6dc6d28 ]

The adapter info MAD is used to send the client info and receive the host
info as a response. A persistent buffer is used and as such the client info
is overwritten after the response. During the course of a normal adapter
reset the client info is refreshed in the buffer in preparation for sending
the adapter info MAD.

However, in the special case of LPM where we reenable the CRQ instead of a
full CRQ teardown and reset we fail to refresh the client info in the
adapter info buffer. As a result, after Live Partition Migration (LPM) we
erroneously report the host's info as our own.

[mkp: typos]

Link: https://lore.kernel.org/r/20200603203632.18426-1-tyr...@linux.ibm.com
Signed-off-by: Tyrel Datwyler 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/ibmvscsi/ibmvscsi.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
index 59f0f1030c54..c5711c659b51 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -415,6 +415,8 @@ static int ibmvscsi_reenable_crq_queue(struct crq_queue 
*queue,
int rc = 0;
struct vio_dev *vdev = to_vio_dev(hostdata->dev);
 
+   set_adapter_info(hostdata);
+
/* Re-enable the CRQ */
do {
if (rc)
-- 
2.25.1



[PATCH AUTOSEL 5.4 068/266] powerpc/crashkernel: Take "mem=" option into account

2020-06-17 Thread Sasha Levin
From: Pingfan Liu 

[ Upstream commit be5470e0c285a68dc3afdea965032f5ddc8269d7 ]

'mem=" option is an easy way to put high pressure on memory during
some test. Hence after applying the memory limit, instead of total
mem, the actual usable memory should be considered when reserving mem
for crashkernel. Otherwise the boot up may experience OOM issue.

E.g. it would reserve 4G prior to the change and 512M afterward, if
passing
crashkernel="2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G",
and mem=5G on a 256G machine.

This issue is powerpc specific because it puts higher priority on
fadump and kdump reservation than on "mem=". Referring the following
code:
if (fadump_reserve_mem() == 0)
reserve_crashkernel();
...
/* Ensure that total memory size is page-aligned. */
limit = ALIGN(memory_limit ?: memblock_phys_mem_size(), PAGE_SIZE);
memblock_enforce_memory_limit(limit);

While on other arches, the effect of "mem=" takes a higher priority
and pass through memblock_phys_mem_size() before calling
reserve_crashkernel().

Signed-off-by: Pingfan Liu 
Reviewed-by: Hari Bathini 
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/1585749644-4148-1-git-send-email-kernelf...@gmail.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/kernel/machine_kexec.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/machine_kexec.c 
b/arch/powerpc/kernel/machine_kexec.c
index c4ed328a7b96..7a1c11a7cba5 100644
--- a/arch/powerpc/kernel/machine_kexec.c
+++ b/arch/powerpc/kernel/machine_kexec.c
@@ -114,11 +114,12 @@ void machine_kexec(struct kimage *image)
 
 void __init reserve_crashkernel(void)
 {
-   unsigned long long crash_size, crash_base;
+   unsigned long long crash_size, crash_base, total_mem_sz;
int ret;
 
+   total_mem_sz = memory_limit ? memory_limit : memblock_phys_mem_size();
/* use common parsing */
-   ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
+   ret = parse_crashkernel(boot_command_line, total_mem_sz,
&crash_size, &crash_base);
if (ret == 0 && crash_size > 0) {
crashk_res.start = crash_base;
@@ -177,6 +178,7 @@ void __init reserve_crashkernel(void)
/* Crash kernel trumps memory limit */
if (memory_limit && memory_limit <= crashk_res.end) {
memory_limit = crashk_res.end + 1;
+   total_mem_sz = memory_limit;
printk("Adjusted memory limit for crashkernel, now 0x%llx\n",
   memory_limit);
}
@@ -185,7 +187,7 @@ void __init reserve_crashkernel(void)
"for crashkernel (System RAM: %ldMB)\n",
(unsigned long)(crash_size >> 20),
(unsigned long)(crashk_res.start >> 20),
-   (unsigned long)(memblock_phys_mem_size() >> 20));
+   (unsigned long)(total_mem_sz >> 20));
 
if (!memblock_is_region_memory(crashk_res.start, crash_size) ||
memblock_reserve(crashk_res.start, crash_size)) {
-- 
2.25.1



[PATCH AUTOSEL 5.4 062/266] powerpc/perf/hv-24x7: Fix inconsistent output values incase multiple hv-24x7 events run

2020-06-17 Thread Sasha Levin
From: Kajol Jain 

[ Upstream commit b4ac18eead28611ff470d0f47a35c4e0ac080d9c ]

Commit 2b206ee6b0df ("powerpc/perf/hv-24x7: Display change in counter
values")' added to print _change_ in the counter value rather then raw
value for 24x7 counters. Incase of transactions, the event count
is set to 0 at the beginning of the transaction. It also sets
the event's prev_count to the raw value at the time of initialization.
Because of setting event count to 0, we are seeing some weird behaviour,
whenever we run multiple 24x7 events at a time.

For example:

command#: ./perf stat -e "{hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/,
   hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/}"
   -C 0 -I 1000 sleep 100

 1.000121704120 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 1.000121704  5 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 2.000357733  8 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 2.000357733 10 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 3.000495215 18,446,744,073,709,551,616 
hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 3.000495215 18,446,744,073,709,551,616 
hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 4.000641884 56 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 4.000641884 18,446,744,073,709,551,616 
hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 5.000791887 18,446,744,073,709,551,616 
hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/

Getting these large values in case we do -I.

As we are setting event_count to 0, for interval case, overall event_count is 
not
coming in incremental order. As we may can get new delta lesser then previous 
count.
Because of which when we print intervals, we are getting negative value which 
create
these large values.

This patch removes part where we set event_count to 0 in function
'h_24x7_event_read'. There won't be much impact as we do set 
event->hw.prev_count
to the raw value at the time of initialization to print change value.

With this patch
In power9 platform

command#: ./perf stat -e "{hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/,
   hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/}"
   -C 0 -I 1000 sleep 100

 1.000117685 93 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 1.000117685  1 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 2.000349331 98 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 2.000349331  2 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 3.000495900131 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 3.000495900  4 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 4.000645920204 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 4.000645920 61 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 4.284169997 22 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/

Suggested-by: Sukadev Bhattiprolu 
Signed-off-by: Kajol Jain 
Tested-by: Madhavan Srinivasan 
Signed-off-by: Michael Ellerman 
Link: https://lore.kernel.org/r/20200525104308.9814-2-kj...@linux.ibm.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/perf/hv-24x7.c | 10 --
 1 file changed, 10 deletions(-)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index 573e0b309c0c..48e8f4b17b91 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -1400,16 +1400,6 @@ static void h_24x7_event_read(struct perf_event *event)
h24x7hw = &get_cpu_var(hv_24x7_hw);
h24x7hw->events[i] = event;
put_cpu_var(h24x7hw);
-   /*
-* Clear the event count so we can compute the _change_
-* in the 24x7 raw counter value at the end of the txn.
-*
-* Note that we could alternatively read the 24x7 value
-* now and save its value in event->hw.prev_count. But
-* that would require issuing a hcall, which would then
-* defeat the purpose of using the txn interface.
-*/
-   local64_set(&event->count, 0);
}
 
put_cpu_var(hv_24x7_reqb);
-- 
2.25.1



[PATCH AUTOSEL 5.4 054/266] powerpc/ptdump: Add _PAGE_COHERENT flag

2020-06-17 Thread Sasha Levin
From: Christophe Leroy 

[ Upstream commit 3af4786eb429b2df76cbd7ce3bae21467ac3e4fb ]

For platforms using shared.c (4xx, Book3e, Book3s/32), also handle the
_PAGE_COHERENT flag which corresponds to the M bit of the WIMG flags.

Signed-off-by: Christophe Leroy 
[mpe: Make it more verbose, use "coherent" rather than "m"]
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/324c3d860717e8e91fca3bb6c0f8b23e1644a404.1589866984.git.christophe.le...@csgroup.eu
Signed-off-by: Sasha Levin 
---
 arch/powerpc/mm/ptdump/shared.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/powerpc/mm/ptdump/shared.c b/arch/powerpc/mm/ptdump/shared.c
index f7ed2f187cb0..784f8df17f73 100644
--- a/arch/powerpc/mm/ptdump/shared.c
+++ b/arch/powerpc/mm/ptdump/shared.c
@@ -30,6 +30,11 @@ static const struct flag_info flag_array[] = {
.val= _PAGE_PRESENT,
.set= "present",
.clear  = "   ",
+   }, {
+   .mask   = _PAGE_COHERENT,
+   .val= _PAGE_COHERENT,
+   .set= "coherent",
+   .clear  = "",
}, {
.mask   = _PAGE_GUARDED,
.val= _PAGE_GUARDED,
-- 
2.25.1



[PATCH AUTOSEL 5.4 044/266] ps3disk: use the default segment boundary

2020-06-17 Thread Sasha Levin
From: Emmanuel Nicolet 

[ Upstream commit 720bc316690bd27dea9d71510b50f0cd698ffc32 ]

Since commit dcebd755926b ("block: use bio_for_each_bvec() to compute
multi-page bvec count"), the kernel will bug_on on the PS3 because
bio_split() is called with sectors == 0:

  kernel BUG at block/bio.c:1853!
  Oops: Exception in kernel mode, sig: 5 [#1]
  BE PAGE_SIZE=4K MMU=Hash PREEMPT SMP NR_CPUS=8 NUMA PS3
  Modules linked in: firewire_sbp2 rtc_ps3(+) soundcore ps3_gelic(+) \
  ps3rom(+) firewire_core ps3vram(+) usb_common crc_itu_t
  CPU: 0 PID: 97 Comm: blkid Not tainted 5.3.0-rc4 #1
  NIP:  c027d0d0 LR: c027d0b0 CTR: 
  REGS: c135ae90 TRAP: 0700   Not tainted  (5.3.0-rc4)
  MSR:  80028032   CR: 44008240  XER: 2000
  IRQMASK: 0
  GPR00: c0289368 c135b120 c084a500 c4ff8300
  GPR04: 0c00 c4c905e0 c4c905e0 
  GPR08:  0001  
  GPR12:  c08ef000 003e 00080001
  GPR16: 0100   0004
  GPR20: c062fd7e 0001  0080
  GPR24: c0781788 c135b350 0080 c4c905e0
  GPR28: c135b348 c4ff8300  c4c9
  NIP [c027d0d0] .bio_split+0x28/0xac
  LR [c027d0b0] .bio_split+0x8/0xac
  Call Trace:
  [c135b120] [c027d130] .bio_split+0x88/0xac (unreliable)
  [c135b1b0] [c0289368] .__blk_queue_split+0x11c/0x53c
  [c135b2d0] [c028f614] .blk_mq_make_request+0x80/0x7d4
  [c135b3d0] [c0283a8c] .generic_make_request+0x118/0x294
  [c135b4b0] [c0283d34] .submit_bio+0x12c/0x174
  [c135b580] [c0205a44] .mpage_bio_submit+0x3c/0x4c
  [c135b600] [c0206184] .mpage_readpages+0xa4/0x184
  [c135b750] [c01ff8fc] .blkdev_readpages+0x24/0x38
  [c135b7c0] [c01589f0] .read_pages+0x6c/0x1a8
  [c135b8b0] [c0158c74] .__do_page_cache_readahead+0x118/0x184
  [c135b9b0] [c01591a8] .force_page_cache_readahead+0xe4/0xe8
  [c135ba50] [c014fc24] .generic_file_read_iter+0x1d8/0x830
  [c135bb50] [c01ffadc] .blkdev_read_iter+0x40/0x5c
  [c135bbc0] [c01b9e00] .new_sync_read+0x144/0x1a0
  [c135bcd0] [c01bc454] .vfs_read+0xa0/0x124
  [c135bd70] [c01bc7a4] .ksys_read+0x70/0xd8
  [c135be20] [c000a524] system_call+0x5c/0x70
  Instruction dump:
  7fe3fb78 482e30dc 7c0802a6 482e3085 7c9e2378 f821ff71 7ca42b78 7d3e00d0
  7c7d1b78 79290fe0 7cc53378 69290001 <0b09> 81230028 7bca0020 7929ba62
  [ end trace 313fec760f30aa1f ]---

The problem originates from setting the segment boundary of the
request queue to -1UL. This makes get_max_segment_size() return zero
when offset is zero, whatever the max segment size. The test with
BLK_SEG_BOUNDARY_MASK fails and 'mask - (mask & offset) + 1' overflows
to zero in the return statement.

Not setting the segment boundary and using the default
value (BLK_SEG_BOUNDARY_MASK) fixes the problem.

Signed-off-by: Emmanuel Nicolet 
Signed-off-by: Geoff Levand 
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/060a416c43138f45105c0540eff1a45539f7e2fc.1589049250.git.ge...@infradead.org
Signed-off-by: Sasha Levin 
---
 drivers/block/ps3disk.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/block/ps3disk.c b/drivers/block/ps3disk.c
index c5c6487a19d5..7b55811c2a81 100644
--- a/drivers/block/ps3disk.c
+++ b/drivers/block/ps3disk.c
@@ -454,7 +454,6 @@ static int ps3disk_probe(struct ps3_system_bus_device *_dev)
queue->queuedata = dev;
 
blk_queue_max_hw_sectors(queue, dev->bounce_size >> 9);
-   blk_queue_segment_boundary(queue, -1UL);
blk_queue_dma_alignment(queue, dev->blk_size-1);
blk_queue_logical_block_size(queue, dev->blk_size);
 
-- 
2.25.1



[PATCH AUTOSEL 5.4 024/266] powerpc/kasan: Fix stack overflow by increasing THREAD_SHIFT

2020-06-17 Thread Sasha Levin
From: Christophe Leroy 

[ Upstream commit edbadaf0671072298e506074128b64e003c5812c ]

When CONFIG_KASAN is selected, the stack usage is increased.

In the same way as x86 and arm64 architectures, increase
THREAD_SHIFT when CONFIG_KASAN is selected.

Fixes: 2edb16efc899 ("powerpc/32: Add KASAN support")
Reported-by: 
Signed-off-by: Christophe Leroy 
Signed-off-by: Michael Ellerman 
Link: https://bugzilla.kernel.org/show_bug.cgi?id=207129
Link: 
https://lore.kernel.org/r/2c50f3b1c9bbaa4217c9a98f3044bd2a36c46a4f.1586361277.git.christophe.le...@c-s.fr
Signed-off-by: Sasha Levin 
---
 arch/powerpc/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 3dc5aecdd853..135d770e8e57 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -747,6 +747,7 @@ config THREAD_SHIFT
range 13 15
default "15" if PPC_256K_PAGES
default "14" if PPC64
+   default "14" if KASAN
default "13"
help
  Used to define the stack size. The default is almost always what you
-- 
2.25.1



[PATCH AUTOSEL 5.4 010/266] ASoC: fsl_esai: Disable exception interrupt before scheduling tasklet

2020-06-17 Thread Sasha Levin
From: Shengjiu Wang 

[ Upstream commit 1fecbb71fe0e46b886f84e3b6decca6643c3af6d ]

Disable exception interrupt before scheduling tasklet, otherwise if
the tasklet isn't handled immediately, there will be endless xrun
interrupt.

Fixes: 7ccafa2b3879 ("ASoC: fsl_esai: recover the channel swap after xrun")
Signed-off-by: Shengjiu Wang 
Acked-by: Nicolin Chen 
Link: 
https://lore.kernel.org/r/a8f2ad955aac9e52587beedc1133b3efbe746895.1587968824.git.shengjiu.w...@nxp.com
Signed-off-by: Mark Brown 
Signed-off-by: Sasha Levin 
---
 sound/soc/fsl/fsl_esai.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/sound/soc/fsl/fsl_esai.c b/sound/soc/fsl/fsl_esai.c
index c7a49d03463a..84290be778f0 100644
--- a/sound/soc/fsl/fsl_esai.c
+++ b/sound/soc/fsl/fsl_esai.c
@@ -87,6 +87,10 @@ static irqreturn_t esai_isr(int irq, void *devid)
if ((saisr & (ESAI_SAISR_TUE | ESAI_SAISR_ROE)) &&
esai_priv->reset_at_xrun) {
dev_dbg(&pdev->dev, "reset module for xrun\n");
+   regmap_update_bits(esai_priv->regmap, REG_ESAI_TCR,
+  ESAI_xCR_xEIE_MASK, 0);
+   regmap_update_bits(esai_priv->regmap, REG_ESAI_RCR,
+  ESAI_xCR_xEIE_MASK, 0);
tasklet_schedule(&esai_priv->task);
}
 
-- 
2.25.1



[PATCH V3 (RESEND) 2/3] mm/sparsemem: Enable vmem_altmap support in vmemmap_alloc_block_buf()

2020-06-17 Thread Anshuman Khandual
There are many instances where vmemap allocation is often switched between
regular memory and device memory just based on whether altmap is available
or not. vmemmap_alloc_block_buf() is used in various platforms to allocate
vmemmap mappings. Lets also enable it to handle altmap based device memory
allocation along with existing regular memory allocations. This will help
in avoiding the altmap based allocation switch in many places.

While here also implement a regular memory allocation fallback mechanism
when the first preferred device memory allocation fails. This will ensure
preserving the existing semantics on powerpc platform. To summarize there
are three different methods to call vmemmap_alloc_block_buf().

(., NULL,   false) /* Allocate from system RAM */
(., altmap, false) /* Allocate from altmap without any fallback */
(., altmap, true)  /* Allocate from altmap with fallback (system RAM) */

Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: Dave Hansen 
Cc: Andy Lutomirski 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: "H. Peter Anvin" 
Cc: Andrew Morton 
Cc: x...@kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux...@kvack.org
Cc: linux-ker...@vger.kernel.org
Tested-by: Jia He 
Suggested-by: Robin Murphy 
Signed-off-by: Anshuman Khandual 
---
 arch/arm64/mm/mmu.c   |  3 ++-
 arch/powerpc/mm/init_64.c | 10 +-
 arch/x86/mm/init_64.c |  6 ++
 include/linux/mm.h|  3 ++-
 mm/sparse-vmemmap.c   | 30 --
 5 files changed, 35 insertions(+), 17 deletions(-)

diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index 0adad8859393..7ca21adb4412 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1100,7 +1100,8 @@ int __meminit vmemmap_populate(unsigned long start, 
unsigned long end, int node,
if (pmd_none(READ_ONCE(*pmdp))) {
void *p = NULL;
 
-   p = vmemmap_alloc_block_buf(PMD_SIZE, node);
+   p = vmemmap_alloc_block_buf(PMD_SIZE, node,
+   NULL, false);
if (!p)
return -ENOMEM;
 
diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c
index bc73abf0bc25..01e25b56eccb 100644
--- a/arch/powerpc/mm/init_64.c
+++ b/arch/powerpc/mm/init_64.c
@@ -225,12 +225,12 @@ int __meminit vmemmap_populate(unsigned long start, 
unsigned long end, int node,
 * fall back to system memory if the altmap allocation fail.
 */
if (altmap && !altmap_cross_boundary(altmap, start, page_size)) 
{
-   p = altmap_alloc_block_buf(page_size, altmap);
-   if (!p)
-   pr_debug("altmap block allocation failed, 
falling back to system memory");
+   p = vmemmap_alloc_block_buf(page_size, node,
+   altmap, true);
+   } else {
+   p = vmemmap_alloc_block_buf(page_size, node,
+   NULL, false);
}
-   if (!p)
-   p = vmemmap_alloc_block_buf(page_size, node);
if (!p)
return -ENOMEM;
 
diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c
index 19c0ed3271a3..4ae4f767c004 100644
--- a/arch/x86/mm/init_64.c
+++ b/arch/x86/mm/init_64.c
@@ -1463,10 +1463,8 @@ static int __meminit vmemmap_populate_hugepages(unsigned 
long start,
if (pmd_none(*pmd)) {
void *p;
 
-   if (altmap)
-   p = altmap_alloc_block_buf(PMD_SIZE, altmap);
-   else
-   p = vmemmap_alloc_block_buf(PMD_SIZE, node);
+   p = vmemmap_alloc_block_buf(PMD_SIZE, node,
+   altmap, false);
if (p) {
pte_t entry;
 
diff --git a/include/linux/mm.h b/include/linux/mm.h
index e40ac543d248..dade7c3f634d 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -3015,7 +3015,8 @@ pte_t *vmemmap_pte_populate(pmd_t *pmd, unsigned long 
addr, int node,
struct vmem_altmap *altmap);
 void *vmemmap_alloc_block(unsigned long size, int node);
 struct vmem_altmap;
-void *vmemmap_alloc_block_buf(unsigned long size, int node);
+void *vmemmap_alloc_block_buf(unsigned long size, int node,
+ struct vmem_altmap *altmap, bool sysram_fallback);
 void *altmap_alloc_block_buf(unsigned long size, struct vmem_altmap *altmap);
 void vmemmap_verify(pte_t *, int, unsigned long, unsigned long);
 int vmemmap_populate_basepages(unsigned long start, unsigned long 

[PATCH V3 (RESEND) 0/3] arm64: Enable vmemmap mapping from device memory

2020-06-17 Thread Anshuman Khandual
This series enables vmemmap backing memory allocation from device memory
ranges on arm64. But before that, it enables vmemmap_populate_basepages()
and vmemmap_alloc_block_buf() to accommodate struct vmem_altmap based
alocation requests.

This series applies on 5.8-rc1.

Pending Question:

altmap_alloc_block_buf() does not have any other remaining users in the
tree after this change. Should it be converted into a static function and
it's declaration be dropped from the header (include/linux/mm.h). Avoided
doing so because I was not sure if there are any off-tree users or not.

Changes in V3:

- Dropped comment from free_hotplug_page_range() per Robin
- Modified comment in unmap_hotplug_range() per Robin
- Enabled altmap support in vmemmap_alloc_block_buf() per Robin

Changes in V2: (https://lkml.org/lkml/2020/3/4/475)

- Rebased on latest hot-remove series (v14) adding P4D page table support

Changes in V1: (https://lkml.org/lkml/2020/1/23/12)

- Added an WARN_ON() in unmap_hotplug_range() when altmap is
  provided without the page table backing memory being freed

Changes in RFC V2: (https://lkml.org/lkml/2019/10/21/11)

- Changed the commit message on 1/2 patch per Will
- Changed the commit message on 2/2 patch as well
- Rebased on arm64 memory hot remove series (v10)

RFC V1: (https://lkml.org/lkml/2019/6/28/32)

Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Mark Rutland 
Cc: Paul Walmsley 
Cc: Palmer Dabbelt 
Cc: Tony Luck 
Cc: Fenghua Yu 
Cc: Dave Hansen 
Cc: Andy Lutomirski 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: David Hildenbrand 
Cc: Mike Rapoport 
Cc: Michal Hocko 
Cc: "Matthew Wilcox (Oracle)" 
Cc: "Kirill A. Shutemov" 
Cc: Andrew Morton 
Cc: Dan Williams 
Cc: Pavel Tatashin 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-i...@vger.kernel.org
Cc: linux-ri...@lists.infradead.org
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux...@kvack.org
Cc: linux-ker...@vger.kernel.org

Anshuman Khandual (3):
  mm/sparsemem: Enable vmem_altmap support in vmemmap_populate_basepages()
  mm/sparsemem: Enable vmem_altmap support in vmemmap_alloc_block_buf()
  arm64/mm: Enable vmem_altmap support for vmemmap mappings

 arch/arm64/mm/mmu.c   | 59 ++-
 arch/ia64/mm/discontig.c  |  2 +-
 arch/powerpc/mm/init_64.c | 10 +++
 arch/riscv/mm/init.c  |  2 +-
 arch/x86/mm/init_64.c | 12 
 include/linux/mm.h|  8 --
 mm/sparse-vmemmap.c   | 38 -
 7 files changed, 87 insertions(+), 44 deletions(-)

-- 
2.20.1



[PATCH AUTOSEL 5.7 306/388] ASoC: fsl_asrc_dma: Fix dma_chan leak when config DMA channel failed

2020-06-17 Thread Sasha Levin
From: Xiyu Yang 

[ Upstream commit 36124fb19f1ae68a500cd76a76d40c6e81bee346 ]

fsl_asrc_dma_hw_params() invokes dma_request_channel() or
fsl_asrc_get_dma_channel(), which returns a reference of the specified
dma_chan object to "pair->dma_chan[dir]" with increased refcnt.

The reference counting issue happens in one exception handling path of
fsl_asrc_dma_hw_params(). When config DMA channel failed for Back-End,
the function forgets to decrease the refcnt increased by
dma_request_channel() or fsl_asrc_get_dma_channel(), causing a refcnt
leak.

Fix this issue by calling dma_release_channel() when config DMA channel
failed.

Signed-off-by: Xiyu Yang 
Signed-off-by: Xin Tan 
Link: 
https://lore.kernel.org/r/1590415966-52416-1-git-send-email-xiyuyan...@fudan.edu.cn
Signed-off-by: Mark Brown 
Signed-off-by: Sasha Levin 
---
 sound/soc/fsl/fsl_asrc_dma.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/sound/soc/fsl/fsl_asrc_dma.c b/sound/soc/fsl/fsl_asrc_dma.c
index e7178817d7a7..1ee10eafe3e6 100644
--- a/sound/soc/fsl/fsl_asrc_dma.c
+++ b/sound/soc/fsl/fsl_asrc_dma.c
@@ -252,6 +252,7 @@ static int fsl_asrc_dma_hw_params(struct snd_soc_component 
*component,
ret = dmaengine_slave_config(pair->dma_chan[dir], &config_be);
if (ret) {
dev_err(dev, "failed to config DMA channel for Back-End\n");
+   dma_release_channel(pair->dma_chan[dir]);
return ret;
}
 
-- 
2.25.1



[PATCH AUTOSEL 5.7 303/388] powerpc/64s/kuap: Add missing isync to KUAP restore paths

2020-06-17 Thread Sasha Levin
From: Nicholas Piggin 

[ Upstream commit cb2b53cbffe3c388cd676b63f34e54ceb2643ae2 ]

Writing the AMR register is documented to require context
synchronizing operations before and after, for it to take effect as
expected. The KUAP restore at interrupt exit time deliberately avoids
the isync after the AMR update because it only needs to take effect
after the context synchronizing RFID that soon follows. Add a comment
for this.

The missing isync before the update doesn't have an obvious
justification, and seems it could theoretically allow a rogue user
access to leak past the AMR update. Add isyncs for these.

Signed-off-by: Nicholas Piggin 
Signed-off-by: Michael Ellerman 
Link: https://lore.kernel.org/r/20200429065654.1677541-3-npig...@gmail.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/include/asm/book3s/64/kup-radix.h | 11 ++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/book3s/64/kup-radix.h 
b/arch/powerpc/include/asm/book3s/64/kup-radix.h
index 3bcef989a35d..101d60f16d46 100644
--- a/arch/powerpc/include/asm/book3s/64/kup-radix.h
+++ b/arch/powerpc/include/asm/book3s/64/kup-radix.h
@@ -16,7 +16,9 @@
 #ifdef CONFIG_PPC_KUAP
BEGIN_MMU_FTR_SECTION_NESTED(67)
ld  \gpr, STACK_REGS_KUAP(r1)
+   isync
mtspr   SPRN_AMR, \gpr
+   /* No isync required, see kuap_restore_amr() */
END_MMU_FTR_SECTION_NESTED_IFSET(MMU_FTR_RADIX_KUAP, 67)
 #endif
 .endm
@@ -62,8 +64,15 @@
 
 static inline void kuap_restore_amr(struct pt_regs *regs)
 {
-   if (mmu_has_feature(MMU_FTR_RADIX_KUAP))
+   if (mmu_has_feature(MMU_FTR_RADIX_KUAP)) {
+   isync();
mtspr(SPRN_AMR, regs->kuap);
+   /*
+* No isync required here because we are about to RFI back to
+* previous context before any user accesses would be made,
+* which is a CSI.
+*/
+   }
 }
 
 static inline void kuap_check_amr(void)
-- 
2.25.1



[PATCH AUTOSEL 5.7 302/388] powerpc/4xx: Don't unmap NULL mbase

2020-06-17 Thread Sasha Levin
From: huhai 

[ Upstream commit bcec081ecc940fc38730b29c743bbee661164161 ]

Signed-off-by: huhai 
Signed-off-by: Michael Ellerman 
Link: https://lore.kernel.org/r/20200521072648.1254699-1-...@ellerman.id.au
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/4xx/pci.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/4xx/pci.c b/arch/powerpc/platforms/4xx/pci.c
index e6e2adcc7b64..c13d64c3b019 100644
--- a/arch/powerpc/platforms/4xx/pci.c
+++ b/arch/powerpc/platforms/4xx/pci.c
@@ -1242,7 +1242,7 @@ static void __init ppc460sx_pciex_check_link(struct 
ppc4xx_pciex_port *port)
if (mbase == NULL) {
printk(KERN_ERR "%pOF: Can't map internal config space !",
port->node);
-   goto done;
+   return;
}
 
while (attempt && (0 == (in_le32(mbase + PECFG_460SX_DLLSTA)
@@ -1252,9 +1252,7 @@ static void __init ppc460sx_pciex_check_link(struct 
ppc4xx_pciex_port *port)
}
if (attempt)
port->link = 1;
-done:
iounmap(mbase);
-
 }
 
 static struct ppc4xx_pciex_hwops ppc460sx_pcie_hwops __initdata = {
-- 
2.25.1



[PATCH AUTOSEL 5.7 289/388] KVM: PPC: Book3S HV: Relax check on H_SVM_INIT_ABORT

2020-06-17 Thread Sasha Levin
From: Laurent Dufour 

[ Upstream commit e3326ae3d59e443a379367c6936941d6ab55d316 ]

The commit 8c47b6ff29e3 ("KVM: PPC: Book3S HV: Check caller of H_SVM_*
Hcalls") added checks of secure bit of SRR1 to filter out the Hcall
reserved to the Ultravisor.

However, the Hcall H_SVM_INIT_ABORT is made by the Ultravisor passing the
context of the VM calling UV_ESM. This allows the Hypervisor to return to
the guest without going through the Ultravisor. Thus the Secure bit of SRR1
is not set in that particular case.

In the case a regular VM is calling H_SVM_INIT_ABORT, this hcall will be
filtered out in kvmppc_h_svm_init_abort() because kvm->arch.secure_guest is
not set in that case.

Fixes: 8c47b6ff29e3 ("KVM: PPC: Book3S HV: Check caller of H_SVM_* Hcalls")
Signed-off-by: Laurent Dufour 
Reviewed-by: Greg Kurz 
Reviewed-by: Ram Pai 
Signed-off-by: Paul Mackerras 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/kvm/book3s_hv.c | 11 ---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c
index 93493f0cbfe8..ee581cde4878 100644
--- a/arch/powerpc/kvm/book3s_hv.c
+++ b/arch/powerpc/kvm/book3s_hv.c
@@ -1099,9 +1099,14 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu)
ret = kvmppc_h_svm_init_done(vcpu->kvm);
break;
case H_SVM_INIT_ABORT:
-   ret = H_UNSUPPORTED;
-   if (kvmppc_get_srr1(vcpu) & MSR_S)
-   ret = kvmppc_h_svm_init_abort(vcpu->kvm);
+   /*
+* Even if that call is made by the Ultravisor, the SSR1 value
+* is the guest context one, with the secure bit clear as it has
+* not yet been secured. So we can't check it here.
+* Instead the kvm->arch.secure_guest flag is checked inside
+* kvmppc_h_svm_init_abort().
+*/
+   ret = kvmppc_h_svm_init_abort(vcpu->kvm);
break;
 
default:
-- 
2.25.1



[PATCH AUTOSEL 5.7 288/388] KVM: PPC: Book3S: Fix some RCU-list locks

2020-06-17 Thread Sasha Levin
From: Qian Cai 

[ Upstream commit ab8b65be183180c3eef405d449163964ecc4b571 ]

It is unsafe to traverse kvm->arch.spapr_tce_tables and
stt->iommu_tables without the RCU read lock held. Also, add
cond_resched_rcu() in places with the RCU read lock held that could take
a while to finish.

 arch/powerpc/kvm/book3s_64_vio.c:76 RCU-list traversed in non-reader section!!

 other info that might help us debug this:

 rcu_scheduler_active = 2, debug_locks = 1
 no locks held by qemu-kvm/4265.

 stack backtrace:
 CPU: 96 PID: 4265 Comm: qemu-kvm Not tainted 5.7.0-rc4-next-20200508+ #2
 Call Trace:
 [c000201a8690f720] [c0715948] dump_stack+0xfc/0x174 (unreliable)
 [c000201a8690f770] [c01d9470] lockdep_rcu_suspicious+0x140/0x164
 [c000201a8690f7f0] [c00810b9fb48] 
kvm_spapr_tce_release_iommu_group+0x1f0/0x220 [kvm]
 [c000201a8690f870] [c00810b8462c] 
kvm_spapr_tce_release_vfio_group+0x54/0xb0 [kvm]
 [c000201a8690f8a0] [c00810b84710] kvm_vfio_destroy+0x88/0x140 [kvm]
 [c000201a8690f8f0] [c00810b7d488] kvm_put_kvm+0x370/0x600 [kvm]
 [c000201a8690f990] [c00810b7e3c0] kvm_vm_release+0x38/0x60 [kvm]
 [c000201a8690f9c0] [c05223f4] __fput+0x124/0x330
 [c000201a8690fa20] [c0151cd8] task_work_run+0xb8/0x130
 [c000201a8690fa70] [c01197e8] do_exit+0x4e8/0xfa0
 [c000201a8690fb70] [c011a374] do_group_exit+0x64/0xd0
 [c000201a8690fbb0] [c0132c90] get_signal+0x1f0/0x1200
 [c000201a8690fcc0] [c0020690] do_notify_resume+0x130/0x3c0
 [c000201a8690fda0] [c0038d64] syscall_exit_prepare+0x1a4/0x280
 [c000201a8690fe20] [c000c8f8] system_call_common+0xf8/0x278

 
 arch/powerpc/kvm/book3s_64_vio.c:368 RCU-list traversed in non-reader section!!

 other info that might help us debug this:

 rcu_scheduler_active = 2, debug_locks = 1
 2 locks held by qemu-kvm/4264:
  #0: c000201ae2d000d8 (&vcpu->mutex){+.+.}-{3:3}, at: 
kvm_vcpu_ioctl+0xdc/0x950 [kvm]
  #1: c000200c9ed0c468 (&kvm->srcu){}-{0:0}, at: 
kvmppc_h_put_tce+0x88/0x340 [kvm]

 
 arch/powerpc/kvm/book3s_64_vio.c:108 RCU-list traversed in non-reader section!!

 other info that might help us debug this:

 rcu_scheduler_active = 2, debug_locks = 1
 1 lock held by qemu-kvm/4257:
  #0: c000200b1b363a40 (&kv->lock){+.+.}-{3:3}, at: 
kvm_vfio_set_attr+0x598/0x6c0 [kvm]

 
 arch/powerpc/kvm/book3s_64_vio.c:146 RCU-list traversed in non-reader section!!

 other info that might help us debug this:

 rcu_scheduler_active = 2, debug_locks = 1
 1 lock held by qemu-kvm/4257:
  #0: c000200b1b363a40 (&kv->lock){+.+.}-{3:3}, at: 
kvm_vfio_set_attr+0x598/0x6c0 [kvm]

Signed-off-by: Qian Cai 
Signed-off-by: Paul Mackerras 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/kvm/book3s_64_vio.c | 18 ++
 1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_64_vio.c b/arch/powerpc/kvm/book3s_64_vio.c
index 50555ad1db93..1a529df0ab44 100644
--- a/arch/powerpc/kvm/book3s_64_vio.c
+++ b/arch/powerpc/kvm/book3s_64_vio.c
@@ -73,6 +73,7 @@ extern void kvm_spapr_tce_release_iommu_group(struct kvm *kvm,
struct kvmppc_spapr_tce_iommu_table *stit, *tmp;
struct iommu_table_group *table_group = NULL;
 
+   rcu_read_lock();
list_for_each_entry_rcu(stt, &kvm->arch.spapr_tce_tables, list) {
 
table_group = iommu_group_get_iommudata(grp);
@@ -87,7 +88,9 @@ extern void kvm_spapr_tce_release_iommu_group(struct kvm *kvm,
kref_put(&stit->kref, kvm_spapr_tce_liobn_put);
}
}
+   cond_resched_rcu();
}
+   rcu_read_unlock();
 }
 
 extern long kvm_spapr_tce_attach_iommu_group(struct kvm *kvm, int tablefd,
@@ -105,12 +108,14 @@ extern long kvm_spapr_tce_attach_iommu_group(struct kvm 
*kvm, int tablefd,
if (!f.file)
return -EBADF;
 
+   rcu_read_lock();
list_for_each_entry_rcu(stt, &kvm->arch.spapr_tce_tables, list) {
if (stt == f.file->private_data) {
found = true;
break;
}
}
+   rcu_read_unlock();
 
fdput(f);
 
@@ -143,6 +148,7 @@ extern long kvm_spapr_tce_attach_iommu_group(struct kvm 
*kvm, int tablefd,
if (!tbl)
return -EINVAL;
 
+   rcu_read_lock();
list_for_each_entry_rcu(stit, &stt->iommu_tables, next) {
if (tbl != stit->tbl)
continue;
@@ -150,14 +156,17 @@ extern long kvm_spapr_tce_attach_iommu_group(struct kvm 
*kvm, int tablefd,
if (!kref_get_unless_zero(&stit->kref)) {
/* stit is being destroyed */
iommu_tce_table_put(tbl);
+   rcu_read_unlock();
return -ENOTTY;
}
/*
 * The table is already known to this KVM, we just increased
 * its KVM reference count

[PATCH AUTOSEL 5.7 287/388] KVM: PPC: Book3S HV: Ignore kmemleak false positives

2020-06-17 Thread Sasha Levin
From: Qian Cai 

[ Upstream commit 0aca8a5575544bd21b3363058afb8f1e81505150 ]

kvmppc_pmd_alloc() and kvmppc_pte_alloc() allocate some memory but then
pud_populate() and pmd_populate() will use __pa() to reference the newly
allocated memory.

Since kmemleak is unable to track the physical memory resulting in false
positives, silence those by using kmemleak_ignore().

unreferenced object 0xc000201c382a1000 (size 4096):
 comm "qemu-kvm", pid 124828, jiffies 4295733767 (age 341.250s)
 hex dump (first 32 bytes):
   c0 00 20 09 f4 60 03 87 c0 00 20 10 72 a0 03 87  .. ..` .r...
   c0 00 20 0e 13 a0 03 87 c0 00 20 1b dc c0 03 87  .. ... .
 backtrace:
   [<4cc2790f>] kvmppc_create_pte+0x838/0xd20 [kvm_hv]
   kvmppc_pmd_alloc at arch/powerpc/kvm/book3s_64_mmu_radix.c:366
   (inlined by) kvmppc_create_pte at arch/powerpc/kvm/book3s_64_mmu_radix.c:590
   [] kvmppc_book3s_instantiate_page+0x2e0/0x8c0 [kvm_hv]
   [] kvmppc_book3s_radix_page_fault+0x1b4/0x2b0 [kvm_hv]
   [<86dddc0e>] kvmppc_book3s_hv_page_fault+0x214/0x12a0 [kvm_hv]
   [<5ae9ccc2>] kvmppc_vcpu_run_hv+0xc5c/0x15f0 [kvm_hv]
   [] kvmppc_vcpu_run+0x34/0x48 [kvm]
   [] kvm_arch_vcpu_ioctl_run+0x314/0x420 [kvm]
   [<2543dd54>] kvm_vcpu_ioctl+0x33c/0x950 [kvm]
   [<48155cd6>] ksys_ioctl+0xd8/0x130
   [<41ffeaa7>] sys_ioctl+0x28/0x40
   [<4afc4310>] system_call_exception+0x114/0x1e0
   [] system_call_common+0xf0/0x278
unreferenced object 0xc0002001f0c03900 (size 256):
 comm "qemu-kvm", pid 124830, jiffies 4295735235 (age 326.570s)
 hex dump (first 32 bytes):
   c0 00 20 10 fa a0 03 87 c0 00 20 10 fa a1 03 87  .. ... .
   c0 00 20 10 fa a2 03 87 c0 00 20 10 fa a3 03 87  .. ... .
 backtrace:
   [<23f675b8>] kvmppc_create_pte+0x854/0xd20 [kvm_hv]
   kvmppc_pte_alloc at arch/powerpc/kvm/book3s_64_mmu_radix.c:356
   (inlined by) kvmppc_create_pte at arch/powerpc/kvm/book3s_64_mmu_radix.c:593
   [] kvmppc_book3s_instantiate_page+0x2e0/0x8c0 [kvm_hv]
   [] kvmppc_book3s_radix_page_fault+0x1b4/0x2b0 [kvm_hv]
   [<86dddc0e>] kvmppc_book3s_hv_page_fault+0x214/0x12a0 [kvm_hv]
   [<5ae9ccc2>] kvmppc_vcpu_run_hv+0xc5c/0x15f0 [kvm_hv]
   [] kvmppc_vcpu_run+0x34/0x48 [kvm]
   [] kvm_arch_vcpu_ioctl_run+0x314/0x420 [kvm]
   [<2543dd54>] kvm_vcpu_ioctl+0x33c/0x950 [kvm]
   [<48155cd6>] ksys_ioctl+0xd8/0x130
   [<41ffeaa7>] sys_ioctl+0x28/0x40
   [<4afc4310>] system_call_exception+0x114/0x1e0
   [] system_call_common+0xf0/0x278

Signed-off-by: Qian Cai 
Signed-off-by: Paul Mackerras 
Signed-off-by: Sasha Levin 
---
 arch/powerpc/kvm/book3s_64_mmu_radix.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c 
b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index aa12cd4078b3..bc6c1aa3d0e9 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -353,7 +353,13 @@ static struct kmem_cache *kvm_pmd_cache;
 
 static pte_t *kvmppc_pte_alloc(void)
 {
-   return kmem_cache_alloc(kvm_pte_cache, GFP_KERNEL);
+   pte_t *pte;
+
+   pte = kmem_cache_alloc(kvm_pte_cache, GFP_KERNEL);
+   /* pmd_populate() will only reference _pa(pte). */
+   kmemleak_ignore(pte);
+
+   return pte;
 }
 
 static void kvmppc_pte_free(pte_t *ptep)
@@ -363,7 +369,13 @@ static void kvmppc_pte_free(pte_t *ptep)
 
 static pmd_t *kvmppc_pmd_alloc(void)
 {
-   return kmem_cache_alloc(kvm_pmd_cache, GFP_KERNEL);
+   pmd_t *pmd;
+
+   pmd = kmem_cache_alloc(kvm_pmd_cache, GFP_KERNEL);
+   /* pud_populate() will only reference _pa(pmd). */
+   kmemleak_ignore(pmd);
+
+   return pmd;
 }
 
 static void kvmppc_pmd_free(pmd_t *pmdp)
-- 
2.25.1



[PATCH AUTOSEL 5.7 278/388] powerpc/8xx: Drop CONFIG_8xx_COPYBACK option

2020-06-17 Thread Sasha Levin
From: Christophe Leroy 

[ Upstream commit d3efcd38c0b99162d889e36a30425345a18edb33 ]

CONFIG_8xx_COPYBACK was there to help disabling copyback cache mode
for debuging hardware. But nobody will design new boards with 8xx now.

All 8xx platforms select it, so make it the default and remove
the option.

Also remove the Mx_RESETVAL values which are pretty useless and hide
the real value while reading code.

Signed-off-by: Christophe Leroy 
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/bcc968cda075516eb76e2f25e09821f582c566b4.1589866984.git.christophe.le...@csgroup.eu
Signed-off-by: Sasha Levin 
---
 arch/powerpc/configs/adder875_defconfig  |  1 -
 arch/powerpc/configs/ep88xc_defconfig|  1 -
 arch/powerpc/configs/mpc866_ads_defconfig|  1 -
 arch/powerpc/configs/mpc885_ads_defconfig|  1 -
 arch/powerpc/configs/tqm8xx_defconfig|  1 -
 arch/powerpc/include/asm/nohash/32/mmu-8xx.h |  2 --
 arch/powerpc/kernel/head_8xx.S   | 15 +--
 arch/powerpc/platforms/8xx/Kconfig   |  9 -
 8 files changed, 1 insertion(+), 30 deletions(-)

diff --git a/arch/powerpc/configs/adder875_defconfig 
b/arch/powerpc/configs/adder875_defconfig
index f55e23cb176c..5326bc739279 100644
--- a/arch/powerpc/configs/adder875_defconfig
+++ b/arch/powerpc/configs/adder875_defconfig
@@ -10,7 +10,6 @@ CONFIG_EXPERT=y
 # CONFIG_BLK_DEV_BSG is not set
 CONFIG_PARTITION_ADVANCED=y
 CONFIG_PPC_ADDER875=y
-CONFIG_8xx_COPYBACK=y
 CONFIG_GEN_RTC=y
 CONFIG_HZ_1000=y
 # CONFIG_SECCOMP is not set
diff --git a/arch/powerpc/configs/ep88xc_defconfig 
b/arch/powerpc/configs/ep88xc_defconfig
index 0e2e5e81a359..f5c3e72da719 100644
--- a/arch/powerpc/configs/ep88xc_defconfig
+++ b/arch/powerpc/configs/ep88xc_defconfig
@@ -12,7 +12,6 @@ CONFIG_EXPERT=y
 # CONFIG_BLK_DEV_BSG is not set
 CONFIG_PARTITION_ADVANCED=y
 CONFIG_PPC_EP88XC=y
-CONFIG_8xx_COPYBACK=y
 CONFIG_GEN_RTC=y
 CONFIG_HZ_100=y
 # CONFIG_SECCOMP is not set
diff --git a/arch/powerpc/configs/mpc866_ads_defconfig 
b/arch/powerpc/configs/mpc866_ads_defconfig
index 5320735395e7..5c56d36cdfc5 100644
--- a/arch/powerpc/configs/mpc866_ads_defconfig
+++ b/arch/powerpc/configs/mpc866_ads_defconfig
@@ -12,7 +12,6 @@ CONFIG_EXPERT=y
 # CONFIG_BLK_DEV_BSG is not set
 CONFIG_PARTITION_ADVANCED=y
 CONFIG_MPC86XADS=y
-CONFIG_8xx_COPYBACK=y
 CONFIG_GEN_RTC=y
 CONFIG_HZ_1000=y
 CONFIG_MATH_EMULATION=y
diff --git a/arch/powerpc/configs/mpc885_ads_defconfig 
b/arch/powerpc/configs/mpc885_ads_defconfig
index 82a008c04eae..949ff9ccda5e 100644
--- a/arch/powerpc/configs/mpc885_ads_defconfig
+++ b/arch/powerpc/configs/mpc885_ads_defconfig
@@ -11,7 +11,6 @@ CONFIG_EXPERT=y
 # CONFIG_VM_EVENT_COUNTERS is not set
 # CONFIG_BLK_DEV_BSG is not set
 CONFIG_PARTITION_ADVANCED=y
-CONFIG_8xx_COPYBACK=y
 CONFIG_GEN_RTC=y
 CONFIG_HZ_100=y
 # CONFIG_SECCOMP is not set
diff --git a/arch/powerpc/configs/tqm8xx_defconfig 
b/arch/powerpc/configs/tqm8xx_defconfig
index eda8bfb2d0a3..77857d513022 100644
--- a/arch/powerpc/configs/tqm8xx_defconfig
+++ b/arch/powerpc/configs/tqm8xx_defconfig
@@ -15,7 +15,6 @@ CONFIG_MODULE_SRCVERSION_ALL=y
 # CONFIG_BLK_DEV_BSG is not set
 CONFIG_PARTITION_ADVANCED=y
 CONFIG_TQM8XX=y
-CONFIG_8xx_COPYBACK=y
 # CONFIG_8xx_CPU15 is not set
 CONFIG_GEN_RTC=y
 CONFIG_HZ_100=y
diff --git a/arch/powerpc/include/asm/nohash/32/mmu-8xx.h 
b/arch/powerpc/include/asm/nohash/32/mmu-8xx.h
index 76af5b0cb16e..26b7cee34dfe 100644
--- a/arch/powerpc/include/asm/nohash/32/mmu-8xx.h
+++ b/arch/powerpc/include/asm/nohash/32/mmu-8xx.h
@@ -19,7 +19,6 @@
 #define MI_RSV4I   0x0800  /* Reserve 4 TLB entries */
 #define MI_PPCS0x0200  /* Use MI_RPN prob/priv state */
 #define MI_IDXMASK 0x1f00  /* TLB index to be loaded */
-#define MI_RESETVAL0x  /* Value of register at reset */
 
 /* These are the Ks and Kp from the PowerPC books.  For proper operation,
  * Ks = 0, Kp = 1.
@@ -95,7 +94,6 @@
 #define MD_TWAM0x0400  /* Use 4K page hardware assist 
*/
 #define MD_PPCS0x0200  /* Use MI_RPN prob/priv state */
 #define MD_IDXMASK 0x1f00  /* TLB index to be loaded */
-#define MD_RESETVAL0x0400  /* Value of register at reset */
 
 #define SPRN_M_CASID   793 /* Address space ID (context) to match */
 #define MC_ASIDMASK0x000f  /* Bits used for ASID value */
diff --git a/arch/powerpc/kernel/head_8xx.S b/arch/powerpc/kernel/head_8xx.S
index 073a651787df..905205c79a25 100644
--- a/arch/powerpc/kernel/head_8xx.S
+++ b/arch/powerpc/kernel/head_8xx.S
@@ -779,10 +779,7 @@ start_here:
 initial_mmu:
li  r8, 0
mtspr   SPRN_MI_CTR, r8 /* remove PINNED ITLB entries */
-   lis r10, MD_RESETVAL@h
-#ifndef CONFIG_8xx_COPYBACK
-   orisr10, r10, MD_WTDEF@h
-#endif
+   lis r10, MD_TWAM@h
mtspr   SPRN_MD_CTR, r10/* remove PINNED DTLB entries */
 
tlbia  

[PATCH AUTOSEL 5.7 277/388] powerpc/32s: Don't warn when mapping RO data ROX.

2020-06-17 Thread Sasha Levin
From: Christophe Leroy 

[ Upstream commit 4b19f96a81bceaf0bcf44d79c0855c61158065ec ]

Mapping RO data as ROX is not an issue since that data
cannot be modified to introduce an exploit.

PPC64 accepts to have RO data mapped ROX, as a trade off
between kernel size and strictness of protection.

On PPC32, kernel size is even more critical as amount of
memory is usually small.

Depending on the number of available IBATs, the last IBATs
might overflow the end of text. Only warn if it crosses
the end of RO data.

Signed-off-by: Christophe Leroy 
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/6499f8eeb2a36330e5c9fc1cee9a79374875bd54.1589866984.git.christophe.le...@csgroup.eu
Signed-off-by: Sasha Levin 
---
 arch/powerpc/mm/book3s32/mmu.c | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/mm/book3s32/mmu.c b/arch/powerpc/mm/book3s32/mmu.c
index 39ba53ca5bb5..a9b2cbc74797 100644
--- a/arch/powerpc/mm/book3s32/mmu.c
+++ b/arch/powerpc/mm/book3s32/mmu.c
@@ -187,6 +187,7 @@ void mmu_mark_initmem_nx(void)
int i;
unsigned long base = (unsigned long)_stext - PAGE_OFFSET;
unsigned long top = (unsigned long)_etext - PAGE_OFFSET;
+   unsigned long border = (unsigned long)__init_begin - PAGE_OFFSET;
unsigned long size;
 
if (IS_ENABLED(CONFIG_PPC_BOOK3S_601))
@@ -201,9 +202,10 @@ void mmu_mark_initmem_nx(void)
size = block_size(base, top);
size = max(size, 128UL << 10);
if ((top - base) > size) {
-   if (strict_kernel_rwx_enabled())
-   pr_warn("Kernel _etext not properly aligned\n");
size <<= 1;
+   if (strict_kernel_rwx_enabled() && base + size > border)
+   pr_warn("Some RW data is getting mapped X. "
+   "Adjust CONFIG_DATA_SHIFT to avoid 
that.\n");
}
setibat(i++, PAGE_OFFSET + base, base, size, PAGE_KERNEL_TEXT);
base += size;
-- 
2.25.1



[PATCH AUTOSEL 5.7 251/388] powerpc/kasan: Fix error detection on memory allocation

2020-06-17 Thread Sasha Levin
From: Christophe Leroy 

[ Upstream commit d132443a73d7a131775df46f33000f67ed92de1e ]

In case (k_start & PAGE_MASK) doesn't equal (kstart), 'va' will never be
NULL allthough 'block' is NULL

Check the return of memblock_alloc() directly instead of
the resulting address in the loop.

Fixes: 509cd3f2b473 ("powerpc/32: Simplify KASAN init")
Signed-off-by: Christophe Leroy 
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/7cb8ca82042bfc45a5cfe726c921cd7e7eeb12a3.1589866984.git.christophe.le...@csgroup.eu
Signed-off-by: Sasha Levin 
---
 arch/powerpc/mm/kasan/kasan_init_32.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/mm/kasan/kasan_init_32.c 
b/arch/powerpc/mm/kasan/kasan_init_32.c
index cbcad369fcb2..8b15fe09b967 100644
--- a/arch/powerpc/mm/kasan/kasan_init_32.c
+++ b/arch/powerpc/mm/kasan/kasan_init_32.c
@@ -76,15 +76,14 @@ static int __init kasan_init_region(void *start, size_t 
size)
return ret;
 
block = memblock_alloc(k_end - k_start, PAGE_SIZE);
+   if (!block)
+   return -ENOMEM;
 
for (k_cur = k_start & PAGE_MASK; k_cur < k_end; k_cur += PAGE_SIZE) {
pmd_t *pmd = pmd_ptr_k(k_cur);
void *va = block + k_cur - k_start;
pte_t pte = pfn_pte(PHYS_PFN(__pa(va)), PAGE_KERNEL);
 
-   if (!va)
-   return -ENOMEM;
-
__set_pte_at(&init_mm, k_cur, pte_offset_kernel(pmd, k_cur), 
pte, 0);
}
flush_tlb_kernel_range(k_start, k_end);
-- 
2.25.1



[PATCH AUTOSEL 5.7 250/388] powerpc/64s/pgtable: fix an undefined behaviour

2020-06-17 Thread Sasha Levin
From: Qian Cai 

[ Upstream commit c2e929b18cea6cbf71364f22d742d9aad7f4677a ]

Booting a power9 server with hash MMU could trigger an undefined
behaviour because pud_offset(p4d, 0) will do,

0 >> (PAGE_SHIFT:16 + PTE_INDEX_SIZE:8 + H_PMD_INDEX_SIZE:10)

Fix it by converting pud_index() and friends to static inline
functions.

UBSAN: shift-out-of-bounds in arch/powerpc/mm/ptdump/ptdump.c:282:15
shift exponent 34 is too large for 32-bit type 'int'
CPU: 6 PID: 1 Comm: swapper/0 Not tainted 5.6.0-rc4-next-20200303+ #13
Call Trace:
dump_stack+0xf4/0x164 (unreliable)
ubsan_epilogue+0x18/0x78
__ubsan_handle_shift_out_of_bounds+0x160/0x21c
walk_pagetables+0x2cc/0x700
walk_pud at arch/powerpc/mm/ptdump/ptdump.c:282
(inlined by) walk_pagetables at arch/powerpc/mm/ptdump/ptdump.c:311
ptdump_check_wx+0x8c/0xf0
mark_rodata_ro+0x48/0x80
kernel_init+0x74/0x194
ret_from_kernel_thread+0x5c/0x74

Suggested-by: Christophe Leroy 
Signed-off-by: Qian Cai 
Signed-off-by: Michael Ellerman 
Reviewed-by: Christophe Leroy 
Link: https://lore.kernel.org/r/20200306044852.3236-1-...@lca.pw
Signed-off-by: Sasha Levin 
---
 arch/powerpc/include/asm/book3s/64/pgtable.h | 23 
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index 368b136517e0..2838b98bc6df 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -998,10 +998,25 @@ extern struct page *pgd_page(pgd_t pgd);
 #define pud_page_vaddr(pud)__va(pud_val(pud) & ~PUD_MASKED_BITS)
 #define pgd_page_vaddr(pgd)__va(pgd_val(pgd) & ~PGD_MASKED_BITS)
 
-#define pgd_index(address) (((address) >> (PGDIR_SHIFT)) & (PTRS_PER_PGD - 1))
-#define pud_index(address) (((address) >> (PUD_SHIFT)) & (PTRS_PER_PUD - 1))
-#define pmd_index(address) (((address) >> (PMD_SHIFT)) & (PTRS_PER_PMD - 1))
-#define pte_index(address) (((address) >> (PAGE_SHIFT)) & (PTRS_PER_PTE - 1))
+static inline unsigned long pgd_index(unsigned long address)
+{
+   return (address >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1);
+}
+
+static inline unsigned long pud_index(unsigned long address)
+{
+   return (address >> PUD_SHIFT) & (PTRS_PER_PUD - 1);
+}
+
+static inline unsigned long pmd_index(unsigned long address)
+{
+   return (address >> PMD_SHIFT) & (PTRS_PER_PMD - 1);
+}
+
+static inline unsigned long pte_index(unsigned long address)
+{
+   return (address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1);
+}
 
 /*
  * Find an entry in a page-table-directory.  We combine the address region
-- 
2.25.1



[PATCH AUTOSEL 5.7 249/388] powerpc/powernv: add NULL check after kzalloc

2020-06-17 Thread Sasha Levin
From: Chen Zhou 

[ Upstream commit ceffa63acce7165c442395b7d64a11ab8b5c5dca ]

Fixes coccicheck warning:

./arch/powerpc/platforms/powernv/opal.c:813:1-5:
alloc with no test, possible model on line 814

Add NULL check after kzalloc.

Signed-off-by: Chen Zhou 
Signed-off-by: Michael Ellerman 
Link: https://lore.kernel.org/r/20200509020838.121660-1-chenzho...@huawei.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/powernv/opal.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/opal.c 
b/arch/powerpc/platforms/powernv/opal.c
index 2b3dfd0b6cdd..d95954ad4c0a 100644
--- a/arch/powerpc/platforms/powernv/opal.c
+++ b/arch/powerpc/platforms/powernv/opal.c
@@ -811,6 +811,10 @@ static int opal_add_one_export(struct kobject *parent, 
const char *export_name,
goto out;
 
attr = kzalloc(sizeof(*attr), GFP_KERNEL);
+   if (!attr) {
+   rc = -ENOMEM;
+   goto out;
+   }
name = kstrdup(export_name, GFP_KERNEL);
if (!name) {
rc = -ENOMEM;
-- 
2.25.1



[PATCH AUTOSEL 5.7 226/388] powerpc/ps3: Fix kexec shutdown hang

2020-06-17 Thread Sasha Levin
From: Geoff Levand 

[ Upstream commit 126554465d93b10662742128918a5fc338cda4aa ]

The ps3_mm_region_destroy() and ps3_mm_vas_destroy() routines
are called very late in the shutdown via kexec's mmu_cleanup_all
routine.  By the time mmu_cleanup_all runs it is too late to use
udbg_printf, and calling it will cause PS3 systems to hang.

Remove all debugging statements from ps3_mm_region_destroy() and
ps3_mm_vas_destroy() and replace any error reporting with calls
to lv1_panic.

With this change builds with 'DEBUG' defined will not cause kexec
reboots to hang, and builds with 'DEBUG' defined or not will end
in lv1_panic if an error is encountered.

Signed-off-by: Geoff Levand 
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/7325c4af2b4c989c19d6a26b90b1fec9c0615ddf.1589049250.git.ge...@infradead.org
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/ps3/mm.c | 22 --
 1 file changed, 12 insertions(+), 10 deletions(-)

diff --git a/arch/powerpc/platforms/ps3/mm.c b/arch/powerpc/platforms/ps3/mm.c
index 423be34f0f5f..f42fe4e86ce5 100644
--- a/arch/powerpc/platforms/ps3/mm.c
+++ b/arch/powerpc/platforms/ps3/mm.c
@@ -200,13 +200,14 @@ void ps3_mm_vas_destroy(void)
 {
int result;
 
-   DBG("%s:%d: map.vas_id= %llu\n", __func__, __LINE__, map.vas_id);
-
if (map.vas_id) {
result = lv1_select_virtual_address_space(0);
-   BUG_ON(result);
-   result = lv1_destruct_virtual_address_space(map.vas_id);
-   BUG_ON(result);
+   result += lv1_destruct_virtual_address_space(map.vas_id);
+
+   if (result) {
+   lv1_panic(0);
+   }
+
map.vas_id = 0;
}
 }
@@ -304,19 +305,20 @@ static void ps3_mm_region_destroy(struct mem_region *r)
int result;
 
if (!r->destroy) {
-   pr_info("%s:%d: Not destroying high region: %llxh %llxh\n",
-   __func__, __LINE__, r->base, r->size);
return;
}
 
-   DBG("%s:%d: r->base = %llxh\n", __func__, __LINE__, r->base);
-
if (r->base) {
result = lv1_release_memory(r->base);
-   BUG_ON(result);
+
+   if (result) {
+   lv1_panic(0);
+   }
+
r->size = r->base = r->offset = 0;
map.total = map.rm.size;
}
+
ps3_mm_set_repository_highmem(NULL);
 }
 
-- 
2.25.1



[PATCH AUTOSEL 5.7 220/388] powerpc/pseries/ras: Fix FWNMI_VALID off by one

2020-06-17 Thread Sasha Levin
From: Nicholas Piggin 

[ Upstream commit deb70f7a35a22dffa55b2c3aac71bc6fb0f486ce ]

This was discovered developing qemu fwnmi sreset support. This
off-by-one bug means the last 16 bytes of the rtas area can not
be used for a 16 byte save area.

It's not a serious bug, and QEMU implementation has to retain a
workaround for old kernels, but it's good to tighten it.

Signed-off-by: Nicholas Piggin 
Signed-off-by: Michael Ellerman 
Acked-by: Mahesh Salgaonkar 
Link: https://lore.kernel.org/r/20200508043408.886394-7-npig...@gmail.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/platforms/pseries/ras.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/pseries/ras.c 
b/arch/powerpc/platforms/pseries/ras.c
index 1d1da639b8b7..16ba5c542e55 100644
--- a/arch/powerpc/platforms/pseries/ras.c
+++ b/arch/powerpc/platforms/pseries/ras.c
@@ -395,10 +395,11 @@ static irqreturn_t ras_error_interrupt(int irq, void 
*dev_id)
 /*
  * Some versions of FWNMI place the buffer inside the 4kB page starting at
  * 0x7000. Other versions place it inside the rtas buffer. We check both.
+ * Minimum size of the buffer is 16 bytes.
  */
 #define VALID_FWNMI_BUFFER(A) \
-   A) >= 0x7000) && ((A) < 0x7ff0)) || \
-   (((A) >= rtas.base) && ((A) < (rtas.base + rtas.size - 16
+   A) >= 0x7000) && ((A) <= 0x8000 - 16)) || \
+   (((A) >= rtas.base) && ((A) <= (rtas.base + rtas.size - 16
 
 static inline struct rtas_error_log *fwnmi_get_errlog(void)
 {
-- 
2.25.1



[PATCH AUTOSEL 5.7 219/388] powerpc/64s/exceptions: Machine check reconcile irq state

2020-06-17 Thread Sasha Levin
From: Nicholas Piggin 

[ Upstream commit f0fd9dd3c213c947dfb5bc2cad3ef5e30d3258ec ]

pseries fwnmi machine check code pops the soft-irq checks in rtas_call
(after the next patch to remove rtas_token from this call path).
Rather than play whack a mole with these and forever having fragile
code, it seems better to have the early machine check handler perform
the same kind of reconcile as the other NMI interrupts.

  WARNING: CPU: 0 PID: 493 at arch/powerpc/kernel/irq.c:343
  CPU: 0 PID: 493 Comm: a Tainted: GW
  NIP:  c001ed2c LR: c0042c40 CTR: 
  REGS: c001fffd38b0 TRAP: 0700   Tainted: GW
  MSR:  80021003   CR: 28000488  XER: 
  CFAR: c001ec90 IRQMASK: 0
  GPR00: c0043820 c001fffd3b40 c12ba300 
  GPR04: 48000488   deadbeef
  GPR08: 0080   1001
  GPR12:  c14a  
  GPR16:    
  GPR20:    
  GPR24:    
  GPR28:  0001 c1360810 
  NIP [c001ed2c] arch_local_irq_restore.part.0+0xac/0x100
  LR [c0042c40] unlock_rtas+0x30/0x90
  Call Trace:
  [c001fffd3b40] [c1360810] 0xc1360810 (unreliable)
  [c001fffd3b60] [c0043820] rtas_call+0x1c0/0x280
  [c001fffd3bb0] [c00dc328] fwnmi_release_errinfo+0x38/0x70
  [c001fffd3c10] [c00dcd8c] 
pseries_machine_check_realmode+0x1dc/0x540
  [c001fffd3cd0] [c003fe04] machine_check_early+0x54/0x70
  [c001fffd3d00] [c0008384] machine_check_early_common+0x134/0x1f0
  --- interrupt: 200 at 0x13f1307c8
  LR = 0x7fff888b8528
  Instruction dump:
  6000 7d2000a6 71298000 41820068 3922 7d210164 4b9c 6000
  6000 7d2000a6 71298000 4c820020 <0fe0> 4e800020 6000 6000

Signed-off-by: Nicholas Piggin 
Signed-off-by: Michael Ellerman 
Link: https://lore.kernel.org/r/20200508043408.886394-5-npig...@gmail.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/kernel/exceptions-64s.S | 19 +++
 1 file changed, 19 insertions(+)

diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index 463372046169..d3e19934cca9 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1117,11 +1117,30 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
li  r10,MSR_RI
mtmsrd  r10,1
 
+   /*
+* Set IRQS_ALL_DISABLED and save PACAIRQHAPPENED (see
+* system_reset_common)
+*/
+   li  r10,IRQS_ALL_DISABLED
+   stb r10,PACAIRQSOFTMASK(r13)
+   lbz r10,PACAIRQHAPPENED(r13)
+   std r10,RESULT(r1)
+   ori r10,r10,PACA_IRQ_HARD_DIS
+   stb r10,PACAIRQHAPPENED(r13)
+
addir3,r1,STACK_FRAME_OVERHEAD
bl  machine_check_early
std r3,RESULT(r1)   /* Save result */
ld  r12,_MSR(r1)
 
+   /*
+* Restore soft mask settings.
+*/
+   ld  r10,RESULT(r1)
+   stb r10,PACAIRQHAPPENED(r13)
+   ld  r10,SOFTE(r1)
+   stb r10,PACAIRQSOFTMASK(r13)
+
 #ifdef CONFIG_PPC_P7_NAP
/*
 * Check if thread was in power saving mode. We come here when any
-- 
2.25.1



[PATCH AUTOSEL 5.7 218/388] powerpc/64s/exception: Fix machine check no-loss idle wakeup

2020-06-17 Thread Sasha Levin
From: Nicholas Piggin 

[ Upstream commit 8a5054d8cbbe03c68dcb0957c291c942132e4101 ]

The architecture allows for machine check exceptions to cause idle
wakeups which resume at the 0x200 address which has to return via
the idle wakeup code, but the early machine check handler is run
first.

The case of a no state-loss sleep is broken because the early
handler uses non-volatile register r1 , which is needed for the wakeup
protocol, but it is not restored.

Fix this by loading r1 from the MCE exception frame before returning
to the idle wakeup code. Also update the comment which has become
stale since the idle rewrite in C.

This crash was found and fix confirmed with a machine check injection
test in qemu powernv model (which is not upstream in qemu yet).

Fixes: 10d91611f426d ("powerpc/64s: Reimplement book3s idle code in C")
Signed-off-by: Nicholas Piggin 
Signed-off-by: Michael Ellerman 
Link: https://lore.kernel.org/r/20200508043408.886394-2-npig...@gmail.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/kernel/exceptions-64s.S | 14 --
 1 file changed, 8 insertions(+), 6 deletions(-)

diff --git a/arch/powerpc/kernel/exceptions-64s.S 
b/arch/powerpc/kernel/exceptions-64s.S
index ebeebab74b56..463372046169 100644
--- a/arch/powerpc/kernel/exceptions-64s.S
+++ b/arch/powerpc/kernel/exceptions-64s.S
@@ -1225,17 +1225,19 @@ EXC_COMMON_BEGIN(machine_check_idle_common)
bl  machine_check_queue_event
 
/*
-* We have not used any non-volatile GPRs here, and as a rule
-* most exception code including machine check does not.
-* Therefore PACA_NAPSTATELOST does not need to be set. Idle
-* wakeup will restore volatile registers.
+* GPR-loss wakeups are relatively straightforward, because the
+* idle sleep code has saved all non-volatile registers on its
+* own stack, and r1 in PACAR1.
 *
-* Load the original SRR1 into r3 for pnv_powersave_wakeup_mce.
+* For no-loss wakeups the r1 and lr registers used by the
+* early machine check handler have to be restored first. r2 is
+* the kernel TOC, so no need to restore it.
 *
 * Then decrement MCE nesting after finishing with the stack.
 */
ld  r3,_MSR(r1)
ld  r4,_LINK(r1)
+   ld  r1,GPR1(r1)
 
lhz r11,PACA_IN_MCE(r13)
subir11,r11,1
@@ -1244,7 +1246,7 @@ EXC_COMMON_BEGIN(machine_check_idle_common)
mtlrr4
rlwinm  r10,r3,47-31,30,31
cmpwi   cr1,r10,2
-   bltlr   cr1 /* no state loss, return to idle caller */
+   bltlr   cr1 /* no state loss, return to idle caller with r3=SRR1 */
b   idle_return_gpr_loss
 #endif
 
-- 
2.25.1



[PATCH AUTOSEL 5.7 205/388] powerpc/64: Don't initialise init_task->thread.regs

2020-06-17 Thread Sasha Levin
From: Michael Ellerman 

[ Upstream commit 7ffa8b7dc11752827329e4e84a574ea6aaf24716 ]

Aneesh increased the size of struct pt_regs by 16 bytes and started
seeing this WARN_ON:

  smp: Bringing up secondary CPUs ...
  [ cut here ]
  WARNING: CPU: 0 PID: 0 at arch/powerpc/kernel/process.c:455 
giveup_all+0xb4/0x110
  Modules linked in:
  CPU: 0 PID: 0 Comm: swapper/0 Not tainted 
5.7.0-rc2-gcc-8.2.0-1.g8f6a41f-default+ #318
  NIP:  c001a2b4 LR: c001a29c CTR: c31d
  REGS: c26d3980 TRAP: 0700   Not tainted  
(5.7.0-rc2-gcc-8.2.0-1.g8f6a41f-default+)
  MSR:  8282b033   CR: 48048224  XER: 

  CFAR: c0019cc8 IRQMASK: 1
  GPR00: c001a264 c26d3c20 c26d7200 8280b033
  GPR04: 0001  0077 30206d7372203164
  GPR08: 2000 02002000 8280b033 3230303030303030
  GPR12: 8800 c31d 00800050 0266
  GPR16: 0309a1a0 0309a4b0 0309a2d8 0309a890
  GPR20: 030d0098 c264da40 fd62 c000ff798080
  GPR24: c264edf0 c001007469f0 fd62 c20e5e90
  GPR28: c264edf0 c264d200 1db6 c264d200
  NIP [c001a2b4] giveup_all+0xb4/0x110
  LR [c001a29c] giveup_all+0x9c/0x110
  Call Trace:
  [c26d3c20] [c001a264] giveup_all+0x64/0x110 (unreliable)
  [c26d3c90] [c001ae34] __switch_to+0x104/0x480
  [c26d3cf0] [c0e0b8a0] __schedule+0x320/0x970
  [c26d3dd0] [c0e0c518] schedule_idle+0x38/0x70
  [c26d3df0] [c019c7c8] do_idle+0x248/0x3f0
  [c26d3e70] [c019cbb8] cpu_startup_entry+0x38/0x40
  [c26d3ea0] [c0011bb0] rest_init+0xe0/0xf8
  [c26d3ed0] [c2004820] start_kernel+0x990/0x9e0
  [c26d3f90] [c000c49c] start_here_common+0x1c/0x400

Which was unexpected. The warning is checking the thread.regs->msr
value of the task we are switching from:

  usermsr = tsk->thread.regs->msr;
  ...
  WARN_ON((usermsr & MSR_VSX) && !((usermsr & MSR_FP) && (usermsr & MSR_VEC)));

ie. if MSR_VSX is set then both of MSR_FP and MSR_VEC are also set.

Dumping tsk->thread.regs->msr we see that it's: 0x1db6

Which is not a normal looking MSR, in fact the only valid bit is
MSR_VSX, all the other bits are reserved in the current definition of
the MSR.

We can see from the oops that it was swapper/0 that we were switching
from when we hit the warning, ie. init_task. So its thread.regs points
to the base (high addresses) in init_stack.

Dumping the content of init_task->thread.regs, with the members of
pt_regs annotated (the 16 bytes larger version), we see:

   c2780080gpr[0] gpr[1]
   c2666008gpr[2] gpr[3]
  c26d3ed0 0078gpr[4] gpr[5]
  c0011b68 c2780080gpr[6] gpr[7]
   gpr[8] gpr[9]
  c26d3f90 80002200gpr[10]gpr[11]
  c2004820 c26d7200gpr[12]gpr[13]
  1db6 c10aabe8gpr[14]gpr[15]
  c10aabe8 c10aabe8gpr[16]gpr[17]
  c294d598 gpr[18]gpr[19]
   1ff8gpr[20]gpr[21]
   c206d608gpr[22]gpr[23]
  c278e0cc gpr[24]gpr[25]
  2fff c000gpr[26]gpr[27]
  0200 0028gpr[28]gpr[29]
  1db6 0475gpr[30]gpr[31]
  0200 1db6nipmsr
   orig_r3ctr
  c000c49c link   xer
   ccrsofte
   trap   dar
   dsisr  result
   pprkuap
   pad[2] pad[3]

This looks suspiciously like stack frames, not a pt_regs. If we look
closely we can see return addresses from the stack trace above,
c2004820 (start_kernel) and c000c49c (start_here_common).

init_task->thread.regs is setup at build time in processor.h:

  #define INIT_THREAD  { \
.ksp = INIT_SP, \
.regs = (struct pt_regs *)INIT_SP - 1, /* XXX bogus, I think */ \

The early boot code where we setup the initial stack is:

  LOAD_REG_ADDR(r3,init_thread_union)

  /* set up a stack pointer */
  LOAD_REG_IMMEDIATE(r1,THREAD_SIZE)
  add   r1,r3,r1
  lir0,0
  stdu  r0,-STACK_FRAME_OVERHEAD(r1)

Which creates a stack frame of size 112 bytes (STACK_FRAME_OVERHEAD).
Which is far too small to contain a pt_regs.

So the result is init_task->thread.regs is pointing at

Re: [PATCH] mm: Move p?d_alloc_track to separate header file

2020-06-17 Thread Andrew Morton
On Tue,  9 Jun 2020 14:05:33 +0200 Joerg Roedel  wrote:

> From: Joerg Roedel 
> 
> The functions are only used in two source files, so there is no need
> for them to be in the global  header. Move them to the new
>  header and include it only where needed.
> 
> ...
>
> new file mode 100644
> index ..1dcc865029a2
> --- /dev/null
> +++ b/include/linux/pgalloc-track.h
> @@ -0,0 +1,51 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +#ifndef _LINUX_PGALLLC_TRACK_H
> +#define _LINUX_PGALLLC_TRACK_H

hm, no #includes.  I guess this is OK, given the limited use.

But it does make one wonder whether ioremap.c should be moved from lib/
to mm/ and this file should be moved from include/linux/ to mm/.

Oh well.


[PATCH AUTOSEL 5.7 146/388] tty: hvc: Fix data abort due to race in hvc_open

2020-06-17 Thread Sasha Levin
From: Raghavendra Rao Ananta 

[ Upstream commit e2bd1dcbe1aa34ff5570b3427c530e4332ecf0fe ]

Potentially, hvc_open() can be called in parallel when two tasks calls
open() on /dev/hvcX. In such a scenario, if the hp->ops->notifier_add()
callback in the function fails, where it sets the tty->driver_data to
NULL, the parallel hvc_open() can see this NULL and cause a memory abort.
Hence, serialize hvc_open and check if tty->private_data is NULL before
proceeding ahead.

The issue can be easily reproduced by launching two tasks simultaneously
that does nothing but open() and close() on /dev/hvcX.
For example:
$ ./simple_open_close /dev/hvc0 & ./simple_open_close /dev/hvc0 &

Signed-off-by: Raghavendra Rao Ananta 
Link: https://lore.kernel.org/r/20200428032601.22127-1-rana...@codeaurora.org
Signed-off-by: Greg Kroah-Hartman 
Signed-off-by: Sasha Levin 
---
 drivers/tty/hvc/hvc_console.c | 16 ++--
 1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/tty/hvc/hvc_console.c b/drivers/tty/hvc/hvc_console.c
index cdcc64ea2554..f8e43a6faea9 100644
--- a/drivers/tty/hvc/hvc_console.c
+++ b/drivers/tty/hvc/hvc_console.c
@@ -75,6 +75,8 @@ static LIST_HEAD(hvc_structs);
  */
 static DEFINE_MUTEX(hvc_structs_mutex);
 
+/* Mutex to serialize hvc_open */
+static DEFINE_MUTEX(hvc_open_mutex);
 /*
  * This value is used to assign a tty->index value to a hvc_struct based
  * upon order of exposure via hvc_probe(), when we can not match it to
@@ -346,16 +348,24 @@ static int hvc_install(struct tty_driver *driver, struct 
tty_struct *tty)
  */
 static int hvc_open(struct tty_struct *tty, struct file * filp)
 {
-   struct hvc_struct *hp = tty->driver_data;
+   struct hvc_struct *hp;
unsigned long flags;
int rc = 0;
 
+   mutex_lock(&hvc_open_mutex);
+
+   hp = tty->driver_data;
+   if (!hp) {
+   rc = -EIO;
+   goto out;
+   }
+
spin_lock_irqsave(&hp->port.lock, flags);
/* Check and then increment for fast path open. */
if (hp->port.count++ > 0) {
spin_unlock_irqrestore(&hp->port.lock, flags);
hvc_kick();
-   return 0;
+   goto out;
} /* else count == 0 */
spin_unlock_irqrestore(&hp->port.lock, flags);
 
@@ -383,6 +393,8 @@ static int hvc_open(struct tty_struct *tty, struct file * 
filp)
/* Force wakeup of the polling thread */
hvc_kick();
 
+out:
+   mutex_unlock(&hvc_open_mutex);
return rc;
 }
 
-- 
2.25.1



[PATCH AUTOSEL 5.7 110/388] ibmvnic: Flush existing work items before device removal

2020-06-17 Thread Sasha Levin
From: Thomas Falcon 

[ Upstream commit 6954a9e4192b86d778fb52b525fd7b62d51b1147 ]

Ensure that all scheduled work items have completed before continuing
with device removal and after further event scheduling has been
halted. This patch fixes a bug where a scheduled driver reset event
is processed following device removal.

Signed-off-by: Thomas Falcon 
Signed-off-by: David S. Miller 
Signed-off-by: Sasha Levin 
---
 drivers/net/ethernet/ibm/ibmvnic.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/ibm/ibmvnic.c 
b/drivers/net/ethernet/ibm/ibmvnic.c
index 197dc5b2c090..1b4d04e4474b 100644
--- a/drivers/net/ethernet/ibm/ibmvnic.c
+++ b/drivers/net/ethernet/ibm/ibmvnic.c
@@ -5184,6 +5184,9 @@ static int ibmvnic_remove(struct vio_dev *dev)
adapter->state = VNIC_REMOVING;
spin_unlock_irqrestore(&adapter->state_lock, flags);
 
+   flush_work(&adapter->ibmvnic_reset);
+   flush_delayed_work(&adapter->ibmvnic_delayed_reset);
+
rtnl_lock();
unregister_netdevice(netdev);
 
-- 
2.25.1



[PATCH AUTOSEL 5.7 107/388] scsi: ibmvscsi: Don't send host info in adapter info MAD after LPM

2020-06-17 Thread Sasha Levin
From: Tyrel Datwyler 

[ Upstream commit 4919b33b63c8b69d8dcf2b867431d0e3b6dc6d28 ]

The adapter info MAD is used to send the client info and receive the host
info as a response. A persistent buffer is used and as such the client info
is overwritten after the response. During the course of a normal adapter
reset the client info is refreshed in the buffer in preparation for sending
the adapter info MAD.

However, in the special case of LPM where we reenable the CRQ instead of a
full CRQ teardown and reset we fail to refresh the client info in the
adapter info buffer. As a result, after Live Partition Migration (LPM) we
erroneously report the host's info as our own.

[mkp: typos]

Link: https://lore.kernel.org/r/20200603203632.18426-1-tyr...@linux.ibm.com
Signed-off-by: Tyrel Datwyler 
Signed-off-by: Martin K. Petersen 
Signed-off-by: Sasha Levin 
---
 drivers/scsi/ibmvscsi/ibmvscsi.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/scsi/ibmvscsi/ibmvscsi.c b/drivers/scsi/ibmvscsi/ibmvscsi.c
index 59f0f1030c54..c5711c659b51 100644
--- a/drivers/scsi/ibmvscsi/ibmvscsi.c
+++ b/drivers/scsi/ibmvscsi/ibmvscsi.c
@@ -415,6 +415,8 @@ static int ibmvscsi_reenable_crq_queue(struct crq_queue 
*queue,
int rc = 0;
struct vio_dev *vdev = to_vio_dev(hostdata->dev);
 
+   set_adapter_info(hostdata);
+
/* Re-enable the CRQ */
do {
if (rc)
-- 
2.25.1



[PATCH AUTOSEL 5.7 086/388] powerpc/crashkernel: Take "mem=" option into account

2020-06-17 Thread Sasha Levin
From: Pingfan Liu 

[ Upstream commit be5470e0c285a68dc3afdea965032f5ddc8269d7 ]

'mem=" option is an easy way to put high pressure on memory during
some test. Hence after applying the memory limit, instead of total
mem, the actual usable memory should be considered when reserving mem
for crashkernel. Otherwise the boot up may experience OOM issue.

E.g. it would reserve 4G prior to the change and 512M afterward, if
passing
crashkernel="2G-4G:384M,4G-16G:512M,16G-64G:1G,64G-128G:2G,128G-:4G",
and mem=5G on a 256G machine.

This issue is powerpc specific because it puts higher priority on
fadump and kdump reservation than on "mem=". Referring the following
code:
if (fadump_reserve_mem() == 0)
reserve_crashkernel();
...
/* Ensure that total memory size is page-aligned. */
limit = ALIGN(memory_limit ?: memblock_phys_mem_size(), PAGE_SIZE);
memblock_enforce_memory_limit(limit);

While on other arches, the effect of "mem=" takes a higher priority
and pass through memblock_phys_mem_size() before calling
reserve_crashkernel().

Signed-off-by: Pingfan Liu 
Reviewed-by: Hari Bathini 
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/1585749644-4148-1-git-send-email-kernelf...@gmail.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/kexec/core.c | 8 +---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kexec/core.c b/arch/powerpc/kexec/core.c
index 078fe3d76feb..56da5eb2b923 100644
--- a/arch/powerpc/kexec/core.c
+++ b/arch/powerpc/kexec/core.c
@@ -115,11 +115,12 @@ void machine_kexec(struct kimage *image)
 
 void __init reserve_crashkernel(void)
 {
-   unsigned long long crash_size, crash_base;
+   unsigned long long crash_size, crash_base, total_mem_sz;
int ret;
 
+   total_mem_sz = memory_limit ? memory_limit : memblock_phys_mem_size();
/* use common parsing */
-   ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
+   ret = parse_crashkernel(boot_command_line, total_mem_sz,
&crash_size, &crash_base);
if (ret == 0 && crash_size > 0) {
crashk_res.start = crash_base;
@@ -178,6 +179,7 @@ void __init reserve_crashkernel(void)
/* Crash kernel trumps memory limit */
if (memory_limit && memory_limit <= crashk_res.end) {
memory_limit = crashk_res.end + 1;
+   total_mem_sz = memory_limit;
printk("Adjusted memory limit for crashkernel, now 0x%llx\n",
   memory_limit);
}
@@ -186,7 +188,7 @@ void __init reserve_crashkernel(void)
"for crashkernel (System RAM: %ldMB)\n",
(unsigned long)(crash_size >> 20),
(unsigned long)(crashk_res.start >> 20),
-   (unsigned long)(memblock_phys_mem_size() >> 20));
+   (unsigned long)(total_mem_sz >> 20));
 
if (!memblock_is_region_memory(crashk_res.start, crash_size) ||
memblock_reserve(crashk_res.start, crash_size)) {
-- 
2.25.1



[PATCH AUTOSEL 5.7 080/388] powerpc/perf/hv-24x7: Fix inconsistent output values incase multiple hv-24x7 events run

2020-06-17 Thread Sasha Levin
From: Kajol Jain 

[ Upstream commit b4ac18eead28611ff470d0f47a35c4e0ac080d9c ]

Commit 2b206ee6b0df ("powerpc/perf/hv-24x7: Display change in counter
values")' added to print _change_ in the counter value rather then raw
value for 24x7 counters. Incase of transactions, the event count
is set to 0 at the beginning of the transaction. It also sets
the event's prev_count to the raw value at the time of initialization.
Because of setting event count to 0, we are seeing some weird behaviour,
whenever we run multiple 24x7 events at a time.

For example:

command#: ./perf stat -e "{hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/,
   hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/}"
   -C 0 -I 1000 sleep 100

 1.000121704120 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 1.000121704  5 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 2.000357733  8 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 2.000357733 10 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 3.000495215 18,446,744,073,709,551,616 
hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 3.000495215 18,446,744,073,709,551,616 
hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 4.000641884 56 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 4.000641884 18,446,744,073,709,551,616 
hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 5.000791887 18,446,744,073,709,551,616 
hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/

Getting these large values in case we do -I.

As we are setting event_count to 0, for interval case, overall event_count is 
not
coming in incremental order. As we may can get new delta lesser then previous 
count.
Because of which when we print intervals, we are getting negative value which 
create
these large values.

This patch removes part where we set event_count to 0 in function
'h_24x7_event_read'. There won't be much impact as we do set 
event->hw.prev_count
to the raw value at the time of initialization to print change value.

With this patch
In power9 platform

command#: ./perf stat -e "{hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/,
   hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/}"
   -C 0 -I 1000 sleep 100

 1.000117685 93 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 1.000117685  1 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 2.000349331 98 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 2.000349331  2 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 3.000495900131 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 3.000495900  4 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 4.000645920204 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 4.000645920 61 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 4.284169997 22 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/

Suggested-by: Sukadev Bhattiprolu 
Signed-off-by: Kajol Jain 
Tested-by: Madhavan Srinivasan 
Signed-off-by: Michael Ellerman 
Link: https://lore.kernel.org/r/20200525104308.9814-2-kj...@linux.ibm.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/perf/hv-24x7.c | 10 --
 1 file changed, 10 deletions(-)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index 573e0b309c0c..48e8f4b17b91 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -1400,16 +1400,6 @@ static void h_24x7_event_read(struct perf_event *event)
h24x7hw = &get_cpu_var(hv_24x7_hw);
h24x7hw->events[i] = event;
put_cpu_var(h24x7hw);
-   /*
-* Clear the event count so we can compute the _change_
-* in the 24x7 raw counter value at the end of the txn.
-*
-* Note that we could alternatively read the 24x7 value
-* now and save its value in event->hw.prev_count. But
-* that would require issuing a hcall, which would then
-* defeat the purpose of using the txn interface.
-*/
-   local64_set(&event->count, 0);
}
 
put_cpu_var(hv_24x7_reqb);
-- 
2.25.1



[PATCH AUTOSEL 5.7 072/388] powerpc/ptdump: Add _PAGE_COHERENT flag

2020-06-17 Thread Sasha Levin
From: Christophe Leroy 

[ Upstream commit 3af4786eb429b2df76cbd7ce3bae21467ac3e4fb ]

For platforms using shared.c (4xx, Book3e, Book3s/32), also handle the
_PAGE_COHERENT flag which corresponds to the M bit of the WIMG flags.

Signed-off-by: Christophe Leroy 
[mpe: Make it more verbose, use "coherent" rather than "m"]
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/324c3d860717e8e91fca3bb6c0f8b23e1644a404.1589866984.git.christophe.le...@csgroup.eu
Signed-off-by: Sasha Levin 
---
 arch/powerpc/mm/ptdump/shared.c | 5 +
 1 file changed, 5 insertions(+)

diff --git a/arch/powerpc/mm/ptdump/shared.c b/arch/powerpc/mm/ptdump/shared.c
index f7ed2f187cb0..784f8df17f73 100644
--- a/arch/powerpc/mm/ptdump/shared.c
+++ b/arch/powerpc/mm/ptdump/shared.c
@@ -30,6 +30,11 @@ static const struct flag_info flag_array[] = {
.val= _PAGE_PRESENT,
.set= "present",
.clear  = "   ",
+   }, {
+   .mask   = _PAGE_COHERENT,
+   .val= _PAGE_COHERENT,
+   .set= "coherent",
+   .clear  = "",
}, {
.mask   = _PAGE_GUARDED,
.val= _PAGE_GUARDED,
-- 
2.25.1



[PATCH AUTOSEL 5.7 065/388] powerpc/book3s64/radix/tlb: Determine hugepage flush correctly

2020-06-17 Thread Sasha Levin
From: "Aneesh Kumar K.V" 

[ Upstream commit 8f53f9c0f68ab2168f637494b9e24034899c1310 ]

With a 64K page size flush with start and end:

  (start, end) = (721f680d, 721f680e)

results in:

  (hstart, hend) = (721f6820, 721f6800)

ie. hstart is above hend, which indicates no huge page flush is
needed.

However the current logic incorrectly sets hflush = true in this case,
because hstart != hend.

That causes us to call __tlbie_va_range() passing hstart/hend, to do a
huge page flush even though we don't need to. __tlbie_va_range() will
skip the actual tlbie operation for start > end. But it will still end
up calling fixup_tlbie_va_range() and doing the TLB fixups in there,
which is harmless but unnecessary work.

Reported-by: Bharata B Rao 
Signed-off-by: Aneesh Kumar K.V 
Reviewed-by: Nicholas Piggin 
[mpe: Drop else case, hflush is already false, flesh out change log]
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/20200513030616.152288-1-aneesh.ku...@linux.ibm.com
Signed-off-by: Sasha Levin 
---
 arch/powerpc/mm/book3s64/radix_tlb.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c 
b/arch/powerpc/mm/book3s64/radix_tlb.c
index 758ade2c2b6e..b5cc9b23cf02 100644
--- a/arch/powerpc/mm/book3s64/radix_tlb.c
+++ b/arch/powerpc/mm/book3s64/radix_tlb.c
@@ -884,9 +884,7 @@ static inline void __radix__flush_tlb_range(struct 
mm_struct *mm,
if (IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE)) {
hstart = (start + PMD_SIZE - 1) & PMD_MASK;
hend = end & PMD_MASK;
-   if (hstart == hend)
-   hflush = false;
-   else
+   if (hstart < hend)
hflush = true;
}
 
-- 
2.25.1



[PATCH AUTOSEL 5.7 058/388] ps3disk: use the default segment boundary

2020-06-17 Thread Sasha Levin
From: Emmanuel Nicolet 

[ Upstream commit 720bc316690bd27dea9d71510b50f0cd698ffc32 ]

Since commit dcebd755926b ("block: use bio_for_each_bvec() to compute
multi-page bvec count"), the kernel will bug_on on the PS3 because
bio_split() is called with sectors == 0:

  kernel BUG at block/bio.c:1853!
  Oops: Exception in kernel mode, sig: 5 [#1]
  BE PAGE_SIZE=4K MMU=Hash PREEMPT SMP NR_CPUS=8 NUMA PS3
  Modules linked in: firewire_sbp2 rtc_ps3(+) soundcore ps3_gelic(+) \
  ps3rom(+) firewire_core ps3vram(+) usb_common crc_itu_t
  CPU: 0 PID: 97 Comm: blkid Not tainted 5.3.0-rc4 #1
  NIP:  c027d0d0 LR: c027d0b0 CTR: 
  REGS: c135ae90 TRAP: 0700   Not tainted  (5.3.0-rc4)
  MSR:  80028032   CR: 44008240  XER: 2000
  IRQMASK: 0
  GPR00: c0289368 c135b120 c084a500 c4ff8300
  GPR04: 0c00 c4c905e0 c4c905e0 
  GPR08:  0001  
  GPR12:  c08ef000 003e 00080001
  GPR16: 0100   0004
  GPR20: c062fd7e 0001  0080
  GPR24: c0781788 c135b350 0080 c4c905e0
  GPR28: c135b348 c4ff8300  c4c9
  NIP [c027d0d0] .bio_split+0x28/0xac
  LR [c027d0b0] .bio_split+0x8/0xac
  Call Trace:
  [c135b120] [c027d130] .bio_split+0x88/0xac (unreliable)
  [c135b1b0] [c0289368] .__blk_queue_split+0x11c/0x53c
  [c135b2d0] [c028f614] .blk_mq_make_request+0x80/0x7d4
  [c135b3d0] [c0283a8c] .generic_make_request+0x118/0x294
  [c135b4b0] [c0283d34] .submit_bio+0x12c/0x174
  [c135b580] [c0205a44] .mpage_bio_submit+0x3c/0x4c
  [c135b600] [c0206184] .mpage_readpages+0xa4/0x184
  [c135b750] [c01ff8fc] .blkdev_readpages+0x24/0x38
  [c135b7c0] [c01589f0] .read_pages+0x6c/0x1a8
  [c135b8b0] [c0158c74] .__do_page_cache_readahead+0x118/0x184
  [c135b9b0] [c01591a8] .force_page_cache_readahead+0xe4/0xe8
  [c135ba50] [c014fc24] .generic_file_read_iter+0x1d8/0x830
  [c135bb50] [c01ffadc] .blkdev_read_iter+0x40/0x5c
  [c135bbc0] [c01b9e00] .new_sync_read+0x144/0x1a0
  [c135bcd0] [c01bc454] .vfs_read+0xa0/0x124
  [c135bd70] [c01bc7a4] .ksys_read+0x70/0xd8
  [c135be20] [c000a524] system_call+0x5c/0x70
  Instruction dump:
  7fe3fb78 482e30dc 7c0802a6 482e3085 7c9e2378 f821ff71 7ca42b78 7d3e00d0
  7c7d1b78 79290fe0 7cc53378 69290001 <0b09> 81230028 7bca0020 7929ba62
  [ end trace 313fec760f30aa1f ]---

The problem originates from setting the segment boundary of the
request queue to -1UL. This makes get_max_segment_size() return zero
when offset is zero, whatever the max segment size. The test with
BLK_SEG_BOUNDARY_MASK fails and 'mask - (mask & offset) + 1' overflows
to zero in the return statement.

Not setting the segment boundary and using the default
value (BLK_SEG_BOUNDARY_MASK) fixes the problem.

Signed-off-by: Emmanuel Nicolet 
Signed-off-by: Geoff Levand 
Signed-off-by: Michael Ellerman 
Link: 
https://lore.kernel.org/r/060a416c43138f45105c0540eff1a45539f7e2fc.1589049250.git.ge...@infradead.org
Signed-off-by: Sasha Levin 
---
 drivers/block/ps3disk.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/block/ps3disk.c b/drivers/block/ps3disk.c
index c5c6487a19d5..7b55811c2a81 100644
--- a/drivers/block/ps3disk.c
+++ b/drivers/block/ps3disk.c
@@ -454,7 +454,6 @@ static int ps3disk_probe(struct ps3_system_bus_device *_dev)
queue->queuedata = dev;
 
blk_queue_max_hw_sectors(queue, dev->bounce_size >> 9);
-   blk_queue_segment_boundary(queue, -1UL);
blk_queue_dma_alignment(queue, dev->blk_size-1);
blk_queue_logical_block_size(queue, dev->blk_size);
 
-- 
2.25.1



[PATCH AUTOSEL 5.7 032/388] powerpc/kasan: Fix stack overflow by increasing THREAD_SHIFT

2020-06-17 Thread Sasha Levin
From: Christophe Leroy 

[ Upstream commit edbadaf0671072298e506074128b64e003c5812c ]

When CONFIG_KASAN is selected, the stack usage is increased.

In the same way as x86 and arm64 architectures, increase
THREAD_SHIFT when CONFIG_KASAN is selected.

Fixes: 2edb16efc899 ("powerpc/32: Add KASAN support")
Reported-by: 
Signed-off-by: Christophe Leroy 
Signed-off-by: Michael Ellerman 
Link: https://bugzilla.kernel.org/show_bug.cgi?id=207129
Link: 
https://lore.kernel.org/r/2c50f3b1c9bbaa4217c9a98f3044bd2a36c46a4f.1586361277.git.christophe.le...@c-s.fr
Signed-off-by: Sasha Levin 
---
 arch/powerpc/Kconfig | 1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index b29d7cb38368..51a074c26793 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -773,6 +773,7 @@ config THREAD_SHIFT
range 13 15
default "15" if PPC_256K_PAGES
default "14" if PPC64
+   default "14" if KASAN
default "13"
help
  Used to define the stack size. The default is almost always what you
-- 
2.25.1



[PATCH AUTOSEL 5.7 013/388] ASoC: fsl_esai: Disable exception interrupt before scheduling tasklet

2020-06-17 Thread Sasha Levin
From: Shengjiu Wang 

[ Upstream commit 1fecbb71fe0e46b886f84e3b6decca6643c3af6d ]

Disable exception interrupt before scheduling tasklet, otherwise if
the tasklet isn't handled immediately, there will be endless xrun
interrupt.

Fixes: 7ccafa2b3879 ("ASoC: fsl_esai: recover the channel swap after xrun")
Signed-off-by: Shengjiu Wang 
Acked-by: Nicolin Chen 
Link: 
https://lore.kernel.org/r/a8f2ad955aac9e52587beedc1133b3efbe746895.1587968824.git.shengjiu.w...@nxp.com
Signed-off-by: Mark Brown 
Signed-off-by: Sasha Levin 
---
 sound/soc/fsl/fsl_esai.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/sound/soc/fsl/fsl_esai.c b/sound/soc/fsl/fsl_esai.c
index c7a49d03463a..84290be778f0 100644
--- a/sound/soc/fsl/fsl_esai.c
+++ b/sound/soc/fsl/fsl_esai.c
@@ -87,6 +87,10 @@ static irqreturn_t esai_isr(int irq, void *devid)
if ((saisr & (ESAI_SAISR_TUE | ESAI_SAISR_ROE)) &&
esai_priv->reset_at_xrun) {
dev_dbg(&pdev->dev, "reset module for xrun\n");
+   regmap_update_bits(esai_priv->regmap, REG_ESAI_TCR,
+  ESAI_xCR_xEIE_MASK, 0);
+   regmap_update_bits(esai_priv->regmap, REG_ESAI_RCR,
+  ESAI_xCR_xEIE_MASK, 0);
tasklet_schedule(&esai_priv->task);
}
 
-- 
2.25.1



Re: [PATCH 3/3] powerpc/8xx: Provide ptep_get() with 16k pages

2020-06-17 Thread Michael Ellerman
Christophe Leroy  writes:
> Le 17/06/2020 à 16:38, Peter Zijlstra a écrit :
>> On Thu, Jun 18, 2020 at 12:21:22AM +1000, Michael Ellerman wrote:
>>> Peter Zijlstra  writes:
 On Mon, Jun 15, 2020 at 12:57:59PM +, Christophe Leroy wrote:
>> 
> +#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)
> +#define __HAVE_ARCH_PTEP_GET
> +static inline pte_t ptep_get(pte_t *ptep)
> +{
> + pte_t pte = {READ_ONCE(ptep->pte), 0, 0, 0};
> +
> + return pte;
> +}
> +#endif

 Would it make sense to have a comment with this magic? The casual reader
 might wonder WTH just happened when he stumbles on this :-)
>>>
>>> I tried writing a helpful comment but it's too late for my brain to form
>>> sensible sentences.
>>>
>>> Christophe can you send a follow-up with a comment explaining it? In
>>> particular the zero entries stand out, it's kind of subtle that those
>>> entries are only populated with the right value when we write to the
>>> page table.
>> 
>> static inline pte_t ptep_get(pte_t *ptep)
>> {
>>  unsigned long val = READ_ONCE(ptep->pte);
>>  /* 16K pages have 4 identical value 4K entries */
>>  pte_t pte = {val, val, val, val);
>>  return pte;
>> }
>> 
>> Maybe something like that?
>
> This should work as well. Indeed nobody cares about what's in the other 
> three. They are only there to ensure that ptep++ increases the ptep 
> pointer by 16 bytes. Only the HW require 4 identical values, that's 
> taken care of in set_pte_at() and pte_update().

Right, but it seems less error-prone to have the in-memory
representation match what we have in the page table (well that's
in-memory too but you know what I mean).

> So we should use the most efficient. Thinking once more, maybe what you 
> propose is the most efficient as there is no need to load another 
> register with value 0 in order to write it in the stack.

On 64-bit I'd say it makes zero difference, the only thing that's going
to matter is the load from ptep->pte. I don't know whether that's true
on the 8xx cores though.

cheers


Re: [PATCH 3/3] powerpc/8xx: Provide ptep_get() with 16k pages

2020-06-17 Thread Michael Ellerman
Peter Zijlstra  writes:
> On Thu, Jun 18, 2020 at 12:21:22AM +1000, Michael Ellerman wrote:
>> Peter Zijlstra  writes:
>> > On Mon, Jun 15, 2020 at 12:57:59PM +, Christophe Leroy wrote:
>
>> >> +#if defined(CONFIG_PPC_8xx) && defined(CONFIG_PPC_16K_PAGES)
>> >> +#define __HAVE_ARCH_PTEP_GET
>> >> +static inline pte_t ptep_get(pte_t *ptep)
>> >> +{
>> >> + pte_t pte = {READ_ONCE(ptep->pte), 0, 0, 0};
>> >> +
>> >> + return pte;
>> >> +}
>> >> +#endif
>> >
>> > Would it make sense to have a comment with this magic? The casual reader
>> > might wonder WTH just happened when he stumbles on this :-)
>> 
>> I tried writing a helpful comment but it's too late for my brain to form
>> sensible sentences.
>> 
>> Christophe can you send a follow-up with a comment explaining it? In
>> particular the zero entries stand out, it's kind of subtle that those
>> entries are only populated with the right value when we write to the
>> page table.
>
> static inline pte_t ptep_get(pte_t *ptep)
> {
>   unsigned long val = READ_ONCE(ptep->pte);
>   /* 16K pages have 4 identical value 4K entries */
>   pte_t pte = {val, val, val, val);
>   return pte;
> }
>
> Maybe something like that?

I think val wants to be pte_basic_t, but otherwise yeah I like that much
better.

cheers


Re: [PATCH v5 01/13] powerpc: Remove Xilinx PPC405/PPC440 support

2020-06-17 Thread Michael Ellerman
Nick Desaulniers  writes:
> On Wed, Jun 17, 2020 at 3:20 AM Michael Ellerman  wrote:
>> Michael Ellerman  writes:
>> > Michal Simek  writes:
>> 
>>
>> >> Or if bamboo requires uImage to be built by default you can do it via
>> >> Kconfig.
>> >>
>> >> diff --git a/arch/powerpc/platforms/44x/Kconfig
>> >> b/arch/powerpc/platforms/44x/Kconfig
>> >> index 39e93d23fb38..300864d7b8c9 100644
>> >> --- a/arch/powerpc/platforms/44x/Kconfig
>> >> +++ b/arch/powerpc/platforms/44x/Kconfig
>> >> @@ -13,6 +13,7 @@ config BAMBOO
>> >> select PPC44x_SIMPLE
>> >> select 440EP
>> >> select FORCE_PCI
>> >> +   select DEFAULT_UIMAGE
>> >> help
>> >>   This option enables support for the IBM PPC440EP evaluation 
>> >> board.
>> >
>> > Who knows what the actual bamboo board used. But I'd be happy to take a
>> > SOB'ed patch to do the above, because these days the qemu emulation is
>> > much more likely to be used than the actual board.
>>
>> I just went to see why my CI boot of 44x didn't catch this, and it's
>> because I don't use the uImage, I just boot the vmlinux directly:
>>
>>   $ qemu-system-ppc -M bamboo -m 128m -display none -kernel build~/vmlinux 
>> -append "console=ttyS0" -display none -nodefaults -serial mon:stdio
>>   Linux version 5.8.0-rc1-00118-g69119673bd50 (michael@alpine1-p1) (gcc 
>> (Ubuntu 9.3.0-10ubuntu2) 9.3.0, GNU ld (GNU Binutils for Ubuntu) 2.34) #4 
>> Wed Jun 17 20:19:22 AEST 2020
>>   Using PowerPC 44x Platform machine description
>>   ioremap() called early from find_legacy_serial_ports+0x690/0x770. Use 
>> early_ioremap() instead
>>   printk: bootconsole [udbg0] enabled
>>
>>
>> So that's probably the simplest solution?
>
> If the uImage or zImage self decompresses, I would prefer to test that as 
> well.

The uImage is decompressed by qemu AIUI.

>> That means previously arch/powerpc/boot/zImage was just a hardlink to
>> the uImage:
>
> It sounds like we can just boot the zImage, or is that no longer
> created with the uImage?

The zImage won't boot on bamboo.

Because of the vagaries of the arch/powerpc/boot/Makefile the zImage
ends up pointing to treeImage.ebony, which is for a different board.

The zImage link is made to the first item in $(image-y):

$(obj)/zImage:  $(addprefix $(obj)/, $(image-y))
$(Q)rm -f $@; ln $< $@
 ^
 first preqrequisite

Which for this defconfig happens to be:

image-$(CONFIG_EBONY)   += treeImage.ebony cuImage.ebony

If you turned off CONFIG_EBONY then the zImage will be a link to
treeImage.bamboo, but qemu can't boot that either.

It's kind of nuts that the zImage points to some arbitrary image
depending on what's configured and the order of things in the Makefile.
But I'm not sure how we make it less nuts without risking breaking
people's existing setups.

cheers


Re: [PATCH] scsi: target/sbp: remove firewire SBP target driver

2020-06-17 Thread Finn Thain
On Tue, 16 Jun 2020, Martin K. Petersen wrote:

> 
> However, keeping code around is not free.

Right. And removing code isn't free either, if it forces people to find 
workarounds.

> Core interfaces change frequently.  Nobody enjoys having to tweak host 
> templates for 50 devices they have never even heard about.

And yet some people seem to enjoy writing patches that are as trivial as 
they are invasive...

You seem to be making an argument for more automation here, not an 
argument for less code. Or is there some upper bound to the size of the 
kernel, that might be lifted by adding maintainers? (Can you deliver a 
better product by adding more developers to your project?)

> Also, we now live in a reality where there is a constant barrage of 
> build bots and code analyzers sending mail. So the effective cost of 
> keeping code around in the tree is going up.

But if maintenance cost rises due to good analysis, the value of the code 
should rise too. So what's the problem? It seems to me that the real 
problem is too many analyses that generate pedantic noise and no tangible 
improvement in code quality or value.

> I get a substantial amount of code analysis mail about drivers nobody 
> have touched in a decade or more.
> 

When stable, mature code fails analysis, the analysis is also questionable 
(in the absence of real examples).

> Consequently, I am much more inclined to remove drivers than I have been 
> in the past. But I am also very happy to bring them back if somebody 
> uses them or - even better - are willing to step up and maintain them.
> 

You seem to be saying that 1) a driver should be removed when it no longer 
meets the present threshold for code quality and 2) that a low quality 
driver is eligible for re-addition because someone wants to use it.
I don't think you can have it both ways.

> I don't particularly like the notion of a driver being orphaned because 
> all that really means is that the driver transitions from being (at 
> least partially) somebody else's problem to being mine and mine alone.
> 

Yes it's your problem but only on a best-effort basis.

Many issues detected by automatic analyzers can be fixed with automatic 
code transformation tools. This kind of solution works tree-wide, so even 
if some defect in your driver is "yours and yours alone", the solution 
will probably come from others.

This email, like yours, is just hand-waving. So feel free to ignore it or 
(preferably) provide evidence of real defects.


Re: [PATCH v4 0/3] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-06-17 Thread Denis Efremov



On 6/16/20 9:53 PM, Joe Perches wrote:
> On Mon, 2020-06-15 at 21:57 -0400, Waiman Long wrote:
>>  v4:
>>   - Break out the memzero_explicit() change as suggested by Dan Carpenter
>> so that it can be backported to stable.
>>   - Drop the "crypto: Remove unnecessary memzero_explicit()" patch for
>> now as there can be a bit more discussion on what is best. It will be
>> introduced as a separate patch later on after this one is merged.
> 
> To this larger audience and last week without reply:
> https://lore.kernel.org/lkml/573b3fbd5927c643920e1364230c296b23e7584d.ca...@perches.com/
> 
> Are there _any_ fastpath uses of kfree or vfree?
> 
> Many patches have been posted recently to fix mispairings
> of specific types of alloc and free functions.

I've prepared a coccinelle script to highlight these mispairings in a function
a couple of days ago: https://lkml.org/lkml/2020/6/5/953
I've listed all the fixes in the commit message. 

Not so many mispairings actually, and most of them are harmless like:
kmalloc(E) -> kvfree(E)

However, coccinelle script can't detect cross-functions mispairings, i.e.
allocation in one function, free in another funtion.

Thanks,
Denis


Re: [PATCH v4 0/3] mm, treewide: Rename kzfree() to kfree_sensitive()

2020-06-17 Thread Joe Perches
On Thu, 2020-06-18 at 00:31 +0300, Denis Efremov wrote:
> 
> On 6/16/20 9:53 PM, Joe Perches wrote:
> > On Mon, 2020-06-15 at 21:57 -0400, Waiman Long wrote:
> > >  v4:
> > >   - Break out the memzero_explicit() change as suggested by Dan Carpenter
> > > so that it can be backported to stable.
> > >   - Drop the "crypto: Remove unnecessary memzero_explicit()" patch for
> > > now as there can be a bit more discussion on what is best. It will be
> > > introduced as a separate patch later on after this one is merged.
> > 
> > To this larger audience and last week without reply:
> > https://lore.kernel.org/lkml/573b3fbd5927c643920e1364230c296b23e7584d.ca...@perches.com/
> > 
> > Are there _any_ fastpath uses of kfree or vfree?
> > 
> > Many patches have been posted recently to fix mispairings
> > of specific types of alloc and free functions.
> 
> I've prepared a coccinelle script to highlight these mispairings in a function
> a couple of days ago: https://lkml.org/lkml/2020/6/5/953
> I've listed all the fixes in the commit message. 
> 
> Not so many mispairings actually, and most of them are harmless like:
> kmalloc(E) -> kvfree(E)
> 
> However, coccinelle script can't detect cross-functions mispairings, i.e.
> allocation in one function, free in another funtion.

Hey Denis, thanks for those patches.

If possible, it's probably better to not require these pairings
and use a single standard kfree/free function.

Given the existing ifs in kfree in slab/slob/slub, it seems
likely that adding a few more wouldn't have much impact.




Re: [v1 PATCH 2/2] Add Documentation regarding the ima-kexec-buffer node in the chosen node documentation

2020-06-17 Thread Rob Herring
On Sun, Jun 07, 2020 at 04:33:23PM -0700, Prakhar Srivastava wrote:
> Add Documentation regarding the ima-kexec-buffer node in
>  the chosen node documentation

Run 'git log --oneline Documentation/devicetree/bindings/chosen.txt' and 
write $subject using the dominate format used.

For the commit message, answer why you need the change, not what the 
change is. I can read the diff for that.

>  
> Signed-off-by: Prakhar Srivastava 
> ---
>  Documentation/devicetree/bindings/chosen.txt | 17 +
>  1 file changed, 17 insertions(+)

This file has moved to a schema here[1]. You need to update it.

> 
> diff --git a/Documentation/devicetree/bindings/chosen.txt 
> b/Documentation/devicetree/bindings/chosen.txt
> index 45e79172a646..a15f70c007ef 100644
> --- a/Documentation/devicetree/bindings/chosen.txt
> +++ b/Documentation/devicetree/bindings/chosen.txt
> @@ -135,3 +135,20 @@ e.g.
>   linux,initrd-end = <0x8280>;
>   };
>  };
> +
> +linux,ima-kexec-buffer
> +--
> +
> +This property(currently used by powerpc, arm64) holds the memory range,
> +the address and the size, of the IMA measurement logs that are being carried
> +over to the kexec session.

What's IMA? 

> +
> +/ {
> + chosen {
> + linux,ima-kexec-buffer = <0x9 0x8200 0x0 0x8000>;
> + };
> +};
> +
> +This porperty does not represent real hardware, but the memory allocated for

typo

> +carrying the IMA measurement logs. The address and the suze are expressed in

typo

> +#address-cells and #size-cells, respectively of the root node.
> -- 
> 2.25.1
> 


[1] https://github.com/devicetree-org/dt-schema/blob/master/schemas/chosen.yaml



[PATCH v2 0/2] powerpc/pci: unmap interrupts when a PHB is removed

2020-06-17 Thread Cédric Le Goater
Hello,

When a passthrough IO adapter is removed from a pseries machine using
hash MMU and the XIVE interrupt mode, the POWER hypervisor expects the
guest OS to clear all page table entries related to the adapter. If
some are still present, the RTAS call which isolates the PCI slot
returns error 9001 "valid outstanding translations" and the removal of
the IO adapter fails. This is because when the PHBs are scanned, Linux
maps automatically some interrupts in the Linux interrupt number space
but these are never removed.

To solve this problem, we introduce a PPC platform specific
pcibios_remove_bus() routine which clears all interrupt mappings when
the bus is removed. This also clears the associated page table entries
of the ESB pages when using XIVE.

For this purpose, we record the logical interrupt numbers of the
mapped interrupt under the PHB structure and let pcibios_remove_bus()
do the clean up.

Tested on :

  - PowerNV with PCI, OpenCAPI, CAPI and GPU adapters. I don't know
how to inject a failure on a PHB but that would be a good test.
  - KVM P8+P9 guests with passthrough PCI adapters, but PHBs can not
be removed under QEMU/KVM.   
  - PowerVM with passthrough PCI adapters (main target)
  
Thanks,

C.

Changes since v1:

 - extended the removal to interrupts other than the legacy INTx.

Cédric Le Goater (2):
  powerpc/pci: unmap legacy INTx interrupts when a PHB is removed
  powerpc/pci: unmap all interrupts when a PHB is removed

 arch/powerpc/include/asm/pci-bridge.h |   6 ++
 arch/powerpc/kernel/pci-common.c  | 114 ++
 2 files changed, 120 insertions(+)

-- 
2.25.4



[PATCH v2 2/2] powerpc/pci: unmap all interrupts when a PHB is removed

2020-06-17 Thread Cédric Le Goater
Some PCI adapters, like GPUs, use the "interrupt-map" property to
describe interrupt mappings other than the legacy INTx interrupts.
There can be more than 4 mappings.

To clear all interrupts when a PHB is removed, we need to increase the
'irq_map' array in which mappings are recorded. Compute the number of
interrupt mappings from the "interrupt-map" property and allocate a
bigger 'irq_map' array.

Signed-off-by: Cédric Le Goater 
---
 arch/powerpc/kernel/pci-common.c | 49 +++-
 1 file changed, 48 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c
index 515480a4bac6..deb831f0ae13 100644
--- a/arch/powerpc/kernel/pci-common.c
+++ b/arch/powerpc/kernel/pci-common.c
@@ -353,9 +353,56 @@ struct pci_controller *pci_find_controller_for_domain(int 
domain_nr)
return NULL;
 }
 
+/*
+ * Assumption is made on the interrupt parent. All interrupt-map
+ * entries are considered to have the same parent.
+ */
+static int pcibios_irq_map_count(struct pci_controller *phb)
+{
+   const __be32 *imap;
+   int imaplen;
+   struct device_node *parent;
+   u32 intsize, addrsize, parintsize, paraddrsize;
+
+   if (of_property_read_u32(phb->dn, "#interrupt-cells", &intsize))
+   return 0;
+   if (of_property_read_u32(phb->dn, "#address-cells", &addrsize))
+   return 0;
+
+   imap = of_get_property(phb->dn, "interrupt-map", &imaplen);
+   if (!imap) {
+   pr_debug("%pOF : no interrupt-map\n", phb->dn);
+   return 0;
+   }
+   imaplen /= sizeof(u32);
+   pr_debug("%pOF : imaplen=%d\n", phb->dn, imaplen);
+
+   if (imaplen < (addrsize + intsize + 1))
+   return 0;
+
+   imap += intsize + addrsize;
+   parent = of_find_node_by_phandle(be32_to_cpup(imap));
+   if (!parent) {
+   pr_debug("%pOF : no imap parent found !\n", phb->dn);
+   return 0;
+   }
+
+   if (of_property_read_u32(parent, "#interrupt-cells", &parintsize)) {
+   pr_debug("%pOF : parent lacks #interrupt-cells!\n", phb->dn);
+   return 0;
+   }
+
+   if (of_property_read_u32(parent, "#address-cells", ¶ddrsize))
+   paraddrsize = 0;
+
+   return imaplen / (addrsize + intsize + 1 + paraddrsize + parintsize);
+}
+
 static void pcibios_irq_map_init(struct pci_controller *phb)
 {
-   phb->irq_count = PCI_NUM_INTX;
+   phb->irq_count = pcibios_irq_map_count(phb);
+   if (phb->irq_count < PCI_NUM_INTX)
+   phb->irq_count = PCI_NUM_INTX;
 
pr_debug("%pOF : interrupt map #%d\n", phb->dn, phb->irq_count);
 
-- 
2.25.4



  1   2   >