RE: [PATCH 0/6] Kill setup_irq()

2020-03-27 Thread Brian Cain
> -Original Message-
> From: linux-hexagon-ow...@vger.kernel.org  ow...@vger.kernel.org> On Behalf Of afzal mohammed
> Sent: Friday, March 27, 2020 11:08 AM
> To: Thomas Gleixner 
> Cc: linux-ker...@vger.kernel.org; linux-arm-ker...@lists.infradead.org;
> linux-samsung-...@vger.kernel.org; x...@kernel.org; linux-
> s...@vger.kernel.org; linux-s...@vger.kernel.org; linuxppc-
> d...@lists.ozlabs.org; linux-par...@vger.kernel.org; linux-
> m...@vger.kernel.org; linux-m...@lists.linux-m68k.org; linux-
> i...@vger.kernel.org; linux-hexa...@vger.kernel.org; linux-c6x-dev@linux-
> c6x.org; linux-o...@vger.kernel.org; linux-al...@vger.kernel.org
> Subject: [PATCH 0/6] Kill setup_irq()
...
> Note 1: sh toolchain is available, but that will not make the  relevant
changes
> compile as it has dependency of 64bit arch toolchain,  did try a Kconfig
hack
> to make it compile w/ 32bit sh toolchain, but it  failed due to other
reasons
> (unknown operands), so gave up on that.
> Note 2: hexagon final image creation fails even w/o my patch, but it  has
> been ensured that w/ my changes relevant object files are getting  built
w/o
> warnings.

Afzal,

What's the nature of the failure in "Note 2"?

-Brian


Re: [yyu168-linux_cet:cet 55/58] powerpc64le-linux-ld: warning: discarding dynamic section .rela___ksymtab+jiffies_to_timeval

2020-03-27 Thread Yu-cheng Yu
On Thu, 2020-02-06 at 04:55 -0800, H.J. Lu wrote:
> On Wed, Feb 5, 2020 at 7:26 PM Michael Ellerman  wrote:
> > "H.J. Lu"  writes:
> > > On Tue, Feb 4, 2020 at 3:37 PM kbuild test robot  wrote:
> > > > tree:   https://github.com/yyu168/linux_cet.git cet
> > > > head:   bba707cc4715c1036b6561ab38b16747f9c49cfa
> > > > commit: 71bb971dd76eeacd351690f28864ad5c5bec3691 [55/58] Discard 
> > > > .note.gnu.property sections in generic NOTES
> > > > config: powerpc-rhel-kconfig (attached as .config)
> > > > compiler: powerpc64le-linux-gcc (GCC) 7.5.0
> > > > reproduce:
> > > > wget 
> > > > https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross
> > > >  -O ~/bin/make.cross
> > > > chmod +x ~/bin/make.cross
> > > > git checkout 71bb971dd76eeacd351690f28864ad5c5bec3691
> > > > # save the attached .config to linux build tree
> > > > GCC_VERSION=7.5.0 make.cross ARCH=powerpc
> > > > 
> > > > If you fix the issue, kindly add following tag
> > > > Reported-by: kbuild test robot 
> > > > 
> > > > All warnings (new ones prefixed by >>):
> > > > 
> > > >powerpc64le-linux-ld: warning: discarding dynamic section 
> > > > .rela___ksymtab_gpl+__wait_rcu_gp
> > > 
> > > arch/powerpc/kernel/vmlinux.lds.S has
> > > 
> > >  .rela.dyn : AT(ADDR(.rela.dyn) - (0xc000 -0x))
> > >  {
> > >   __rela_dyn_start = .;
> > >   *(.rela*)  Keep .rela* sections
> > >  }
> > 
> > The above is inside #ifdef CONFIG_RELOCATABLE
> > 
> > > ...
> > >  /DISCARD/ : {
> > >   *(*.EMB.apuinfo)
> > >   *(.glink .iplt .plt .rela* .comment)
> > > Discard  .rela* sections.  But it is 
> > > ignored.
> > >   *(.gnu.version*)
> > >   *(.gnu.attributes)
> > >   *(.eh_frame)
> > >  }
> > 
> > But that is not #ifdef'ed at all.
> > 
> > > With my
> > > 
> > > ommit 71bb971dd76eeacd351690f28864ad5c5bec3691
> > > Author: H.J. Lu 
> > > Date:   Thu Jan 30 12:39:09 2020 -0800
> > > 
> > > Discard .note.gnu.property sections in generic NOTES
> > > 
> > > With the command-line option, -mx86-used-note=yes, the x86 assembler
> > > in binutils 2.32 and above generates a program property note in a note
> > > section, .note.gnu.property, to encode used x86 ISAs and features.  
> > > But
> > > kernel linker script only contains a single NOTE segment:
> > > 
> > > /DISCARD/ : { *(.note.gnu.property) }
> > > 
> > > is placed before
> > > 
> > > .rela.dyn : AT(ADDR(.rela.dyn) - (0xc000 -0x))
> > >  {
> > >   __rela_dyn_start = .;
> > >   *(.rela*)  Keep .rela* sections
> > >  }
> > > 
> > > Then .rela* in
> > > 
> > >  /DISCARD/ : {
> > >   *(*.EMB.apuinfo)
> > >   *(.glink .iplt .plt .rela* .comment)
> > >   *(.gnu.version*)
> > >   *(.gnu.attributes)
> > >   *(.eh_frame)
> > >  }
> > > 
> > > is honored.  Can someone from POWERPC comment on it?
> > 
> > Hmm OK. I'm not really a toolchain person.
> > 
> > The comment on DISCARDS says:
> > 
> >* Some archs want to discard exit text/data at runtime rather than
> >* link time due to cross-section references such as alt instructions,
> >* bug table, eh_frame, etc.  DISCARDS must be the last of output
> >* section definitions so that such archs put those in earlier section
> >* definitions.
> >*/
> > 
> > But I guess you're changing those semantics in your series.
> > 
> > This seems to fix the warning for me?
> > 
> > diff --git a/arch/powerpc/kernel/vmlinux.lds.S 
> > b/arch/powerpc/kernel/vmlinux.lds.S
> > index b4c89a1acebb..076b3e8a849d 100644
> > --- a/arch/powerpc/kernel/vmlinux.lds.S
> > +++ b/arch/powerpc/kernel/vmlinux.lds.S
> > @@ -365,9 +365,12 @@ SECTIONS
> > DISCARDS
> > /DISCARD/ : {
> > *(*.EMB.apuinfo)
> > -   *(.glink .iplt .plt .rela* .comment)
> > +   *(.glink .iplt .plt .comment)
> > *(.gnu.version*)
> > *(.gnu.attributes)
> > *(.eh_frame)
> > +#ifndef CONFIG_RELOCATABLE
> > +   *(.rela*)
> > +#endif
> > }
> >  }
> > 
> > 
> > cheers
> 
> This looks correct me.
> 
> Reviewed-by: H.J. Lu 
> 
> Thanks.
> 

Has this been merged into any branch yet?  I just checked the tip tree and did
not see it.

Thanks,
Yu-cheng



[PATCH 6/9] powerpc/ps3: Set CONFIG_UEVENT_HELPER=y in ps3_defconfig

2020-03-27 Thread Geoff Levand
Set CONFIG_UEVENT_HELPER=y in ps3_defconfig.

commit 1be01d4a57142ded23bdb9e0c8d9369e693b26cc (driver: base: Disable
CONFIG_UEVENT_HELPER by default) disabled the CONFIG_UEVENT_HELPER option
that is needed for hotplug and module loading by most older 32bit powerpc
distributions that users typically install on the PS3.

Cc: Geert Uytterhoeven 
Signed-off-by: Geoff Levand 
---
 arch/powerpc/configs/ps3_defconfig | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/powerpc/configs/ps3_defconfig 
b/arch/powerpc/configs/ps3_defconfig
index 4db51719342a..81b55c880fc3 100644
--- a/arch/powerpc/configs/ps3_defconfig
+++ b/arch/powerpc/configs/ps3_defconfig
@@ -60,6 +60,8 @@ CONFIG_CFG80211=m
 CONFIG_CFG80211_WEXT=y
 CONFIG_MAC80211=m
 # CONFIG_MAC80211_RC_MINSTREL is not set
+CONFIG_UEVENT_HELPER=y
+CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug"
 CONFIG_BLK_DEV_LOOP=y
 CONFIG_BLK_DEV_RAM=y
 CONFIG_BLK_DEV_RAM_SIZE=65535
-- 
2.20.1




[PATCH 2/9] drivers/ps3: Remove duplicate error messages

2020-03-27 Thread Geoff Levand
From: Markus Elfring 

Remove duplicate memory allocation failure error messages.

Signed-off-by: Markus Elfring 
Signed-off-by: Geoff Levand 
---
 drivers/ps3/ps3-lpm.c   | 2 --
 drivers/ps3/ps3-vuart.c | 1 -
 2 files changed, 3 deletions(-)

diff --git a/drivers/ps3/ps3-lpm.c b/drivers/ps3/ps3-lpm.c
index 83c45659bc9d..fcc346296e3a 100644
--- a/drivers/ps3/ps3-lpm.c
+++ b/drivers/ps3/ps3-lpm.c
@@ -,8 +,6 @@ int ps3_lpm_open(enum ps3_lpm_tb_type tb_type, void 
*tb_cache,
lpm_priv->tb_cache_internal = kzalloc(
lpm_priv->tb_cache_size + 127, GFP_KERNEL);
if (!lpm_priv->tb_cache_internal) {
-   dev_err(sbd_core(), "%s:%u: alloc internal tb_cache "
-   "failed\n", __func__, __LINE__);
result = -ENOMEM;
goto fail_malloc;
}
diff --git a/drivers/ps3/ps3-vuart.c b/drivers/ps3/ps3-vuart.c
index ddaa5ea5801a..3b5878edc6c2 100644
--- a/drivers/ps3/ps3-vuart.c
+++ b/drivers/ps3/ps3-vuart.c
@@ -917,7 +917,6 @@ static int ps3_vuart_bus_interrupt_get(void)
vuart_bus_priv.bmp = kzalloc(sizeof(struct ports_bmp), GFP_KERNEL);
 
if (!vuart_bus_priv.bmp) {
-   pr_debug("%s:%d: kzalloc failed.\n", __func__, __LINE__);
result = -ENOMEM;
goto fail_bmp_malloc;
}
-- 
2.20.1




[PATCH 5/9] ps3disk: use the default segment boundary

2020-03-27 Thread Geoff Levand
From: Emmanuel Nicolet 

Hi,
since commit dcebd755926b ("block: use bio_for_each_bvec() to compute
multi-page bvec count"), the kernel will bug_on on the PS3 because
bio_split() is called with sectors == 0:

kernel BUG at block/bio.c:1853!
Oops: Exception in kernel mode, sig: 5 [#1]
BE PAGE_SIZE=4K MMU=Hash PREEMPT SMP NR_CPUS=8 NUMA PS3
Modules linked in: firewire_sbp2 rtc_ps3(+) soundcore ps3_gelic(+) \
ps3rom(+) firewire_core ps3vram(+) usb_common crc_itu_t
CPU: 0 PID: 97 Comm: blkid Not tainted 5.3.0-rc4 #1
NIP:  c027d0d0 LR: c027d0b0 CTR: 
REGS: c135ae90 TRAP: 0700   Not tainted  (5.3.0-rc4)
MSR:  80028032   CR: 44008240  XER: 2000
IRQMASK: 0
GPR00: c0289368 c135b120 c084a500 c4ff8300
GPR04: 0c00 c4c905e0 c4c905e0 
GPR08:  0001  
GPR12:  c08ef000 003e 00080001
GPR16: 0100   0004
GPR20: c062fd7e 0001  0080
GPR24: c0781788 c135b350 0080 c4c905e0
GPR28: c135b348 c4ff8300  c4c9
NIP [c027d0d0] .bio_split+0x28/0xac
LR [c027d0b0] .bio_split+0x8/0xac
Call Trace:
[c135b120] [c027d130] .bio_split+0x88/0xac (unreliable)
[c135b1b0] [c0289368] .__blk_queue_split+0x11c/0x53c
[c135b2d0] [c028f614] .blk_mq_make_request+0x80/0x7d4
[c135b3d0] [c0283a8c] .generic_make_request+0x118/0x294
[c135b4b0] [c0283d34] .submit_bio+0x12c/0x174
[c135b580] [c0205a44] .mpage_bio_submit+0x3c/0x4c
[c135b600] [c0206184] .mpage_readpages+0xa4/0x184
[c135b750] [c01ff8fc] .blkdev_readpages+0x24/0x38
[c135b7c0] [c01589f0] .read_pages+0x6c/0x1a8
[c135b8b0] [c0158c74] .__do_page_cache_readahead+0x118/0x184
[c135b9b0] [c01591a8] .force_page_cache_readahead+0xe4/0xe8
[c135ba50] [c014fc24] .generic_file_read_iter+0x1d8/0x830
[c135bb50] [c01ffadc] .blkdev_read_iter+0x40/0x5c
[c135bbc0] [c01b9e00] .new_sync_read+0x144/0x1a0
[c135bcd0] [c01bc454] .vfs_read+0xa0/0x124
[c135bd70] [c01bc7a4] .ksys_read+0x70/0xd8
[c135be20] [c000a524] system_call+0x5c/0x70
Instruction dump:
7fe3fb78 482e30dc 7c0802a6 482e3085 7c9e2378 f821ff71 7ca42b78 7d3e00d0
7c7d1b78 79290fe0 7cc53378 69290001 <0b09> 81230028 7bca0020 7929ba62
[ end trace 313fec760f30aa1f ]---

The problem originates from setting the segment boundary of the request
queue to -1UL. This makes get_max_segment_size() return zero when offset
is zero, whatever the max segment size. The test with BLK_SEG_BOUNDARY_MASK
fails and 'mask - (mask & offset) + 1' overflows to zero in the return
statement.

Not setting the segment boundary and using the default value
(BLK_SEG_BOUNDARY_MASK) fixes the problem.
Maybe BLK_SEG_BOUNDARY_MASK should be set to -1UL? It's currently set to
only 0xUL. I don't know if that would break anything.

Signed-off-by: Emmanuel Nicolet 
Signed-off-by: Geoff Levand 
---
 drivers/block/ps3disk.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/block/ps3disk.c b/drivers/block/ps3disk.c
index c5c6487a19d5..7b55811c2a81 100644
--- a/drivers/block/ps3disk.c
+++ b/drivers/block/ps3disk.c
@@ -454,7 +454,6 @@ static int ps3disk_probe(struct ps3_system_bus_device *_dev)
queue->queuedata = dev;
 
blk_queue_max_hw_sectors(queue, dev->bounce_size >> 9);
-   blk_queue_segment_boundary(queue, -1UL);
blk_queue_dma_alignment(queue, dev->blk_size-1);
blk_queue_logical_block_size(queue, dev->blk_size);
 
-- 
2.20.1




[PATCH 9/9] powerpc/ps3: Add udbg_panic

2020-03-27 Thread Geoff Levand
BUG_ON() won't work in the early init code, so replace it with
a new routine udbg_panic() that uses udbg_printf() and lv1_panic().

Signed-off-by: Geoff Levand 
---
 arch/powerpc/platforms/ps3/mm.c | 9 -
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/platforms/ps3/mm.c b/arch/powerpc/platforms/ps3/mm.c
index 423be34f0f5f..7c7c2f53eacb 100644
--- a/arch/powerpc/platforms/ps3/mm.c
+++ b/arch/powerpc/platforms/ps3/mm.c
@@ -26,6 +26,13 @@
 #define DBG pr_devel
 #endif
 
+#define udbg_panic(_val, _msg) \
+if (_val) { \
+   udbg_printf("%s:%d: " _msg ": %d\n", \
+   __func__, __LINE__, (int)(_val)); \
+   lv1_panic(0); \
+}
+
 enum {
 #if defined(CONFIG_PS3_DYNAMIC_DMA)
USE_DYNAMIC_DMA = 1,
@@ -313,7 +320,7 @@ static void ps3_mm_region_destroy(struct mem_region *r)
 
if (r->base) {
result = lv1_release_memory(r->base);
-   BUG_ON(result);
+   udbg_panic(result, "lv1_release_memory failed");
r->size = r->base = r->offset = 0;
map.total = map.rm.size;
}
-- 
2.20.1



[PATCH 3/9] net/ps3_gelic_net: Remove duplicate error message

2020-03-27 Thread Geoff Levand
From: Markus Elfring 

Remove an extra message for a memory allocation failure in
function gelic_descr_prepare_rx().

Signed-off-by: Markus Elfring 
Signed-off-by: Geoff Levand 
---
 drivers/net/ethernet/toshiba/ps3_gelic_net.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/net/ethernet/toshiba/ps3_gelic_net.c 
b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
index 070dd6fa9401..8522f3898e0d 100644
--- a/drivers/net/ethernet/toshiba/ps3_gelic_net.c
+++ b/drivers/net/ethernet/toshiba/ps3_gelic_net.c
@@ -382,8 +382,6 @@ static int gelic_descr_prepare_rx(struct gelic_card *card,
descr->skb = dev_alloc_skb(bufsize + GELIC_NET_RXBUF_ALIGN - 1);
if (!descr->skb) {
descr->buf_addr = 0; /* tell DMAC don't touch memory */
-   dev_info(ctodev(card),
-"%s:allocate skb failed !!\n", __func__);
return -ENOMEM;
}
descr->buf_size = cpu_to_be32(bufsize);
-- 
2.20.1




[PATCH 8/9] powerpc/ps3: Add lv1_panic

2020-03-27 Thread Geoff Levand
lv1_panic takes a single parameter, 0=halt, 1=reboot, and it will
never return.

Signed-off-by: Geoff Levand 
---
 arch/powerpc/boot/ppc_asm.h| 6 ++
 arch/powerpc/include/asm/ppc_asm.h | 6 ++
 2 files changed, 12 insertions(+)

diff --git a/arch/powerpc/boot/ppc_asm.h b/arch/powerpc/boot/ppc_asm.h
index 192b97523b05..cf758bc63846 100644
--- a/arch/powerpc/boot/ppc_asm.h
+++ b/arch/powerpc/boot/ppc_asm.h
@@ -8,6 +8,12 @@
  * Copyright (C) 1995-1999 Gary Thomas, Paul Mackerras, Cort Dougan.
  */
 
+.macro lv1_panic
+   li  r3, 0
+   li  r11, 255
+   .long 0x4422
+.endm
+
 /* Condition Register Bit Fields */
 
 #definecr0 0
diff --git a/arch/powerpc/include/asm/ppc_asm.h 
b/arch/powerpc/include/asm/ppc_asm.h
index 6b03dff61a05..e76a6a4020ea 100644
--- a/arch/powerpc/include/asm/ppc_asm.h
+++ b/arch/powerpc/include/asm/ppc_asm.h
@@ -13,6 +13,12 @@
 
 #ifdef __ASSEMBLY__
 
+.macro lv1_panic
+   li  r3, 0
+   li  r11, 255
+   .long 0x4422
+.endm
+
 #define SZL(BITS_PER_LONG/8)
 
 /*
-- 
2.20.1




[PATCH 0/9] PS3 patches for v5.7

2020-03-27 Thread Geoff Levand
Hi Michael,

Here are a few PS3 specific patches.  A few remove some reduntant messages,
a few add some minor debugging support, and a few fix some problems during
system boot.

Please consider for v5.7.

-Geoff

The following changes since commit 16fbf79b0f83bc752cee8589279f1ebfe57b3b6e:

  Linux 5.6-rc7 (2020-03-22 18:31:56 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/geoff/ps3-linux.git 
for-merge-ps3

for you to fetch changes up to 1333a8985c4190763c9c0312bcefad8b1ea863c7:

  powerpc/ps3: Add udbg_panic (2020-03-27 13:07:31 -0700)


Dan Carpenter (1):
  powerpc/ps3: remove an unneeded NULL check

Emmanuel Nicolet (1):
  ps3disk: use the default segment boundary

Geoff Levand (4):
  powerpc/ps3: Set CONFIG_UEVENT_HELPER=y in ps3_defconfig
  powerpc/ps3: Add check for otheros image size
  powerpc/ps3: Add lv1_panic
  powerpc/ps3: Add udbg_panic

Markus Elfring (3):
  powerpc/ps3: Remove duplicate error messages
  drivers/ps3: Remove duplicate error messages
  net/ps3_gelic_net: Remove duplicate error message

 arch/powerpc/boot/ppc_asm.h  |  6 ++
 arch/powerpc/boot/wrapper| 13 +++--
 arch/powerpc/configs/ps3_defconfig   |  2 ++
 arch/powerpc/include/asm/ppc_asm.h   |  6 ++
 arch/powerpc/platforms/ps3/mm.c  |  9 -
 arch/powerpc/platforms/ps3/os-area.c |  4 +---
 drivers/block/ps3disk.c  |  1 -
 drivers/net/ethernet/toshiba/ps3_gelic_net.c |  2 --
 drivers/ps3/ps3-lpm.c|  2 --
 drivers/ps3/ps3-vuart.c  |  1 -
 drivers/ps3/sys-manager-core.c   |  2 +-
 11 files changed, 35 insertions(+), 13 deletions(-)

-- 
2.20.1



[PATCH 4/9] powerpc/ps3: remove an unneeded NULL check

2020-03-27 Thread Geoff Levand
From: Dan Carpenter 

Static checkers don't like the inconsistent NULL checking on "ops".
This function is only called once and "ops" isn't NULL so the check can
be removed.

Signed-off-by: Dan Carpenter 
Signed-off-by: Geoff Levand 
---
 drivers/ps3/sys-manager-core.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/ps3/sys-manager-core.c b/drivers/ps3/sys-manager-core.c
index 24709c572c0c..e061b7d0632b 100644
--- a/drivers/ps3/sys-manager-core.c
+++ b/drivers/ps3/sys-manager-core.c
@@ -31,7 +31,7 @@ void ps3_sys_manager_register_ops(const struct 
ps3_sys_manager_ops *ops)
 {
BUG_ON(!ops);
BUG_ON(!ops->dev);
-   ps3_sys_manager_ops = ops ? *ops : ps3_sys_manager_ops;
+   ps3_sys_manager_ops = *ops;
 }
 EXPORT_SYMBOL_GPL(ps3_sys_manager_register_ops);
 
-- 
2.20.1




[PATCH 7/9] powerpc/ps3: Add check for otheros image size

2020-03-27 Thread Geoff Levand
The ps3's otheros flash loader has a size limit of 16 MiB for the
uncompressed image.  If that limit will be reached output the
flash image file as 'otheros-too-big.bld'.

Signed-off-by: Geoff Levand 
---
 arch/powerpc/boot/wrapper | 13 +++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/boot/wrapper b/arch/powerpc/boot/wrapper
index ed6266367bc0..1dfd9fd929c8 100755
--- a/arch/powerpc/boot/wrapper
+++ b/arch/powerpc/boot/wrapper
@@ -570,7 +570,16 @@ ps3)
 count=$overlay_size bs=1
 
 odir="$(dirname "$ofile.bin")"
-rm -f "$odir/otheros.bld"
-gzip -n --force -9 --stdout "$ofile.bin" > "$odir/otheros.bld"
+
+# The ps3's flash loader has a size limit of 16 MiB for the uncompressed
+# image.  If a compressed image that exceeded this limit is written to
+# flash the loader will decompress that image until the 16 MiB limit is
+# reached, then enter the system reset vector of the partially decompressed
+# image.  No warning is issued.
+rm -f "$odir"/{otheros,otheros-too-big}.bld
+size=$(${CROSS}nm --no-sort --radix=d "$ofile" | egrep ' _end$' | cut -d' 
' -f1)
+bld="otheros.bld"
+[ $size -le 16777216 ] || bld="otheros-too-big.bld"
+gzip -n --force -9 --stdout "$ofile.bin" > "$odir/$bld"
 ;;
 esac
-- 
2.20.1




[PATCH 1/9] powerpc/ps3: Remove duplicate error messages

2020-03-27 Thread Geoff Levand
From: Markus Elfring 

Remove duplicate memory allocation failure error messages.

Signed-off-by: Markus Elfring 
Signed-off-by: Geoff Levand 
---
 arch/powerpc/platforms/ps3/os-area.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/powerpc/platforms/ps3/os-area.c 
b/arch/powerpc/platforms/ps3/os-area.c
index cbddd63caf2d..e8530371aed6 100644
--- a/arch/powerpc/platforms/ps3/os-area.c
+++ b/arch/powerpc/platforms/ps3/os-area.c
@@ -613,10 +613,8 @@ static int update_flash_db(void)
/* Read in header and db from flash. */
 
header = kmalloc(buf_len, GFP_KERNEL);
-   if (!header) {
-   pr_debug("%s: kmalloc failed\n", __func__);
+   if (!header)
return -ENOMEM;
-   }
 
count = os_area_flash_read(header, buf_len, 0);
if (count < 0) {
-- 
2.20.1




Re: [PATCH v2] powerpc xmon: use `dcbf` inplace of `dcbi` instruction for 64bit Book3S

2020-03-27 Thread Christophe Leroy




Le 27/03/2020 à 19:19, Segher Boessenkool a écrit :

On Fri, Mar 27, 2020 at 04:12:13PM +0100, Christophe Leroy wrote:

Maybe you could also change invalidate_dcache_range():

for (i = 0; i < size >> shift; i++, addr += bytes) {
if (IS_ENABLED(CONFIG_PPC_BOOK3S_64))
dcbf(addr);
else
dcbi(addr);
}


But please note that flushing is pretty much the opposite from
invalidating (a flush (dcbf) makes sure that what is in the cache now
ends up in memory, while an invalidate (dcbi) makes sure it will *not*
end up in memory).  (Both will remove the addressed cache line from the
data caches).

So you cannot blindly replace them; in all cases you need to look and
see if it does what you need here.

(dcbi is much harder to use correctly -- it can race very easily -- so
in practice you will be fine most of the time; but be careful around
startup code and the like).



At the time being, invalidate_dcache_range() is used in only one place, 
and that's a place for PPC32 only. So I was just suggesting that just in 
case. Maybe there is no point in bothering with that at the time being.


Christophe


Re: [patch V3 12/20] powerpc/ps3: Convert half completion to rcuwait

2020-03-27 Thread Geoff Levand
Hi,

On 3/21/20 4:25 AM, Thomas Gleixner wrote:
> From: Thomas Gleixner 
> 
> The PS3 notification interrupt and kthread use a hacked up completion to
> communicate. Since we're wanting to change the completion implementation and
> this is abuse anyway, replace it with a simple rcuwait since there is only 
> ever
> the one waiter.
> 
> AFAICT the kthread uses TASK_INTERRUPTIBLE to not increase loadavg, kthreads
> cannot receive signals by default and this one doesn't look different. Use
> TASK_IDLE instead.

I tested the patch set applied against v5.6-rc7 on the PS3 and it worked
as expected.

Tested by: Geoff Levand 



Re: [PATCH v2] powerpc/boot: Delete unneeded .globl _zimage_start

2020-03-27 Thread Segher Boessenkool
On Fri, Mar 27, 2020 at 09:50:54AM -0700, Fangrui Song wrote:
> We aim for compatibility with GNU in many aspects to make it easier for
> people to switch over. However, just because there is a subtle behavior
> in GNU toolchain does not mean we need to emulate that behavior.

It isn't subtle.  It is the explicit documented behaviour.

> With
> all due respect, there are a large quantity of legacy behaviors we don't
> want to support.

And it isn't legacy at all.  It is simply the defined semantics.

It is semantics that real code relies on (and has for ages) as well.


Segher


Re: [PATCH v1] powerpc: Make setjmp/longjump signature standard

2020-03-27 Thread Nathan Chancellor
On Fri, Mar 27, 2020 at 06:45:21PM +0100, Christophe Leroy wrote:
> Subject line, change longjump to longjmp
> 
> Le 27/03/2020 à 11:07, Clement Courbet a écrit :
> > Declaring setjmp()/longjmp() as taking longs makes the signature
> > non-standard, and makes clang complain. In the past, this has been
> > worked around by adding -ffreestanding to the compile flags.
> > 
> > The implementation looks like it only ever propagates the value
> > (in longjmp) or sets it to 1 (in setjmp), and we only call longjmp
> > with integer parameters.
> > 
> > This allows removing -ffreestanding from the compilation flags.
> > 
> > Context:
> > https://lore.kernel.org/patchwork/patch/1214060
> > https://lore.kernel.org/patchwork/patch/1216174
> > 
> > Signed-off-by: Clement Courbet 
> > ---
> >   arch/powerpc/include/asm/setjmp.h | 6 --
> >   arch/powerpc/kexec/Makefile   | 3 ---
> >   2 files changed, 4 insertions(+), 5 deletions(-)
> > 
> > diff --git a/arch/powerpc/include/asm/setjmp.h 
> > b/arch/powerpc/include/asm/setjmp.h
> > index e9f81bb3f83b..84bb0d140d59 100644
> > --- a/arch/powerpc/include/asm/setjmp.h
> > +++ b/arch/powerpc/include/asm/setjmp.h
> > @@ -7,7 +7,9 @@
> >   #define JMP_BUF_LEN23
> > -extern long setjmp(long *) __attribute__((returns_twice));
> > -extern void longjmp(long *, long) __attribute__((noreturn));
> > +typedef long *jmp_buf;
> 
> Do we need that new opaque typedef ? Why not just keep long * ?

Yes, otherwise the warning comes back:

In file included from arch/powerpc/kexec/crash.c:25:
arch/powerpc/include/asm/setjmp.h:10:12: error: declaration of built-in 
function 'setjmp' requires the declaration of the 'jmp_buf' type, commonly 
provided in the header . [-Werror,-Wincomplete-setjmp-declaration]
extern int setjmp(long *env) __attribute__((returns_twice));
   ^
arch/powerpc/include/asm/setjmp.h:11:13: error: declaration of built-in 
function 'longjmp' requires the declaration of the 'jmp_buf' type, commonly 
provided in the header . [-Werror,-Wincomplete-setjmp-declaration]
extern void longjmp(long *env, int val) __attribute__((noreturn));
^
2 errors generated.

> > +
> > +extern int setjmp(jmp_buf env) __attribute__((returns_twice));
> > +extern void longjmp(jmp_buf env, int val) __attribute__((noreturn));
> >   #endif /* _ASM_POWERPC_SETJMP_H */
> > diff --git a/arch/powerpc/kexec/Makefile b/arch/powerpc/kexec/Makefile
> > index 378f6108a414..86380c69f5ce 100644
> > --- a/arch/powerpc/kexec/Makefile
> > +++ b/arch/powerpc/kexec/Makefile
> > @@ -3,9 +3,6 @@
> >   # Makefile for the linux kernel.
> >   #
> > -# Avoid clang warnings around longjmp/setjmp declarations
> > -CFLAGS_crash.o += -ffreestanding
> > -
> >   obj-y += core.o crash.o core_$(BITS).o
> >   obj-$(CONFIG_PPC32)   += relocate_32.o
> > 
> 
> Christophe


Re: [PATCH v2] powerpc xmon: use `dcbf` inplace of `dcbi` instruction for 64bit Book3S

2020-03-27 Thread Segher Boessenkool
On Fri, Mar 27, 2020 at 04:12:13PM +0100, Christophe Leroy wrote:
> Maybe you could also change invalidate_dcache_range():
> 
>   for (i = 0; i < size >> shift; i++, addr += bytes) {
>   if (IS_ENABLED(CONFIG_PPC_BOOK3S_64))
>   dcbf(addr);
>   else
>   dcbi(addr);
>   }

But please note that flushing is pretty much the opposite from
invalidating (a flush (dcbf) makes sure that what is in the cache now
ends up in memory, while an invalidate (dcbi) makes sure it will *not*
end up in memory).  (Both will remove the addressed cache line from the
data caches).

So you cannot blindly replace them; in all cases you need to look and
see if it does what you need here.

(dcbi is much harder to use correctly -- it can race very easily -- so
in practice you will be fine most of the time; but be careful around
startup code and the like).


Segher


[PATCH V2 5/5] selftests/powerpc: Add README for GZIP engine tests

2020-03-27 Thread Raphael Moreira Zinsly
Include a README file with the instructions to use the
testcases at selftests/powerpc/nx-gzip.

Signed-off-by: Bulent Abali 
Signed-off-by: Raphael Moreira Zinsly 
---
 .../powerpc/nx-gzip/99-nx-gzip.rules  |  1 +
 .../testing/selftests/powerpc/nx-gzip/README  | 44 +++
 2 files changed, 45 insertions(+)
 create mode 100644 tools/testing/selftests/powerpc/nx-gzip/99-nx-gzip.rules
 create mode 100644 tools/testing/selftests/powerpc/nx-gzip/README

diff --git a/tools/testing/selftests/powerpc/nx-gzip/99-nx-gzip.rules 
b/tools/testing/selftests/powerpc/nx-gzip/99-nx-gzip.rules
new file mode 100644
index ..5a7118495cb3
--- /dev/null
+++ b/tools/testing/selftests/powerpc/nx-gzip/99-nx-gzip.rules
@@ -0,0 +1 @@
+SUBSYSTEM=="nxgzip", KERNEL=="nx-gzip", MODE="0666"
diff --git a/tools/testing/selftests/powerpc/nx-gzip/README 
b/tools/testing/selftests/powerpc/nx-gzip/README
new file mode 100644
index ..a80c289f1d2c
--- /dev/null
+++ b/tools/testing/selftests/powerpc/nx-gzip/README
@@ -0,0 +1,44 @@
+Test the nx-gzip function:
+=
+
+Verify that following device exists:
+  /dev/crypto/nx-gzip
+If you get a permission error run as sudo or set the device permissions:
+   sudo chmod go+rw /dev/crypto/nx-gzip
+However, chmod may not survive across boots. You may create a udev file such
+as:
+   /etc/udev/rules.d/99-nx-gzip.rules
+
+
+Then make and run:
+$ make
+gcc -O3 -I./inc -o gzfht_test gzfht_test.c gzip_vas.c
+gcc -O3 -I./inc -o gunz_test gunz_test.c gzip_vas.c
+
+
+Compress any file using Fixed Huffman mode. Output will have a .nx.gz suffix:
+$ ./gzfht_test gzip_vas.c
+file gzip_vas.c read, 5218 bytes
+compressed 5218 to 2545 bytes total, crc32 checksum = 817543a3
+
+
+Uncompress the previous output. Output will have a .nx.gunzip suffix:
+./gunz_test gzip_vas.c.nx.gz
+gzHeader FLG 0
+00 00 00 00 04 03
+gzHeader MTIME, XFL, OS ignored
+computed checksum 817543a3 isize 1462
+stored   checksum 817543a3 isize 1462
+decomp is complete: fclose
+
+
+Compare two files:
+$ sha1sum gzip_vas.c.nx.gz.nx.gunzip gzip_vas.c
+4e87536f3ee9e771ef30fb0fb27572032ca44ef8  gzip_vas.c.nx.gz.nx.gunzip
+4e87536f3ee9e771ef30fb0fb27572032ca44ef8  gzip_vas.c
+
+
+Note that the code here are intended for testing the nx-gzip hardware function.
+They are not intended for demonstrating performance or compression ratio.
+For more information and source code consider using:
+https://github.com/libnxz/power-gzip
-- 
2.21.0



[PATCH V2 4/5] selftests/powerpc: Add NX-GZIP engine decompress testcase

2020-03-27 Thread Raphael Moreira Zinsly
Include a decompression testcase for the powerpc NX-GZIP
engine.

Signed-off-by: Bulent Abali 
Signed-off-by: Raphael Moreira Zinsly 
---
 .../selftests/powerpc/nx-gzip/Makefile|7 +-
 .../selftests/powerpc/nx-gzip/gunz_test.c | 1078 +
 2 files changed, 1082 insertions(+), 3 deletions(-)
 create mode 100644 tools/testing/selftests/powerpc/nx-gzip/gunz_test.c

diff --git a/tools/testing/selftests/powerpc/nx-gzip/Makefile 
b/tools/testing/selftests/powerpc/nx-gzip/Makefile
index ab903f63bbbd..82abc19a49a0 100644
--- a/tools/testing/selftests/powerpc/nx-gzip/Makefile
+++ b/tools/testing/selftests/powerpc/nx-gzip/Makefile
@@ -1,9 +1,9 @@
 CC = gcc
 CFLAGS = -O3
 INC = ./inc
-SRC = gzfht_test.c
+SRC = gzfht_test.c gunz_test.c
 OBJ = $(SRC:.c=.o)
-TESTS = gzfht_test
+TESTS = gzfht_test gunz_test
 EXTRA_SOURCES = gzip_vas.c
 
 all:   $(TESTS)
@@ -16,6 +16,7 @@ $(TESTS): $(OBJ)
 
 run_tests: $(TESTS)
./gzfht_test gzip_vas.c
+   ./gunz_test gzip_vas.c.nx.gz
 
 clean:
-   rm -f $(TESTS) *.o *~ *.gz
+   rm -f $(TESTS) *.o *~ *.gz *.gunzip
diff --git a/tools/testing/selftests/powerpc/nx-gzip/gunz_test.c 
b/tools/testing/selftests/powerpc/nx-gzip/gunz_test.c
new file mode 100644
index ..82eb268a8397
--- /dev/null
+++ b/tools/testing/selftests/powerpc/nx-gzip/gunz_test.c
@@ -0,0 +1,1078 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+
+/* P9 gunzip sample code for demonstrating the P9 NX hardware
+ * interface.  Not intended for productive uses or for performance or
+ * compression ratio measurements.  Note also that /dev/crypto/gzip,
+ * VAS and skiboot support are required
+ *
+ * Copyright 2020 IBM Corp.
+ *
+ * Author: Bulent Abali 
+ *
+ * https://github.com/libnxz/power-gzip for zlib api and other utils
+ * Definitions of acronyms used here.  See
+ * P9 NX Gzip Accelerator User's Manual for details:
+ * https://github.com/libnxz/power-gzip/blob/develop/doc/power_nx_gzip_um.pdf
+ *
+ * adler/crc: 32 bit checksums appended to stream tail
+ * ce:   completion extension
+ * cpb:  coprocessor parameter block (metadata)
+ * crb:  coprocessor request block (command)
+ * csb:  coprocessor status block (status)
+ * dht:  dynamic huffman table
+ * dde:  data descriptor element (address, length)
+ * ddl:  list of ddes
+ * dh/fh:dynamic and fixed huffman types
+ * fc:   coprocessor function code
+ * histlen:  history/dictionary length
+ * history:  sliding window of up to 32KB of data
+ * lzcount:  Deflate LZ symbol counts
+ * rembytecnt: remaining byte count
+ * sfbt: source final block type; last block's type during decomp
+ * spbc: source processed byte count
+ * subc: source unprocessed bit count
+ * tebc: target ending bit count; valid bits in the last byte
+ * tpbc: target processed byte count
+ * vas:  virtual accelerator switch; the user mode interface
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "nxu.h"
+#include "nx.h"
+#include "crb.h"
+
+int nx_dbg;
+FILE *nx_gzip_log;
+
+#define NX_MIN(X, Y) (((X) < (Y))?(X):(Y))
+#define NX_MAX(X, Y) (((X) > (Y))?(X):(Y))
+
+#define GETINPC(X) fgetc(X)
+#define FNAME_MAX 1024
+
+/* fifo queue management */
+#define fifo_used_bytes(used) (used)
+#define fifo_free_bytes(used, len) ((len)-(used))
+/* amount of free bytes in the first and last parts */
+#define fifo_free_first_bytes(cur, used, len)  cur)+(used)) <= (len)) \
+ ? (len)-((cur)+(used)) : 0)
+#define fifo_free_last_bytes(cur, used, len)   cur)+(used)) <= (len)) \
+ ? (cur) : (len)-(used))
+/* amount of used bytes in the first and last parts */
+#define fifo_used_first_bytes(cur, used, len)  cur)+(used)) <= (len)) \
+ ? (used) : (len)-(cur))
+#define fifo_used_last_bytes(cur, used, len)   cur)+(used)) <= (len)) \
+ ? 0 : ((used)+(cur))-(len))
+/* first and last free parts start here */
+#define fifo_free_first_offset(cur, used)  ((cur)+(used))
+#define fifo_free_last_offset(cur, used, len)  \
+  fifo_used_last_bytes(cur, used, len)
+/* first and last used parts start here */
+#define fifo_used_first_offset(cur)(cur)
+#define fifo_used_last_offset(cur) (0)
+
+const int fifo_in_len = 1<<24;
+const int fifo_out_len = 1<<24;
+const int page_sz = 1<<16;
+const int line_sz = 1<<7;
+const int window_max = 1<<15;
+const int retry_max = 50;
+
+void *nx_fault_storage_address;
+
+/*
+ * Fault in pages prior to NX job submission.  wr=1 may be required to
+ * touch writeable pages.  System zero pages do not fault-in the page as
+ * intended.  Typically set wr=1 for NX target pages and set wr=0 

[PATCH V2 2/5] selftests/powerpc: Add header files for NX compresion/decompression

2020-03-27 Thread Raphael Moreira Zinsly
Add files to be able to compress and decompress files using the
powerpc NX-GZIP engine.

Signed-off-by: Bulent Abali 
Signed-off-by: Raphael Moreira Zinsly 
---
 .../powerpc/nx-gzip/inc/copy-paste.h  |  54 ++
 .../selftests/powerpc/nx-gzip/inc/nx_dbg.h|  95 +++
 .../selftests/powerpc/nx-gzip/inc/nxu.h   | 651 ++
 3 files changed, 800 insertions(+)
 create mode 100644 tools/testing/selftests/powerpc/nx-gzip/inc/copy-paste.h
 create mode 100644 tools/testing/selftests/powerpc/nx-gzip/inc/nx_dbg.h
 create mode 100644 tools/testing/selftests/powerpc/nx-gzip/inc/nxu.h

diff --git a/tools/testing/selftests/powerpc/nx-gzip/inc/copy-paste.h 
b/tools/testing/selftests/powerpc/nx-gzip/inc/copy-paste.h
new file mode 100644
index ..107139b6c7df
--- /dev/null
+++ b/tools/testing/selftests/powerpc/nx-gzip/inc/copy-paste.h
@@ -0,0 +1,54 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+
+#include "nx-helpers.h"
+
+/*
+ * Macros taken from arch/powerpc/include/asm/ppc-opcode.h and other
+ * header files.
+ */
+#define ___PPC_RA(a)(((a) & 0x1f) << 16)
+#define ___PPC_RB(b)(((b) & 0x1f) << 11)
+
+#define PPC_INST_COPY   0x7c20060c
+#define PPC_INST_PASTE  0x7c20070d
+
+#define PPC_COPY(a, b)  stringify_in_c(.long PPC_INST_COPY | \
+   ___PPC_RA(a) | ___PPC_RB(b))
+#define PPC_PASTE(a, b) stringify_in_c(.long PPC_INST_PASTE | \
+   ___PPC_RA(a) | ___PPC_RB(b))
+#define CR0_SHIFT  28
+#define CR0_MASK   0xF
+/*
+ * Copy/paste instructions:
+ *
+ * copy RA,RB
+ * Copy contents of address (RA) + effective_address(RB)
+ * to internal copy-buffer.
+ *
+ * paste RA,RB
+ * Paste contents of internal copy-buffer to the address
+ * (RA) + effective_address(RB)
+ */
+static inline int vas_copy(void *crb, int offset)
+{
+   asm volatile(PPC_COPY(%0, %1)";"
+   :
+   : "b" (offset), "b" (crb)
+   : "memory");
+
+   return 0;
+}
+
+static inline int vas_paste(void *paste_address, int offset)
+{
+   u32 cr;
+
+   cr = 0;
+   asm volatile(PPC_PASTE(%1, %2)";"
+   "mfocrf %0, 0x80;"
+   : "=r" (cr)
+   : "b" (offset), "b" (paste_address)
+   : "memory", "cr0");
+
+   return (cr >> CR0_SHIFT) & CR0_MASK;
+}
diff --git a/tools/testing/selftests/powerpc/nx-gzip/inc/nx_dbg.h 
b/tools/testing/selftests/powerpc/nx-gzip/inc/nx_dbg.h
new file mode 100644
index ..f2c0eee2317e
--- /dev/null
+++ b/tools/testing/selftests/powerpc/nx-gzip/inc/nx_dbg.h
@@ -0,0 +1,95 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later
+ *
+ * Copyright 2020 IBM Corporation
+ *
+ */
+
+#ifndef _NXU_DBG_H_
+#define _NXU_DBG_H_
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+extern FILE * nx_gzip_log;
+extern int nx_gzip_trace;
+extern unsigned int nx_gzip_inflate_impl;
+extern unsigned int nx_gzip_deflate_impl;
+extern unsigned int nx_gzip_inflate_flags;
+extern unsigned int nx_gzip_deflate_flags;
+
+extern int nx_dbg;
+pthread_mutex_t mutex_log;
+
+#define nx_gzip_trace_enabled()   (nx_gzip_trace & 0x1)
+#define nx_gzip_hw_trace_enabled()(nx_gzip_trace & 0x2)
+#define nx_gzip_sw_trace_enabled()(nx_gzip_trace & 0x4)
+#define nx_gzip_gather_statistics()   (nx_gzip_trace & 0x8)
+#define nx_gzip_per_stream_stat() (nx_gzip_trace & 0x10)
+
+#define prt(fmt, ...) do { \
+   pthread_mutex_lock(_log); \
+   flock(nx_gzip_log->_fileno, LOCK_EX);   \
+   time_t t; struct tm *m; time(); m = localtime();\
+   fprintf(nx_gzip_log, "[%04d/%02d/%02d %02d:%02d:%02d] " \
+   "pid %d: " fmt, \
+   (int)m->tm_year + 1900, (int)m->tm_mon+1, (int)m->tm_mday, \
+   (int)m->tm_hour, (int)m->tm_min, (int)m->tm_sec,\
+   (int)getpid(), ## __VA_ARGS__); \
+   fflush(nx_gzip_log);\
+   flock(nx_gzip_log->_fileno, LOCK_UN);   \
+   pthread_mutex_unlock(_log);   \
+} while (0)
+
+/* Use in case of an error */
+#define prt_err(fmt, ...) do { if (nx_dbg >= 0) {  \
+   prt("%s:%u: Error: "fmt,\
+   __FILE__, __LINE__, ## __VA_ARGS__);\
+}} while (0)
+
+/* Use in case of an warning */
+#define prt_warn(fmt, ...) do {if (nx_dbg >= 1) {  
\
+   prt("%s:%u: Warning: "fmt,  \
+   __FILE__, __LINE__, ## __VA_ARGS__);\
+}} while (0)
+
+/* Informational printouts */
+#define prt_info(fmt, ...) do {if (nx_dbg >= 2) {  
\
+ 

[PATCH V2 3/5] selftests/powerpc: Add NX-GZIP engine compress testcase

2020-03-27 Thread Raphael Moreira Zinsly
Add a compression testcase for the powerpc NX-GZIP engine.

Signed-off-by: Bulent Abali 
Signed-off-by: Raphael Moreira Zinsly 
---
 .../selftests/powerpc/nx-gzip/Makefile|  21 +
 .../selftests/powerpc/nx-gzip/gzfht_test.c| 489 ++
 .../selftests/powerpc/nx-gzip/gzip_vas.c  | 259 ++
 3 files changed, 769 insertions(+)
 create mode 100644 tools/testing/selftests/powerpc/nx-gzip/Makefile
 create mode 100644 tools/testing/selftests/powerpc/nx-gzip/gzfht_test.c
 create mode 100644 tools/testing/selftests/powerpc/nx-gzip/gzip_vas.c

diff --git a/tools/testing/selftests/powerpc/nx-gzip/Makefile 
b/tools/testing/selftests/powerpc/nx-gzip/Makefile
new file mode 100644
index ..ab903f63bbbd
--- /dev/null
+++ b/tools/testing/selftests/powerpc/nx-gzip/Makefile
@@ -0,0 +1,21 @@
+CC = gcc
+CFLAGS = -O3
+INC = ./inc
+SRC = gzfht_test.c
+OBJ = $(SRC:.c=.o)
+TESTS = gzfht_test
+EXTRA_SOURCES = gzip_vas.c
+
+all:   $(TESTS)
+
+$(OBJ): %.o: %.c
+   $(CC) $(CFLAGS) -I$(INC) -c $<
+
+$(TESTS): $(OBJ)
+   $(CC) $(CFLAGS) -I$(INC) -o $@ $@.o $(EXTRA_SOURCES)
+
+run_tests: $(TESTS)
+   ./gzfht_test gzip_vas.c
+
+clean:
+   rm -f $(TESTS) *.o *~ *.gz
diff --git a/tools/testing/selftests/powerpc/nx-gzip/gzfht_test.c 
b/tools/testing/selftests/powerpc/nx-gzip/gzfht_test.c
new file mode 100644
index ..7a21c25f5611
--- /dev/null
+++ b/tools/testing/selftests/powerpc/nx-gzip/gzfht_test.c
@@ -0,0 +1,489 @@
+// SPDX-License-Identifier: GPL-2.0-or-later
+
+/* P9 gzip sample code for demonstrating the P9 NX hardware interface.
+ * Not intended for productive uses or for performance or compression
+ * ratio measurements.  For simplicity of demonstration, this sample
+ * code compresses in to fixed Huffman blocks only (Deflate btype=1)
+ * and has very simple memory management.  Dynamic Huffman blocks
+ * (Deflate btype=2) are more involved as detailed in the user guide.
+ * Note also that /dev/crypto/gzip, VAS and skiboot support are
+ * required.
+ *
+ * Copyright 2020 IBM Corp.
+ *
+ * https://github.com/libnxz/power-gzip for zlib api and other utils
+ *
+ * Author: Bulent Abali 
+ *
+ * Definitions of acronyms used here. See
+ * P9 NX Gzip Accelerator User's Manual for details:
+ * https://github.com/libnxz/power-gzip/blob/develop/doc/power_nx_gzip_um.pdf
+ *
+ * adler/crc: 32 bit checksums appended to stream tail
+ * ce:   completion extension
+ * cpb:  coprocessor parameter block (metadata)
+ * crb:  coprocessor request block (command)
+ * csb:  coprocessor status block (status)
+ * dht:  dynamic huffman table
+ * dde:  data descriptor element (address, length)
+ * ddl:  list of ddes
+ * dh/fh:dynamic and fixed huffman types
+ * fc:   coprocessor function code
+ * histlen:  history/dictionary length
+ * history:  sliding window of up to 32KB of data
+ * lzcount:  Deflate LZ symbol counts
+ * rembytecnt: remaining byte count
+ * sfbt: source final block type; last block's type during decomp
+ * spbc: source processed byte count
+ * subc: source unprocessed bit count
+ * tebc: target ending bit count; valid bits in the last byte
+ * tpbc: target processed byte count
+ * vas:  virtual accelerator switch; the user mode interface
+ */
+
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include "nxu.h"
+#include "nx.h"
+
+int nx_dbg;
+FILE *nx_gzip_log;
+void *nx_fault_storage_address;
+
+#define NX_MIN(X, Y) (((X) < (Y)) ? (X) : (Y))
+#define FNAME_MAX 1024
+#define FEXT ".nx.gz"
+
+/*
+ * LZ counts returned in the user supplied nx_gzip_crb_cpb_t structure.
+ */
+static int compress_fht_sample(char *src, uint32_t srclen, char *dst,
+   uint32_t dstlen, int with_count,
+   struct nx_gzip_crb_cpb_t *cmdp, void *handle)
+{
+   int cc;
+   uint32_t fc;
+
+   assert(!!cmdp);
+
+   put32(cmdp->crb, gzip_fc, 0);  /* clear */
+   fc = (with_count) ? GZIP_FC_COMPRESS_RESUME_FHT_COUNT :
+   GZIP_FC_COMPRESS_RESUME_FHT;
+   putnn(cmdp->crb, gzip_fc, fc);
+   putnn(cmdp->cpb, in_histlen, 0); /* resuming with no history */
+   memset((void *) >crb.csb, 0, sizeof(cmdp->crb.csb));
+
+   /* Section 6.6 programming notes; spbc may be in two different
+* places depending on FC.
+*/
+   if (!with_count)
+   put32(cmdp->cpb, out_spbc_comp, 0);
+   else
+   put32(cmdp->cpb, out_spbc_comp_with_count, 0);
+
+   /* Figure 6-3 6-4; CSB location */
+   put64(cmdp->crb, csb_address, 0);
+   put64(cmdp->crb, csb_address,
+ (uint64_t) >crb.csb & csb_address_mask);
+
+   /* Source direct dde (scatter-gather list) */
+   clear_dde(cmdp->crb.source_dde);
+   putnn(cmdp->crb.source_dde, dde_count, 0);
+   

[PATCH V2 1/5] selftests/powerpc: Add header files for GZIP engine test

2020-03-27 Thread Raphael Moreira Zinsly
Add files to access the powerpc NX-GZIP engine in user space.

Signed-off-by: Bulent Abali 
Signed-off-by: Raphael Moreira Zinsly 
---
 .../selftests/powerpc/nx-gzip/inc/crb.h   | 159 ++
 .../selftests/powerpc/nx-gzip/inc/nx-gzip.h   |  27 +++
 .../powerpc/nx-gzip/inc/nx-helpers.h  |  54 ++
 .../selftests/powerpc/nx-gzip/inc/nx.h|  38 +
 4 files changed, 278 insertions(+)
 create mode 100644 tools/testing/selftests/powerpc/nx-gzip/inc/crb.h
 create mode 100644 tools/testing/selftests/powerpc/nx-gzip/inc/nx-gzip.h
 create mode 100644 tools/testing/selftests/powerpc/nx-gzip/inc/nx-helpers.h
 create mode 100644 tools/testing/selftests/powerpc/nx-gzip/inc/nx.h

diff --git a/tools/testing/selftests/powerpc/nx-gzip/inc/crb.h 
b/tools/testing/selftests/powerpc/nx-gzip/inc/crb.h
new file mode 100644
index ..9056e3dc1831
--- /dev/null
+++ b/tools/testing/selftests/powerpc/nx-gzip/inc/crb.h
@@ -0,0 +1,159 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#ifndef __CRB_H
+#define __CRB_H
+#include 
+#include "nx.h"
+
+typedef unsigned char u8;
+typedef unsigned int u32;
+typedef unsigned long long u64;
+
+/* CCW 842 CI/FC masks
+ * NX P8 workbook, section 4.3.1, figure 4-6
+ * "CI/FC Boundary by NX CT type"
+ */
+#define CCW_CI_842  (0x3ff8)
+#define CCW_FC_842  (0x0007)
+
+/* Chapter 6.5.8 Coprocessor-Completion Block (CCB) */
+
+#define CCB_VALUE  (0x3fff)
+#define CCB_ADDRESS(0xfff8)
+#define CCB_CM (0x0007)
+#define CCB_CM0(0x0004)
+#define CCB_CM12   (0x0003)
+
+#define CCB_CM0_ALL_COMPLETIONS(0x0)
+#define CCB_CM0_LAST_IN_CHAIN  (0x4)
+#define CCB_CM12_STORE (0x0)
+#define CCB_CM12_INTERRUPT (0x1)
+
+#define CCB_SIZE   (0x10)
+#define CCB_ALIGN  CCB_SIZE
+
+struct coprocessor_completion_block {
+   __be64 value;
+   __be64 address;
+} __aligned(CCB_ALIGN);
+
+
+/* Chapter 6.5.7 Coprocessor-Status Block (CSB) */
+
+#define CSB_V  (0x80)
+#define CSB_F  (0x04)
+#define CSB_CH (0x03)
+#define CSB_CE_INCOMPLETE  (0x80)
+#define CSB_CE_TERMINATION (0x40)
+#define CSB_CE_TPBC(0x20)
+
+#define CSB_CC_SUCCESS (0)
+#define CSB_CC_INVALID_ALIGN   (1)
+#define CSB_CC_OPERAND_OVERLAP (2)
+#define CSB_CC_DATA_LENGTH (3)
+#define CSB_CC_TRANSLATION (5)
+#define CSB_CC_PROTECTION  (6)
+#define CSB_CC_RD_EXTERNAL (7)
+#define CSB_CC_INVALID_OPERAND (8)
+#define CSB_CC_PRIVILEGE   (9)
+#define CSB_CC_INTERNAL(10)
+#define CSB_CC_WR_EXTERNAL (12)
+#define CSB_CC_NOSPC   (13)
+#define CSB_CC_EXCESSIVE_DDE   (14)
+#define CSB_CC_WR_TRANSLATION  (15)
+#define CSB_CC_WR_PROTECTION   (16)
+#define CSB_CC_UNKNOWN_CODE(17)
+#define CSB_CC_ABORT   (18)
+#define CSB_CC_TRANSPORT   (20)
+#define CSB_CC_SEGMENTED_DDL   (31)
+#define CSB_CC_PROGRESS_POINT  (32)
+#define CSB_CC_DDE_OVERFLOW(33)
+#define CSB_CC_SESSION (34)
+#define CSB_CC_PROVISION   (36)
+#define CSB_CC_CHAIN   (37)
+#define CSB_CC_SEQUENCE(38)
+#define CSB_CC_HW  (39)
+
+#define CSB_SIZE   (0x10)
+#define CSB_ALIGN  CSB_SIZE
+
+struct coprocessor_status_block {
+   u8 flags;
+   u8 cs;
+   u8 cc;
+   u8 ce;
+   __be32 count;
+   __be64 address;
+} __aligned(CSB_ALIGN);
+
+
+/* Chapter 6.5.10 Data-Descriptor List (DDL)
+ * each list contains one or more Data-Descriptor Entries (DDE)
+ */
+
+#define DDE_P  (0x8000)
+
+#define DDE_SIZE   (0x10)
+#define DDE_ALIGN  DDE_SIZE
+
+struct data_descriptor_entry {
+   __be16 flags;
+   u8 count;
+   u8 index;
+   __be32 length;
+   __be64 address;
+} __aligned(DDE_ALIGN);
+
+
+/* Chapter 6.5.2 Coprocessor-Request Block (CRB) */
+
+#define CRB_SIZE   (0x80)
+#define CRB_ALIGN  (0x100) /* Errata: requires 256 alignment */
+
+
+/* Coprocessor Status Block field
+ *   ADDRESS   address of CSB
+ *   C CCB is valid
+ *   AT0 = addrs are virtual, 1 = addrs are phys
+ *   M enable perf monitor
+ */
+#define CRB_CSB_ADDRESS(0xfff0)
+#define CRB_CSB_C  (0x0008)
+#define CRB_CSB_AT (0x0002)
+#define CRB_CSB_M  (0x0001)
+
+struct coprocessor_request_block {
+   __be32 ccw;
+   __be32 flags;
+   __be64 csb_addr;
+
+   struct data_descriptor_entry source;
+   struct data_descriptor_entry target;
+
+   struct coprocessor_completion_block ccb;
+
+   u8 reserved[48];
+
+   struct coprocessor_status_block csb;
+} __aligned(CRB_ALIGN);
+
+#define crb_csb_addr(c) __be64_to_cpu(c->csb_addr)

[PATCH V2 0/5] selftests/powerpc: Add NX-GZIP engine testcase

2020-03-27 Thread Raphael Moreira Zinsly
This patch series are intended to test the POWER9 Nest
Accelerator (NX) GZIP engine that is being introduced by
https://lists.ozlabs.org/pipermail/linuxppc-dev/2020-March/205659.html
More information about how to access the NX can be found in that patch, also a
complete userspace library and more documentation can be found at:
https://github.com/libnxz/power-gzip

Changes in V2:
- Fixed errors and warnings caught by scripts/checkpatch.pl, including
  line breaks inside strings.
- Fixed infinite loop and out-of-boundaries writing found by Daniel
  Axtens.

Best regards,
Raphael



Re: [PATCH v2 1/1] ppc/crash: Skip spinlocks during crash

2020-03-27 Thread Leonardo Bras
Hello Michael,

On Fri, 2020-03-27 at 14:50 +1100, Michael Ellerman wrote:
> Hi Leonardo,
> 
> Leonardo Bras  writes:
> > During a crash, there is chance that the cpus that handle the NMI IPI
> > are holding a spin_lock. If this spin_lock is needed by crashing_cpu it
> > will cause a deadlock. (rtas_lock and printk logbuf_log as of today)
> 
> Please give us more detail on how those locks are causing you trouble, a
> stack trace would be good if you have it.

Sure, I have hit it in printf and rtas_call, as said before. 

After crash_send_ipi(), it's tested how many cpus_in_crash are there,
and once they hit the total value, it's printed "IPI complete". This
printk call itself already got stuck in spin_lock, for example.

Here are the stack traces:

#0  arch_spin_lock 
#1  do_raw_spin_lock 
#2  __raw_spin_lock 
#3  _raw_spin_lock 
#4  vprintk_emit 
#5  vprintk_func
#7  crash_kexec_prepare_cpus 
#8  default_machine_crash_shutdown
#9  machine_crash_shutdown 
#10 __crash_kexec
#11 crash_kexec
#12 oops_end

#0 arch_spin_lock
#1  lock_rtas () 
#2  rtas_call (token=8204, nargs=1, nret=1, outputs=0x0)
#3  ics_rtas_mask_real_irq (hw_irq=4100) 
#4  machine_kexec_mask_interrupts
#5  default_machine_crash_shutdown
#6  machine_crash_shutdown 
#7  __crash_kexec
#8  crash_kexec
#9  oops_end

> > This is a problem if the system has kdump set up, given if it crashes
> > for any reason kdump may not be saved for crash analysis.
> > 
> > Skip spinlocks after NMI IPI is sent to all other cpus.
> 
> We don't want to add overhead to all spinlocks for the life of the
> system, just to handle this one case.

I understand. 
Other than this patch, I would propose doing something uglier, like
forcing the said locks to unlocked state when cpus_in_crash hits it's
maximum value, before printing "IPI complete".
Creating similar functions that don't lock, just for this case, looks
like overkill to me.

Do you have any other suggestion?

> There's already a flag that is set when the system is crashing,
> "oops_in_progress", maybe we need to use that somewhere to skip a lock
> or do an early return.

I think that would not work, because oops_in_progress should be 0 here:
oops_end() calls bust_spinlocks(0) before calling crash_kexec(), and
bust_spinlocks(0) will decrement oops_in_progress.
(just verified, it's 0 before printing "IPI complete").

Thank you the feedback, :)

Best regards,
Leonardo



signature.asc
Description: This is a digitally signed message part


[PATCH 1/3] powerpc/head_check: Automatic verbosity

2020-03-27 Thread Geoff Levand
To aid debugging build problems turn on shell tracing for the
head_check script when the build is verbose.

Signed-off-by: Geoff Levand 
---
 arch/powerpc/tools/head_check.sh | 6 --
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/tools/head_check.sh b/arch/powerpc/tools/head_check.sh
index ad9e57209aa4..37061fb9b58e 100644
--- a/arch/powerpc/tools/head_check.sh
+++ b/arch/powerpc/tools/head_check.sh
@@ -31,8 +31,10 @@
 # level entry code (boot, interrupt vectors, etc) until r2 is set up. This
 # could cause the kernel to die in early boot.
 
-# Turn this on if you want more debug output:
-# set -x
+# Allow for verbose output
+if [ "$V" = "1" ]; then
+   set -x
+fi
 
 if [ $# -lt 2 ]; then
echo "$0 [path to nm] [path to vmlinux]" 1>&2
-- 
2.20.1




[PATCH 0/3] powerpc: Minor updates to improve build debugging

2020-03-27 Thread Geoff Levand
Hi Michael,

Here are a few minor updates to the powerpc build files that make debugging
build problems a little easier.  Please consider.

-Geoff

The following changes since commit 16fbf79b0f83bc752cee8589279f1ebfe57b3b6e:

  Linux 5.6-rc7 (2020-03-22 18:31:56 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/geoff/ps3-linux.git 
for-merge-powerpc

for you to fetch changes up to e6fb89a0946dd620f1d8120578744942ca934e11:

  powerpc/head_check: Avoid broken pipe (2020-03-27 09:41:41 -0700)


Geoff Levand (3):
  powerpc/head_check: Automatic verbosity
  powerpc/wrapper: Output linker map file
  powerpc/head_check: Avoid broken pipe

 arch/powerpc/boot/wrapper| 3 ++-
 arch/powerpc/tools/head_check.sh | 8 +---
 2 files changed, 7 insertions(+), 4 deletions(-)

-- 
2.20.1



[PATCH 3/3] powerpc/head_check: Avoid broken pipe

2020-03-27 Thread Geoff Levand
Remove the '-m4' option to grep to allow grep to process all of nm's
output.  This avoids the nm warning:

  nm terminated with signal 13 [Broken pipe]

Signed-off-by: Geoff Levand 
---
 arch/powerpc/tools/head_check.sh | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/tools/head_check.sh b/arch/powerpc/tools/head_check.sh
index 37061fb9b58e..e32d3162e5ed 100644
--- a/arch/powerpc/tools/head_check.sh
+++ b/arch/powerpc/tools/head_check.sh
@@ -46,7 +46,7 @@ nm="$1"
 vmlinux="$2"
 
 # gcc-4.6-era toolchain make _stext an A (absolute) symbol rather than T
-$nm "$vmlinux" | grep -e " [TA] _stext$" -e " t start_first_256B$" -e " a 
text_start$" -e " t start_text$" -m4 > .tmp_symbols.txt
+$nm "$vmlinux" | grep -e " [TA] _stext$" -e " t start_first_256B$" -e " a 
text_start$" -e " t start_text$" > .tmp_symbols.txt
 
 
 vma=$(cat .tmp_symbols.txt | grep -e " [TA] _stext$" | cut -d' ' -f1)
-- 
2.20.1



[PATCH 2/3] powerpc/wrapper: Output linker map file

2020-03-27 Thread Geoff Levand
To aid debugging wrapper troubles, output a linker map file
'wrapper.map' when the build is verbose.

Signed-off-by: Geoff Levand 
---
 arch/powerpc/boot/wrapper | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/boot/wrapper b/arch/powerpc/boot/wrapper
index ed6266367bc0..35ace40d9fc2 100755
--- a/arch/powerpc/boot/wrapper
+++ b/arch/powerpc/boot/wrapper
@@ -29,6 +29,7 @@ set -e
 # Allow for verbose output
 if [ "$V" = 1 ]; then
 set -x
+map="-Map wrapper.map"
 fi
 
 # defaults
@@ -500,7 +501,7 @@ if [ "$platform" != "miboot" ]; then
 text_start="-Ttext $link_address"
 fi
 #link everything
-${CROSS}ld -m $format -T $lds $text_start $pie $nodl -o "$ofile" \
+${CROSS}ld -m $format -T $lds $text_start $pie $nodl -o "$ofile" $map \
$platformo $tmp $object/wrapper.a
 rm $tmp
 fi
-- 
2.20.1




[RFC PATCH] powerpc: Use ppu_intrinsics.h instead of opencoding

2020-03-27 Thread Christophe Leroy
ppu_intrinsics.h already includes helpers for things
like sync(), isync(), dcbX(), etc ...

Use it instead of opencoding.

Signed-off-by: Christophe Leroy 
---
 arch/powerpc/include/asm/barrier.h  | 11 +++
 arch/powerpc/include/asm/cache.h| 25 ++---
 arch/powerpc/include/asm/checksum.h |  4 +++-
 arch/powerpc/include/asm/synch.h| 12 
 4 files changed, 20 insertions(+), 32 deletions(-)

diff --git a/arch/powerpc/include/asm/barrier.h 
b/arch/powerpc/include/asm/barrier.h
index 123adcefd40f..392c91519220 100644
--- a/arch/powerpc/include/asm/barrier.h
+++ b/arch/powerpc/include/asm/barrier.h
@@ -7,6 +7,10 @@
 
 #include 
 
+#ifndef __ASSEMBLY__
+#include 
+#endif
+
 /*
  * Memory barrier.
  * The sync instruction guarantees that all memory accesses initiated
@@ -31,9 +35,9 @@
  * However, on CPUs that don't support lwsync, lwsync actually maps to a
  * heavy-weight sync, so smp_wmb() can be a lighter-weight eieio.
  */
-#define mb()   __asm__ __volatile__ ("sync" : : : "memory")
-#define rmb()  __asm__ __volatile__ ("sync" : : : "memory")
-#define wmb()  __asm__ __volatile__ ("sync" : : : "memory")
+#define mb()   __sync()
+#define rmb()  __sync()
+#define wmb()  __sync()
 
 /* The sub-arch has lwsync */
 #if defined(__powerpc64__) || defined(CONFIG_PPC_E500MC)
@@ -42,7 +46,6 @@
 #define SMPWMB  eieio
 #endif
 
-#define __lwsync() __asm__ __volatile__ (stringify_in_c(LWSYNC) : : 
:"memory")
 #define dma_rmb()  __lwsync()
 #define dma_wmb()  __asm__ __volatile__ (stringify_in_c(SMPWMB) : : 
:"memory")
 
diff --git a/arch/powerpc/include/asm/cache.h b/arch/powerpc/include/asm/cache.h
index 72b81015cebe..5b5e9a63060a 100644
--- a/arch/powerpc/include/asm/cache.h
+++ b/arch/powerpc/include/asm/cache.h
@@ -34,6 +34,8 @@
 #define IFETCH_ALIGN_BYTES (1 << IFETCH_ALIGN_SHIFT)
 
 #if !defined(__ASSEMBLY__)
+#include 
+
 #ifdef CONFIG_PPC64
 
 struct ppc_cache_info {
@@ -111,31 +113,16 @@ extern void _set_L3CR(unsigned long);
 #define _set_L3CR(val) do { } while(0)
 #endif
 
-static inline void dcbz(void *addr)
-{
-   __asm__ __volatile__ ("dcbz 0, %0" : : "r"(addr) : "memory");
-}
+#define dcbz   __dcbz
+#define dcbf   __dcbf
+#define dcbst  __dcbst
+#define icbi   __icbi
 
 static inline void dcbi(void *addr)
 {
__asm__ __volatile__ ("dcbi 0, %0" : : "r"(addr) : "memory");
 }
 
-static inline void dcbf(void *addr)
-{
-   __asm__ __volatile__ ("dcbf 0, %0" : : "r"(addr) : "memory");
-}
-
-static inline void dcbst(void *addr)
-{
-   __asm__ __volatile__ ("dcbst 0, %0" : : "r"(addr) : "memory");
-}
-
-static inline void icbi(void *addr)
-{
-   asm volatile ("icbi 0, %0" : : "r"(addr) : "memory");
-}
-
 static inline void iccci(void *addr)
 {
asm volatile ("iccci 0, %0" : : "r"(addr) : "memory");
diff --git a/arch/powerpc/include/asm/checksum.h 
b/arch/powerpc/include/asm/checksum.h
index 9cce06194dcc..16abea7c3c64 100644
--- a/arch/powerpc/include/asm/checksum.h
+++ b/arch/powerpc/include/asm/checksum.h
@@ -8,6 +8,8 @@
 
 #include 
 #include 
+
+#include 
 /*
  * Computes the checksum of a memory block at src, length len,
  * and adds in "sum" (32-bit), while copying the block to dst.
@@ -42,7 +44,7 @@ static inline __sum16 csum_fold(__wsum sum)
unsigned int tmp;
 
/* swap the two 16-bit halves of sum */
-   __asm__("rlwinm %0,%1,16,0,31" : "=r" (tmp) : "r" (sum));
+   tmp = __rlwinm(sum, 16, 0, 31);
/* if there is a carry from adding the two 16-bit halves,
   it will carry from the lower half into the upper half,
   giving us the correct sum in the upper half. */
diff --git a/arch/powerpc/include/asm/synch.h b/arch/powerpc/include/asm/synch.h
index aca70fb43147..44020f89854e 100644
--- a/arch/powerpc/include/asm/synch.h
+++ b/arch/powerpc/include/asm/synch.h
@@ -7,19 +7,15 @@
 #include 
 
 #ifndef __ASSEMBLY__
+#include 
+
 extern unsigned int __start___lwsync_fixup, __stop___lwsync_fixup;
 extern void do_lwsync_fixups(unsigned long value, void *fixup_start,
 void *fixup_end);
 
-static inline void eieio(void)
-{
-   __asm__ __volatile__ ("eieio" : : : "memory");
-}
+#define eieio __eieio
+#define isync __isync
 
-static inline void isync(void)
-{
-   __asm__ __volatile__ ("isync" : : : "memory");
-}
 #endif /* __ASSEMBLY__ */
 
 #if defined(__powerpc64__)
-- 
2.25.0



Re: [PATCH v1] powerpc: Make setjmp/longjump signature standard

2020-03-27 Thread Christophe Leroy

Subject line, change longjump to longjmp

Le 27/03/2020 à 11:07, Clement Courbet a écrit :

Declaring setjmp()/longjmp() as taking longs makes the signature
non-standard, and makes clang complain. In the past, this has been
worked around by adding -ffreestanding to the compile flags.

The implementation looks like it only ever propagates the value
(in longjmp) or sets it to 1 (in setjmp), and we only call longjmp
with integer parameters.

This allows removing -ffreestanding from the compilation flags.

Context:
https://lore.kernel.org/patchwork/patch/1214060
https://lore.kernel.org/patchwork/patch/1216174

Signed-off-by: Clement Courbet 
---
  arch/powerpc/include/asm/setjmp.h | 6 --
  arch/powerpc/kexec/Makefile   | 3 ---
  2 files changed, 4 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/include/asm/setjmp.h 
b/arch/powerpc/include/asm/setjmp.h
index e9f81bb3f83b..84bb0d140d59 100644
--- a/arch/powerpc/include/asm/setjmp.h
+++ b/arch/powerpc/include/asm/setjmp.h
@@ -7,7 +7,9 @@
  
  #define JMP_BUF_LEN23
  
-extern long setjmp(long *) __attribute__((returns_twice));

-extern void longjmp(long *, long) __attribute__((noreturn));
+typedef long *jmp_buf;


Do we need that new opaque typedef ? Why not just keep long * ?


+
+extern int setjmp(jmp_buf env) __attribute__((returns_twice));
+extern void longjmp(jmp_buf env, int val) __attribute__((noreturn));
  
  #endif /* _ASM_POWERPC_SETJMP_H */

diff --git a/arch/powerpc/kexec/Makefile b/arch/powerpc/kexec/Makefile
index 378f6108a414..86380c69f5ce 100644
--- a/arch/powerpc/kexec/Makefile
+++ b/arch/powerpc/kexec/Makefile
@@ -3,9 +3,6 @@
  # Makefile for the linux kernel.
  #
  
-# Avoid clang warnings around longjmp/setjmp declarations

-CFLAGS_crash.o += -ffreestanding
-
  obj-y += core.o crash.o core_$(BITS).o
  
  obj-$(CONFIG_PPC32)		+= relocate_32.o




Christophe


RFA [PPC kernel] Avoid upcoming PPC kernel build failure

2020-03-27 Thread H.J. Lu
On Fri, Mar 27, 2020 at 7:54 AM Yu-cheng Yu  wrote:
>
> On Thu, 2020-02-06 at 04:55 -0800, H.J. Lu wrote:
> > On Wed, Feb 5, 2020 at 7:26 PM Michael Ellerman  wrote:
> > > "H.J. Lu"  writes:
> > > > On Tue, Feb 4, 2020 at 3:37 PM kbuild test robot  wrote:
> > > > > tree:   https://github.com/yyu168/linux_cet.git cet
> > > > > head:   bba707cc4715c1036b6561ab38b16747f9c49cfa
> > > > > commit: 71bb971dd76eeacd351690f28864ad5c5bec3691 [55/58] Discard 
> > > > > .note.gnu.property sections in generic NOTES
> > > > > config: powerpc-rhel-kconfig (attached as .config)
> > > > > compiler: powerpc64le-linux-gcc (GCC) 7.5.0
> > > > > reproduce:
> > > > > wget 
> > > > > https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross
> > > > >  -O ~/bin/make.cross
> > > > > chmod +x ~/bin/make.cross
> > > > > git checkout 71bb971dd76eeacd351690f28864ad5c5bec3691
> > > > > # save the attached .config to linux build tree
> > > > > GCC_VERSION=7.5.0 make.cross ARCH=powerpc
> > > > >
> > > > > If you fix the issue, kindly add following tag
> > > > > Reported-by: kbuild test robot 
> > > > >
> > > > > All warnings (new ones prefixed by >>):
> > > > >
> > > > >powerpc64le-linux-ld: warning: discarding dynamic section 
> > > > > .rela___ksymtab_gpl+__wait_rcu_gp
> > > >
> > > > arch/powerpc/kernel/vmlinux.lds.S has
> > > >
> > > >  .rela.dyn : AT(ADDR(.rela.dyn) - (0xc000 -0x))
> > > >  {
> > > >   __rela_dyn_start = .;
> > > >   *(.rela*)  Keep .rela* sections
> > > >  }
> > >
> > > The above is inside #ifdef CONFIG_RELOCATABLE
> > >
> > > > ...
> > > >  /DISCARD/ : {
> > > >   *(*.EMB.apuinfo)
> > > >   *(.glink .iplt .plt .rela* .comment)
> > > > Discard  .rela* sections.  But it is 
> > > > ignored.
> > > >   *(.gnu.version*)
> > > >   *(.gnu.attributes)
> > > >   *(.eh_frame)
> > > >  }
> > >
> > > But that is not #ifdef'ed at all.
> > >
> > > > With my
> > > >
> > > > ommit 71bb971dd76eeacd351690f28864ad5c5bec3691
> > > > Author: H.J. Lu 
> > > > Date:   Thu Jan 30 12:39:09 2020 -0800
> > > >
> > > > Discard .note.gnu.property sections in generic NOTES
> > > >
> > > > With the command-line option, -mx86-used-note=yes, the x86 assembler
> > > > in binutils 2.32 and above generates a program property note in a 
> > > > note
> > > > section, .note.gnu.property, to encode used x86 ISAs and features.  
> > > > But
> > > > kernel linker script only contains a single NOTE segment:
> > > >
> > > > /DISCARD/ : { *(.note.gnu.property) }
> > > >
> > > > is placed before
> > > >
> > > > .rela.dyn : AT(ADDR(.rela.dyn) - (0xc000 -0x))
> > > >  {
> > > >   __rela_dyn_start = .;
> > > >   *(.rela*)  Keep .rela* sections
> > > >  }
> > > >
> > > > Then .rela* in
> > > >
> > > >  /DISCARD/ : {
> > > >   *(*.EMB.apuinfo)
> > > >   *(.glink .iplt .plt .rela* .comment)
> > > >   *(.gnu.version*)
> > > >   *(.gnu.attributes)
> > > >   *(.eh_frame)
> > > >  }
> > > >
> > > > is honored.  Can someone from POWERPC comment on it?
> > >
> > > Hmm OK. I'm not really a toolchain person.
> > >
> > > The comment on DISCARDS says:
> > >
> > >* Some archs want to discard exit text/data at runtime rather than
> > >* link time due to cross-section references such as alt instructions,
> > >* bug table, eh_frame, etc.  DISCARDS must be the last of output
> > >* section definitions so that such archs put those in earlier section
> > >* definitions.
> > >*/
> > >
> > > But I guess you're changing those semantics in your series.
> > >
> > > This seems to fix the warning for me?
> > >
> > > diff --git a/arch/powerpc/kernel/vmlinux.lds.S 
> > > b/arch/powerpc/kernel/vmlinux.lds.S
> > > index b4c89a1acebb..076b3e8a849d 100644
> > > --- a/arch/powerpc/kernel/vmlinux.lds.S
> > > +++ b/arch/powerpc/kernel/vmlinux.lds.S
> > > @@ -365,9 +365,12 @@ SECTIONS
> > > DISCARDS
> > > /DISCARD/ : {
> > > *(*.EMB.apuinfo)
> > > -   *(.glink .iplt .plt .rela* .comment)
> > > +   *(.glink .iplt .plt .comment)
> > > *(.gnu.version*)
> > > *(.gnu.attributes)
> > > *(.eh_frame)
> > > +#ifndef CONFIG_RELOCATABLE
> > > +   *(.rela*)
> > > +#endif
> > > }
> > >  }
> > >
> > >
> > > cheers
> >
> > This looks correct me.
> >
> > Reviewed-by: H.J. Lu 
> >
> > Thanks.
> >
>
> Has this been merged into any branch yet?  I just checked the tip tree and did
> not see it.
>

FYI, my patches have been queued on x86/build branch.   Could someone
from PPC community add this patch to PPC kernel to avoid upcoming PPC
kernel build failure?

Thanks.

-- 
H.J.


Re: [PATCH v1] powerpc: Make setjmp/longjump signature standard

2020-03-27 Thread Nathan Chancellor
On Fri, Mar 27, 2020 at 10:10:44AM -0700, Nick Desaulniers wrote:
> On Fri, Mar 27, 2020 at 3:08 AM Clement Courbet  wrote:
> >
> > Declaring setjmp()/longjmp() as taking longs makes the signature
> > non-standard, and makes clang complain. In the past, this has been
> > worked around by adding -ffreestanding to the compile flags.
> >
> > The implementation looks like it only ever propagates the value
> > (in longjmp) or sets it to 1 (in setjmp), and we only call longjmp
> > with integer parameters.
> >
> > This allows removing -ffreestanding from the compilation flags.
> >
> > Context:
> > https://lore.kernel.org/patchwork/patch/1214060
> > https://lore.kernel.org/patchwork/patch/1216174
> >
> > Signed-off-by: Clement Courbet 

Thanks for fixing this properly, not really sure why I did not think of
this in the first place. I guess my thought was the warning makes it
seem like clang is going to ignore the kernel's implementation of
setjmp/longjmp but I can't truly remember.

> Hi Clement, thanks for the patch! Would you mind sending a V2 that
> included a similar fix to arch/powerpc/xmon/Makefile?

Agreed.

> For context, this was the original patch:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=aea447141c7e7824b81b49acd1bc78
> which was then modified to:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c9029ef9c95765e7b63c4d9aa780674447db1ec0
> 
> So on your V2, if you include in the commit message, the line:
> 
> Fixes c9029ef9c957 ("powerpc: Avoid clang warnings around setjmp and longjmp")
> 
> then that will help our LTS branch maintainers back port it to the
> appropriate branches.

The tags should be:

Cc: sta...@vger.kernel.org # v4.14+
Fixes: c9029ef9c957 ("powerpc: Avoid clang warnings around setjmp and longjmp")

that way it explicitly gets picked up for stable, rather than Sasha's
AUTOSEL process, which could miss it.

With the xmon/Makefile -ffreestanding removed and the tags updated,
consider this:

Reviewed-by: Nathan Chancellor 
Tested-by: Nathan Chancellor 

Cheers,
Nathan


Re: [PATCH v1] powerpc: Make setjmp/longjump signature standard

2020-03-27 Thread Nick Desaulniers
On Fri, Mar 27, 2020 at 3:08 AM Clement Courbet  wrote:
>
> Declaring setjmp()/longjmp() as taking longs makes the signature
> non-standard, and makes clang complain. In the past, this has been
> worked around by adding -ffreestanding to the compile flags.
>
> The implementation looks like it only ever propagates the value
> (in longjmp) or sets it to 1 (in setjmp), and we only call longjmp
> with integer parameters.
>
> This allows removing -ffreestanding from the compilation flags.
>
> Context:
> https://lore.kernel.org/patchwork/patch/1214060
> https://lore.kernel.org/patchwork/patch/1216174
>
> Signed-off-by: Clement Courbet 

Hi Clement, thanks for the patch! Would you mind sending a V2 that
included a similar fix to arch/powerpc/xmon/Makefile?

For context, this was the original patch:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=aea447141c7e7824b81b49acd1bc78
which was then modified to:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=c9029ef9c95765e7b63c4d9aa780674447db1ec0

So on your V2, if you include in the commit message, the line:

Fixes c9029ef9c957 ("powerpc: Avoid clang warnings around setjmp and longjmp")

then that will help our LTS branch maintainers back port it to the
appropriate branches.

>
> ---
>  arch/powerpc/include/asm/setjmp.h | 6 --
>  arch/powerpc/kexec/Makefile   | 3 ---
>  2 files changed, 4 insertions(+), 5 deletions(-)
>
> diff --git a/arch/powerpc/include/asm/setjmp.h 
> b/arch/powerpc/include/asm/setjmp.h
> index e9f81bb3f83b..84bb0d140d59 100644
> --- a/arch/powerpc/include/asm/setjmp.h
> +++ b/arch/powerpc/include/asm/setjmp.h
> @@ -7,7 +7,9 @@
>
>  #define JMP_BUF_LEN23
>
> -extern long setjmp(long *) __attribute__((returns_twice));
> -extern void longjmp(long *, long) __attribute__((noreturn));
> +typedef long *jmp_buf;
> +
> +extern int setjmp(jmp_buf env) __attribute__((returns_twice));
> +extern void longjmp(jmp_buf env, int val) __attribute__((noreturn));
>
>  #endif /* _ASM_POWERPC_SETJMP_H */
> diff --git a/arch/powerpc/kexec/Makefile b/arch/powerpc/kexec/Makefile
> index 378f6108a414..86380c69f5ce 100644
> --- a/arch/powerpc/kexec/Makefile
> +++ b/arch/powerpc/kexec/Makefile
> @@ -3,9 +3,6 @@
>  # Makefile for the linux kernel.
>  #
>
> -# Avoid clang warnings around longjmp/setjmp declarations
> -CFLAGS_crash.o += -ffreestanding
> -
>  obj-y  += core.o crash.o core_$(BITS).o
>
>  obj-$(CONFIG_PPC32)+= relocate_32.o
> --
> 2.25.1.696.g5e7596f4ac-goog
>


-- 
Thanks,
~Nick Desaulniers


Re: [PATCH] x86: Alias memset to __builtin_memset.

2020-03-27 Thread Segher Boessenkool
Hi!

On Thu, Mar 26, 2020 at 01:38:39PM +0100, Clement Courbet wrote:
> --- a/arch/powerpc/include/asm/setjmp.h
> +++ b/arch/powerpc/include/asm/setjmp.h
> @@ -12,7 +12,9 @@
> 
>  #define JMP_BUF_LEN23
> -extern long setjmp(long *);
> -extern void longjmp(long *, long);
> +typedef long * jmp_buf;
> +
> +extern int setjmp(jmp_buf);
> +extern void longjmp(jmp_buf, int);
> 
> I'm happy to send a patch for this, and get rid of more -ffreestanding.
> Opinions ?

Pedantically, jmp_buf should be an array type.  But, this will probably
work fine, and it certainly is better than what we had before.

You could do
  typedef long jmp_buf[JMP_BUF_LEN];
perhaps?

Thanks,


Segher


[PATCH] soc: fsl: qe: clean up an indentation issue

2020-03-27 Thread Colin King
From: Colin Ian King 

There is a statement that not indented correctly, remove the
extraneous space.

Signed-off-by: Colin Ian King 
---
 drivers/soc/fsl/qe/ucc.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/soc/fsl/qe/ucc.c b/drivers/soc/fsl/qe/ucc.c
index d6c93970df4d..cac0fb7693a0 100644
--- a/drivers/soc/fsl/qe/ucc.c
+++ b/drivers/soc/fsl/qe/ucc.c
@@ -519,7 +519,7 @@ int ucc_set_tdm_rxtx_clk(u32 tdm_num, enum qe_clock clock,
int clock_bits;
u32 shift;
struct qe_mux __iomem *qe_mux_reg;
-__be32 __iomem *cmxs1cr;
+   __be32 __iomem *cmxs1cr;
 
qe_mux_reg = _immr->qmx;
 
-- 
2.25.1



[PATCH 0/6] Kill setup_irq()

2020-03-27 Thread afzal mohammed
Hi Thomas,

As compared to the situation mentioned earlier[1], now powerpc patch is
also in -next, and the pending ARM patches has been picked up by ARM SoC
maintainers today and is expected to show up in next -next. All other
subsytem patches has been picked by relevant maintainers & are already
in -next except alpha, c6x, hexagon, unicore32 & sh.

As it is the case, i am sending you patches for the above 5
architecture's plus the core removal patch.

Status of 5 arch's:
---
alpha:  received ack from Matt Turner, build test success
c6x:did receive ack from Mark Salter in v1, the final
 version (v3) was with minor changes, hence removed his
 ack & cc'ed him, build test success
hexagon:build test success
unicore32:  couldn't get toolchain from kernel.org, 0day test robot
 or Segher's buildall
sh: To compile the relevant changes sh64 compiler is
 required, couldn't get it from above mentioned 3
 sources.

Note 1: sh toolchain is available, but that will not make the
 relevant changes compile as it has dependency of 64bit arch toolchain,
 did try a Kconfig hack to make it compile w/ 32bit sh toolchain, but it
 failed due to other reasons (unknown operands), so gave up on that.
Note 2: hexagon final image creation fails even w/o my patch, but it
 has been ensured that w/ my changes relevant object files are getting
 built  w/o warnings.

Regards
afzal

[1] https://lkml.kernel.org/r/20200321172626.GA6323@afzalpc

afzal mohammed (6):
  alpha: Replace setup_irq() by request_irq()
  c6x: replace setup_irq() by request_irq()
  hexagon: replace setup_irq() by request_irq()
  sh: replace setup_irq() by request_irq()
  unicore32: replace setup_irq() by request_irq()
  genirq: Remove setup_irq() and remove_irq()

 arch/alpha/kernel/irq_alpha.c | 29 
 arch/alpha/kernel/irq_i8259.c |  8 ++
 arch/alpha/kernel/irq_impl.h  |  7 +
 arch/alpha/kernel/irq_pyxis.c |  3 ++-
 arch/alpha/kernel/sys_alcor.c |  3 ++-
 arch/alpha/kernel/sys_cabriolet.c |  3 ++-
 arch/alpha/kernel/sys_eb64p.c |  3 ++-
 arch/alpha/kernel/sys_marvel.c|  2 +-
 arch/alpha/kernel/sys_miata.c |  6 +++--
 arch/alpha/kernel/sys_ruffian.c   |  3 ++-
 arch/alpha/kernel/sys_rx164.c |  3 ++-
 arch/alpha/kernel/sys_sx164.c |  3 ++-
 arch/alpha/kernel/sys_wildfire.c  |  7 ++---
 arch/alpha/kernel/time.c  |  6 ++---
 arch/c6x/platforms/timer64.c  | 11 +++-
 arch/hexagon/kernel/smp.c | 22 
 arch/hexagon/kernel/time.c| 11 +++-
 arch/sh/boards/mach-cayman/irq.c  | 18 +
 arch/sh/drivers/dma/dma-pvr2.c|  9 +++
 arch/unicore32/kernel/time.c  | 11 +++-
 include/linux/irq.h   |  2 --
 kernel/irq/manage.c   | 44 ---
 22 files changed, 60 insertions(+), 154 deletions(-)

-- 
2.25.1



Re: [PATCH 1/1] ppc/crash: Skip spinlocks during crash

2020-03-27 Thread Leonardo Bras
Hello Christophe, thanks for the feedback.

I noticed an error in this patch and sent a v2, that can be seen here:
http://patchwork.ozlabs.org/patch/1262468/

Comments inline::

On Fri, 2020-03-27 at 07:50 +0100, Christophe Leroy wrote:
> > @@ -142,6 +144,8 @@ static inline void arch_spin_lock(arch_spinlock_t *lock)
> > if (likely(__arch_spin_trylock(lock) == 0))
> > break;
> > do {
> > +   if (unlikely(crash_skip_spinlock))
> > +   return;

Complete function for reference:
static inline void arch_spin_lock(arch_spinlock_t *lock)
{
while (1) {
if (likely(__arch_spin_trylock(lock) == 0))
break;
do {
if (unlikely(crash_skip_spinlock))
return;
HMT_low();
if (is_shared_processor())
splpar_spin_yield(lock);
} while (unlikely(lock->slock != 0));
HMT_medium();
}
}

> You are adding a test that reads a global var in the middle of a so hot 
> path ? That must kill performance. 

I thought it would, in worst case scenario, increase a maximum delay of
an arch_spin_lock() call 1 spin cycle. Here is what I thought:

- If the lock is already free, it would change nothing, 
- Otherwise, the lock will wait.
- Waiting cycle just got bigger.
- Worst case scenario: running one more cycle, given lock->slock can
turn to 0 just after checking.

Could you please point where I failed to see the performance penalty?
(I need to get better at this :) )


> Can we do different ?

Sure, a less intrusive way of doing it would be to free the currently
needed locks before proceeding. I just thought it would be harder to
maintain.

> Christophe

Best regards,
Leonardo


signature.asc
Description: This is a digitally signed message part


Re: [PATCH v2] powerpc/boot: Delete unneeded .globl _zimage_start

2020-03-27 Thread Segher Boessenkool
On Thu, Mar 26, 2020 at 03:26:12PM -0700, Fangrui Song wrote:
> On 2020-03-26, Segher Boessenkool wrote:
> >On Wed, Mar 25, 2020 at 09:42:57AM -0700, Fangrui Song wrote:
> >>.globl sets the symbol binding to STB_GLOBAL while .weak sets the
> >>binding to STB_WEAK. GNU as let .weak override .globl since binutils-gdb
> >>5ca547dc2399a0a5d9f20626d4bf5547c3ccfddd (1996). Clang integrated
> >>assembler let the last win but it may error in the future.
> >
> >GNU AS works for more than just ELF.  The way the assembler language
> >is defined, it is not .weak overriding .globl -- instead, .weak sets a
> >symbol attribute.  On an existing symbol (but it creates on if there is
> >none yet).
> >
> >Clang is buggy if it does not allow valid (and perfectly normal)
> >assembler code like this.
> 
> https://sourceware.org/pipermail/binutils/2020-March/110399.html
> 
> Alan: "I think it is completely fine for you to make the llvm assembler
> error on inconsistent binding, or the last directive win.  Either of
> those behaviours is logical and good, but you quite possibly will run
> into a need to fix more user assembly.

This would be fine and consistent behaviour, of course.  But it is not
appropriate if you want to pretend to be compatible to GNU toolchains.

Which is exactly why you want this kernel patch at all.  And the kernel
can (in this case) accommodate your buggy assembler, sure, but are you
going to "fix" all other programs with this "problem" as well?


Segher


Re: [PATCH v2] powerpc xmon: use `dcbf` inplace of `dcbi` instruction for 64bit Book3S

2020-03-27 Thread Christophe Leroy




Le 27/03/2020 à 10:03, Balamuruhan S a écrit :

On Fri, 2020-03-27 at 07:48 +0100, Christophe Leroy wrote:


Le 26/03/2020 à 07:15, Balamuruhan S a écrit :

Data Cache Block Invalidate (dcbi) instruction was implemented back in
PowerPC
architecture version 2.03. It is obsolete and attempt to use of this
illegal
instruction results in a hypervisor emulation assistance interrupt. So,
ifdef
it out the option `i` in xmon for 64bit Book3S.


I don't understand. You say two contradictory things:
1/ You say it _was_ added back.
2/ You say it _is_ obsolete.

How can it be obsolete if it was added back ?


I actually learnt it from P8 and P9 User Manual,

The POWER8/POWER9 core does not provide support for the following optional or
obsolete instructions (attempted use of these results in a hypervisor emulation
assistance interrupt):
• tlbia - TLB invalidate all
• tlbiex - TLB invalidate entry by index (obsolete)
• slbiex - SLB invalidate entry by index (obsolete)
• dcba - Data cache block allocate (Book II; obsolete)
• dcbi - Data cache block invalidate (obsolete)
• rfi - Return from interrupt (32-bit; obsolete)



Then that's exactly what you have to say in the coming log.

Maybe you could also change invalidate_dcache_range():

for (i = 0; i < size >> shift; i++, addr += bytes) {
if (IS_ENABLED(CONFIG_PPC_BOOK3S_64))
dcbf(addr);
else
dcbi(addr);
}




Christophe


Re: [PATCH 0/2] powerpc: Remove support for ppc405/440 Xilinx platforms

2020-03-27 Thread Andy Shevchenko
On Fri, Mar 27, 2020 at 02:22:55PM +0100, Arnd Bergmann wrote:
> On Fri, Mar 27, 2020 at 2:15 PM Andy Shevchenko
>  wrote:
> > On Fri, Mar 27, 2020 at 03:10:26PM +0200, Andy Shevchenko wrote:
> > > On Fri, Mar 27, 2020 at 01:54:33PM +0100, Arnd Bergmann wrote:
> > > > On Fri, Mar 27, 2020 at 1:12 PM Michal Simek  
> > > > wrote:

...

> > > > It does raise a follow-up question about ppc40x though: is it time to
> > > > retire all of it?
> > >
> > > Who knows?
> > >
> > > I have in possession nice WD My Book Live, based on this architecture, 
> > > and I
> > > won't it gone from modern kernel support. OTOH I understand that amount 
> > > of real
> > > users not too big.
> >
> > +Cc: Christian Lamparter, whom I owe for that WD box.
> 
> According to https://openwrt.org/toh/wd/mybooklive, that one is based on
> APM82181/ppc464, so it is about several generations newer than what I
> asked about (ppc40x).
> 
> > > Ah, and I have Amiga board, but that one is being used only for testing, 
> > > so,
> > > I don't care much.
> 
> I think there are a couple of ppc440 based Amiga boards, but again, not 405
> to my knowledge.

Ah, you are right. No objections from ppc40x removal!

-- 
With Best Regards,
Andy Shevchenko




Re: [PATCH 0/2] powerpc: Remove support for ppc405/440 Xilinx platforms

2020-03-27 Thread Arnd Bergmann
On Fri, Mar 27, 2020 at 2:15 PM Andy Shevchenko
 wrote:
> On Fri, Mar 27, 2020 at 03:10:26PM +0200, Andy Shevchenko wrote:
> > On Fri, Mar 27, 2020 at 01:54:33PM +0100, Arnd Bergmann wrote:
> > > On Fri, Mar 27, 2020 at 1:12 PM Michal Simek  
> > > wrote:
> > > >
> > > > recently we wanted to update xilinx intc driver and we found that 
> > > > function
> > > > which we wanted to remove is still wired by ancient Xilinx PowerPC
> > > > platforms. Here is the thread about it.
> > > > https://lore.kernel.org/linux-next/48d3232d-0f1d-42ea-3109-f44bbabfa...@xilinx.com/
> > > >
> > > > I have been talking about it internally and there is no interest in 
> > > > these
> > > > platforms and it is also orphan for quite a long time. None is really
> > > > running/testing these platforms regularly that's why I think it makes 
> > > > sense
> > > > to remove them also with drivers which are specific to this platform.
> > > >
> > > > U-Boot support was removed in 2017 without anybody complain about it
> > > > https://github.com/Xilinx/u-boot-xlnx/commit/98f705c9cefdfdba62c069821bbba10273a0a8ed
> > > >
> > > > Based on current ppc/next.
> > > >
> > > > If anyone has any objection about it, please let me know.
> > >
> > > Acked-by: Arnd Bergmann 
> > >
> > > This looks reasonable to me as well, in particular as the code only
> > > supports the two
> > > ppc44x virtex developer boards and no commercial products.
> > >
> > > It does raise a follow-up question about ppc40x though: is it time to
> > > retire all of it?
> >
> > Who knows?
> >
> > I have in possession nice WD My Book Live, based on this architecture, and I
> > won't it gone from modern kernel support. OTOH I understand that amount of 
> > real
> > users not too big.
>
> +Cc: Christian Lamparter, whom I owe for that WD box.

According to https://openwrt.org/toh/wd/mybooklive, that one is based on
APM82181/ppc464, so it is about several generations newer than what I
asked about (ppc40x).

> > Ah, and I have Amiga board, but that one is being used only for testing, so,
> > I don't care much.

I think there are a couple of ppc440 based Amiga boards, but again, not 405
to my knowledge.

  Arnd


Re: [PATCH 0/2] powerpc: Remove support for ppc405/440 Xilinx platforms

2020-03-27 Thread Andy Shevchenko
On Fri, Mar 27, 2020 at 03:10:26PM +0200, Andy Shevchenko wrote:
> On Fri, Mar 27, 2020 at 01:54:33PM +0100, Arnd Bergmann wrote:
> > On Fri, Mar 27, 2020 at 1:12 PM Michal Simek  
> > wrote:
> > >
> > > recently we wanted to update xilinx intc driver and we found that function
> > > which we wanted to remove is still wired by ancient Xilinx PowerPC
> > > platforms. Here is the thread about it.
> > > https://lore.kernel.org/linux-next/48d3232d-0f1d-42ea-3109-f44bbabfa...@xilinx.com/
> > >
> > > I have been talking about it internally and there is no interest in these
> > > platforms and it is also orphan for quite a long time. None is really
> > > running/testing these platforms regularly that's why I think it makes 
> > > sense
> > > to remove them also with drivers which are specific to this platform.
> > >
> > > U-Boot support was removed in 2017 without anybody complain about it
> > > https://github.com/Xilinx/u-boot-xlnx/commit/98f705c9cefdfdba62c069821bbba10273a0a8ed
> > >
> > > Based on current ppc/next.
> > >
> > > If anyone has any objection about it, please let me know.
> > 
> > Acked-by: Arnd Bergmann 
> > 
> > This looks reasonable to me as well, in particular as the code only
> > supports the two
> > ppc44x virtex developer boards and no commercial products.
> > 
> > It does raise a follow-up question about ppc40x though: is it time to
> > retire all of it?
> 
> Who knows?
> 
> I have in possession nice WD My Book Live, based on this architecture, and I
> won't it gone from modern kernel support. OTOH I understand that amount of 
> real
> users not too big.

+Cc: Christian Lamparter, whom I owe for that WD box.

> 
> Ah, and I have Amiga board, but that one is being used only for testing, so,
> I don't care much.
> 
> > The other ppc405 machines appear to have seen even fewer updates after the
> > OpenBlockS 600 got added in 2011, so it's possible nobody is using them any 
> > more
> > with modern kernels.
> > 
> > I see that OpenWRT removed both ppc40x and ppc44x exactly a year ago after
> > they had not been maintained for years.
> > 
> > However, 44x (in its ppc476 incarnation) is clearly still is used
> > through the fsp2 platform,
> > and can not be deprecated at least until that is known to have stopped
> > getting kernel
> > updates.
> > 
> > Arnd
> 
> -- 
> With Best Regards,
> Andy Shevchenko
> 
> 

-- 
With Best Regards,
Andy Shevchenko




Re: [PATCH 0/2] powerpc: Remove support for ppc405/440 Xilinx platforms

2020-03-27 Thread Andy Shevchenko
On Fri, Mar 27, 2020 at 01:54:33PM +0100, Arnd Bergmann wrote:
> On Fri, Mar 27, 2020 at 1:12 PM Michal Simek  wrote:
> >
> > recently we wanted to update xilinx intc driver and we found that function
> > which we wanted to remove is still wired by ancient Xilinx PowerPC
> > platforms. Here is the thread about it.
> > https://lore.kernel.org/linux-next/48d3232d-0f1d-42ea-3109-f44bbabfa...@xilinx.com/
> >
> > I have been talking about it internally and there is no interest in these
> > platforms and it is also orphan for quite a long time. None is really
> > running/testing these platforms regularly that's why I think it makes sense
> > to remove them also with drivers which are specific to this platform.
> >
> > U-Boot support was removed in 2017 without anybody complain about it
> > https://github.com/Xilinx/u-boot-xlnx/commit/98f705c9cefdfdba62c069821bbba10273a0a8ed
> >
> > Based on current ppc/next.
> >
> > If anyone has any objection about it, please let me know.
> 
> Acked-by: Arnd Bergmann 
> 
> This looks reasonable to me as well, in particular as the code only
> supports the two
> ppc44x virtex developer boards and no commercial products.
> 
> It does raise a follow-up question about ppc40x though: is it time to
> retire all of it?

Who knows?

I have in possession nice WD My Book Live, based on this architecture, and I
won't it gone from modern kernel support. OTOH I understand that amount of real
users not too big.

Ah, and I have Amiga board, but that one is being used only for testing, so,
I don't care much.

> The other ppc405 machines appear to have seen even fewer updates after the
> OpenBlockS 600 got added in 2011, so it's possible nobody is using them any 
> more
> with modern kernels.
> 
> I see that OpenWRT removed both ppc40x and ppc44x exactly a year ago after
> they had not been maintained for years.
> 
> However, 44x (in its ppc476 incarnation) is clearly still is used
> through the fsp2 platform,
> and can not be deprecated at least until that is known to have stopped
> getting kernel
> updates.
> 
> Arnd

-- 
With Best Regards,
Andy Shevchenko




Re: [PATCH 0/2] powerpc: Remove support for ppc405/440 Xilinx platforms

2020-03-27 Thread Arnd Bergmann
On Fri, Mar 27, 2020 at 1:12 PM Michal Simek  wrote:
>
> recently we wanted to update xilinx intc driver and we found that function
> which we wanted to remove is still wired by ancient Xilinx PowerPC
> platforms. Here is the thread about it.
> https://lore.kernel.org/linux-next/48d3232d-0f1d-42ea-3109-f44bbabfa...@xilinx.com/
>
> I have been talking about it internally and there is no interest in these
> platforms and it is also orphan for quite a long time. None is really
> running/testing these platforms regularly that's why I think it makes sense
> to remove them also with drivers which are specific to this platform.
>
> U-Boot support was removed in 2017 without anybody complain about it
> https://github.com/Xilinx/u-boot-xlnx/commit/98f705c9cefdfdba62c069821bbba10273a0a8ed
>
> Based on current ppc/next.
>
> If anyone has any objection about it, please let me know.

Acked-by: Arnd Bergmann 

This looks reasonable to me as well, in particular as the code only
supports the two
ppc44x virtex developer boards and no commercial products.

It does raise a follow-up question about ppc40x though: is it time to
retire all of it?
The other ppc405 machines appear to have seen even fewer updates after the
OpenBlockS 600 got added in 2011, so it's possible nobody is using them any more
with modern kernels.

I see that OpenWRT removed both ppc40x and ppc44x exactly a year ago after
they had not been maintained for years.

However, 44x (in its ppc476 incarnation) is clearly still is used
through the fsp2 platform,
and can not be deprecated at least until that is known to have stopped
getting kernel
updates.

Arnd


Re: [patch V3 03/20] usb: gadget: Use completion interface instead of open coding it

2020-03-27 Thread Sebastian Siewior
On 2020-03-25 10:37:57 [+0200], Felipe Balbi wrote:
> Do you want to carry it via your tree? If so:

We would like to do so.

> Acked-by: Felipe Balbi 

Thank you.

> Otherwise, let me know and I'll pick this patch.

Sebastian


[PATCH 2/2] powerpc: Remove Xilinx PPC405/PPC440 support

2020-03-27 Thread Michal Simek
The latest Xilinx design tools called ISE and EDK has been released in
October 2013. New tool doesn't support any PPC405/PPC440 new designs.
These platforms are no longer supported and tested.

PowerPC 405/440 port is orphan from 2013 by
commit cdeb89943bfc ("MAINTAINERS: Fix incorrect status tag") and
commit 19624236cce1 ("MAINTAINERS: Update Grant's email address and 
maintainership")
that's why it is time to remove the support fot these platforms.

Signed-off-by: Michal Simek 
---

 Documentation/devicetree/bindings/xilinx.txt | 143 --
 Documentation/powerpc/bootwrapper.rst|  28 +-
 MAINTAINERS  |   6 -
 arch/powerpc/Kconfig.debug   |   2 +-
 arch/powerpc/boot/Makefile   |   7 +-
 arch/powerpc/boot/dts/Makefile   |   1 -
 arch/powerpc/boot/dts/virtex440-ml507.dts| 406 
 arch/powerpc/boot/dts/virtex440-ml510.dts| 466 ---
 arch/powerpc/boot/ops.h  |   1 -
 arch/powerpc/boot/serial.c   |   5 -
 arch/powerpc/boot/uartlite.c |  79 
 arch/powerpc/boot/virtex.c   |  97 
 arch/powerpc/boot/virtex405-head.S   |  31 --
 arch/powerpc/boot/wrapper|   8 -
 arch/powerpc/configs/40x/virtex_defconfig|  75 ---
 arch/powerpc/configs/44x/virtex5_defconfig   |  74 ---
 arch/powerpc/configs/ppc40x_defconfig|   8 -
 arch/powerpc/configs/ppc44x_defconfig|   8 -
 arch/powerpc/include/asm/xilinx_intc.h   |  16 -
 arch/powerpc/include/asm/xilinx_pci.h|  21 -
 arch/powerpc/kernel/cputable.c   |  39 --
 arch/powerpc/platforms/40x/Kconfig   |  31 --
 arch/powerpc/platforms/40x/Makefile  |   1 -
 arch/powerpc/platforms/40x/virtex.c  |  54 ---
 arch/powerpc/platforms/44x/Kconfig   |  37 --
 arch/powerpc/platforms/44x/Makefile  |   2 -
 arch/powerpc/platforms/44x/virtex.c  |  60 ---
 arch/powerpc/platforms/44x/virtex_ml510.c|  30 --
 arch/powerpc/platforms/Kconfig   |   4 -
 arch/powerpc/sysdev/Makefile |   2 -
 arch/powerpc/sysdev/xilinx_intc.c|  88 
 arch/powerpc/sysdev/xilinx_pci.c | 132 --
 arch/powerpc/xmon/ppc-dis.c  |   6 -
 arch/powerpc/xmon/ppc-opc.c  |  23 -
 arch/powerpc/xmon/ppc.h  |   5 -
 drivers/char/Kconfig |   2 +-
 drivers/video/fbdev/Kconfig  |   2 +-
 37 files changed, 7 insertions(+), 1993 deletions(-)
 delete mode 100644 arch/powerpc/boot/dts/virtex440-ml507.dts
 delete mode 100644 arch/powerpc/boot/dts/virtex440-ml510.dts
 delete mode 100644 arch/powerpc/boot/uartlite.c
 delete mode 100644 arch/powerpc/boot/virtex.c
 delete mode 100644 arch/powerpc/boot/virtex405-head.S
 delete mode 100644 arch/powerpc/configs/40x/virtex_defconfig
 delete mode 100644 arch/powerpc/configs/44x/virtex5_defconfig
 delete mode 100644 arch/powerpc/include/asm/xilinx_intc.h
 delete mode 100644 arch/powerpc/include/asm/xilinx_pci.h
 delete mode 100644 arch/powerpc/platforms/40x/virtex.c
 delete mode 100644 arch/powerpc/platforms/44x/virtex.c
 delete mode 100644 arch/powerpc/platforms/44x/virtex_ml510.c
 delete mode 100644 arch/powerpc/sysdev/xilinx_intc.c
 delete mode 100644 arch/powerpc/sysdev/xilinx_pci.c

diff --git a/Documentation/devicetree/bindings/xilinx.txt 
b/Documentation/devicetree/bindings/xilinx.txt
index d058ace29345..28199b31fe5e 100644
--- a/Documentation/devicetree/bindings/xilinx.txt
+++ b/Documentation/devicetree/bindings/xilinx.txt
@@ -86,149 +86,6 @@
xlnx,use-parity = <0>;
};
 
-   Some IP cores actually implement 2 or more logical devices.  In
-   this case, the device should still describe the whole IP core with
-   a single node and add a child node for each logical device.  The
-   ranges property can be used to translate from parent IP-core to the
-   registers of each device.  In addition, the parent node should be
-   compatible with the bus type 'xlnx,compound', and should contain
-   #address-cells and #size-cells, as with any other bus.  (Note: this
-   makes the assumption that both logical devices have the same bus
-   binding.  If this is not true, then separate nodes should be used
-   for each logical device).  The 'cell-index' property can be used to
-   enumerate logical devices within an IP core.  For example, the
-   following is the system.mhs entry for the dual ps2 controller found
-   on the ml403 reference design.
-
-   BEGIN opb_ps2_dual_ref
-   PARAMETER INSTANCE = opb_ps2_dual_ref_0
-   PARAMETER HW_VER = 1.00.a
-   PARAMETER C_BASEADDR = 0xA900
-   PARAMETER C_HIGHADDR = 0xA9001FFF
-   BUS_INTERFACE SOPB = opb_v20_0
-   PORT Sys_Intr1 = ps2_1_intr
-   PORT Sys_Intr2 = ps2_2_intr
-   PORT 

[PATCH 0/2] powerpc: Remove support for ppc405/440 Xilinx platforms

2020-03-27 Thread Michal Simek
Hi,

recently we wanted to update xilinx intc driver and we found that function
which we wanted to remove is still wired by ancient Xilinx PowerPC
platforms. Here is the thread about it.
https://lore.kernel.org/linux-next/48d3232d-0f1d-42ea-3109-f44bbabfa...@xilinx.com/

I have been talking about it internally and there is no interest in these
platforms and it is also orphan for quite a long time. None is really
running/testing these platforms regularly that's why I think it makes sense
to remove them also with drivers which are specific to this platform.

U-Boot support was removed in 2017 without anybody complain about it
https://github.com/Xilinx/u-boot-xlnx/commit/98f705c9cefdfdba62c069821bbba10273a0a8ed

Based on current ppc/next.

If anyone has any objection about it, please let me know.

Thanks,
Michal


Michal Simek (2):
  sound: ac97: Remove sound driver for ancient platform
  powerpc: Remove Xilinx PPC405/PPC440 support

 Documentation/devicetree/bindings/xilinx.txt |  143 --
 Documentation/powerpc/bootwrapper.rst|   28 +-
 MAINTAINERS  |6 -
 arch/powerpc/Kconfig.debug   |2 +-
 arch/powerpc/boot/Makefile   |7 +-
 arch/powerpc/boot/dts/Makefile   |1 -
 arch/powerpc/boot/dts/virtex440-ml507.dts|  406 --
 arch/powerpc/boot/dts/virtex440-ml510.dts|  466 ---
 arch/powerpc/boot/ops.h  |1 -
 arch/powerpc/boot/serial.c   |5 -
 arch/powerpc/boot/uartlite.c |   79 --
 arch/powerpc/boot/virtex.c   |   97 --
 arch/powerpc/boot/virtex405-head.S   |   31 -
 arch/powerpc/boot/wrapper|8 -
 arch/powerpc/configs/40x/virtex_defconfig|   75 -
 arch/powerpc/configs/44x/virtex5_defconfig   |   74 -
 arch/powerpc/configs/ppc40x_defconfig|8 -
 arch/powerpc/configs/ppc44x_defconfig|8 -
 arch/powerpc/include/asm/xilinx_intc.h   |   16 -
 arch/powerpc/include/asm/xilinx_pci.h|   21 -
 arch/powerpc/kernel/cputable.c   |   39 -
 arch/powerpc/platforms/40x/Kconfig   |   31 -
 arch/powerpc/platforms/40x/Makefile  |1 -
 arch/powerpc/platforms/40x/virtex.c  |   54 -
 arch/powerpc/platforms/44x/Kconfig   |   37 -
 arch/powerpc/platforms/44x/Makefile  |2 -
 arch/powerpc/platforms/44x/virtex.c  |   60 -
 arch/powerpc/platforms/44x/virtex_ml510.c|   30 -
 arch/powerpc/platforms/Kconfig   |4 -
 arch/powerpc/sysdev/Makefile |2 -
 arch/powerpc/sysdev/xilinx_intc.c|   88 --
 arch/powerpc/sysdev/xilinx_pci.c |  132 --
 arch/powerpc/xmon/ppc-dis.c  |6 -
 arch/powerpc/xmon/ppc-opc.c  |   23 -
 arch/powerpc/xmon/ppc.h  |5 -
 drivers/char/Kconfig |2 +-
 drivers/video/fbdev/Kconfig  |2 +-
 sound/drivers/Kconfig|   12 -
 sound/drivers/Makefile   |2 -
 sound/drivers/ml403-ac97cr.c | 1298 --
 40 files changed, 7 insertions(+), 3305 deletions(-)
 delete mode 100644 arch/powerpc/boot/dts/virtex440-ml507.dts
 delete mode 100644 arch/powerpc/boot/dts/virtex440-ml510.dts
 delete mode 100644 arch/powerpc/boot/uartlite.c
 delete mode 100644 arch/powerpc/boot/virtex.c
 delete mode 100644 arch/powerpc/boot/virtex405-head.S
 delete mode 100644 arch/powerpc/configs/40x/virtex_defconfig
 delete mode 100644 arch/powerpc/configs/44x/virtex5_defconfig
 delete mode 100644 arch/powerpc/include/asm/xilinx_intc.h
 delete mode 100644 arch/powerpc/include/asm/xilinx_pci.h
 delete mode 100644 arch/powerpc/platforms/40x/virtex.c
 delete mode 100644 arch/powerpc/platforms/44x/virtex.c
 delete mode 100644 arch/powerpc/platforms/44x/virtex_ml510.c
 delete mode 100644 arch/powerpc/sysdev/xilinx_intc.c
 delete mode 100644 arch/powerpc/sysdev/xilinx_pci.c
 delete mode 100644 sound/drivers/ml403-ac97cr.c

-- 
2.26.0



Re: [PATCH v3] powerpc/perf: Use SIER_USER_MASK while updating SPRN_SIER for EBB events

2020-03-27 Thread Athira Rajeev



> On 19-Mar-2020, at 4:22 PM, Michael Ellerman  wrote:
> 
> Hi Athira,
> 
> Athira Rajeev  writes:
>> Sampled Instruction Event Register (SIER), is a PMU register,
>   ^
>   that
>> captures architecture state for a given sample. And sier_user_mask
>   ^  ^
>   don't think we need "architecture" SIER_USER_MASK
> 
>> defined in commit 330a1eb7775b ("powerpc/perf: Core EBB support for 64-bit
>> book3s") defines the architected bits that needs to be saved from the SPR.
> 
> Not quite, it defines the bits that are visible to userspace.
> 
> And I think it's true that for EBB events the bits we need/want to save
> are only the user visible bits.
> 
>> Currently all of the bits from SIER are saved for EBB events. Patch fixes
>> this by ANDing the "sier_user_mask" to data from SIER in ebb_switch_out().
>> This will force save only architected bits from the SIER.
> 
> s/architected/user visible/
> 
> 
> But, why does it matter? The kernel saves the user visible bits, as well
> as the kernel-only bits into the thread struct. And then later the
> kernel restores that value into the hardware before returning to
> userspace.
> 
> But the hardware enforces the visibility of the bits, so userspace can't
> observe any bits that it shouldn't.
> 
> Or is there some other mechanism whereby userspace can see those bits? ;)
> 
> If there was, what would the security implications of that be?

Hi Michael,

Thanks for your comments. 

In ebb_switch_in, we set PMCC bit [MMCR0 44:45 ] to 10 which means SIER ( Group 
B ) register is readable in problem state. Hence the intention of the patch was 
to make sure we are not exposing the bits which the userspace shouldn't be 
reading. 

But following your comment about "hardware enforcing the visibility of bits", I 
did try an "ebb" experiment which showed that reading SPRN_SIER didn't expose 
any bits other than the user visible bits. Sorry for the confusion here. 

In that case, Can we drop the existing definition of SIER_USER_MASK if it is no 
more needed ?

Thanks
Athira
> 
> cheers
> 
>> Fixes: 330a1eb7775b ("powerpc/perf: Core EBB support for 64-bit book3s")
>> Signed-off-by: Athira Rajeev 
>> ---
>> Changelog:
>> v2 -> v3:
>> - Corrected name of SIER register in commit message
>>  as pointed by Segher Boessenkool
>> v1 -> v2:
>> - Make the commit message more clearer.
>> 
>> arch/powerpc/perf/core-book3s.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/arch/powerpc/perf/core-book3s.c 
>> b/arch/powerpc/perf/core-book3s.c
>> index 3086055..48b61cc 100644
>> --- a/arch/powerpc/perf/core-book3s.c
>> +++ b/arch/powerpc/perf/core-book3s.c
>> @@ -579,7 +579,7 @@ static void ebb_switch_out(unsigned long mmcr0)
>>  return;
>> 
>>  current->thread.siar  = mfspr(SPRN_SIAR);
>> -current->thread.sier  = mfspr(SPRN_SIER);
>> +current->thread.sier  = mfspr(SPRN_SIER) & SIER_USER_MASK;
>>  current->thread.sdar  = mfspr(SPRN_SDAR);
>>  current->thread.mmcr0 = mmcr0 & MMCR0_USER_MASK;
>>  current->thread.mmcr2 = mfspr(SPRN_MMCR2) & MMCR2_USER_MASK;
>> -- 
>> 1.8.3.1



[PATCH v4 4/6] powerpc/sysfs: Show idle_purr and idle_spurr for every CPU

2020-03-27 Thread Gautham R. Shenoy
From: "Gautham R. Shenoy" 

On Pseries LPARs, to calculate utilization, we need to know the
[S]PURR ticks when the CPUs were busy or idle.

The total PURR and SPURR ticks are already exposed via the per-cpu
sysfs files "purr" and "spurr". This patch adds support for exposing
the idle PURR and SPURR ticks via new per-cpu sysfs files named
"idle_purr" and "idle_spurr".

Signed-off-by: Gautham R. Shenoy 
---
 arch/powerpc/kernel/sysfs.c | 82 +++--
 1 file changed, 79 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
index 479c706..571b325 100644
--- a/arch/powerpc/kernel/sysfs.c
+++ b/arch/powerpc/kernel/sysfs.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 #include "cacheinfo.h"
@@ -760,6 +761,74 @@ static void create_svm_file(void)
 }
 #endif /* CONFIG_PPC_SVM */
 
+#ifdef CONFIG_PPC_PSERIES
+static void read_idle_purr(void *val)
+{
+   u64 *ret = val;
+
+   *ret = read_this_idle_purr();
+}
+
+static ssize_t idle_purr_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+   struct cpu *cpu = container_of(dev, struct cpu, dev);
+   u64 val;
+
+   smp_call_function_single(cpu->dev.id, read_idle_purr, , 1);
+   return sprintf(buf, "%llx\n", val);
+}
+static DEVICE_ATTR(idle_purr, 0400, idle_purr_show, NULL);
+
+static void create_idle_purr_file(struct device *s)
+{
+   if (firmware_has_feature(FW_FEATURE_LPAR))
+   device_create_file(s, _attr_idle_purr);
+}
+
+static void remove_idle_purr_file(struct device *s)
+{
+   if (firmware_has_feature(FW_FEATURE_LPAR))
+   device_remove_file(s, _attr_idle_purr);
+}
+
+static void read_idle_spurr(void *val)
+{
+   u64 *ret = val;
+
+   *ret = read_this_idle_spurr();
+}
+
+static ssize_t idle_spurr_show(struct device *dev,
+  struct device_attribute *attr, char *buf)
+{
+   struct cpu *cpu = container_of(dev, struct cpu, dev);
+   u64 val;
+
+   smp_call_function_single(cpu->dev.id, read_idle_spurr, , 1);
+   return sprintf(buf, "%llx\n", val);
+}
+static DEVICE_ATTR(idle_spurr, 0400, idle_spurr_show, NULL);
+
+static void create_idle_spurr_file(struct device *s)
+{
+   if (firmware_has_feature(FW_FEATURE_LPAR))
+   device_create_file(s, _attr_idle_spurr);
+}
+
+static void remove_idle_spurr_file(struct device *s)
+{
+   if (firmware_has_feature(FW_FEATURE_LPAR))
+   device_remove_file(s, _attr_idle_spurr);
+}
+
+#else /* CONFIG_PPC_PSERIES */
+#define create_idle_purr_file(s)
+#define remove_idle_purr_file(s)
+#define create_idle_spurr_file(s)
+#define remove_idle_spurr_file(s)
+#endif /* CONFIG_PPC_PSERIES */
+
 static int register_cpu_online(unsigned int cpu)
 {
struct cpu *c = _cpu(cpu_devices, cpu);
@@ -823,10 +892,13 @@ static int register_cpu_online(unsigned int cpu)
if (!firmware_has_feature(FW_FEATURE_LPAR))
add_write_permission_dev_attr(_attr_purr);
device_create_file(s, _attr_purr);
+   create_idle_purr_file(s);
}
 
-   if (cpu_has_feature(CPU_FTR_SPURR))
+   if (cpu_has_feature(CPU_FTR_SPURR)) {
device_create_file(s, _attr_spurr);
+   create_idle_spurr_file(s);
+   }
 
if (cpu_has_feature(CPU_FTR_DSCR))
device_create_file(s, _attr_dscr);
@@ -910,11 +982,15 @@ static int unregister_cpu_online(unsigned int cpu)
device_remove_file(s, _attr_mmcra);
 #endif /* CONFIG_PMU_SYSFS */
 
-   if (cpu_has_feature(CPU_FTR_PURR))
+   if (cpu_has_feature(CPU_FTR_PURR)) {
device_remove_file(s, _attr_purr);
+   remove_idle_purr_file(s);
+   }
 
-   if (cpu_has_feature(CPU_FTR_SPURR))
+   if (cpu_has_feature(CPU_FTR_SPURR)) {
device_remove_file(s, _attr_spurr);
+   remove_idle_spurr_file(s);
+   }
 
if (cpu_has_feature(CPU_FTR_DSCR))
device_remove_file(s, _attr_dscr);
-- 
1.9.4



[PATCH v4 6/6] pseries/sysfs: Minimise IPI noise while reading [idle_][s]purr

2020-03-27 Thread Gautham R. Shenoy
From: "Gautham R. Shenoy" 

Currently purr, spurr, idle_purr, idle_spurr are exposed for every CPU
via the sysfs interface
/sys/devices/system/cpu/cpuX/[idle_][s]purr. Each sysfs read currently
generates an IPI to obtain the desired value from the target CPU X.
Since these aforementioned sysfs are typically read one after another,
we end up generating 4 IPIs per CPU in a short duration.

In order to minimize the IPI noise, this patch caches the values of
all the four entities whenever one of them is read. If subsequently
any of these are read within the next 10ms, the cached value is
returned. With this, we will generate at most one IPI every 10ms for
every CPU.

Test-results: While reading the four sysfs files back-to-back for a
given CPU every second for 100 seconds.

Without the patch:
 16 [XICS 2 Edge IPI] = 422 times
 DBL [Doorbell interrupts] = 13 times
 Total : 435 IPIs.

With the patch:
  16 [XICS 2 Edge IPI] = 111 times
  DBL [Doorbell interrupts] = 17 times
  Total : 128 IPIs.

Signed-off-by: Gautham R. Shenoy 
---
 arch/powerpc/kernel/sysfs.c | 117 
 1 file changed, 97 insertions(+), 20 deletions(-)

diff --git a/arch/powerpc/kernel/sysfs.c b/arch/powerpc/kernel/sysfs.c
index 571b325..bd92023 100644
--- a/arch/powerpc/kernel/sysfs.c
+++ b/arch/powerpc/kernel/sysfs.c
@@ -586,8 +586,6 @@ void ppc_enable_pmcs(void)
  * SPRs which are not related to PMU.
  */
 #ifdef CONFIG_PPC64
-SYSFS_SPRSETUP(purr, SPRN_PURR);
-SYSFS_SPRSETUP(spurr, SPRN_SPURR);
 SYSFS_SPRSETUP(pir, SPRN_PIR);
 SYSFS_SPRSETUP(tscr, SPRN_TSCR);
 
@@ -596,8 +594,6 @@ void ppc_enable_pmcs(void)
   enable write when needed with a separate function.
   Lets be conservative and default to pseries.
 */
-static DEVICE_ATTR(spurr, 0400, show_spurr, NULL);
-static DEVICE_ATTR(purr, 0400, show_purr, store_purr);
 static DEVICE_ATTR(pir, 0400, show_pir, NULL);
 static DEVICE_ATTR(tscr, 0600, show_tscr, store_tscr);
 #endif /* CONFIG_PPC64 */
@@ -761,22 +757,110 @@ static void create_svm_file(void)
 }
 #endif /* CONFIG_PPC_SVM */
 
+#ifdef CONFIG_PPC64
+/*
+ * The duration (in ms) from the last IPI to the target CPU until
+ * which a cached value of purr, spurr, idle_purr, idle_spurr can be
+ * reported to the user on a corresponding sysfs file read. Beyond
+ * this duration, fresh values need to be obtained by sending IPIs to
+ * the target CPU when the sysfs files are read.
+ */
+static unsigned long util_stats_staleness_tolerance_ms = 10;
+struct util_acct_stats {
+   u64 latest_purr;
+   u64 latest_spurr;
+#ifdef CONFIG_PPC_PSERIES
+   u64 latest_idle_purr;
+   u64 latest_idle_spurr;
+#endif
+   unsigned long last_update_jiffies;
+};
+
+DEFINE_PER_CPU(struct util_acct_stats, util_acct_stats);
+
+static void update_util_acct_stats(void *ptr)
+{
+   struct util_acct_stats *stats = ptr;
+
+   stats->latest_purr = mfspr(SPRN_PURR);
+   stats->latest_spurr = mfspr(SPRN_SPURR);
 #ifdef CONFIG_PPC_PSERIES
-static void read_idle_purr(void *val)
+   stats->latest_idle_purr = read_this_idle_purr();
+   stats->latest_idle_spurr = read_this_idle_spurr();
+#endif
+   stats->last_update_jiffies = jiffies;
+}
+
+struct util_acct_stats *get_util_stats_ptr(int cpu)
+{
+   struct util_acct_stats *stats = per_cpu_ptr(_acct_stats, cpu);
+   unsigned long delta_jiffies;
+
+   delta_jiffies = jiffies - stats->last_update_jiffies;
+
+   /*
+* If we have a recent enough data, reuse that instead of
+* sending an IPI.
+*/
+   if (jiffies_to_msecs(delta_jiffies) < util_stats_staleness_tolerance_ms)
+   return stats;
+
+   smp_call_function_single(cpu, update_util_acct_stats, stats, 1);
+   return stats;
+}
+
+static ssize_t show_purr(struct device *dev,
+struct device_attribute *attr, char *buf)
 {
-   u64 *ret = val;
+   struct cpu *cpu = container_of(dev, struct cpu, dev);
+   struct util_acct_stats *stats;
 
-   *ret = read_this_idle_purr();
+   stats = get_util_stats_ptr(cpu->dev.id);
+   return sprintf(buf, "%llx\n", stats->latest_purr);
 }
 
+static void write_purr(void *val)
+{
+   mtspr(SPRN_PURR, *(unsigned long *)val);
+}
+
+static ssize_t __used store_purr(struct device *dev,
+struct device_attribute *attr,
+const char *buf, size_t count)
+{
+   struct cpu *cpu = container_of(dev, struct cpu, dev);
+   unsigned long val;
+   int ret = kstrtoul(buf, 16, );
+
+   if (ret != 0)
+   return -EINVAL;
+
+   smp_call_function_single(cpu->dev.id, write_purr, , 1);
+   return count;
+}
+static DEVICE_ATTR(purr, 0400, show_purr, store_purr);
+
+static ssize_t show_spurr(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+   

[PATCH v4 5/6] Documentation: Document sysfs interfaces purr, spurr, idle_purr, idle_spurr

2020-03-27 Thread Gautham R. Shenoy
From: "Gautham R. Shenoy" 

Add documentation for the following sysfs interfaces:
/sys/devices/system/cpu/cpuX/purr
/sys/devices/system/cpu/cpuX/spurr
/sys/devices/system/cpu/cpuX/idle_purr
/sys/devices/system/cpu/cpuX/idle_spurr

Signed-off-by: Gautham R. Shenoy 
---
 Documentation/ABI/testing/sysfs-devices-system-cpu | 39 ++
 1 file changed, 39 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-devices-system-cpu 
b/Documentation/ABI/testing/sysfs-devices-system-cpu
index 2e0e3b4..bc07677 100644
--- a/Documentation/ABI/testing/sysfs-devices-system-cpu
+++ b/Documentation/ABI/testing/sysfs-devices-system-cpu
@@ -580,3 +580,42 @@ Description:   Secure Virtual Machine
If 1, it means the system is using the Protected Execution
Facility in POWER9 and newer processors. i.e., it is a Secure
Virtual Machine.
+
+What:  /sys/devices/system/cpu/cpuX/purr
+Date:  Apr 2005
+Contact:   Linux for PowerPC mailing list 
+Description:   PURR ticks for this CPU since the system boot.
+
+   The Processor Utilization Resources Register (PURR) is
+   a 64-bit counter which provides an estimate of the
+   resources used by the CPU thread. The contents of this
+   register increases monotonically. This sysfs interface
+   exposes the number of PURR ticks for cpuX.
+
+What:  /sys/devices/system/cpu/cpuX/spurr
+Date:  Dec 2006
+Contact:   Linux for PowerPC mailing list 
+Description:   SPURR ticks for this CPU since the system boot.
+
+   The Scaled Processor Utilization Resources Register
+   (SPURR) is a 64-bit counter that provides a frequency
+   invariant estimate of the resources used by the CPU
+   thread. The contents of this register increases
+   monotonically. This sysfs interface exposes the number
+   of SPURR ticks for cpuX.
+
+What:  /sys/devices/system/cpu/cpuX/idle_purr
+Date:  Mar 2020
+Contact:   Linux for PowerPC mailing list 
+Description:   PURR ticks for cpuX when it was idle.
+
+   This sysfs interface exposes the number of PURR ticks
+   for cpuX when it was idle.
+
+What:  /sys/devices/system/cpu/cpuX/idle_spurr
+Date:  Mar 2020
+Contact:   Linux for PowerPC mailing list 
+Description:   SPURR ticks for cpuX when it was idle.
+
+   This sysfs interface exposes the number of SPURR ticks
+   for cpuX when it was idle.
-- 
1.9.4



[PATCH v4 1/6] powerpc: Move idle_loop_prolog()/epilog() functions to header file

2020-03-27 Thread Gautham R. Shenoy
From: "Gautham R. Shenoy" 

Currently prior to entering an idle state on a Linux Guest, the
pseries cpuidle driver implement an idle_loop_prolog() and
idle_loop_epilog() functions which ensure that idle_purr is correctly
computed, and the hypervisor is informed that the CPU cycles have been
donated.

These prolog and epilog functions are also required in the default
idle call, i.e pseries_lpar_idle(). Hence move these accessor
functions to a common header file and call them from
pseries_lpar_idle(). Since the existing header files such as
asm/processor.h have enough clutter, create a new header file
asm/idle.h. Finally rename idle_loop_prolog() and idle_loop_epilog()
to pseries_idle_prolog() and pseries_idle_epilog() as they are only
relavent for on pseries guests.

Signed-off-by: Gautham R. Shenoy 
---
 arch/powerpc/include/asm/idle.h| 31 +
 arch/powerpc/platforms/pseries/setup.c |  7 +--
 drivers/cpuidle/cpuidle-pseries.c  | 36 +++---
 3 files changed, 43 insertions(+), 31 deletions(-)
 create mode 100644 arch/powerpc/include/asm/idle.h

diff --git a/arch/powerpc/include/asm/idle.h b/arch/powerpc/include/asm/idle.h
new file mode 100644
index 000..32064a4c
--- /dev/null
+++ b/arch/powerpc/include/asm/idle.h
@@ -0,0 +1,31 @@
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+#ifndef _ASM_POWERPC_IDLE_H
+#define _ASM_POWERPC_IDLE_H
+#include 
+#include 
+
+#ifdef CONFIG_PPC_PSERIES
+static inline void pseries_idle_prolog(unsigned long *in_purr)
+{
+   ppc64_runlatch_off();
+   *in_purr = mfspr(SPRN_PURR);
+   /*
+* Indicate to the HV that we are idle. Now would be
+* a good time to find other work to dispatch.
+*/
+   get_lppaca()->idle = 1;
+}
+
+static inline void pseries_idle_epilog(unsigned long in_purr)
+{
+   u64 wait_cycles;
+
+   wait_cycles = be64_to_cpu(get_lppaca()->wait_state_cycles);
+   wait_cycles += mfspr(SPRN_PURR) - in_purr;
+   get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles);
+   get_lppaca()->idle = 0;
+
+   ppc64_runlatch_on();
+}
+#endif /* CONFIG_PPC_PSERIES */
+#endif
diff --git a/arch/powerpc/platforms/pseries/setup.c 
b/arch/powerpc/platforms/pseries/setup.c
index 0c8421d..2f53e6b 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -68,6 +68,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -319,6 +320,8 @@ static int alloc_dispatch_log_kmem_cache(void)
 
 static void pseries_lpar_idle(void)
 {
+   unsigned long in_purr;
+
/*
 * Default handler to go into low thread priority and possibly
 * low power mode by ceding processor to hypervisor
@@ -328,7 +331,7 @@ static void pseries_lpar_idle(void)
return;
 
/* Indicate to hypervisor that we are idle. */
-   get_lppaca()->idle = 1;
+   pseries_idle_prolog(_purr);
 
/*
 * Yield the processor to the hypervisor.  We return if
@@ -339,7 +342,7 @@ static void pseries_lpar_idle(void)
 */
cede_processor();
 
-   get_lppaca()->idle = 0;
+   pseries_idle_epilog(in_purr);
 }
 
 /*
diff --git a/drivers/cpuidle/cpuidle-pseries.c 
b/drivers/cpuidle/cpuidle-pseries.c
index 74c2479..46d5e05 100644
--- a/drivers/cpuidle/cpuidle-pseries.c
+++ b/drivers/cpuidle/cpuidle-pseries.c
@@ -19,6 +19,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 
 struct cpuidle_driver pseries_idle_driver = {
@@ -31,29 +32,6 @@ struct cpuidle_driver pseries_idle_driver = {
 static u64 snooze_timeout __read_mostly;
 static bool snooze_timeout_en __read_mostly;
 
-static inline void idle_loop_prolog(unsigned long *in_purr)
-{
-   ppc64_runlatch_off();
-   *in_purr = mfspr(SPRN_PURR);
-   /*
-* Indicate to the HV that we are idle. Now would be
-* a good time to find other work to dispatch.
-*/
-   get_lppaca()->idle = 1;
-}
-
-static inline void idle_loop_epilog(unsigned long in_purr)
-{
-   u64 wait_cycles;
-
-   wait_cycles = be64_to_cpu(get_lppaca()->wait_state_cycles);
-   wait_cycles += mfspr(SPRN_PURR) - in_purr;
-   get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles);
-   get_lppaca()->idle = 0;
-
-   ppc64_runlatch_on();
-}
-
 static int snooze_loop(struct cpuidle_device *dev,
struct cpuidle_driver *drv,
int index)
@@ -63,7 +41,7 @@ static int snooze_loop(struct cpuidle_device *dev,
 
set_thread_flag(TIF_POLLING_NRFLAG);
 
-   idle_loop_prolog(_purr);
+   pseries_idle_prolog(_purr);
local_irq_enable();
snooze_exit_time = get_tb() + snooze_timeout;
 
@@ -87,7 +65,7 @@ static int snooze_loop(struct cpuidle_device *dev,
 
local_irq_disable();
 
-   idle_loop_epilog(in_purr);
+   pseries_idle_epilog(in_purr);
 
return index;
 }
@@ -115,7 +93,7 @@ static int 

[PATCH v4 3/6] powerpc/pseries: Account for SPURR ticks on idle CPUs

2020-03-27 Thread Gautham R. Shenoy
From: "Gautham R. Shenoy" 

On Pseries LPARs, to calculate utilization, we need to know the
[S]PURR ticks when the CPUs were busy or idle.

Via pseries_idle_prolog(), pseries_idle_epilog(), we track the idle
PURR ticks in the VPA variable "wait_state_cycles". This patch extends
the support to account for the idle SPURR ticks. It also provides an
accessor function to accurately reads idle SPURR ticks.

Signed-off-by: Gautham R. Shenoy 
---
 arch/powerpc/include/asm/idle.h| 33 +
 arch/powerpc/platforms/pseries/setup.c |  2 ++
 2 files changed, 35 insertions(+)

diff --git a/arch/powerpc/include/asm/idle.h b/arch/powerpc/include/asm/idle.h
index d4bfb6a..accd1f5 100644
--- a/arch/powerpc/include/asm/idle.h
+++ b/arch/powerpc/include/asm/idle.h
@@ -5,13 +5,20 @@
 #include 
 
 #ifdef CONFIG_PPC_PSERIES
+DECLARE_PER_CPU(u64, idle_spurr_cycles);
 DECLARE_PER_CPU(u64, idle_entry_purr_snap);
+DECLARE_PER_CPU(u64, idle_entry_spurr_snap);
 
 static inline void snapshot_purr_idle_entry(void)
 {
*this_cpu_ptr(_entry_purr_snap) = mfspr(SPRN_PURR);
 }
 
+static inline void snapshot_spurr_idle_entry(void)
+{
+   *this_cpu_ptr(_entry_spurr_snap) = mfspr(SPRN_SPURR);
+}
+
 static inline void update_idle_purr_accounting(void)
 {
u64 wait_cycles;
@@ -22,10 +29,19 @@ static inline void update_idle_purr_accounting(void)
get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles);
 }
 
+static inline void update_idle_spurr_accounting(void)
+{
+   u64 *idle_spurr_cycles_ptr = this_cpu_ptr(_spurr_cycles);
+   u64 in_spurr = *this_cpu_ptr(_entry_spurr_snap);
+
+   *idle_spurr_cycles_ptr += mfspr(SPRN_SPURR) - in_spurr;
+}
+
 static inline void pseries_idle_prolog(void)
 {
ppc64_runlatch_off();
snapshot_purr_idle_entry();
+   snapshot_spurr_idle_entry();
/*
 * Indicate to the HV that we are idle. Now would be
 * a good time to find other work to dispatch.
@@ -36,6 +52,7 @@ static inline void pseries_idle_prolog(void)
 static inline void pseries_idle_epilog(void)
 {
update_idle_purr_accounting();
+   update_idle_spurr_accounting();
get_lppaca()->idle = 0;
ppc64_runlatch_on();
 }
@@ -56,5 +73,21 @@ static inline u64 read_this_idle_purr(void)
return be64_to_cpu(get_lppaca()->wait_state_cycles);
 }
 
+static inline u64 read_this_idle_spurr(void)
+{
+   /*
+* If we are reading from an idle context, update the
+* idle-spurr cycles corresponding to the last idle period.
+* Since the idle context is not yet over, take a fresh
+* snapshot of the idle-spurr.
+*/
+   if (get_lppaca()->idle == 1) {
+   update_idle_spurr_accounting();
+   snapshot_spurr_idle_entry();
+   }
+
+   return *this_cpu_ptr(_spurr_cycles);
+}
+
 #endif /* CONFIG_PPC_PSERIES */
 #endif
diff --git a/arch/powerpc/platforms/pseries/setup.c 
b/arch/powerpc/platforms/pseries/setup.c
index 4905c96..1b55e80 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -318,7 +318,9 @@ static int alloc_dispatch_log_kmem_cache(void)
 }
 machine_early_initcall(pseries, alloc_dispatch_log_kmem_cache);
 
+DEFINE_PER_CPU(u64, idle_spurr_cycles);
 DEFINE_PER_CPU(u64, idle_entry_purr_snap);
+DEFINE_PER_CPU(u64, idle_entry_spurr_snap);
 static void pseries_lpar_idle(void)
 {
/*
-- 
1.9.4



[PATCH v4 2/6] powerpc/idle: Add accessor function to always read latest idle PURR

2020-03-27 Thread Gautham R. Shenoy
From: "Gautham R. Shenoy" 

Currently when CPU goes idle, we take a snapshot of PURR via
pseries_idle_prolog() which is used at the CPU idle exit to compute
the idle PURR cycles via the function pseries_idle_epilog().  Thus,
the value of idle PURR cycle thus read before pseries_idle_prolog() and
after pseries_idle_epilog() is always correct.

However, if we were to read the idle PURR cycles from an interrupt
context between pseries_idle_prolog() and pseries_idle_epilog() (this will
be done in a future patch), then, the value of the idle PURR thus read
will not include the cycles spent in the most recent idle period.

This patch addresses the issue by providing accessor function to read
the idle PURR such such that it includes the cycles spent in the most
recent idle period, if we read it between pseries_idle_prolog() and
pseries_idle_epilog(). In order to achieve it, the patch saves the
snapshot of PURR in pseries_idle_prolog() in a per-cpu variable,
instead of on the stack, so that it can be accessed from an interrupt
context.

Signed-off-by: Gautham R. Shenoy 
---
 arch/powerpc/include/asm/idle.h| 47 +++---
 arch/powerpc/platforms/pseries/setup.c |  7 +++--
 drivers/cpuidle/cpuidle-pseries.c  | 15 +--
 3 files changed, 47 insertions(+), 22 deletions(-)

diff --git a/arch/powerpc/include/asm/idle.h b/arch/powerpc/include/asm/idle.h
index 32064a4c..d4bfb6a 100644
--- a/arch/powerpc/include/asm/idle.h
+++ b/arch/powerpc/include/asm/idle.h
@@ -5,10 +5,27 @@
 #include 
 
 #ifdef CONFIG_PPC_PSERIES
-static inline void pseries_idle_prolog(unsigned long *in_purr)
+DECLARE_PER_CPU(u64, idle_entry_purr_snap);
+
+static inline void snapshot_purr_idle_entry(void)
+{
+   *this_cpu_ptr(_entry_purr_snap) = mfspr(SPRN_PURR);
+}
+
+static inline void update_idle_purr_accounting(void)
+{
+   u64 wait_cycles;
+   u64 in_purr = *this_cpu_ptr(_entry_purr_snap);
+
+   wait_cycles = be64_to_cpu(get_lppaca()->wait_state_cycles);
+   wait_cycles += mfspr(SPRN_PURR) - in_purr;
+   get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles);
+}
+
+static inline void pseries_idle_prolog(void)
 {
ppc64_runlatch_off();
-   *in_purr = mfspr(SPRN_PURR);
+   snapshot_purr_idle_entry();
/*
 * Indicate to the HV that we are idle. Now would be
 * a good time to find other work to dispatch.
@@ -16,16 +33,28 @@ static inline void pseries_idle_prolog(unsigned long 
*in_purr)
get_lppaca()->idle = 1;
 }
 
-static inline void pseries_idle_epilog(unsigned long in_purr)
+static inline void pseries_idle_epilog(void)
 {
-   u64 wait_cycles;
-
-   wait_cycles = be64_to_cpu(get_lppaca()->wait_state_cycles);
-   wait_cycles += mfspr(SPRN_PURR) - in_purr;
-   get_lppaca()->wait_state_cycles = cpu_to_be64(wait_cycles);
+   update_idle_purr_accounting();
get_lppaca()->idle = 0;
-
ppc64_runlatch_on();
 }
+
+static inline u64 read_this_idle_purr(void)
+{
+   /*
+* If we are reading from an idle context, update the
+* idle-purr cycles corresponding to the last idle period.
+* Since the idle context is not yet over, take a fresh
+* snapshot of the idle-purr.
+*/
+   if (unlikely(get_lppaca()->idle == 1)) {
+   update_idle_purr_accounting();
+   snapshot_purr_idle_entry();
+   }
+
+   return be64_to_cpu(get_lppaca()->wait_state_cycles);
+}
+
 #endif /* CONFIG_PPC_PSERIES */
 #endif
diff --git a/arch/powerpc/platforms/pseries/setup.c 
b/arch/powerpc/platforms/pseries/setup.c
index 2f53e6b..4905c96 100644
--- a/arch/powerpc/platforms/pseries/setup.c
+++ b/arch/powerpc/platforms/pseries/setup.c
@@ -318,10 +318,9 @@ static int alloc_dispatch_log_kmem_cache(void)
 }
 machine_early_initcall(pseries, alloc_dispatch_log_kmem_cache);
 
+DEFINE_PER_CPU(u64, idle_entry_purr_snap);
 static void pseries_lpar_idle(void)
 {
-   unsigned long in_purr;
-
/*
 * Default handler to go into low thread priority and possibly
 * low power mode by ceding processor to hypervisor
@@ -331,7 +330,7 @@ static void pseries_lpar_idle(void)
return;
 
/* Indicate to hypervisor that we are idle. */
-   pseries_idle_prolog(_purr);
+   pseries_idle_prolog();
 
/*
 * Yield the processor to the hypervisor.  We return if
@@ -342,7 +341,7 @@ static void pseries_lpar_idle(void)
 */
cede_processor();
 
-   pseries_idle_epilog(in_purr);
+   pseries_idle_epilog();
 }
 
 /*
diff --git a/drivers/cpuidle/cpuidle-pseries.c 
b/drivers/cpuidle/cpuidle-pseries.c
index 46d5e05..6513ef2 100644
--- a/drivers/cpuidle/cpuidle-pseries.c
+++ b/drivers/cpuidle/cpuidle-pseries.c
@@ -36,12 +36,11 @@ static int snooze_loop(struct cpuidle_device *dev,
struct cpuidle_driver *drv,
int index)
 {
-   unsigned long in_purr;
u64 

[PATCH v4 0/6] [PATCH v4 0/6] Track and expose idle PURR and SPURR ticks

2020-03-27 Thread Gautham R. Shenoy
From: "Gautham R. Shenoy" 

Hi,

This is the fourth version of the patches to track and expose idle PURR
and SPURR ticks. These patches are required by tools such as lparstat
to compute system utilization for capacity planning purposes.

The previous versions can be found here:
v3: https://lkml.org/lkml/2020/3/11/331
v2: https://lkml.org/lkml/2020/2/21/21
v1: https://lore.kernel.org/patchwork/cover/1159341/

They key changes from v3 are:

   - Fixed the build errors on !CONFIG_PPC64 and !CONFIG_PPC_PSERIES
 configurations notified by the kbuild bot.


Motivation:
===
On PSeries LPARs, the data centers planners desire a more accurate
view of system utilization per resource such as CPU to plan the system
capacity requirements better. Such accuracy can be obtained by reading
PURR/SPURR registers for CPU resource utilization.

Tools such as lparstat which are used to compute the utilization need
to know [S]PURR ticks when the cpu was busy or idle. The [S]PURR
counters are already exposed through sysfs.  We already account for
PURR ticks when we go to idle so that we can update the VPA area. This
patchset extends support to account for SPURR ticks when idle, and
expose both via per-cpu sysfs files.

This patch series also introduces a patch (Patch 6/6) to send an IPI
in order to read and cache the values of purr, spurr, idle_purr and
idle_spurr of the target CPU when any one of them is read via
sysfs. These cached values will be presented if any of these sysfs are
read within the next 10ms. If these sysfs files are read after 10ms
from the earlier IPI, a fresh IPI is issued to read and cache the
values again. This minimizes the number of IPIs required to be sent
when these values are read back-to-back via the sysfs interface.

Without patch 6/6 (Without caching): 
 16 [XICS 2 Edge IPI] = 422 times
 DBL [Doorbell interrupts] = 13 times
 Total : 435 IPIs.

With patch 6/6 (With caching):
  16 [XICS 2 Edge IPI] = 111 times
  DBL [Doorbell interrupts] = 17 times
  Total : 128 IPIs.

These patches are required for enhancement to the lparstat utility
that compute the CPU utilization based on PURR and SPURR which can be
found here :
https://groups.google.com/forum/#!topic/powerpc-utils-devel/fYRo69xO9r4


With the patches, when lparstat is run on a LPAR running CPU-Hogs,
=
sudo ./src/lparstat -E 1 3
System Configuration
type=Dedicated mode=Capped smt=8 lcpu=2 mem=4834176 kB cpus=0 ent=2.00 
---Actual--- -Normalized-
%busy  %idle   Frequency %busy  %idle
-- --  - -- --
 99.99   0.00  3.35GHz[111%] 110.99   0.00
100.00   0.00  3.35GHz[111%] 111.00   0.00
100.00   0.00  3.35GHz[111%] 111.00   0.00

With patches, when lparstat is run on and idle LPAR
=
---Actual--- -Normalized-
%busy  %idle   Frequency %busy  %idle
-- --  - -- --
0.20  99.81  2.17GHz[ 72%]   0.19  71.82
0.42  99.58  2.11GHz[ 70%]   0.31  69.69
0.41  99.59  2.11GHz[ 70%]   0.31  69.69

Gautham R. Shenoy (6):
  powerpc: Move idle_loop_prolog()/epilog() functions to header file
  powerpc/idle: Add accessor function to always read latest idle PURR
  powerpc/pseries: Account for SPURR ticks on idle CPUs
  powerpc/sysfs: Show idle_purr and idle_spurr for every CPU
  Documentation: Document sysfs interfaces purr, spurr, idle_purr,
idle_spurr
  pseries/sysfs: Minimise IPI noise while reading [idle_][s]purr

 Documentation/ABI/testing/sysfs-devices-system-cpu |  39 +
 arch/powerpc/include/asm/idle.h|  93 
 arch/powerpc/kernel/sysfs.c| 167 -
 arch/powerpc/platforms/pseries/setup.c |   8 +-
 drivers/cpuidle/cpuidle-pseries.c  |  39 +
 5 files changed, 305 insertions(+), 41 deletions(-)
 create mode 100644 arch/powerpc/include/asm/idle.h

-- 
1.9.4



[PATCH v7 6/6] perf/tools/pmu-events/powerpc: Add hv_24x7 socket/chip level metric events

2020-03-27 Thread Kajol Jain
The hv_24×7 feature in IBM® POWER9™ processor-based servers provide the
facility to continuously collect large numbers of hardware performance
metrics efficiently and accurately.
This patch adds hv_24x7  metric file for different Socket/chip
resources.

Result:

power9 platform:

command:# ./perf stat --metric-only -M Memory_RD_BW_Chip -C 0 -I 1000

 1.96188  0.9  0.3
 2.000285720  0.5  0.1
 3.000424990  0.4  0.1

command:# ./perf stat --metric-only -M PowerBUS_Frequency -C 0 -I 1000

 1.979812.32.3
 2.0002917132.32.3
 3.0004217192.32.3
 4.0005509122.32.3

Signed-off-by: Kajol Jain 
---
 .../arch/powerpc/power9/nest_metrics.json | 19 +++
 1 file changed, 19 insertions(+)
 create mode 100644 tools/perf/pmu-events/arch/powerpc/power9/nest_metrics.json

diff --git a/tools/perf/pmu-events/arch/powerpc/power9/nest_metrics.json 
b/tools/perf/pmu-events/arch/powerpc/power9/nest_metrics.json
new file mode 100644
index ..c121e526442a
--- /dev/null
+++ b/tools/perf/pmu-events/arch/powerpc/power9/nest_metrics.json
@@ -0,0 +1,19 @@
+[
+{
+"MetricExpr": "(hv_24x7@PM_MCS01_128B_RD_DISP_PORT01\\,chip\\=?@ + 
hv_24x7@PM_MCS01_128B_RD_DISP_PORT23\\,chip\\=?@ + 
hv_24x7@PM_MCS23_128B_RD_DISP_PORT01\\,chip\\=?@ + 
hv_24x7@PM_MCS23_128B_RD_DISP_PORT23\\,chip\\=?@)",
+"MetricName": "Memory_RD_BW_Chip",
+"MetricGroup": "Memory_BW",
+"ScaleUnit": "1.6e-2MB"
+},
+{
+   "MetricExpr": "(hv_24x7@PM_MCS01_128B_WR_DISP_PORT01\\,chip\\=?@ + 
hv_24x7@PM_MCS01_128B_WR_DISP_PORT23\\,chip\\=?@ + 
hv_24x7@PM_MCS23_128B_WR_DISP_PORT01\\,chip\\=?@ + 
hv_24x7@PM_MCS23_128B_WR_DISP_PORT23\\,chip\\=?@ )",
+"MetricName": "Memory_WR_BW_Chip",
+"MetricGroup": "Memory_BW",
+"ScaleUnit": "1.6e-2MB"
+},
+{
+   "MetricExpr": "(hv_24x7@PM_PB_CYC\\,chip\\=?@ )",
+"MetricName": "PowerBUS_Frequency",
+"ScaleUnit": "2.5e-7GHz"
+}
+]
-- 
2.18.1



[PATCH v7 5/6] tools/perf: Enable Hz/hz prinitg for --metric-only option

2020-03-27 Thread Kajol Jain
Commit 54b5091606c18 ("perf stat: Implement --metric-only mode")
added function 'valid_only_metric()' which drops "Hz" or "hz",
if it is part of "ScaleUnit". This patch enable it since hv_24x7
supports couple of frequency events.

Signed-off-by: Kajol Jain 
---
 tools/perf/util/stat-display.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 16efdba1973a..ecdebfcdd379 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -237,8 +237,6 @@ static bool valid_only_metric(const char *unit)
if (!unit)
return false;
if (strstr(unit, "/sec") ||
-   strstr(unit, "hz") ||
-   strstr(unit, "Hz") ||
strstr(unit, "CPUs utilized"))
return false;
return true;
-- 
2.18.1



[PATCH v7 4/6] perf/tools: Enhance JSON/metric infrastructure to handle "?"

2020-03-27 Thread Kajol Jain
Patch enhances current metric infrastructure to handle "?" in the metric
expression. The "?" can be use for parameters whose value not known while
creating metric events and which can be replace later at runtime to
the proper value. It also add flexibility to create multiple events out
of single metric event added in json file.

Patch adds function 'arch_get_runtimeparam' which is a arch specific
function, returns the count of metric events need to be created.
By default it return 1.

This infrastructure needed for hv_24x7 socket/chip level events.
"hv_24x7" chip level events needs specific chip-id to which the
data is requested. Function 'arch_get_runtimeparam' implemented
in header.c which extract number of sockets from sysfs file
"sockets" under "/sys/devices/hv_24x7/interface/".

With this patch basically we are trying to create as many metric events
as define by runtime_param.

For that one loop is added in function 'metricgroup__add_metric',
which create multiple events at run time depend on return value of
'arch_get_runtimeparam' and merge that event in 'group_list'.

To achieve that we are actually passing this parameter value as part of
`expr__find_other` function and changing "?" present in metric expression
with this value.

As in our json file, there gonna be single metric event, and out of
which we are creating multiple events.

To understand which data count belongs to which parameter value,
we also printing param value in generic_metric function.

For example,
command:# ./perf stat  -M PowerBUS_Frequency -C 0 -I 1000
 1.000101867  9,356,933  hv_24x7/pm_pb_cyc,chip=0/ #  2.3 
GHz  PowerBUS_Frequency_0
 1.000101867  9,366,134  hv_24x7/pm_pb_cyc,chip=1/ #  2.3 
GHz  PowerBUS_Frequency_1
 2.000314878  9,365,868  hv_24x7/pm_pb_cyc,chip=0/ #  2.3 
GHz  PowerBUS_Frequency_0
 2.000314878  9,366,092  hv_24x7/pm_pb_cyc,chip=1/ #  2.3 
GHz  PowerBUS_Frequency_1

So, here _0 and _1 after PowerBUS_Frequency specify parameter value.

Signed-off-by: Kajol Jain 
---
 tools/perf/arch/powerpc/util/header.c |  8 
 tools/perf/tests/expr.c   |  8 
 tools/perf/util/expr.c| 11 ++-
 tools/perf/util/expr.h|  5 +++--
 tools/perf/util/expr.l| 27 +++---
 tools/perf/util/metricgroup.c | 28 ---
 tools/perf/util/metricgroup.h |  2 ++
 tools/perf/util/stat-shadow.c | 17 ++--
 8 files changed, 79 insertions(+), 27 deletions(-)

diff --git a/tools/perf/arch/powerpc/util/header.c 
b/tools/perf/arch/powerpc/util/header.c
index 3b4cdfc5efd6..d4870074f14c 100644
--- a/tools/perf/arch/powerpc/util/header.c
+++ b/tools/perf/arch/powerpc/util/header.c
@@ -7,6 +7,8 @@
 #include 
 #include 
 #include "header.h"
+#include "metricgroup.h"
+#include 
 
 #define mfspr(rn)   ({unsigned long rval; \
 asm volatile("mfspr %0," __stringify(rn) \
@@ -44,3 +46,9 @@ get_cpuid_str(struct perf_pmu *pmu __maybe_unused)
 
return bufp;
 }
+
+int arch_get_runtimeparam(void)
+{
+   int count;
+   return sysfs__read_int("/devices/hv_24x7/interface/sockets", ) < 
0 ? 1 : count;
+}
diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c
index ea10fc4412c4..516504cf0ea5 100644
--- a/tools/perf/tests/expr.c
+++ b/tools/perf/tests/expr.c
@@ -10,7 +10,7 @@ static int test(struct expr_parse_ctx *ctx, const char *e, 
double val2)
 {
double val;
 
-   if (expr__parse(, ctx, e))
+   if (expr__parse(, ctx, e, 1))
TEST_ASSERT_VAL("parse test failed", 0);
TEST_ASSERT_VAL("unexpected value", val == val2);
return 0;
@@ -44,15 +44,15 @@ int test__expr(struct test *t __maybe_unused, int subtest 
__maybe_unused)
return ret;
 
p = "FOO/0";
-   ret = expr__parse(, , p);
+   ret = expr__parse(, , p, 1);
TEST_ASSERT_VAL("division by zero", ret == -1);
 
p = "BAR/";
-   ret = expr__parse(, , p);
+   ret = expr__parse(, , p, 1);
TEST_ASSERT_VAL("missing operand", ret == -1);
 
TEST_ASSERT_VAL("find other",
-   expr__find_other("FOO + BAR + BAZ + BOZO", "FOO", 
, _other) == 0);
+   expr__find_other("FOO + BAR + BAZ + BOZO", "FOO", 
, _other, 1) == 0);
TEST_ASSERT_VAL("find other", num_other == 3);
TEST_ASSERT_VAL("find other", !strcmp(other[0], "BAR"));
TEST_ASSERT_VAL("find other", !strcmp(other[1], "BAZ"));
diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c
index c3382d58cf40..9d56e5fb3873 100644
--- a/tools/perf/util/expr.c
+++ b/tools/perf/util/expr.c
@@ -27,10 +27,11 @@ void expr__ctx_init(struct expr_parse_ctx *ctx)
 
 static int
 __expr__parse(double *val, struct expr_parse_ctx *ctx, const char *expr,
- int start)
+ int start, int param)
 {
struct 

[PATCH v7 3/6] perf/tools: Refactoring metricgroup__add_metric function

2020-03-27 Thread Kajol Jain
This patch refactor metricgroup__add_metric function where
some part of it move to function metricgroup__add_metric_param.
No logic change.

Signed-off-by: Kajol Jain 
---
 tools/perf/util/metricgroup.c | 61 +--
 1 file changed, 36 insertions(+), 25 deletions(-)

diff --git a/tools/perf/util/metricgroup.c b/tools/perf/util/metricgroup.c
index 926449a7cdbf..b905eaa907a7 100644
--- a/tools/perf/util/metricgroup.c
+++ b/tools/perf/util/metricgroup.c
@@ -485,6 +485,40 @@ static bool metricgroup__has_constraint(struct pmu_event 
*pe)
return false;
 }
 
+static int __metricgroup__add_metric(struct strbuf *events,
+   struct list_head *group_list, struct pmu_event *pe)
+{
+
+   const char **ids;
+   int idnum;
+   struct egroup *eg;
+   int ret = -EINVAL;
+
+   if (expr__find_other(pe->metric_expr, NULL, , ) < 0)
+   return ret;
+
+   if (events->len > 0)
+   strbuf_addf(events, ",");
+
+   if (metricgroup__has_constraint(pe))
+   metricgroup__add_metric_non_group(events, ids, idnum);
+   else
+   metricgroup__add_metric_weak_group(events, ids, idnum);
+
+   eg = malloc(sizeof(*eg));
+   if (!eg)
+   return -ENOMEM;
+
+   eg->ids = ids;
+   eg->idnum = idnum;
+   eg->metric_name = pe->metric_name;
+   eg->metric_expr = pe->metric_expr;
+   eg->metric_unit = pe->unit;
+   list_add_tail(>nd, group_list);
+
+   return 0;
+}
+
 static int metricgroup__add_metric(const char *metric, struct strbuf *events,
   struct list_head *group_list)
 {
@@ -504,35 +538,12 @@ static int metricgroup__add_metric(const char *metric, 
struct strbuf *events,
continue;
if (match_metric(pe->metric_group, metric) ||
match_metric(pe->metric_name, metric)) {
-   const char **ids;
-   int idnum;
-   struct egroup *eg;
 
pr_debug("metric expr %s for %s\n", pe->metric_expr, 
pe->metric_name);
 
-   if (expr__find_other(pe->metric_expr,
-NULL, , ) < 0)
+   ret = __metricgroup__add_metric(events, group_list, pe);
+   if (ret == -EINVAL || !ret)
continue;
-   if (events->len > 0)
-   strbuf_addf(events, ",");
-
-   if (metricgroup__has_constraint(pe))
-   metricgroup__add_metric_non_group(events, ids, 
idnum);
-   else
-   metricgroup__add_metric_weak_group(events, ids, 
idnum);
-
-   eg = malloc(sizeof(struct egroup));
-   if (!eg) {
-   ret = -ENOMEM;
-   break;
-   }
-   eg->ids = ids;
-   eg->idnum = idnum;
-   eg->metric_name = pe->metric_name;
-   eg->metric_expr = pe->metric_expr;
-   eg->metric_unit = pe->unit;
-   list_add_tail(>nd, group_list);
-   ret = 0;
}
}
return ret;
-- 
2.18.1



[PATCH v7 2/6] perf expr: Add expr_scanner_ctx object

2020-03-27 Thread Kajol Jain
From: Jiri Olsa 

Adding expr_scanner_ctx object to hold user data
for the expr scanner. Currently it holds only
start_token, Kajol Jain will use it to hold 24x7
runtime param.

Signed-off-by: Jiri Olsa 
---
 tools/perf/util/expr.c |  6 --
 tools/perf/util/expr.h |  4 
 tools/perf/util/expr.l | 10 +-
 3 files changed, 13 insertions(+), 7 deletions(-)

diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c
index c8ccc548a585..c3382d58cf40 100644
--- a/tools/perf/util/expr.c
+++ b/tools/perf/util/expr.c
@@ -3,7 +3,6 @@
 #include 
 #include "expr.h"
 #include "expr-bison.h"
-#define YY_EXTRA_TYPE int
 #include "expr-flex.h"
 
 #ifdef PARSER_DEBUG
@@ -30,11 +29,14 @@ static int
 __expr__parse(double *val, struct expr_parse_ctx *ctx, const char *expr,
  int start)
 {
+   struct expr_scanner_ctx scanner_ctx = {
+   .start_token = start,
+   };
YY_BUFFER_STATE buffer;
void *scanner;
int ret;
 
-   ret = expr_lex_init_extra(start, );
+   ret = expr_lex_init_extra(_ctx, );
if (ret)
return ret;
 
diff --git a/tools/perf/util/expr.h b/tools/perf/util/expr.h
index b9e53f2b5844..0938ad166ece 100644
--- a/tools/perf/util/expr.h
+++ b/tools/perf/util/expr.h
@@ -15,6 +15,10 @@ struct expr_parse_ctx {
struct expr_parse_id ids[MAX_PARSE_ID];
 };
 
+struct expr_scanner_ctx {
+   int start_token;
+};
+
 void expr__ctx_init(struct expr_parse_ctx *ctx);
 void expr__add_id(struct expr_parse_ctx *ctx, const char *id, double val);
 int expr__parse(double *final_val, struct expr_parse_ctx *ctx, const char 
*expr);
diff --git a/tools/perf/util/expr.l b/tools/perf/util/expr.l
index eaad29243c23..2582c2464938 100644
--- a/tools/perf/util/expr.l
+++ b/tools/perf/util/expr.l
@@ -76,13 +76,13 @@ sym [0-9a-zA-Z_\.:@]+
 symbol {spec}*{sym}*{spec}*{sym}*
 
 %%
-   {
-   int start_token;
+   struct expr_scanner_ctx *sctx = expr_get_extra(yyscanner);
 
-   start_token = expr_get_extra(yyscanner);
+   {
+   int start_token = sctx->start_token;
 
-   if (start_token) {
-   expr_set_extra(NULL, yyscanner);
+   if (sctx->start_token) {
+   sctx->start_token = 0;
return start_token;
}
}
-- 
2.18.1



[PATCH v7 1/6] perf expr: Add expr_ prefix for parse_ctx and parse_id

2020-03-27 Thread Kajol Jain
From: Jiri Olsa 

Adding expr_ prefix for parse_ctx and parse_id,
to straighten out the expr* namespace.

There's no functional change.

Signed-off-by: Jiri Olsa 
---
 tools/perf/tests/expr.c   |  4 ++--
 tools/perf/util/expr.c| 10 +-
 tools/perf/util/expr.h| 12 ++--
 tools/perf/util/expr.y|  6 +++---
 tools/perf/util/stat-shadow.c |  2 +-
 5 files changed, 17 insertions(+), 17 deletions(-)

diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c
index 28313e59d6f6..ea10fc4412c4 100644
--- a/tools/perf/tests/expr.c
+++ b/tools/perf/tests/expr.c
@@ -6,7 +6,7 @@
 #include 
 #include 
 
-static int test(struct parse_ctx *ctx, const char *e, double val2)
+static int test(struct expr_parse_ctx *ctx, const char *e, double val2)
 {
double val;
 
@@ -22,7 +22,7 @@ int test__expr(struct test *t __maybe_unused, int subtest 
__maybe_unused)
const char **other;
double val;
int i, ret;
-   struct parse_ctx ctx;
+   struct expr_parse_ctx ctx;
int num_other;
 
expr__ctx_init();
diff --git a/tools/perf/util/expr.c b/tools/perf/util/expr.c
index fd192ddf93c1..c8ccc548a585 100644
--- a/tools/perf/util/expr.c
+++ b/tools/perf/util/expr.c
@@ -11,7 +11,7 @@ extern int expr_debug;
 #endif
 
 /* Caller must make sure id is allocated */
-void expr__add_id(struct parse_ctx *ctx, const char *name, double val)
+void expr__add_id(struct expr_parse_ctx *ctx, const char *name, double val)
 {
int idx;
 
@@ -21,13 +21,13 @@ void expr__add_id(struct parse_ctx *ctx, const char *name, 
double val)
ctx->ids[idx].val = val;
 }
 
-void expr__ctx_init(struct parse_ctx *ctx)
+void expr__ctx_init(struct expr_parse_ctx *ctx)
 {
ctx->num_ids = 0;
 }
 
 static int
-__expr__parse(double *val, struct parse_ctx *ctx, const char *expr,
+__expr__parse(double *val, struct expr_parse_ctx *ctx, const char *expr,
  int start)
 {
YY_BUFFER_STATE buffer;
@@ -52,7 +52,7 @@ __expr__parse(double *val, struct parse_ctx *ctx, const char 
*expr,
return ret;
 }
 
-int expr__parse(double *final_val, struct parse_ctx *ctx, const char *expr)
+int expr__parse(double *final_val, struct expr_parse_ctx *ctx, const char 
*expr)
 {
return __expr__parse(final_val, ctx, expr, EXPR_PARSE) ? -1 : 0;
 }
@@ -75,7 +75,7 @@ int expr__find_other(const char *expr, const char *one, const 
char ***other,
 int *num_other)
 {
int err, i = 0, j = 0;
-   struct parse_ctx ctx;
+   struct expr_parse_ctx ctx;
 
expr__ctx_init();
err = __expr__parse(NULL, , expr, EXPR_OTHER);
diff --git a/tools/perf/util/expr.h b/tools/perf/util/expr.h
index 9377538f4097..b9e53f2b5844 100644
--- a/tools/perf/util/expr.h
+++ b/tools/perf/util/expr.h
@@ -5,19 +5,19 @@
 #define EXPR_MAX_OTHER 20
 #define MAX_PARSE_ID EXPR_MAX_OTHER
 
-struct parse_id {
+struct expr_parse_id {
const char *name;
double val;
 };
 
-struct parse_ctx {
+struct expr_parse_ctx {
int num_ids;
-   struct parse_id ids[MAX_PARSE_ID];
+   struct expr_parse_id ids[MAX_PARSE_ID];
 };
 
-void expr__ctx_init(struct parse_ctx *ctx);
-void expr__add_id(struct parse_ctx *ctx, const char *id, double val);
-int expr__parse(double *final_val, struct parse_ctx *ctx, const char *expr);
+void expr__ctx_init(struct expr_parse_ctx *ctx);
+void expr__add_id(struct expr_parse_ctx *ctx, const char *id, double val);
+int expr__parse(double *final_val, struct expr_parse_ctx *ctx, const char 
*expr);
 int expr__find_other(const char *expr, const char *one, const char ***other,
int *num_other);
 
diff --git a/tools/perf/util/expr.y b/tools/perf/util/expr.y
index 4720cbe79357..cd17486c1c5d 100644
--- a/tools/perf/util/expr.y
+++ b/tools/perf/util/expr.y
@@ -15,7 +15,7 @@
 %define api.pure full
 
 %parse-param { double *final_val }
-%parse-param { struct parse_ctx *ctx }
+%parse-param { struct expr_parse_ctx *ctx }
 %parse-param {void *scanner}
 %lex-param {void* scanner}
 
@@ -39,14 +39,14 @@
 
 %{
 static void expr_error(double *final_val __maybe_unused,
-  struct parse_ctx *ctx __maybe_unused,
+  struct expr_parse_ctx *ctx __maybe_unused,
   void *scanner,
   const char *s)
 {
pr_debug("%s\n", s);
 }
 
-static int lookup_id(struct parse_ctx *ctx, char *id, double *val)
+static int lookup_id(struct expr_parse_ctx *ctx, char *id, double *val)
 {
int i;
 
diff --git a/tools/perf/util/stat-shadow.c b/tools/perf/util/stat-shadow.c
index 0fd713d3674f..402af3e8d287 100644
--- a/tools/perf/util/stat-shadow.c
+++ b/tools/perf/util/stat-shadow.c
@@ -729,7 +729,7 @@ static void generic_metric(struct perf_stat_config *config,
   struct runtime_stat *st)
 {
print_metric_t print_metric = out->print_metric;
-   struct parse_ctx pctx;
+   struct expr_parse_ctx pctx;
double 

[PATCH v7 0/6] powerpc/perf: Add json file metric support for the hv_24x7 socket/chip level events

2020-03-27 Thread Kajol Jain
Patchset adds json file metric support for the hv_24x7 socket/chip level
events. "hv_24x7" pmu interface events needs system dependent parameter
like socket/chip/core. For example, hv_24x7 chip level events needs
specific chip-id to which the data is requested should be added as part
of pmu events.

So to enable JSON file support to "hv_24x7" interface, patchset reads
total number of sockets details in sysfs under 
"/sys/devices/hv_24x7/interface/".

Second patch of the patchset adds expr_scanner_ctx object to hold user
data for the expr scanner, which can be used to hold runtime parameter.

Patch 4 & 6 of the patchset handles perf tool plumbing needed to replace
the "?" character in the metric expression to proper value and hv_24x7
json metric file for different Socket/chip resources.

Patch set also enable Hz/hz prinitg for --metric-only option to print
metric data for bus frequency.

Applied and tested all these patches cleanly on top of jiri's flex changes
with the changes done by Kan Liang for "Support metric group constraint"
patchset and made required changes.

Also apply this patch on top of the fix patch send earlier
for printing metric name incase overlapping events.
https://git.kernel.org/pub/scm/linux/kernel/git/acme/linux.git/commit/?h=perf/core=37cd7f65bf71a48f25eeb6d9be5dacb20d008ea6

Changelog:
v6 -> v7
- Spit patchset into two patch series one for kernel changes and other
  for tool side changes.
- Made changes Suggested by Jiri, including rather then reading runtime
  parameter from metric name, actually add it in structure egroup and
  metric_expr.
- As we don't need to read runtime parameter from metric name,
  now I am not appending it and rather just printing it in
  generic_metric function.

Kernel Side changes patch series: https://lkml.org/lkml/2020/3/27/58

v5 -> v6
- resolve compilation issue due to rearranging patch series.
- Rather then adding new function to take careof case for runtime param
  in metricgroup__add_metric, using metricgroup__add_metric_param itself
  for that work.
- Address some optimization suggested like using directly file path
  rather then adding new macro in header.c
- Change commit message on patch where we are adding "?" support
  by adding simple example.

v4 -> v5
- Using sysfs__read_int instead of sysfs__read_ull while reading
  parameter value in powerpc/util/header.c file.

- Using asprintf rather then malloc and sprintf 
  Suggested by Arnaldo Carvalho de Melo

- Break patch 6 from previous version to two patch,
  - One to add refactor current "metricgroup__add_metric" function
and another where actually "?" handling infra added.

- Add expr__runtimeparam as part of 'expr_scanner_ctx' struct
  rather then making it global variable. Thanks Jiri for
  adding this structure to hold user data for the expr scanner.

- Add runtime param as agrugement to function 'expr__find_other'
  and 'expr__parse' and made changes on references accordingly.

v3 -> v4
- Apply these patch on top of Kan liang changes.
  As suggested by Jiri.

v2 -> v3
- Remove setting  event_count to 0 part in function 'h_24x7_event_read'
  with comment rather then adding 0 to event_count value.
  Suggested by: Sukadev Bhattiprolu

- Apply tool side changes require to replace "?" on Jiri's flex patch
  series and made all require changes to make it compatible with added
  flex change.

v1 -> v2
- Rename hv-24x7 metric json file as nest_metrics.json

Jiri Olsa (2):
  perf expr: Add expr_ prefix for parse_ctx and parse_id
  perf expr: Add expr_scanner_ctx object

Kajol Jain (4):
  perf/tools: Refactoring metricgroup__add_metric function
  perf/tools: Enhance JSON/metric infrastructure to handle "?"
  tools/perf: Enable Hz/hz prinitg for --metric-only option
  perf/tools/pmu-events/powerpc: Add hv_24x7 socket/chip level metric
events

 tools/perf/arch/powerpc/util/header.c |  8 ++
 .../arch/powerpc/power9/nest_metrics.json | 19 +
 tools/perf/tests/expr.c   | 12 +--
 tools/perf/util/expr.c| 25 +++---
 tools/perf/util/expr.h| 19 +++--
 tools/perf/util/expr.l| 37 ++---
 tools/perf/util/expr.y|  6 +-
 tools/perf/util/metricgroup.c | 79 +--
 tools/perf/util/metricgroup.h |  2 +
 tools/perf/util/stat-display.c|  2 -
 tools/perf/util/stat-shadow.c | 19 +++--
 11 files changed, 157 insertions(+), 71 deletions(-)
 create mode 100644 tools/perf/pmu-events/arch/powerpc/power9/nest_metrics.json

-- 
2.18.1



Re: hardcoded SIGSEGV in __die() ?

2020-03-27 Thread Joakim Tjernlund
On Thu, 2020-03-26 at 11:28 +1100, Michael Ellerman wrote:
> CAUTION: This email originated from outside of the organization. Do not click 
> links or open attachments unless you recognize the sender and know the 
> content is safe.
> 
> 
> Joakim Tjernlund  writes:
> > On Mon, 2020-03-23 at 15:45 +0100, Christophe Leroy wrote:
> > > Le 23/03/2020 à 15:43, Christophe Leroy a écrit :
> > > > Le 23/03/2020 à 15:17, Joakim Tjernlund a écrit :
> > > > > In __die(), see below, there is this call to notify_send() with
> > > > > SIGSEGV hardcoded, this seems odd
> > > > > to me as the variable "err" holds the true signal(in my case SIGBUS)
> > > > > Should not SIGSEGV be replaced with the true signal no.?
> > > > 
> > > > As far as I can see, comes from
> > > > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fid%3D66fcb1059data=02%7C01%7CJoakim.Tjernlund%40infinera.com%7Caa316058f9e34dd758c808d7d11ca391%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C637207793252449714sdata=LBzRMxHWJzNEztnnG0UzJb7PHvaDGVswQD%2B8WpY9YX8%3Dreserved=0
> > > > 
> > > 
> > > And
> > > https://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Ftorvalds%2Flinux.git%2Fcommit%2F%3Fid%3Dae87221d3ce49d9de1e43756da834fd0bf05a2addata=02%7C01%7CJoakim.Tjernlund%40infinera.com%7Caa316058f9e34dd758c808d7d11ca391%7C285643de5f5b4b03a1530ae2dc8aaf77%7C1%7C0%7C637207793252449714sdata=Dh%2BUTRgG85oVSgC3SCR1B7izQH4HofT4ppOMiy9xvDA%3Dreserved=0
> > > shows it is (was?) similar on x86.
> > > 
> > 
> > I tried to follow that chain thinking it would end up sending a signal to 
> > user space but I cannot see
> > that happens. Seems to be related to debugging.
> > 
> > In short, I cannot see any signal being delivered to user space. If so that 
> > would explain why
> > our user space process never dies.
> > Is there a signal hidden in machine_check handler for SIGBUS I cannot see?
> 
> It's platform specific. What platform are you on?

I am on e500, e5500(e500mc) and 83xx :)


> 
> See the ppc_md & cur_cpu_spec calls here:
> 
> void machine_check_exception(struct pt_regs *regs)
> {
> int recover = 0;
> bool nested = in_nmi();
> if (!nested)
> nmi_enter();
> 
> __this_cpu_inc(irq_stat.mce_exceptions);
> 
> add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
> 
> /* See if any machine dependent calls. In theory, we would want
>  * to call the CPU first, and call the ppc_md. one if the CPU
>  * one returns a positive number. However there is existing code
>  * that assumes the board gets a first chance, so let's keep it
>  * that way for now and fix things later. --BenH.
>  */
> if (ppc_md.machine_check_exception)
> recover = ppc_md.machine_check_exception(regs);
> else if (cur_cpu_spec->machine_check)
> recover = cur_cpu_spec->machine_check(regs);
> 
> if (recover > 0)
> goto bail;
> 
> 
> Either the ppc_md or cpu_spec handlers can send a signal, but after a
> bit of grepping I think only the pseries and powernv ones do.

Seems so

> 
> If you get into die() then it's an oops, which is not the same as a
> normal signal.

Exactly, and the die/OOPS does not seem work as intended either. The system 
tries to limp along
and generates more similar OOPses and may even hang.

> 
> cheers



Re: [PATCH v6 09/11] perf/tools: Enhance JSON/metric infrastructure to handle "?"

2020-03-27 Thread kajoljain



On 3/24/20 6:41 PM, Jiri Olsa wrote:
> On Fri, Mar 20, 2020 at 06:24:04PM +0530, Kajol Jain wrote:
>> Patch enhances current metric infrastructure to handle "?" in the metric
>> expression. The "?" can be use for parameters whose value not known while
>> creating metric events and which can be replace later at runtime to
>> the proper value. It also add flexibility to create multiple events out
>> of single metric event added in json file.
>>
>> Patch adds function 'arch_get_runtimeparam' which is a arch specific
>> function, returns the count of metric events need to be created.
>> By default it return 1.
>>
>> This infrastructure needed for hv_24x7 socket/chip level events.
>> "hv_24x7" chip level events needs specific chip-id to which the
>> data is requested. Function 'arch_get_runtimeparam' implemented
>> in header.c which extract number of sockets from sysfs file
>> "sockets" under "/sys/devices/hv_24x7/interface/".
>>
>>
>> With this patch basically we are trying to create as many metric events
>> as define by runtime_param.
>>
>> For that one loop is added in function 'metricgroup__add_metric',
>> which create multiple events at run time depend on return value of
>> 'arch_get_runtimeparam' and merge that event in 'group_list'.
>>
>> To achieve that we are actually passing this parameter value as part of
>> `expr__find_other` function and changing "?" present in metric expression
>> with this value.
>>
>> As in our json file, there gonna be single metric event, and out of
>> which we are creating multiple events, I am also merging this value
>> to the original metric name to specify parameter value.
>>
>> For example,
>> command:# ./perf stat  -M PowerBUS_Frequency -C 0 -I 1000
>> #   time counts unit events
>>  1.000101867  9,356,933  hv_24x7/pm_pb_cyc,chip=0/ #  
>> 2.3 GHz  PowerBUS_Frequency_0
>>  1.000101867  9,366,134  hv_24x7/pm_pb_cyc,chip=1/ #  
>> 2.3 GHz  PowerBUS_Frequency_1
>>  2.000314878  9,365,868  hv_24x7/pm_pb_cyc,chip=0/ #  
>> 2.3 GHz  PowerBUS_Frequency_0
>>  2.000314878  9,366,092  hv_24x7/pm_pb_cyc,chip=1/ #  
>> 2.3 GHz  PowerBUS_Frequency_1
>>
>> So, here _0 and _1 after PowerBUS_Frequency specify parameter value.
>>
>> As after adding this to group_list, again we call expr__parse
>> in 'generic_metric' function present in util/stat-display.c.
>> By this time again we need to pass this parameter value. So, now to get this 
>> value
>> actually I am trying to extract it from metric name itself. Because
>> otherwise it gonna point to last updated value present in runtime_param.
>> And gonna match for that value only.
> 
> so why can't we pass that param as integer value through the metric objects?
> 
> it get's created in metricgroup__add_metric_param:
>   - as struct egroup *eg
>   - we can add egroup::param and store the param value there
> 
> then in metricgroup__setup_events it moves to:
>   - struct metric_expr *expr
>   - we can add metric_expr::param to keep the param
> 
> then in perf_stat__print_shadow_stats there's:
>   - struct metric_expr *mexp loop
>   - calling generic_metric metric - we could call it with mexp::param
>   - and pass the param to expr__parse
> 
Hi jiri,
Thanks for the suggestion, Yes it make more sense to use like that.
Will update.

Thanks,
Kajol
> jirka
> 


[PATCH] selftests/powerpc: Fix try-run when source tree is not writable

2020-03-27 Thread Michael Ellerman
We added a usage of try-run to pmu/ebb/Makefile to detect if the
toolchain supported the -no-pie option.

This fails if we build out-of-tree and the source tree is not
writable, as try-run tries to write its temporary files to the current
directory. That leads to the -no-pie option being silently dropped,
which leads to broken executables with some toolchains.

If we remove the redirect to /dev/null in try-run, we see the error:

  make[3]: Entering directory '/linux/tools/testing/selftests/powerpc/pmu/ebb'
  /usr/bin/ld: cannot open output file .54.tmp: Read-only file system
  collect2: error: ld returned 1 exit status
  make[3]: Nothing to be done for 'all'.

And looking with strace we see it's trying to use a file that's in the
source tree:

  lstat("/linux/tools/testing/selftests/powerpc/pmu/ebb/.54.tmp", 
0x7c0f83c8)

We can fix it by setting TMPOUT to point to the $(OUTPUT) directory,
and we can verify with strace it's now trying to write to the output
directory:

  lstat("/output/kselftest/powerpc/pmu/ebb/.54.tmp", 0x7fffd1bf6bf8)

And also see that the -no-pie option is now correctly detected.

Fixes: 0695f8bca93e ("selftests/powerpc: Handle Makefile for unrecognized 
option")
Cc: sta...@vger.kernel.org # v5.5+
Signed-off-by: Michael Ellerman 
---
 tools/testing/selftests/powerpc/pmu/ebb/Makefile | 1 +
 1 file changed, 1 insertion(+)

diff --git a/tools/testing/selftests/powerpc/pmu/ebb/Makefile 
b/tools/testing/selftests/powerpc/pmu/ebb/Makefile
index 417306353e07..ca35dd8848b0 100644
--- a/tools/testing/selftests/powerpc/pmu/ebb/Makefile
+++ b/tools/testing/selftests/powerpc/pmu/ebb/Makefile
@@ -7,6 +7,7 @@ include ../../../../../../scripts/Kbuild.include
 # The EBB handler is 64-bit code and everything links against it
 CFLAGS += -m64
 
+TMPOUT = $(OUTPUT)/
 # Toolchains may build PIE by default which breaks the assembly
 no-pie-option := $(call try-run, echo 'int main() { return 0; }' | \
 $(CC) -Werror $(KBUILD_CPPFLAGS) $(CC_OPTION_CFLAGS) -no-pie -x c - -o 
"$$TMP", -no-pie)

base-commit: 11a48a5a18c63fd7621bb050228cebf13566e4d8
prerequisite-patch-id: e460261e180666bfb63bcd0cc554874d73c3b71f
prerequisite-patch-id: 67db368cfdf8d3aefd78f140420281f9b4b53e07
prerequisite-patch-id: cf5d957a366998b4a3ce70a79b6e969eb98fca7d
prerequisite-patch-id: 3ace935c6ae425ad635eb38f906e790c3c9bf41f
-- 
2.25.1



Re: [PATCH v2] mm/sparse: Fix kernel crash with pfn_section_valid check

2020-03-27 Thread David Hildenbrand
On 26.03.20 14:32, Aneesh Kumar K.V wrote:
> Fixes the below crash
> 
> BUG: Kernel NULL pointer dereference on read at 0x
> Faulting instruction address: 0xc0c3447c
> Oops: Kernel access of bad area, sig: 11 [#1]
> LE PAGE_SIZE=64K MMU=Hash SMP NR_CPUS=2048 NUMA pSeries
> CPU: 11 PID: 7519 Comm: lt-ndctl Not tainted 5.6.0-rc7-autotest #1
> ...
> NIP [c0c3447c] vmemmap_populated+0x98/0xc0
> LR [c0088354] vmemmap_free+0x144/0x320
> Call Trace:
>  section_deactivate+0x220/0x240
>  __remove_pages+0x118/0x170
>  arch_remove_memory+0x3c/0x150
>  memunmap_pages+0x1cc/0x2f0
>  devm_action_release+0x30/0x50
>  release_nodes+0x2f8/0x3e0
>  device_release_driver_internal+0x168/0x270
>  unbind_store+0x130/0x170
>  drv_attr_store+0x44/0x60
>  sysfs_kf_write+0x68/0x80
>  kernfs_fop_write+0x100/0x290
>  __vfs_write+0x3c/0x70
>  vfs_write+0xcc/0x240
>  ksys_write+0x7c/0x140
>  system_call+0x5c/0x68
> 
> The crash is due to NULL dereference at
> 
> test_bit(idx, ms->usage->subsection_map); due to ms->usage = NULL; in 
> pfn_section_valid()
> 
> With commit: d41e2f3bd546 ("mm/hotplug: fix hot remove failure in 
> SPARSEMEM|!VMEMMAP case")
> section_mem_map is set to NULL after depopulate_section_mem(). This
> was done so that pfn_page() can work correctly with kernel config that 
> disables
> SPARSEMEM_VMEMMAP. With that config pfn_to_page does
> 
>   __section_mem_map_addr(__sec) + __pfn;
> where
> 
> static inline struct page *__section_mem_map_addr(struct mem_section *section)
> {
>   unsigned long map = section->section_mem_map;
>   map &= SECTION_MAP_MASK;
>   return (struct page *)map;
> }
> 
> Now with SPASEMEM_VMEMAP enabled, mem_section->usage->subsection_map is used 
> to
> check the pfn validity (pfn_valid()). Since section_deactivate release
> mem_section->usage if a section is fully deactivated, pfn_valid() check after
> a subsection_deactivate cause a kernel crash.
> 
> static inline int pfn_valid(unsigned long pfn)
> {
> ...
>   return early_section(ms) || pfn_section_valid(ms, pfn);
> }
> 
> where
> 
> static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn)
> {
>   int idx = subsection_map_index(pfn);
> 
>   return test_bit(idx, ms->usage->subsection_map);
> }
> 
> Avoid this by clearing SECTION_HAS_MEM_MAP when mem_section->usage is freed.
> For architectures like ppc64 where large pages are used for vmmemap mapping 
> (16MB),
> a specific vmemmap mapping can cover multiple sections. Hence before a vmemmap
> mapping page can be freed, the kernel needs to make sure there are no valid 
> sections
> within that mapping. Clearing the section valid bit before
> depopulate_section_memap enables this.
> 
> Fixes: d41e2f3bd546 ("mm/hotplug: fix hot remove failure in 
> SPARSEMEM|!VMEMMAP case")
> Reported-by: Sachin Sant 
> Tested-by: Sachin Sant 
> Cc: Baoquan He 
> Cc: Michael Ellerman 
> Cc: Dan Williams 
> Cc: Pankaj Gupta 
> Cc: David Hildenbrand 
> Cc: Michal Hocko 
> Cc: Wei Yang 
> Cc: Oscar Salvador 
> Cc: Mike Rapoport 
> Cc: 
> Signed-off-by: Aneesh Kumar K.V 
> ---
>  mm/sparse.c | 6 ++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/mm/sparse.c b/mm/sparse.c
> index aadb7298dcef..65599e8bd636 100644
> --- a/mm/sparse.c
> +++ b/mm/sparse.c
> @@ -781,6 +781,12 @@ static void section_deactivate(unsigned long pfn, 
> unsigned long nr_pages,
>   ms->usage = NULL;
>   }
>   memmap = sparse_decode_mem_map(ms->section_mem_map, section_nr);
> + /*
> +  * Mark the section invalid so that valid_section()
> +  * return false. This prevents code from dereferencing

s/return/returns/

> +  * ms->usage array.

maybe "(see pfn_section_valid())"

> +  */
> + ms->section_mem_map &= ~SECTION_HAS_MEM_MAP;


-- 
Thanks,

David / dhildenb



Re: [PATCH 1/2] powerpc/vmlinux.lds: Explicitly retain .gnu.hash

2020-03-27 Thread Michael Ellerman
Alan Modra  writes:
> On Thu, Feb 27, 2020 at 03:59:32PM +1100, Michael Ellerman wrote:
>> Relocatable kernel builds produce a warning about .gnu.hash being an
>> orphan section:
>> 
>>   ld: warning: orphan section `.gnu.hash' from `linker stubs' being placed 
>> in section `.gnu.hash'
>> 
>> If we try to discard it the build fails:
>> 
>>   ld -EL -m elf64lppc -pie --orphan-handling=warn --build-id -o
>> .tmp_vmlinux1 -T ./arch/powerpc/kernel/vmlinux.lds --whole-archive
>> arch/powerpc/kernel/head_64.o arch/powerpc/kernel/entry_64.o
>> ...
>> sound/built-in.a net/built-in.a virt/built-in.a --no-whole-archive
>> --start-group lib/lib.a --end-group
>>   ld: could not find section .gnu.hash
>> 
>> So add an entry to explicitly retain it, as we do for .hash.
>
> Looks fine to me.  You can also pass --hash-style=sysv to ld (since
> binutils-2.18) to disable generation of .gnu.hash.

Our current minimum is 2.21, so that's probably 5-10 years away :)

cheers


Re: [PATCH 2/2] powerpc/vmlinux.lds: Discard .interp section

2020-03-27 Thread Michael Ellerman
Alan Modra  writes:
> On Thu, Feb 27, 2020 at 03:59:33PM +1100, Michael Ellerman wrote:
>> The .interp section specifies which "interpreter", ie. dynamic loader,
>> the kernel requests. But that doesn't make any sense, the kernel is
>> not a regular binary that is run with an interpreter.
>> 
>> The content seems to be some default value, this file doesn't even
>> exist on my system:
>>     2f 75 73 72 2f 6c 69 62  2f 6c 64 2e 73 6f 2e 31  
>> |/usr/lib/ld.so.1|
>> 
>> So the section serves no useful purpose and consumes a small amount of
>> space.
>> 
>> Also Alan Modra says we "likely could discard" it, so do so.
>
> Yes, but you ought to check with the mimimum required binutils.  It is
> quite possible that an older linker will blow up.

OK, I guess I'll have to test.

> If the minimum required binutils is at least binutils-2.26 then
> passing --no-dynamic-linker to ld is a more elegant solution.

The current minimum is 2.21, though there's talk of increasing it to
2.23.

cheers


Re: [PATCH v2] powerpc/kprobes: Blacklist functions running with MMU disabled on PPC32

2020-03-27 Thread Naveen N. Rao

Christophe Leroy wrote:

kprobe does not handle events happening in real mode, all
functions running with MMU disabled have to be blacklisted.

As already done for PPC64, do it for PPC32.

Signed-off-by: Christophe Leroy 
---
v2:
- Don't rename nonrecoverable as local, mark it noprobe instead.
- Add missing linux/kprobes.h include in pq2.c
---
 arch/powerpc/include/asm/ppc_asm.h   | 10 +++
 arch/powerpc/kernel/cpu_setup_6xx.S  |  4 +-
 arch/powerpc/kernel/entry_32.S   | 65 
 arch/powerpc/kernel/fpu.S|  1 +
 arch/powerpc/kernel/idle_6xx.S   |  2 +-
 arch/powerpc/kernel/idle_e500.S  |  2 +-
 arch/powerpc/kernel/l2cr_6xx.S   |  2 +-
 arch/powerpc/kernel/misc.S   |  2 +
 arch/powerpc/kernel/misc_32.S|  4 +-
 arch/powerpc/kernel/swsusp_32.S  |  6 +-
 arch/powerpc/kernel/vector.S |  1 +
 arch/powerpc/mm/book3s32/hash_low.S  | 38 ++--
 arch/powerpc/mm/mem.c|  2 +
 arch/powerpc/platforms/52xx/lite5200_sleep.S |  2 +
 arch/powerpc/platforms/82xx/pq2.c|  3 +
 arch/powerpc/platforms/83xx/suspend-asm.S|  1 +
 arch/powerpc/platforms/powermac/cache.S  |  2 +
 arch/powerpc/platforms/powermac/sleep.S  | 13 ++--
 18 files changed, 86 insertions(+), 74 deletions(-)

diff --git a/arch/powerpc/include/asm/ppc_asm.h 
b/arch/powerpc/include/asm/ppc_asm.h
index 6b03dff61a05..e8f34ba89497 100644
--- a/arch/powerpc/include/asm/ppc_asm.h
+++ b/arch/powerpc/include/asm/ppc_asm.h
@@ -267,8 +267,18 @@ GLUE(.,name):
.pushsection "_kprobe_blacklist","aw";  \
PPC_LONG (entry) ;  \
.popsection
+#define _NOKPROBE_ENTRY(entry) \
+   _ASM_NOKPROBE_SYMBOL(entry) \
+   _ENTRY(entry)
+#define _NOKPROBE_GLOBAL(entry)\
+   _ASM_NOKPROBE_SYMBOL(entry) \
+   _GLOBAL(entry)
 #else
 #define _ASM_NOKPROBE_SYMBOL(entry)
+#define _NOKPROBE_ENTRY(entry) \
+   _ENTRY(entry)
+#define _NOKPROBE_GLOBAL(entry)\
+   _GLOBAL(entry)
 #endif


Michael hasn't preferred including NOKPROBE variants of those macros 
previously, since he would like to see some cleanups there:

https://patchwork.ozlabs.org/patch/696138/



 #define FUNC_START(name)   _GLOBAL(name)
diff --git a/arch/powerpc/kernel/cpu_setup_6xx.S 
b/arch/powerpc/kernel/cpu_setup_6xx.S
index f6517f67265a..1cb947268546 100644
--- a/arch/powerpc/kernel/cpu_setup_6xx.S
+++ b/arch/powerpc/kernel/cpu_setup_6xx.S
@@ -276,7 +276,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_NO_DPM)
  * in some 750 cpus where using a not yet initialized FPU register after
  * power on reset may hang the CPU
  */
-_GLOBAL(__init_fpu_registers)
+_NOKPROBE_GLOBAL(__init_fpu_registers)
mfmsr   r10
ori r11,r10,MSR_FP
mtmsr   r11
@@ -381,7 +381,7 @@ _GLOBAL(__save_cpu_setup)
  * restore CPU state as backed up by the previous
  * function. This does not include cache setting
  */
-_GLOBAL(__restore_cpu_setup)
+_NOKPROBE_GLOBAL(__restore_cpu_setup)
/* Some CR fields are volatile, we back it up all */
mfcrr7

diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S
index 16af0d8d90a8..f788d586254d 100644
--- a/arch/powerpc/kernel/entry_32.S
+++ b/arch/powerpc/kernel/entry_32.S
@@ -44,24 +44,21 @@
.align  12

 #ifdef CONFIG_BOOKE
-   .globl  mcheck_transfer_to_handler
-mcheck_transfer_to_handler:
+_NOKPROBE_ENTRY(mcheck_transfer_to_handler)
mfspr   r0,SPRN_DSRR0
stw r0,_DSRR0(r11)
mfspr   r0,SPRN_DSRR1
stw r0,_DSRR1(r11)
/* fall through */

-   .globl  debug_transfer_to_handler
-debug_transfer_to_handler:
+_NOKPROBE_ENTRY(debug_transfer_to_handler)
mfspr   r0,SPRN_CSRR0
stw r0,_CSRR0(r11)
mfspr   r0,SPRN_CSRR1
stw r0,_CSRR1(r11)
/* fall through */

-   .globl  crit_transfer_to_handler
-crit_transfer_to_handler:
+_NOKPROBE_ENTRY(crit_transfer_to_handler)
 #ifdef CONFIG_PPC_BOOK3E_MMU
mfspr   r0,SPRN_MAS0
stw r0,MAS0(r11)
@@ -97,8 +94,7 @@ crit_transfer_to_handler:
 #endif

 #ifdef CONFIG_40x
-   .globl  crit_transfer_to_handler
-crit_transfer_to_handler:
+_NOKPROBE_ENTRY(crit_transfer_to_handler)
lwz r0,crit_r10@l(0)
stw r0,GPR10(r11)
lwz r0,crit_r11@l(0)
@@ -124,13 +120,11 @@ crit_transfer_to_handler:
  * Note that we rely on the caller having set cr0.eq iff the exception
  * occurred in kernel mode (i.e. MSR:PR = 0).
  */
-   .globl  transfer_to_handler_full
-transfer_to_handler_full:
+_NOKPROBE_ENTRY(transfer_to_handler_full)
SAVE_NVGPRS(r11)
/* fall through */

-   .globl  transfer_to_handler

Re: [PATCH v2] powerpc xmon: use `dcbf` inplace of `dcbi` instruction for 64bit Book3S

2020-03-27 Thread Balamuruhan S
On Fri, 2020-03-27 at 07:48 +0100, Christophe Leroy wrote:
> 
> Le 26/03/2020 à 07:15, Balamuruhan S a écrit :
> > Data Cache Block Invalidate (dcbi) instruction was implemented back in
> > PowerPC
> > architecture version 2.03. It is obsolete and attempt to use of this
> > illegal
> > instruction results in a hypervisor emulation assistance interrupt. So,
> > ifdef
> > it out the option `i` in xmon for 64bit Book3S.
> 
> I don't understand. You say two contradictory things:
> 1/ You say it _was_ added back.
> 2/ You say it _is_ obsolete.
> 
> How can it be obsolete if it was added back ?

I actually learnt it from P8 and P9 User Manual,

The POWER8/POWER9 core does not provide support for the following optional or
obsolete instructions (attempted use of these results in a hypervisor emulation
assistance interrupt):
• tlbia - TLB invalidate all
• tlbiex - TLB invalidate entry by index (obsolete)
• slbiex - SLB invalidate entry by index (obsolete)
• dcba - Data cache block allocate (Book II; obsolete)
• dcbi - Data cache block invalidate (obsolete)
• rfi - Return from interrupt (32-bit; obsolete)

> 
> [...]
> 
> > diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
> > index 0ec9640335bb..bfd5a97689cd 100644
> > --- a/arch/powerpc/xmon/xmon.c
> > +++ b/arch/powerpc/xmon/xmon.c
> > @@ -335,10 +335,12 @@ static inline void cflush(void *p)
> > asm volatile ("dcbf 0,%0; icbi 0,%0" : : "r" (p));
> >   }
> >   
> > +#ifndef CONFIG_PPC_BOOK3S_64
> 
> You don't need that #ifndef. Keeping it should be harmless.

okay.

> 
> >   static inline void cinval(void *p)
> >   {
> > asm volatile ("dcbi 0,%0; icbi 0,%0" : : "r" (p));
> >   }
> > +#endif
> >   
> >   /**
> >* write_ciabr() - write the CIABR SPR
> > @@ -1791,8 +1793,9 @@ static void prregs(struct pt_regs *fp)
> >   
> >   static void cacheflush(void)
> >   {
> > -   int cmd;
> > unsigned long nflush;
> > +#ifndef CONFIG_PPC_BOOK3S_64
> 
> Don't make it so complex, see below
> 
> > +   int cmd;
> >   
> > cmd = inchar();
> > if (cmd != 'i')
> > @@ -1800,13 +1803,14 @@ static void cacheflush(void)
> > scanhex((void *));
> > if (termch != '\n')
> > termch = 0;
> > +#endif
> > nflush = 1;
> > scanhex();
> > nflush = (nflush + L1_CACHE_BYTES - 1) / L1_CACHE_BYTES;
> > if (setjmp(bus_error_jmp) == 0) {
> > catch_memory_errors = 1;
> > sync();
> > -
> > +#ifndef CONFIG_PPC_BOOK3S_64
> 
> You don't need that ifndef, just ensure below that regardless of cmd, 
> book3s/64 calls cflush and not cinval.
> 
> > if (cmd != 'i') {
> 
> The only thing you have to do is to replace the above test by:
> 
>   if (cmd != 'i' || IS_ENABLED(CONFIG_PPC_BOOK3S_64)) {

yes, this is the better way.

> 
> > for (; nflush > 0; --nflush, adrs += L1_CACHE_BYTES)
> > cflush((void *) adrs);
> > @@ -1814,6 +1818,10 @@ static void cacheflush(void)
> > for (; nflush > 0; --nflush, adrs += L1_CACHE_BYTES)
> > cinval((void *) adrs);
> > }
> > +#else
> 
> Don't need that at all, it's a duplication of the above.

sure :+1:

Thanks for reviewing.

-- Bala

> 
> > +   for (; nflush > 0; --nflush, adrs += L1_CACHE_BYTES)
> > +   cflush((void *)adrs);
> > +#endif
> > sync();
> > /* wait a little while to see if we get a machine check */
> > __delay(200);
> > 
> > base-commit: a87b93bdf800a4d7a42d95683624a4516e516b4f
> > 
> 
> Christophe



Re: [PATCH 3/4] powerpc/uaccess: evaluate macro arguments once, before user access is allowed

2020-03-27 Thread Christophe Leroy




Le 27/03/2020 à 08:02, Nicholas Piggin a écrit :

get/put_user can be called with nontrivial arguments. fs/proc/page.c
has a good example:

 if (put_user(stable_page_flags(ppage), out)) {

stable_page_flags is quite a lot of code, including spin locks in the
page allocator.

Ensure these arguments are evaluated before user access is allowed.
This improves security by reducing code with access to userspace, but
it also fixes a PREEMPT bug with KUAP on powerpc/64s:
stable_page_flags is currently called with AMR set to allow writes,
it ends up calling spin_unlock(), which can call preempt_schedule. But
the task switch code can not be called with AMR set (it relies on
interrupts saving the register), so this blows up.

It's fine if the code inside allow_user_access is preemptible, because
a timer or IPI will save the AMR, but it's not okay to explicitly
cause a reschedule.

Signed-off-by: Nicholas Piggin 
---
  arch/powerpc/include/asm/uaccess.h | 97 ++
  1 file changed, 59 insertions(+), 38 deletions(-)

diff --git a/arch/powerpc/include/asm/uaccess.h 
b/arch/powerpc/include/asm/uaccess.h
index 670910df3cc7..1cf8595aeef1 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -162,36 +162,48 @@ do {  
\
prevent_write_to_user(ptr, size);   \
  } while (0)
  
-#define __put_user_nocheck(x, ptr, size, do_allow)			\

+#define __put_user_nocheck(x, ptr, size, do_allow) \


No need to touch this line. Anyway at the end, you still have several \ 
which are not aligned.



  ({\
long __pu_err;  \
__typeof__(*(ptr)) __user *__pu_addr = (ptr);   \
+   __typeof__(*(ptr)) __pu_val = (x);  \
+   __typeof__(size) __pu_size = (size);\
+   \
if (!is_kernel_addr((unsigned long)__pu_addr))  \
might_fault();  \
-   __chk_user_ptr(ptr);\
-   if (do_allow)   
\


No need to touch that line


-   __put_user_size((x), __pu_addr, (size), __pu_err);  
\
-   else
\


No need to touch that line


-   __put_user_size_allowed((x), __pu_addr, (size), __pu_err);  
\
+   __chk_user_ptr(__pu_addr);  \
+   if (do_allow)   \
+   __put_user_size(__pu_val, __pu_addr, __pu_size, __pu_err); \
+   else\
+   __put_user_size_allowed(__pu_val, __pu_addr, __pu_size, 
__pu_err); \
+   \
__pu_err;   \
  })
  
-#define __put_user_check(x, ptr, size)	\

-({ \
-   long __pu_err = -EFAULT;\
-   __typeof__(*(ptr)) __user *__pu_addr = (ptr);   \
-   might_fault();  \
-   if (access_ok(__pu_addr, size)) \
-   __put_user_size((x), __pu_addr, (size), __pu_err);  \
-   __pu_err;   \


Same comment applies, you are touching some lines just to change the \, 
but at the end you still have some misaligned ones.


It would help the review not to touch unchanged lines just for that.

Same comment applies a few places below as well.


+#define __put_user_check(x, ptr, size) \
+({ \
+   long __pu_err = -EFAULT;\
+   __typeof__(*(ptr)) __user *__pu_addr = (ptr);   \
+   __typeof__(*(ptr)) __pu_val = (x);  \
+   __typeof__(size) __pu_size = (size);\
+   \
+   might_fault();  \
+   if (access_ok(__pu_addr, __pu_size))\
+   __put_user_size(__pu_val, __pu_addr, __pu_size, __pu_err); \
+   \
+   __pu_err;   \
  })
  
  #define __put_user_nosleep(x, ptr, size)			\

  ({\
long __pu_err;  \
__typeof__(*(ptr)) __user *__pu_addr = (ptr);

Re: [PATCH 1/4] powerpc/64s: implement probe_kernel_read/write without touching AMR

2020-03-27 Thread Christophe Leroy




Le 27/03/2020 à 08:02, Nicholas Piggin a écrit :

There is no need to allow user accesses when probing kernel addresses.

Signed-off-by: Nicholas Piggin 
---
  arch/powerpc/include/asm/uaccess.h | 25 ++-
  arch/powerpc/lib/Makefile  |  2 +-
  arch/powerpc/lib/uaccess.c | 50 ++
  3 files changed, 68 insertions(+), 9 deletions(-)
  create mode 100644 arch/powerpc/lib/uaccess.c



[...]


diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile
index b8de3be10eb4..a15060b5008e 100644
--- a/arch/powerpc/lib/Makefile
+++ b/arch/powerpc/lib/Makefile
@@ -36,7 +36,7 @@ extra-$(CONFIG_PPC64) += crtsavres.o
  endif
  
  obj-$(CONFIG_PPC_BOOK3S_64) += copyuser_power7.o copypage_power7.o \

-  memcpy_power7.o
+  memcpy_power7.o uaccess.o


Why only book3s/64 ? It applies to the 8xx and book3s/32 as well, I 
think it should just be for all powerpc.


  
  obj64-y	+= copypage_64.o copyuser_64.o mem_64.o hweight_64.o \

   memcpy_64.o memcpy_mcsafe_64.o
diff --git a/arch/powerpc/lib/uaccess.c b/arch/powerpc/lib/uaccess.c
new file mode 100644
index ..0057ab52d6fe
--- /dev/null
+++ b/arch/powerpc/lib/uaccess.c
@@ -0,0 +1,50 @@
+#include 
+#include 
+
+static __always_inline long
+probe_read_common(void *dst, const void __user *src, size_t size)
+{
+   long ret;
+
+   pagefault_disable();
+   ret = raw_copy_from_user_allowed(dst, src, size);
+   pagefault_enable();
+
+   return ret ? -EFAULT : 0;
+}
+
+static __always_inline long
+probe_write_common(void __user *dst, const void *src, size_t size)
+{
+   long ret;
+
+   pagefault_disable();
+   ret = raw_copy_to_user_allowed(dst, src, size);
+   pagefault_enable();
+
+   return ret ? -EFAULT : 0;
+}
+
+long probe_kernel_read(void *dst, const void *src, size_t size)
+{
+   long ret;
+   mm_segment_t old_fs = get_fs();
+
+   set_fs(KERNEL_DS);
+   ret = probe_read_common(dst, (__force const void __user *)src, size);


I think you should squash probe_read_common() here, having it separated 
is a lot of lines for no added value. It also may make people believe it 
overwrites the generic probe_read_common()




+   set_fs(old_fs);
+
+   return ret;
+}
+
+long probe_kernel_write(void *dst, const void *src, size_t size)
+{
+   long ret;
+   mm_segment_t old_fs = get_fs();
+
+   set_fs(KERNEL_DS);
+   ret = probe_write_common((__force void __user *)dst, src, size);


Same comment as for probe_read_common()


+   set_fs(old_fs);
+
+   return ret;
+}



Christophe


[PATCH 3/4] powerpc/uaccess: evaluate macro arguments once, before user access is allowed

2020-03-27 Thread Nicholas Piggin
get/put_user can be called with nontrivial arguments. fs/proc/page.c
has a good example:

if (put_user(stable_page_flags(ppage), out)) {

stable_page_flags is quite a lot of code, including spin locks in the
page allocator.

Ensure these arguments are evaluated before user access is allowed.
This improves security by reducing code with access to userspace, but
it also fixes a PREEMPT bug with KUAP on powerpc/64s:
stable_page_flags is currently called with AMR set to allow writes,
it ends up calling spin_unlock(), which can call preempt_schedule. But
the task switch code can not be called with AMR set (it relies on
interrupts saving the register), so this blows up.

It's fine if the code inside allow_user_access is preemptible, because
a timer or IPI will save the AMR, but it's not okay to explicitly
cause a reschedule.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/uaccess.h | 97 ++
 1 file changed, 59 insertions(+), 38 deletions(-)

diff --git a/arch/powerpc/include/asm/uaccess.h 
b/arch/powerpc/include/asm/uaccess.h
index 670910df3cc7..1cf8595aeef1 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -162,36 +162,48 @@ do {  
\
prevent_write_to_user(ptr, size);   \
 } while (0)
 
-#define __put_user_nocheck(x, ptr, size, do_allow) \
+#define __put_user_nocheck(x, ptr, size, do_allow) \
 ({ \
long __pu_err;  \
__typeof__(*(ptr)) __user *__pu_addr = (ptr);   \
+   __typeof__(*(ptr)) __pu_val = (x);  \
+   __typeof__(size) __pu_size = (size);\
+   \
if (!is_kernel_addr((unsigned long)__pu_addr))  \
might_fault();  \
-   __chk_user_ptr(ptr);\
-   if (do_allow)   
\
-   __put_user_size((x), __pu_addr, (size), __pu_err);  
\
-   else
\
-   __put_user_size_allowed((x), __pu_addr, (size), __pu_err);  
\
+   __chk_user_ptr(__pu_addr);  \
+   if (do_allow)   \
+   __put_user_size(__pu_val, __pu_addr, __pu_size, __pu_err); \
+   else\
+   __put_user_size_allowed(__pu_val, __pu_addr, __pu_size, 
__pu_err); \
+   \
__pu_err;   \
 })
 
-#define __put_user_check(x, ptr, size) \
-({ \
-   long __pu_err = -EFAULT;\
-   __typeof__(*(ptr)) __user *__pu_addr = (ptr);   \
-   might_fault();  \
-   if (access_ok(__pu_addr, size)) \
-   __put_user_size((x), __pu_addr, (size), __pu_err);  \
-   __pu_err;   \
+#define __put_user_check(x, ptr, size) \
+({ \
+   long __pu_err = -EFAULT;\
+   __typeof__(*(ptr)) __user *__pu_addr = (ptr);   \
+   __typeof__(*(ptr)) __pu_val = (x);  \
+   __typeof__(size) __pu_size = (size);\
+   \
+   might_fault();  \
+   if (access_ok(__pu_addr, __pu_size))\
+   __put_user_size(__pu_val, __pu_addr, __pu_size, __pu_err); \
+   \
+   __pu_err;   \
 })
 
 #define __put_user_nosleep(x, ptr, size)   \
 ({ \
long __pu_err;  \
__typeof__(*(ptr)) __user *__pu_addr = (ptr);   \
-   __chk_user_ptr(ptr);\
-   __put_user_size((x), __pu_addr, (size), __pu_err);  \
+   __typeof__(*(ptr)) __pu_val = (x);  \
+   __typeof__(size) __pu_size = (size);\
+   \
+   __chk_user_ptr(__pu_addr);  \
+   

[PATCH 2/4] powerpc/64s: use mmu_has_feature in set_kuap() and get_kuap()

2020-03-27 Thread Nicholas Piggin
Commit 8150a153c013 ("powerpc/64s: Use early_mmu_has_feature() in
set_kuap()"), had to switch to using the _early feature test, because
probe_kernel_read was being called very early. After the previous
patch, probe_kernel_read no longer touches kuap, so it can go back to
using the non-_early variant, for better performance.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/book3s/64/kup-radix.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/kup-radix.h 
b/arch/powerpc/include/asm/book3s/64/kup-radix.h
index 3bcef989a35d..67a7fd0182e6 100644
--- a/arch/powerpc/include/asm/book3s/64/kup-radix.h
+++ b/arch/powerpc/include/asm/book3s/64/kup-radix.h
@@ -79,7 +79,7 @@ static inline void kuap_check_amr(void)
 
 static inline unsigned long get_kuap(void)
 {
-   if (!early_mmu_has_feature(MMU_FTR_RADIX_KUAP))
+   if (!mmu_has_feature(MMU_FTR_RADIX_KUAP))
return 0;
 
return mfspr(SPRN_AMR);
@@ -87,7 +87,7 @@ static inline unsigned long get_kuap(void)
 
 static inline void set_kuap(unsigned long value)
 {
-   if (!early_mmu_has_feature(MMU_FTR_RADIX_KUAP))
+   if (!mmu_has_feature(MMU_FTR_RADIX_KUAP))
return;
 
/*
-- 
2.23.0



[PATCH 1/4] powerpc/64s: implement probe_kernel_read/write without touching AMR

2020-03-27 Thread Nicholas Piggin
There is no need to allow user accesses when probing kernel addresses.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/uaccess.h | 25 ++-
 arch/powerpc/lib/Makefile  |  2 +-
 arch/powerpc/lib/uaccess.c | 50 ++
 3 files changed, 68 insertions(+), 9 deletions(-)
 create mode 100644 arch/powerpc/lib/uaccess.c

diff --git a/arch/powerpc/include/asm/uaccess.h 
b/arch/powerpc/include/asm/uaccess.h
index 2f500debae21..670910df3cc7 100644
--- a/arch/powerpc/include/asm/uaccess.h
+++ b/arch/powerpc/include/asm/uaccess.h
@@ -341,8 +341,8 @@ raw_copy_in_user(void __user *to, const void __user *from, 
unsigned long n)
 }
 #endif /* __powerpc64__ */
 
-static inline unsigned long raw_copy_from_user(void *to,
-   const void __user *from, unsigned long n)
+static inline unsigned long
+raw_copy_from_user_allowed(void *to, const void __user *from, unsigned long n)
 {
unsigned long ret;
if (__builtin_constant_p(n) && (n <= 8)) {
@@ -351,19 +351,19 @@ static inline unsigned long raw_copy_from_user(void *to,
switch (n) {
case 1:
barrier_nospec();
-   __get_user_size(*(u8 *)to, from, 1, ret);
+   __get_user_size_allowed(*(u8 *)to, from, 1, ret);
break;
case 2:
barrier_nospec();
-   __get_user_size(*(u16 *)to, from, 2, ret);
+   __get_user_size_allowed(*(u16 *)to, from, 2, ret);
break;
case 4:
barrier_nospec();
-   __get_user_size(*(u32 *)to, from, 4, ret);
+   __get_user_size_allowed(*(u32 *)to, from, 4, ret);
break;
case 8:
barrier_nospec();
-   __get_user_size(*(u64 *)to, from, 8, ret);
+   __get_user_size_allowed(*(u64 *)to, from, 8, ret);
break;
}
if (ret == 0)
@@ -371,9 +371,18 @@ static inline unsigned long raw_copy_from_user(void *to,
}
 
barrier_nospec();
-   allow_read_from_user(from, n);
ret = __copy_tofrom_user((__force void __user *)to, from, n);
-   prevent_read_from_user(from, n);
+   return ret;
+}
+
+static inline unsigned long
+raw_copy_from_user(void *to, const void __user *from, unsigned long n)
+{
+   unsigned long ret;
+
+   allow_read_from_user(to, n);
+   ret = raw_copy_from_user_allowed(to, from, n);
+   prevent_read_from_user(to, n);
return ret;
 }
 
diff --git a/arch/powerpc/lib/Makefile b/arch/powerpc/lib/Makefile
index b8de3be10eb4..a15060b5008e 100644
--- a/arch/powerpc/lib/Makefile
+++ b/arch/powerpc/lib/Makefile
@@ -36,7 +36,7 @@ extra-$(CONFIG_PPC64) += crtsavres.o
 endif
 
 obj-$(CONFIG_PPC_BOOK3S_64) += copyuser_power7.o copypage_power7.o \
-  memcpy_power7.o
+  memcpy_power7.o uaccess.o
 
 obj64-y+= copypage_64.o copyuser_64.o mem_64.o hweight_64.o \
   memcpy_64.o memcpy_mcsafe_64.o
diff --git a/arch/powerpc/lib/uaccess.c b/arch/powerpc/lib/uaccess.c
new file mode 100644
index ..0057ab52d6fe
--- /dev/null
+++ b/arch/powerpc/lib/uaccess.c
@@ -0,0 +1,50 @@
+#include 
+#include 
+
+static __always_inline long
+probe_read_common(void *dst, const void __user *src, size_t size)
+{
+   long ret;
+
+   pagefault_disable();
+   ret = raw_copy_from_user_allowed(dst, src, size);
+   pagefault_enable();
+
+   return ret ? -EFAULT : 0;
+}
+
+static __always_inline long
+probe_write_common(void __user *dst, const void *src, size_t size)
+{
+   long ret;
+
+   pagefault_disable();
+   ret = raw_copy_to_user_allowed(dst, src, size);
+   pagefault_enable();
+
+   return ret ? -EFAULT : 0;
+}
+
+long probe_kernel_read(void *dst, const void *src, size_t size)
+{
+   long ret;
+   mm_segment_t old_fs = get_fs();
+
+   set_fs(KERNEL_DS);
+   ret = probe_read_common(dst, (__force const void __user *)src, size);
+   set_fs(old_fs);
+
+   return ret;
+}
+
+long probe_kernel_write(void *dst, const void *src, size_t size)
+{
+   long ret;
+   mm_segment_t old_fs = get_fs();
+
+   set_fs(KERNEL_DS);
+   ret = probe_write_common((__force void __user *)dst, src, size);
+   set_fs(old_fs);
+
+   return ret;
+}
-- 
2.23.0



Re: [PATCH V2 0/3] mm/debug: Add more arch page table helper tests

2020-03-27 Thread Christophe Leroy




On 03/27/2020 06:46 AM, Anshuman Khandual wrote:


On 03/26/2020 08:53 PM, Christophe Leroy wrote:



Le 26/03/2020 à 03:23, Anshuman Khandual a écrit :



On 03/24/2020 10:52 AM, Anshuman Khandual wrote:

This series adds more arch page table helper tests. The new tests here are
either related to core memory functions and advanced arch pgtable helpers.
This also creates a documentation file enlisting all expected semantics as
suggested by Mike Rapoport (https://lkml.org/lkml/2020/1/30/40).

This series has been tested on arm64 and x86 platforms.


If folks can test these patches out on remaining ARCH_HAS_DEBUG_VM_PGTABLE
enabled platforms i.e s390, arc, powerpc (32 and 64), that will be really
appreciated. Thank you.



On powerpc 8xx (PPC32), I get:

[   53.338368] debug_vm_pgtable: debug_vm_pgtable: Validating architecture page 
table helpers
[   53.347403] [ cut here ]
[   53.351832] WARNING: CPU: 0 PID: 1 at mm/debug_vm_pgtable.c:647 
debug_vm_pgtable+0x280/0x3f4


mm/debug_vm_pgtable.c:647 ?

With the following commits in place

53a8338ce (HEAD) Documentation/mm: Add descriptions for arch page table helper
5d4913fc1 mm/debug: Add tests validating arch advanced page table helpers
bcaf120a7 mm/debug: Add tests validating arch page table helpers for core 
features
d6ed5a4a5 x86/memory: Drop pud_mknotpresent()
0739d1f8d mm/debug: Add tests validating architecture page table helpers
16fbf79b0 (tag: v5.6-rc7) Linux 5.6-rc7


I have:

facaa5eb5909 (HEAD -> helpers0) mm/debug: Add tests validating arch 
advanced page table helpers
6389fed515fc mm/debug: Add tests validating arch page table helpers for 
core features

dc14ecc8b94e mm/debug: add tests validating architecture page table helpers
c6624071c338 (origin/merge, merge) Automatic merge of branches 'master', 
'next' and 'fixes' into merge
58e05c5508e6 Automatic merge of branches 'master', 'next' and 'fixes' 
into merge
1b649e0bcae7 (origin/master, origin/HEAD) Merge 
git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net


origin is https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git

I can't see your last patch in powerpc mailing list 
(https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=166237)




mm/debug_vm_pgtable.c:647 is here.


Line 647 is:

WARN_ON(!pte_same(pte, __swp_entry_to_pte(swp)));




#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
static void __init pmd_swap_tests(unsigned long pfn, pgprot_t prot)
{
 swp_entry_t swp;
 pmd_t pmd;  -> Line #647

 pmd = pfn_pmd(pfn, prot);
 swp = __pmd_to_swp_entry(pmd);
 WARN_ON(!pmd_same(pmd, __swp_entry_to_pmd(swp)));
}
#else
static void __init pmd_swap_tests(unsigned long pfn, pgprot_t prot) { }
#end

Did I miss something ?



[...]


Could you please point me to the exact test which is failing ?


[   53.519778] Freeing unused kernel memory: 608K



So I assume that the system should have come till runtime just fine apart from
the above warning message because.



Yes it boots fine otherwise.

Christophe


Re: [PATCH 1/1] ppc/crash: Skip spinlocks during crash

2020-03-27 Thread Christophe Leroy




Le 26/03/2020 à 23:28, Leonardo Bras a écrit :

During a crash, there is chance that the cpus that handle the NMI IPI
are holding a spin_lock. If this spin_lock is needed by crashing_cpu it
will cause a deadlock. (rtas_lock and printk logbuf_log as of today)

This is a problem if the system has kdump set up, given if it crashes
for any reason kdump may not be saved for crash analysis.

Skip spinlocks after NMI IPI is sent to all other cpus.

Signed-off-by: Leonardo Bras 
---
  arch/powerpc/include/asm/spinlock.h | 6 ++
  arch/powerpc/kexec/crash.c  | 3 +++
  2 files changed, 9 insertions(+)

diff --git a/arch/powerpc/include/asm/spinlock.h 
b/arch/powerpc/include/asm/spinlock.h
index 860228e917dc..a6381d110795 100644
--- a/arch/powerpc/include/asm/spinlock.h
+++ b/arch/powerpc/include/asm/spinlock.h
@@ -111,6 +111,8 @@ static inline void splpar_spin_yield(arch_spinlock_t *lock) 
{};
  static inline void splpar_rw_yield(arch_rwlock_t *lock) {};
  #endif
  
+extern bool crash_skip_spinlock __read_mostly;

+
  static inline bool is_shared_processor(void)
  {
  #ifdef CONFIG_PPC_SPLPAR
@@ -142,6 +144,8 @@ static inline void arch_spin_lock(arch_spinlock_t *lock)
if (likely(__arch_spin_trylock(lock) == 0))
break;
do {
+   if (unlikely(crash_skip_spinlock))
+   return;


You are adding a test that reads a global var in the middle of a so hot 
path ? That must kill performance. Can we do different ?


Christophe


Re: [PATCH v2] powerpc xmon: use `dcbf` inplace of `dcbi` instruction for 64bit Book3S

2020-03-27 Thread Christophe Leroy




Le 26/03/2020 à 07:15, Balamuruhan S a écrit :

Data Cache Block Invalidate (dcbi) instruction was implemented back in PowerPC
architecture version 2.03. It is obsolete and attempt to use of this illegal
instruction results in a hypervisor emulation assistance interrupt. So, ifdef
it out the option `i` in xmon for 64bit Book3S.


I don't understand. You say two contradictory things:
1/ You say it _was_ added back.
2/ You say it _is_ obsolete.

How can it be obsolete if it was added back ?

[...]


diff --git a/arch/powerpc/xmon/xmon.c b/arch/powerpc/xmon/xmon.c
index 0ec9640335bb..bfd5a97689cd 100644
--- a/arch/powerpc/xmon/xmon.c
+++ b/arch/powerpc/xmon/xmon.c
@@ -335,10 +335,12 @@ static inline void cflush(void *p)
asm volatile ("dcbf 0,%0; icbi 0,%0" : : "r" (p));
  }
  
+#ifndef CONFIG_PPC_BOOK3S_64


You don't need that #ifndef. Keeping it should be harmless.


  static inline void cinval(void *p)
  {
asm volatile ("dcbi 0,%0; icbi 0,%0" : : "r" (p));
  }
+#endif
  
  /**

   * write_ciabr() - write the CIABR SPR
@@ -1791,8 +1793,9 @@ static void prregs(struct pt_regs *fp)
  
  static void cacheflush(void)

  {
-   int cmd;
unsigned long nflush;
+#ifndef CONFIG_PPC_BOOK3S_64


Don't make it so complex, see below


+   int cmd;
  
  	cmd = inchar();

if (cmd != 'i')
@@ -1800,13 +1803,14 @@ static void cacheflush(void)
scanhex((void *));
if (termch != '\n')
termch = 0;
+#endif
nflush = 1;
scanhex();
nflush = (nflush + L1_CACHE_BYTES - 1) / L1_CACHE_BYTES;
if (setjmp(bus_error_jmp) == 0) {
catch_memory_errors = 1;
sync();
-
+#ifndef CONFIG_PPC_BOOK3S_64


You don't need that ifndef, just ensure below that regardless of cmd, 
book3s/64 calls cflush and not cinval.



if (cmd != 'i') {


The only thing you have to do is to replace the above test by:

if (cmd != 'i' || IS_ENABLED(CONFIG_PPC_BOOK3S_64)) {


for (; nflush > 0; --nflush, adrs += L1_CACHE_BYTES)
cflush((void *) adrs);
@@ -1814,6 +1818,10 @@ static void cacheflush(void)
for (; nflush > 0; --nflush, adrs += L1_CACHE_BYTES)
cinval((void *) adrs);
}
+#else


Don't need that at all, it's a duplication of the above.


+   for (; nflush > 0; --nflush, adrs += L1_CACHE_BYTES)
+   cflush((void *)adrs);
+#endif
sync();
/* wait a little while to see if we get a machine check */
__delay(200);

base-commit: a87b93bdf800a4d7a42d95683624a4516e516b4f



Christophe


Re: [PATCH V2 0/3] mm/debug: Add more arch page table helper tests

2020-03-27 Thread Anshuman Khandual


On 03/26/2020 08:53 PM, Christophe Leroy wrote:
> 
> 
> Le 26/03/2020 à 03:23, Anshuman Khandual a écrit :
>>
>>
>> On 03/24/2020 10:52 AM, Anshuman Khandual wrote:
>>> This series adds more arch page table helper tests. The new tests here are
>>> either related to core memory functions and advanced arch pgtable helpers.
>>> This also creates a documentation file enlisting all expected semantics as
>>> suggested by Mike Rapoport (https://lkml.org/lkml/2020/1/30/40).
>>>
>>> This series has been tested on arm64 and x86 platforms.
>>
>> If folks can test these patches out on remaining ARCH_HAS_DEBUG_VM_PGTABLE
>> enabled platforms i.e s390, arc, powerpc (32 and 64), that will be really
>> appreciated. Thank you.
>>
> 
> On powerpc 8xx (PPC32), I get:
> 
> [   53.338368] debug_vm_pgtable: debug_vm_pgtable: Validating architecture 
> page table helpers
> [   53.347403] [ cut here ]
> [   53.351832] WARNING: CPU: 0 PID: 1 at mm/debug_vm_pgtable.c:647 
> debug_vm_pgtable+0x280/0x3f4

mm/debug_vm_pgtable.c:647 ?

With the following commits in place

53a8338ce (HEAD) Documentation/mm: Add descriptions for arch page table helper
5d4913fc1 mm/debug: Add tests validating arch advanced page table helpers
bcaf120a7 mm/debug: Add tests validating arch page table helpers for core 
features
d6ed5a4a5 x86/memory: Drop pud_mknotpresent()
0739d1f8d mm/debug: Add tests validating architecture page table helpers
16fbf79b0 (tag: v5.6-rc7) Linux 5.6-rc7

mm/debug_vm_pgtable.c:647 is here.

#ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
static void __init pmd_swap_tests(unsigned long pfn, pgprot_t prot)
{
swp_entry_t swp;
pmd_t pmd;  -> Line #647

pmd = pfn_pmd(pfn, prot);
swp = __pmd_to_swp_entry(pmd);
WARN_ON(!pmd_same(pmd, __swp_entry_to_pmd(swp)));
}
#else
static void __init pmd_swap_tests(unsigned long pfn, pgprot_t prot) { }
#end

Did I miss something ?

> [   53.360140] CPU: 0 PID: 1 Comm: swapper Not tainted 
> 5.6.0-rc7-s3k-dev-01090-g92710e99881f #3544
> [   53.368718] NIP:  c0777c04 LR: c0777bb8 CTR: 
> [   53.373720] REGS: c9023df0 TRAP: 0700   Not tainted 
> (5.6.0-rc7-s3k-dev-01090-g92710e99881f)
> [   53.382042] MSR:  00029032   CR: 22000222  XER: 2000
> [   53.388667]
> [   53.388667] GPR00: c0777bb8 c9023ea8 c612 0001 1e41  
>  007641c9
> [   53.388667] GPR08:  0001   82000222  
> c00039b8 
> [   53.388667] GPR16:    fff0 065fc000 1e41 
> c660 01e4
> [   53.388667] GPR24: 01d9 c062d14c c65fc000 c642d448 06c9  
> c65f8000 c65fc040
> [   53.423400] NIP [c0777c04] debug_vm_pgtable+0x280/0x3f4
> [   53.428559] LR [c0777bb8] debug_vm_pgtable+0x234/0x3f4
> [   53.433593] Call Trace:
> [   53.436048] [c9023ea8] [c0777bb8] debug_vm_pgtable+0x234/0x3f4 (unreliable)
> [   53.442936] [c9023f28] [c00039e0] kernel_init+0x28/0x124
> [   53.448184] [c9023f38] [c000f174] ret_from_kernel_thread+0x14/0x1c
> [   53.454245] Instruction dump:
> [   53.457180] 41a20008 4bea3ed9 62890021 7d36b92e 7d36b82e 71290fd0 3149 
> 7d2a4910
> [   53.464838] 0f09 5789077e 3149 7d2a4910 <0f09> 38c0 
> 38a0 3880
> [   53.472671] ---[ end trace fd5dd92744dc0065 ]---
Could you please point me to the exact test which is failing ?

> [   53.519778] Freeing unused kernel memory: 608K
> 
> 
So I assume that the system should have come till runtime just fine apart from
the above warning message because.


[PATCH v7 5/5] powerpc/hv-24x7: Update post_mobility_fixup() to handle migration

2020-03-27 Thread Kajol Jain
Function 'read_sys_info_pseries()' is added to get system parameter
values like number of sockets and chips per socket.
and it gets these details via rtas_call with token
"PROCESSOR_MODULE_INFO".

Incase lpar migrate from one system to another, system
parameter details like chips per sockets or number of sockets might
change. So, it needs to be re-initialized otherwise, these values
corresponds to previous system values.
This patch adds a call to 'read_sys_info_pseries()' from
'post-mobility_fixup()' to re-init the physsockets and physchips values.

Signed-off-by: Kajol Jain 
---
 arch/powerpc/platforms/pseries/mobility.c | 12 
 1 file changed, 12 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/mobility.c 
b/arch/powerpc/platforms/pseries/mobility.c
index b571285f6c14..226accd6218b 100644
--- a/arch/powerpc/platforms/pseries/mobility.c
+++ b/arch/powerpc/platforms/pseries/mobility.c
@@ -371,6 +371,18 @@ void post_mobility_fixup(void)
/* Possibly switch to a new RFI flush type */
pseries_setup_rfi_flush();
 
+   /*
+* Incase lpar migrate from one system to another, system
+* parameter details like chips per sockets and number of sockets
+* might change. So, it needs to be re-initialized otherwise these
+* values corresponds to previous system.
+* Here, adding a call to read_sys_info_pseries() declared in
+* platforms/pseries/pseries.h to re-init the physsockets and
+* physchips value.
+*/
+   if (IS_ENABLED(CONFIG_HV_PERF_CTRS) && IS_ENABLED(CONFIG_PPC_RTAS))
+   read_sys_info_pseries();
+
return;
 }
 
-- 
2.18.1



[PATCH v7 4/5] Documentation/ABI: Add ABI documentation for chips and sockets

2020-03-27 Thread Kajol Jain
Add documentation for the following sysfs files:
/sys/devices/hv_24x7/interface/chips,
/sys/devices/hv_24x7/interface/sockets

Signed-off-by: Kajol Jain 
---
 .../testing/sysfs-bus-event_source-devices-hv_24x7 | 14 ++
 1 file changed, 14 insertions(+)

diff --git a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7 
b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
index ec27c6c9e737..e17e5b444a1c 100644
--- a/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
+++ b/Documentation/ABI/testing/sysfs-bus-event_source-devices-hv_24x7
@@ -22,6 +22,20 @@ Description:
Exposes the "version" field of the 24x7 catalog. This is also
extractable from the provided binary "catalog" sysfs entry.
 
+What:  /sys/devices/hv_24x7/interface/sockets
+Date:  March 2020
+Contact:   Linux on PowerPC Developer List 
+Description:   read only
+   This sysfs interface exposes the number of sockets present in 
the
+   system.
+
+What:  /sys/devices/hv_24x7/interface/chips
+Date:  March 2020
+Contact:   Linux on PowerPC Developer List 
+Description:   read only
+   This sysfs interface exposes the number of chips per socket
+   present in the system.
+
 What:  /sys/bus/event_source/devices/hv_24x7/event_descs/
 Date:  February 2014
 Contact:   Linux on PowerPC Developer List 
-- 
2.18.1



[PATCH v7 3/5] powerpc/hv-24x7: Add sysfs files inside hv-24x7 device to show processor details

2020-03-27 Thread Kajol Jain
To expose the system dependent parameter like total number of
sockets and numbers of chips per socket, patch adds two sysfs files.
"sockets" and "chips" are added to /sys/devices/hv_24x7/interface/
of the "hv_24x7" pmu.

Signed-off-by: Kajol Jain 
---
 arch/powerpc/perf/hv-24x7.c | 22 ++
 1 file changed, 22 insertions(+)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index 9ae00f29bd21..a31bd5b88f7a 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -454,6 +454,20 @@ static ssize_t device_show_string(struct device *dev,
return sprintf(buf, "%s\n", (char *)d->var);
 }
 
+#ifdef CONFIG_PPC_RTAS
+static ssize_t sockets_show(struct device *dev,
+   struct device_attribute *attr, char *buf)
+{
+   return sprintf(buf, "%d\n", physsockets);
+}
+
+static ssize_t chips_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+   return sprintf(buf, "%d\n", physchips);
+}
+#endif
+
 static struct attribute *device_str_attr_create_(char *name, char *str)
 {
struct dev_ext_attribute *attr = kzalloc(sizeof(*attr), GFP_KERNEL);
@@ -1100,6 +1114,10 @@ PAGE_0_ATTR(catalog_len, "%lld\n",
(unsigned long long)be32_to_cpu(page_0->length) * 4096);
 static BIN_ATTR_RO(catalog, 0/* real length varies */);
 static DEVICE_ATTR_RO(domains);
+#ifdef CONFIG_PPC_RTAS
+static DEVICE_ATTR_RO(sockets);
+static DEVICE_ATTR_RO(chips);
+#endif
 
 static struct bin_attribute *if_bin_attrs[] = {
_attr_catalog,
@@ -1110,6 +1128,10 @@ static struct attribute *if_attrs[] = {
_attr_catalog_len.attr,
_attr_catalog_version.attr,
_attr_domains.attr,
+#ifdef CONFIG_PPC_RTAS
+   _attr_sockets.attr,
+   _attr_chips.attr,
+#endif
NULL,
 };
 
-- 
2.18.1



[PATCH v7 2/5] powerpc/hv-24x7: Add rtas call in hv-24x7 driver to get processor details

2020-03-27 Thread Kajol Jain
For hv_24x7 socket/chip level events, specific chip-id to which
the data requested should be added as part of pmu events.
But number of chips/socket in the system details are not exposed.

Patch implements read_sys_info_pseries() to get system
parameter values like number of sockets and chips per socket.
Rtas_call with token "PROCESSOR_MODULE_INFO"
is used to get these values.

Sub-sequent patch exports these values via sysfs.

Patch also make these parameters default to 1.

Signed-off-by: Kajol Jain 
---
 arch/powerpc/perf/hv-24x7.c  | 72 
 arch/powerpc/platforms/pseries/pseries.h |  3 +
 2 files changed, 75 insertions(+)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index 48e8f4b17b91..9ae00f29bd21 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -20,6 +20,11 @@
 #include 
 #include 
 
+#ifdef CONFIG_PPC_RTAS
+#include 
+#include <../../platforms/pseries/pseries.h>
+#endif
+
 #include "hv-24x7.h"
 #include "hv-24x7-catalog.h"
 #include "hv-common.h"
@@ -57,6 +62,69 @@ static bool is_physical_domain(unsigned domain)
}
 }
 
+#ifdef CONFIG_PPC_RTAS
+#define PROCESSOR_MODULE_INFO   43
+#define PROCESSOR_MAX_LENGTH   (8 * 1024)
+
+static int strbe16toh(const char *buf, int offset)
+{
+   return (buf[offset] << 8) + buf[offset + 1];
+}
+
+static u32 physsockets;/* Physical sockets */
+static u32 physchips;  /* Physical chips */
+
+/*
+ * Function read_sys_info_pseries() make a rtas_call which require
+ * data buffer of size 8K. As standard 'rtas_data_buf' is of size
+ * 4K, we are adding new local buffer 'rtas_local_data_buf'.
+ */
+char rtas_local_data_buf[PROCESSOR_MAX_LENGTH] __cacheline_aligned;
+
+/*
+ * read_sys_info_pseries()
+ * Retrieve the number of sockets and chips per socket details
+ * through the get-system-parameter rtas call.
+ */
+void read_sys_info_pseries(void)
+{
+   int call_status, len, ntypes;
+
+   /*
+* Making system parameter: chips and sockets default to 1.
+*/
+   physsockets = 1;
+   physchips = 1;
+   memset(rtas_local_data_buf, 0, PROCESSOR_MAX_LENGTH);
+   spin_lock(_data_buf_lock);
+
+   call_status = rtas_call(rtas_token("ibm,get-system-parameter"), 3, 1,
+   NULL,
+   PROCESSOR_MODULE_INFO,
+   __pa(rtas_local_data_buf),
+   PROCESSOR_MAX_LENGTH);
+
+   spin_unlock(_data_buf_lock);
+
+   if (call_status != 0) {
+   pr_info("%s %s Error calling get-system-parameter (0x%x)\n",
+   __FILE__, __func__, call_status);
+   } else {
+   rtas_local_data_buf[PROCESSOR_MAX_LENGTH - 1] = '\0';
+   len = strbe16toh(rtas_local_data_buf, 0);
+   if (len < 6)
+   return;
+
+   ntypes = strbe16toh(rtas_local_data_buf, 2);
+
+   if (!ntypes)
+   return;
+   physsockets = strbe16toh(rtas_local_data_buf, 4);
+   physchips = strbe16toh(rtas_local_data_buf, 6);
+   }
+}
+#endif /* CONFIG_PPC_RTAS */
+
 /* Domains for which more than one result element are returned for each event. 
*/
 static bool domain_needs_aggregation(unsigned int domain)
 {
@@ -1605,6 +1673,10 @@ static int hv_24x7_init(void)
if (r)
return r;
 
+#ifdef CONFIG_PPC_RTAS
+   read_sys_info_pseries();
+#endif
+
return 0;
 }
 
diff --git a/arch/powerpc/platforms/pseries/pseries.h 
b/arch/powerpc/platforms/pseries/pseries.h
index 13fa370a87e4..1727559ce304 100644
--- a/arch/powerpc/platforms/pseries/pseries.h
+++ b/arch/powerpc/platforms/pseries/pseries.h
@@ -19,6 +19,9 @@ extern void request_event_sources_irqs(struct device_node *np,
 struct pt_regs;
 
 extern int pSeries_system_reset_exception(struct pt_regs *regs);
+#ifdef CONFIG_PPC_RTAS
+extern void read_sys_info_pseries(void);
+#endif
 extern int pSeries_machine_check_exception(struct pt_regs *regs);
 extern long pseries_machine_check_realmode(struct pt_regs *regs);
 
-- 
2.18.1



[PATCH v7 1/5] powerpc/perf/hv-24x7: Fix inconsistent output values incase multiple hv-24x7 events run

2020-03-27 Thread Kajol Jain
Commit 2b206ee6b0df ("powerpc/perf/hv-24x7: Display change in counter
values")' added to print _change_ in the counter value rather then raw
value for 24x7 counters. Incase of transactions, the event count
is set to 0 at the beginning of the transaction. It also sets
the event's prev_count to the raw value at the time of initialization.
Because of setting event count to 0, we are seeing some weird behaviour,
whenever we run multiple 24x7 events at a time.

For example:

command#: ./perf stat -e "{hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/,
   hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/}"
   -C 0 -I 1000 sleep 100

 1.000121704120 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 1.000121704  5 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 2.000357733  8 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 2.000357733 10 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 3.000495215 18,446,744,073,709,551,616 
hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 3.000495215 18,446,744,073,709,551,616 
hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 4.000641884 56 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 4.000641884 18,446,744,073,709,551,616 
hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 5.000791887 18,446,744,073,709,551,616 
hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/

Getting these large values in case we do -I.

As we are setting event_count to 0, for interval case, overall event_count is 
not
coming in incremental order. As we may can get new delta lesser then previous 
count.
Because of which when we print intervals, we are getting negative value which 
create
these large values.

This patch removes part where we set event_count to 0 in function
'h_24x7_event_read'. There won't be much impact as we do set 
event->hw.prev_count
to the raw value at the time of initialization to print change value.

With this patch
In power9 platform

command#: ./perf stat -e "{hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/,
   hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/}"
   -C 0 -I 1000 sleep 100

 1.000117685 93 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 1.000117685  1 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 2.000349331 98 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 2.000349331  2 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 3.000495900131 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 3.000495900  4 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 4.000645920204 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/
 4.000645920 61 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=1/
 4.284169997 22 hv_24x7/PM_MCS01_128B_RD_DISP_PORT01,chip=0/

Signed-off-by: Kajol Jain 
Suggested-by: Sukadev Bhattiprolu 
---
 arch/powerpc/perf/hv-24x7.c | 10 --
 1 file changed, 10 deletions(-)

diff --git a/arch/powerpc/perf/hv-24x7.c b/arch/powerpc/perf/hv-24x7.c
index 573e0b309c0c..48e8f4b17b91 100644
--- a/arch/powerpc/perf/hv-24x7.c
+++ b/arch/powerpc/perf/hv-24x7.c
@@ -1400,16 +1400,6 @@ static void h_24x7_event_read(struct perf_event *event)
h24x7hw = _cpu_var(hv_24x7_hw);
h24x7hw->events[i] = event;
put_cpu_var(h24x7hw);
-   /*
-* Clear the event count so we can compute the _change_
-* in the 24x7 raw counter value at the end of the txn.
-*
-* Note that we could alternatively read the 24x7 value
-* now and save its value in event->hw.prev_count. But
-* that would require issuing a hcall, which would then
-* defeat the purpose of using the txn interface.
-*/
-   local64_set(>count, 0);
}
 
put_cpu_var(hv_24x7_reqb);
-- 
2.18.1



[PATCH v7 0/5] powerpc/hv-24x7: Expose chip/sockets info to add json file metric support for the hv_24x7 socket/chip level events

2020-03-27 Thread Kajol Jain
Patchset fixes the inconsistent results we are getting when
we run multiple 24x7 events.

"hv_24x7" pmu interface events needs system dependent parameter
like socket/chip/core. For example, hv_24x7 chip level events needs
specific chip-id to which the data is requested should be added as part
of pmu events.

So to enable JSON file support to "hv_24x7" interface, patchset expose
total number of sockets and chips per-socket details in sysfs
files (sockets, chips) under "/sys/devices/hv_24x7/interface/".

To get sockets and number of chips per sockets, patchset adds a rtas call
with token "PROCESSOR_MODULE_INFO" to get these details. Patchset also
handles partition migration case to re-init these system depended
parameters by adding proper calls in post_mobility_fixup() (mobility.c).

v6: http://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=164769
Changelog:
v6 -> v7
- Split patchset into two patch series, one with kernel changes
  and another with perf tool side changes. This pachset contain
  all kernel side changes.

Kajol Jain (5):
  powerpc/perf/hv-24x7: Fix inconsistent output values incase multiple
hv-24x7 events run
  powerpc/hv-24x7: Add rtas call in hv-24x7 driver to get processor
details
  powerpc/hv-24x7: Add sysfs files inside hv-24x7 device to show
processor details
  Documentation/ABI: Add ABI documentation for chips and sockets
  powerpc/hv-24x7: Update post_mobility_fixup() to handle migration

 .../sysfs-bus-event_source-devices-hv_24x7|  14 +++
 arch/powerpc/perf/hv-24x7.c   | 104 --
 arch/powerpc/platforms/pseries/mobility.c |  12 ++
 arch/powerpc/platforms/pseries/pseries.h  |   3 +
 4 files changed, 123 insertions(+), 10 deletions(-)

-- 
2.18.1