[PATCH RFC 2/4] powerpc: Add Microwatt platform

2020-05-08 Thread Paul Mackerras
Microwatt is a FPGA-based implementation of the Power ISA.  It
currently only implements little-endian 64-bit mode, and does
not (yet) support SMP.

This adds a new machine type to support FPGA-based SoCs with a
Microwatt core.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/Kconfig  |2 +-
 arch/powerpc/configs/microwatt_defconfig  | 1418 +
 arch/powerpc/platforms/Kconfig|1 +
 arch/powerpc/platforms/Makefile   |1 +
 arch/powerpc/platforms/microwatt/Kconfig  |9 +
 arch/powerpc/platforms/microwatt/Makefile |1 +
 arch/powerpc/platforms/microwatt/setup.c  |   40 +
 7 files changed, 1471 insertions(+), 1 deletion(-)
 create mode 100644 arch/powerpc/configs/microwatt_defconfig
 create mode 100644 arch/powerpc/platforms/microwatt/Kconfig
 create mode 100644 arch/powerpc/platforms/microwatt/Makefile
 create mode 100644 arch/powerpc/platforms/microwatt/setup.c

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 497b7d0b2d7e..97286b8312f5 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -407,7 +407,7 @@ config HUGETLB_PAGE_SIZE_VARIABLE
 
 config MATH_EMULATION
bool "Math emulation"
-   depends on 4xx || PPC_8xx || PPC_MPC832x || BOOKE
+   depends on 4xx || PPC_8xx || PPC_MPC832x || BOOKE || PPC_MICROWATT
help
  Some PowerPC chips designed for embedded applications do not have
  a floating-point unit and therefore do not implement the
diff --git a/arch/powerpc/configs/microwatt_defconfig 
b/arch/powerpc/configs/microwatt_defconfig
new file mode 100644
index ..f4f4c965a786
--- /dev/null
+++ b/arch/powerpc/configs/microwatt_defconfig
@@ -0,0 +1,1418 @@
+#
+# Automatically generated file; DO NOT EDIT.
+# Linux/powerpc 5.6.0 Kernel Configuration
+#
+
+#
+# Compiler: powerpc64le-linux-gnu-gcc (GCC) 9.2.1 20190827 (Red Hat Cross 
9.2.1-1)
+#
+CONFIG_CC_IS_GCC=y
+CONFIG_GCC_VERSION=90201
+CONFIG_CLANG_VERSION=0
+CONFIG_CC_HAS_ASM_GOTO=y
+CONFIG_CC_HAS_ASM_INLINE=y
+CONFIG_CC_HAS_WARN_MAYBE_UNINITIALIZED=y
+CONFIG_CC_DISABLE_WARN_MAYBE_UNINITIALIZED=y
+CONFIG_IRQ_WORK=y
+CONFIG_BUILDTIME_TABLE_SORT=y
+CONFIG_THREAD_INFO_IN_TASK=y
+
+#
+# General setup
+#
+CONFIG_BROKEN_ON_SMP=y
+CONFIG_INIT_ENV_ARG_LIMIT=32
+# CONFIG_COMPILE_TEST is not set
+CONFIG_LOCALVERSION=""
+CONFIG_LOCALVERSION_AUTO=y
+CONFIG_BUILD_SALT=""
+CONFIG_HAVE_KERNEL_GZIP=y
+CONFIG_HAVE_KERNEL_XZ=y
+# CONFIG_KERNEL_GZIP is not set
+CONFIG_KERNEL_XZ=y
+CONFIG_DEFAULT_HOSTNAME="arty"
+# CONFIG_SWAP is not set
+# CONFIG_SYSVIPC is not set
+# CONFIG_CROSS_MEMORY_ATTACH is not set
+# CONFIG_USELIB is not set
+CONFIG_HAVE_ARCH_AUDITSYSCALL=y
+
+#
+# IRQ subsystem
+#
+CONFIG_GENERIC_IRQ_SHOW=y
+CONFIG_GENERIC_IRQ_SHOW_LEVEL=y
+CONFIG_HARDIRQS_SW_RESEND=y
+CONFIG_IRQ_DOMAIN=y
+CONFIG_IRQ_FORCED_THREADING=y
+CONFIG_SPARSE_IRQ=y
+# end of IRQ subsystem
+
+CONFIG_GENERIC_TIME_VSYSCALL=y
+CONFIG_GENERIC_CLOCKEVENTS=y
+CONFIG_GENERIC_CMOS_UPDATE=y
+
+#
+# Timers subsystem
+#
+CONFIG_TICK_ONESHOT=y
+CONFIG_HZ_PERIODIC=y
+# CONFIG_NO_HZ_IDLE is not set
+# CONFIG_NO_HZ is not set
+CONFIG_HIGH_RES_TIMERS=y
+# end of Timers subsystem
+
+# CONFIG_PREEMPT_NONE is not set
+CONFIG_PREEMPT_VOLUNTARY=y
+# CONFIG_PREEMPT is not set
+
+#
+# CPU/Task time and stats accounting
+#
+CONFIG_TICK_CPU_ACCOUNTING=y
+# CONFIG_VIRT_CPU_ACCOUNTING_NATIVE is not set
+# CONFIG_VIRT_CPU_ACCOUNTING_GEN is not set
+# CONFIG_IRQ_TIME_ACCOUNTING is not set
+# CONFIG_BSD_PROCESS_ACCT is not set
+# CONFIG_PSI is not set
+# end of CPU/Task time and stats accounting
+
+#
+# RCU Subsystem
+#
+CONFIG_TINY_RCU=y
+# CONFIG_RCU_EXPERT is not set
+CONFIG_SRCU=y
+CONFIG_TINY_SRCU=y
+# end of RCU Subsystem
+
+# CONFIG_IKCONFIG is not set
+# CONFIG_IKHEADERS is not set
+CONFIG_LOG_BUF_SHIFT=16
+CONFIG_PRINTK_SAFE_LOG_BUF_SHIFT=12
+
+#
+# Scheduler features
+#
+# end of Scheduler features
+
+CONFIG_ARCH_SUPPORTS_NUMA_BALANCING=y
+CONFIG_CC_HAS_INT128=y
+# CONFIG_CGROUPS is not set
+# CONFIG_NAMESPACES is not set
+# CONFIG_CHECKPOINT_RESTORE is not set
+# CONFIG_SCHED_AUTOGROUP is not set
+# CONFIG_SYSFS_DEPRECATED is not set
+# CONFIG_RELAY is not set
+CONFIG_BLK_DEV_INITRD=y
+CONFIG_INITRAMFS_SOURCE="rootfs.cpio"
+CONFIG_INITRAMFS_ROOT_UID=0
+CONFIG_INITRAMFS_ROOT_GID=0
+# CONFIG_RD_GZIP is not set
+# CONFIG_RD_BZIP2 is not set
+# CONFIG_RD_LZMA is not set
+CONFIG_RD_XZ=y
+# CONFIG_RD_LZO is not set
+# CONFIG_RD_LZ4 is not set
+CONFIG_INITRAMFS_COMPRESSION_XZ=y
+# CONFIG_INITRAMFS_COMPRESSION_NONE is not set
+# CONFIG_BOOT_CONFIG is not set
+# CONFIG_CC_OPTIMIZE_FOR_PERFORMANCE is not set
+CONFIG_CC_OPTIMIZE_FOR_SIZE=y
+CONFIG_HAVE_LD_DEAD_CODE_DATA_ELIMINATION=y
+# CONFIG_LD_DEAD_CODE_DATA_ELIMINATION is not set
+CONFIG_SYSCTL=y
+CONFIG_SYSCTL_EXCEPTION_TRACE=y
+CONFIG_EXPERT=y
+CONFIG_MULTIUSER=y
+# CONFIG_SGETMASK_SYSCALL is not set
+# CONFIG_SYSFS_SYSCALL is not set
+# CONFIG_FHANDLE is not set
+CONFIG_POSIX_TIMERS=y
+CONFIG_PRINTK=y

[PATCH RFC 0/4] Add support for Microwatt-based SoCs

2020-05-08 Thread Paul Mackerras
This patch series adds support for running Linux on a Microwatt SoC
(system on chip) implementation on an FPGA.  Microwatt is a small
Power ISA implementation, targetted at FPGAs, aiming for PowerISA
v3.0B compliance.  It does not currently implement any floating-point
or vector instructions, hypervisor mode, big-endian mode, 32-bit mode,
or the HPT/SLB MMU facilities.  However, it does support enough to run
Linux (as of my "mmu" branch plus Ben's "litedram" branch").

Paul.


[PATCH RFC 1/4] powerpc/radix: Fix compilation for radix with CONFIG_SMP=n

2020-05-08 Thread Paul Mackerras
This fixes the compile errors we currently get with CONFIG_SMP=n and
CONFIG_PPC_RADIX_MMU=y.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/include/asm/book3s/64/tlbflush-radix.h | 2 ++
 arch/powerpc/mm/book3s64/radix_tlb.c| 2 --
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h 
b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
index ca8db193ae38..adcc6114d170 100644
--- a/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
+++ b/arch/powerpc/include/asm/book3s/64/tlbflush-radix.h
@@ -68,6 +68,8 @@ extern void radix__flush_tlb_page_psize(struct mm_struct *mm, 
unsigned long vmad
 #define radix__flush_all_mm(mm)radix__local_flush_all_mm(mm)
 #define radix__flush_tlb_page(vma,addr)
radix__local_flush_tlb_page(vma,addr)
 #define radix__flush_tlb_page_psize(mm,addr,p) 
radix__local_flush_tlb_page_psize(mm,addr,p)
+#define exit_flush_lazy_tlbs(mm)   do { } while (0)
+#define __flush_all_mm(mm, fullmm) radix__local_flush_all_mm(mm)
 #endif
 extern void radix__flush_tlb_pwc(struct mmu_gather *tlb, unsigned long addr);
 extern void radix__flush_tlb_collapsed_pmd(struct mm_struct *mm, unsigned long 
addr);
diff --git a/arch/powerpc/mm/book3s64/radix_tlb.c 
b/arch/powerpc/mm/book3s64/radix_tlb.c
index 03f43c924e00..e3ea026cf91e 100644
--- a/arch/powerpc/mm/book3s64/radix_tlb.c
+++ b/arch/powerpc/mm/book3s64/radix_tlb.c
@@ -776,8 +776,6 @@ void radix__flush_tlb_page(struct vm_area_struct *vma, 
unsigned long vmaddr)
 }
 EXPORT_SYMBOL(radix__flush_tlb_page);
 
-#else /* CONFIG_SMP */
-#define radix__flush_all_mm radix__local_flush_all_mm
 #endif /* CONFIG_SMP */
 
 static void do_tlbiel_kernel(void *info)
-- 
2.25.3



[PATCH RFC 4/4] powerpc/radix: Add support for microwatt's PRTBL SPR

2020-05-08 Thread Paul Mackerras
Microwatt currently doesn't implement hypervisor mode and therefore
doesn't implement the partition table.  It does implement the process
table and radix page table walks.

This adds code to write the base address of the process table to the
PRTBL SPR, which has been assigned SPR 720 for now, as that is in the
range of SPR numbers assigned for experimental use.  PRTBL is only
written when we have neither the FW_FEATURE_LPAR feature nor the
CPU_FTR_HVMODE feature.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/include/asm/reg.h   |  1 +
 arch/powerpc/mm/book3s64/radix_pgtable.c | 13 +
 2 files changed, 10 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/include/asm/reg.h b/arch/powerpc/include/asm/reg.h
index 1aa46dff0957..6ea3fc42740d 100644
--- a/arch/powerpc/include/asm/reg.h
+++ b/arch/powerpc/include/asm/reg.h
@@ -721,6 +721,7 @@
 #endif
 #define SPRN_TIR   0x1BE   /* Thread Identification Register */
 #define SPRN_PTCR  0x1D0   /* Partition table control Register */
+#define SPRN_PRTBL 0x2D0   /* Process table pointer */
 #define SPRN_PSPB  0x09F   /* Problem State Priority Boost reg */
 #define SPRN_PTEHI 0x3D5   /* 981 7450 PTE HI word (S/W TLB load) */
 #define SPRN_PTELO 0x3D6   /* 982 7450 PTE LO word (S/W TLB load) */
diff --git a/arch/powerpc/mm/book3s64/radix_pgtable.c 
b/arch/powerpc/mm/book3s64/radix_pgtable.c
index dd1bea45325c..2e6a376c9d82 100644
--- a/arch/powerpc/mm/book3s64/radix_pgtable.c
+++ b/arch/powerpc/mm/book3s64/radix_pgtable.c
@@ -600,10 +600,15 @@ void __init radix__early_init_mmu(void)
radix_init_pgtable();
 
if (!firmware_has_feature(FW_FEATURE_LPAR)) {
-   lpcr = mfspr(SPRN_LPCR);
-   mtspr(SPRN_LPCR, lpcr | LPCR_UPRT | LPCR_HR);
-   radix_init_partition_table();
-   radix_init_amor();
+   if (cpu_has_feature(CPU_FTR_HVMODE)) {
+   lpcr = mfspr(SPRN_LPCR);
+   mtspr(SPRN_LPCR, lpcr | LPCR_UPRT | LPCR_HR);
+   radix_init_partition_table();
+   radix_init_amor();
+   } else {
+   mtspr(SPRN_PRTBL, (__pa(process_tb) |
+  (PRTB_SIZE_SHIFT - 12)));
+   }
} else {
radix_init_pseries();
}
-- 
2.25.3



[PATCH RFC 3/4] powerpc/microwatt: Add early debug UART support for Microwatt

2020-05-08 Thread Paul Mackerras
Currently microwatt-based SoCs come with a "potato" UART.  This
adds udbg support for the potato UART, giving us both an early
debug console, and a runtime console using the hvc-udbg support.

Signed-off-by: Paul Mackerras 
---
 arch/powerpc/Kconfig.debug   |   6 ++
 arch/powerpc/include/asm/udbg.h  |   1 +
 arch/powerpc/kernel/udbg.c   |   2 +
 arch/powerpc/platforms/microwatt/setup.c | 108 +++
 4 files changed, 117 insertions(+)

diff --git a/arch/powerpc/Kconfig.debug b/arch/powerpc/Kconfig.debug
index 0b063830eea8..399abc6d2af7 100644
--- a/arch/powerpc/Kconfig.debug
+++ b/arch/powerpc/Kconfig.debug
@@ -277,6 +277,12 @@ config PPC_EARLY_DEBUG_MEMCONS
  This console provides input and output buffers stored within the
  kernel BSS and should be safe to select on any system. A debugger
  can then be used to read kernel output or send input to the console.
+
+config PPC_EARLY_DEBUG_MICROWATT
+   bool "Microwatt potato UART"
+   help
+ Select this to enable early debugging using the potato UART
+included in the Microwatt SOC.
 endchoice
 
 config PPC_MEMCONS_OUTPUT_SIZE
diff --git a/arch/powerpc/include/asm/udbg.h b/arch/powerpc/include/asm/udbg.h
index 0ea9e70ed78b..2dbd2d3b0591 100644
--- a/arch/powerpc/include/asm/udbg.h
+++ b/arch/powerpc/include/asm/udbg.h
@@ -53,6 +53,7 @@ extern void __init udbg_init_ehv_bc(void);
 extern void __init udbg_init_ps3gelic(void);
 extern void __init udbg_init_debug_opal_raw(void);
 extern void __init udbg_init_debug_opal_hvsi(void);
+extern void __init udbg_init_debug_microwatt(void);
 
 #endif /* __KERNEL__ */
 #endif /* _ASM_POWERPC_UDBG_H */
diff --git a/arch/powerpc/kernel/udbg.c b/arch/powerpc/kernel/udbg.c
index 01595e8cafe7..e614993021c6 100644
--- a/arch/powerpc/kernel/udbg.c
+++ b/arch/powerpc/kernel/udbg.c
@@ -67,6 +67,8 @@ void __init udbg_early_init(void)
udbg_init_debug_opal_raw();
 #elif defined(CONFIG_PPC_EARLY_DEBUG_OPAL_HVSI)
udbg_init_debug_opal_hvsi();
+#elif defined(CONFIG_PPC_EARLY_DEBUG_MICROWATT)
+   udbg_init_debug_microwatt();
 #endif
 
 #ifdef CONFIG_PPC_EARLY_DEBUG
diff --git a/arch/powerpc/platforms/microwatt/setup.c 
b/arch/powerpc/platforms/microwatt/setup.c
index 3cfc5955a6fe..a5145adeaae7 100644
--- a/arch/powerpc/platforms/microwatt/setup.c
+++ b/arch/powerpc/platforms/microwatt/setup.c
@@ -10,8 +10,115 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 
+static u64 potato_uart_base;
+
+#define PROC_FREQ 1
+#define UART_FREQ 115200
+#define UART_BASE 0xc0002000
+
+#define POTATO_CONSOLE_TX  0x00
+#define POTATO_CONSOLE_RX  0x08
+#define POTATO_CONSOLE_STATUS  0x10
+#define   POTATO_CONSOLE_STATUS_RX_EMPTY   0x01
+#define   POTATO_CONSOLE_STATUS_TX_EMPTY   0x02
+#define   POTATO_CONSOLE_STATUS_RX_FULL0x04
+#define   POTATO_CONSOLE_STATUS_TX_FULL0x08
+#define POTATO_CONSOLE_CLOCK_DIV   0x18
+#define POTATO_CONSOLE_IRQ_EN  0x20
+
+static u64 potato_uart_reg_read(int offset)
+{
+   u64 val, msr;
+
+   msr = mfmsr();
+   __asm__ volatile("mtmsrd %3,0; ldcix %0,%1,%2; mtmsrd %4,0"
+: "=r" (val) : "b" (potato_uart_base), "r" (offset),
+  "r" (msr & ~MSR_DR), "r" (msr));
+
+   return val;
+}
+
+static void potato_uart_reg_write(int offset, u64 val)
+{
+   u64 msr;
+
+   msr = mfmsr();
+   __asm__ volatile("mtmsrd %3,0; stdcix %0,%1,%2; mtmsrd %4,0"
+: : "r" (val), "b" (potato_uart_base), "r" (offset),
+  "r" (msr & ~MSR_DR), "r" (msr));
+}
+
+static int potato_uart_rx_empty(void)
+{
+   u64 val;
+
+   val = potato_uart_reg_read(POTATO_CONSOLE_STATUS);
+
+   if (val & POTATO_CONSOLE_STATUS_RX_EMPTY)
+   return 1;
+
+   return 0;
+}
+
+static int potato_uart_tx_full(void)
+{
+   u64 val;
+
+   val = potato_uart_reg_read(POTATO_CONSOLE_STATUS);
+
+   if (val & POTATO_CONSOLE_STATUS_TX_FULL)
+   return 1;
+
+   return 0;
+}
+
+static int potato_uart_read(void)
+{
+   while (potato_uart_rx_empty())
+   ;
+   return potato_uart_reg_read(POTATO_CONSOLE_RX);
+}
+
+static int potato_uart_read_poll(void)
+{
+   if (potato_uart_rx_empty())
+   return -1;
+   return potato_uart_reg_read(POTATO_CONSOLE_RX);
+}
+
+static void potato_uart_write(char c)
+{
+   if (c == '\n')
+   potato_uart_write('\r');
+   while (potato_uart_tx_full())
+   ;
+   potato_uart_reg_write(POTATO_CONSOLE_TX, c);
+}
+
+static unsigned long potato_uart_divisor(unsigned long proc_freq, unsigned 
long uart_freq)
+{
+   return proc_freq / (uart_freq * 16) - 1;
+}
+
+void potato_uart_init(void)
+{
+   potato_uart_base = UART_BASE;
+
+   

Re: remove a few uses of ->queuedata

2020-05-08 Thread Ming Lei
On Fri, May 08, 2020 at 06:15:02PM +0200, Christoph Hellwig wrote:
> Hi all,
> 
> various bio based drivers use queue->queuedata despite already having
> set up disk->private_data, which can be used just as easily.  This
> series cleans them up to only use a single private data pointer.
> 
> blk-mq based drivers that have code pathes that can't easily get at
> the gendisk are unaffected by this series.

Yeah, before adding disk, there still may be requests queued to LLD
for blk-mq based drivers.

So are there this similar situation for these bio based drivers?


Thanks,
Ming



Re: [PATCH v4 11/16] powerpc/64s: machine check interrupt update NMI accounting

2020-05-08 Thread kbuild test robot
Hi Nicholas,

I love your patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on tip/perf/core v5.7-rc4 next-20200508]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:
https://github.com/0day-ci/linux/commits/Nicholas-Piggin/powerpc-machine-check-and-system-reset-fixes/20200509-030554
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-randconfig-r002-20200509 (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 9.3.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day GCC_VERSION=9.3.0 make.cross 
ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot 

All errors (new ones prefixed by >>):

   In file included from include/linux/kernel.h:15,
from include/asm-generic/bug.h:19,
from arch/powerpc/include/asm/bug.h:109,
from include/linux/bug.h:5,
from arch/powerpc/include/asm/mmu.h:130,
from arch/powerpc/include/asm/paca.h:18,
from arch/powerpc/include/asm/current.h:13,
from include/linux/sched.h:12,
from arch/powerpc/kernel/process.c:14:
   arch/powerpc/kernel/process.c: In function 'show_regs':
>> arch/powerpc/kernel/process.c:1425:74: error: 'struct paca_struct' has no 
>> member named 'in_nmi'
1425 |  pr_cont("IRQMASK: %lx IN_NMI:%d IN_MCE:%d", regs->softe, 
(int)get_paca()->in_nmi, (int)get_paca()->in_mce);
 |  
^~
   include/linux/printk.h:312:26: note: in definition of macro 'pr_cont'
 312 |  printk(KERN_CONT fmt, ##__VA_ARGS__)
 |  ^~~
>> arch/powerpc/kernel/process.c:1425:99: error: 'struct paca_struct' has no 
>> member named 'in_mce'
1425 |  pr_cont("IRQMASK: %lx IN_NMI:%d IN_MCE:%d", regs->softe, 
(int)get_paca()->in_nmi, (int)get_paca()->in_mce);
 |  
 ^~
   include/linux/printk.h:312:26: note: in definition of macro 'pr_cont'
 312 |  printk(KERN_CONT fmt, ##__VA_ARGS__)
 |  ^~~

vim +1425 arch/powerpc/kernel/process.c

  1401  
  1402  void show_regs(struct pt_regs * regs)
  1403  {
  1404  int i, trap;
  1405  
  1406  show_regs_print_info(KERN_DEFAULT);
  1407  
  1408  printk("NIP:  "REG" LR: "REG" CTR: "REG"\n",
  1409 regs->nip, regs->link, regs->ctr);
  1410  printk("REGS: %px TRAP: %04lx   %s  (%s)\n",
  1411 regs, regs->trap, print_tainted(), 
init_utsname()->release);
  1412  printk("MSR:  "REG" ", regs->msr);
  1413  print_msr_bits(regs->msr);
  1414  pr_cont("  CR: %08lx  XER: %08lx\n", regs->ccr, regs->xer);
  1415  trap = TRAP(regs);
  1416  if ((TRAP(regs) != 0xc00) && cpu_has_feature(CPU_FTR_CFAR))
  1417  pr_cont("CFAR: "REG" ", regs->orig_gpr3);
  1418  if (trap == 0x200 || trap == 0x300 || trap == 0x600)
  1419  #if defined(CONFIG_4xx) || defined(CONFIG_BOOKE)
  1420  pr_cont("DEAR: "REG" ESR: "REG" ", regs->dar, 
regs->dsisr);
  1421  #else
  1422  pr_cont("DAR: "REG" DSISR: %08lx ", regs->dar, 
regs->dsisr);
  1423  #endif
  1424  #ifdef CONFIG_PPC64
> 1425  pr_cont("IRQMASK: %lx IN_NMI:%d IN_MCE:%d", regs->softe, 
> (int)get_paca()->in_nmi, (int)get_paca()->in_mce);
  1426  #endif
  1427  #ifdef CONFIG_PPC_TRANSACTIONAL_MEM
  1428  if (MSR_TM_ACTIVE(regs->msr))
  1429  pr_cont("\nPACATMSCRATCH: %016llx ", 
get_paca()->tm_scratch);
  1430  #endif
  1431  
  1432  for (i = 0;  i < 32;  i++) {
  1433  if ((i % REGS_PER_LINE) == 0)
  1434  pr_cont("\nGPR%02d: ", i);
  1435  pr_cont(REG " ", regs->gpr[i]);
  1436  if (i == LAST_VOLATILE && !FULL_REGS(regs))
  1437  break;
  1438  }
  1439  pr_cont("\n");
  1440  #ifdef CONFIG_KALLSYMS
  1441  /*
  144

[PATCH -next] powerpc/powernv: add NULL check after kzalloc

2020-05-08 Thread Chen Zhou
Fixes coccicheck warning:

./arch/powerpc/platforms/powernv/opal.c:813:1-5:
alloc with no test, possible model on line 814

Add NULL check after kzalloc.

Signed-off-by: Chen Zhou 
---
 arch/powerpc/platforms/powernv/opal.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/platforms/powernv/opal.c 
b/arch/powerpc/platforms/powernv/opal.c
index 2b3dfd0b6cdd..d95954ad4c0a 100644
--- a/arch/powerpc/platforms/powernv/opal.c
+++ b/arch/powerpc/platforms/powernv/opal.c
@@ -811,6 +811,10 @@ static int opal_add_one_export(struct kobject *parent, 
const char *export_name,
goto out;
 
attr = kzalloc(sizeof(*attr), GFP_KERNEL);
+   if (!attr) {
+   rc = -ENOMEM;
+   goto out;
+   }
name = kstrdup(export_name, GFP_KERNEL);
if (!name) {
rc = -ENOMEM;
-- 
2.20.1



[PATCH] powerpc/kvm: silence kmemleak false positives

2020-05-08 Thread Qian Cai
kvmppc_pmd_alloc() and kvmppc_pte_alloc() allocate some memory but then
pud_populate() and pmd_populate() will use __pa() to reference the newly
allocated memory. The same is in xive_native_provision_pages().

Since kmemleak is unable to track the physical memory resulting in false
positives, silence those by using kmemleak_ignore().

unreferenced object 0xc000201c382a1000 (size 4096):
  comm "qemu-kvm", pid 124828, jiffies 4295733767 (age 341.250s)
  hex dump (first 32 bytes):
c0 00 20 09 f4 60 03 87 c0 00 20 10 72 a0 03 87  .. ..` .r...
c0 00 20 0e 13 a0 03 87 c0 00 20 1b dc c0 03 87  .. ... .
  backtrace:
[<4cc2790f>] kvmppc_create_pte+0x838/0xd20 [kvm_hv]
kvmppc_pmd_alloc at arch/powerpc/kvm/book3s_64_mmu_radix.c:366
(inlined by) kvmppc_create_pte at arch/powerpc/kvm/book3s_64_mmu_radix.c:590
[] kvmppc_book3s_instantiate_page+0x2e0/0x8c0 [kvm_hv]
[] kvmppc_book3s_radix_page_fault+0x1b4/0x2b0 [kvm_hv]
[<86dddc0e>] kvmppc_book3s_hv_page_fault+0x214/0x12a0 [kvm_hv]
[<5ae9ccc2>] kvmppc_vcpu_run_hv+0xc5c/0x15f0 [kvm_hv]
[] kvmppc_vcpu_run+0x34/0x48 [kvm]
[] kvm_arch_vcpu_ioctl_run+0x314/0x420 [kvm]
[<2543dd54>] kvm_vcpu_ioctl+0x33c/0x950 [kvm]
[<48155cd6>] ksys_ioctl+0xd8/0x130
[<41ffeaa7>] sys_ioctl+0x28/0x40
[<4afc4310>] system_call_exception+0x114/0x1e0
[] system_call_common+0xf0/0x278
unreferenced object 0xc0002001f0c03900 (size 256):
  comm "qemu-kvm", pid 124830, jiffies 4295735235 (age 326.570s)
  hex dump (first 32 bytes):
c0 00 20 10 fa a0 03 87 c0 00 20 10 fa a1 03 87  .. ... .
c0 00 20 10 fa a2 03 87 c0 00 20 10 fa a3 03 87  .. ... .
  backtrace:
[<23f675b8>] kvmppc_create_pte+0x854/0xd20 [kvm_hv]
kvmppc_pte_alloc at arch/powerpc/kvm/book3s_64_mmu_radix.c:356
(inlined by) kvmppc_create_pte at arch/powerpc/kvm/book3s_64_mmu_radix.c:593
[] kvmppc_book3s_instantiate_page+0x2e0/0x8c0 [kvm_hv]
[] kvmppc_book3s_radix_page_fault+0x1b4/0x2b0 [kvm_hv]
[<86dddc0e>] kvmppc_book3s_hv_page_fault+0x214/0x12a0 [kvm_hv]
[<5ae9ccc2>] kvmppc_vcpu_run_hv+0xc5c/0x15f0 [kvm_hv]
[] kvmppc_vcpu_run+0x34/0x48 [kvm]
[] kvm_arch_vcpu_ioctl_run+0x314/0x420 [kvm]
[<2543dd54>] kvm_vcpu_ioctl+0x33c/0x950 [kvm]
[<48155cd6>] ksys_ioctl+0xd8/0x130
[<41ffeaa7>] sys_ioctl+0x28/0x40
[<4afc4310>] system_call_exception+0x114/0x1e0
[] system_call_common+0xf0/0x278
unreferenced object 0xc000201b53e9 (size 65536):
  comm "qemu-kvm", pid 124557, jiffies 4295650285 (age 364.370s)
  hex dump (first 32 bytes):
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
  backtrace:
[] xive_native_alloc_vp_block+0x168/0x210
xive_native_provision_pages at arch/powerpc/sysdev/xive/native.c:645
(inlined by) xive_native_alloc_vp_block at 
arch/powerpc/sysdev/xive/native.c:674
[<4d5c7964>] kvmppc_xive_compute_vp_id+0x20c/0x3b0 [kvm]
[<55317cd2>] kvmppc_xive_connect_vcpu+0xa4/0x4a0 [kvm]
[<93dfc014>] kvm_arch_vcpu_ioctl+0x388/0x508 [kvm]
[] kvm_vcpu_ioctl+0x15c/0x950 [kvm]
[<48155cd6>] ksys_ioctl+0xd8/0x130
[<41ffeaa7>] sys_ioctl+0x28/0x40
[<4afc4310>] system_call_exception+0x114/0x1e0
[] system_call_common+0xf0/0x278

Signed-off-by: Qian Cai 
---
 arch/powerpc/kvm/book3s_64_mmu_radix.c | 16 ++--
 arch/powerpc/sysdev/xive/native.c  |  4 
 2 files changed, 18 insertions(+), 2 deletions(-)

diff --git a/arch/powerpc/kvm/book3s_64_mmu_radix.c 
b/arch/powerpc/kvm/book3s_64_mmu_radix.c
index aa12cd4078b3..bc6c1aa3d0e9 100644
--- a/arch/powerpc/kvm/book3s_64_mmu_radix.c
+++ b/arch/powerpc/kvm/book3s_64_mmu_radix.c
@@ -353,7 +353,13 @@ static struct kmem_cache *kvm_pmd_cache;
 
 static pte_t *kvmppc_pte_alloc(void)
 {
-   return kmem_cache_alloc(kvm_pte_cache, GFP_KERNEL);
+   pte_t *pte;
+
+   pte = kmem_cache_alloc(kvm_pte_cache, GFP_KERNEL);
+   /* pmd_populate() will only reference _pa(pte). */
+   kmemleak_ignore(pte);
+
+   return pte;
 }
 
 static void kvmppc_pte_free(pte_t *ptep)
@@ -363,7 +369,13 @@ static void kvmppc_pte_free(pte_t *ptep)
 
 static pmd_t *kvmppc_pmd_alloc(void)
 {
-   return kmem_cache_alloc(kvm_pmd_cache, GFP_KERNEL);
+   pmd_t *pmd;
+
+   pmd = kmem_cache_alloc(kvm_pmd_cache, GFP_KERNEL);
+   /* pud_populate() will only reference _pa(pmd). */
+   kmemleak_ignore(pmd);
+
+   return pmd;
 }
 
 static void kvmppc_pmd_free(pmd_t *pmdp)
diff --git a/arch/powerpc/sysdev/xive/native.c 

Re: [PATCH V3 2/3] mm/hugetlb: Define a generic fallback for is_hugepage_only_range()

2020-05-08 Thread Mike Kravetz
On 5/7/20 8:07 PM, Anshuman Khandual wrote:
> There are multiple similar definitions for is_hugepage_only_range() on
> various platforms. Lets just add it's generic fallback definition for
> platforms that do not override. This help reduce code duplication.
> 
> Cc: Russell King 
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> Cc: Tony Luck 
> Cc: Fenghua Yu 
> Cc: Thomas Bogendoerfer 
> Cc: "James E.J. Bottomley" 
> Cc: Helge Deller 
> Cc: Benjamin Herrenschmidt 
> Cc: Paul Mackerras 
> Cc: Michael Ellerman 
> Cc: Paul Walmsley 
> Cc: Palmer Dabbelt 
> Cc: Heiko Carstens 
> Cc: Vasily Gorbik 
> Cc: Christian Borntraeger 
> Cc: Yoshinori Sato 
> Cc: Rich Felker 
> Cc: "David S. Miller" 
> Cc: Thomas Gleixner 
> Cc: Ingo Molnar 
> Cc: Borislav Petkov 
> Cc: "H. Peter Anvin" 
> Cc: Mike Kravetz 
> Cc: Andrew Morton 
> Cc: x...@kernel.org
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-i...@vger.kernel.org
> Cc: linux-m...@vger.kernel.org
> Cc: linux-par...@vger.kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-ri...@lists.infradead.org
> Cc: linux-s...@vger.kernel.org
> Cc: linux...@vger.kernel.org
> Cc: sparcli...@vger.kernel.org
> Cc: linux...@kvack.org
> Cc: linux-a...@vger.kernel.org
> Cc: linux-ker...@vger.kernel.org
> Signed-off-by: Anshuman Khandual 
> ---
>  arch/arm/include/asm/hugetlb.h | 6 --
>  arch/arm64/include/asm/hugetlb.h   | 6 --
>  arch/ia64/include/asm/hugetlb.h| 1 +
>  arch/mips/include/asm/hugetlb.h| 7 ---
>  arch/parisc/include/asm/hugetlb.h  | 6 --
>  arch/powerpc/include/asm/hugetlb.h | 1 +
>  arch/riscv/include/asm/hugetlb.h   | 6 --
>  arch/s390/include/asm/hugetlb.h| 7 ---
>  arch/sh/include/asm/hugetlb.h  | 6 --
>  arch/sparc/include/asm/hugetlb.h   | 6 --
>  arch/x86/include/asm/hugetlb.h | 6 --
>  include/linux/hugetlb.h| 9 +
>  12 files changed, 11 insertions(+), 56 deletions(-)
> 

> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
> index 43a1cef8f0f1..c01c0c6f7fd4 100644
> --- a/include/linux/hugetlb.h
> +++ b/include/linux/hugetlb.h
> @@ -591,6 +591,15 @@ static inline unsigned int blocks_per_huge_page(struct 
> hstate *h)
>  
>  #include 
>  
> +#ifndef is_hugepage_only_range
> +static inline int is_hugepage_only_range(struct mm_struct *mm,
> + unsigned long addr, unsigned long len)
> +{
> + return 0;
> +}
> +#define is_hugepage_only_range is_hugepage_only_range
> +#endif
> +
>  #ifndef arch_make_huge_pte
>  static inline pte_t arch_make_huge_pte(pte_t entry, struct vm_area_struct 
> *vma,
>  struct page *page, int writable)
> 

Did you try building without CONFIG_HUGETLB_PAGE defined?  I'm guessing
that you need a stub for is_hugepage_only_range().  Or, perhaps add this
to asm-generic/hugetlb.h?

-- 
Mike Kravetz


Re: [PATCH v8 5/5] powerpc/hv-24x7: Update post_mobility_fixup() to handle migration

2020-05-08 Thread kbuild test robot
Hi Kajol,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on powerpc/next]
[also build test ERROR on linus/master v5.7-rc4 next-20200508]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:
https://github.com/0day-ci/linux/commits/Kajol-Jain/powerpc-hv-24x7-Expose-chip-sockets-info-to-add-json-file-metric-support-for-the-hv_24x7-socket-chip-level-events/20200507-032548
base:   https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git next
config: powerpc-randconfig-r013-20200508 (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 9.3.0
reproduce:
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day GCC_VERSION=9.3.0 make.cross 
ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kbuild test robot 

All errors (new ones prefixed by >>):

>> arch/powerpc/platforms/pseries/mobility.c:48:20: error: static declaration 
>> of 'read_sys_info_pseries' follows non-static declaration
  48 | static inline void read_sys_info_pseries(void) { }
 |^
   In file included from arch/powerpc/platforms/pseries/mobility.c:22:
   arch/powerpc/include/asm/rtas.h:485:13: note: previous declaration of 
'read_sys_info_pseries' was here
 485 | extern void read_sys_info_pseries(void);
 | ^

vim +/read_sys_info_pseries +48 arch/powerpc/platforms/pseries/mobility.c

44  
45  #ifdef CONFIG_HV_PERF_CTRS
46  void read_sys_info_pseries(void);
47  #else
  > 48  static inline void read_sys_info_pseries(void) { }
49  #endif
50  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


Re: [PATCH 5/5] powerpc/mpc85xx: Add Cyrus P5040 device tree source

2020-05-08 Thread Scott Wood
On Thu, 2020-05-07 at 22:30 +0100, Darren Stevens wrote:
> 
> +/include/ "p5040si-pre.dtsi"
> +
> +/ {
> + model = "varisys,CYRUS5040";
> + compatible = "varisys,CYRUS";

Is this board 100% compatible with the Cyrus P5020 board, down to every last
quirk, except for the SoC plugged into it?  If not, they shouldn't have the
same compatible.  If they are, then couldn't everything in this file but the
SoC include be moved to a dtsi shared with cyrus_p5020.dts?


> + #address-cells = <2>;
> + #size-cells = <2>;
> + interrupt-parent = <>;
> +
> + aliases{
> + ethernet0 = 
> + ethernet1 = 
> + };

Space after "aliases"

-Scott




Re: [PATCH 4/5] powerpc/mpc85xx: Add Cyrus HDD LED

2020-05-08 Thread Scott Wood
On Thu, 2020-05-07 at 22:15 +0100, Darren Stevens wrote:
> The Cyrus board has its HDD LED connected to a GPIO pin. Add a device
> tree entry for this.
> 
> Signed-off-By: Darren Stevens 
> 
> ---
>  arch/powerpc/boot/dts/fsl/cyrus_p5020.dts | 10 ++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/arch/powerpc/boot/dts/fsl/cyrus_p5020.dts
> b/arch/powerpc/boot/dts/fsl/cyrus_p5020.dts index f0548fe..74c100f
> 100644 --- a/arch/powerpc/boot/dts/fsl/cyrus_p5020.dts
> +++ b/arch/powerpc/boot/dts/fsl/cyrus_p5020.dts
> @@ -83,6 +83,16 @@
>   gpios = < 2 1>;
>   };
>  
> + leds {
> + compatible = "gpio-leds";
> +
> + hdd {
> + label = "Disk Activity";
> + gpios = < 5 0>;
> + linux,default-trigger =
> "disk-activity";
> + };

Documentation/devicetree/bindings/leds/common.yaml says that label is
deprecated, and to "use 'function' and 'color' properties instead".

Also, please CC devicet...@vger.kernel.org on these patches.

-Scott




Re: [PATCH 2/2] selftests: vm: pkeys: Fix powerpc access right updates

2020-05-08 Thread Florian Weimer
* Sandipan Das:

> Hi Florian,
>
> On 08/05/20 11:31 pm, Florian Weimer wrote:
>> * Sandipan Das:
>> 
>>> The Power ISA mandates that all writes to the Authority
>>> Mask Register (AMR) must always be preceded as well as
>>> succeeded by a context-synchronizing instruction. This
>>> applies to both the privileged and unprivileged variants
>>> of the Move To AMR instruction.
>> 
>> Ugh.  Do you have a reference for that?
>> 
>> We need to fix this in glibc.
>> 
>
> This is from Table 6 of Chapter 11 in page 1134 of Power
> ISA 3.0B. The document can be found here:
> https://ibm.ent.box.com/s/1hzcwkwf8rbju5h9iyf44wm94amnlcrv

Thanks a lot!  I filed:

  

Florian



Re: [PATCH 2/2] selftests: vm: pkeys: Fix powerpc access right updates

2020-05-08 Thread Sandipan Das
Hi Florian,

On 08/05/20 11:31 pm, Florian Weimer wrote:
> * Sandipan Das:
> 
>> The Power ISA mandates that all writes to the Authority
>> Mask Register (AMR) must always be preceded as well as
>> succeeded by a context-synchronizing instruction. This
>> applies to both the privileged and unprivileged variants
>> of the Move To AMR instruction.
> 
> Ugh.  Do you have a reference for that?
> 
> We need to fix this in glibc.
> 

This is from Table 6 of Chapter 11 in page 1134 of Power
ISA 3.0B. The document can be found here:
https://ibm.ent.box.com/s/1hzcwkwf8rbju5h9iyf44wm94amnlcrv

- Sandipan


Re: remove a few uses of ->queuedata

2020-05-08 Thread Dan Williams
On Fri, May 8, 2020 at 9:16 AM Christoph Hellwig  wrote:
>
> Hi all,
>
> various bio based drivers use queue->queuedata despite already having
> set up disk->private_data, which can be used just as easily.  This
> series cleans them up to only use a single private data pointer.

...but isn't the queue pretty much guaranteed to be cache hot and the
gendisk cache cold? I'm not immediately seeing what else needs the
gendisk in the I/O path. Is there another motivation I'm missing?


Re: [PATCH 2/2] selftests: vm: pkeys: Fix powerpc access right updates

2020-05-08 Thread Florian Weimer
* Sandipan Das:

> The Power ISA mandates that all writes to the Authority
> Mask Register (AMR) must always be preceded as well as
> succeeded by a context-synchronizing instruction. This
> applies to both the privileged and unprivileged variants
> of the Move To AMR instruction.

Ugh.  Do you have a reference for that?

We need to fix this in glibc.

Thanks,
Florian



[PATCH 2/2] selftests: vm: pkeys: Fix powerpc access right updates

2020-05-08 Thread Sandipan Das
The Power ISA mandates that all writes to the Authority
Mask Register (AMR) must always be preceded as well as
succeeded by a context-synchronizing instruction. This
applies to both the privileged and unprivileged variants
of the Move To AMR instruction.

Fixes: 130f573c2a79 ("selftests/vm/pkeys: introduce powerpc support")
Reported-by: Aneesh Kumar K.V 
Suggested-by: Aneesh Kumar K.V 
Signed-off-by: Sandipan Das 
---
 tools/testing/selftests/vm/pkey-powerpc.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/vm/pkey-powerpc.h 
b/tools/testing/selftests/vm/pkey-powerpc.h
index eb5077de8f1e..1ebb586b2fbc 100644
--- a/tools/testing/selftests/vm/pkey-powerpc.h
+++ b/tools/testing/selftests/vm/pkey-powerpc.h
@@ -55,7 +55,8 @@ static inline void __write_pkey_reg(u64 pkey_reg)
dprintf4("%s() changing %016llx to %016llx\n",
 __func__, __read_pkey_reg(), pkey_reg);
 
-   asm volatile("mtspr 0xd, %0" : : "r" ((unsigned long)(amr)) : "memory");
+   asm volatile("isync; mtspr 0xd, %0; isync"
+: : "r" ((unsigned long)(amr)) : "memory");
 
dprintf4("%s() pkey register after changing %016llx to %016llx\n",
__func__, __read_pkey_reg(), pkey_reg);
-- 
2.17.1



[PATCH 1/2] selftests: vm: pkeys: Fix powerpc access right definitions

2020-05-08 Thread Sandipan Das
For powerpc, PKEY_DISABLE_WRITE and PKEY_DISABLE_ACCESS are
redefined only if the system headers already define them.
Otherwise, the test fails to compile due to their absence.
This makes sure that they are always defined irrespective of
them being present in the system headers.

Fixes: 130f573c2a79 ("selftests/vm/pkeys: introduce powerpc support")
Reported-by: Aneesh Kumar K.V 
Signed-off-by: Sandipan Das 
---
 tools/testing/selftests/vm/pkey-powerpc.h | 8 ++--
 1 file changed, 2 insertions(+), 6 deletions(-)

diff --git a/tools/testing/selftests/vm/pkey-powerpc.h 
b/tools/testing/selftests/vm/pkey-powerpc.h
index 3a761e51a587..eb5077de8f1e 100644
--- a/tools/testing/selftests/vm/pkey-powerpc.h
+++ b/tools/testing/selftests/vm/pkey-powerpc.h
@@ -16,15 +16,11 @@
 #define fpregs fp_regs
 #define si_pkey_offset 0x20
 
-#ifdef PKEY_DISABLE_ACCESS
 #undef PKEY_DISABLE_ACCESS
-# define PKEY_DISABLE_ACCESS   0x3  /* disable read and write */
-#endif
+#define PKEY_DISABLE_ACCESS0x3  /* disable read and write */
 
-#ifdef PKEY_DISABLE_WRITE
 #undef PKEY_DISABLE_WRITE
-# define PKEY_DISABLE_WRITE0x2
-#endif
+#define PKEY_DISABLE_WRITE 0x2
 
 #define NR_PKEYS   32
 #define NR_RESERVED_PKEYS_4K   27 /* pkey-0, pkey-1, exec-only-pkey
-- 
2.17.1



[PATCH 0/2] selftests: vm: pkeys: Some powerpc fixes

2020-05-08 Thread Sandipan Das
Some fixes for the powerpc bits w.r.t to the way the pkey
access rights are defined and how the permission register
is updated.

Sandipan Das (2):
  selftests: vm: pkeys: Fix powerpc access right definitions
  selftests: vm: pkeys: Fix powerpc access right updates

 tools/testing/selftests/vm/pkey-powerpc.h | 11 ---
 1 file changed, 4 insertions(+), 7 deletions(-)

-- 
2.17.1



Re: [PATCH v4 02/14] arm: add support for folded p4d page tables

2020-05-08 Thread Mike Rapoport
On Fri, May 08, 2020 at 08:53:27AM +0200, Marek Szyprowski wrote:
> Hi Mike,
> 
> On 07.05.2020 18:11, Mike Rapoport wrote:
> > On Thu, May 07, 2020 at 02:16:56PM +0200, Marek Szyprowski wrote:
> >> On 14.04.2020 17:34, Mike Rapoport wrote:
> >>> From: Mike Rapoport 
> >>>
> >>> Implement primitives necessary for the 4th level folding, add walks of p4d
> >>> level where appropriate, and remove __ARCH_USE_5LEVEL_HACK.
> >>>
> >>> Signed-off-by: Mike Rapoport 
> >> Today I've noticed that kexec is broken on ARM 32bit. Bisecting between
> >> current linux-next and v5.7-rc1 pointed to this commit. I've tested this
> >> on Odroid XU4 and Raspberry Pi4 boards. Here is the relevant log:
> >>
> >> # kexec --kexec-syscall -l zImage --append "$(cat /proc/cmdline)"
> >> memory_range[0]:0x4000..0xbe9f
> >> memory_range[0]:0x4000..0xbe9f
> >> # kexec -e
> >> kexec_core: Starting new kernel
> >> 8<--- cut here ---
> >> Unable to handle kernel paging request at virtual address c010f1f4
> >> pgd = c6817793
> >> [c010f1f4] *pgd=441e(bad)
> >> Internal error: Oops: 80d [#1] PREEMPT ARM
> >> Modules linked in:
> >> CPU: 0 PID: 1329 Comm: kexec Tainted: G    W
> >> 5.7.0-rc3-00127-g6cba81ed0f62 #611
> >> Hardware name: Samsung Exynos (Flattened Device Tree)
> >> PC is at machine_kexec+0x40/0xfc
> > Any chance you have the debug info in this kernel?
> > scripts/faddr2line would come handy here.
> 
> # ./scripts/faddr2line --list vmlinux machine_kexec+0x40
> machine_kexec+0x40/0xf8:
> 
> machine_kexec at arch/arm/kernel/machine_kexec.c:182
>   177    reboot_code_buffer = 
> page_address(image->control_code_page);
>   178
>   179    /* Prepare parameters for reboot_code_buffer*/
>   180    set_kernel_text_rw();
>   181    kexec_start_address = image->start;
>  >182<   kexec_indirection_page = page_list;
>   183    kexec_mach_type = machine_arch_type;
>   184    kexec_boot_atags = image->arch.kernel_r2;
>   185
>   186    /* copy our kernel relocation code to the control code 
> page */
>   187    reboot_entry = fncpy(reboot_code_buffer,

Can you please try the patch below:

diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 963b5284d284..f86b3d17928e 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -571,7 +571,7 @@ static inline void section_update(unsigned long addr, 
pmdval_t mask,
 {
pmd_t *pmd;
 
-   pmd = pmd_off_k(addr);
+   pmd = pmd_offset(pud_offset(p4d_offset(pgd_offset(mm, addr), addr), 
addr), addr);
 
 #ifdef CONFIG_ARM_LPAE
pmd[0] = __pmd((pmd_val(pmd[0]) & mask) | prot);

>  > ...
> 
> Best regards
> -- 
> Marek Szyprowski, PhD
> Samsung R Institute Poland
> 

-- 
Sincerely yours,
Mike.


Re: ioremap() called early from pnv_pci_init_ioda_phb()

2020-05-08 Thread Qian Cai



> On May 8, 2020, at 10:39 AM, Qian Cai  wrote:
> 
> Booting POWER9 PowerNV has this message,
> 
> "ioremap() called early from pnv_pci_init_ioda_phb+0x420/0xdfc. Use 
> early_ioremap() instead”
> 
> but use the patch below will result in leaks because it will never call 
> early_iounmap() anywhere. However, it looks me it was by design that 
> phb->regs mapping would be there forever where it would be used in 
> pnv_ioda_get_inval_reg(), so is just that check_early_ioremap_leak() initcall 
> too strong?
> 
> --- a/arch/powerpc/platforms/powernv/pci-ioda.c
> +++ b/arch/powerpc/platforms/powernv/pci-ioda.c
> @@ -36,6 +36,7 @@
> #include 
> #include 
> #include 
> +#include 
> 
> #include 
> 
> @@ -3827,7 +3828,7 @@ static void __init pnv_pci_init_ioda_phb(struct 
> device_node *np,
>/* Get registers */
>if (!of_address_to_resource(np, 0, )) {
>phb->regs_phys = r.start;
> -   phb->regs = ioremap(r.start, resource_size());
> +   phb->regs = early_ioremap(r.start, resource_size());
>if (phb->regs == NULL)
>pr_err("  Failed to map registers !\n”);

This will also trigger a panic with debugfs reads, so isn’t that this commit 
bogus at least for powerpc64?

d538aadc2718 (“powerpc/ioremap: warn on early use of ioremap()")

11017.617022][T122068] Faulting instruction address: 0xc00db564
[11017.617257][T122066] Faulting instruction address: 0xc00db564
[11017.617950][T122073] Faulting instruction address: 0xc00db564
[11017.61][T122064] BUG: Unable to handle kernel data access on read at 
0xffe20e10
[11017.618935][T122064] Faulting instruction address: 0xc00db564
[11017.737996][T122072] 
[11017.738010][T122073] Oops: Kernel access of bad area, sig: 11 [#2]
[11017.738024][T122073] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 
DEBUG_PAGEALLOC NUMA PowerNV
[11017.738051][T122073] Modules linked in: brd ext4 crc16 mbcache jbd2 loop 
kvm_hv kvm ip_tables x_tables xfs sd_mod bnx2x ahci libahci tg3 mdio libata 
libphy firmware_class dm_mirror dm_region_hash dm_log dm_mod
[11017.738110][T122073] CPU: 108 PID: 122073 Comm: read_all Tainted: G  D W 
5.7.0-rc4-next-20200508+ #4
[11017.738138][T122073] NIP:  c00db564 LR: c056f660 CTR: 
c00db550
[11017.738173][T122073] REGS: c00374f6f980 TRAP: 0380   Tainted: G  D W 
 (5.7.0-rc4-next-20200508+)
[11017.738234][T122073] MSR:  90009033   CR: 
22002282  XER: 2004
[11017.738278][T122073] CFAR: c056f65c IRQMASK: 0 
[11017.738278][T122073] GPR00: c056f660 c00374f6fc10 
c1689400 c000201ffc41aa00 
[11017.738278][T122073] GPR04: c00374f6fc70  
 0001 
[11017.738278][T122073] GPR08:  ffe2 
 c008ee380080 
[11017.738278][T122073] GPR12: c00db550 c000201fff671280 
  
[11017.738278][T122073] GPR16: 0002 10040800 
1001ccd8 1001cc80 
[11017.738278][T122073] GPR20: 1001cc98 1001ccc8 
1001cca8 1001cb48 
[11017.738278][T122073] GPR24:   
03ff 7fffebb67390 
[11017.738278][T122073] GPR28: c00374f6fd90 c000200c0c6a7550 
 c000200c0c6a7500 
[11017.738542][T122073] NIP [c00db564] pnv_eeh_dbgfs_get_inbB+0x14/0x30
[11017.738579][T122073] LR [c056f660] simple_attr_read+0xa0/0x180
[11017.738613][T122073] Call Trace:
[11017.738645][T122073] [c00374f6fc10] [c056f630] 
simple_attr_read+0x70/0x180 (unreliable)
[11017.738672][T122073] [c00374f6fcb0] [c064a2e0] 
full_proxy_read+0x90/0xe0
[11017.738686][T122073] [c00374f6fd00] [c051fe0c] 
__vfs_read+0x3c/0x70
[11017.738722][T122073] [c00374f6fd20] [c051feec] 
vfs_read+0xac/0x170
[11017.738757][T122073] [c00374f6fd70] [c052034c] 
ksys_read+0x7c/0x140
[11017.738818][T122073] [c00374f6fdc0] [c0038af4] 
system_call_exception+0x114/0x1e0
[11017.738867][T122073] [c00374f6fe20] [c000c8f0] 
system_call_common+0xf0/0x278
[11017.738916][T122073] Instruction dump:
[11017.738948][T122073] 7c0004ac f9490d10 a14d0c78 3860 b14d0c7a 4e800020 
6000 7c0802a6 
[11017.739001][T122073] 6000 e9230278 e9290028 7c0004ac  0c09 
4c00012c 3860 
[11017.739052][T122073] ---[ end trace f68728a0d3053b5e ]---
[11017.828156][T122073] 
[11017.828170][T122068] Oops: Kernel access of bad area, sig: 11 [#3]
[11017.828184][T122068] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=256 
DEBUG_PAGEALLOC NUMA PowerNV
[11017.828209][T122068] Modules linked in: brd ext4 crc16 mbcache jbd2 loop 
kvm_hv kvm ip_tables x_tables xfs sd_mod bnx2x ahci libahci tg3 mdio libata 
libphy firmwa

Re: [PATCH v7 2/5] seq_buf: Export seq_buf_printf() to external modules

2020-05-08 Thread Steven Rostedt
On Fri, 8 May 2020 18:09:35 +0200
Borislav Petkov  wrote:

> On Fri, May 08, 2020 at 05:30:31PM +0530, Vaibhav Jain wrote:
> > I am referring to Kernel Loadable Modules with MODULE_LICENSE("GPL")
> > here.  
> 
> And what does "external" refer to? Because if it is out-of-tree, we
> don't export symbols for out-of-tree modules.

I've always wondered about this. Why not?

-- Steve


[PATCH] selftests: powerpc: Add test for execute-disabled pkeys

2020-05-08 Thread Sandipan Das
Apart from read and write access, memory protection keys can
also be used for restricting execute permission of pages on
powerpc. This adds a test to verify if the feature works as
expected.

Signed-off-by: Sandipan Das 
---
 tools/testing/selftests/powerpc/mm/Makefile   |   3 +-
 .../selftests/powerpc/mm/pkey_exec_prot.c | 326 ++
 2 files changed, 328 insertions(+), 1 deletion(-)
 create mode 100644 tools/testing/selftests/powerpc/mm/pkey_exec_prot.c

diff --git a/tools/testing/selftests/powerpc/mm/Makefile 
b/tools/testing/selftests/powerpc/mm/Makefile
index b9103c4bb414..2816229f648b 100644
--- a/tools/testing/selftests/powerpc/mm/Makefile
+++ b/tools/testing/selftests/powerpc/mm/Makefile
@@ -3,7 +3,7 @@ noarg:
$(MAKE) -C ../
 
 TEST_GEN_PROGS := hugetlb_vs_thp_test subpage_prot prot_sao segv_errors 
wild_bctr \
- large_vm_fork_separation bad_accesses
+ large_vm_fork_separation bad_accesses pkey_exec_prot
 TEST_GEN_PROGS_EXTENDED := tlbie_test
 TEST_GEN_FILES := tempfile
 
@@ -17,6 +17,7 @@ $(OUTPUT)/prot_sao: ../utils.c
 $(OUTPUT)/wild_bctr: CFLAGS += -m64
 $(OUTPUT)/large_vm_fork_separation: CFLAGS += -m64
 $(OUTPUT)/bad_accesses: CFLAGS += -m64
+$(OUTPUT)/pkey_exec_prot: CFLAGS += -m64
 
 $(OUTPUT)/tempfile:
dd if=/dev/zero of=$@ bs=64k count=1
diff --git a/tools/testing/selftests/powerpc/mm/pkey_exec_prot.c 
b/tools/testing/selftests/powerpc/mm/pkey_exec_prot.c
new file mode 100644
index ..b346ad205e68
--- /dev/null
+++ b/tools/testing/selftests/powerpc/mm/pkey_exec_prot.c
@@ -0,0 +1,326 @@
+// SPDX-License-Identifier: GPL-2.0+
+
+/*
+ * Copyright 2020, Sandipan Das, IBM Corp.
+ *
+ * Test if applying execute protection on pages using memory
+ * protection keys works as expected.
+ */
+
+#define _GNU_SOURCE
+#include 
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+
+#include "utils.h"
+
+/* Override definitions as they might be inconsistent */
+#undef PKEY_DISABLE_ACCESS
+#define PKEY_DISABLE_ACCESS0x3
+
+#undef PKEY_DISABLE_WRITE
+#define PKEY_DISABLE_WRITE 0x2
+
+#undef PKEY_DISABLE_EXECUTE
+#define PKEY_DISABLE_EXECUTE   0x4
+
+/* Older distros might not define this */
+#ifndef SEGV_PKUERR
+#define SEGV_PKUERR4
+#endif
+
+#define SYS_pkey_mprotect  386
+#define SYS_pkey_alloc 384
+#define SYS_pkey_free  385
+
+#define PKEY_BITS_PER_PKEY 2
+#define NR_PKEYS   32
+
+#define PKEY_BITS_MASK ((1UL << PKEY_BITS_PER_PKEY) - 1)
+
+static unsigned long pkeyreg_get(void)
+{
+   unsigned long uamr;
+
+   asm volatile("mfspr %0, 0xd" : "=r"(uamr));
+   return uamr;
+}
+
+static void pkeyreg_set(unsigned long uamr)
+{
+   asm volatile("isync; mtspr  0xd, %0; isync;" : : "r"(uamr));
+}
+
+static void pkey_set_rights(int pkey, unsigned long rights)
+{
+   unsigned long uamr, shift;
+
+   shift = (NR_PKEYS - pkey - 1) * PKEY_BITS_PER_PKEY;
+   uamr = pkeyreg_get();
+   uamr &= ~(PKEY_BITS_MASK << shift);
+   uamr |= (rights & PKEY_BITS_MASK) << shift;
+   pkeyreg_set(uamr);
+}
+
+static int sys_pkey_mprotect(void *addr, size_t len, int prot, int pkey)
+{
+   return syscall(SYS_pkey_mprotect, addr, len, prot, pkey);
+}
+
+static int sys_pkey_alloc(unsigned long flags, unsigned long rights)
+{
+   return syscall(SYS_pkey_alloc, flags, rights);
+}
+
+static int sys_pkey_free(int pkey)
+{
+   return syscall(SYS_pkey_free, pkey);
+}
+
+static volatile int fpkey, fcode, ftype, faults;
+static unsigned long pgsize, numinsns;
+static volatile unsigned int *faddr;
+static unsigned int *insns;
+
+static void segv_handler(int signum, siginfo_t *sinfo, void *ctx)
+{
+   /* Check if this fault originated because of the expected reasons */
+   if (sinfo->si_code != SEGV_ACCERR && sinfo->si_code != SEGV_PKUERR) {
+   printf("got an unexpected fault, code = %d\n",
+  sinfo->si_code);
+   goto fail;
+   }
+
+   /* Check if this fault originated from the expected address */
+   if (sinfo->si_addr != (void *) faddr) {
+   printf("got an unexpected fault, addr = %p\n",
+  sinfo->si_addr);
+   goto fail;
+   }
+
+   /* Check if the expected number of faults has been exceeded */
+   if (faults == 0)
+   goto fail;
+
+   fcode = sinfo->si_code;
+
+   /* Restore permissions in order to continue */
+   switch (fcode) {
+   case SEGV_ACCERR:
+   if (mprotect(insns, pgsize, PROT_READ | PROT_WRITE)) {
+   perror("mprotect");
+   goto fail;
+   }
+   break;
+   case SEGV_PKUERR:
+   if (sinfo->si_pkey != fpkey)
+   goto fail;
+
+   if (ftype == PKEY_DISABLE_ACCESS) {
+   pkey_set_rights(fpkey, 0);
+   } else if (ftype == 

[PATCH 15/15] nvdimm/pmem: stop using ->queuedata

2020-05-08 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 drivers/nvdimm/pmem.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/drivers/nvdimm/pmem.c b/drivers/nvdimm/pmem.c
index 2df6994acf836..f8dc5941215bf 100644
--- a/drivers/nvdimm/pmem.c
+++ b/drivers/nvdimm/pmem.c
@@ -196,7 +196,7 @@ static blk_qc_t pmem_make_request(struct request_queue *q, 
struct bio *bio)
unsigned long start;
struct bio_vec bvec;
struct bvec_iter iter;
-   struct pmem_device *pmem = q->queuedata;
+   struct pmem_device *pmem = bio->bi_disk->private_data;
struct nd_region *nd_region = to_region(pmem);
 
if (bio->bi_opf & REQ_PREFLUSH)
@@ -231,7 +231,7 @@ static blk_qc_t pmem_make_request(struct request_queue *q, 
struct bio *bio)
 static int pmem_rw_page(struct block_device *bdev, sector_t sector,
   struct page *page, unsigned int op)
 {
-   struct pmem_device *pmem = bdev->bd_queue->queuedata;
+   struct pmem_device *pmem = bdev->bd_disk->private_data;
blk_status_t rc;
 
if (op_is_write(op))
@@ -464,7 +464,6 @@ static int pmem_attach_disk(struct device *dev,
blk_queue_flag_set(QUEUE_FLAG_NONROT, q);
if (pmem->pfn_flags & PFN_MAP)
blk_queue_flag_set(QUEUE_FLAG_DAX, q);
-   q->queuedata = pmem;
 
disk = alloc_disk_node(0, nid);
if (!disk)
@@ -474,6 +473,7 @@ static int pmem_attach_disk(struct device *dev,
disk->fops  = _fops;
disk->queue = q;
disk->flags = GENHD_FL_EXT_DEVT;
+   disk->private_data  = pmem;
disk->queue->backing_dev_info->capabilities |= BDI_CAP_SYNCHRONOUS_IO;
nvdimm_namespace_disk_name(ndns, disk->disk_name);
set_capacity(disk, (pmem->size - pmem->pfn_pad - pmem->data_offset)
-- 
2.26.2



[PATCH 13/15] nvdimm/blk: stop using ->queuedata

2020-05-08 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 drivers/nvdimm/blk.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/nvdimm/blk.c b/drivers/nvdimm/blk.c
index 43751fab9d36a..ffe4728bad8b1 100644
--- a/drivers/nvdimm/blk.c
+++ b/drivers/nvdimm/blk.c
@@ -165,7 +165,7 @@ static int nsblk_do_bvec(struct nd_namespace_blk *nsblk,
 static blk_qc_t nd_blk_make_request(struct request_queue *q, struct bio *bio)
 {
struct bio_integrity_payload *bip;
-   struct nd_namespace_blk *nsblk;
+   struct nd_namespace_blk *nsblk = bio->bi_disk->private_data;
struct bvec_iter iter;
unsigned long start;
struct bio_vec bvec;
@@ -176,7 +176,6 @@ static blk_qc_t nd_blk_make_request(struct request_queue 
*q, struct bio *bio)
return BLK_QC_T_NONE;
 
bip = bio_integrity(bio);
-   nsblk = q->queuedata;
rw = bio_data_dir(bio);
do_acct = nd_iostat_start(bio, );
bio_for_each_segment(bvec, bio, iter) {
@@ -258,7 +257,6 @@ static int nsblk_attach_disk(struct nd_namespace_blk *nsblk)
blk_queue_max_hw_sectors(q, UINT_MAX);
blk_queue_logical_block_size(q, nsblk_sector_size(nsblk));
blk_queue_flag_set(QUEUE_FLAG_NONROT, q);
-   q->queuedata = nsblk;
 
disk = alloc_disk(0);
if (!disk)
@@ -268,6 +266,7 @@ static int nsblk_attach_disk(struct nd_namespace_blk *nsblk)
disk->fops  = _blk_fops;
disk->queue = q;
disk->flags = GENHD_FL_EXT_DEVT;
+   disk->private_data  = nsblk;
nvdimm_namespace_disk_name(>common, disk->disk_name);
 
if (devm_add_action_or_reset(dev, nd_blk_release_disk, disk))
-- 
2.26.2



[PATCH 14/15] nvdimm/btt: stop using ->queuedata

2020-05-08 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 drivers/nvdimm/btt.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/nvdimm/btt.c b/drivers/nvdimm/btt.c
index 3b09419218d6f..eca3e48fefda1 100644
--- a/drivers/nvdimm/btt.c
+++ b/drivers/nvdimm/btt.c
@@ -1442,7 +1442,7 @@ static int btt_do_bvec(struct btt *btt, struct 
bio_integrity_payload *bip,
 static blk_qc_t btt_make_request(struct request_queue *q, struct bio *bio)
 {
struct bio_integrity_payload *bip = bio_integrity(bio);
-   struct btt *btt = q->queuedata;
+   struct btt *btt = bio->bi_disk->private_data;
struct bvec_iter iter;
unsigned long start;
struct bio_vec bvec;
@@ -1543,7 +1543,6 @@ static int btt_blk_init(struct btt *btt)
blk_queue_logical_block_size(btt->btt_queue, btt->sector_size);
blk_queue_max_hw_sectors(btt->btt_queue, UINT_MAX);
blk_queue_flag_set(QUEUE_FLAG_NONROT, btt->btt_queue);
-   btt->btt_queue->queuedata = btt;
 
if (btt_meta_size(btt)) {
int rc = nd_integrity_init(btt->btt_disk, btt_meta_size(btt));
-- 
2.26.2



[PATCH 12/15] md: stop using ->queuedata

2020-05-08 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 drivers/md/md.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/md/md.c b/drivers/md/md.c
index 271e8a5873549..c079ecf77c564 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -466,7 +466,7 @@ static blk_qc_t md_make_request(struct request_queue *q, 
struct bio *bio)
 {
const int rw = bio_data_dir(bio);
const int sgrp = op_stat_group(bio_op(bio));
-   struct mddev *mddev = q->queuedata;
+   struct mddev *mddev = bio->bi_disk->private_data;
unsigned int sectors;
 
if (unlikely(test_bit(MD_BROKEN, >flags)) && (rw == WRITE)) {
@@ -5626,7 +5626,6 @@ static int md_alloc(dev_t dev, char *name)
mddev->queue = blk_alloc_queue(md_make_request, NUMA_NO_NODE);
if (!mddev->queue)
goto abort;
-   mddev->queue->queuedata = mddev;
 
blk_set_stacking_limits(>queue->limits);
 
-- 
2.26.2



[PATCH 11/15] dm: stop using ->queuedata

2020-05-08 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 drivers/md/dm.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 0eb93da44ea2a..2aaae6c1ed312 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -1783,7 +1783,7 @@ static blk_qc_t dm_process_bio(struct mapped_device *md,
 
 static blk_qc_t dm_make_request(struct request_queue *q, struct bio *bio)
 {
-   struct mapped_device *md = q->queuedata;
+   struct mapped_device *md = bio->bi_disk->private_data;
blk_qc_t ret = BLK_QC_T_NONE;
int srcu_idx;
struct dm_table *map;
@@ -1980,7 +1980,6 @@ static struct mapped_device *alloc_dev(int minor)
md->queue = blk_alloc_queue(dm_make_request, numa_node_id);
if (!md->queue)
goto bad;
-   md->queue->queuedata = md;
 
md->disk = alloc_disk_node(1, md->numa_node_id);
if (!md->disk)
-- 
2.26.2



[PATCH 10/15] bcache: stop setting ->queuedata

2020-05-08 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 drivers/md/bcache/super.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index d98354fa28e3e..a0fb5af2beeda 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -871,7 +871,6 @@ static int bcache_device_init(struct bcache_device *d, 
unsigned int block_size,
return -ENOMEM;
 
d->disk->queue  = q;
-   q->queuedata= d;
q->backing_dev_info->congested_data = d;
q->limits.max_hw_sectors= UINT_MAX;
q->limits.max_sectors   = UINT_MAX;
-- 
2.26.2



[PATCH 09/15] lightnvm: stop using ->queuedata

2020-05-08 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 drivers/lightnvm/core.c  | 1 -
 drivers/lightnvm/pblk-init.c | 2 +-
 drivers/lightnvm/pblk.h  | 2 +-
 3 files changed, 2 insertions(+), 3 deletions(-)

diff --git a/drivers/lightnvm/core.c b/drivers/lightnvm/core.c
index db38a68abb6c0..85c5490cdfd2e 100644
--- a/drivers/lightnvm/core.c
+++ b/drivers/lightnvm/core.c
@@ -400,7 +400,6 @@ static int nvm_create_tgt(struct nvm_dev *dev, struct 
nvm_ioctl_create *create)
}
 
tdisk->private_data = targetdata;
-   tqueue->queuedata = targetdata;
 
mdts = (dev->geo.csecs >> 9) * NVM_MAX_VLBA;
if (dev->geo.mdts) {
diff --git a/drivers/lightnvm/pblk-init.c b/drivers/lightnvm/pblk-init.c
index 9a967a2e83dd7..bec904ec0f7c0 100644
--- a/drivers/lightnvm/pblk-init.c
+++ b/drivers/lightnvm/pblk-init.c
@@ -49,7 +49,7 @@ struct bio_set pblk_bio_set;
 
 static blk_qc_t pblk_make_rq(struct request_queue *q, struct bio *bio)
 {
-   struct pblk *pblk = q->queuedata;
+   struct pblk *pblk = bio->bi_disk->private_data;
 
if (bio_op(bio) == REQ_OP_DISCARD) {
pblk_discard(pblk, bio);
diff --git a/drivers/lightnvm/pblk.h b/drivers/lightnvm/pblk.h
index 86ffa875bfe16..ed364afaed0d8 100644
--- a/drivers/lightnvm/pblk.h
+++ b/drivers/lightnvm/pblk.h
@@ -1255,7 +1255,7 @@ static inline int pblk_boundary_ppa_checks(struct 
nvm_tgt_dev *tgt_dev,
continue;
}
 
-   print_ppa(tgt_dev->q->queuedata, ppa, "boundary", i);
+   print_ppa(tgt_dev->disk->private_data, ppa, "boundary", i);
 
return 1;
}
-- 
2.26.2



[PATCH 01/15] nfblock: use gendisk private_data

2020-05-08 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 arch/m68k/emu/nfblock.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/m68k/emu/nfblock.c b/arch/m68k/emu/nfblock.c
index c3a630440512e..5c6f04b00e866 100644
--- a/arch/m68k/emu/nfblock.c
+++ b/arch/m68k/emu/nfblock.c
@@ -61,7 +61,7 @@ struct nfhd_device {
 
 static blk_qc_t nfhd_make_request(struct request_queue *queue, struct bio *bio)
 {
-   struct nfhd_device *dev = queue->queuedata;
+   struct nfhd_device *dev = bio->bi_disk->private_data;
struct bio_vec bvec;
struct bvec_iter iter;
int dir, len, shift;
@@ -122,7 +122,6 @@ static int __init nfhd_init_one(int id, u32 blocks, u32 
bsize)
if (dev->queue == NULL)
goto free_dev;
 
-   dev->queue->queuedata = dev;
blk_queue_logical_block_size(dev->queue, bsize);
 
dev->disk = alloc_disk(16);
@@ -136,6 +135,7 @@ static int __init nfhd_init_one(int id, u32 blocks, u32 
bsize)
sprintf(dev->disk->disk_name, "nfhd%u", dev_id);
set_capacity(dev->disk, (sector_t)blocks * (bsize / 512));
dev->disk->queue = dev->queue;
+   dev->disk->private_data = dev;
 
add_disk(dev->disk);
 
-- 
2.26.2



[PATCH 04/15] null_blk: stop using ->queuedata for bio mode

2020-05-08 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 drivers/block/null_blk_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/block/null_blk_main.c b/drivers/block/null_blk_main.c
index 8efd8778e2095..d14df79feca89 100644
--- a/drivers/block/null_blk_main.c
+++ b/drivers/block/null_blk_main.c
@@ -1365,7 +1365,7 @@ static blk_qc_t null_queue_bio(struct request_queue *q, 
struct bio *bio)
 {
sector_t sector = bio->bi_iter.bi_sector;
sector_t nr_sectors = bio_sectors(bio);
-   struct nullb *nullb = q->queuedata;
+   struct nullb *nullb = bio->bi_disk->private_data;
struct nullb_queue *nq = nullb_to_queue(nullb);
struct nullb_cmd *cmd;
 
-- 
2.26.2



[PATCH 07/15] umem: stop using ->queuedata

2020-05-08 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 drivers/block/umem.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/block/umem.c b/drivers/block/umem.c
index d84e8a878df24..e59bff24e02cf 100644
--- a/drivers/block/umem.c
+++ b/drivers/block/umem.c
@@ -521,7 +521,8 @@ static int mm_check_plugged(struct cardinfo *card)
 
 static blk_qc_t mm_make_request(struct request_queue *q, struct bio *bio)
 {
-   struct cardinfo *card = q->queuedata;
+   struct cardinfo *card = bio->bi_disk->private_data;
+
pr_debug("mm_make_request %llu %u\n",
 (unsigned long long)bio->bi_iter.bi_sector,
 bio->bi_iter.bi_size);
@@ -888,7 +889,6 @@ static int mm_pci_probe(struct pci_dev *dev, const struct 
pci_device_id *id)
card->queue = blk_alloc_queue(mm_make_request, NUMA_NO_NODE);
if (!card->queue)
goto failed_alloc;
-   card->queue->queuedata = card;
 
tasklet_init(>tasklet, process_page, (unsigned long)card);
 
-- 
2.26.2



remove a few uses of ->queuedata

2020-05-08 Thread Christoph Hellwig
Hi all,

various bio based drivers use queue->queuedata despite already having
set up disk->private_data, which can be used just as easily.  This
series cleans them up to only use a single private data pointer.

blk-mq based drivers that have code pathes that can't easily get at
the gendisk are unaffected by this series.


[PATCH 03/15] drbd: stop using ->queuedata

2020-05-08 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 drivers/block/drbd/drbd_main.c | 1 -
 drivers/block/drbd/drbd_req.c  | 2 +-
 2 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c
index c094c3c2c5d4d..be787b268f044 100644
--- a/drivers/block/drbd/drbd_main.c
+++ b/drivers/block/drbd/drbd_main.c
@@ -2805,7 +2805,6 @@ enum drbd_ret_code drbd_create_device(struct 
drbd_config_context *adm_ctx, unsig
if (!q)
goto out_no_q;
device->rq_queue = q;
-   q->queuedata   = device;
 
disk = alloc_disk(1);
if (!disk)
diff --git a/drivers/block/drbd/drbd_req.c b/drivers/block/drbd/drbd_req.c
index 840c3aef3c5c9..02c104a0c45e0 100644
--- a/drivers/block/drbd/drbd_req.c
+++ b/drivers/block/drbd/drbd_req.c
@@ -1614,7 +1614,7 @@ void do_submit(struct work_struct *ws)
 
 blk_qc_t drbd_make_request(struct request_queue *q, struct bio *bio)
 {
-   struct drbd_device *device = (struct drbd_device *) q->queuedata;
+   struct drbd_device *device = bio->bi_disk->private_data;
unsigned long start_jif;
 
blk_queue_split(q, );
-- 
2.26.2



[PATCH 05/15] ps3vram: stop using ->queuedata

2020-05-08 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 drivers/block/ps3vram.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/block/ps3vram.c b/drivers/block/ps3vram.c
index 821d4d8b1d763..5a1d1d137c724 100644
--- a/drivers/block/ps3vram.c
+++ b/drivers/block/ps3vram.c
@@ -587,7 +587,7 @@ static struct bio *ps3vram_do_bio(struct 
ps3_system_bus_device *dev,
 
 static blk_qc_t ps3vram_make_request(struct request_queue *q, struct bio *bio)
 {
-   struct ps3_system_bus_device *dev = q->queuedata;
+   struct ps3_system_bus_device *dev = bio->bi_disk->private_data;
struct ps3vram_priv *priv = ps3_system_bus_get_drvdata(dev);
int busy;
 
@@ -745,7 +745,6 @@ static int ps3vram_probe(struct ps3_system_bus_device *dev)
}
 
priv->queue = queue;
-   queue->queuedata = dev;
blk_queue_max_segments(queue, BLK_MAX_SEGMENTS);
blk_queue_max_segment_size(queue, BLK_MAX_SEGMENT_SIZE);
blk_queue_max_hw_sectors(queue, BLK_SAFE_MAX_SECTORS);
-- 
2.26.2



[PATCH 08/15] zram: stop using ->queuedata

2020-05-08 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 drivers/block/zram/zram_drv.c | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/block/zram/zram_drv.c b/drivers/block/zram/zram_drv.c
index ebb234f36909c..e1a6c74c7a4ba 100644
--- a/drivers/block/zram/zram_drv.c
+++ b/drivers/block/zram/zram_drv.c
@@ -1593,7 +1593,7 @@ static void __zram_make_request(struct zram *zram, struct 
bio *bio)
  */
 static blk_qc_t zram_make_request(struct request_queue *queue, struct bio *bio)
 {
-   struct zram *zram = queue->queuedata;
+   struct zram *zram = bio->bi_disk->private_data;
 
if (!valid_io_request(zram, bio->bi_iter.bi_sector,
bio->bi_iter.bi_size)) {
@@ -1916,7 +1916,6 @@ static int zram_add(void)
zram->disk->first_minor = device_id;
zram->disk->fops = _devops;
zram->disk->queue = queue;
-   zram->disk->queue->queuedata = zram;
zram->disk->private_data = zram;
snprintf(zram->disk->disk_name, 16, "zram%d", device_id);
 
-- 
2.26.2



[PATCH 06/15] rsxx: stop using ->queuedata

2020-05-08 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 drivers/block/rsxx/dev.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/block/rsxx/dev.c b/drivers/block/rsxx/dev.c
index 8ffa8260dcafe..6dde80b096c62 100644
--- a/drivers/block/rsxx/dev.c
+++ b/drivers/block/rsxx/dev.c
@@ -133,7 +133,7 @@ static void bio_dma_done_cb(struct rsxx_cardinfo *card,
 
 static blk_qc_t rsxx_make_request(struct request_queue *q, struct bio *bio)
 {
-   struct rsxx_cardinfo *card = q->queuedata;
+   struct rsxx_cardinfo *card = bio->bi_disk->private_data;
struct rsxx_bio_meta *bio_meta;
blk_status_t st = BLK_STS_IOERR;
 
@@ -282,8 +282,6 @@ int rsxx_setup_dev(struct rsxx_cardinfo *card)
card->queue->limits.discard_alignment   = RSXX_HW_BLK_SIZE;
}
 
-   card->queue->queuedata = card;
-
snprintf(card->gendisk->disk_name, sizeof(card->gendisk->disk_name),
 "rsxx%d", card->disk_id);
card->gendisk->major = card->major;
@@ -304,7 +302,6 @@ void rsxx_destroy_dev(struct rsxx_cardinfo *card)
card->gendisk = NULL;
 
blk_cleanup_queue(card->queue);
-   card->queue->queuedata = NULL;
unregister_blkdev(card->major, DRIVER_NAME);
 }
 
-- 
2.26.2



[PATCH 02/15] simdisk: stop using ->queuedata

2020-05-08 Thread Christoph Hellwig
Signed-off-by: Christoph Hellwig 
---
 arch/xtensa/platforms/iss/simdisk.c | 4 +---
 1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/xtensa/platforms/iss/simdisk.c 
b/arch/xtensa/platforms/iss/simdisk.c
index 49322b66cda93..31b5020077a05 100644
--- a/arch/xtensa/platforms/iss/simdisk.c
+++ b/arch/xtensa/platforms/iss/simdisk.c
@@ -103,7 +103,7 @@ static void simdisk_transfer(struct simdisk *dev, unsigned 
long sector,
 
 static blk_qc_t simdisk_make_request(struct request_queue *q, struct bio *bio)
 {
-   struct simdisk *dev = q->queuedata;
+   struct simdisk *dev = bio->bi_disk->private_data;
struct bio_vec bvec;
struct bvec_iter iter;
sector_t sector = bio->bi_iter.bi_sector;
@@ -273,8 +273,6 @@ static int __init simdisk_setup(struct simdisk *dev, int 
which,
goto out_alloc_queue;
}
 
-   dev->queue->queuedata = dev;
-
dev->gd = alloc_disk(SIMDISK_MINORS);
if (dev->gd == NULL) {
pr_err("alloc_disk failed\n");
-- 
2.26.2



Re: [PATCH v7 2/5] seq_buf: Export seq_buf_printf() to external modules

2020-05-08 Thread Borislav Petkov
On Fri, May 08, 2020 at 05:30:31PM +0530, Vaibhav Jain wrote:
> I am referring to Kernel Loadable Modules with MODULE_LICENSE("GPL")
> here.

And what does "external" refer to? Because if it is out-of-tree, we
don't export symbols for out-of-tree modules.

Looks like you're exporting it for that papr_scm.c thing, which is fine.
But that is not "external".

So?

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


Re: [PATCH v7 2/5] seq_buf: Export seq_buf_printf() to external modules

2020-05-08 Thread Joe Perches
On Fri, 2020-05-08 at 17:30 +0530, Vaibhav Jain wrote:
> Hi Boris,
> 
> Borislav Petkov  writes:
> 
> > On Fri, May 08, 2020 at 04:19:19PM +0530, Vaibhav Jain wrote:
> > > 'seq_buf' provides a very useful abstraction for writing to a string
> > > buffer without needing to worry about it over-flowing. However even
> > > though the API has been stable for couple of years now its stills not
> > > exported to external modules limiting its usage.
> > > 
> > > Hence this patch proposes update to 'seq_buf.c' to mark
> > > seq_buf_printf() which is part of the seq_buf API to be exported to
> > > external GPL modules. This symbol will be used in later parts of this
> > 
> > What is "external GPL modules"?
> I am referring to Kernel Loadable Modules with MODULE_LICENSE("GPL")
> here.

Any reason why these Kernel Loadable Modules with MODULE_LICENSE("GPL")
are not in the kernel tree?




[PATCH -next] soc: fsl: qbman: Remove unused inline function qm_eqcr_get_ci_stashing

2020-05-08 Thread YueHaibing
There's no callers in-tree anymore.

Signed-off-by: YueHaibing 
---
 drivers/soc/fsl/qbman/qman.c | 5 -
 1 file changed, 5 deletions(-)

diff --git a/drivers/soc/fsl/qbman/qman.c b/drivers/soc/fsl/qbman/qman.c
index 1e164e03410a..9888a7061873 100644
--- a/drivers/soc/fsl/qbman/qman.c
+++ b/drivers/soc/fsl/qbman/qman.c
@@ -449,11 +449,6 @@ static inline int qm_eqcr_init(struct qm_portal *portal,
return 0;
 }
 
-static inline unsigned int qm_eqcr_get_ci_stashing(struct qm_portal *portal)
-{
-   return (qm_in(portal, QM_REG_CFG) >> 28) & 0x7;
-}
-
 static inline void qm_eqcr_finish(struct qm_portal *portal)
 {
struct qm_eqcr *eqcr = >eqcr;
-- 
2.17.1




ioremap() called early from pnv_pci_init_ioda_phb()

2020-05-08 Thread Qian Cai
 Booting POWER9 PowerNV has this message,

"ioremap() called early from pnv_pci_init_ioda_phb+0x420/0xdfc. Use 
early_ioremap() instead”

but use the patch below will result in leaks because it will never call 
early_iounmap() anywhere. However, it looks me it was by design that phb->regs 
mapping would be there forever where it would be used in 
pnv_ioda_get_inval_reg(), so is just that check_early_ioremap_leak() initcall 
too strong?

--- a/arch/powerpc/platforms/powernv/pci-ioda.c
+++ b/arch/powerpc/platforms/powernv/pci-ioda.c
@@ -36,6 +36,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -3827,7 +3828,7 @@ static void __init pnv_pci_init_ioda_phb(struct 
device_node *np,
/* Get registers */
if (!of_address_to_resource(np, 0, )) {
phb->regs_phys = r.start;
-   phb->regs = ioremap(r.start, resource_size());
+   phb->regs = early_ioremap(r.start, resource_size());
if (phb->regs == NULL)
pr_err("  Failed to map registers !\n”);

[   23.080069][T1] [ cut here ]
[   23.080089][T1] Debug warning: early ioremap leak of 10 areas detected.
[   23.080089][T1] please boot with early_ioremap_debug and report the 
dmesg.
[   23.080157][T1] WARNING: CPU: 4 PID: 1 at mm/early_ioremap.c:99 
check_early_ioremap_leak+0xd4/0x108
[   23.080171][T1] Modules linked in:
[   23.080192][T1] CPU: 4 PID: 1 Comm: swapper/0 Not tainted 
5.7.0-rc4-next-20200508+ #4
[   23.080214][T1] NIP:  c103f2d8 LR: c103f2d4 CTR: 

[   23.080226][T1] REGS: c0003df0f860 TRAP: 0700   Not tainted  
(5.7.0-rc4-next-20200508+)
[   23.080259][T1] MSR:  90029033   CR: 
48000222  XER: 2004
[   23.080296][T1] CFAR: c010d5a8 IRQMASK: 0 
[   23.080296][T1] GPR00: c103f2d4 c0003df0faf0 
c1689400 0072 
[   23.080296][T1] GPR04: 0006  
c0003df0f7e4 0004 
[   23.080296][T1] GPR08: 001ffbb6  
c0003dee6680 0002 
[   23.080296][T1] GPR12:  c01fae00 
c1057860 c10578b0 
[   23.080296][T1] GPR16: c1002d38 c14f0660 
c14f0680 c14f06a0 
[   23.080296][T1] GPR20: c14f06c0 c14f06e0 
c14f0700 c14f0720 
[   23.080296][T1] GPR24: c0c4bc30 c00486b82000 
c15a0fe0 c15a0fc0 
[   23.080296][T1] GPR28: 0010 0010 
c1061e30 000a 
[   23.080507][T1] NIP [c103f2d8] 
check_early_ioremap_leak+0xd4/0x108
[   23.080530][T1] LR [c103f2d4] check_early_ioremap_leak+0xd0/0x108
[   23.080552][T1] Call Trace:
[   23.080571][T1] [c0003df0faf0] [c103f2d4] 
check_early_ioremap_leak+0xd0/0x108 (unreliable)
[   23.080607][T1] [c0003df0fb80] [c001130c] 
do_one_initcall+0xcc/0x660
[   23.080648][T1] [c0003df0fc80] [c1004c18] 
kernel_init_freeable+0x480/0x568
[   23.080681][T1] [c0003df0fdb0] [c0012180] 
kernel_init+0x24/0x194
[   23.080713][T1] [c0003df0fe20] [c000cb28] 
ret_from_kernel_thread+0x5c/0x74

This is from the early_ioremap_debug dmesg.

[0.00][T0] [ cut here ]
[0.00][T0] __early_ioremap(0x000600c3c001, 0001) [0] => 
 + ffbe
[0.00][T0] WARNING: CPU: 0 PID: 0 at mm/early_ioremap.c:162 
__early_ioremap+0x2d8/0x408
[0.00][T0] Modules linked in:
[0.00][T0] CPU: 0 PID: 0 Comm: swapper Not tainted 
5.7.0-rc4-next-20200508+ #4
[0.00][T0] NIP:  c103f5e4 LR: c103f5e0 CTR: 
c01e77f0
[0.00][T0] REGS: c168f980 TRAP: 0700   Not tainted  
(5.7.0-rc4-next-20200508+)
[0.00][T0] MSR:  90021033   CR: 
28000248  XER: 2004
[0.00][T0] CFAR: c010d5a8 IRQMASK: 1 
[0.00][T0] GPR00: c103f5e0 c168fc10 
c1689400 0050 
[0.00][T0] GPR04: c152f6f8  
c168f904  
[0.00][T0] GPR08:   
c162f600 0002 
[0.00][T0] GPR12: c01e77f0 c5b3 
  
[0.00][T0] GPR16:   
 1000 
[0.00][T0] GPR20:  81ae 
  
[0.00][T0] GPR24: 0001 c1061da8 
0008 0008 
[0.00][T0] GPR28:  c1061db0 
 c1061eb8 
[0.00][T0] NIP [c103f5e4] __early_ioremap+0x2d8/0x4

[PATCH -next] soc: fsl: dpio: remove set but not used variable 'addr_cena'

2020-05-08 Thread YueHaibing
drivers/soc/fsl/dpio//qbman-portal.c:650:11: warning: variable 'addr_cena' set 
but not used [-Wunused-but-set-variable]
  uint64_t addr_cena;
   ^

It is never used, so remove it.

Signed-off-by: YueHaibing 
---
 drivers/soc/fsl/dpio/qbman-portal.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/drivers/soc/fsl/dpio/qbman-portal.c 
b/drivers/soc/fsl/dpio/qbman-portal.c
index e2e9fbb58a72..0ce859b60888 100644
--- a/drivers/soc/fsl/dpio/qbman-portal.c
+++ b/drivers/soc/fsl/dpio/qbman-portal.c
@@ -647,7 +647,6 @@ int qbman_swp_enqueue_multiple_direct(struct qbman_swp *s,
const uint32_t *cl = (uint32_t *)d;
uint32_t eqcr_ci, eqcr_pi, half_mask, full_mask;
int i, num_enqueued = 0;
-   uint64_t addr_cena;
 
spin_lock(>access_spinlock);
half_mask = (s->eqcr.pi_ci_mask>>1);
@@ -700,7 +699,6 @@ int qbman_swp_enqueue_multiple_direct(struct qbman_swp *s,
 
/* Flush all the cacheline without load/store in between */
eqcr_pi = s->eqcr.pi;
-   addr_cena = (size_t)s->addr_cena;
for (i = 0; i < num_enqueued; i++)
eqcr_pi++;
s->eqcr.pi = eqcr_pi & full_mask;
-- 
2.17.1




[PATCH -next] soc: fsl: dpio: Remove unused inline function qbman_write_eqcr_am_rt_register

2020-05-08 Thread YueHaibing
There's no callers in-tree anymore since commit
3b2abda7d28c ("soc: fsl: dpio: Replace QMAN array mode with ring mode enqueue")

Signed-off-by: YueHaibing 
---
 drivers/soc/fsl/dpio/qbman-portal.c | 12 
 1 file changed, 12 deletions(-)

diff --git a/drivers/soc/fsl/dpio/qbman-portal.c 
b/drivers/soc/fsl/dpio/qbman-portal.c
index 804b8ba9bf5c..e2e9fbb58a72 100644
--- a/drivers/soc/fsl/dpio/qbman-portal.c
+++ b/drivers/soc/fsl/dpio/qbman-portal.c
@@ -572,18 +572,6 @@ void qbman_eq_desc_set_qd(struct qbman_eq_desc *d, u32 
qdid,
 #define EQAR_VB(eqar)  ((eqar) & 0x80)
 #define EQAR_SUCCESS(eqar) ((eqar) & 0x100)
 
-static inline void qbman_write_eqcr_am_rt_register(struct qbman_swp *p,
-  u8 idx)
-{
-   if (idx < 16)
-   qbman_write_register(p, QBMAN_CINH_SWP_EQCR_AM_RT + idx * 4,
-QMAN_RT_MODE);
-   else
-   qbman_write_register(p, QBMAN_CINH_SWP_EQCR_AM_RT2 +
-(idx - 16) * 4,
-QMAN_RT_MODE);
-}
-
 #define QB_RT_BIT ((u32)0x100)
 /**
  * qbman_swp_enqueue_direct() - Issue an enqueue command
-- 
2.17.1




Re: [PATCH v7 2/5] seq_buf: Export seq_buf_printf() to external modules

2020-05-08 Thread Michael Ellerman
Borislav Petkov  writes:
> On Fri, May 08, 2020 at 04:19:19PM +0530, Vaibhav Jain wrote:
>> 'seq_buf' provides a very useful abstraction for writing to a string
>> buffer without needing to worry about it over-flowing. However even
>> though the API has been stable for couple of years now its stills not
>> exported to external modules limiting its usage.
>> 
>> Hence this patch proposes update to 'seq_buf.c' to mark
>> seq_buf_printf() which is part of the seq_buf API to be exported to
>> external GPL modules. This symbol will be used in later parts of this
>
> What is "external GPL modules"?

A module that has MODULE_LICENSE("GPL") ?

cheers


Re: [PATCH v2 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline

2020-05-08 Thread David Hildenbrand
On 08.05.20 15:39, David Hildenbrand wrote:
> On 08.05.20 15:03, Srikar Dronamraju wrote:
>> * Michal Hocko  [2020-05-04 11:37:12]:
>>
>
> Have you tested on something else than ppc? Each arch does the NUMA
> setup separately and this is a big mess. E.g. x86 marks even memory less
> nodes (see init_memory_less_node) as online.
>

 while I have predominantly tested on ppc, I did test on X86 with 
 CONFIG_NUMA
 enabled/disabled on both single node and multi node machines.
 However, I dont have a cpuless/memoryless x86 system.
>>>
>>> This should be able to emulate inside kvm, I believe.
>>>
>>
>> I did try but somehow not able to get cpuless / memoryless node in a x86 kvm
>> guest.
> 
> I use the following
> 
> #! /bin/bash
> sudo x86_64-softmmu/qemu-system-x86_64 \
> --enable-kvm \
> -m 4G,maxmem=20G,slots=2 \
> -smp sockets=2,cores=2 \
> -numa node,nodeid=0,cpus=0-1,mem=4G -numa node,nodeid=1,cpus=2-3,mem=0G \

Sorry, this line has to be

-numa node,nodeid=0,cpus=0-3,mem=4G -numa node,nodeid=1,mem=0G \

> -kernel /home/dhildenb/git/linux/arch/x86_64/boot/bzImage \
> -append "console=ttyS0 rd.shell rd.luks=0 rd.lvm=0 rd.md=0 rd.dm=0" \
> -initrd /boot/initramfs-5.2.8-200.fc30.x86_64.img \
> -machine pc,nvdimm \
> -nographic \
> -nodefaults \
> -chardev stdio,id=serial \
> -device isa-serial,chardev=serial \
> -chardev socket,id=monitor,path=/var/tmp/monitor,server,nowait \
> -mon chardev=monitor,mode=readline
> 
> to get a cpu-less and memory-less node 1. Never tried with node 0.
> 


-- 
Thanks,

David / dhildenb



Re: [PATCH v2 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline

2020-05-08 Thread David Hildenbrand
On 08.05.20 15:03, Srikar Dronamraju wrote:
> * Michal Hocko  [2020-05-04 11:37:12]:
> 

 Have you tested on something else than ppc? Each arch does the NUMA
 setup separately and this is a big mess. E.g. x86 marks even memory less
 nodes (see init_memory_less_node) as online.

>>>
>>> while I have predominantly tested on ppc, I did test on X86 with CONFIG_NUMA
>>> enabled/disabled on both single node and multi node machines.
>>> However, I dont have a cpuless/memoryless x86 system.
>>
>> This should be able to emulate inside kvm, I believe.
>>
> 
> I did try but somehow not able to get cpuless / memoryless node in a x86 kvm
> guest.

I use the following

#! /bin/bash
sudo x86_64-softmmu/qemu-system-x86_64 \
--enable-kvm \
-m 4G,maxmem=20G,slots=2 \
-smp sockets=2,cores=2 \
-numa node,nodeid=0,cpus=0-1,mem=4G -numa node,nodeid=1,cpus=2-3,mem=0G \
-kernel /home/dhildenb/git/linux/arch/x86_64/boot/bzImage \
-append "console=ttyS0 rd.shell rd.luks=0 rd.lvm=0 rd.md=0 rd.dm=0" \
-initrd /boot/initramfs-5.2.8-200.fc30.x86_64.img \
-machine pc,nvdimm \
-nographic \
-nodefaults \
-chardev stdio,id=serial \
-device isa-serial,chardev=serial \
-chardev socket,id=monitor,path=/var/tmp/monitor,server,nowait \
-mon chardev=monitor,mode=readline

to get a cpu-less and memory-less node 1. Never tried with node 0.

-- 
Thanks,

David / dhildenb



Re: [PATCH v4 04/16] powerpc/64s/exceptions: machine check reconcile irq state

2020-05-08 Thread Michael Ellerman
Nicholas Piggin  writes:

> pseries fwnmi machine check code pops the soft-irq checks in rtas_call
> (after the previous patch to remove rtas_token from this call path).
 ^
 I changed this to "next" which I think is what you meant?

cheers

> Rather than play whack a mole with these and forever having fragile
> code, it seems better to have the early machine check handler perform
> the same kind of reconcile as the other NMI interrupts.
>
>   WARNING: CPU: 0 PID: 493 at arch/powerpc/kernel/irq.c:343
>   CPU: 0 PID: 493 Comm: a Tainted: GW
>   NIP:  c001ed2c LR: c0042c40 CTR: 
>   REGS: c001fffd38b0 TRAP: 0700   Tainted: GW
>   MSR:  80021003   CR: 28000488  XER: 
>   CFAR: c001ec90 IRQMASK: 0
>   GPR00: c0043820 c001fffd3b40 c12ba300 
>   GPR04: 48000488   deadbeef
>   GPR08: 0080   1001
>   GPR12:  c14a  
>   GPR16:    
>   GPR20:    
>   GPR24:    
>   GPR28:  0001 c1360810 
>   NIP [c001ed2c] arch_local_irq_restore.part.0+0xac/0x100
>   LR [c0042c40] unlock_rtas+0x30/0x90
>   Call Trace:
>   [c001fffd3b40] [c1360810] 0xc1360810 (unreliable)
>   [c001fffd3b60] [c0043820] rtas_call+0x1c0/0x280
>   [c001fffd3bb0] [c00dc328] fwnmi_release_errinfo+0x38/0x70
>   [c001fffd3c10] [c00dcd8c] 
> pseries_machine_check_realmode+0x1dc/0x540
>   [c001fffd3cd0] [c003fe04] machine_check_early+0x54/0x70
>   [c001fffd3d00] [c0008384] machine_check_early_common+0x134/0x1f0
>   --- interrupt: 200 at 0x13f1307c8
>   LR = 0x7fff888b8528
>   Instruction dump:
>   6000 7d2000a6 71298000 41820068 3922 7d210164 4b9c 6000
>   6000 7d2000a6 71298000 4c820020 <0fe0> 4e800020 6000 6000
>
> Signed-off-by: Nicholas Piggin 
> ---
>  arch/powerpc/kernel/exceptions-64s.S | 19 +++
>  1 file changed, 19 insertions(+)
>
> diff --git a/arch/powerpc/kernel/exceptions-64s.S 
> b/arch/powerpc/kernel/exceptions-64s.S
> index a42b73efb1a9..072772803b7c 100644
> --- a/arch/powerpc/kernel/exceptions-64s.S
> +++ b/arch/powerpc/kernel/exceptions-64s.S
> @@ -1116,11 +1116,30 @@ END_FTR_SECTION_IFSET(CPU_FTR_HVMODE)
>   li  r10,MSR_RI
>   mtmsrd  r10,1
>  
> + /*
> +  * Set IRQS_ALL_DISABLED and save PACAIRQHAPPENED (see
> +  * system_reset_common)
> +  */
> + li  r10,IRQS_ALL_DISABLED
> + stb r10,PACAIRQSOFTMASK(r13)
> + lbz r10,PACAIRQHAPPENED(r13)
> + std r10,RESULT(r1)
> + ori r10,r10,PACA_IRQ_HARD_DIS
> + stb r10,PACAIRQHAPPENED(r13)
> +
>   addir3,r1,STACK_FRAME_OVERHEAD
>   bl  machine_check_early
>   std r3,RESULT(r1)   /* Save result */
>   ld  r12,_MSR(r1)
>  
> + /*
> +  * Restore soft mask settings.
> +  */
> + ld  r10,RESULT(r1)
> + stb r10,PACAIRQHAPPENED(r13)
> + ld  r10,SOFTE(r1)
> + stb r10,PACAIRQSOFTMASK(r13)
> +
>  #ifdef CONFIG_PPC_P7_NAP
>   /*
>* Check if thread was in power saving mode. We come here when any
> -- 
> 2.23.0


Re: [PATCH v3 1/3] powerpc/numa: Set numa_node for all possible cpus

2020-05-08 Thread Srikar Dronamraju
* Christopher Lameter  [2020-05-02 22:55:16]:

> On Fri, 1 May 2020, Srikar Dronamraju wrote:
> 
> > -   for_each_present_cpu(cpu)
> > -   numa_setup_cpu(cpu);
> > +   for_each_possible_cpu(cpu) {
> > +   /*
> > +* Powerpc with CONFIG_NUMA always used to have a node 0,
> > +* even if it was memoryless or cpuless. For all cpus that
> > +* are possible but not present, cpu_to_node() would point
> > +* to node 0. To remove a cpuless, memoryless dummy node,
> > +* powerpc need to make sure all possible but not present
> > +* cpu_to_node are set to a proper node.
> > +*/
> > +   if (cpu_present(cpu))
> > +   numa_setup_cpu(cpu);
> > +   else
> > +   set_cpu_numa_node(cpu, first_online_node);
> > +   }
> >  }
> 
> 
> Can this be folded into numa_setup_cpu?
> 
> This looks more like numa_setup_cpu needs to change?
> 

We can fold this into numa_setup_cpu().

However till now we were sure that numa_setup_cpu() would be called only for
a present cpu. That assumption will change.
+ (non-consequential) an additional check everytime cpu is hotplugged in.

If Michael Ellerman is okay with the change, I can fold it in.

-- 
Thanks and Regards
Srikar Dronamraju


[PATCH v2] powerpc/spufs: Rework fcheck() usage

2020-05-08 Thread Michael Ellerman
Currently the spu coredump code triggers an RCU warning:

  =
  WARNING: suspicious RCU usage
  5.7.0-rc3-01755-g7cd49f0b7ec7 #1 Not tainted
  -
  include/linux/fdtable.h:95 suspicious rcu_dereference_check() usage!

  other info that might help us debug this:

  rcu_scheduler_active = 2, debug_locks = 1
  1 lock held by spu-coredump/1343:
   #0: c007fa22f430 (sb_writers#2){.+.+}-{0:0}, at: 
.do_coredump+0x1010/0x13c8

  stack backtrace:
  CPU: 0 PID: 1343 Comm: spu-coredump Not tainted 5.7.0-rc3-01755-g7cd49f0b7ec7 
#1
  Call Trace:
.dump_stack+0xec/0x15c (unreliable)
.lockdep_rcu_suspicious+0x120/0x144
.coredump_next_context+0x148/0x158
.spufs_coredump_extra_notes_size+0x54/0x190
.elf_coredump_extra_notes_size+0x34/0x50
.elf_core_dump+0xe48/0x19d0
.do_coredump+0xe50/0x13c8
.get_signal+0x864/0xd88
.do_notify_resume+0x158/0x3c8
.interrupt_exit_user_prepare+0x19c/0x208
interrupt_return+0x14/0x1c0

This comes from fcheck_files() via fcheck().

It's pretty clearly documented that fcheck() must be wrapped with
rcu_read_lock(), adding that fixes the RCU warning.

hch points out that once we've released the RCU read lock the file may
be closed and freed, which would leave us with a pointer to a freed
spu_context.

To avoid that, take a reference to the spu_context while we hold the
RCU read lock, and drop that reference later once we're done with the
context.

Signed-off-by: Michael Ellerman 
---
 arch/powerpc/platforms/cell/spufs/coredump.c | 19 ---
 1 file changed, 16 insertions(+), 3 deletions(-)

v2: Take a reference and hold it until we're done.

diff --git a/arch/powerpc/platforms/cell/spufs/coredump.c 
b/arch/powerpc/platforms/cell/spufs/coredump.c
index 8b3296b62f65..37c155254cd5 100644
--- a/arch/powerpc/platforms/cell/spufs/coredump.c
+++ b/arch/powerpc/platforms/cell/spufs/coredump.c
@@ -82,13 +82,20 @@ static int match_context(const void *v, struct file *file, 
unsigned fd)
  */
 static struct spu_context *coredump_next_context(int *fd)
 {
+   struct spu_context *ctx;
struct file *file;
int n = iterate_fd(current->files, *fd, match_context, NULL);
if (!n)
return NULL;
*fd = n - 1;
+
+   rcu_read_lock();
file = fcheck(*fd);
-   return SPUFS_I(file_inode(file))->i_ctx;
+   ctx = SPUFS_I(file_inode(file))->i_ctx;
+   get_spu_context(ctx);
+   rcu_read_unlock();
+
+   return ctx;
 }
 
 int spufs_coredump_extra_notes_size(void)
@@ -99,17 +106,23 @@ int spufs_coredump_extra_notes_size(void)
fd = 0;
while ((ctx = coredump_next_context()) != NULL) {
rc = spu_acquire_saved(ctx);
-   if (rc)
+   if (rc) {
+   put_spu_context(ctx);
break;
+   }
+
rc = spufs_ctx_note_size(ctx, fd);
spu_release_saved(ctx);
-   if (rc < 0)
+   if (rc < 0) {
+   put_spu_context(ctx);
break;
+   }
 
size += rc;
 
/* start searching the next fd next time */
fd++;
+   put_spu_context(ctx);
}
 
return size;
-- 
2.25.1



Re: [PATCH v3 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline

2020-05-08 Thread Srikar Dronamraju
* Christopher Lameter  [2020-05-02 23:05:28]:

> On Fri, 1 May 2020, Srikar Dronamraju wrote:
> 
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -116,8 +116,10 @@ EXPORT_SYMBOL(latent_entropy);
> >   */
> >  nodemask_t node_states[NR_NODE_STATES] __read_mostly = {
> > [N_POSSIBLE] = NODE_MASK_ALL,
> > +#ifdef CONFIG_NUMA
> > +   [N_ONLINE] = NODE_MASK_NONE,
> 
> Hmmm I would have expected that you would have added something early
> in boot that would mark the current node (whatever is is) online instead?

Do correct me, but these are structure initialization in page_alloc.c
Wouldn't these happen much before the numa initialization happens?
I think we are already marking nodes as online as soon as we detect the
nodes.

-- 
Thanks and Regards
Srikar Dronamraju


[tip: perf/core] perf metricgroups: Enhance JSON/metric infrastructure to handle "?"

2020-05-08 Thread tip-bot2 for Kajol Jain
The following commit has been merged into the perf/core branch of tip:

Commit-ID: 1e1a873dc67fc748cc319a27603f33db91027730
Gitweb:
https://git.kernel.org/tip/1e1a873dc67fc748cc319a27603f33db91027730
Author:Kajol Jain 
AuthorDate:Thu, 02 Apr 2020 02:03:37 +05:30
Committer: Arnaldo Carvalho de Melo 
CommitterDate: Thu, 30 Apr 2020 10:48:33 -03:00

perf metricgroups: Enhance JSON/metric infrastructure to handle "?"

Patch enhances current metric infrastructure to handle "?" in the metric
expression. The "?" can be use for parameters whose value not known
while creating metric events and which can be replace later at runtime
to the proper value. It also add flexibility to create multiple events
out of single metric event added in JSON file.

Patch adds function 'arch_get_runtimeparam' which is a arch specific
function, returns the count of metric events need to be created.  By
default it return 1.

This infrastructure needed for hv_24x7 socket/chip level events.
"hv_24x7" chip level events needs specific chip-id to which the data is
requested. Function 'arch_get_runtimeparam' implemented in header.c
which extract number of sockets from sysfs file "sockets" under
"/sys/devices/hv_24x7/interface/".

With this patch basically we are trying to create as many metric events
as define by runtime_param.

For that one loop is added in function 'metricgroup__add_metric', which
create multiple events at run time depend on return value of
'arch_get_runtimeparam' and merge that event in 'group_list'.

To achieve that we are actually passing this parameter value as part of
`expr__find_other` function and changing "?" present in metric
expression with this value.

As in our JSON file, there gonna be single metric event, and out of
which we are creating multiple events.

To understand which data count belongs to which parameter value,
we also printing param value in generic_metric function.

For example,

  command:# ./perf stat  -M PowerBUS_Frequency -C 0 -I 1000
1.000101867  9,356,933  hv_24x7/pm_pb_cyc,chip=0/ #  2.3 GHz  
PowerBUS_Frequency_0
1.000101867  9,366,134  hv_24x7/pm_pb_cyc,chip=1/ #  2.3 GHz  
PowerBUS_Frequency_1
2.000314878  9,365,868  hv_24x7/pm_pb_cyc,chip=0/ #  2.3 GHz  
PowerBUS_Frequency_0
2.000314878  9,366,092  hv_24x7/pm_pb_cyc,chip=1/ #  2.3 GHz  
PowerBUS_Frequency_1

So, here _0 and _1 after PowerBUS_Frequency specify parameter value.

Signed-off-by: Kajol Jain 
Acked-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Anju T Sudhakar 
Cc: Benjamin Herrenschmidt 
Cc: Greg Kroah-Hartman 
Cc: Jin Yao 
Cc: Joe Mario 
Cc: Kan Liang 
Cc: Madhavan Srinivasan 
Cc: Mamatha Inamdar 
Cc: Mark Rutland 
Cc: Michael Ellerman 
Cc: Michael Petlan 
Cc: Namhyung Kim 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Ravi Bangoria 
Cc: Sukadev Bhattiprolu 
Cc: Thomas Gleixner 
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lore.kernel.org/lkml/20200401203340.31402-5-kj...@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/arch/powerpc/util/header.c |  8 +++-
 tools/perf/tests/expr.c   |  8 +++
 tools/perf/util/expr.c| 11 +-
 tools/perf/util/expr.h|  5 +++--
 tools/perf/util/expr.l| 27 ++---
 tools/perf/util/metricgroup.c | 28 +++---
 tools/perf/util/metricgroup.h |  2 ++-
 tools/perf/util/stat-shadow.c | 17 ++--
 8 files changed, 79 insertions(+), 27 deletions(-)

diff --git a/tools/perf/arch/powerpc/util/header.c 
b/tools/perf/arch/powerpc/util/header.c
index 3b4cdfc..d487007 100644
--- a/tools/perf/arch/powerpc/util/header.c
+++ b/tools/perf/arch/powerpc/util/header.c
@@ -7,6 +7,8 @@
 #include 
 #include 
 #include "header.h"
+#include "metricgroup.h"
+#include 
 
 #define mfspr(rn)   ({unsigned long rval; \
 asm volatile("mfspr %0," __stringify(rn) \
@@ -44,3 +46,9 @@ get_cpuid_str(struct perf_pmu *pmu __maybe_unused)
 
return bufp;
 }
+
+int arch_get_runtimeparam(void)
+{
+   int count;
+   return sysfs__read_int("/devices/hv_24x7/interface/sockets", ) < 
0 ? 1 : count;
+}
diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c
index ea10fc4..516504c 100644
--- a/tools/perf/tests/expr.c
+++ b/tools/perf/tests/expr.c
@@ -10,7 +10,7 @@ static int test(struct expr_parse_ctx *ctx, const char *e, 
double val2)
 {
double val;
 
-   if (expr__parse(, ctx, e))
+   if (expr__parse(, ctx, e, 1))
TEST_ASSERT_VAL("parse test failed", 0);
TEST_ASSERT_VAL("unexpected value", val == val2);
return 0;
@@ -44,15 +44,15 @@ int test__expr(struct test *t __maybe_unused, int subtest 
__maybe_unused)
return ret;
 
p = "FOO/0";
-   ret = expr__parse(, , p);
+   ret = expr__parse(, , p, 1);
TEST_ASSERT_VAL("division by zero", ret == -1);
 
p = "BAR/";
-   ret = 

[tip: perf/core] perf vendor events power9: Add hv_24x7 socket/chip level metric events

2020-05-08 Thread tip-bot2 for Kajol Jain
The following commit has been merged into the perf/core branch of tip:

Commit-ID: 354575c00d61c174e0ff070f56cf3cdbe6d23f9e
Gitweb:
https://git.kernel.org/tip/354575c00d61c174e0ff070f56cf3cdbe6d23f9e
Author:Kajol Jain 
AuthorDate:Thu, 02 Apr 2020 02:03:40 +05:30
Committer: Arnaldo Carvalho de Melo 
CommitterDate: Thu, 30 Apr 2020 10:48:33 -03:00

perf vendor events power9: Add hv_24x7 socket/chip level metric events

The hv_24×7 feature in IBM® POWER9™ processor-based servers provide the
facility to continuously collect large numbers of hardware performance
metrics efficiently and accurately.

This patch adds hv_24x7  metric file for different Socket/chip
resources.

Result:

power9 platform:

  command:# ./perf stat --metric-only -M Memory_RD_BW_Chip -C 0 -I 1000

 1.96188  0.9   0.3
 2.000285720  0.5   0.1
 3.000424990  0.4   0.1

  command:# ./perf stat --metric-only -M PowerBUS_Frequency -C 0 -I 1000

 1.97981  2.3   2.3
 2.000291713  2.3   2.3
 3.000421719  2.3   2.3
 4.000550912  2.3   2.3

Signed-off-by: Kajol Jain 
Acked-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Anju T Sudhakar 
Cc: Benjamin Herrenschmidt 
Cc: Greg Kroah-Hartman 
Cc: Jin Yao 
Cc: Joe Mario 
Cc: Kan Liang 
Cc: Madhavan Srinivasan 
Cc: Mamatha Inamdar 
Cc: Mark Rutland 
Cc: Michael Ellerman 
Cc: Michael Petlan 
Cc: Namhyung Kim 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Ravi Bangoria 
Cc: Sukadev Bhattiprolu 
Cc: Thomas Gleixner 
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lore.kernel.org/lkml/20200401203340.31402-8-kj...@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/pmu-events/arch/powerpc/power9/nest_metrics.json | 19 +++-
 1 file changed, 19 insertions(+)
 create mode 100644 tools/perf/pmu-events/arch/powerpc/power9/nest_metrics.json

diff --git a/tools/perf/pmu-events/arch/powerpc/power9/nest_metrics.json 
b/tools/perf/pmu-events/arch/powerpc/power9/nest_metrics.json
new file mode 100644
index 000..c121e52
--- /dev/null
+++ b/tools/perf/pmu-events/arch/powerpc/power9/nest_metrics.json
@@ -0,0 +1,19 @@
+[
+{
+"MetricExpr": "(hv_24x7@PM_MCS01_128B_RD_DISP_PORT01\\,chip\\=?@ + 
hv_24x7@PM_MCS01_128B_RD_DISP_PORT23\\,chip\\=?@ + 
hv_24x7@PM_MCS23_128B_RD_DISP_PORT01\\,chip\\=?@ + 
hv_24x7@PM_MCS23_128B_RD_DISP_PORT23\\,chip\\=?@)",
+"MetricName": "Memory_RD_BW_Chip",
+"MetricGroup": "Memory_BW",
+"ScaleUnit": "1.6e-2MB"
+},
+{
+   "MetricExpr": "(hv_24x7@PM_MCS01_128B_WR_DISP_PORT01\\,chip\\=?@ + 
hv_24x7@PM_MCS01_128B_WR_DISP_PORT23\\,chip\\=?@ + 
hv_24x7@PM_MCS23_128B_WR_DISP_PORT01\\,chip\\=?@ + 
hv_24x7@PM_MCS23_128B_WR_DISP_PORT23\\,chip\\=?@ )",
+"MetricName": "Memory_WR_BW_Chip",
+"MetricGroup": "Memory_BW",
+"ScaleUnit": "1.6e-2MB"
+},
+{
+   "MetricExpr": "(hv_24x7@PM_PB_CYC\\,chip\\=?@ )",
+"MetricName": "PowerBUS_Frequency",
+"ScaleUnit": "2.5e-7GHz"
+}
+]


[tip: perf/core] perf tests expr: Added test for runtime param in metric expression

2020-05-08 Thread tip-bot2 for Kajol Jain
The following commit has been merged into the perf/core branch of tip:

Commit-ID: 9022608ec5babbb0fa631234098d52895e7e34d8
Gitweb:
https://git.kernel.org/tip/9022608ec5babbb0fa631234098d52895e7e34d8
Author:Kajol Jain 
AuthorDate:Thu, 02 Apr 2020 02:03:38 +05:30
Committer: Arnaldo Carvalho de Melo 
CommitterDate: Thu, 30 Apr 2020 10:48:33 -03:00

perf tests expr: Added test for runtime param in metric expression

Added test case for parsing  "?" in metric expression.

Signed-off-by: Kajol Jain 
Acked-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Anju T Sudhakar 
Cc: Benjamin Herrenschmidt 
Cc: Greg Kroah-Hartman 
Cc: Jin Yao 
Cc: Joe Mario 
Cc: Kan Liang 
Cc: Madhavan Srinivasan 
Cc: Mamatha Inamdar 
Cc: Mark Rutland 
Cc: Michael Ellerman 
Cc: Michael Petlan 
Cc: Namhyung Kim 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Ravi Bangoria 
Cc: Sukadev Bhattiprolu 
Cc: Thomas Gleixner 
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lore.kernel.org/lkml/20200401203340.31402-6-kj...@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/tests/expr.c | 8 
 1 file changed, 8 insertions(+)

diff --git a/tools/perf/tests/expr.c b/tools/perf/tests/expr.c
index 516504c..f9e8e56 100644
--- a/tools/perf/tests/expr.c
+++ b/tools/perf/tests/expr.c
@@ -59,6 +59,14 @@ int test__expr(struct test *t __maybe_unused, int subtest 
__maybe_unused)
TEST_ASSERT_VAL("find other", !strcmp(other[2], "BOZO"));
TEST_ASSERT_VAL("find other", other[3] == NULL);
 
+   TEST_ASSERT_VAL("find other",
+   expr__find_other("EVENT1\\,param\\=?@ + 
EVENT2\\,param\\=?@", NULL,
+  , _other, 3) == 0);
+   TEST_ASSERT_VAL("find other", num_other == 2);
+   TEST_ASSERT_VAL("find other", !strcmp(other[0], "EVENT1,param=3/"));
+   TEST_ASSERT_VAL("find other", !strcmp(other[1], "EVENT2,param=3/"));
+   TEST_ASSERT_VAL("find other", other[2] == NULL);
+
for (i = 0; i < num_other; i++)
zfree([i]);
free((void *)other);


[tip: perf/core] perf tools: Enable Hz/hz prinitg for --metric-only option

2020-05-08 Thread tip-bot2 for Kajol Jain
The following commit has been merged into the perf/core branch of tip:

Commit-ID: 3351c6da896bf521b118bfbb699fbda8f2a816b3
Gitweb:
https://git.kernel.org/tip/3351c6da896bf521b118bfbb699fbda8f2a816b3
Author:Kajol Jain 
AuthorDate:Thu, 02 Apr 2020 02:03:39 +05:30
Committer: Arnaldo Carvalho de Melo 
CommitterDate: Thu, 30 Apr 2020 10:48:33 -03:00

perf tools: Enable Hz/hz prinitg for --metric-only option

Commit 54b5091606c18 ("perf stat: Implement --metric-only mode") added
function 'valid_only_metric()' which drops "Hz" or "hz", if it is part
of "ScaleUnit". This patch enable it since hv_24x7 supports couple of
frequency events.

Signed-off-by: Kajol Jain 
Acked-by: Jiri Olsa 
Cc: Alexander Shishkin 
Cc: Andi Kleen 
Cc: Anju T Sudhakar 
Cc: Benjamin Herrenschmidt 
Cc: Greg Kroah-Hartman 
Cc: Jin Yao 
Cc: Joe Mario 
Cc: Kan Liang 
Cc: Madhavan Srinivasan 
Cc: Mamatha Inamdar 
Cc: Mark Rutland 
Cc: Michael Ellerman 
Cc: Michael Petlan 
Cc: Namhyung Kim 
Cc: Paul Mackerras 
Cc: Peter Zijlstra 
Cc: Ravi Bangoria 
Cc: Sukadev Bhattiprolu 
Cc: Thomas Gleixner 
Cc: linuxppc-dev@lists.ozlabs.org
Link: http://lore.kernel.org/lkml/20200401203340.31402-7-kj...@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo 
---
 tools/perf/util/stat-display.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/tools/perf/util/stat-display.c b/tools/perf/util/stat-display.c
index 9e757d1..679aaa6 100644
--- a/tools/perf/util/stat-display.c
+++ b/tools/perf/util/stat-display.c
@@ -237,8 +237,6 @@ static bool valid_only_metric(const char *unit)
if (!unit)
return false;
if (strstr(unit, "/sec") ||
-   strstr(unit, "hz") ||
-   strstr(unit, "Hz") ||
strstr(unit, "CPUs utilized"))
return false;
return true;


Re: [PATCH v2 3/3] mm/page_alloc: Keep memoryless cpuless node 0 offline

2020-05-08 Thread Srikar Dronamraju
* Michal Hocko  [2020-05-04 11:37:12]:

> > > 
> > > Have you tested on something else than ppc? Each arch does the NUMA
> > > setup separately and this is a big mess. E.g. x86 marks even memory less
> > > nodes (see init_memory_less_node) as online.
> > > 
> > 
> > while I have predominantly tested on ppc, I did test on X86 with CONFIG_NUMA
> > enabled/disabled on both single node and multi node machines.
> > However, I dont have a cpuless/memoryless x86 system.
> 
> This should be able to emulate inside kvm, I believe.
> 

I did try but somehow not able to get cpuless / memoryless node in a x86 kvm
guest.

Also I am unable to see how to enable HAVE_MEMORYLESS_NODES on x86 system.
# git grep -w HAVE_MEMORYLESS_NODES | cat
arch/ia64/Kconfig:config HAVE_MEMORYLESS_NODES
arch/powerpc/Kconfig:config HAVE_MEMORYLESS_NODES
#
I forced enabled but it got disabled while kernel build.
May be I am missing something.

> > 
> > So we have a redundant page hinting numa faults which we can avoid.
> 
> interesting. Does this lead to any observable differences? Btw. it would
> be really great to describe how the online state influences the numa
> balancing.
> 

If numa_balancing is enabled, it has a check to see if the number of online
nodes is 1. If its one, it disables numa_balancing, else the numa_balancing
stays as is. In this case, the actual node (node nr > 0) and
node 0 were marked online without the patch.

Here are 2 sample numa programs.

numa01.sh is a set of 2 process each running threads as many as number of cpus;
each thread doing 50 loops on 3GB process shared memory operations.

numa02.sh is a single process with threads as many as number of cpus;
each thread doing 800 loops on 32MB thread local memory operations.

Testcase Time:  Min  Max  Avg  StdDev
./numa01.sh  Real:  149.62   149.66   149.64   0.02
./numa01.sh  Sys:   3.21 3.71 3.46 0.25
./numa01.sh  User:  4755.13  4758.15  4756.64  1.51
./numa02.sh  Real:  24.9825.0225.000.02
./numa02.sh  Sys:   0.51 0.59 0.55 0.04
./numa02.sh  User:  790.28   790.88   790.58   0.30

Testcase Time:  Min  Max  Avg  StdDev  %Change
./numa01.sh  Real:  149.44   149.46   149.45   0.010.127133%
./numa01.sh  Sys:   0.71 0.89 0.80 0.09332.5%
./numa01.sh  User:  4754.19  4754.48  4754.33  0.150.0485873%
./numa02.sh  Real:  24.9724.9824.980.000.0800641%
./numa02.sh  Sys:   0.26 0.41 0.33 0.0866.6667%
./numa02.sh  User:  789.75   790.28   790.01   0.270.072151%

numa01.sh
param   no_patchwith_patch  %Change
-   --  --  ---
numa_hint_faults1131164 0   -100%
numa_hint_faults_local  1131164 0   -100%
numa_hit213696  214244  0.256439%
numa_local  213696  214244  0.256439%
numa_pte_updates1131294 0   -100%
pgfault 1380845 241424  -82.5162%
pgmajfault  75  60  -20%

numa02.sh
param   no_patchwith_patch  %Change
-   --  --  ---
numa_hint_faults111878  0   -100%
numa_hint_faults_local  111878  0   -100%
numa_hit41854   43220   3.26373%
numa_local  41854   43220   3.26373%
numa_pte_updates113926  0   -100%
pgfault 163662  51210   -68.7099%
pgmajfault  56  52  -7.14286%

Observations:
The real time and user time actually doesn't change much. However the system
time changes to some extent. The reason being the number of numa hinting
faults. With the patch we are not seeing the numa hinting faults.

> > 2. Few people have complained about existence of this dummy node when
> > parsing lscpu and numactl o/p. They somehow start to think that the tools
> > are reporting incorrectly or the kernel is not able to recognize resources
> > connected to the node.
> 
> Please be more specific.

Taking the below example of numactl
available: 2 nodes (0,7)
node 0 cpus:
node 0 size: 0 MB
node 0 free: 0 MB
node 7 cpus: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 
25 26 27 28 29 30 31
node 7 size: 16238 MB
node 7 free: 15449 MB
node distances:
node   0   7 
  0:  10  20 
  7:  20  10 

We know node 0 can be special, but users may not feel the same.

When users parse numactl/lscpu or /sys directory; they find there are 2
online nodes. They find none of the resources for a node(node 0) are
available but still online. However they find other nodes (nodes 1-6) with
don't have resources but not online. So they tend to think the kernel has
been unable to online some of the resources or the resources have gone bad.
Please do note that on hypervisors like PowerVM, the admins don't have
control over 

Re: [PATCH 2/3] dts: ppc: t4240rdb: add uie_unsupported property to drop warning

2020-05-08 Thread Alexandre Belloni
On 08/05/2020 13:49:24+0800, Biwen Li wrote:
> From: Biwen Li 
> 
> This adds uie_unsupported property to drop warning as follows:
> - $ hwclock.util-linux
>   hwclock.util-linux: select() to /dev/rtc0
>   to wait for clock tick timed out
> 
> My case:
> - RTC ds1374's INT pin is connected to VCC on T4240RDB,
>   then the RTC cannot inform cpu about the alarm interrupt
> 
> Signed-off-by: Biwen Li 
> ---
>  arch/powerpc/boot/dts/fsl/t4240rdb.dts | 6 +-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/powerpc/boot/dts/fsl/t4240rdb.dts 
> b/arch/powerpc/boot/dts/fsl/t4240rdb.dts
> index a56a705d41f7..ccdd10202e56 100644
> --- a/arch/powerpc/boot/dts/fsl/t4240rdb.dts
> +++ b/arch/powerpc/boot/dts/fsl/t4240rdb.dts
> @@ -144,7 +144,11 @@
>   rtc@68 {
>   compatible = "dallas,ds1374";
>   reg = <0x68>;
> - interrupts = <0x1 0x1 0 0>;

removing the interrupt should be enough to solve your issue

> + // The ds1374's INT pin isn't
> + // connected to cpu's INT pin,
> + // so the rtc cannot synchronize
> + // clock tick per second.
> + uie_unsupported;
>   };
>   };
>  
> -- 
> 2.17.1
> 

-- 
Alexandre Belloni, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


Re: [PATCH 1/3] rtc: ds1374: add uie_unsupported property to drop warning

2020-05-08 Thread Alexandre Belloni
Hi,

On 08/05/2020 13:49:23+0800, Biwen Li wrote:
> From: Biwen Li 
> 
> Add uie_unsupported property to drop warning as follows:
> - $ hwclock.util-linux
>   hwclock.util-liux: select() /dev/rtc0
>   to wait for clock tick timed out
> 
> My case:
> - RTC ds1374's INT pin is connected to VCC on T4240RDB,
>   then the RTC cannot inform cpu about the alarm
>   interrupt
> 
> Signed-off-by: Biwen Li 
> ---
>  drivers/rtc/rtc-ds1374.c | 4 
>  1 file changed, 4 insertions(+)
> 
> diff --git a/drivers/rtc/rtc-ds1374.c b/drivers/rtc/rtc-ds1374.c
> index 9c51a12cf70f..e530e887a17e 100644
> --- a/drivers/rtc/rtc-ds1374.c
> +++ b/drivers/rtc/rtc-ds1374.c
> @@ -651,6 +651,10 @@ static int ds1374_probe(struct i2c_client *client,
>   if (ret)
>   return ret;
>  
> + if (of_property_read_bool(client->dev.of_node,
> +  "uie_unsupported"))
> + ds1374->rtc->uie_unsupported = true;
> +

This is not how this is supposed to work, either the RTC support uie or
don't, it is not board dependent and certainly doesn't require an
(undocumented) DT property.

>  #ifdef CONFIG_RTC_DRV_DS1374_WDT
>   save_client = client;
>   ret = misc_register(_miscdev);
> -- 
> 2.17.1
> 

-- 
Alexandre Belloni, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


Re: [PATCH net 11/16] net: ethernet: marvell: mvneta: fix fixed-link phydev leaks

2020-05-08 Thread Greg Kroah-Hartman
On Fri, May 08, 2020 at 08:21:19AM +0200, Johan Hovold wrote:
> On Fri, May 08, 2020 at 03:35:02AM +0530, Naresh Kamboju wrote:
> > On Thu, 7 May 2020 at 16:43, Greg Kroah-Hartman
> >  wrote:
> > >
> > 
> > > > >
> > > > > Greg, 3f65047c853a ("of_mdio: add helper to deregister fixed-link
> > > > > PHYs") needs to be backported as well for these.
> > > > >
> > > > > Original series can be found here:
> > > > >
> > > > > 
> > > > > https://lkml.kernel.org/r/1480357509-28074-1-git-send-email-jo...@kernel.org
> > > >
> > > > Ah, thanks for that, I thought I dropped all of the ones that caused
> > > > build errors, but missed the above one.  I'll go take the whole series
> > > > instead.
> > >
> > > This should now all be fixed up, thanks.
> > 
> > While building kernel Image for arm architecture on stable-rc 4.4 branch
> > the following build error found.
> > 
> > of_mdio: add helper to deregister fixed-link PHYs
> > commit 3f65047c853a2a5abcd8ac1984af3452b5df4ada upstream.
> > 
> > Add helper to deregister fixed-link PHYs registered using
> > of_phy_register_fixed_link().
> > 
> > Convert the two drivers that care to deregister their fixed-link PHYs to
> > use the new helper, but note that most drivers currently fail to do so.
> > 
> > Signed-off-by: Johan Hovold 
> > Signed-off-by: David S. Miller 
> > [only take helper function for 4.4.y - gregkh]
> > 
> >  # make -sk KBUILD_BUILD_USER=TuxBuild -C/linux -j16 ARCH=arm
> > CROSS_COMPILE=arm-linux-gnueabihf- HOSTCC=gcc CC="sccache
> > arm-linux-gnueabihf-gcc" O=build zImage
> > 70 #
> > 71 ../drivers/of/of_mdio.c: In function ‘of_phy_deregister_fixed_link’:
> > 72 ../drivers/of/of_mdio.c:379:2: error: implicit declaration of
> > function ‘fixed_phy_unregister’; did you mean ‘fixed_phy_register’?
> > [-Werror=implicit-function-declaration]
> > 73  379 | fixed_phy_unregister(phydev);
> > 74  | ^~~~
> > 75  | fixed_phy_register
> > 76 ../drivers/of/of_mdio.c:381:22: error: ‘struct phy_device’ has no
> > member named ‘mdio’; did you mean ‘mdix’?
> > 77  381 | put_device(>mdio.dev); /* of_phy_find_device() */
> > 78  | ^~~~
> > 79  | mdix
> 
> Another dependency: 5bcbe0f35fb1 ("phy: fixed: Fix removal of phys.")
> 
> Greg, these patches are from four years ago so can't really remember if
> there are other dependencies or reasons against backporting them (the
> missing stable tags are per Dave's preference), sorry.
> 
> The cover letter also mentions another dependency, but that may just
> have been some context conflict.
> 
> Perhaps you better drop these unless you want to review them closer.

Good idea, I've dropped them all for now, sorry for the noise.

greg k-h


Re: [PATCH v7 2/5] seq_buf: Export seq_buf_printf() to external modules

2020-05-08 Thread Vaibhav Jain
Hi Boris,

Borislav Petkov  writes:

> On Fri, May 08, 2020 at 04:19:19PM +0530, Vaibhav Jain wrote:
>> 'seq_buf' provides a very useful abstraction for writing to a string
>> buffer without needing to worry about it over-flowing. However even
>> though the API has been stable for couple of years now its stills not
>> exported to external modules limiting its usage.
>> 
>> Hence this patch proposes update to 'seq_buf.c' to mark
>> seq_buf_printf() which is part of the seq_buf API to be exported to
>> external GPL modules. This symbol will be used in later parts of this
>
> What is "external GPL modules"?
I am referring to Kernel Loadable Modules with MODULE_LICENSE("GPL")
here.

>
> -- 
> Regards/Gruss,
> Boris.
>
> https://people.kernel.org/tglx/notes-about-netiquette
> ___
> Linux-nvdimm mailing list -- linux-nvd...@lists.01.org
> To unsubscribe send an email to linux-nvdimm-le...@lists.01.org


RE: [PATCH 2/3] dts: ppc: t4240rdb: add uie_unsupported property to drop warning

2020-05-08 Thread Biwen Li (OSS)
> 
> On 08/05/2020 13:49:24+0800, Biwen Li wrote:
> > From: Biwen Li 
> >
> > This adds uie_unsupported property to drop warning as follows:
> > - $ hwclock.util-linux
> >   hwclock.util-linux: select() to /dev/rtc0
> >   to wait for clock tick timed out
> >
> > My case:
> > - RTC ds1374's INT pin is connected to VCC on T4240RDB,
> >   then the RTC cannot inform cpu about the alarm interrupt
> >
> > Signed-off-by: Biwen Li 
> > ---
> >  arch/powerpc/boot/dts/fsl/t4240rdb.dts | 6 +-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/arch/powerpc/boot/dts/fsl/t4240rdb.dts
> b/arch/powerpc/boot/dts/fsl/t4240rdb.dts
> > index a56a705d41f7..ccdd10202e56 100644
> > --- a/arch/powerpc/boot/dts/fsl/t4240rdb.dts
> > +++ b/arch/powerpc/boot/dts/fsl/t4240rdb.dts
> > @@ -144,7 +144,11 @@
> > rtc@68 {
> > compatible = "dallas,ds1374";
> > reg = <0x68>;
> > -   interrupts = <0x1 0x1 0 0>;
> 
> removing the interrupt should be enough to solve your issue
Okay, got it. Thanks.
> 
> > +   // The ds1374's INT pin isn't
> > +   // connected to cpu's INT pin,
> > +   // so the rtc cannot synchronize
> > +   // clock tick per second.
> > +   uie_unsupported;
> > };
> > };
> >
> > --
> > 2.17.1
> >
> 
> --
> Alexandre Belloni, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com


RE: [PATCH 1/3] rtc: ds1374: add uie_unsupported property to drop warning

2020-05-08 Thread Biwen Li (OSS)
> 
> Hi,
> 
> On 08/05/2020 13:49:23+0800, Biwen Li wrote:
> > From: Biwen Li 
> >
> > Add uie_unsupported property to drop warning as follows:
> > - $ hwclock.util-linux
> >   hwclock.util-liux: select() /dev/rtc0
> >   to wait for clock tick timed out
> >
> > My case:
> > - RTC ds1374's INT pin is connected to VCC on T4240RDB,
> >   then the RTC cannot inform cpu about the alarm
> >   interrupt
> >
> > Signed-off-by: Biwen Li 
> > ---
> >  drivers/rtc/rtc-ds1374.c | 4 
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/drivers/rtc/rtc-ds1374.c b/drivers/rtc/rtc-ds1374.c index
> > 9c51a12cf70f..e530e887a17e 100644
> > --- a/drivers/rtc/rtc-ds1374.c
> > +++ b/drivers/rtc/rtc-ds1374.c
> > @@ -651,6 +651,10 @@ static int ds1374_probe(struct i2c_client *client,
> > if (ret)
> > return ret;
> >
> > +   if (of_property_read_bool(client->dev.of_node,
> > +"uie_unsupported"))
> > +   ds1374->rtc->uie_unsupported = true;
> > +
> 
> This is not how this is supposed to work, either the RTC support uie or 
> don't, it is
> not board dependent and certainly doesn't require an
> (undocumented) DT property.
Okay, got it. Thanks.
> 
> >  #ifdef CONFIG_RTC_DRV_DS1374_WDT
> > save_client = client;
> > ret = misc_register(_miscdev);
> > --
> > 2.17.1
> >
> 
> --
> Alexandre Belloni, Bootlin
> Embedded Linux and Kernel engineering
> https://bootlin.com


Re: [PATCH v7 2/5] seq_buf: Export seq_buf_printf() to external modules

2020-05-08 Thread Borislav Petkov
On Fri, May 08, 2020 at 04:19:19PM +0530, Vaibhav Jain wrote:
> 'seq_buf' provides a very useful abstraction for writing to a string
> buffer without needing to worry about it over-flowing. However even
> though the API has been stable for couple of years now its stills not
> exported to external modules limiting its usage.
> 
> Hence this patch proposes update to 'seq_buf.c' to mark
> seq_buf_printf() which is part of the seq_buf API to be exported to
> external GPL modules. This symbol will be used in later parts of this

What is "external GPL modules"?

-- 
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette


[PATCH v7 5/5] powerpc/papr_scm: Implement support for PAPR_SCM_PDSM_HEALTH

2020-05-08 Thread Vaibhav Jain
This patch implements support for PDSM request 'PAPR_SCM_PDSM_HEALTH'
that returns a newly introduced 'struct nd_papr_pdsm_health' instance
containing dimm health information back to user space in response to
ND_CMD_CALL. This functionality is implemented in newly introduced
papr_scm_get_health() that queries the scm-dimm health information and
then copies this information to the package payload whose layout is
defined by 'struct nd_papr_pdsm_health'.

The patch also introduces a new member 'struct papr_scm_priv.health'
thats an instance of 'struct nd_papr_pdsm_health' to cache the health
information of a nvdimm. As a result functions drc_pmem_query_health()
and flags_show() are updated to populate and use this new struct
instead of a u64 integer that was earlier used.

Cc: Dan Williams 
Cc: Michael Ellerman 
Cc: "Aneesh Kumar K . V" 
Signed-off-by: Vaibhav Jain 
---
Changelog:
v6..v7:
* Updated flags_show() to use seq_buf_printf(). [Mpe]
* Updated papr_scm_get_health() to use newly introduced
  __drc_pmem_query_health() bypassing the cache [Mpe].

v5..v6:
* Added attribute '__packed' to 'struct nd_papr_pdsm_health_v1' to
  gaurd against possibility of different compilers adding different
  paddings to the struct [ Dan Williams ]

* Updated 'struct nd_papr_pdsm_health_v1' to use __u8 instead of
  'bool' and also updated drc_pmem_query_health() to take this into
  account. [ Dan Williams ]

v4..v5:
* None

v3..v4:
* Call the DSM_PAPR_SCM_HEALTH service function from
  papr_scm_service_dsm() instead of papr_scm_ndctl(). [Aneesh]

v2..v3:
* Updated struct nd_papr_scm_dimm_health_stat_v1 to use '__xx' types
  as its exported to the userspace [Aneesh]
* Changed the constants DSM_PAPR_SCM_DIMM_XX indicating dimm health
  from enum to #defines [Aneesh]

v1..v2:
* New patch in the series
---
 arch/powerpc/include/uapi/asm/papr_scm_pdsm.h |  39 ++
 arch/powerpc/platforms/pseries/papr_scm.c | 125 +++---
 2 files changed, 147 insertions(+), 17 deletions(-)

diff --git a/arch/powerpc/include/uapi/asm/papr_scm_pdsm.h 
b/arch/powerpc/include/uapi/asm/papr_scm_pdsm.h
index 671693439c1c..db0cf550dabe 100644
--- a/arch/powerpc/include/uapi/asm/papr_scm_pdsm.h
+++ b/arch/powerpc/include/uapi/asm/papr_scm_pdsm.h
@@ -113,6 +113,7 @@ struct nd_pdsm_cmd_pkg {
  */
 enum papr_scm_pdsm {
PAPR_SCM_PDSM_MIN = 0x0,
+   PAPR_SCM_PDSM_HEALTH,
PAPR_SCM_PDSM_MAX,
 };
 
@@ -131,4 +132,42 @@ static inline void *pdsm_cmd_to_payload(struct 
nd_pdsm_cmd_pkg *pcmd)
return (void *)((__u8 *) pcmd + pcmd->payload_offset);
 }
 
+/* Various scm-dimm health indicators */
+#define PAPR_PDSM_DIMM_HEALTHY   0
+#define PAPR_PDSM_DIMM_UNHEALTHY 1
+#define PAPR_PDSM_DIMM_CRITICAL  2
+#define PAPR_PDSM_DIMM_FATAL 3
+
+/*
+ * Struct exchanged between kernel & ndctl in for PAPR_SCM_PDSM_HEALTH
+ * Various flags indicate the health status of the dimm.
+ *
+ * dimm_unarmed: Dimm not armed. So contents wont persist.
+ * dimm_bad_shutdown   : Previous shutdown did not persist contents.
+ * dimm_bad_restore: Contents from previous shutdown werent restored.
+ * dimm_scrubbed   : Contents of the dimm have been scrubbed.
+ * dimm_locked : Contents of the dimm cant be modified until CEC reboot
+ * dimm_encrypted  : Contents of dimm are encrypted.
+ * dimm_health : Dimm health indicator. One of PAPR_PDSM_DIMM_
+ */
+struct nd_papr_pdsm_health_v1 {
+   __u8 dimm_unarmed;
+   __u8 dimm_bad_shutdown;
+   __u8 dimm_bad_restore;
+   __u8 dimm_scrubbed;
+   __u8 dimm_locked;
+   __u8 dimm_encrypted;
+   __u16 dimm_health;
+} __packed;
+
+/*
+ * Typedef the current struct for dimm_health so that any application
+ * or kernel recompiled after introducing a new version automatically
+ * supports the new version.
+ */
+#define nd_papr_pdsm_health nd_papr_pdsm_health_v1
+
+/* Current version number for the dimm health struct */
+#define ND_PAPR_PDSM_HEALTH_VERSION 1
+
 #endif /* _UAPI_ASM_POWERPC_PAPR_SCM_PDSM_H_ */
diff --git a/arch/powerpc/platforms/pseries/papr_scm.c 
b/arch/powerpc/platforms/pseries/papr_scm.c
index ed4b49a6f1e1..c59bf17ad054 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -88,7 +88,7 @@ struct papr_scm_priv {
unsigned long lasthealth_jiffies;
 
/* Health information for the dimm */
-   u64 health_bitmap;
+   struct nd_papr_pdsm_health health;
 };
 
 static int drc_pmem_bind(struct papr_scm_priv *p)
@@ -201,6 +201,7 @@ static int drc_pmem_query_n_bind(struct papr_scm_priv *p)
 static int __drc_pmem_query_health(struct papr_scm_priv *p)
 {
unsigned long ret[PLPAR_HCALL_BUFSIZE];
+   u64 health;
s64 rc;
 
/* issue the hcall */
@@ -208,18 +209,46 @@ static int __drc_pmem_query_health(struct papr_scm_priv 
*p)
if (rc != H_SUCCESS) {
dev_err(>pdev->dev,
 

[PATCH v7 2/5] seq_buf: Export seq_buf_printf() to external modules

2020-05-08 Thread Vaibhav Jain
'seq_buf' provides a very useful abstraction for writing to a string
buffer without needing to worry about it over-flowing. However even
though the API has been stable for couple of years now its stills not
exported to external modules limiting its usage.

Hence this patch proposes update to 'seq_buf.c' to mark
seq_buf_printf() which is part of the seq_buf API to be exported to
external GPL modules. This symbol will be used in later parts of this
patchset to simplify content creation for a sysfs attribute.

Cc: Steven Rostedt 
Cc: Piotr Maziarz 
Cc: Cezary Rojewski 
Cc: Borislav Petkov 
Signed-off-by: Vaibhav Jain 
---
Changelog:

v6..v7:
* New patch in the series
---
 lib/seq_buf.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/seq_buf.c b/lib/seq_buf.c
index 4e865d42ab03..707453f5d58e 100644
--- a/lib/seq_buf.c
+++ b/lib/seq_buf.c
@@ -91,6 +91,7 @@ int seq_buf_printf(struct seq_buf *s, const char *fmt, ...)
 
return ret;
 }
+EXPORT_SYMBOL_GPL(seq_buf_printf);
 
 #ifdef CONFIG_BINARY_PRINTF
 /**
-- 
2.26.2



[PATCH v7 4/5] ndctl/papr_scm, uapi: Add support for PAPR nvdimm specific methods

2020-05-08 Thread Vaibhav Jain
Introduce support for Papr nvDimm Specific Methods (PDSM) in papr_scm
modules and add the command family to the white list of NVDIMM command
sets. Also advertise support for ND_CMD_CALL for the dimm
command mask and implement necessary scaffolding in the module to
handle ND_CMD_CALL ioctl and PDSM requests that we receive.

The layout of the PDSM request as we expect from libnvdimm/libndctl is
described in newly introduced uapi header 'papr_scm_pdsm.h' which
defines a new 'struct nd_pdsm_cmd_pkg' header. This header is used
to communicate the PDSM request via member
'nd_pkg_papr_scm->nd_command' and size of payload that need to be
sent/received for servicing the PDSM.

A new function is_cmd_valid() is implemented that reads the args to
papr_scm_ndctl() and performs sanity tests on them. A new function
papr_scm_service_pdsm() is introduced and is called from
papr_scm_ndctl() in case of a PDSM request is received via ND_CMD_CALL
command from libnvdimm.

Cc: Dan Williams 
Cc: Michael Ellerman 
Cc: "Aneesh Kumar K . V" 
Signed-off-by: Vaibhav Jain 
---
Changelog:

v6..v7 :
* Removed the re-definitions of __packed macro from papr_scm_pdsm.h
  [Mpe].
* Removed the usage of __KERNEL__ macros in papr_scm_pdsm.h [Mpe].
* Removed macros that were unused in papr_scm.c from papr_scm_pdsm.h
  [Mpe].
* Made functions defined in papr_scm_pdsm.h as static inline. [Mpe]

v5..v6 :
* Changed the usage of the term DSM to PDSM to distinguish it from the
  ACPI term [ Dan Williams ]
* Renamed papr_scm_dsm.h to papr_scm_pdsm.h and updated various struct
  to reflect the new terminology.
* Updated the patch description and title to reflect the new terminology.
* Squashed patch to introduce new command family in 'ndctl.h' with
  this patch [ Dan Williams ]
* Updated the papr_scm_pdsm method starting index from 0x1 to 0x0
  [ Dan Williams ]
* Removed redundant license text from the papr_scm_psdm.h file.
  [ Dan Williams ]
* s/envelop/envelope/ at various places [ Dan Williams ]
* Added '__packed' attribute to command package header to gaurd
  against different compiler adding paddings between the fields.
  [ Dan Williams]
* Converted various pr_debug to dev_debug [ Dan Williams ]

v4..v5 :
* None

v3..v4 :
* None

v2..v3 :
* Updated the patch prefix to 'ndctl/uapi' [Aneesh]

v1..v2 :
* None
---
 arch/powerpc/include/uapi/asm/papr_scm_pdsm.h | 134 ++
 arch/powerpc/platforms/pseries/papr_scm.c | 101 -
 include/uapi/linux/ndctl.h|   1 +
 3 files changed, 230 insertions(+), 6 deletions(-)
 create mode 100644 arch/powerpc/include/uapi/asm/papr_scm_pdsm.h

diff --git a/arch/powerpc/include/uapi/asm/papr_scm_pdsm.h 
b/arch/powerpc/include/uapi/asm/papr_scm_pdsm.h
new file mode 100644
index ..671693439c1c
--- /dev/null
+++ b/arch/powerpc/include/uapi/asm/papr_scm_pdsm.h
@@ -0,0 +1,134 @@
+/* SPDX-License-Identifier: GPL-2.0+ WITH Linux-syscall-note */
+/*
+ * PAPR-SCM Dimm specific methods (PDSM) and structs for libndctl
+ *
+ * (C) Copyright IBM 2020
+ *
+ * Author: Vaibhav Jain 
+ */
+
+#ifndef _UAPI_ASM_POWERPC_PAPR_SCM_PDSM_H_
+#define _UAPI_ASM_POWERPC_PAPR_SCM_PDSM_H_
+
+#include 
+
+/*
+ * PDSM Envelope:
+ *
+ * The ioctl ND_CMD_CALL transfers data between user-space and kernel via
+ * 'envelopes' which consists of a header and user-defined payload sections.
+ * The header is described by 'struct nd_pdsm_cmd_pkg' which expects a
+ * payload following it and offset of which relative to the struct is provided
+ * by 'nd_pdsm_cmd_pkg.payload_offset'. *
+ *
+ *  +-+-+---+
+ *  |   64-Bytes  |   8-Bytes   |   Max 184-Bytes   |
+ *  +-+-+---+
+ *  |   nd_pdsm_cmd_pkg |   |
+ *  |-+ |   |
+ *  |  nd_cmd_pkg | |   |
+ *  +-+-+---+
+ *  | nd_family   ||   |
+ *  | nd_size_out | cmd_status  |  |
+ *  | nd_size_in  | payload_version |  PAYLOAD |
+ *  | nd_command  | payload_offset ->  |
+ *  | nd_fw_size  | |  |
+ *  +-+-+---+
+ *
+ * PDSM Header:
+ *
+ * The header is defined as 'struct nd_pdsm_cmd_pkg' which embeds a
+ * 'struct nd_cmd_pkg' instance. The PDSM command is assigned to member
+ * 'nd_cmd_pkg.nd_command'. Apart from size information of the envelope which 
is
+ * contained in 'struct nd_cmd_pkg', the header also has members following
+ * members:
+ *
+ * 'cmd_status': (Out) Errors if any encountered while 
servicing PDSM.
+ * 'payload_version'   : (In/Out) Version number associated with the payload.
+ * 

[PATCH v7 3/5] powerpc/papr_scm: Fetch nvdimm health information from PHYP

2020-05-08 Thread Vaibhav Jain
Implement support for fetching nvdimm health information via
H_SCM_HEALTH hcall as documented in Ref[1]. The hcall returns a pair
of 64-bit big-endian integers, bitwise-and of which is then stored in
'struct papr_scm_priv' and subsequently partially exposed to
user-space via newly introduced dimm specific attribute
'papr/flags'. Since the hcall is costly, the health information is
cached and only re-queried, 60s after the previous successful hcall.

The patch also adds a  documentation text describing flags reported by
the the new sysfs attribute 'papr/flags' is also introduced at
Documentation/ABI/testing/sysfs-bus-papr-scm.

[1] commit 58b278f568f0 ("powerpc: Provide initial documentation for
PAPR hcalls")

Cc: Dan Williams 
Cc: Michael Ellerman 
Cc: "Aneesh Kumar K . V" 
Signed-off-by: Vaibhav Jain 
---
Changelog:

v6..v7 :
* Used the exported buf_seq_printf() function to generate content for
  'papr/flags'
* Moved the PAPR_SCM_DIMM_* bit-flags macro definitions to papr_scm.c
  and removed the papr_scm.h file [Mpe]
* Some minor consistency issued in sysfs-bus-papr-scm
  documentation. [Mpe]
* s/dimm_mutex/health_mutex/g [Mpe]
* Split drc_pmem_query_health() into two function one of which takes
  care of caching and locking. [Mpe]
* Fixed a local copy creation of dimm health information using
  READ_ONCE(). [Mpe]

v5..v6 :
* Change the flags sysfs attribute from 'papr_flags' to 'papr/flags'
  [Dan Williams]
* Include documentation for 'papr/flags' attr [Dan Williams]
* Change flag 'save_fail' to 'flush_fail' [Dan Williams]
* Caching of health bitmap to reduce expensive hcalls [Dan Williams]
* Removed usage of PPC_BIT from 'papr-scm.h' header [Mpe]
* Replaced two __be64 integers from papr_scm_priv to a single u64
  integer [Mpe]
* Updated patch description to reflect the changes made in this
  version.
* Removed avoidable usage of 'papr_scm_priv.dimm_mutex' from
  flags_show() [Dan Williams]

v4..v5 :
* None

v3..v4 :
* None

v2..v3 :
* Removed PAPR_SCM_DIMM_HEALTH_NON_CRITICAL as a condition for
 NVDIMM unarmed [Aneesh]

v1..v2 :
* New patch in the series.
---
 Documentation/ABI/testing/sysfs-bus-papr-scm |  27 +++
 arch/powerpc/platforms/pseries/papr_scm.c| 169 ++-
 2 files changed, 194 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-papr-scm

diff --git a/Documentation/ABI/testing/sysfs-bus-papr-scm 
b/Documentation/ABI/testing/sysfs-bus-papr-scm
new file mode 100644
index ..6143d06072f1
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-papr-scm
@@ -0,0 +1,27 @@
+What:  /sys/bus/nd/devices/nmemX/papr/flags
+Date:  Apr, 2020
+KernelVersion: v5.8
+Contact:   linuxppc-dev , 
linux-nvd...@lists.01.org,
+Description:
+   (RO) Report flags indicating various states of a
+   papr-scm NVDIMM device. Each flag maps to a one or
+   more bits set in the dimm-health-bitmap retrieved in
+   response to H_SCM_HEALTH hcall. The details of the bit
+   flags returned in response to this hcall is available
+   at 'Documentation/powerpc/papr_hcalls.rst' . Below are
+   the flags reported in this sysfs file:
+
+   * "not_armed"   : Indicates that NVDIMM contents will not
+ survive a power cycle.
+   * "flush_fail"  : Indicates that NVDIMM contents
+ couldn't be flushed during last
+ shut-down event.
+   * "restore_fail": Indicates that NVDIMM contents
+ couldn't be restored during NVDIMM
+ initialization.
+   * "encrypted"   : NVDIMM contents are encrypted.
+   * "smart_notify": There is health event for the NVDIMM.
+   * "scrubbed": Indicating that contents of the
+ NVDIMM have been scrubbed.
+   * "locked"  : Indicating that NVDIMM contents cant
+ be modified until next power cycle.
diff --git a/arch/powerpc/platforms/pseries/papr_scm.c 
b/arch/powerpc/platforms/pseries/papr_scm.c
index f35592423380..142636e1a59f 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -12,6 +12,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 
@@ -22,6 +23,44 @@
 (1ul << ND_CMD_GET_CONFIG_DATA) | \
 (1ul << ND_CMD_SET_CONFIG_DATA))
 
+/* DIMM health bitmap bitmap indicators */
+/* SCM device is unable to persist memory contents */
+#define PAPR_SCM_DIMM_UNARMED   (1ULL << (63 - 0))
+/* SCM device failed to persist memory contents */
+#define PAPR_SCM_DIMM_SHUTDOWN_DIRTY(1ULL << (63 - 1))
+/* SCM device contents are persisted from previous IPL */
+#define PAPR_SCM_DIMM_SHUTDOWN_CLEAN(1ULL << (63 - 2))
+/* SCM device contents 

[PATCH v7 0/5] powerpc/papr_scm: Add support for reporting nvdimm health

2020-05-08 Thread Vaibhav Jain
The PAPR standard[1][3] provides mechanisms to query the health and
performance stats of an NVDIMM via various hcalls as described in
Ref[2].  Until now these stats were never available nor exposed to the
user-space tools like 'ndctl'. This is partly due to PAPR platform not
having support for ACPI and NFIT. Hence 'ndctl' is unable to query and
report the dimm health status and a user had no way to determine the
current health status of a NDVIMM.

To overcome this limitation, this patch-set updates papr_scm kernel
module to query and fetch NVDIMM health stats using hcalls described
in Ref[2].  This health and performance stats are then exposed to
userspace via sysfs and PAPR-NVDIMM-Specific-Methods(PDSM) issued by
libndctl.

These changes coupled with proposed ndtcl changes located at Ref[4]
should provide a way for the user to retrieve NVDIMM health status
using ndtcl.

Below is a sample output using proposed kernel + ndctl for PAPR NVDIMM
in a emulation environment:

 # ndctl list -DH
[
  {
"dev":"nmem0",
"health":{
  "health_state":"fatal",
  "shutdown_state":"dirty"
}
  }
]

Dimm health report output on a pseries guest lpar with vPMEM or HMS
based NVDIMMs that are in perfectly healthy conditions:

 # ndctl list -d nmem0 -H
[
  {
"dev":"nmem0",
"health":{
  "health_state":"ok",
  "shutdown_state":"clean"
}
  }
]

PAPR NVDIMM-Specific-Methods(PDSM)
==

PDSM requests are issued by vendor specific code in libndctl to
execute certain operations or fetch information from NVDIMMS. PDSMs
requests can be sent to papr_scm module via libndctl(userspace) and
libnvdimm (kernel) using the ND_CMD_CALL ioctl command which can be
handled in the dimm control function papr_scm_ndctl(). Current
patchset proposes a single PDSM to retrieve NVDIMM health, defined in
the newly introduced uapi header named 'papr_scm_pdsm.h'. Support for
more PDSMs will be added in future.

Structure of the patch-set
==

The patch-set starts with a doc patch documenting details of hcall
H_SCM_HEALTH. Second patch exports kernel symbol seq_buf_printf()
thats used in subsequent patches to generate sysfs attribute content.

Third patch implements support for fetching NVDIMM health information
from PHYP and partially exposing it to user-space via a NVDIMM sysfs
flag.

Fourth patches deal with implementing support for servicing PDSM
commands in papr_scm module.

Finally the last patch implements support for servicing PDSM
'PAPR_SCM_PDSM_HEALTH' that returns the NVDIMM health information to
libndctl.

Changelog:
==

v6..v7:

* Incorporate various review comments from Mpe.  Removed papr_scm.h
* Added a patch to export seq_buf_printf() [Mpe, Steven Rostedt]
* header file and moved its contents to papr_scm.c.
* Split function drc_pmem_query_health() into two functions, one that takes
  care of caching and concurrency and other one that doesn't.
* Fixed a possible incorrect way to make local copy of nvdimm health data.
* Some variable renames changed as suggested in previous review.
* Removed unused macros/defines from papr_scm_pdsm.h
* Updated papr_scm_pdsm.h to remove usage of __KERNEL__ define.
* Updated papr_scm_pdsm.h to remove redefinition of __packed macro.

v5..v6:

* Incorporate review comments from Mpe and Dan Williams.
* Changed the usage of term DSM to PDSM as former conflicted with
  usage in ACPI context.
* UAPI updates to remove usage of bool and marking the structs 
  defined as 'packed'.
* Simplified the health-bitmap handling in papr_scm to use u64
  instead of __be64 integers.
* Caching of the health information so reading the dimm-flag file
  doesn't result in costly hcalls everytime.
* Changed dimm-flag 'save_fail' to 'flush_fail'
* Moved the dimm flag file from 'papr_flags' to 'papr/flags'.
* Added a patch to document H_SCM_HEALTH hcall return values.
* Added sysfs ABI documentation for newly introduce dimm-flag
  sysfs file 'papr/flags'

v4..v5:

* Fixed a bug in new implementation of papr_scm_ndctl() that was triggering
  a false error condition.

v3..v4:

* Restructured papr_scm_ndctl() to dispatch ND_CMD_CALL commands to a new
  function named papr_scm_service_dsm() to serivice DSM requests. [Aneesh]

v2..v3:

* Updated the papr_scm_dsm.h header to be more confimant general kernel
  guidelines for UAPI headers. [Aneesh]

* Changed the definition of macro PAPR_SCM_DIMM_UNARMED_MASK to not
  include case when the NVDIMM is unarmed because its a vPMEM
  NVDIMM. [Aneesh]

v1..v2:

* Restructured the patch-set based on review comments on V1 patch-set to
simplify the patch review. Multiple small patches have been combined into
single patches to reduce cross referencing that was needed in earlier
patch-set. Hence most of the patches in this patch-set as now new. [Aneesh]

* Removed the initial work done for fetch NVDIMM performance statistics.
These changes will be re-proposed in a separate patch-set. [Aneesh]

* Simplified handling 

[PATCH v7 1/5] powerpc: Document details on H_SCM_HEALTH hcall

2020-05-08 Thread Vaibhav Jain
Add documentation to 'papr_hcalls.rst' describing the bitmap flags
that are returned from H_SCM_HEALTH hcall as per the PAPR-SCM
specification.

Cc: Dan Williams 
Cc: Michael Ellerman 
Cc: "Aneesh Kumar K . V" 
Signed-off-by: Vaibhav Jain 
---
Changelog:

v6..v7:
* None

v5..v6
* New patch in the series
---
 Documentation/powerpc/papr_hcalls.rst | 43 ---
 1 file changed, 39 insertions(+), 4 deletions(-)

diff --git a/Documentation/powerpc/papr_hcalls.rst 
b/Documentation/powerpc/papr_hcalls.rst
index 3493631a60f8..9a5ba5eaf323 100644
--- a/Documentation/powerpc/papr_hcalls.rst
+++ b/Documentation/powerpc/papr_hcalls.rst
@@ -220,13 +220,48 @@ from the LPAR memory.
 **H_SCM_HEALTH**
 
 | Input: drcIndex
-| Out: *health-bitmap, health-bit-valid-bitmap*
+| Out: *health-bitmap (r4), health-bit-valid-bitmap (r5)*
 | Return Value: *H_Success, H_Parameter, H_Hardware*
 
 Given a DRC Index return the info on predictive failure and overall health of
-the NVDIMM. The asserted bits in the health-bitmap indicate a single predictive
-failure and health-bit-valid-bitmap indicate which bits in health-bitmap are
-valid.
+the NVDIMM. The asserted bits in the health-bitmap indicate one or more states
+(described in table below) of the NVDIMM and health-bit-valid-bitmap indicate
+which bits in health-bitmap are valid.
+
+Health Bitmap Flags:
+
++--+---+
+|  Bit |   Definition  
|
++==+===+
+|  00  | SCM device is unable to persist memory contents.  
|
+|  | If the system is powered down, nothing will be saved. 
|
++--+---+
+|  01  | SCM device failed to persist memory contents. Either contents were 
not|
+|  | saved successfully on power down or were not restored properly on 
|
+|  | power up. 
|
++--+---+
+|  02  | SCM device contents are persisted from previous IPL. The data from
|
+|  | the last boot were successfully restored. 
|
++--+---+
+|  03  | SCM device contents are not persisted from previous IPL. There was no 
|
+|  | data to restore from the last boot.   
|
++--+---+
+|  04  | SCM device memory life remaining is critically low
|
++--+---+
+|  05  | SCM device will be garded off next IPL due to failure 
|
++--+---+
+|  06  | SCM contents cannot persist due to current platform health status. A  
|
+|  | hardware failure may prevent data from being saved or restored.   
|
++--+---+
+|  07  | SCM device is unable to persist memory contents in certain conditions 
|
++--+---+
+|  08  | SCM device is encrypted   
|
++--+---+
+|  09  | SCM device has successfully completed a requested erase or secure 
|
+|  | erase procedure.  
|
++--+---+
+|10:63 | Reserved / Unused 
|
++--+---+
 
 **H_SCM_PERFORMANCE_STATS**
 
-- 
2.26.2



Re: [PATCH] powerpc/spufs: adjust list element pointer type

2020-05-08 Thread Julia Lawall



On Fri, 8 May 2020, Jeremy Kerr wrote:

> Hi Julia,
>
> > Other uses of >aff_list_head, eg in spufs_assert_affinity, indicate
> > that the list elements have type spu_context, not spu as used here.  Change
> > the type of tmp accordingly.
>
> Looks good to me; we could even use ctx there, rather than the separate
> tmp variable.

I thought about that, but it seemed a little bit abusive, since ctx is
used in an iteration over another list.  But if you prefer that I can
change it.

julia

>
> Reviewed-by: Jeremy Kerr 
>
> Cheers,
>
>
> Jeremy
>
>


Re: [PATCH] powerpc/spufs: adjust list element pointer type

2020-05-08 Thread Jeremy Kerr
Hi Julia,

> Other uses of >aff_list_head, eg in spufs_assert_affinity, indicate
> that the list elements have type spu_context, not spu as used here.  Change
> the type of tmp accordingly.

Looks good to me; we could even use ctx there, rather than the separate
tmp variable.

Reviewed-by: Jeremy Kerr 

Cheers,


Jeremy



[PATCH] powerpc/spufs: adjust list element pointer type

2020-05-08 Thread Julia Lawall
Other uses of >aff_list_head, eg in spufs_assert_affinity, indicate
that the list elements have type spu_context, not spu as used here.  Change
the type of tmp accordingly.

This has no impact on the execution, because tmp is not used in the body of
the loop.

Fixes: c5fc8d2a92461 ("[CELL] cell: add placement computation for scheduling of 
affinity contexts")
Signed-off-by: Julia Lawall 

---
 arch/powerpc/platforms/cell/spufs/sched.c |3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/powerpc/platforms/cell/spufs/sched.c 
b/arch/powerpc/platforms/cell/spufs/sched.c
index f18d5067cd0f..487fcb47f10d 100644
--- a/arch/powerpc/platforms/cell/spufs/sched.c
+++ b/arch/powerpc/platforms/cell/spufs/sched.c
@@ -344,8 +344,7 @@ static struct spu *aff_ref_location(struct spu_context 
*ctx, int mem_aff,
 static void aff_set_ref_point_location(struct spu_gang *gang)
 {
int mem_aff, gs, lowest_offset;
-   struct spu_context *ctx;
-   struct spu *tmp;
+   struct spu_context *tmp, *ctx;
 
mem_aff = gang->aff_ref_ctx->flags & SPU_CREATE_AFFINITY_MEM;
lowest_offset = 0;



Re: [PATCH v2 0/5] Statsfs: a new ram-based file sytem for Linux kernel statistics

2020-05-08 Thread Paolo Bonzini
[Answering for Emanuele because he's not available until Monday]

On 07/05/20 19:45, Jonathan Adams wrote:
> This is good work.  As David Rientjes mentioned, I'm currently investigating
> a similar project, based on a google-internal debugfs-based FS we call
> "metricfs".  It's
> designed in a slightly different fashion than statsfs here is, and the
> statistics exported are
> mostly fed into our OpenTelemetry-like system.  We're motivated by
> wanting an upstreamed solution, so that we can upstream the metrics we
> create that are of general interest, and lower the overall rebasing
> burden for our tree.

Cool.  We included a public reading API exactly so that there could be
other "frontends".  I was mostly thinking of BPF as an in-tree user, but
your metricfs could definitely use the reading API.

>  - the 8/16/32/64 signed/unsigned integers seems like a wart, and the
> built-in support to grab any offset from a structure doesn't seem like
> much of an advantage. A simpler interface would be to just support an> 
> "integer" (possibly signed/unsigned) type, which is always 64-bit, and
> allow the caller to provide a function pointer to retrieve the value,
> with one or two void *s cbargs.  Then the framework could provide an
> offset-based callback (or callbacks) similar to the existing
> functionality, and a similar one for per-CPU based statistics.  A
> second "clear" callback could be optionally provided to allow for
> statistics to be cleared, as in your current proposal.

Ok, so basically splitting get_simple_value into many separate
callbacks.  The callbacks would be in a struct like

struct stats_fs_type {
uint64_t (*get)(struct stats_fs_value *, void *);
void (*clear)(struct stats_fs_value *, void *);
bool signed;
}

static uint64_t stats_fs_get_u8(struct stats_fs_value *val, void *base)
{
return *((uint8_t *)(base + (uintptr_t)val->arg);
}

static void stats_fs_clear_u8(struct stats_fs_value *val, void *base)
{
*((uint8_t *)(base + (uintptr_t)val->arg) = 0;
}

struct stats_fs_type stats_fs_type_u8 = {
stats_fs_get_u8,
stats_fs_clear_u8,
false
};

and custom types can be defined using "&(struct stats_fs_type) {...}".

>  - Beyond the statistic's type, one *very* useful piece of metadata
> for telemetry tools is knowing whether a given statistic is
> "cumulative" (an unsigned counter which is only ever increased), as
> opposed to a floating value (like "amount of memory used").

Good idea.  Also, clearing does not make sense for a floating value, so
we can use cumulative/floating to get a default for the mode: KVM
statistics for example are mostly cumulative and mode 644, except a few
that are floating and those are all mode 444.  Therefore it makes sense
to add cumulative/floating even before outputting it as metadata.

> I'm more
> concerned with getting the statistics model and capabilities right
> from the beginning, because those are harder to adjust later.

Agreed.

> 1. Each metricfs metric can have one or two string or integer "keys".
> If these exist, they expand the metric from a single value into a
> multi-dimensional table. For example, we use this to report a hash
> table we keep of functions calling "WARN()", in a 'warnings'
> statistic:
> 
> % cat .../warnings/values
> x86_pmu_stop 1
> %
>
> Indicates that the x86_pmu_stop() function has had a WARN() fire once
> since the system was booted.  If multiple functions have fired
> WARN()s, they are listed in this table with their own counts. [1]  We
> also use these to report per-CPU counters on a CPU-by-CPU basis:
> 
> % cat .../irq_x86/NMI/values
> 0 42
> 1 18
> ... one line per cpu
> % cat .../rx_bytes/values
> lo 501360681
> eth0 1457631256

These seem like two different things.

The percpu and per-interface values are best represented as subordinate
sources, one per CPU and one per interface.  For interfaces I would just
use a separate directory, but it doesn't really make sense for CPUs.  So
if we can cater for it in the model, it's better.  For example:

- add a new argument to statsfs_create_source and statsfs_create_values
that makes it not create directories and files respectively.

- add a new "aggregate function" STATS_FS_LIST that directs the parent
to build a table of all the simple values below it

We can also add a helper statsfs_add_values_percpu that creates a new
source for each CPU, I think.

The warnings one instead is a real hash table.  It should be possible to
implement it as some kind of customized aggregation, that is implemented
in the client instead of coming from subordinate sources.  The
presentation can then just use STATS_FS_LIST.  I don't see anything in
the design that is a blocker.

> 2.  We also export some metadata about each statistic.  For example,
> the metadata for the NMI counter above looks like:
> 
> % cat .../NMI/annotations
> DESCRIPTION Non-maskable\ interrupts
> CUMULATIVE
> % cat .../NMI/fields
> cpu value
> int int
> %

Good 

Re: [PATCH v8 11/30] powerpc: Use a datatype for instructions

2020-05-08 Thread Christophe Leroy




Le 08/05/2020 à 03:51, Jordan Niethe a écrit :

On Wed, May 6, 2020 at 1:45 PM Jordan Niethe  wrote:


Currently unsigned ints are used to represent instructions on powerpc.
This has worked well as instructions have always been 4 byte words.
However, a future ISA version will introduce some changes to
instructions that mean this scheme will no longer work as well. This
change is Prefixed Instructions. A prefixed instruction is made up of a
word prefix followed by a word suffix to make an 8 byte double word
instruction. No matter the endianness of the system the prefix always
comes first. Prefixed instructions are only planned for powerpc64.

Introduce a ppc_inst type to represent both prefixed and word
instructions on powerpc64 while keeping it possible to exclusively have
word instructions on powerpc32.

Signed-off-by: Jordan Niethe 
---
v4: New to series
v5: Add to epapr_paravirt.c, kgdb.c
v6: - setup_32.c: machine_init(): Use type
 - feature-fixups.c: do_final_fixups(): Use type
 - optprobes.c: arch_prepare_optimized_kprobe(): change a void * to
   struct ppc_inst *
 - fault.c: store_updates_sp(): Use type
 - Change ppc_inst_equal() implementation from memcpy()
v7: - Fix compilation issue in early_init_dt_scan_epapr() and
   do_patch_instruction() with CONFIG_STRICT_KERNEL_RWX
v8: - style
 - Use in crash_dump.c, mpc86xx_smp.c, smp.c
---


[...]





Hi mpe,
Could you add this fixup.
--- a/arch/powerpc/lib/feature-fixups.c
+++ b/arch/powerpc/lib/feature-fixups.c
@@ -356,7 +356,7 @@ static void patch_btb_flush_section(long *curr)
 end = (void *)curr + *(curr + 1);
 for (; start < end; start++) {
 pr_devel("patching dest %lx\n", (unsigned long)start);
-   patch_instruction(start, ppc_inst(PPC_INST_NOP));
+   patch_instruction((struct ppc_inst *)start,
ppc_inst(PPC_INST_NOP));
 }
  }



Why not declare stard and end as struct ppc_inst ? Wouldn't it be 
cleaner than a cast ?


Christophe


Re: [PATCH v4 02/14] arm: add support for folded p4d page tables

2020-05-08 Thread Marek Szyprowski
Hi Mike,

On 07.05.2020 18:11, Mike Rapoport wrote:
> On Thu, May 07, 2020 at 02:16:56PM +0200, Marek Szyprowski wrote:
>> On 14.04.2020 17:34, Mike Rapoport wrote:
>>> From: Mike Rapoport 
>>>
>>> Implement primitives necessary for the 4th level folding, add walks of p4d
>>> level where appropriate, and remove __ARCH_USE_5LEVEL_HACK.
>>>
>>> Signed-off-by: Mike Rapoport 
>> Today I've noticed that kexec is broken on ARM 32bit. Bisecting between
>> current linux-next and v5.7-rc1 pointed to this commit. I've tested this
>> on Odroid XU4 and Raspberry Pi4 boards. Here is the relevant log:
>>
>> # kexec --kexec-syscall -l zImage --append "$(cat /proc/cmdline)"
>> memory_range[0]:0x4000..0xbe9f
>> memory_range[0]:0x4000..0xbe9f
>> # kexec -e
>> kexec_core: Starting new kernel
>> 8<--- cut here ---
>> Unable to handle kernel paging request at virtual address c010f1f4
>> pgd = c6817793
>> [c010f1f4] *pgd=441e(bad)
>> Internal error: Oops: 80d [#1] PREEMPT ARM
>> Modules linked in:
>> CPU: 0 PID: 1329 Comm: kexec Tainted: G    W
>> 5.7.0-rc3-00127-g6cba81ed0f62 #611
>> Hardware name: Samsung Exynos (Flattened Device Tree)
>> PC is at machine_kexec+0x40/0xfc
> Any chance you have the debug info in this kernel?
> scripts/faddr2line would come handy here.

# ./scripts/faddr2line --list vmlinux machine_kexec+0x40
machine_kexec+0x40/0xf8:

machine_kexec at arch/arm/kernel/machine_kexec.c:182
  177    reboot_code_buffer = 
page_address(image->control_code_page);
  178
  179    /* Prepare parameters for reboot_code_buffer*/
  180    set_kernel_text_rw();
  181    kexec_start_address = image->start;
 >182<   kexec_indirection_page = page_list;
  183    kexec_mach_type = machine_arch_type;
  184    kexec_boot_atags = image->arch.kernel_r2;
  185
  186    /* copy our kernel relocation code to the control code 
page */
  187    reboot_entry = fncpy(reboot_code_buffer,

 > ...

Best regards
-- 
Marek Szyprowski, PhD
Samsung R Institute Poland



Re: [PATCH net 11/16] net: ethernet: marvell: mvneta: fix fixed-link phydev leaks

2020-05-08 Thread Johan Hovold
On Fri, May 08, 2020 at 03:35:02AM +0530, Naresh Kamboju wrote:
> On Thu, 7 May 2020 at 16:43, Greg Kroah-Hartman
>  wrote:
> >
> 
> > > >
> > > > Greg, 3f65047c853a ("of_mdio: add helper to deregister fixed-link
> > > > PHYs") needs to be backported as well for these.
> > > >
> > > > Original series can be found here:
> > > >
> > > > 
> > > > https://lkml.kernel.org/r/1480357509-28074-1-git-send-email-jo...@kernel.org
> > >
> > > Ah, thanks for that, I thought I dropped all of the ones that caused
> > > build errors, but missed the above one.  I'll go take the whole series
> > > instead.
> >
> > This should now all be fixed up, thanks.
> 
> While building kernel Image for arm architecture on stable-rc 4.4 branch
> the following build error found.
> 
> of_mdio: add helper to deregister fixed-link PHYs
> commit 3f65047c853a2a5abcd8ac1984af3452b5df4ada upstream.
> 
> Add helper to deregister fixed-link PHYs registered using
> of_phy_register_fixed_link().
> 
> Convert the two drivers that care to deregister their fixed-link PHYs to
> use the new helper, but note that most drivers currently fail to do so.
> 
> Signed-off-by: Johan Hovold 
> Signed-off-by: David S. Miller 
> [only take helper function for 4.4.y - gregkh]
> 
>  # make -sk KBUILD_BUILD_USER=TuxBuild -C/linux -j16 ARCH=arm
> CROSS_COMPILE=arm-linux-gnueabihf- HOSTCC=gcc CC="sccache
> arm-linux-gnueabihf-gcc" O=build zImage
> 70 #
> 71 ../drivers/of/of_mdio.c: In function ‘of_phy_deregister_fixed_link’:
> 72 ../drivers/of/of_mdio.c:379:2: error: implicit declaration of
> function ‘fixed_phy_unregister’; did you mean ‘fixed_phy_register’?
> [-Werror=implicit-function-declaration]
> 73  379 | fixed_phy_unregister(phydev);
> 74  | ^~~~
> 75  | fixed_phy_register
> 76 ../drivers/of/of_mdio.c:381:22: error: ‘struct phy_device’ has no
> member named ‘mdio’; did you mean ‘mdix’?
> 77  381 | put_device(>mdio.dev); /* of_phy_find_device() */
> 78  | ^~~~
> 79  | mdix

Another dependency: 5bcbe0f35fb1 ("phy: fixed: Fix removal of phys.")

Greg, these patches are from four years ago so can't really remember if
there are other dependencies or reasons against backporting them (the
missing stable tags are per Dave's preference), sorry.

The cover letter also mentions another dependency, but that may just
have been some context conflict.

Perhaps you better drop these unless you want to review them closer.

Johan


[PATCH 3/3] dts: ppc: t1024rdb: add wakeup-source property to drop warning

2020-05-08 Thread Biwen Li
From: Biwen Li 

This adds wakeup-source property to drop warning as follows:
- $ hwclock.util-linux
  hwclock.util-linux: select() to /dev/rtc0
  to wait for clock tick timed out

My case:
- RTC ds1339s INT pin isn't connected to cpus INT pin on T1024RDB,
  then the RTC cannot inform cpu about alarm interrupt

How to fix it?
- add wakeup-source property and remove IRQ line
  to set uie_unsupported flag

Signed-off-by: Biwen Li 
---
 arch/powerpc/boot/dts/fsl/t1024rdb.dts | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/boot/dts/fsl/t1024rdb.dts 
b/arch/powerpc/boot/dts/fsl/t1024rdb.dts
index 645caff98ed1..191cbf5cda4e 100644
--- a/arch/powerpc/boot/dts/fsl/t1024rdb.dts
+++ b/arch/powerpc/boot/dts/fsl/t1024rdb.dts
@@ -161,7 +161,7 @@
rtc@68 {
compatible = "dallas,ds1339";
reg = <0x68>;
-   interrupts = <0x1 0x1 0 0>;
+   wakeup-source;
};
};
 
-- 
2.17.1



[PATCH 2/3] dts: ppc: t4240rdb: add uie_unsupported property to drop warning

2020-05-08 Thread Biwen Li
From: Biwen Li 

This adds uie_unsupported property to drop warning as follows:
- $ hwclock.util-linux
  hwclock.util-linux: select() to /dev/rtc0
  to wait for clock tick timed out

My case:
- RTC ds1374's INT pin is connected to VCC on T4240RDB,
  then the RTC cannot inform cpu about the alarm interrupt

Signed-off-by: Biwen Li 
---
 arch/powerpc/boot/dts/fsl/t4240rdb.dts | 6 +-
 1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/boot/dts/fsl/t4240rdb.dts 
b/arch/powerpc/boot/dts/fsl/t4240rdb.dts
index a56a705d41f7..ccdd10202e56 100644
--- a/arch/powerpc/boot/dts/fsl/t4240rdb.dts
+++ b/arch/powerpc/boot/dts/fsl/t4240rdb.dts
@@ -144,7 +144,11 @@
rtc@68 {
compatible = "dallas,ds1374";
reg = <0x68>;
-   interrupts = <0x1 0x1 0 0>;
+   // The ds1374's INT pin isn't
+   // connected to cpu's INT pin,
+   // so the rtc cannot synchronize
+   // clock tick per second.
+   uie_unsupported;
};
};
 
-- 
2.17.1



[PATCH 1/3] rtc: ds1374: add uie_unsupported property to drop warning

2020-05-08 Thread Biwen Li
From: Biwen Li 

Add uie_unsupported property to drop warning as follows:
- $ hwclock.util-linux
  hwclock.util-liux: select() /dev/rtc0
  to wait for clock tick timed out

My case:
- RTC ds1374's INT pin is connected to VCC on T4240RDB,
  then the RTC cannot inform cpu about the alarm
  interrupt

Signed-off-by: Biwen Li 
---
 drivers/rtc/rtc-ds1374.c | 4 
 1 file changed, 4 insertions(+)

diff --git a/drivers/rtc/rtc-ds1374.c b/drivers/rtc/rtc-ds1374.c
index 9c51a12cf70f..e530e887a17e 100644
--- a/drivers/rtc/rtc-ds1374.c
+++ b/drivers/rtc/rtc-ds1374.c
@@ -651,6 +651,10 @@ static int ds1374_probe(struct i2c_client *client,
if (ret)
return ret;
 
+   if (of_property_read_bool(client->dev.of_node,
+"uie_unsupported"))
+   ds1374->rtc->uie_unsupported = true;
+
 #ifdef CONFIG_RTC_DRV_DS1374_WDT
save_client = client;
ret = misc_register(_miscdev);
-- 
2.17.1



Re: [PATCH 0/7] sha1 library cleanup

2020-05-08 Thread Herbert Xu
Eric Biggers  wrote:
>  sounds very generic and important, like it's the
> header to include if you're doing cryptographic hashing in the kernel.
> But actually it only includes the library implementation of the SHA-1
> compression function (not even the full SHA-1).  This should basically
> never be used anymore; SHA-1 is no longer considered secure, and there
> are much better ways to do cryptographic hashing in the kernel.
> 
> Also the function is named just "sha_transform()", which makes it
> unclear which version of SHA is meant.
> 
> Therefore, this series cleans things up by moving these SHA-1
> declarations into  where they better belong, and changing
> the names to say SHA-1 rather than just SHA.
> 
> As future work, we should split sha.h into sha1.h and sha2.h and try to
> remove the remaining uses of SHA-1.  For example, the remaining use in
> drivers/char/random.c is probably one that can be gotten rid of.
> 
> This patch series applies to cryptodev/master.
> 
> Eric Biggers (7):
>  mptcp: use SHA256_BLOCK_SIZE, not SHA_MESSAGE_BYTES
>  crypto: powerpc/sha1 - remove unused temporary workspace
>  crypto: powerpc/sha1 - prefix the "sha1_" functions
>  crypto: s390/sha1 - prefix the "sha1_" functions
>  crypto: lib/sha1 - rename "sha" to "sha1"
>  crypto: lib/sha1 - remove unnecessary includes of linux/cryptohash.h
>  crypto: lib/sha1 - fold linux/cryptohash.h into crypto/sha.h
> 
> Documentation/security/siphash.rst  |  2 +-
> arch/arm/crypto/sha1_glue.c |  1 -
> arch/arm/crypto/sha1_neon_glue.c|  1 -
> arch/arm/crypto/sha256_glue.c   |  1 -
> arch/arm/crypto/sha256_neon_glue.c  |  1 -
> arch/arm/kernel/armksyms.c  |  1 -
> arch/arm64/crypto/sha256-glue.c |  1 -
> arch/arm64/crypto/sha512-glue.c |  1 -
> arch/microblaze/kernel/microblaze_ksyms.c   |  1 -
> arch/mips/cavium-octeon/crypto/octeon-md5.c |  1 -
> arch/powerpc/crypto/md5-glue.c  |  1 -
> arch/powerpc/crypto/sha1-spe-glue.c |  1 -
> arch/powerpc/crypto/sha1.c  | 33 ++---
> arch/powerpc/crypto/sha256-spe-glue.c   |  1 -
> arch/s390/crypto/sha1_s390.c| 12 
> arch/sparc/crypto/md5_glue.c|  1 -
> arch/sparc/crypto/sha1_glue.c   |  1 -
> arch/sparc/crypto/sha256_glue.c |  1 -
> arch/sparc/crypto/sha512_glue.c |  1 -
> arch/unicore32/kernel/ksyms.c   |  1 -
> arch/x86/crypto/sha1_ssse3_glue.c   |  1 -
> arch/x86/crypto/sha256_ssse3_glue.c |  1 -
> arch/x86/crypto/sha512_ssse3_glue.c |  1 -
> crypto/sha1_generic.c   |  5 ++--
> drivers/char/random.c   |  8 ++---
> drivers/crypto/atmel-sha.c  |  1 -
> drivers/crypto/chelsio/chcr_algo.c  |  1 -
> drivers/crypto/chelsio/chcr_ipsec.c |  1 -
> drivers/crypto/omap-sham.c  |  1 -
> fs/f2fs/hash.c  |  1 -
> include/crypto/sha.h| 10 +++
> include/linux/cryptohash.h  | 14 -
> include/linux/filter.h  |  4 +--
> include/net/tcp.h   |  1 -
> kernel/bpf/core.c   | 18 +--
> lib/crypto/chacha.c |  1 -
> lib/sha1.c  | 24 ---
> net/core/secure_seq.c   |  1 -
> net/ipv6/addrconf.c | 10 +++
> net/ipv6/seg6_hmac.c|  1 -
> net/mptcp/crypto.c  |  4 +--
> 41 files changed, 69 insertions(+), 104 deletions(-)
> delete mode 100644 include/linux/cryptohash.h
> 
> 
> base-commit: 12b3cf9093542d9f752a4968815ece836159013f

All applied.  Thanks.
-- 
Email: Herbert Xu 
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt