Re: [v1 PATCH 1/2] Refactoring carrying over IMA measuremnet logs over Kexec.

2020-06-07 Thread kernel test robot
Hi Prakhar,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on arm64/for-next/core]
[also build test WARNING on powerpc/next soc/for-next v5.7 next-20200605]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:
https://github.com/0day-ci/linux/commits/Prakhar-Srivastava/Adding-support-to-carry-IMA-measurement-logs/20200608-073805
base:   https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git 
for-next/core
config: arm64-allyesconfig (attached as .config)
compiler: aarch64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
ARCH=arm64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>, old ones prefixed by <<):

>> security/integrity/ima/ima_kexec.c:59:5: warning: no previous prototype for 
>> 'ima_get_kexec_buffer' [-Wmissing-prototypes]
59 | int ima_get_kexec_buffer(void **addr, size_t *size)
| ^~~~
>> security/integrity/ima/ima_kexec.c:85:5: warning: no previous prototype for 
>> 'delete_fdt_mem_rsv' [-Wmissing-prototypes]
85 | int delete_fdt_mem_rsv(void *fdt, unsigned long start, unsigned long size)
| ^~
>> security/integrity/ima/ima_kexec.c:115:5: warning: no previous prototype for 
>> 'ima_free_kexec_buffer' [-Wmissing-prototypes]
115 | int ima_free_kexec_buffer(void)
| ^
>> security/integrity/ima/ima_kexec.c:144:6: warning: no previous prototype for 
>> 'remove_ima_buffer' [-Wmissing-prototypes]
144 | void remove_ima_buffer(void *fdt, int chosen_node)
|  ^
security/integrity/ima/ima_kexec.c:231:6: warning: no previous prototype for 
'ima_add_kexec_buffer' [-Wmissing-prototypes]
231 | void ima_add_kexec_buffer(struct kimage *image)
|  ^~~~

vim +/ima_get_kexec_buffer +59 security/integrity/ima/ima_kexec.c

51  
52  /**
53   * ima_get_kexec_buffer - get IMA buffer from the previous kernel
54   * @addr:   On successful return, set to point to the buffer 
contents.
55   * @size:   On successful return, set to the buffer size.
56   *
57   * Return: 0 on success, negative errno on error.
58   */
  > 59  int ima_get_kexec_buffer(void **addr, size_t *size)
60  {
61  int ret, len;
62  unsigned long tmp_addr;
63  size_t tmp_size;
64  const void *prop;
65  
66  prop = of_get_property(of_chosen, "linux,ima-kexec-buffer", 
);
67  if (!prop)
68  return -ENOENT;
69  
70  ret = do_get_kexec_buffer(prop, len, _addr, _size);
71  if (ret)
72  return ret;
73  
74  *addr = __va(tmp_addr);
75  *size = tmp_size;
76  
77  return 0;
78  }
79  
80  /**
81   * delete_fdt_mem_rsv - delete memory reservation with given address 
and size
82   *
83   * Return: 0 on success, or negative errno on error.
84   */
  > 85  int delete_fdt_mem_rsv(void *fdt, unsigned long start, unsigned long 
size)
86  {
87  int i, ret, num_rsvs = fdt_num_mem_rsv(fdt);
88  
89  for (i = 0; i < num_rsvs; i++) {
90  uint64_t rsv_start, rsv_size;
91  
92  ret = fdt_get_mem_rsv(fdt, i, _start, _size);
93  if (ret) {
94  pr_err("Malformed device tree.\n");
95  return -EINVAL;
96  }
97  
98  if (rsv_start == start && rsv_size == size) {
99  ret = fdt_del_mem_rsv(fdt, i);
   100  if (ret) {
   101  pr_err("Error deleting device tree 
reservation.\n");
   102  return -EINVAL;
   103  }
   104  
   105  return 0;
   106  }
   107  }
   108  
   109  return -ENOENT;
   110  }
   111  
   112  /**
   113   * ima_free_kexec_buffer - free memory used by the IMA buffer
   114   */
 > 115  int ima_free_kexec_buffer(void)
   116  {
   117  int ret;
   118  unsigned long addr;
   119  size_t size;
   120  struct property *prop;
   121  
   122  prop = of_find_property(of_chosen, "linux,ima-kexec-buffer", 
NULL);
   123  if (!prop)
   124  return -ENOENT;
   125  
   126  

Re: [v1 PATCH 1/2] Refactoring carrying over IMA measuremnet logs over Kexec.

2020-06-07 Thread kernel test robot
Hi Prakhar,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on arm64/for-next/core]
[also build test ERROR on powerpc/next soc/for-next v5.7 next-20200605]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:
https://github.com/0day-ci/linux/commits/Prakhar-Srivastava/Adding-support-to-carry-IMA-measurement-logs/20200608-073805
base:   https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git 
for-next/core
config: powerpc-allyesconfig (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross 
ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All errors (new ones prefixed by >>, old ones prefixed by <<):

arch/powerpc/kexec/ima.c:24:5: warning: no previous prototype for 
'arch_ima_add_kexec_buffer' [-Wmissing-prototypes]
24 | int arch_ima_add_kexec_buffer(struct kimage *image, unsigned long 
load_addr,
| ^
arch/powerpc/kexec/ima.c:62:5: warning: no previous prototype for 
'setup_ima_buffer' [-Wmissing-prototypes]
62 | int setup_ima_buffer(const struct kimage *image, void *fdt, int 
chosen_node)
| ^~~~
arch/powerpc/kexec/ima.c: In function 'setup_ima_buffer':
>> arch/powerpc/kexec/ima.c:71:8: error: implicit declaration of function 
>> 'get_addr_size_cells'; did you mean 'fdt_address_cells'? 
>> [-Werror=implicit-function-declaration]
71 |  ret = get_addr_size_cells(_cells, _cells);
|^~~
|fdt_address_cells
cc1: some warnings being treated as errors

vim +71 arch/powerpc/kexec/ima.c

ab6b1d1fc4aae6 arch/powerpc/kernel/ima_kexec.c Thiago Jung Bauermann 2016-12-19 
 53  
ab6b1d1fc4aae6 arch/powerpc/kernel/ima_kexec.c Thiago Jung Bauermann 2016-12-19 
 54  /**
ab6b1d1fc4aae6 arch/powerpc/kernel/ima_kexec.c Thiago Jung Bauermann 2016-12-19 
 55   * setup_ima_buffer - add IMA buffer information to the fdt
ab6b1d1fc4aae6 arch/powerpc/kernel/ima_kexec.c Thiago Jung Bauermann 2016-12-19 
 56   * @image: kexec image being loaded.
ab6b1d1fc4aae6 arch/powerpc/kernel/ima_kexec.c Thiago Jung Bauermann 2016-12-19 
 57   * @fdt:   Flattened device tree for the next kernel.
ab6b1d1fc4aae6 arch/powerpc/kernel/ima_kexec.c Thiago Jung Bauermann 2016-12-19 
 58   * @chosen_node:   Offset to the chosen node.
ab6b1d1fc4aae6 arch/powerpc/kernel/ima_kexec.c Thiago Jung Bauermann 2016-12-19 
 59   *
ab6b1d1fc4aae6 arch/powerpc/kernel/ima_kexec.c Thiago Jung Bauermann 2016-12-19 
 60   * Return: 0 on success, or negative errno on error.
ab6b1d1fc4aae6 arch/powerpc/kernel/ima_kexec.c Thiago Jung Bauermann 2016-12-19 
 61   */
ab6b1d1fc4aae6 arch/powerpc/kernel/ima_kexec.c Thiago Jung Bauermann 2016-12-19 
 62  int setup_ima_buffer(const struct kimage *image, void *fdt, int 
chosen_node)
ab6b1d1fc4aae6 arch/powerpc/kernel/ima_kexec.c Thiago Jung Bauermann 2016-12-19 
 63  {
ab6b1d1fc4aae6 arch/powerpc/kernel/ima_kexec.c Thiago Jung Bauermann 2016-12-19 
 64 int ret, addr_cells, size_cells, entry_size;
ab6b1d1fc4aae6 arch/powerpc/kernel/ima_kexec.c Thiago Jung Bauermann 2016-12-19 
 65 u8 value[16];
ab6b1d1fc4aae6 arch/powerpc/kernel/ima_kexec.c Thiago Jung Bauermann 2016-12-19 
 66  
aea659ce44ba07 arch/powerpc/kexec/ima.cPrakhar Srivastava2020-06-07 
 67  // remove_ima_buffer(fdt, chosen_node);
ab6b1d1fc4aae6 arch/powerpc/kernel/ima_kexec.c Thiago Jung Bauermann 2016-12-19 
 68 if (!image->arch.ima_buffer_size)
ab6b1d1fc4aae6 arch/powerpc/kernel/ima_kexec.c Thiago Jung Bauermann 2016-12-19 
 69 return 0;
ab6b1d1fc4aae6 arch/powerpc/kernel/ima_kexec.c Thiago Jung Bauermann 2016-12-19 
 70  
ab6b1d1fc4aae6 arch/powerpc/kernel/ima_kexec.c Thiago Jung Bauermann 2016-12-19 
@71 ret = get_addr_size_cells(_cells, _cells);

:: The code at line 71 was first introduced by commit
:: ab6b1d1fc4aae6b8bd6fb1422405568094c9b40f powerpc: ima: send the kexec 
buffer to the next kernel

:: TO: Thiago Jung Bauermann 
:: CC: Linus Torvalds 

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


Re: [v1 PATCH 1/2] Refactoring carrying over IMA measuremnet logs over Kexec.

2020-06-07 Thread kernel test robot
Hi Prakhar,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on arm64/for-next/core]
[also build test WARNING on powerpc/next soc/for-next v5.7 next-20200605]
[if your patch is applied to the wrong git tree, please drop us a note to help
improve the system. BTW, we also suggest to use '--base' option to specify the
base tree in git format-patch, please see https://stackoverflow.com/a/37406982]

url:
https://github.com/0day-ci/linux/commits/Prakhar-Srivastava/Adding-support-to-carry-IMA-measurement-logs/20200608-073805
base:   https://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git 
for-next/core
config: arm64-randconfig-r012-20200607 (attached as .config)
compiler: clang version 11.0.0 (https://github.com/llvm/llvm-project 
e429cffd4f228f70c1d9df0e5d77c08590dd9766)
reproduce (this is a W=1 build):
wget 
https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O 
~/bin/make.cross
chmod +x ~/bin/make.cross
# install arm64 cross compiling tool for clang build
# apt-get install binutils-aarch64-linux-gnu
# save the attached .config to linux build tree
COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross ARCH=arm64 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot 

All warnings (new ones prefixed by >>, old ones prefixed by <<):

>> arch/arm64/kernel/machine_kexec_file.c:50:5: warning: no previous prototype 
>> for function 'arch_ima_add_kexec_buffer' [-Wmissing-prototypes]
int arch_ima_add_kexec_buffer(struct kimage *image, unsigned long load_addr,
^
arch/arm64/kernel/machine_kexec_file.c:50:1: note: declare 'static' if the 
function is not intended to be used outside of this translation unit
int arch_ima_add_kexec_buffer(struct kimage *image, unsigned long load_addr,
^
static
1 warning generated.

vim +/arch_ima_add_kexec_buffer +50 arch/arm64/kernel/machine_kexec_file.c

41  
42  /**
43   * arch_ima_add_kexec_buffer - do arch-specific steps to add the IMA 
buffer
44   *
45   * Architectures should use this function to pass on the IMA buffer
46   * information to the next kernel.
47   *
48   * Return: 0 on success, negative errno on error.
49   */
  > 50  int arch_ima_add_kexec_buffer(struct kimage *image, unsigned long 
load_addr,
51size_t size)
52  {
53  image->arch.ima_buffer_addr = load_addr;
54  image->arch.ima_buffer_size = size;
55  return 0;
56  }
57  

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-...@lists.01.org


.config.gz
Description: application/gzip


[v1 PATCH 2/2] Add Documentation regarding the ima-kexec-buffer node in the chosen node documentation

2020-06-07 Thread Prakhar Srivastava
Add Documentation regarding the ima-kexec-buffer node in
 the chosen node documentation
 
Signed-off-by: Prakhar Srivastava 
---
 Documentation/devicetree/bindings/chosen.txt | 17 +
 1 file changed, 17 insertions(+)

diff --git a/Documentation/devicetree/bindings/chosen.txt 
b/Documentation/devicetree/bindings/chosen.txt
index 45e79172a646..a15f70c007ef 100644
--- a/Documentation/devicetree/bindings/chosen.txt
+++ b/Documentation/devicetree/bindings/chosen.txt
@@ -135,3 +135,20 @@ e.g.
linux,initrd-end = <0x8280>;
};
 };
+
+linux,ima-kexec-buffer
+--
+
+This property(currently used by powerpc, arm64) holds the memory range,
+the address and the size, of the IMA measurement logs that are being carried
+over to the kexec session.
+
+/ {
+   chosen {
+   linux,ima-kexec-buffer = <0x9 0x8200 0x0 0x8000>;
+   };
+};
+
+This porperty does not represent real hardware, but the memory allocated for
+carrying the IMA measurement logs. The address and the suze are expressed in
+#address-cells and #size-cells, respectively of the root node.
-- 
2.25.1



[v1 PATCH 0/2] Adding support to carry IMA measurement logs

2020-06-07 Thread Prakhar Srivastava
IMA during kexec(kexec file load) verifies the kernel signature and measures
the signature of the kernel. The signature in the logs can be used to verfiy 
the 
authenticity of the kernel. The logs don not get carried over kexec and thus
remote attesation cannot verify the signature of the running kernel.

Add a new chosen node entry linux,ima-kexec-buffer to hold the address and
the size of the memory reserved to carry the IMA measurement log.

Tested on:
  arm64 with Uboot

Changelog:

v1:
  Refactoring carrying over IMA measuremnet logs over Kexec. This patch
moves the non-architecture specific code out of powerpc and adds to
security/ima.(Suggested by Thiago)
  Add Documentation regarding the ima-kexec-buffer node in the chosen
node documentation

v0:
  Add a layer of abstraction to use the memory reserved by device tree
for ima buffer pass.
  Add support for ima buffer pass using reserved memory for arm64 kexec.
Update the arch sepcific code path in kexec file load to store the
ima buffer in the reserved memory. The same reserved memory is read
on kexec or cold boot.

 Documentation/devicetree/bindings/chosen.txt |  17 +++
 arch/arm64/Kconfig   |   1 +
 arch/arm64/include/asm/ima.h |  24 +++
 arch/arm64/include/asm/kexec.h   |   3 +
 arch/arm64/kernel/machine_kexec_file.c   |  47 +-
 arch/powerpc/include/asm/ima.h   |   9 --
 arch/powerpc/kexec/ima.c | 117 +-
 security/integrity/ima/ima_kexec.c   | 151 +++
 8 files changed, 236 insertions(+), 133 deletions(-)
 create mode 100644 arch/arm64/include/asm/ima.h

-- 
2.25.1



[v1 PATCH 1/2] Refactoring carrying over IMA measuremnet logs over Kexec.

2020-06-07 Thread Prakhar Srivastava
This patch moves the non-architecture specific code out of powerpc and
 adds to security/ima. 
Update the arm64 and powerpc kexec file load paths to carry the IMA measurement
logs.

Signed-off-by: Prakhar Srivastava 
---
 arch/arm64/Kconfig |   1 +
 arch/arm64/include/asm/ima.h   |  24 
 arch/arm64/include/asm/kexec.h |   3 +
 arch/arm64/kernel/machine_kexec_file.c |  47 ++--
 arch/powerpc/include/asm/ima.h |   9 --
 arch/powerpc/kexec/ima.c   | 117 +--
 security/integrity/ima/ima_kexec.c | 151 +
 7 files changed, 219 insertions(+), 133 deletions(-)
 create mode 100644 arch/arm64/include/asm/ima.h

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 5d513f461957..3d544e2e25e6 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1070,6 +1070,7 @@ config KEXEC
 config KEXEC_FILE
bool "kexec file based system call"
select KEXEC_CORE
+   select HAVE_IMA_KEXEC
help
  This is new version of kexec system call. This system call is
  file based and takes file descriptors as system call argument
diff --git a/arch/arm64/include/asm/ima.h b/arch/arm64/include/asm/ima.h
new file mode 100644
index ..8946bae8baa2
--- /dev/null
+++ b/arch/arm64/include/asm/ima.h
@@ -0,0 +1,24 @@
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_ARCH_IMA_H
+#define _ASM_ARCH_IMA_H
+
+struct kimage;
+
+#ifdef CONFIG_IMA_KEXEC
+int arch_ima_add_kexec_buffer(struct kimage *image, unsigned long load_addr,
+ size_t size);
+
+int setup_ima_buffer(const struct kimage *image, void *fdt, int chosen_node);
+#else
+static inline int arch_ima_add_kexec_buffer(struct kimage *image,
+   unsigned long load_addr, size_t size)
+{
+   return 0;
+}
+static inline int setup_ima_buffer(const struct kimage *image, void *fdt,
+  int chosen_node)
+{
+   return 0;
+}
+#endif /* CONFIG_IMA_KEXEC */
+#endif /* _ASM_ARCH_IMA_H */
diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index d24b527e8c00..7bd60c185ad3 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -100,6 +100,9 @@ struct kimage_arch {
void *elf_headers;
unsigned long elf_headers_mem;
unsigned long elf_headers_sz;
+
+   phys_addr_t ima_buffer_addr;
+   size_t ima_buffer_size;
 };
 
 extern const struct kexec_file_ops kexec_image_ops;
diff --git a/arch/arm64/kernel/machine_kexec_file.c 
b/arch/arm64/kernel/machine_kexec_file.c
index b40c3b0def92..1e9007c926db 100644
--- a/arch/arm64/kernel/machine_kexec_file.c
+++ b/arch/arm64/kernel/machine_kexec_file.c
@@ -24,20 +24,37 @@
 #include 
 
 /* relevant device tree properties */
-#define FDT_PROP_KEXEC_ELFHDR  "linux,elfcorehdr"
-#define FDT_PROP_MEM_RANGE "linux,usable-memory-range"
-#define FDT_PROP_INITRD_START  "linux,initrd-start"
-#define FDT_PROP_INITRD_END"linux,initrd-end"
-#define FDT_PROP_BOOTARGS  "bootargs"
-#define FDT_PROP_KASLR_SEED"kaslr-seed"
-#define FDT_PROP_RNG_SEED  "rng-seed"
-#define RNG_SEED_SIZE  128
+#define FDT_PROP_KEXEC_ELFHDR  "linux,elfcorehdr"
+#define FDT_PROP_MEM_RANGE "linux,usable-memory-range"
+#define FDT_PROP_INITRD_START  "linux,initrd-start"
+#define FDT_PROP_INITRD_END"linux,initrd-end"
+#define FDT_PROP_BOOTARGS  "bootargs"
+#define FDT_PROP_KASLR_SEED"kaslr-seed"
+#define FDT_PROP_RNG_SEED  "rng-seed"
+#define FDT_PROP_IMA_KEXEC_BUFFER  "linux,ima-kexec-buffer"
+#define RNG_SEED_SIZE  128
 
 const struct kexec_file_ops * const kexec_file_loaders[] = {
_image_ops,
NULL
 };
 
+/**
+ * arch_ima_add_kexec_buffer - do arch-specific steps to add the IMA buffer
+ *
+ * Architectures should use this function to pass on the IMA buffer
+ * information to the next kernel.
+ *
+ * Return: 0 on success, negative errno on error.
+ */
+int arch_ima_add_kexec_buffer(struct kimage *image, unsigned long load_addr,
+ size_t size)
+{
+   image->arch.ima_buffer_addr = load_addr;
+   image->arch.ima_buffer_size = size;
+   return 0;
+}
+
 int arch_kimage_file_post_load_cleanup(struct kimage *image)
 {
vfree(image->arch.dtb);
@@ -66,6 +83,9 @@ static int setup_dtb(struct kimage *image,
if (ret && ret != -FDT_ERR_NOTFOUND)
goto out;
ret = fdt_delprop(dtb, off, FDT_PROP_MEM_RANGE);
+   if (ret && ret != -FDT_ERR_NOTFOUND)
+   goto out;
+   ret = fdt_delprop(dtb, off, FDT_PROP_IMA_KEXEC_BUFFER);
if (ret && ret != -FDT_ERR_NOTFOUND)
goto out;
 
@@ -119,6 +139,17 @@ static int setup_dtb(struct kimage *image,
goto out;
}
 
+   if (image->arch.ima_buffer_size > 0) {
+
+   ret = 

Re: Boot issue with the latest Git kernel

2020-06-07 Thread Aneesh Kumar K.V
Christian Zigotzky  writes:

> On 04 June 2020 at 7:15 pm, Christophe Leroy wrote:
>> Yes today's linux-next boots on my powerpc 8xx board.
>>
>> Christophe
> Hello Christophe,
>
> Thanks for testing.
>
> I was able to perform a 'git bisect' [1] and identified the bad commit. 
> [2] I reverted this commit and after that the kernel boots and works 
> without any problems.
>
> Could you please check this commit?
>
> Thanks,
> Christian
>
>
> [1] https://forum.hyperion-entertainment.com/viewtopic.php?p=50772#p50772
> [2] 
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2ba3e6947aed9bb9575eb1603c0ac6e39185d32a

This was also reported here.

https://lore.kernel.org/linuxppc-dev/1591181457.9020.13.camel@abdul

-aneesh


Re: Boot issue with the latest Git kernel

2020-06-07 Thread Christian Zigotzky

Hi All,

It seems, someone has fixed the boot issue. The latest Git kernel boots 
on my PowerPC machines.


Thanks,
Christian


On 05 June 2020 at 6:23 pm, Christian Zigotzky wrote:

On 04 June 2020 at 7:15 pm, Christophe Leroy wrote:

Yes today's linux-next boots on my powerpc 8xx board.

Christophe

Hello Christophe,

Thanks for testing.

I was able to perform a 'git bisect' [1] and identified the bad 
commit. [2] I reverted this commit and after that the kernel boots and 
works without any problems.


Could you please check this commit?

Thanks,
Christian


[1] https://forum.hyperion-entertainment.com/viewtopic.php?p=50772#p50772
[2] 
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2ba3e6947aed9bb9575eb1603c0ac6e39185d32a




[PATCH v11 6/6] powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH

2020-06-07 Thread Vaibhav Jain
This patch implements support for PDSM request 'PAPR_PDSM_HEALTH'
that returns a newly introduced 'struct nd_papr_pdsm_health' instance
containing dimm health information back to user space in response to
ND_CMD_CALL. This functionality is implemented in newly introduced
papr_pdsm_health() that queries the nvdimm health information and
then copies this information to the package payload whose layout is
defined by 'struct nd_papr_pdsm_health'.

Cc: "Aneesh Kumar K . V" 
Cc: Dan Williams 
Cc: Michael Ellerman 
Cc: Ira Weiny 
Signed-off-by: Vaibhav Jain 
---
Changelog:

v10..v11:
* Changed the definition of 'struct nd_papr_pdsm_health' to a maximal
  struct 184 bytes in size [ Dan Williams ]
* Added new field 'extension_flags' to 'struct nd_papr_pdsm_health'
  [ Dan Williams ]
* Updated papr_pdsm_health() to set field 'extension_flags' to 0.
* Introduced a define ND_PDSM_PAYLOAD_MAX_SIZE that indicates the
  maximum size of a payload.
* Fixed a suspicious conversion from u64 to u8 in papr_pdsm_health
  that was preventing correct initialization of 'struct
  nd_papr_pdsm_health'. [ Ira ]

v9..v10:
* Removed code in papr_pdsm_health that performed validation on pdsm
  payload version and corrosponding struct and defines used for
  validation of payload version.
* Dropped usage of struct papr_pdsm_health in 'struct
  papr_scm_priv'. Instead papr_psdm_health() now uses
  'papr_scm_priv.health_bitmap' to populate the pdsm payload.
* Above change also fixes the problem where this patch was removing
  the code that was previously introduced in this patch-series.
  [ Ira ]
* Introduced a new def ND_PDSM_ENVELOPE_HDR_SIZE that indicates the
  space allocated to 'struct nd_pdsm_cmd_pkg' fields except 'struct
  nd_cmd_pkg'. This def is useful in validating payload sizes.
* Reworked papr_pdsm_health() to enforce a specific payload size for
  'PAPR_PDSM_HEALTH' pdsm request.

Resend:
* Added ack from Aneesh.

v8..v9:
* s/PAPR_SCM_PDSM_HEALTH/PAPR_PDSM_HEALTH/g  [ Dan , Aneesh ]
* s/PAPR_SCM_PSDM_DIMM_*/PAPR_PDSM_DIMM_*/g
* Renamed papr_scm_get_health() to papr_psdm_health()
* Updated patch description to replace papr-scm dimm with nvdimm.

v7..v8:
* None

Resend:
* None

v6..v7:
* Updated flags_show() to use seq_buf_printf(). [Mpe]
* Updated papr_scm_get_health() to use newly introduced
  __drc_pmem_query_health() bypassing the cache [Mpe].

v5..v6:
* Added attribute '__packed' to 'struct nd_papr_pdsm_health_v1' to
  gaurd against possibility of different compilers adding different
  paddings to the struct [ Dan Williams ]

* Updated 'struct nd_papr_pdsm_health_v1' to use __u8 instead of
  'bool' and also updated drc_pmem_query_health() to take this into
  account. [ Dan Williams ]

v4..v5:
* None

v3..v4:
* Call the DSM_PAPR_SCM_HEALTH service function from
  papr_scm_service_dsm() instead of papr_scm_ndctl(). [Aneesh]

v2..v3:
* Updated struct nd_papr_scm_dimm_health_stat_v1 to use '__xx' types
  as its exported to the userspace [Aneesh]
* Changed the constants DSM_PAPR_SCM_DIMM_XX indicating dimm health
  from enum to #defines [Aneesh]

v1..v2:
* New patch in the series
---
 arch/powerpc/include/uapi/asm/papr_pdsm.h | 43 ++
 arch/powerpc/platforms/pseries/papr_scm.c | 71 +++
 2 files changed, 114 insertions(+)

diff --git a/arch/powerpc/include/uapi/asm/papr_pdsm.h 
b/arch/powerpc/include/uapi/asm/papr_pdsm.h
index df2447455cfe..12c7aa5ee8bf 100644
--- a/arch/powerpc/include/uapi/asm/papr_pdsm.h
+++ b/arch/powerpc/include/uapi/asm/papr_pdsm.h
@@ -72,13 +72,56 @@ struct nd_pdsm_cmd_pkg {
__u8 payload[]; /* In/Out: Sub-cmd data buffer */
 } __packed;
 
+/* Calculate size used by the pdsm header fields minus 'struct nd_cmd_pkg' */
+#define ND_PDSM_HDR_SIZE \
+   (sizeof(struct nd_pdsm_cmd_pkg) - sizeof(struct nd_cmd_pkg))
+
+/* Max payload size that we can handle */
+#define ND_PDSM_PAYLOAD_MAX_SIZE 184
+
 /*
  * Methods to be embedded in ND_CMD_CALL request. These are sent to the kernel
  * via 'nd_pdsm_cmd_pkg.hdr.nd_command' member of the ioctl struct
  */
 enum papr_pdsm {
PAPR_PDSM_MIN = 0x0,
+   PAPR_PDSM_HEALTH,
PAPR_PDSM_MAX,
 };
 
+/* Various nvdimm health indicators */
+#define PAPR_PDSM_DIMM_HEALTHY   0
+#define PAPR_PDSM_DIMM_UNHEALTHY 1
+#define PAPR_PDSM_DIMM_CRITICAL  2
+#define PAPR_PDSM_DIMM_FATAL 3
+
+/*
+ * Struct exchanged between kernel & ndctl in for PAPR_PDSM_HEALTH
+ * Various flags indicate the health status of the dimm.
+ *
+ * extension_flags : Any extension fields present in the struct.
+ * dimm_unarmed: Dimm not armed. So contents wont persist.
+ * dimm_bad_shutdown   : Previous shutdown did not persist contents.
+ * dimm_bad_restore: Contents from previous shutdown werent restored.
+ * dimm_scrubbed   : Contents of the dimm have been scrubbed.
+ * dimm_locked : Contents of the dimm cant be modified until CEC reboot
+ * dimm_encrypted  : Contents of dimm are 

[PATCH v11 2/6] seq_buf: Export seq_buf_printf

2020-06-07 Thread Vaibhav Jain
'seq_buf' provides a very useful abstraction for writing to a string
buffer without needing to worry about it over-flowing. However even
though the API has been stable for couple of years now its still not
exported to kernel loadable modules limiting its usage.

Hence this patch proposes update to 'seq_buf.c' to mark
seq_buf_printf() which is part of the seq_buf API to be exported to
kernel loadable GPL modules. This symbol will be used in later parts
of this patch-set to simplify content creation for a sysfs attribute.

Cc: Piotr Maziarz 
Cc: Cezary Rojewski 
Cc: Christoph Hellwig 
Cc: Steven Rostedt 
Cc: Borislav Petkov 
Acked-by: Steven Rostedt (VMware) 
Signed-off-by: Vaibhav Jain 
---
Changelog:

v10..v11:
* None

v9..v10:
* None

Resend:
* Added ack from Steven Rostedt

v8..v9:
* None

v7..v8:
* Updated the patch title [ Christoph Hellwig ]
* Updated patch description to replace confusing term 'external kernel
  modules' to 'kernel lodable modules'.

Resend:
* Added ack from Steven Rostedt

v6..v7:
* New patch in the series
---
 lib/seq_buf.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/lib/seq_buf.c b/lib/seq_buf.c
index 4e865d42ab03..707453f5d58e 100644
--- a/lib/seq_buf.c
+++ b/lib/seq_buf.c
@@ -91,6 +91,7 @@ int seq_buf_printf(struct seq_buf *s, const char *fmt, ...)
 
return ret;
 }
+EXPORT_SYMBOL_GPL(seq_buf_printf);
 
 #ifdef CONFIG_BINARY_PRINTF
 /**
-- 
2.26.2



[PATCH v11 4/6] powerpc/papr_scm: Improve error logging and handling papr_scm_ndctl()

2020-06-07 Thread Vaibhav Jain
Since papr_scm_ndctl() can be called from outside papr_scm, its
exposed to the possibility of receiving NULL as value of 'cmd_rc'
argument. This patch updates papr_scm_ndctl() to protect against such
possibility by assigning it pointer to a local variable in case cmd_rc
== NULL.

Finally the patch also updates the 'default' add a debug log unknown
'cmd' values.

Cc: "Aneesh Kumar K . V" 
Cc: Dan Williams 
Cc: Michael Ellerman 
Cc: Ira Weiny 
Signed-off-by: Vaibhav Jain 
---
Changelog:

v10..v11:
* Instead of returning *cmd_rd just return '0' in case nd_cmd is
  handled. In case of unknown nd-cmd return -EINVAL
  [ Ira and Dan Williams ]
* Updated patch description.

v9..v10
* New patch in the series
---
 arch/powerpc/platforms/pseries/papr_scm.c | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/arch/powerpc/platforms/pseries/papr_scm.c 
b/arch/powerpc/platforms/pseries/papr_scm.c
index 0c091622b15e..692ad3d79826 100644
--- a/arch/powerpc/platforms/pseries/papr_scm.c
+++ b/arch/powerpc/platforms/pseries/papr_scm.c
@@ -355,11 +355,16 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor 
*nd_desc,
 {
struct nd_cmd_get_config_size *get_size_hdr;
struct papr_scm_priv *p;
+   int rc;
 
/* Only dimm-specific calls are supported atm */
if (!nvdimm)
return -EINVAL;
 
+   /* Use a local variable in case cmd_rc pointer is NULL */
+   if (!cmd_rc)
+   cmd_rc = 
+
p = nvdimm_provider_data(nvdimm);
 
switch (cmd) {
@@ -381,6 +386,7 @@ static int papr_scm_ndctl(struct nvdimm_bus_descriptor 
*nd_desc,
break;
 
default:
+   dev_dbg(>pdev->dev, "Unknown command = %d\n", cmd);
return -EINVAL;
}
 
-- 
2.26.2



[PATCH v11 5/6] ndctl/papr_scm, uapi: Add support for PAPR nvdimm specific methods

2020-06-07 Thread Vaibhav Jain
Introduce support for PAPR NVDIMM Specific Methods (PDSM) in papr_scm
module and add the command family NVDIMM_FAMILY_PAPR to the white list
of NVDIMM command sets. Also advertise support for ND_CMD_CALL for the
nvdimm command mask and implement necessary scaffolding in the module
to handle ND_CMD_CALL ioctl and PDSM requests that we receive.

The layout of the PDSM request as we expect from libnvdimm/libndctl is
described in newly introduced uapi header 'papr_pdsm.h' which
defines a new 'struct nd_pdsm_cmd_pkg' header. This header is used
to communicate the PDSM request via member
'nd_cmd_pkg.nd_command' and size of payload that need to be
sent/received for servicing the PDSM.

A new function is_cmd_valid() is implemented that reads the args to
papr_scm_ndctl() and performs sanity tests on them. A new function
papr_scm_service_pdsm() is introduced and is called from
papr_scm_ndctl() in case of a PDSM request is received via ND_CMD_CALL
command from libnvdimm.

Cc: "Aneesh Kumar K . V" 
Cc: Dan Williams 
Cc: Michael Ellerman 
Cc: Ira Weiny 
Signed-off-by: Vaibhav Jain 
---
Changelog:

v10..v11:
* Moved in-lines 'nd_pdsm_cmd_pkg()' and 'pdsm_cmd_to_payload()' from
  'papr_pdsm.h' header to 'papr_scm.c'. The avoids a potential license
  incompatibility issue with non-GPL-2.0 user-space code trying to
  include the header in its code. [ Ira ]
* Verified papr_pdsm.h with UAPI_HEADER_TEST config.
* Moved the is_cmd_valid() check in papr_scm_ndctl() before check for
  cmd_rc == NULL. This prevents cmd_rc to be updated in case the
  nd-cmd is invalid or unknown.

v9..v10:
* Simplified 'struct nd_pdsm_cmd_pkg' by removing the
  'payload_version' field.
* Removed the corrosponding documentation on versioning and backward
  compatibility from 'papr_pdsm.h'
* Reduced the size of reserved fields to 4-bytes making 'struct
  nd_pdsm_cmd_pkg' 64 + 8 bytes long.
* Updated is_cmd_valid() to enforce validation checks on pdsm
  commands. [ Dan Williams ]
* Added check for reserved fields being set to '0' in is_cmd_valid()
  [ Ira ]
* Moved changes for checking cmd_rc == NULL and logging improvements
  to a separate prelim patch [ Ira ].
* Moved  pdsm package validation checks from papr_scm_service_pdsm()
  to is_cmd_valid().
* Marked papr_scm_service_pdsm() return type as 'void' since errors
  are reported in nd_pdsm_cmd_pkg.cmd_status field.

Resend:
* Added ack from Aneesh.

v8..v9:
* Reduced the usage of term SCM replacing it with appropriate
  replacement [ Dan Williams, Aneesh ]
* Renamed 'papr_scm_pdsm.h' to 'papr_pdsm.h'
* s/PAPR_SCM_PDSM_*/PAPR_PDSM_*/g
* s/NVDIMM_FAMILY_PAPR_SCM/NVDIMM_FAMILY_PAPR/g
* Minor updates to 'papr_psdm.h' to replace usage of term 'SCM'.
* Minor update to patch description.

v7..v8:
* Removed the 'payload_offset' field from 'struct
  nd_pdsm_cmd_pkg'. Instead command payload is always assumed to start
  at 'nd_pdsm_cmd_pkg.payload'. [ Aneesh ]
* To enable introducing new fields to 'struct nd_pdsm_cmd_pkg',
  'reserved' field of 10-bytes is introduced. [ Aneesh ]
* Fixed a typo in "Backward Compatibility" section of papr_scm_pdsm.h
  [ Ira ]

Resend:
* None

v6..v7 :
* Removed the re-definitions of __packed macro from papr_scm_pdsm.h
  [Mpe].
* Removed the usage of __KERNEL__ macros in papr_scm_pdsm.h [Mpe].
* Removed macros that were unused in papr_scm.c from papr_scm_pdsm.h
  [Mpe].
* Made functions defined in papr_scm_pdsm.h as static inline. [Mpe]

v5..v6 :
* Changed the usage of the term DSM to PDSM to distinguish it from the
  ACPI term [ Dan Williams ]
* Renamed papr_scm_dsm.h to papr_scm_pdsm.h and updated various struct
  to reflect the new terminology.
* Updated the patch description and title to reflect the new terminology.
* Squashed patch to introduce new command family in 'ndctl.h' with
  this patch [ Dan Williams ]
* Updated the papr_scm_pdsm method starting index from 0x1 to 0x0
  [ Dan Williams ]
* Removed redundant license text from the papr_scm_psdm.h file.
  [ Dan Williams ]
* s/envelop/envelope/ at various places [ Dan Williams ]
* Added '__packed' attribute to command package header to gaurd
  against different compiler adding paddings between the fields.
  [ Dan Williams]
* Converted various pr_debug to dev_debug [ Dan Williams ]

v4..v5 :
* None

v3..v4 :
* None

v2..v3 :
* Updated the patch prefix to 'ndctl/uapi' [Aneesh]

v1..v2 :
* None
---
 arch/powerpc/include/uapi/asm/papr_pdsm.h |  84 +++
 arch/powerpc/platforms/pseries/papr_scm.c | 126 +-
 include/uapi/linux/ndctl.h|   1 +
 3 files changed, 207 insertions(+), 4 deletions(-)
 create mode 100644 arch/powerpc/include/uapi/asm/papr_pdsm.h

diff --git a/arch/powerpc/include/uapi/asm/papr_pdsm.h 
b/arch/powerpc/include/uapi/asm/papr_pdsm.h
new file mode 100644
index ..df2447455cfe
--- /dev/null
+++ b/arch/powerpc/include/uapi/asm/papr_pdsm.h
@@ -0,0 +1,84 @@
+/* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */
+/*
+ * PAPR nvDimm Specific 

[PATCH v11 3/6] powerpc/papr_scm: Fetch nvdimm health information from PHYP

2020-06-07 Thread Vaibhav Jain
Implement support for fetching nvdimm health information via
H_SCM_HEALTH hcall as documented in Ref[1]. The hcall returns a pair
of 64-bit bitmap, bitwise-and of which is then stored in
'struct papr_scm_priv' and subsequently partially exposed to
user-space via newly introduced dimm specific attribute
'papr/flags'. Since the hcall is costly, the health information is
cached and only re-queried, 60s after the previous successful hcall.

The patch also adds a  documentation text describing flags reported by
the the new sysfs attribute 'papr/flags' is also introduced at
Documentation/ABI/testing/sysfs-bus-papr-pmem.

[1] commit 58b278f568f0 ("powerpc: Provide initial documentation for
PAPR hcalls")

Cc: "Aneesh Kumar K . V" 
Cc: Dan Williams 
Cc: Michael Ellerman 
Cc: Ira Weiny 
Signed-off-by: Vaibhav Jain 
---
Changelog:

v10..v11:
* None

v9..v10:
* Removed an avoidable 'goto' in __drc_pmem_query_health. [ Ira ].

Resend:
* Added ack from Aneesh.

v8..v9:
* Rename some variables and defines to reduce usage of term SCM
  replacing it with PMEM [Dan Williams, Aneesh]
* s/PAPR_SCM_DIMM/PAPR_PMEM/g
* s/papr_scm_nd_attributes/papr_nd_attributes/g
* s/papr_scm_nd_attribute_group/papr_nd_attribute_group/g
* s/papr_scm_dimm_attr_groups/papr_nd_attribute_groups/g
* Renamed file sysfs-bus-papr-scm to sysfs-bus-papr-pmem

v7..v8:
* Update type of variable 'rc' in __drc_pmem_query_health() and
  drc_pmem_query_health() to long and int respectively. [ Ira ]
* Updated the patch description to s/64 bit Big Endian Number/64-bit
  bitmap/ [ Ira, Aneesh ].

Resend:
* None

v6..v7 :
* Used the exported buf_seq_printf() function to generate content for
  'papr/flags'
* Moved the PAPR_SCM_DIMM_* bit-flags macro definitions to papr_scm.c
  and removed the papr_scm.h file [Mpe]
* Some minor consistency issued in sysfs-bus-papr-scm
  documentation. [Mpe]
* s/dimm_mutex/health_mutex/g [Mpe]
* Split drc_pmem_query_health() into two function one of which takes
  care of caching and locking. [Mpe]
* Fixed a local copy creation of dimm health information using
  READ_ONCE(). [Mpe]

v5..v6 :
* Change the flags sysfs attribute from 'papr_flags' to 'papr/flags'
  [Dan Williams]
* Include documentation for 'papr/flags' attr [Dan Williams]
* Change flag 'save_fail' to 'flush_fail' [Dan Williams]
* Caching of health bitmap to reduce expensive hcalls [Dan Williams]
* Removed usage of PPC_BIT from 'papr-scm.h' header [Mpe]
* Replaced two __be64 integers from papr_scm_priv to a single u64
  integer [Mpe]
* Updated patch description to reflect the changes made in this
  version.
* Removed avoidable usage of 'papr_scm_priv.dimm_mutex' from
  flags_show() [Dan Williams]

v4..v5 :
* None

v3..v4 :
* None

v2..v3 :
* Removed PAPR_SCM_DIMM_HEALTH_NON_CRITICAL as a condition for
 NVDIMM unarmed [Aneesh]

v1..v2 :
* New patch in the series.
---
 Documentation/ABI/testing/sysfs-bus-papr-pmem |  27 +++
 arch/powerpc/platforms/pseries/papr_scm.c | 168 +-
 2 files changed, 193 insertions(+), 2 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-papr-pmem

diff --git a/Documentation/ABI/testing/sysfs-bus-papr-pmem 
b/Documentation/ABI/testing/sysfs-bus-papr-pmem
new file mode 100644
index ..5b10d036a8d4
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-bus-papr-pmem
@@ -0,0 +1,27 @@
+What:  /sys/bus/nd/devices/nmemX/papr/flags
+Date:  Apr, 2020
+KernelVersion: v5.8
+Contact:   linuxppc-dev , 
linux-nvd...@lists.01.org,
+Description:
+   (RO) Report flags indicating various states of a
+   papr-pmem NVDIMM device. Each flag maps to a one or
+   more bits set in the dimm-health-bitmap retrieved in
+   response to H_SCM_HEALTH hcall. The details of the bit
+   flags returned in response to this hcall is available
+   at 'Documentation/powerpc/papr_hcalls.rst' . Below are
+   the flags reported in this sysfs file:
+
+   * "not_armed"   : Indicates that NVDIMM contents will not
+ survive a power cycle.
+   * "flush_fail"  : Indicates that NVDIMM contents
+ couldn't be flushed during last
+ shut-down event.
+   * "restore_fail": Indicates that NVDIMM contents
+ couldn't be restored during NVDIMM
+ initialization.
+   * "encrypted"   : NVDIMM contents are encrypted.
+   * "smart_notify": There is health event for the NVDIMM.
+   * "scrubbed": Indicating that contents of the
+ NVDIMM have been scrubbed.
+   * "locked"  : Indicating that NVDIMM contents cant
+ be modified until next power cycle.
diff --git a/arch/powerpc/platforms/pseries/papr_scm.c 
b/arch/powerpc/platforms/pseries/papr_scm.c

[PATCH v11 1/6] powerpc: Document details on H_SCM_HEALTH hcall

2020-06-07 Thread Vaibhav Jain
Add documentation to 'papr_hcalls.rst' describing the bitmap flags
that are returned from H_SCM_HEALTH hcall as per the PAPR-SCM
specification.

Cc: "Aneesh Kumar K . V" 
Cc: Dan Williams 
Cc: Michael Ellerman 
Cc: Ira Weiny 
Acked-by: Ira Weiny 
Signed-off-by: Vaibhav Jain 
---
Changelog:

v10..v11:
* None

v9..v10:
* Added ack from Ira.

Resend:
* None

v8..v9:
* s/SCM/PMEM device. [ Dan Williams, Aneesh ]

v7..v8:
* Added a clarification on bit-ordering of Health Bitmap

Resend:
* None

v6..v7:
* None

v5..v6:
* New patch in the series
---
 Documentation/powerpc/papr_hcalls.rst | 46 ---
 1 file changed, 42 insertions(+), 4 deletions(-)

diff --git a/Documentation/powerpc/papr_hcalls.rst 
b/Documentation/powerpc/papr_hcalls.rst
index 3493631a60f8..48fcf1255a33 100644
--- a/Documentation/powerpc/papr_hcalls.rst
+++ b/Documentation/powerpc/papr_hcalls.rst
@@ -220,13 +220,51 @@ from the LPAR memory.
 **H_SCM_HEALTH**
 
 | Input: drcIndex
-| Out: *health-bitmap, health-bit-valid-bitmap*
+| Out: *health-bitmap (r4), health-bit-valid-bitmap (r5)*
 | Return Value: *H_Success, H_Parameter, H_Hardware*
 
 Given a DRC Index return the info on predictive failure and overall health of
-the NVDIMM. The asserted bits in the health-bitmap indicate a single predictive
-failure and health-bit-valid-bitmap indicate which bits in health-bitmap are
-valid.
+the PMEM device. The asserted bits in the health-bitmap indicate one or more 
states
+(described in table below) of the PMEM device and health-bit-valid-bitmap 
indicate
+which bits in health-bitmap are valid. The bits are reported in
+reverse bit ordering for example a value of 0xC400
+indicates bits 0, 1, and 5 are valid.
+
+Health Bitmap Flags:
+
++--+---+
+|  Bit |   Definition  
|
++==+===+
+|  00  | PMEM device is unable to persist memory contents. 
|
+|  | If the system is powered down, nothing will be saved. 
|
++--+---+
+|  01  | PMEM device failed to persist memory contents. Either contents were   
|
+|  | not saved successfully on power down or were not restored properly on 
|
+|  | power up. 
|
++--+---+
+|  02  | PMEM device contents are persisted from previous IPL. The data from   
|
+|  | the last boot were successfully restored. 
|
++--+---+
+|  03  | PMEM device contents are not persisted from previous IPL. There was 
no|
+|  | data to restore from the last boot.   
|
++--+---+
+|  04  | PMEM device memory life remaining is critically low   
|
++--+---+
+|  05  | PMEM device will be garded off next IPL due to failure
|
++--+---+
+|  06  | PMEM device contents cannot persist due to current platform health
|
+|  | status. A hardware failure may prevent data from being saved or   
|
+|  | restored. 
|
++--+---+
+|  07  | PMEM device is unable to persist memory contents in certain 
conditions|
++--+---+
+|  08  | PMEM device is encrypted  
|
++--+---+
+|  09  | PMEM device has successfully completed a requested erase or secure
|
+|  | erase procedure.  
|
++--+---+
+|10:63 | Reserved / Unused 
|
++--+---+
 
 **H_SCM_PERFORMANCE_STATS**
 
-- 
2.26.2



[PATCH v11 0/6] powerpc/papr_scm: Add support for reporting nvdimm health

2020-06-07 Thread Vaibhav Jain
Changes since v10 [1]:
* Changed the definition of 'struct nd_papr_pdsm_health' to a maximal
  struct 184 bytes which can be extended in future with newly
  introduced 'extension_flags'
* Fixed a suspicious conversion from u64 to u8 in papr_pdsm_health
  that was preventing correct initialization of 'struct
  nd_papr_pdsm_health'.
* Moved in-lines 'nd_pdsm_cmd_pkg()' and 'pdsm_cmd_to_payload()' from
  'papr_pdsm.h' header to 'papr_scm.c'. The avoids a potential license
  incompatibility issue with non-GPL-2.0 user-space code trying to
  include the header in its code.
* Fixed the control flow in papr_scm_ndctl() to return '0' in case
  nd_cmd is handled. In case of unknown nd-cmd return -EINVAL.

[1] 
https://lore.kernel.org/linux-nvdimm/20200604234136.253703-1-vaib...@linux.ibm.com/
---

The PAPR standard[2][4] provides mechanisms to query the health and
performance stats of an NVDIMM via various hcalls as described in
Ref[3].  Until now these stats were never available nor exposed to the
user-space tools like 'ndctl'. This is partly due to PAPR platform not
having support for ACPI and NFIT. Hence 'ndctl' is unable to query and
report the dimm health status and a user had no way to determine the
current health status of a NDVIMM.

To overcome this limitation, this patch-set updates papr_scm kernel
module to query and fetch NVDIMM health stats using hcalls described
in Ref[3].  This health and performance stats are then exposed to
userspace via sysfs and PAPR-NVDIMM-Specific-Methods(PDSM) issued by
libndctl.

These changes coupled with proposed ndtcl changes located at Ref[5]
should provide a way for the user to retrieve NVDIMM health status
using ndtcl.

Below is a sample output using proposed kernel + ndctl for PAPR NVDIMM
in a emulation environment:

 # ndctl list -DH
[
  {
"dev":"nmem0",
"health":{
  "health_state":"fatal",
  "shutdown_state":"dirty"
}
  }
]

Dimm health report output on a pseries guest lpar with vPMEM or HMS
based NVDIMMs that are in perfectly healthy conditions:

 # ndctl list -d nmem0 -H
[
  {
"dev":"nmem0",
"health":{
  "health_state":"ok",
  "shutdown_state":"clean"
}
  }
]

PAPR NVDIMM-Specific-Methods(PDSM)
==

PDSM requests are issued by vendor specific code in libndctl to
execute certain operations or fetch information from NVDIMMS. PDSMs
requests can be sent to papr_scm module via libndctl(userspace) and
libnvdimm (kernel) using the ND_CMD_CALL ioctl command which can be
handled in the dimm control function papr_scm_ndctl(). Current
patchset proposes a single PDSM to retrieve NVDIMM health, defined in
the newly introduced uapi header named 'papr_pdsm.h'. Support for
more PDSMs will be added in future.

Structure of the patch-set
==

The patch-set starts with a doc patch documenting details of hcall
H_SCM_HEALTH. Second patch exports kernel symbol seq_buf_printf()
thats used in subsequent patches to generate sysfs attribute content.

Third patch implements support for fetching NVDIMM health information
from PHYP and partially exposing it to user-space via a NVDIMM sysfs
flag.

Fourth patch updates papr_scm_ndctl() to handle a possible error case
and also improve debug logging.

Fifth patch deals with implementing support for servicing PDSM
commands in papr_scm module.

Finally the last patch implements support for servicing PDSM
'PAPR_PDSM_HEALTH' that returns the NVDIMM health information to
libndctl.

References:
[2] "Power Architecture Platform Reference"
  https://en.wikipedia.org/wiki/Power_Architecture_Platform_Reference
[3] commit 58b278f568f0
 ("powerpc: Provide initial documentation for PAPR hcalls")
[4] "Linux on Power Architecture Platform Reference"
 https://members.openpowerfoundation.org/document/dl/469
[5] https://github.com/vaibhav92/ndctl/tree/papr_scm_health_v11

---

Vaibhav Jain (6):
  powerpc: Document details on H_SCM_HEALTH hcall
  seq_buf: Export seq_buf_printf
  powerpc/papr_scm: Fetch nvdimm health information from PHYP
  powerpc/papr_scm: Improve error logging and handling papr_scm_ndctl()
  ndctl/papr_scm,uapi: Add support for PAPR nvdimm specific methods
  powerpc/papr_scm: Implement support for PAPR_PDSM_HEALTH

 Documentation/ABI/testing/sysfs-bus-papr-pmem |  27 ++
 Documentation/powerpc/papr_hcalls.rst |  46 ++-
 arch/powerpc/include/uapi/asm/papr_pdsm.h | 127 ++
 arch/powerpc/platforms/pseries/papr_scm.c | 373 +-
 include/uapi/linux/ndctl.h|   1 +
 lib/seq_buf.c |   1 +
 6 files changed, 564 insertions(+), 11 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-bus-papr-pmem
 create mode 100644 arch/powerpc/include/uapi/asm/papr_pdsm.h

-- 
2.26.2



[PATCH 1/2] powerpc/64s: remove PROT_SAO support

2020-06-07 Thread Nicholas Piggin
ISA v3.1 does not support the SAO storage control attribute required to
implement PROT_SAO. PROT_SAO was used by specialised system software
(Lx86) that has been discontinued for about 7 years, and is not thought
to be used elsewhere, so removal should not cause problems.

We rather remove it than keep support for older processors, because
live migrating guest partitions to newer processors may not be possible
if SAO is in use.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/include/asm/book3s/64/pgtable.h  |  8 ++--
 arch/powerpc/include/asm/cputable.h   |  9 ++--
 arch/powerpc/include/asm/kvm_book3s_64.h  |  3 +-
 arch/powerpc/include/asm/mman.h   | 24 +++
 arch/powerpc/include/asm/nohash/64/pgtable.h  |  2 -
 arch/powerpc/kernel/dt_cpu_ftrs.c |  2 +-
 arch/powerpc/mm/book3s64/hash_utils.c |  2 -
 include/linux/mm.h|  2 -
 include/trace/events/mmflags.h|  2 -
 mm/ksm.c  |  4 --
 tools/testing/selftests/powerpc/mm/.gitignore |  1 -
 tools/testing/selftests/powerpc/mm/Makefile   |  4 +-
 tools/testing/selftests/powerpc/mm/prot_sao.c | 42 ---
 13 files changed, 18 insertions(+), 87 deletions(-)
 delete mode 100644 tools/testing/selftests/powerpc/mm/prot_sao.c

diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h 
b/arch/powerpc/include/asm/book3s/64/pgtable.h
index f17442c3a092..d9e92586f8dc 100644
--- a/arch/powerpc/include/asm/book3s/64/pgtable.h
+++ b/arch/powerpc/include/asm/book3s/64/pgtable.h
@@ -20,9 +20,13 @@
 #define _PAGE_RW   (_PAGE_READ | _PAGE_WRITE)
 #define _PAGE_RWX  (_PAGE_READ | _PAGE_WRITE | _PAGE_EXEC)
 #define _PAGE_PRIVILEGED   0x8 /* kernel access only */
-#define _PAGE_SAO  0x00010 /* Strong access order */
+
+#define _PAGE_CACHE_CTL0x00030 /* Bits for the folowing cache 
modes */
+   /*  No bits set is normal cacheable memory */
+   /*  0x00010 unused, is SAO bit on radix POWER9 */
 #define _PAGE_NON_IDEMPOTENT   0x00020 /* non idempotent memory */
 #define _PAGE_TOLERANT 0x00030 /* tolerant memory, cache inhibited */
+
 #define _PAGE_DIRTY0x00080 /* C: page changed */
 #define _PAGE_ACCESSED 0x00100 /* R: page referenced */
 /*
@@ -825,8 +829,6 @@ static inline void __set_pte_at(struct mm_struct *mm, 
unsigned long addr,
return hash__set_pte_at(mm, addr, ptep, pte, percpu);
 }
 
-#define _PAGE_CACHE_CTL(_PAGE_SAO | _PAGE_NON_IDEMPOTENT | 
_PAGE_TOLERANT)
-
 #define pgprot_noncached pgprot_noncached
 static inline pgprot_t pgprot_noncached(pgprot_t prot)
 {
diff --git a/arch/powerpc/include/asm/cputable.h 
b/arch/powerpc/include/asm/cputable.h
index bac2252c839e..c7e923ba 100644
--- a/arch/powerpc/include/asm/cputable.h
+++ b/arch/powerpc/include/asm/cputable.h
@@ -191,7 +191,6 @@ static inline void cpu_feature_keys_init(void) { }
 #define CPU_FTR_SPURR  LONG_ASM_CONST(0x0100)
 #define CPU_FTR_DSCR   LONG_ASM_CONST(0x0200)
 #define CPU_FTR_VSXLONG_ASM_CONST(0x0400)
-#define CPU_FTR_SAOLONG_ASM_CONST(0x0800)
 #define CPU_FTR_CP_USE_DCBTZ   LONG_ASM_CONST(0x1000)
 #define CPU_FTR_UNALIGNED_LD_STD   LONG_ASM_CONST(0x2000)
 #define CPU_FTR_ASYM_SMT   LONG_ASM_CONST(0x4000)
@@ -435,7 +434,7 @@ static inline void cpu_feature_keys_init(void) { }
CPU_FTR_MMCRA | CPU_FTR_SMT | \
CPU_FTR_COHERENT_ICACHE | \
CPU_FTR_PURR | CPU_FTR_SPURR | CPU_FTR_REAL_LE | \
-   CPU_FTR_DSCR | CPU_FTR_SAO  | CPU_FTR_ASYM_SMT | \
+   CPU_FTR_DSCR | CPU_FTR_ASYM_SMT | \
CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
CPU_FTR_CFAR | CPU_FTR_HVMODE | \
CPU_FTR_VMX_COPY | CPU_FTR_HAS_PPR | CPU_FTR_DABRX | CPU_FTR_PKEY)
@@ -444,7 +443,7 @@ static inline void cpu_feature_keys_init(void) { }
CPU_FTR_MMCRA | CPU_FTR_SMT | \
CPU_FTR_COHERENT_ICACHE | \
CPU_FTR_PURR | CPU_FTR_SPURR | CPU_FTR_REAL_LE | \
-   CPU_FTR_DSCR | CPU_FTR_SAO  | \
+   CPU_FTR_DSCR | \
CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
CPU_FTR_CFAR | CPU_FTR_HVMODE | CPU_FTR_VMX_COPY | \
CPU_FTR_DBELL | CPU_FTR_HAS_PPR | CPU_FTR_DAWR | \
@@ -455,7 +454,7 @@ static inline void cpu_feature_keys_init(void) { }
CPU_FTR_MMCRA | CPU_FTR_SMT | \
CPU_FTR_COHERENT_ICACHE | \
CPU_FTR_PURR | CPU_FTR_SPURR | CPU_FTR_REAL_LE | \
-   CPU_FTR_DSCR | CPU_FTR_SAO  | \
+   CPU_FTR_DSCR | \
CPU_FTR_STCX_CHECKS_ADDRESS | CPU_FTR_POPCNTB | CPU_FTR_POPCNTD | \
CPU_FTR_CFAR | CPU_FTR_HVMODE | 

[PATCH 2/2] powerpc/64s/hash: disable subpage_prot syscall by default

2020-06-07 Thread Nicholas Piggin
The subpage_prot syscall was added for specialised system software
(Lx86) that has been discontinued for about 7 years, and is not thought
to be used elsewhere, so disable it by default.

Signed-off-by: Nicholas Piggin 
---
 arch/powerpc/Kconfig   | 1 +
 arch/powerpc/configs/powernv_defconfig | 1 -
 arch/powerpc/configs/pseries_defconfig | 1 -
 3 files changed, 1 insertion(+), 2 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 9fa23eb320ff..1701646f845d 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -834,6 +834,7 @@ config FORCE_MAX_ZONEORDER
 
 config PPC_SUBPAGE_PROT
bool "Support setting protections for 4k subpages"
+   default n
depends on PPC_BOOK3S_64 && PPC_64K_PAGES
help
  This option adds support for a system call to allow user programs
diff --git a/arch/powerpc/configs/powernv_defconfig 
b/arch/powerpc/configs/powernv_defconfig
index 2de9aadf0f50..afc0dd73a1e6 100644
--- a/arch/powerpc/configs/powernv_defconfig
+++ b/arch/powerpc/configs/powernv_defconfig
@@ -64,7 +64,6 @@ CONFIG_HWPOISON_INJECT=m
 CONFIG_TRANSPARENT_HUGEPAGE=y
 CONFIG_DEFERRED_STRUCT_PAGE_INIT=y
 CONFIG_PPC_64K_PAGES=y
-CONFIG_PPC_SUBPAGE_PROT=y
 CONFIG_SCHED_SMT=y
 CONFIG_PM=y
 CONFIG_HOTPLUG_PCI=y
diff --git a/arch/powerpc/configs/pseries_defconfig 
b/arch/powerpc/configs/pseries_defconfig
index dfa4a726333b..894e8d85fb48 100644
--- a/arch/powerpc/configs/pseries_defconfig
+++ b/arch/powerpc/configs/pseries_defconfig
@@ -57,7 +57,6 @@ CONFIG_MEMORY_HOTREMOVE=y
 CONFIG_KSM=y
 CONFIG_TRANSPARENT_HUGEPAGE=y
 CONFIG_PPC_64K_PAGES=y
-CONFIG_PPC_SUBPAGE_PROT=y
 CONFIG_SCHED_SMT=y
 CONFIG_HOTPLUG_PCI=y
 CONFIG_HOTPLUG_PCI_RPA=m
-- 
2.23.0



[PATCH v5 4/4] riscv: Check relocations at compile time

2020-06-07 Thread Alexandre Ghiti
Relocating kernel at runtime is done very early in the boot process, so
it is not convenient to check for relocations there and react in case a
relocation was not expected.

There exists a script in scripts/ that extracts the relocations from
vmlinux that is then used at postlink to check the relocations.

Signed-off-by: Alexandre Ghiti 
Reviewed-by: Anup Patel 
---
 arch/riscv/Makefile.postlink | 36 
 arch/riscv/tools/relocs_check.sh | 26 +++
 2 files changed, 62 insertions(+)
 create mode 100644 arch/riscv/Makefile.postlink
 create mode 100755 arch/riscv/tools/relocs_check.sh

diff --git a/arch/riscv/Makefile.postlink b/arch/riscv/Makefile.postlink
new file mode 100644
index ..bf2b2bca1845
--- /dev/null
+++ b/arch/riscv/Makefile.postlink
@@ -0,0 +1,36 @@
+# SPDX-License-Identifier: GPL-2.0
+# ===
+# Post-link riscv pass
+# ===
+#
+# Check that vmlinux relocations look sane
+
+PHONY := __archpost
+__archpost:
+
+-include include/config/auto.conf
+include scripts/Kbuild.include
+
+quiet_cmd_relocs_check = CHKREL  $@
+cmd_relocs_check = \
+   $(CONFIG_SHELL) $(srctree)/arch/riscv/tools/relocs_check.sh 
"$(OBJDUMP)" "$(NM)" "$@"
+
+# `@true` prevents complaint when there is nothing to be done
+
+vmlinux: FORCE
+   @true
+ifdef CONFIG_RELOCATABLE
+   $(call if_changed,relocs_check)
+endif
+
+%.ko: FORCE
+   @true
+
+clean:
+   @true
+
+PHONY += FORCE clean
+
+FORCE:
+
+.PHONY: $(PHONY)
diff --git a/arch/riscv/tools/relocs_check.sh b/arch/riscv/tools/relocs_check.sh
new file mode 100755
index ..baeb2e7b2290
--- /dev/null
+++ b/arch/riscv/tools/relocs_check.sh
@@ -0,0 +1,26 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0-or-later
+# Based on powerpc relocs_check.sh
+
+# This script checks the relocations of a vmlinux for "suspicious"
+# relocations.
+
+if [ $# -lt 3 ]; then
+echo "$0 [path to objdump] [path to nm] [path to vmlinux]" 1>&2
+exit 1
+fi
+
+bad_relocs=$(
+${srctree}/scripts/relocs_check.sh "$@" |
+   # These relocations are okay
+   #   R_RISCV_RELATIVE
+   grep -F -w -v 'R_RISCV_RELATIVE'
+)
+
+if [ -z "$bad_relocs" ]; then
+   exit 0
+fi
+
+num_bad=$(echo "$bad_relocs" | wc -l)
+echo "WARNING: $num_bad bad relocations"
+echo "$bad_relocs"
-- 
2.20.1



[PATCH v5 3/4] powerpc: Move script to check relocations at compile time in scripts/

2020-06-07 Thread Alexandre Ghiti
Relocating kernel at runtime is done very early in the boot process, so
it is not convenient to check for relocations there and react in case a
relocation was not expected.

Powerpc architecture has a script that allows to check at compile time
for such unexpected relocations: extract the common logic to scripts/
so that other architectures can take advantage of it.

Signed-off-by: Alexandre Ghiti 
Reviewed-by: Anup Patel 
---
 arch/powerpc/tools/relocs_check.sh | 18 ++
 scripts/relocs_check.sh| 20 
 2 files changed, 22 insertions(+), 16 deletions(-)
 create mode 100755 scripts/relocs_check.sh

diff --git a/arch/powerpc/tools/relocs_check.sh 
b/arch/powerpc/tools/relocs_check.sh
index 014e00e74d2b..e367895941ae 100755
--- a/arch/powerpc/tools/relocs_check.sh
+++ b/arch/powerpc/tools/relocs_check.sh
@@ -15,21 +15,8 @@ if [ $# -lt 3 ]; then
exit 1
 fi
 
-# Have Kbuild supply the path to objdump and nm so we handle cross compilation.
-objdump="$1"
-nm="$2"
-vmlinux="$3"
-
-# Remove from the bad relocations those that match an undefined weak symbol
-# which will result in an absolute relocation to 0.
-# Weak unresolved symbols are of that form in nm output:
-# "  w _binary__btf_vmlinux_bin_end"
-undef_weak_symbols=$($nm "$vmlinux" | awk '$1 ~ /w/ { print $2 }')
-
 bad_relocs=$(
-$objdump -R "$vmlinux" |
-   # Only look at relocation lines.
-   grep -E '\

[PATCH v5 2/4] riscv: Introduce CONFIG_RELOCATABLE

2020-06-07 Thread Alexandre Ghiti
This config allows to compile the kernel as PIE and to relocate it at
any virtual address at runtime: this paves the way to KASLR and to 4-level
page table folding at runtime. Runtime relocation is possible since
relocation metadata are embedded into the kernel.

Note that relocating at runtime introduces an overhead even if the
kernel is loaded at the same address it was linked at and that the compiler
options are those used in arm64 which uses the same RELA relocation
format.

Signed-off-by: Alexandre Ghiti 
Reviewed-by: Zong Li 
Reviewed-by: Anup Patel 
---
 arch/riscv/Kconfig  | 12 +++
 arch/riscv/Makefile |  5 ++-
 arch/riscv/kernel/vmlinux.lds.S |  6 ++--
 arch/riscv/mm/Makefile  |  4 +++
 arch/riscv/mm/init.c| 63 +
 5 files changed, 87 insertions(+), 3 deletions(-)

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index a31e1a41913a..93127d5913fe 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -170,6 +170,18 @@ config PGTABLE_LEVELS
default 3 if 64BIT
default 2
 
+config RELOCATABLE
+   bool
+   depends on MMU
+   help
+  This builds a kernel as a Position Independent Executable (PIE),
+  which retains all relocation metadata required to relocate the
+  kernel binary at runtime to a different virtual address than the
+  address it was linked at.
+  Since RISCV uses the RELA relocation format, this requires a
+  relocation pass at runtime even if the kernel is loaded at the
+  same address it was linked at.
+
 source "arch/riscv/Kconfig.socs"
 
 menu "Platform type"
diff --git a/arch/riscv/Makefile b/arch/riscv/Makefile
index fb6e37db836d..1406416ea743 100644
--- a/arch/riscv/Makefile
+++ b/arch/riscv/Makefile
@@ -9,7 +9,10 @@
 #
 
 OBJCOPYFLAGS:= -O binary
-LDFLAGS_vmlinux :=
+ifeq ($(CONFIG_RELOCATABLE),y)
+LDFLAGS_vmlinux := -shared -Bsymbolic -z notext -z norelro
+KBUILD_CFLAGS += -fPIE
+endif
 ifeq ($(CONFIG_DYNAMIC_FTRACE),y)
LDFLAGS_vmlinux := --no-relax
 endif
diff --git a/arch/riscv/kernel/vmlinux.lds.S b/arch/riscv/kernel/vmlinux.lds.S
index a9abde62909f..e8ffba8c2044 100644
--- a/arch/riscv/kernel/vmlinux.lds.S
+++ b/arch/riscv/kernel/vmlinux.lds.S
@@ -85,8 +85,10 @@ SECTIONS
 
BSS_SECTION(PAGE_SIZE, PAGE_SIZE, 0)
 
-   .rel.dyn : {
-   *(.rel.dyn*)
+   .rela.dyn : ALIGN(8) {
+   __rela_dyn_start = .;
+   *(.rela .rela*)
+   __rela_dyn_end = .;
}
 
_end = .;
diff --git a/arch/riscv/mm/Makefile b/arch/riscv/mm/Makefile
index 363ef01c30b1..dc5cdaa80bc1 100644
--- a/arch/riscv/mm/Makefile
+++ b/arch/riscv/mm/Makefile
@@ -1,6 +1,10 @@
 # SPDX-License-Identifier: GPL-2.0-only
 
 CFLAGS_init.o := -mcmodel=medany
+ifdef CONFIG_RELOCATABLE
+CFLAGS_init.o += -fno-pie
+endif
+
 ifdef CONFIG_FTRACE
 CFLAGS_REMOVE_init.o = -pg
 endif
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index 71da78914645..29b33289a12f 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -13,6 +13,9 @@
 #include 
 #include 
 #include 
+#ifdef CONFIG_RELOCATABLE
+#include 
+#endif
 
 #include 
 #include 
@@ -379,6 +382,53 @@ static uintptr_t __init best_map_size(phys_addr_t base, 
phys_addr_t size)
 #error "setup_vm() is called from head.S before relocate so it should not use 
absolute addressing."
 #endif
 
+#ifdef CONFIG_RELOCATABLE
+extern unsigned long __rela_dyn_start, __rela_dyn_end;
+
+#ifdef CONFIG_64BIT
+#define Elf_Rela Elf64_Rela
+#define Elf_Addr Elf64_Addr
+#else
+#define Elf_Rela Elf32_Rela
+#define Elf_Addr Elf32_Addr
+#endif
+
+void __init relocate_kernel(uintptr_t load_pa)
+{
+   Elf_Rela *rela = (Elf_Rela *)&__rela_dyn_start;
+   /*
+* This holds the offset between the linked virtual address and the
+* relocated virtual address.
+*/
+   uintptr_t reloc_offset = kernel_virt_addr - KERNEL_LINK_ADDR;
+   /*
+* This holds the offset between kernel linked virtual address and
+* physical address.
+*/
+   uintptr_t va_kernel_link_pa_offset = KERNEL_LINK_ADDR - load_pa;
+
+   for ( ; rela < (Elf_Rela *)&__rela_dyn_end; rela++) {
+   Elf_Addr addr = (rela->r_offset - va_kernel_link_pa_offset);
+   Elf_Addr relocated_addr = rela->r_addend;
+
+   if (rela->r_info != R_RISCV_RELATIVE)
+   continue;
+
+   /*
+* Make sure to not relocate vdso symbols like rt_sigreturn
+* which are linked from the address 0 in vmlinux since
+* vdso symbol addresses are actually used as an offset from
+* mm->context.vdso in VDSO_OFFSET macro.
+*/
+   if (relocated_addr >= KERNEL_LINK_ADDR)
+   relocated_addr += reloc_offset;
+
+   *(Elf_Addr *)addr = relocated_addr;
+   }
+}
+
+#endif
+
 

[PATCH v5 1/4] riscv: Move kernel mapping to vmalloc zone

2020-06-07 Thread Alexandre Ghiti
This is a preparatory patch for relocatable kernel.

The kernel used to be linked at PAGE_OFFSET address and used to be loaded
physically at the beginning of the main memory. Therefore, we could use
the linear mapping for the kernel mapping.

But the relocated kernel base address will be different from PAGE_OFFSET
and since in the linear mapping, two different virtual addresses cannot
point to the same physical address, the kernel mapping needs to lie outside
the linear mapping.

In addition, because modules and BPF must be close to the kernel (inside
+-2GB window), the kernel is placed at the end of the vmalloc zone minus
2GB, which leaves room for modules and BPF. The kernel could not be
placed at the beginning of the vmalloc zone since other vmalloc
allocations from the kernel could get all the +-2GB window around the
kernel which would prevent new modules and BPF programs to be loaded.

Signed-off-by: Alexandre Ghiti 
Reviewed-by: Zong Li 
---
 arch/riscv/boot/loader.lds.S |  3 +-
 arch/riscv/include/asm/page.h| 10 +-
 arch/riscv/include/asm/pgtable.h | 38 ++---
 arch/riscv/kernel/head.S |  3 +-
 arch/riscv/kernel/module.c   |  4 +--
 arch/riscv/kernel/vmlinux.lds.S  |  3 +-
 arch/riscv/mm/init.c | 58 +---
 arch/riscv/mm/physaddr.c |  2 +-
 8 files changed, 88 insertions(+), 33 deletions(-)

diff --git a/arch/riscv/boot/loader.lds.S b/arch/riscv/boot/loader.lds.S
index 47a5003c2e28..62d94696a19c 100644
--- a/arch/riscv/boot/loader.lds.S
+++ b/arch/riscv/boot/loader.lds.S
@@ -1,13 +1,14 @@
 /* SPDX-License-Identifier: GPL-2.0 */
 
 #include 
+#include 
 
 OUTPUT_ARCH(riscv)
 ENTRY(_start)
 
 SECTIONS
 {
-   . = PAGE_OFFSET;
+   . = KERNEL_LINK_ADDR;
 
.payload : {
*(.payload)
diff --git a/arch/riscv/include/asm/page.h b/arch/riscv/include/asm/page.h
index 2d50f76efe48..48bb09b6a9b7 100644
--- a/arch/riscv/include/asm/page.h
+++ b/arch/riscv/include/asm/page.h
@@ -90,18 +90,26 @@ typedef struct page *pgtable_t;
 
 #ifdef CONFIG_MMU
 extern unsigned long va_pa_offset;
+extern unsigned long va_kernel_pa_offset;
 extern unsigned long pfn_base;
 #define ARCH_PFN_OFFSET(pfn_base)
 #else
 #define va_pa_offset   0
+#define va_kernel_pa_offset0
 #define ARCH_PFN_OFFSET(PAGE_OFFSET >> PAGE_SHIFT)
 #endif /* CONFIG_MMU */
 
 extern unsigned long max_low_pfn;
 extern unsigned long min_low_pfn;
+extern unsigned long kernel_virt_addr;
 
 #define __pa_to_va_nodebug(x)  ((void *)((unsigned long) (x) + va_pa_offset))
-#define __va_to_pa_nodebug(x)  ((unsigned long)(x) - va_pa_offset)
+#define linear_mapping_va_to_pa(x) ((unsigned long)(x) - va_pa_offset)
+#define kernel_mapping_va_to_pa(x) \
+   ((unsigned long)(x) - va_kernel_pa_offset)
+#define __va_to_pa_nodebug(x)  \
+   (((x) >= PAGE_OFFSET) ? \
+   linear_mapping_va_to_pa(x) : kernel_mapping_va_to_pa(x))
 
 #ifdef CONFIG_DEBUG_VIRTUAL
 extern phys_addr_t __virt_to_phys(unsigned long x);
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 35b60035b6b0..94ef3b49dfb6 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -11,23 +11,29 @@
 
 #include 
 
-#ifndef __ASSEMBLY__
-
-/* Page Upper Directory not used in RISC-V */
-#include 
-#include 
-#include 
-#include 
-
-#ifdef CONFIG_MMU
+#ifndef CONFIG_MMU
+#define KERNEL_VIRT_ADDR   PAGE_OFFSET
+#define KERNEL_LINK_ADDR   PAGE_OFFSET
+#else
+/*
+ * Leave 2GB for modules and BPF that must lie within a 2GB range around
+ * the kernel.
+ */
+#define KERNEL_VIRT_ADDR   (VMALLOC_END - SZ_2G + 1)
+#define KERNEL_LINK_ADDR   KERNEL_VIRT_ADDR
 
 #define VMALLOC_SIZE (KERN_VIRT_SIZE >> 1)
 #define VMALLOC_END  (PAGE_OFFSET - 1)
 #define VMALLOC_START(PAGE_OFFSET - VMALLOC_SIZE)
 
 #define BPF_JIT_REGION_SIZE(SZ_128M)
-#define BPF_JIT_REGION_START   (PAGE_OFFSET - BPF_JIT_REGION_SIZE)
-#define BPF_JIT_REGION_END (VMALLOC_END)
+#define BPF_JIT_REGION_START   PFN_ALIGN((unsigned long)&_end)
+#define BPF_JIT_REGION_END (BPF_JIT_REGION_START + BPF_JIT_REGION_SIZE)
+
+#ifdef CONFIG_64BIT
+#define VMALLOC_MODULE_START   BPF_JIT_REGION_END
+#define VMALLOC_MODULE_END (((unsigned long)&_start & PAGE_MASK) + SZ_2G)
+#endif
 
 /*
  * Roughly size the vmemmap space to be large enough to fit enough
@@ -57,9 +63,16 @@
 #define FIXADDR_SIZE PGDIR_SIZE
 #endif
 #define FIXADDR_START(FIXADDR_TOP - FIXADDR_SIZE)
-
 #endif
 
+#ifndef __ASSEMBLY__
+
+/* Page Upper Directory not used in RISC-V */
+#include 
+#include 
+#include 
+#include 
+
 #ifdef CONFIG_64BIT
 #include 
 #else
@@ -483,6 +496,7 @@ static inline void __kernel_map_pages(struct page *page, 
int numpages, int enabl
 
 #define kern_addr_valid(addr)   (1) /* FIXME */
 
+extern char _start[];
 extern void *dtb_early_va;
 void setup_bootmem(void);
 void 

[PATCH v5 0/4] vmalloc kernel mapping and relocatable kernel

2020-06-07 Thread Alexandre Ghiti
This patchset originally implemented relocatable kernel support but now 
 
also moves the kernel mapping into the vmalloc zone.
 

 
The first patch explains why we need to move the kernel into vmalloc
 
zone (instead of memcpying it around). That patch should ease KASLR 
 
implementation a lot.   
 

 
The second patch allows to build relocatable kernels but is not selected
 
by default. 
 

 
The third and fourth patches take advantage of an already existing powerpc  
 
script that checks relocations at compile-time, and uses it for riscv.  
 

 
Changes in v5:
  * Add "static __init" to create_kernel_page_table function as reported by
Kbuild test robot
  * Add reviewed-by from Zong
  * Rebase onto v5.7

Changes in v4:  
 
  * Fix BPF region that overlapped with kernel's as suggested by Zong   
 
  * Fix end of module region that could be larger than 2GB as suggested by Zong 
 
  * Fix the size of the vm area reserved for the kernel as we could lose
 
PMD_SIZE if the size was already aligned on PMD_SIZE
 
  * Split compile time relocations check patch into 2 patches as suggested by 
Anup
  * Applied Reviewed-by from Zong and Anup  
 

 
Changes in v3:  
 
  * Move kernel mapping to vmalloc  
 

 
Changes in v2:  
 
  * Make RELOCATABLE depend on MMU as suggested by Anup 
 
  * Rename kernel_load_addr into kernel_virt_addr as suggested by Anup  
 
  * Use __pa_symbol instead of __pa, as suggested by Zong   
 
  * Rebased on top of v5.6-rc3  
 
  * Tested with sv48 patchset   
 
  * Add Reviewed/Tested-by from Zong and Anup 

Alexandre Ghiti (4):
  riscv: Move kernel mapping to vmalloc zone
  riscv: Introduce CONFIG_RELOCATABLE
  powerpc: Move script to check relocations at compile time in scripts/
  riscv: Check relocations at compile time

 arch/powerpc/tools/relocs_check.sh |  18 +
 arch/riscv/Kconfig |  12 +++
 arch/riscv/Makefile|   5 +-
 arch/riscv/Makefile.postlink   |  36 +
 arch/riscv/boot/loader.lds.S   |   3 +-
 arch/riscv/include/asm/page.h  |  10 ++-
 arch/riscv/include/asm/pgtable.h   |  38 ++---
 arch/riscv/kernel/head.S   |   3 +-
 arch/riscv/kernel/module.c |   4 +-
 arch/riscv/kernel/vmlinux.lds.S|   9 ++-
 arch/riscv/mm/Makefile |   4 +
 arch/riscv/mm/init.c   | 121 +
 arch/riscv/mm/physaddr.c   |   2 +-
 arch/riscv/tools/relocs_check.sh   |  26 +++
 scripts/relocs_check.sh|  20 +
 15 files changed, 259 insertions(+), 52 deletions(-)
 create mode 100644 arch/riscv/Makefile.postlink
 create mode 100755 arch/riscv/tools/relocs_check.sh
 create mode 100755 scripts/relocs_check.sh

-- 
2.20.1