Re: [PATCH v2] x86/boot: Use efi_setup_data for searching RSDP on kexec-ed kernel

2019-04-02 Thread Dave Young
On 04/03/19 at 01:35pm, Chao Fan wrote:
> On Tue, Apr 02, 2019 at 08:03:19PM +0800, Dave Young wrote:
> >On 04/01/19 at 12:08am, Junichi Nomura wrote:
> >> Commit 3a63f70bf4c3a ("x86/boot: Early parse RSDP and save it in
> >> boot_params") broke kexec boot on EFI systems.  efi_get_rsdp_addr()
> >> in the early parsing code tries to search RSDP from EFI table but
> >> that will crash because the table address is virtual when the kernel
> >> was booted by kexec.
> >> 
> >> In the case of kexec, physical address of EFI tables is provided
> >> via efi_setup_data in boot_params, which is set up by kexec(1).
> >> 
> >> Factor out the table parsing code and use different pointers depending
> >> on whether the kernel is booted by kexec or not.
> >> 
> >> Fixes: 3a63f70bf4c3a ("x86/boot: Early parse RSDP and save it in 
> >> boot_params")
> >> Signed-off-by: Jun'ichi Nomura 
> >> Acked-by: Baoquan He 
> >> Cc: Chao Fan 
> >> Cc: Borislav Petkov 
> >> Cc: Dave Young 
> [...]
> >
> >I failed to kexec reboot on my laptop, kernel panics too quick,  I'm not 
> >sure this is
> >caused by your patch though.
> >
> >Actually there are something probably i915 changes break kexec,  the
> >above test is with "nomodeset" which should work.
> >
> >Let me do more testing and update here tomorrow.
> >
> 
> Hi Dave,
> 
> Last day I was testing the normal kexec, today I have tested the kdump
> issue. Since the kdump has set "nokaslr" to cmdline, so I drop from
> KDUMP_COMMANDLINE_APPEND
> And it booted OK, so the PATCH works in both normal kexec and kdump.
> 

Actually I got some different kexec test results.

Yesterday, with my installed kernel (based on git head several weeks
ago), kexec kernel panics.

Then I tried latest mainline with git pull, everything works, (with or
without the patch, and can not reproduce the bug this patch is fixing)

Today, test again, kexec reboot hangs (with or without your patch), but
kdump works always (with or without the patch)

It is weird to me. Probably I need find out why I can not reproduce the
bug this patch is addressing first.

earlyprintk seems not working for me anymore, it is not easy to debug on
laptop now.

But the patch itself is clear, I think it should be good.  There might be
other things broken.

Thanks
Dave

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2] x86/boot: Use efi_setup_data for searching RSDP on kexec-ed kernel

2019-04-02 Thread Chao Fan
On Tue, Apr 02, 2019 at 08:03:19PM +0800, Dave Young wrote:
>On 04/01/19 at 12:08am, Junichi Nomura wrote:
>> Commit 3a63f70bf4c3a ("x86/boot: Early parse RSDP and save it in
>> boot_params") broke kexec boot on EFI systems.  efi_get_rsdp_addr()
>> in the early parsing code tries to search RSDP from EFI table but
>> that will crash because the table address is virtual when the kernel
>> was booted by kexec.
>> 
>> In the case of kexec, physical address of EFI tables is provided
>> via efi_setup_data in boot_params, which is set up by kexec(1).
>> 
>> Factor out the table parsing code and use different pointers depending
>> on whether the kernel is booted by kexec or not.
>> 
>> Fixes: 3a63f70bf4c3a ("x86/boot: Early parse RSDP and save it in 
>> boot_params")
>> Signed-off-by: Jun'ichi Nomura 
>> Acked-by: Baoquan He 
>> Cc: Chao Fan 
>> Cc: Borislav Petkov 
>> Cc: Dave Young 
[...]
>
>I failed to kexec reboot on my laptop, kernel panics too quick,  I'm not sure 
>this is
>caused by your patch though.
>
>Actually there are something probably i915 changes break kexec,  the
>above test is with "nomodeset" which should work.
>
>Let me do more testing and update here tomorrow.
>

Hi Dave,

Last day I was testing the normal kexec, today I have tested the kdump
issue. Since the kdump has set "nokaslr" to cmdline, so I drop from
KDUMP_COMMANDLINE_APPEND
And it booted OK, so the PATCH works in both normal kexec and kdump.

[root@localhost ~]# echo 1 > /proc/sys/kernel/sysrq
[root@localhost ~]# echo c > /proc/sysrq-trigger
[   67.776136] sysrq: Trigger a crash
[   67.777412] Kernel panic - not syncing: sysrq triggered crash
[   67.779429] CPU: 1 PID: 1652 Comm: bash Kdump: loaded Not tainted 5.1.0-rc3+ 
#4
[   67.780755] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 
0.0.0 02/06/2015
[   67.782062] Call Trace:
[   67.782490]  dump_stack+0x5c/0x80
[   67.783049]  panic+0x101/0x2a7
[   67.783560]  ? printk+0x58/0x6f
[   67.784091]  sysrq_handle_crash+0x11/0x11
[   67.784762]  __handle_sysrq.cold.7+0x45/0xf2
[   67.785467]  write_sysrq_trigger+0x2b/0x30
[   67.786087]  proc_reg_write+0x39/0x60
[   67.786597]  vfs_write+0xa5/0x1a0
[   67.787061]  ksys_write+0x4f/0xb0
[   67.787492]  do_syscall_64+0x5b/0x160
[   67.788010]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[   67.788740] RIP: 0033:0x7f66266fbed8
[   67.789239] Code: 89 02 48 c7 c0 ff ff ff ff eb b3 0f 1f 80 00 00 00 00 f3 
0f 1e fa 48 8d 05 45 78 0d 00 8b 00 85 c0 75 17 b8 01 00 00 00 0f 05 <48> 3d 00 
f0 ff ff 77 58 c3 0f 1f 80 00 00 00 00 41 54 49 89 d4 55
[   67.791325] RSP: 002b:7ffecdaf6138 EFLAGS: 0246 ORIG_RAX: 
0001
[   67.792084] RAX: ffda RBX: 0002 RCX: 7f66266fbed8
[   67.792820] RDX: 0002 RSI: 55dcc8d29880 RDI: 0001
[   67.793515] RBP: 55dcc8d29880 R08: 000a R09: 7ffecdaf5cc0
[   67.794276] R10: 000a R11: 0246 R12: 7f66267cf780
[   67.795017] R13: 0002 R14: 7f66267ca740 R15: 0002
early console in extract_kernel
input_data: 0x376033b1
input_len: 0x008412d4
output: 0x3600
output_len: 0x01e15844
kernel_total_size: 0x01e2c000
trampoline_32bit: 0x0009d000
booted via startup_64()


Physical KASLR disabled: no suitable memory region!

Virtual KASLR using RDRAND RDTSC...

Decompressing Linux... Parsing ELF... Performing relocations... done.
Booting the kernel.
[...]
 Starting Kdump Vmcore Save Service...
kdump: dump target is /dev/mapper/fedora-root
kdump: saving to /sysroot//var/crash/127.0.0.1-2019-04-03-01:28:01/
[3.551609] EXT4-fs (dm-0): re-mounted. Opts: (null)
kdump: saving vmcore-dmesg.txt
kdump: saving vmcore-dmesg.txt complete
kdump: saving vmcore
Copying data  : [100.0 %] |   eta: 
0s
kdump: saving vmcore complete

Thanks,
Chao Fan

>Thanks
>Dave
>
>
>



___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH 3/3] kdump: update Documentation about crashkernel on arm64

2019-04-02 Thread Chen Zhou
Now we support crashkernel=X,[high,low] on arm64, update the
Documentation.

Signed-off-by: Chen Zhou 
---
 Documentation/admin-guide/kernel-parameters.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 27a5f8c..6772f4f 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -715,14 +715,14 @@
Documentation/kdump/kdump.txt for an example.
 
crashkernel=size[KMG],high
-   [KNL, x86_64] range could be above 4G. Allow kernel
+   [KNL, x86_64, arm64] range could be above 4G. Allow 
kernel
to allocate physical memory region from top, so could
be above 4G if system have more than 4G ram installed.
Otherwise memory region will be allocated below 4G, if
available.
It will be ignored if crashkernel=X is specified.
crashkernel=size[KMG],low
-   [KNL, x86_64] range under 4G. When crashkernel=X,high
+   [KNL, x86_64, arm64] range under 4G. When 
crashkernel=X,high
is passed, kernel could allocate physical memory region
above 4G, that cause second kernel crash on system
that require some amount of low memory, e.g. swiotlb
-- 
2.7.4


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH 1/3] arm64: kdump: support reserving crashkernel above 4G

2019-04-02 Thread Chen Zhou
When crashkernel is reserved above 4G in memory, kernel should
reserve some amount of low memory for swiotlb and some DMA buffers.

Kernel would try to allocate at least 256M below 4G automatically
as x86_64 if crashkernel is above 4G. Meanwhile, support
crashkernel=X,[high,low] in arm64.

Signed-off-by: Chen Zhou 
---
 arch/arm64/kernel/setup.c |  3 ++
 arch/arm64/mm/init.c  | 71 +--
 2 files changed, 71 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 413d566..82cd9a0 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -243,6 +243,9 @@ static void __init request_standard_resources(void)
request_resource(res, _data);
 #ifdef CONFIG_KEXEC_CORE
/* Userspace will find "Crash kernel" region in /proc/iomem. */
+   if (crashk_low_res.end && crashk_low_res.start >= res->start &&
+   crashk_low_res.end <= res->end)
+   request_resource(res, _low_res);
if (crashk_res.end && crashk_res.start >= res->start &&
crashk_res.end <= res->end)
request_resource(res, _res);
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 6bc1350..ceb2a25 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -64,6 +64,57 @@ EXPORT_SYMBOL(memstart_addr);
 phys_addr_t arm64_dma_phys_limit __ro_after_init;
 
 #ifdef CONFIG_KEXEC_CORE
+static int __init reserve_crashkernel_low(void)
+{
+   unsigned long long base, low_base = 0, low_size = 0;
+   unsigned long total_low_mem;
+   int ret;
+
+   total_low_mem = memblock_mem_size(1UL << (32 - PAGE_SHIFT));
+
+   /* crashkernel=Y,low */
+   ret = parse_crashkernel_low(boot_command_line, total_low_mem, 
_size, );
+   if (ret) {
+   /*
+* two parts from lib/swiotlb.c:
+* -swiotlb size: user-specified with swiotlb= or default.
+*
+* -swiotlb overflow buffer: now hardcoded to 32k. We round it
+* to 8M for other buffers that may need to stay low too. Also
+* make sure we allocate enough extra low memory so that we
+* don't run out of DMA buffers for 32-bit devices.
+*/
+   low_size = max(swiotlb_size_or_default() + (8UL << 20), 256UL 
<< 20);
+   } else {
+   /* passed with crashkernel=0,low ? */
+   if (!low_size)
+   return 0;
+   }
+
+   low_base = memblock_find_in_range(0, 1ULL << 32, low_size, SZ_2M);
+   if (!low_base) {
+   pr_err("Cannot reserve %ldMB crashkernel low memory, please try 
smaller size.\n",
+   (unsigned long)(low_size >> 20));
+   return -ENOMEM;
+   }
+
+   ret = memblock_reserve(low_base, low_size);
+   if (ret) {
+   pr_err("%s: Error reserving crashkernel low memblock.\n", 
__func__);
+   return ret;
+   }
+
+   pr_info("Reserving %ldMB of low memory at %ldMB for crashkernel (System 
RAM: %ldMB)\n",
+   (unsigned long)(low_size >> 20),
+   (unsigned long)(low_base >> 20),
+   (unsigned long)(total_low_mem >> 20));
+
+   crashk_low_res.start = low_base;
+   crashk_low_res.end   = low_base + low_size - 1;
+
+   return 0;
+}
+
 /*
  * reserve_crashkernel() - reserves memory for crash kernel
  *
@@ -74,19 +125,28 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
 static void __init reserve_crashkernel(void)
 {
unsigned long long crash_base, crash_size;
+   bool high = false;
int ret;
 
ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
_size, _base);
/* no crashkernel= or invalid value specified */
-   if (ret || !crash_size)
-   return;
+   if (ret || !crash_size) {
+   /* crashkernel=X,high */
+   ret = parse_crashkernel_high(boot_command_line, 
memblock_phys_mem_size(),
+   _size, _base);
+   if (ret || !crash_size)
+   return;
+   high = true;
+   }
 
crash_size = PAGE_ALIGN(crash_size);
 
if (crash_base == 0) {
/* Current arm64 boot protocol requires 2MB alignment */
-   crash_base = memblock_find_in_range(0, ARCH_LOW_ADDRESS_LIMIT,
+   crash_base = memblock_find_in_range(0,
+   high ? memblock_end_of_DRAM()
+   : ARCH_LOW_ADDRESS_LIMIT,
crash_size, SZ_2M);
if (crash_base == 0) {
pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
@@ -112,6 +172,11 @@ static void __init reserve_crashkernel(void)
  

[PATCH 0/3] support reserving crashkernel above 4G on arm64 kdump

2019-04-02 Thread Chen Zhou
When crashkernel is reserved above 4G in memory, kernel should reserve
some amount of low memory for swiotlb and some DMA buffers. So there may
be two crash kernel regions, one is below 4G, the other is above 4G.

Crash dump kernel reads more than one crash kernel regions via a dtb
property under node /chosen,
linux,usable-memory-range = .

Besides, we need to modify kexec-tools:
  arm64: support more than one crash kernel regions

Chen Zhou (3):
  arm64: kdump: support reserving crashkernel above 4G
  arm64: kdump: support more than one crash kernel regions
  kdump: update Documentation about crashkernel on arm64

 Documentation/admin-guide/kernel-parameters.txt |   4 +-
 arch/arm64/kernel/setup.c   |   3 +
 arch/arm64/mm/init.c| 108 
 include/linux/memblock.h|   1 +
 mm/memblock.c   |  40 +
 5 files changed, 139 insertions(+), 17 deletions(-)

-- 
2.7.4


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH 2/3] arm64: kdump: support more than one crash kernel regions

2019-04-02 Thread Chen Zhou
After commit (arm64: kdump: support reserving crashkernel above 4G),
there may be two crash kernel regions, one is below 4G, the other is
above 4G.

Crash dump kernel reads more than one crash kernel regions via a dtb
property under node /chosen,
linux,usable-memory-range = 

Signed-off-by: Chen Zhou 
---
 arch/arm64/mm/init.c | 37 +
 include/linux/memblock.h |  1 +
 mm/memblock.c| 40 
 3 files changed, 66 insertions(+), 12 deletions(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index ceb2a25..769c77a 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -64,6 +64,8 @@ EXPORT_SYMBOL(memstart_addr);
 phys_addr_t arm64_dma_phys_limit __ro_after_init;
 
 #ifdef CONFIG_KEXEC_CORE
+# define CRASH_MAX_USABLE_RANGES2
+
 static int __init reserve_crashkernel_low(void)
 {
unsigned long long base, low_base = 0, low_size = 0;
@@ -346,8 +348,8 @@ static int __init early_init_dt_scan_usablemem(unsigned 
long node,
const char *uname, int depth, void *data)
 {
struct memblock_region *usablemem = data;
-   const __be32 *reg;
-   int len;
+   const __be32 *reg, *endp;
+   int len, nr = 0;
 
if (depth != 1 || strcmp(uname, "chosen") != 0)
return 0;
@@ -356,22 +358,33 @@ static int __init early_init_dt_scan_usablemem(unsigned 
long node,
if (!reg || (len < (dt_root_addr_cells + dt_root_size_cells)))
return 1;
 
-   usablemem->base = dt_mem_next_cell(dt_root_addr_cells, );
-   usablemem->size = dt_mem_next_cell(dt_root_size_cells, );
+   endp = reg + (len / sizeof(__be32));
+   while ((endp - reg) >= (dt_root_addr_cells + dt_root_size_cells)) {
+   usablemem[nr].base = dt_mem_next_cell(dt_root_addr_cells, );
+   usablemem[nr].size = dt_mem_next_cell(dt_root_size_cells, );
+
+   if (++nr >= CRASH_MAX_USABLE_RANGES)
+   break;
+   }
 
return 1;
 }
 
 static void __init fdt_enforce_memory_region(void)
 {
-   struct memblock_region reg = {
-   .size = 0,
-   };
-
-   of_scan_flat_dt(early_init_dt_scan_usablemem, );
-
-   if (reg.size)
-   memblock_cap_memory_range(reg.base, reg.size);
+   int i, cnt = 0;
+   struct memblock_region regs[CRASH_MAX_USABLE_RANGES];
+
+   memset(regs, 0, sizeof(regs));
+   of_scan_flat_dt(early_init_dt_scan_usablemem, regs);
+
+   for (i = 0; i < CRASH_MAX_USABLE_RANGES; i++)
+   if (regs[i].size)
+   cnt++;
+   else
+   break;
+   if (cnt)
+   memblock_cap_memory_ranges(regs, cnt);
 }
 
 void __init arm64_memblock_init(void)
diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 47e3c06..aeade34 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -446,6 +446,7 @@ phys_addr_t memblock_start_of_DRAM(void);
 phys_addr_t memblock_end_of_DRAM(void);
 void memblock_enforce_memory_limit(phys_addr_t memory_limit);
 void memblock_cap_memory_range(phys_addr_t base, phys_addr_t size);
+void memblock_cap_memory_ranges(struct memblock_region *regs, int cnt);
 void memblock_mem_limit_remove_map(phys_addr_t limit);
 bool memblock_is_memory(phys_addr_t addr);
 bool memblock_is_map_memory(phys_addr_t addr);
diff --git a/mm/memblock.c b/mm/memblock.c
index 28fa8926..1a7f4ee7c 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1697,6 +1697,46 @@ void __init memblock_cap_memory_range(phys_addr_t base, 
phys_addr_t size)
base + size, PHYS_ADDR_MAX);
 }
 
+void __init memblock_cap_memory_ranges(struct memblock_region *regs, int cnt)
+{
+   int start_rgn[INIT_MEMBLOCK_REGIONS], end_rgn[INIT_MEMBLOCK_REGIONS];
+   int i, j, ret, nr = 0;
+
+   for (i = 0; i < cnt; i++) {
+   ret = memblock_isolate_range(, regs[i].base,
+   regs[i].size, _rgn[i], _rgn[i]);
+   if (ret)
+   break;
+   nr++;
+   }
+   if (!nr)
+   return;
+
+   /* remove all the MAP regions */
+   for (i = memblock.memory.cnt - 1; i >= end_rgn[nr - 1]; i--)
+   if (!memblock_is_nomap([i]))
+   memblock_remove_region(, i);
+
+   for (i = nr - 1; i > 0; i--)
+   for (j = start_rgn[i] - 1; j >= end_rgn[i - 1]; j--)
+   if (!memblock_is_nomap([j]))
+   memblock_remove_region(, j);
+
+   for (i = start_rgn[0] - 1; i >= 0; i--)
+   if (!memblock_is_nomap([i]))
+   memblock_remove_region(, i);
+
+   /* truncate the reserved regions */
+   memblock_remove_range(, 0, regs[0].base);
+
+   for (i = nr - 1; i > 0; i--)
+   memblock_remove_range(,
+   regs[i].base, regs[i - 

[PATCH] arm64: support more than one crash kernel regions

2019-04-02 Thread Chen Zhou
When crashkernel is reserved above 4G in memory, kernel should
reserve some amount of low memory for swiotlb and some DMA buffers.
So there may be two crash kernel regions, one is below 4G, the other
is above 4G.

Currently, there is only one crash kernel region on arm64, and pass
"linux,usable-memory-range = " property to crash dump
kernel. Now, we pass
"linux,usable-memory-range = " to crash
dump kernel to support two crash kernel regions and load crash
kernel high.

Signed-off-by: Chen Zhou 
---
 kexec/arch/arm64/crashdump-arm64.c | 44 +
 kexec/arch/arm64/crashdump-arm64.h |  3 +-
 kexec/arch/arm64/kexec-arm64.c | 57 +-
 3 files changed, 72 insertions(+), 32 deletions(-)

diff --git a/kexec/arch/arm64/crashdump-arm64.c 
b/kexec/arch/arm64/crashdump-arm64.c
index 4fd7aa8..158e778 100644
--- a/kexec/arch/arm64/crashdump-arm64.c
+++ b/kexec/arch/arm64/crashdump-arm64.c
@@ -32,11 +32,11 @@ static struct memory_ranges system_memory_rgns = {
 };
 
 /* memory range reserved for crashkernel */
-struct memory_range crash_reserved_mem;
+struct memory_range crash_reserved_mem[CRASH_MAX_RESERVED_RANGES];
 struct memory_ranges usablemem_rgns = {
.size = 0,
-   .max_size = 1,
-   .ranges = _reserved_mem,
+   .max_size = CRASH_MAX_RESERVED_RANGES,
+   .ranges = crash_reserved_mem,
 };
 
 struct memory_range elfcorehdr_mem;
@@ -108,7 +108,7 @@ int is_crashkernel_mem_reserved(void)
if (!usablemem_rgns.size)
kexec_iomem_for_each_line(NULL, iomem_range_callback, NULL);
 
-   return crash_reserved_mem.start != crash_reserved_mem.end;
+   return usablemem_rgns.size;
 }
 
 /*
@@ -122,6 +122,8 @@ int is_crashkernel_mem_reserved(void)
  */
 static int crash_get_memory_ranges(void)
 {
+   int i;
+
/*
 * First read all memory regions that can be considered as
 * system memory including the crash area.
@@ -129,16 +131,19 @@ static int crash_get_memory_ranges(void)
if (!usablemem_rgns.size)
kexec_iomem_for_each_line(NULL, iomem_range_callback, NULL);
 
-   /* allow only a single region for crash dump kernel */
-   if (usablemem_rgns.size != 1)
+   /* allow one or two region for crash dump kernel */
+   if (!usablemem_rgns.size)
return -EINVAL;
 
-   dbgprint_mem_range("Reserved memory range", _reserved_mem, 1);
+   dbgprint_mem_range("Reserved memory range",
+   usablemem_rgns.ranges, usablemem_rgns.size);
 
-   if (mem_regions_exclude(_memory_rgns, _reserved_mem)) {
-   fprintf(stderr,
-   "Error: Number of crash memory ranges excedeed the max 
limit\n");
-   return -ENOMEM;
+   for (i = 0; i < usablemem_rgns.size; i++) {
+   if (mem_regions_exclude(_memory_rgns, 
_reserved_mem[i])) {
+   fprintf(stderr,
+   "Error: Number of crash memory ranges 
excedeed the max limit\n");
+   return -ENOMEM;
+   }
}
 
/*
@@ -199,7 +204,8 @@ int load_crashdump_segments(struct kexec_info *info)
return EFAILED;
 
elfcorehdr = add_buffer_phys_virt(info, buf, bufsz, bufsz, 0,
-   crash_reserved_mem.start, crash_reserved_mem.end,
+   crash_reserved_mem[usablemem_rgns.size - 1].start,
+   crash_reserved_mem[usablemem_rgns.size - 1].end,
-1, 0);
 
elfcorehdr_mem.start = elfcorehdr;
@@ -217,21 +223,23 @@ int load_crashdump_segments(struct kexec_info *info)
  * virt_to_phys() in add_segment().
  * So let's fix up those values for later use so the memory base
  * (arm64_mm.phys_offset) will be correctly replaced with
- * crash_reserved_mem.start.
+ * crash_reserved_mem[usablemem_rgns.size - 1].start.
  */
 void fixup_elf_addrs(struct mem_ehdr *ehdr)
 {
struct mem_phdr *phdr;
int i;
 
-   ehdr->e_entry += - arm64_mem.phys_offset + crash_reserved_mem.start;
+   ehdr->e_entry += -arm64_mem.phys_offset +
+   crash_reserved_mem[usablemem_rgns.size - 1].start;
 
for (i = 0; i < ehdr->e_phnum; i++) {
phdr = >e_phdr[i];
if (phdr->p_type != PT_LOAD)
continue;
phdr->p_paddr +=
-   (-arm64_mem.phys_offset + crash_reserved_mem.start);
+   (-arm64_mem.phys_offset +
+crash_reserved_mem[usablemem_rgns.size - 1].start);
}
 }
 
@@ -240,11 +248,11 @@ int get_crash_kernel_load_range(uint64_t *start, uint64_t 
*end)
if (!usablemem_rgns.size)
kexec_iomem_for_each_line(NULL, iomem_range_callback, NULL);
 
-   if (!crash_reserved_mem.end)
+   if (!usablemem_rgns.size)
return -1;
 
-   *start = crash_reserved_mem.start;
-   *end = crash_reserved_mem.end;
+ 

Re: [PATCH v3 1/3] arm64, vmcoreinfo : Append 'PTRS_PER_PGD' to vmcoreinfo

2019-04-02 Thread James Morse
Hi Kazu,

On 27/03/2019 16:07, Kazuhito Hagio wrote:
> On 3/26/2019 12:36 PM, James Morse wrote:
>> On 20/03/2019 05:09, Bhupesh Sharma wrote:
>>> With ARMv8.2-LVA architecture extension availability, arm64 hardware
>>> which supports this extension can support a virtual address-space upto
>>> 52-bits.
>>>
>>> Since at the moment we enable the support of this extension in kernel
>>> via CONFIG flags, e.g.
>>>  - User-space 52-bit LVA via CONFIG_ARM64_USER_VA_BITS_52
>>>
>>> so, there is no clear mechanism in the user-space right now to
>>> determine these CONFIG flag values and hence determine the maximum
>>> virtual address space supported by the underlying kernel.
>>>
>>> User-space tools like 'makedumpfile' therefore are broken currently
>>> as they have no proper method to calculate the 'PTRS_PER_PGD' value
>>> which is required to perform a page table walk to determine the
>>> physical address of a corresponding virtual address found in
>>> kcore/vmcoreinfo.
>>>
>>> If one appends 'PTRS_PER_PGD' number to vmcoreinfo for arm64,
>>> it can be used in user-space to determine the maximum virtual address
>>> supported by underlying kernel.
>>
>> I don't think this really solves the problem, it feels fragile.
>>
>> I can see how vmcoreinfo tells you VA_BITS==48, PAGE_SIZE==64K and 
>> PTRS_PER_PGD=1024.
>> You can use this to work out that the top level page table size isn't 
>> consistent with a
>> 48bit VA, so 52bit VA must be in use...
>>
>> But wasn't your problem walking the kernel page tables? In particular the 
>> offset that we
>> apply because the tables were based on a 48bit VA shifted up in 
>> swapper_pg_dir.
>>
>> Where does the TTBR1_EL1 offset come from with this property? I assume 
>> makedumpfile
>> hard-codes it when it sees 52bit is in use ... somewhere.

> My understanding is that the TTBR1_EL1 offset comes from a kernel
> virtual address with the exported PTRS_PER_PGD.
> 
> With T1SZ is 48bit and T0SZ is 52bit,

(PTRS_PER_PGD doesn't tell you this, PTRS_PER_PGD lets you spot something odd is
happening, and this just happens to be the only odd combination today.)


> kva = 0x<--- start of kernel virtual address

Does makedumpfile have this value? If the kernel were using 52bit VA for TTBR1 
this value
would be different.


> pgd_index(kva) = (kva >> PGDIR_SHIFT) & (PTRS_PER_PGD - 1)
>= (0x >> 42) & (1024 - 1)
>= 0x003fffc0 & 0x3ff
>= 0x3c0  <--- the offset (0x3c0) is included
> 
> This is what kernel does now, so makedumpfile also wants to do.

Sure, and it would work today. I'm worried about tomorrow, where we support 
something new,
and need to bundle new information out through vmcoreinfo. This ends up being 
used to
fingerprint the kernel support, instead of as the value it was intended to be.


>> We haven't solved the problem!
>>
>> Today __cpu_setup() sets T0SZ and T1SZ differently for 52bit VA, but in the 
>> future it
>> could set them the same, or different the other-way-round.
>>
>> Will makedumpfile using this value keep working once T1SZ is 52bit VA too? 
>> In this case
>> there would be no ttbr offset.
> 
> If T1SZ is 52bit, probably kernel virtual address starts from 
> 0xfff0,

I didn't think this 'bottom of the ttbr1 mapping range' value was exposed 
anywhere.
Where can user-space get this from? (I can't see it in the vmcoreinfo list)


> then the offset becomes 0 with the pgd_index() above.
> I think makedumpfile will keep working with that.


Steve mentions a 52/48 combination in his kernel series:
https://lore.kernel.org/linux-arm-kernel/20190218170245.14915-1-steve.cap...@arm.com/


I think vmcoreinfo-users will eventually need to spot 52bit used in TTBR1 
and/or TTBR0,
and possibly: configured, but not enabled in either. (this is because the bits 
are also
used for pointer-auth, the kernel may be built for both pointer-auth and 52-bit 
VA, and
chose which to enabled at boot based on some policy)

I don't see how you can do this with one value.
I'd like to get this right now, so we user-space doesn't need updating again!


Thanks,

James

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v3 1/3] arm64, vmcoreinfo : Append 'PTRS_PER_PGD' to vmcoreinfo

2019-04-02 Thread James Morse
Hi Bhupesh,

On 28/03/2019 11:42, Bhupesh Sharma wrote:
> On 03/26/2019 10:06 PM, James Morse wrote:
>> On 20/03/2019 05:09, Bhupesh Sharma wrote:
>>> With ARMv8.2-LVA architecture extension availability, arm64 hardware
>>> which supports this extension can support a virtual address-space upto
>>> 52-bits.
>>>
>>> Since at the moment we enable the support of this extension in kernel
>>> via CONFIG flags, e.g.
>>>   - User-space 52-bit LVA via CONFIG_ARM64_USER_VA_BITS_52
>>>
>>> so, there is no clear mechanism in the user-space right now to
>>> determine these CONFIG flag values and hence determine the maximum
>>> virtual address space supported by the underlying kernel.
>>>
>>> User-space tools like 'makedumpfile' therefore are broken currently
>>> as they have no proper method to calculate the 'PTRS_PER_PGD' value
>>> which is required to perform a page table walk to determine the
>>> physical address of a corresponding virtual address found in
>>> kcore/vmcoreinfo.
>>>
>>> If one appends 'PTRS_PER_PGD' number to vmcoreinfo for arm64,
>>> it can be used in user-space to determine the maximum virtual address
>>> supported by underlying kernel.
>>
>> I don't think this really solves the problem, it feels fragile.
>>
>> I can see how vmcoreinfo tells you VA_BITS==48, PAGE_SIZE==64K and 
>> PTRS_PER_PGD=1024.
>> You can use this to work out that the top level page table size isn't 
>> consistent with a
>> 48bit VA, so 52bit VA must be in use...
>>
>> But wasn't your problem walking the kernel page tables? In particular the 
>> offset that we
>> apply because the tables were based on a 48bit VA shifted up in 
>> swapper_pg_dir.
>>
>> Where does the TTBR1_EL1 offset come from with this property? I assume 
>> makedumpfile
>> hard-codes it when it sees 52bit is in use ... somewhere.
>> We haven't solved the problem!

> But isn't the TTBR1_EL1 offset already appended by the kernel via 
> e842dfb5a2d3 ("arm64:
> mm: Offset TTBR1 to allow 52-bit PTRS_PER_PGD")
> in case of kernel configuration where 52-bit userspace VAs are possible.

> Accordingly we have the following assembler helper in 
> 'arch/arm64/include/asm/assembler.h':
> 
>    .macro  offset_ttbr1, ttbr
> #ifdef CONFIG_ARM64_52BIT_VA
>    orr \ttbr, \ttbr, #TTBR1_BADDR_4852_OFFSET
> #endif
>    .endm
> 
> where:
> #ifdef CONFIG_ARM64_52BIT_VA
> /* Must be at least 64-byte aligned to prevent corruption of the TTBR */
> #define TTBR1_BADDR_4852_OFFSET    (((UL(1) << (52 - PGDIR_SHIFT)) - \
>     (UL(1) << (48 - PGDIR_SHIFT))) * 8)
> #endif

Sure, and all this would work today, because there is only one weird 
combination. But once
we support another combination of 52bit-va, you'd either need another value, or 
to start
using PTRS_PER_PGD as a flag for v5.1_FUNNY_BEHAVIOUR_ONE.


[...]

> Note that the above computation holds true both for PTRS_PER_PGD = 64 (48-bit 
> kernel with
> 48-bit User VA) and 1024 (48-bit with 52-bit User VA) cases. And these are the
> configurations for which we are trying to fix the user-space regressions 
> reported (on
> arm64) recently.

... and revisit it when there is another combination?


>> Today __cpu_setup() sets T0SZ and T1SZ differently for 52bit VA, but in the 
>> future it
>> could set them the same, or different the other-way-round.
>>
>> Will makedumpfile using this value keep working once T1SZ is 52bit VA too? 
>> In this case
>> there would be no ttbr offset.
>>
>> If you need another vmcoreinfo flag once that happens, we've done something 
>> wrong here.
> 
> I am currently experimenting with Steve's patches for 52-bit kernel VA
> () and will comment more on the same when I 
> am able to
> get the user-space utilities like makedumpfile and kexec-tools to work with 
> the same on
> both ARMv8 Fast Simulator model and older CPUs which don't support ARMv8.2 
> extensions.


> However, I think we should not hold up fixes for regressions already 
> reported, because the
> 52-bit kernel VA changes probably still need some more rework.

Chucking things into vmcoreinfo isn't free: we need to keep them there forever, 
otherwise
yesterdays version of the tools breaks. Can we take the time to get this right 
for the
cases we know about?

Yes the kernel code is going to move around, this is why the information we 
expose via
vmcoreinfo needs to be thought through: something we would always need, 
regardless of how
the kernel implements it.


>> (Not to mention what happens if the TTBR1_EL1 uses 52bit va, but TTBR0_EL1 
>> doesn't)
> 
> I am wondering if there are any real users of the above combination.

Heh! Is there any hardware that supports this?

Pointer-auth changes all this again, as we may prefer to use the bits for 
pointer-auth in
one TTB or the other. PTRS_PER_PGD may show the 52bit value in this case, but 
neither TTBR
is mapping 52bits of VA.


> So far, I have generally come across discussions where the following 
> variations 

Re: [PATCH 1/2 RESEND v10] x86/mm, resource: add a new I/O resource descriptor 'IORES_DESC_RESERVED'

2019-04-02 Thread Borislav Petkov
On Tue, Apr 02, 2019 at 08:02:04PM +0800, lijiang wrote:
> These regions(E820_TYPE_{RESERVED_KERN,RAM,UNUSABLE}) are still marked as
> IORES_DESC_NONE and should not be mapped encrypted when using ioremap().

Seems to me like we're going in circles. You said here:

https://lkml.kernel.org/r/9eb61523-7a08-24c4-ac15-050537bd9...@redhat.com

that the kernel doesn't pass the e820 reserved ranges to the second
kernel.

I suggested to use a special IORES descriptor for them -
IORES_DES_RESERVED.

Now you say that that is not enough and some of those you want passed,
are still marked as IORES_DESC_NONE.

Sounds to me like you need try again.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 1/2 RESEND v10] x86/mm, resource: add a new I/O resource descriptor 'IORES_DESC_RESERVED'

2019-04-02 Thread lijiang
在 2019年04月02日 17:06, Borislav Petkov 写道:
> On Fri, Mar 29, 2019 at 08:39:13PM +0800, Lianbo Jiang wrote:
>> -static int __ioremap_check_desc_other(struct resource *res)
>> +/*
>> + * Originally, these areas described as IORES_DESC_NONE are not mapped
>> + * as encrypted when using ioremap(), for example, E820_TYPE_{RESERVED,
>> + * RESERVED_KERN,RAM,UNUSABLE}, etc. It checks for a resource that is
>> + * not described as IORES_DESC_NONE, which can make sure the reserved
>> + * areas are not mapped as encrypted when using ioremap().
>> + *
>> + * Now IORES_DESC_RESERVED has been created for the reserved areas so
>> + * the check needs to be expanded so that these areas are not mapped
>> + * encrypted when using ioremap().
>> + */
>> +static int __ioremap_check_desc_none_and_reserved(struct resource *res)
>>  {
>> -return (res->desc != IORES_DESC_NONE);
>> +return ((res->desc != IORES_DESC_NONE) &&
> 
> Why is this still checking IORES_DESC_NONE when the idea is to have this
> specific IORES_DESC_RESERVED for all marked as *reserved* regions in
> e820 which should not be mapped encrypted?
> 
> IOW, which regions are still marked as IORES_DESC_NONE and should not be
> mapped encrypted?
> 
Thanks for your comment.

These regions(E820_TYPE_{RESERVED_KERN,RAM,UNUSABLE}) are still marked as
IORES_DESC_NONE and should not be mapped encrypted when using ioremap().
Please refer to the following function.

static unsigned long __init e820_type_to_iores_desc(struct e820_entry *entry)
{
switch (entry->type) {
case E820_TYPE_ACPI:return IORES_DESC_ACPI_TABLES;
case E820_TYPE_NVS: return IORES_DESC_ACPI_NV_STORAGE;
case E820_TYPE_PMEM:return IORES_DESC_PERSISTENT_MEMORY;
case E820_TYPE_PRAM:return 
IORES_DESC_PERSISTENT_MEMORY_LEGACY;
case E820_TYPE_RESERVED:return IORES_DESC_RESERVED;
case E820_TYPE_RESERVED_KERN:   /* Fall-through: */
case E820_TYPE_RAM: /* Fall-through: */
case E820_TYPE_UNUSABLE:/* Fall-through: */
default:return IORES_DESC_NONE;
}
}


Thanks.
Lianbo

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 1/3 v2] x86/kexec: Do not map the kexec area as decrypted when SEV is active

2019-04-02 Thread Borislav Petkov
On Wed, Mar 27, 2019 at 01:36:27PM +0800, Lianbo Jiang wrote:
> Currently, the arch_kexec_post_{alloc,free}_pages() unconditionally
> maps the kexec area as decrypted. This works fine when SME is active.
> Because in SME, the first kernel is loaded in decrypted area by the
> BIOS, so the second kernel must be also loaded into the decrypted
> memory.
> 
> When SEV is active, the first kernel is loaded into the encrypted
> area, so the second kernel must be also loaded into the encrypted
> memory. Lets make sure that arch_kexec_post_{alloc,free}_pages()
> does not clear the memory encryption mask from the kexec area when
> SEV is active.

This commit message still doesn't explain the big picture why you want
this change.

And it must explain it because it might be all clear in your head now
but months from now, you, we, all would've forgotten why this change was
needed.

So pls add blurb that this whole effort is being done so that SEV VMs
can kdump too. I.e., the 1ft picture.

Anyone must be able to figure out *why* a change has been done just by
doing git archeology. So make sure you explain it properly.

If unsure, try to put yourself in the shoes of some future kernel
developer who is trying to find out why this change has been done. Now
read the commit message you've written. Does it make any sense to him? I
think not.

Do you catch my drift?

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v3] x86/boot: Use efi_setup_data for searching RSDP on kexec-ed kernel

2019-04-02 Thread Junichi Nomura
Commit 3a63f70bf4c3a ("x86/boot: Early parse RSDP and save it in
boot_params") broke kexec boot on EFI systems.  efi_get_rsdp_addr()
in the early parsing code tries to search RSDP from EFI table but
that will crash because the table address is virtual when the kernel
was booted by kexec.

In the case of kexec, physical address of EFI tables is provided
via efi_setup_data in boot_params, which is set up by kexec(1).

Factor out the table parsing code and use different pointers depending
on whether the kernel is booted by kexec or not.

Fixes: 3a63f70bf4c3a ("x86/boot: Early parse RSDP and save it in boot_params")
Signed-off-by: Jun'ichi Nomura 
Acked-by: Baoquan He 
Tested-by: Chao Fan 
Cc: Borislav Petkov 
Cc: Dave Young 

--
v2: Added comments above __efi_get_rsdp_addr() and kexec_get_rsdp_addr() 

v3: Properly ifdef out 64bit-only kexec code to avoid 32bit build warnings

diff --git a/arch/x86/boot/compressed/acpi.c b/arch/x86/boot/compressed/acpi.c
index 0ef4ad5..d9f9abd 100644
--- a/arch/x86/boot/compressed/acpi.c
+++ b/arch/x86/boot/compressed/acpi.c
@@ -44,17 +44,114 @@ static acpi_physical_address get_acpi_rsdp(void)
return addr;
 }
 
-/* Search EFI system tables for RSDP. */
-static acpi_physical_address efi_get_rsdp_addr(void)
+#if defined(CONFIG_EFI) && defined(CONFIG_X86_64)
+static unsigned long efi_get_kexec_setup_data_addr(void)
+{
+   struct setup_data *data;
+   u64 pa_data;
+
+   pa_data = boot_params->hdr.setup_data;
+   while (pa_data) {
+   data = (struct setup_data *) pa_data;
+   if (data->type == SETUP_EFI)
+   return pa_data + sizeof(struct setup_data);
+   pa_data = data->next;
+   }
+   return 0;
+}
+#endif
+
+#ifdef CONFIG_EFI
+/*
+ * Search EFI system tables for RSDP.  If both ACPI_20_TABLE_GUID and
+ * ACPI_TABLE_GUID are found, take the former, which has more features.
+ */
+static acpi_physical_address
+__efi_get_rsdp_addr(unsigned long config_tables, unsigned int nr_tables,
+   bool efi_64)
 {
acpi_physical_address rsdp_addr = 0;
+   int i;
 
+   /* Get EFI tables from systab. */
+   for (i = 0; i < nr_tables; i++) {
+   acpi_physical_address table;
+   efi_guid_t guid;
+
+   if (efi_64) {
+   efi_config_table_64_t *tbl = (efi_config_table_64_t *) 
config_tables + i;
+
+   guid  = tbl->guid;
+   table = tbl->table;
+
+   if (!IS_ENABLED(CONFIG_X86_64) && table >> 32) {
+   debug_putstr("Error getting RSDP address: EFI 
config table located above 4GB.\n");
+   return 0;
+   }
+   } else {
+   efi_config_table_32_t *tbl = (efi_config_table_32_t *) 
config_tables + i;
+
+   guid  = tbl->guid;
+   table = tbl->table;
+   }
+
+   if (!(efi_guidcmp(guid, ACPI_TABLE_GUID)))
+   rsdp_addr = table;
+   else if (!(efi_guidcmp(guid, ACPI_20_TABLE_GUID)))
+   return table;
+   }
+
+   return rsdp_addr;
+}
+#endif
+
+/*
+ * EFI/kexec support is only added for 64bit. So we don't have to
+ * care 32bit case.
+ */
+static acpi_physical_address kexec_get_rsdp_addr(void)
+{
+#if defined(CONFIG_EFI) && defined(CONFIG_X86_64)
+   efi_system_table_64_t *systab;
+   struct efi_setup_data *esd;
+   struct efi_info *ei;
+   char *sig;
+
+   esd = (struct efi_setup_data *) efi_get_kexec_setup_data_addr();
+   if (!esd)
+   return 0;
+
+   if (!esd->tables) {
+   debug_putstr("Wrong kexec SETUP_EFI data.\n");
+   return 0;
+   }
+
+   ei = _params->efi_info;
+   sig = (char *)>efi_loader_signature;
+   if (strncmp(sig, EFI64_LOADER_SIGNATURE, 4)) {
+   debug_putstr("Wrong kexec EFI loader signature.\n");
+   return 0;
+   }
+
+   /* Get systab from boot params. */
+   systab = (efi_system_table_64_t *) (ei->efi_systab | 
((__u64)ei->efi_systab_hi << 32));
+   if (!systab)
+   error("EFI system table not found in kexec boot_params.");
+
+   return __efi_get_rsdp_addr((unsigned long) esd->tables,
+  systab->nr_tables, true);
+#else
+   return 0;
+#endif
+}
+
+static acpi_physical_address efi_get_rsdp_addr(void)
+{
 #ifdef CONFIG_EFI
-   unsigned long systab, systab_tables, config_tables;
+   unsigned long systab, config_tables;
unsigned int nr_tables;
struct efi_info *ei;
bool efi_64;
-   int size, i;
char *sig;
 
ei = _params->efi_info;
@@ -88,49 +185,20 @@ static acpi_physical_address efi_get_rsdp_addr(void)
 
config_tables   = stbl->tables;
nr_tables   = stbl->nr_tables;
- 

Re: [PATCH v2] x86/boot: Use efi_setup_data for searching RSDP on kexec-ed kernel

2019-04-02 Thread Junichi Nomura
On 4/2/19 8:06 PM, Chao Fan wrote:
> On Tue, Apr 02, 2019 at 09:53:51AM +, Junichi Nomura wrote:
>> On Tue, Apr 02, 2019 at 05:41:49PM +0800, Chao Fan wrote:
>>> [   77.989030] kexec_core: Starting new kernel
>>> early console in extract_kernel
>>> input_data: 0x00017f6033b1
>>> input_len: 0x008412d4
>>> output: 0x00017e00
>>> output_len: 0x01e15844
>>> kernel_total_size: 0x01e2c000
>>> trampoline_32bit: 0x0009d000
>>> booted via startup_64()
>>>
>>>
>>> Physical KASLR disabled: no suitable memory region!
>>> --
>>>
>>> I am not sure whether I have done the right test.
>>> This guest is booted from EFI. Here we can see the kexeced kernel
>>> has completed the compressed boot stage. So I think your PATCH works.
>>
>> Thanks for testing.  If your test bed doesn't boot even with the patch,
>> you could check what was found as RSDP with a debug patch like below.
> 
> Oh no, it booted. I just put the compressed stag log.

Ah, then I think the patch worked as expected. Thanks.

-- 
Jun'ichi Nomura, NEC Corporation / NEC Solution Innovators, Ltd.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2] x86/boot: Use efi_setup_data for searching RSDP on kexec-ed kernel

2019-04-02 Thread Chao Fan
On Tue, Apr 02, 2019 at 09:53:51AM +, Junichi Nomura wrote:
>On Tue, Apr 02, 2019 at 05:41:49PM +0800, Chao Fan wrote:
>> [   77.989030] kexec_core: Starting new kernel
>> early console in extract_kernel
>> input_data: 0x00017f6033b1
>> input_len: 0x008412d4
>> output: 0x00017e00
>> output_len: 0x01e15844
>> kernel_total_size: 0x01e2c000
>> trampoline_32bit: 0x0009d000
>> booted via startup_64()
>> 
>> 
>> Physical KASLR disabled: no suitable memory region!
>> --
>> 
>> I am not sure whether I have done the right test.
>> This guest is booted from EFI. Here we can see the kexeced kernel
>> has completed the compressed boot stage. So I think your PATCH works.
>
>Thanks for testing.  If your test bed doesn't boot even with the patch,
>you could check what was found as RSDP with a debug patch like below.

Oh no, it booted. I just put the compressed stag log.

Thanks,
Chao Fan

>
>diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
>--- a/arch/x86/boot/compressed/misc.c
>+++ b/arch/x86/boot/compressed/misc.c
>@@ -379,6 +379,7 @@ asmlinkage __visible void *extract_kernel(void *rmode, 
>memptr heap,
>   debug_putaddr(output);
>   debug_putaddr(output_len);
>   debug_putaddr(kernel_total_size);
>+  debug_putaddr(boot_params->acpi_rsdp_addr);
> 
> #ifdef CONFIG_X86_64
>   /* Report address of 32-bit trampoline */
>
>



___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2] x86/boot: Use efi_setup_data for searching RSDP on kexec-ed kernel

2019-04-02 Thread Junichi Nomura
On Tue, Apr 02, 2019 at 05:41:49PM +0800, Chao Fan wrote:
> [   77.989030] kexec_core: Starting new kernel
> early console in extract_kernel
> input_data: 0x00017f6033b1
> input_len: 0x008412d4
> output: 0x00017e00
> output_len: 0x01e15844
> kernel_total_size: 0x01e2c000
> trampoline_32bit: 0x0009d000
> booted via startup_64()
> 
> 
> Physical KASLR disabled: no suitable memory region!
> --
> 
> I am not sure whether I have done the right test.
> This guest is booted from EFI. Here we can see the kexeced kernel
> has completed the compressed boot stage. So I think your PATCH works.

Thanks for testing.  If your test bed doesn't boot even with the patch,
you could check what was found as RSDP with a debug patch like below.

diff --git a/arch/x86/boot/compressed/misc.c b/arch/x86/boot/compressed/misc.c
--- a/arch/x86/boot/compressed/misc.c
+++ b/arch/x86/boot/compressed/misc.c
@@ -379,6 +379,7 @@ asmlinkage __visible void *extract_kernel(void *rmode, 
memptr heap,
debug_putaddr(output);
debug_putaddr(output_len);
debug_putaddr(kernel_total_size);
+   debug_putaddr(boot_params->acpi_rsdp_addr);
 
 #ifdef CONFIG_X86_64
/* Report address of 32-bit trampoline */

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH 1/2 RESEND v10] x86/mm, resource: add a new I/O resource descriptor 'IORES_DESC_RESERVED'

2019-04-02 Thread Borislav Petkov
On Fri, Mar 29, 2019 at 08:39:13PM +0800, Lianbo Jiang wrote:
> -static int __ioremap_check_desc_other(struct resource *res)
> +/*
> + * Originally, these areas described as IORES_DESC_NONE are not mapped
> + * as encrypted when using ioremap(), for example, E820_TYPE_{RESERVED,
> + * RESERVED_KERN,RAM,UNUSABLE}, etc. It checks for a resource that is
> + * not described as IORES_DESC_NONE, which can make sure the reserved
> + * areas are not mapped as encrypted when using ioremap().
> + *
> + * Now IORES_DESC_RESERVED has been created for the reserved areas so
> + * the check needs to be expanded so that these areas are not mapped
> + * encrypted when using ioremap().
> + */
> +static int __ioremap_check_desc_none_and_reserved(struct resource *res)
>  {
> - return (res->desc != IORES_DESC_NONE);
> + return ((res->desc != IORES_DESC_NONE) &&

Why is this still checking IORES_DESC_NONE when the idea is to have this
specific IORES_DESC_RESERVED for all marked as *reserved* regions in
e820 which should not be mapped encrypted?

IOW, which regions are still marked as IORES_DESC_NONE and should not be
mapped encrypted?

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v2] x86/boot: Use efi_setup_data for searching RSDP on kexec-ed kernel

2019-04-02 Thread Chao Fan
Hi,

I have test your PATCH in Qemu guest:
--
Fedora 29 (Workstation Edition)
Kernel 5.1.0-rc3+ on an x86_64 (ttyS0)

localhost login: root
Password:
Last login: Tue Apr  2 05:30:33 on ttyS0
[root@localhost ~]# cd /boot
[root@localhost boot]# ls
config-4.18.16-300.fc29.x86_64
efi
extlinux
grub2
initramfs-0-rescue-858ff1f5d0cb453898ae6c7f77c68ba7.img
initramfs-4.18.16-300.fc29.x86_64.img
initramfs-4.18.16-300.fc29.x86_64kdump.img
initramfs-5.1.0-rc3+.img
initramfs-5.1.0-rc3+kdump.img
loader
lost+found
System.map
System.map-4.18.16-300.fc29.x86_64
System.map-5.1.0-rc3+
vmlinuz
vmlinuz-0-rescue-858ff1f5d0cb453898ae6c7f77c68ba7
vmlinuz-4.18.16-300.fc29.x86_64
vmlinuz-5.1.0-rc3+
[root@localhost boot]# kexec -l vmlinuz-5.1.0-rc3+ 
--initrd=initramfs-5.1.0-rc3+.img --reuse-cmdline
[root@localhost boot]# kexec -e
[   77.933760] Unregister pv shared memory for cpu 1
[   77.933763] Unregister pv shared memory for cpu 2
[   77.933766] Unregister pv shared memory for cpu 5
[   77.933825] Unregister pv shared memory for cpu 7
[   77.933882] Unregister pv shared memory for cpu 3
[   77.933904] Unregister pv shared memory for cpu 8
[   77.936147] Unregister pv shared memory for cpu 4
[   77.936199] Unregister pv shared memory for cpu 9
[   77.945822] Unregister pv shared memory for cpu 0
[   77.946308] Unregister pv shared memory for cpu 6
[   77.947420] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[   77.989030] kexec_core: Starting new kernel
early console in extract_kernel
input_data: 0x00017f6033b1
input_len: 0x008412d4
output: 0x00017e00
output_len: 0x01e15844
kernel_total_size: 0x01e2c000
trampoline_32bit: 0x0009d000
booted via startup_64()


Physical KASLR disabled: no suitable memory region!
--

I am not sure whether I have done the right test.
This guest is booted from EFI. Here we can see the kexeced kernel
has completed the compressed boot stage. So I think your PATCH works.

Thanks,
Chao Fan




___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec