[PATCH 19/57] docs: kdump: convert it to ReST

2019-04-15 Thread Mauro Carvalho Chehab
Convert kdump documentation to ReST and add it to the
user faced manual, as the documents are mainly focused on
sysadmins that would be enabling kdump.

Note: the vmcoreinfo.rst has one very long title for
sub-sections. I opted to break this one, in order to make it
easier to display in html.

Signed-off-by: Mauro Carvalho Chehab 
---
 Documentation/kdump/kdump.txt  | 131 +
 Documentation/kdump/vmcoreinfo.txt |  59 ++---
 2 files changed, 104 insertions(+), 86 deletions(-)

diff --git a/Documentation/kdump/kdump.txt b/Documentation/kdump/kdump.txt
index 51814450a7f8..1da2d7b765f6 100644
--- a/Documentation/kdump/kdump.txt
+++ b/Documentation/kdump/kdump.txt
@@ -71,9 +71,8 @@ This is a symlink to the latest version.
 
 The latest kexec-tools git tree is available at:
 
-git://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git
-and
-http://www.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git
+- git://git.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git
+- http://www.kernel.org/pub/scm/utils/kernel/kexec/kexec-tools.git
 
 There is also a gitweb interface available at
 http://www.kernel.org/git/?p=utils/kernel/kexec/kexec-tools.git
@@ -81,25 +80,25 @@ 
http://www.kernel.org/git/?p=utils/kernel/kexec/kexec-tools.git
 More information about kexec-tools can be found at
 http://horms.net/projects/kexec/
 
-3) Unpack the tarball with the tar command, as follows:
+3) Unpack the tarball with the tar command, as follows::
 
-   tar xvpzf kexec-tools.tar.gz
+   tar xvpzf kexec-tools.tar.gz
 
-4) Change to the kexec-tools directory, as follows:
+4) Change to the kexec-tools directory, as follows::
 
-   cd kexec-tools-VERSION
+   cd kexec-tools-VERSION
 
-5) Configure the package, as follows:
+5) Configure the package, as follows::
 
-   ./configure
+   ./configure
 
-6) Compile the package, as follows:
+6) Compile the package, as follows::
 
-   make
+   make
 
-7) Install the package, as follows:
+7) Install the package, as follows::
 
-   make install
+   make install
 
 
 Build the system and dump-capture kernels
@@ -126,25 +125,25 @@ dump-capture kernels for enabling kdump support.
 System kernel config options
 
 
-1) Enable "kexec system call" in "Processor type and features."
+1) Enable "kexec system call" in "Processor type and features."::
 
-   CONFIG_KEXEC=y
+   CONFIG_KEXEC=y
 
 2) Enable "sysfs file system support" in "Filesystem" -> "Pseudo
-   filesystems." This is usually enabled by default.
+   filesystems." This is usually enabled by default::
 
-   CONFIG_SYSFS=y
+   CONFIG_SYSFS=y
 
Note that "sysfs file system support" might not appear in the "Pseudo
filesystems" menu if "Configure standard kernel features (for small
systems)" is not enabled in "General Setup." In this case, check the
-   .config file itself to ensure that sysfs is turned on, as follows:
+   .config file itself to ensure that sysfs is turned on, as follows::
 
-   grep 'CONFIG_SYSFS' .config
+   grep 'CONFIG_SYSFS' .config
 
-3) Enable "Compile the kernel with debug info" in "Kernel hacking."
+3) Enable "Compile the kernel with debug info" in "Kernel hacking."::
 
-   CONFIG_DEBUG_INFO=Y
+   CONFIG_DEBUG_INFO=Y
 
This causes the kernel to be built with debug symbols. The dump
analysis tools require a vmlinux with debug symbols in order to read
@@ -154,29 +153,32 @@ Dump-capture kernel config options (Arch Independent)
 -
 
 1) Enable "kernel crash dumps" support under "Processor type and
-   features":
+   features"::
 
-   CONFIG_CRASH_DUMP=y
+   CONFIG_CRASH_DUMP=y
 
-2) Enable "/proc/vmcore support" under "Filesystems" -> "Pseudo filesystems".
+2) Enable "/proc/vmcore support" under "Filesystems" -> "Pseudo filesystems"::
+
+   CONFIG_PROC_VMCORE=y
 
-   CONFIG_PROC_VMCORE=y
(CONFIG_PROC_VMCORE is set by default when CONFIG_CRASH_DUMP is selected.)
 
 Dump-capture kernel config options (Arch Dependent, i386 and x86_64)
 
 
 1) On i386, enable high memory support under "Processor type and
-   features":
+   features"::
 
-   CONFIG_HIGHMEM64G=y
-   or
-   CONFIG_HIGHMEM4G
+   CONFIG_HIGHMEM64G=y
+
+   or::
+
+   CONFIG_HIGHMEM4G
 
 2) On i386 and x86_64, disable symmetric multi-processing support
-   under "Processor type and features":
+   under "Processor type and features"::
 
-   CONFIG_SMP=n
+   CONFIG_SMP=n
 
(If CONFIG_SMP=y, then specify maxcpus=1 on the kernel command line
when loading the dump-capture kernel, see section "Load the Dump-capture
@@ -184,9 +186,9 @@ Dump-capture kernel config options (Arch Dependent, i386 
and x86_64)
 
 3) If one wants to build and use a relocatable kernel,
Enable "Build a relocatable kernel" support under "Processor type and
-   features"
+   features"::
 
-   CONFIG_RELOCATABLE=y
+   

[PATCH 00/57] Convert files to ReST

2019-04-15 Thread Mauro Carvalho Chehab
This series convert lots of files to be properly parsed by Sphinx
as ReST files.

As it touches on lot of stuff, the series is based on linux-next.

I have a separate patch series with do the actual rename and
adjustment of references. I opted to submit this first, as it
sounds easier to merge this way, as each subsystem maintainer
can apply the conversion directly on their trees (or at docs
tree), avoiding merge conflects.

Both this series and  the next steps are on my devel git tree,
at:

https://git.linuxtv.org/mchehab/experimental.git/log/?h=all_with_indexes-v3

The final output in html can be seen at:

https://www.infradead.org/~mchehab/rst_conversion/

Mauro Carvalho Chehab (57):
  docs: trace: fix some Sphinx warnings
  docs: acpi: convert text files to ReST
  docs: aoe: convert text files to ReST
  docs: arm64: convert documentation to ReST format
  docs: cdrom/cdrom-standard.tex: convert from LaTeX to ReST
  docs: cdrom: convert remaining files to ReST
  docs: cgroup-v1: convert to ReST file format
  docs: cgroup-v1/blkio-controller.rst: add a note about CFQ scheduler
  docs: cpu-freq: convert files to ReST
  docs: device-mapper: convert it to ReST format
  docs: extcon: move it to acpi dir and convert it to ReST
  docs: fault-injection: convert it to ReST format
  docs: fb: convert documentation to ReST format
  docs: fpga: convert it to ReST
  docs: gpio: convert it to ReST
  docs: ide: convert it to ReST format
  docs: infiniband: convert it to ReST format
  docs: kbuild: convert it to ReST output
  docs: kdump: convert it to ReST
  docs: livepatch: convert it to ReST format
  docs: locking: convert docs to ReST format
  docs: mic: convert it to ReST format
  docs: netlabel: convert it to ReST
  docs: pcmcia: convert it to ReST format
  docs: power: convert docs to ReST
  docs: powerpc: convert docs to ReST
  docs: pps/pps.txt convert it to ReST and move to API book
  docs: ptp.txt: convert to ReST and move to driver-api
  docs: riscv: convert it to ReST format
  docs: s390: Debugging390.txt: convert table to ascii artwork
  docs: s390: convert text files to ReST format
  s390: include/asm/debug.h add kerneldoc markups
  docs: serial: convert it to ReST format
  docs: target: convert it to ReST format
  docs: timers: convert documentation to ReST
  docs: usb: convert documents to ReST
  docs: watchdog: convert documents to ReST format
  docs: x86: convert text files to ReST
  docs: xilinx: convert eemi.txt to ReST
  docs: scheduler: convert files to ReST
  docs: EDID/HOWTO.txt: convert to ReST and move to kernel-API
  docs: connector.txt: convert to ReST
  docs: lcd-panel-cgram.txt convert it to ReST and move to admin-guide
  docs: lp855x-driver.txt: convert to ReST and move to kernel-api
  docs: m68k: convert it to ReST file format and add to arch bookset
  docs: cma/debugfs.txt: convert to ReST and move to admin-guide/mm
  docs: console.txt: convert to ReST format
  docs: pti_intel_mid.txt: convert to ReST
  docs: early-userspace: convert docs to ReST
  docs: driver-model: convert it to ReST format
  docs: arm: convert text files to ReST format
  docs: memory-devices: convert ti-emif.txt to ReST format
  docs: xen-tpmfront.txt: convert the file to ReST format
  docs: bus-devices: ti-gpmc.txt: convert it to ReST
  docs: nvmem: convert file to ReST format
  docs: phy: convert samsung-usb2.txt to ReST format
  docs: Prepare files to be renamed to *.rst

 Documentation/EDID/HOWTO.txt  |   29 +-
 Documentation/acpi/DSD-properties-rules.txt   |4 +-
 Documentation/acpi/acpi-lid.txt   |   37 +-
 Documentation/acpi/aml-debugger.txt   |   31 +-
 Documentation/acpi/apei/einj.txt  |   59 +-
 Documentation/acpi/apei/output_format.txt |  247 +-
 Documentation/acpi/cppc_sysfs.txt |   52 +-
 Documentation/acpi/debug.txt  |   20 +-
 .../drivers/extcon-intel-int3496.txt} |   14 +-
 .../acpi/dsd/data-node-references.txt |   11 +-
 Documentation/acpi/dsd/graph.txt  |   24 +-
 Documentation/acpi/dsd/leds.txt   |   18 +-
 Documentation/acpi/dsdt-override.txt  |4 +-
 Documentation/acpi/enumeration.txt|   42 +-
 Documentation/acpi/gpio-properties.txt|   42 +-
 Documentation/acpi/i2c-muxes.txt  |   21 +-
 Documentation/acpi/initrd_table_override.txt  |   90 +-
 Documentation/acpi/linuxized-acpica.txt   |   58 +-
 Documentation/acpi/lpit.txt   |8 +-
 Documentation/acpi/method-customizing.txt |   48 +-
 Documentation/acpi/method-tracing.txt |  132 +-
 Documentation/acpi/namespace.txt  |  323 +-
 Documentation/acpi/osi.txt|3 +-
 Documentation/acpi/scan_handlers.txt  |9 +-
 Documentation/acpi/ssdt-overlays.txt  |  128 +-
 Documentation/acpi/video_extension.txt|   16 +-
 Documentation/aoe/aoe.txt |   63 +-
 

Re: [PATCH v4 3/5] memblock: add memblock_cap_memory_ranges for multiple ranges

2019-04-15 Thread Chen Zhou
Hi Mike,

On 2019/4/16 3:09, Mike Rapoport wrote:
> Hi,
> 
> On Mon, Apr 15, 2019 at 06:57:23PM +0800, Chen Zhou wrote:
>> The memblock_cap_memory_range() removes all the memory except the
>> range passed to it. Extend this function to receive memblock_type
>> with the regions that should be kept.
>>
>> Enable this function in arm64 for reservation of multiple regions
>> for the crash kernel.
>>
>> Signed-off-by: Chen Zhou 
>> Signed-off-by: Mike Rapoport 
> 
> I didn't work on this version, please drop the signed-off.

Sorry about this. I should ask you firstly before doing it this way. I will 
drop it.

remove_size);
>> +}
>> +
>> +memblock_remove_range(,
>> +regs[nr - 1].base + regs[nr - 1].size, PHYS_ADDR_MAX);
>> +}
>> +
> 
> I've double-checked and I see no problem with using
> for_each_mem_range_rev() iterators for removing some ranges. And with them
> this functions becomes much clearer and more efficient.
> 
> Can you please check if the below patch works for you?
> 
>>From e25e6c9cd94a01abac124deacc66e5d258fdbf7c Mon Sep 17 00:00:00 2001
> From: Mike Rapoport 
> Date: Wed, 10 Apr 2019 16:02:32 +0300
> Subject: [PATCH] memblock: extend memblock_cap_memory_range to multiple ranges
> 
> The memblock_cap_memory_range() removes all the memory except the range
> passed to it. Extend this function to receive an array of memblock_regions
> that should be kept. This allows switching to simple iteration over
> memblock arrays with 'for_each_mem_range_rev' to remove the unneeded memory.
> 
> Enable use of this function in arm64 for reservation of multiple regions for
> the crash kernel.
> 
> Signed-off-by: Mike Rapoport 
> ---
>  arch/arm64/mm/init.c | 34 --
>  include/linux/memblock.h |  2 +-
>  mm/memblock.c| 44 
>  3 files changed, 45 insertions(+), 35 deletions(-)
> 
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 6bc1350..8665d29 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -64,6 +64,10 @@ EXPORT_SYMBOL(memstart_addr);
>  phys_addr_t arm64_dma_phys_limit __ro_after_init;
>  
>  #ifdef CONFIG_KEXEC_CORE
> +
> +/* at most two crash kernel regions, low_region and high_region */
> +#define CRASH_MAX_USABLE_RANGES  2
> +
>  /*
>   * reserve_crashkernel() - reserves memory for crash kernel
>   *
> @@ -280,9 +284,9 @@ early_param("mem", early_mem);
>  static int __init early_init_dt_scan_usablemem(unsigned long node,
>   const char *uname, int depth, void *data)
>  {
> - struct memblock_region *usablemem = data;
> - const __be32 *reg;
> - int len;
> + struct memblock_type *usablemem = data;
> + const __be32 *reg, *endp;
> + int len, nr = 0;
>  
>   if (depth != 1 || strcmp(uname, "chosen") != 0)
>   return 0;
> @@ -291,22 +295,32 @@ static int __init early_init_dt_scan_usablemem(unsigned 
> long node,
>   if (!reg || (len < (dt_root_addr_cells + dt_root_size_cells)))
>   return 1;
>  
> - usablemem->base = dt_mem_next_cell(dt_root_addr_cells, );
> - usablemem->size = dt_mem_next_cell(dt_root_size_cells, );
> + endp = reg + (len / sizeof(__be32));
> + while ((endp - reg) >= (dt_root_addr_cells + dt_root_size_cells)) {
> + unsigned long base = dt_mem_next_cell(dt_root_addr_cells, );
> + unsigned long size = dt_mem_next_cell(dt_root_size_cells, );
>  
> + if (memblock_add_range(usablemem, base, size, NUMA_NO_NODE,
> +MEMBLOCK_NONE))
> + return 0;
> + if (++nr >= CRASH_MAX_USABLE_RANGES)
> + break;
> + }
>   return 1;
>  }
>  
>  static void __init fdt_enforce_memory_region(void)
>  {
> - struct memblock_region reg = {
> - .size = 0,
> + struct memblock_region usable_regions[CRASH_MAX_USABLE_RANGES];
> + struct memblock_type usablemem = {
> + .max = CRASH_MAX_USABLE_RANGES,
> + .regions = usable_regions,
>   };
>  
> - of_scan_flat_dt(early_init_dt_scan_usablemem, );
> + of_scan_flat_dt(early_init_dt_scan_usablemem, );
>  
> - if (reg.size)
> - memblock_cap_memory_range(reg.base, reg.size);
> + if (usablemem.cnt)
> + memblock_cap_memory_ranges(usablemem.regions, usablemem.cnt);
>  }
>  
>  void __init arm64_memblock_init(void)
> diff --git a/include/linux/memblock.h b/include/linux/memblock.h
> index 294d5d8..f5c029b 100644
> --- a/include/linux/memblock.h
> +++ b/include/linux/memblock.h
> @@ -404,7 +404,7 @@ phys_addr_t memblock_mem_size(unsigned long limit_pfn);
>  phys_addr_t memblock_start_of_DRAM(void);
>  phys_addr_t memblock_end_of_DRAM(void);
>  void memblock_enforce_memory_limit(phys_addr_t memory_limit);
> -void memblock_cap_memory_range(phys_addr_t base, phys_addr_t size);
> +void memblock_cap_memory_ranges(struct 

Re: [PATCH v4] x86/boot: Use efi_setup_data for searching RSDP on kexec-ed kernel

2019-04-15 Thread Junichi Nomura
On 4/16/19 8:00 AM, Junichi Nomura wrote:
> On 4/15/19 7:25 PM, Borislav Petkov wrote:
>> On Mon, Apr 15, 2019 at 11:07:17AM +0200, Borislav Petkov wrote:
>>> On Mon, Apr 15, 2019 at 07:01:54AM +, Junichi Nomura wrote:
 OK. Then I'll go back to v3 and make sure to hang when
 something is wrong during kexec boot on EFI system.
>>>
>>> No need - I have it here locally. I'll clean it up and post it for
>>> review.
>>
>> Here it is. Ok, not ok?
> 
> Thank you.  Basically ok.
> I put some comments below about whether to hang or return.
> 
>> +static acpi_physical_address kexec_get_rsdp_addr(void)
>> +{
>> +efi_system_table_64_t *systab;
>> +struct efi_setup_data *esd;
>> +struct efi_info *ei;
>> +char *sig;
>> +
>> +esd = (struct efi_setup_data *)get_kexec_setup_data_addr();
>> +if (!esd)
>> +return 0;
>> +
>> +if (!esd->tables) {
>> +debug_putstr("Wrong kexec SETUP_EFI data.\n");
>> +return 0;
>> +}
> 
> I thought we should hang here instead of return so that we
> don't run into efi_get_rsdp_addr() in case of kexec.
> 
>> +ei = _params->efi_info;
>> +sig = (char *)>efi_loader_signature;
>> +if (strncmp(sig, EFI64_LOADER_SIGNATURE, 4)) {
>> +debug_putstr("Wrong kexec EFI loader signature.\n");
>> +return 0;
>> +}
> 
> Same here.

One more question just for clarification.

I see kexec is only supported on 64bit kernel. But are we sure
we don't need to support kexec on EFI32 + 64bit kernel?

I don't have such an environment and as far as I tried with OVMF i386
and KVM guest, that combination doesn't work reliably even with v5.0.
So I suppose people don't care.

>> +/* Get systab from boot params. */
>> +systab = (efi_system_table_64_t *) (ei->efi_systab | 
>> ((__u64)ei->efi_systab_hi << 32));
>> +if (!systab)
>> +error("EFI system table not found in kexec boot_params.");
>> +
>> +return __efi_get_rsdp_addr((unsigned long)esd->tables, 
>> systab->nr_tables, true);
> 
> Same here when __efi_get_rsdp_addr() returns 0.
> 
> I'm fine with either way, though.

-- 
Jun'ichi Nomura, NEC Corporation / NEC Solution Innovators, Ltd.
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v4] x86/boot: Use efi_setup_data for searching RSDP on kexec-ed kernel

2019-04-15 Thread Junichi Nomura
On 4/15/19 7:25 PM, Borislav Petkov wrote:
> On Mon, Apr 15, 2019 at 11:07:17AM +0200, Borislav Petkov wrote:
>> On Mon, Apr 15, 2019 at 07:01:54AM +, Junichi Nomura wrote:
>>> OK. Then I'll go back to v3 and make sure to hang when
>>> something is wrong during kexec boot on EFI system.
>>
>> No need - I have it here locally. I'll clean it up and post it for
>> review.
> 
> Here it is. Ok, not ok?

Thank you.  Basically ok.
I put some comments below about whether to hang or return.

> +static acpi_physical_address kexec_get_rsdp_addr(void)
> +{
> + efi_system_table_64_t *systab;
> + struct efi_setup_data *esd;
> + struct efi_info *ei;
> + char *sig;
> +
> + esd = (struct efi_setup_data *)get_kexec_setup_data_addr();
> + if (!esd)
> + return 0;
> +
> + if (!esd->tables) {
> + debug_putstr("Wrong kexec SETUP_EFI data.\n");
> + return 0;
> + }

I thought we should hang here instead of return so that we
don't run into efi_get_rsdp_addr() in case of kexec.

> + ei = _params->efi_info;
> + sig = (char *)>efi_loader_signature;
> + if (strncmp(sig, EFI64_LOADER_SIGNATURE, 4)) {
> + debug_putstr("Wrong kexec EFI loader signature.\n");
> + return 0;
> + }

Same here.

> + /* Get systab from boot params. */
> + systab = (efi_system_table_64_t *) (ei->efi_systab | 
> ((__u64)ei->efi_systab_hi << 32));
> + if (!systab)
> + error("EFI system table not found in kexec boot_params.");
> +
> + return __efi_get_rsdp_addr((unsigned long)esd->tables, 
> systab->nr_tables, true);

Same here when __efi_get_rsdp_addr() returns 0.

I'm fine with either way, though.

-- 
Jun'ichi Nomura, NEC Corporation / NEC Solution Innovators, Ltd.
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v4 3/5] memblock: add memblock_cap_memory_ranges for multiple ranges

2019-04-15 Thread Mike Rapoport
Hi,

On Mon, Apr 15, 2019 at 06:57:23PM +0800, Chen Zhou wrote:
> The memblock_cap_memory_range() removes all the memory except the
> range passed to it. Extend this function to receive memblock_type
> with the regions that should be kept.
> 
> Enable this function in arm64 for reservation of multiple regions
> for the crash kernel.
> 
> Signed-off-by: Chen Zhou 
> Signed-off-by: Mike Rapoport 

I didn't work on this version, please drop the signed-off.

> ---
>  include/linux/memblock.h |  1 +
>  mm/memblock.c| 45 +
>  2 files changed, 46 insertions(+)
> 
> diff --git a/include/linux/memblock.h b/include/linux/memblock.h
> index 47e3c06..180877c 100644
> --- a/include/linux/memblock.h
> +++ b/include/linux/memblock.h
> @@ -446,6 +446,7 @@ phys_addr_t memblock_start_of_DRAM(void);
>  phys_addr_t memblock_end_of_DRAM(void);
>  void memblock_enforce_memory_limit(phys_addr_t memory_limit);
>  void memblock_cap_memory_range(phys_addr_t base, phys_addr_t size);
> +void memblock_cap_memory_ranges(struct memblock_type *regions_to_keep);
>  void memblock_mem_limit_remove_map(phys_addr_t limit);
>  bool memblock_is_memory(phys_addr_t addr);
>  bool memblock_is_map_memory(phys_addr_t addr);
> diff --git a/mm/memblock.c b/mm/memblock.c
> index f315eca..9661807 100644
> --- a/mm/memblock.c
> +++ b/mm/memblock.c
> @@ -1697,6 +1697,51 @@ void __init memblock_cap_memory_range(phys_addr_t 
> base, phys_addr_t size)
>   base + size, PHYS_ADDR_MAX);
>  }
>  
> +void __init memblock_cap_memory_ranges(struct memblock_type *regions_to_keep)
> +{
> + int start_rgn[INIT_MEMBLOCK_REGIONS], end_rgn[INIT_MEMBLOCK_REGIONS];
> + int i, j, ret, nr = 0;
> + struct memblock_region *regs = regions_to_keep->regions;
> +
> + for (i = 0; i < regions_to_keep->cnt; i++) {
> + ret = memblock_isolate_range(, regs[i].base,
> + regs[i].size, _rgn[i], _rgn[i]);
> + if (ret)
> + break;
> + nr++;
> + }
> + if (!nr)
> + return;
> +
> + /* remove all the MAP regions */
> + for (i = memblock.memory.cnt - 1; i >= end_rgn[nr - 1]; i--)
> + if (!memblock_is_nomap([i]))
> + memblock_remove_region(, i);
> +
> + for (i = nr - 1; i > 0; i--)
> + for (j = start_rgn[i] - 1; j >= end_rgn[i - 1]; j--)
> + if (!memblock_is_nomap([j]))
> + memblock_remove_region(, j);
> +
> + for (i = start_rgn[0] - 1; i >= 0; i--)
> + if (!memblock_is_nomap([i]))
> + memblock_remove_region(, i);
> +
> + /* truncate the reserved regions */
> + memblock_remove_range(, 0, regs[0].base);
> +
> + for (i = nr - 1; i > 0; i--) {
> + phys_addr_t remove_base = regs[i - 1].base + regs[i - 1].size;
> + phys_addr_t remove_size = regs[i].base - remove_base;
> +
> + memblock_remove_range(, remove_base,
> + remove_size);
> + }
> +
> + memblock_remove_range(,
> + regs[nr - 1].base + regs[nr - 1].size, PHYS_ADDR_MAX);
> +}
> +

I've double-checked and I see no problem with using
for_each_mem_range_rev() iterators for removing some ranges. And with them
this functions becomes much clearer and more efficient.

Can you please check if the below patch works for you?

>From e25e6c9cd94a01abac124deacc66e5d258fdbf7c Mon Sep 17 00:00:00 2001
From: Mike Rapoport 
Date: Wed, 10 Apr 2019 16:02:32 +0300
Subject: [PATCH] memblock: extend memblock_cap_memory_range to multiple ranges

The memblock_cap_memory_range() removes all the memory except the range
passed to it. Extend this function to receive an array of memblock_regions
that should be kept. This allows switching to simple iteration over
memblock arrays with 'for_each_mem_range_rev' to remove the unneeded memory.

Enable use of this function in arm64 for reservation of multiple regions for
the crash kernel.

Signed-off-by: Mike Rapoport 
---
 arch/arm64/mm/init.c | 34 --
 include/linux/memblock.h |  2 +-
 mm/memblock.c| 44 
 3 files changed, 45 insertions(+), 35 deletions(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 6bc1350..8665d29 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -64,6 +64,10 @@ EXPORT_SYMBOL(memstart_addr);
 phys_addr_t arm64_dma_phys_limit __ro_after_init;
 
 #ifdef CONFIG_KEXEC_CORE
+
+/* at most two crash kernel regions, low_region and high_region */
+#define CRASH_MAX_USABLE_RANGES2
+
 /*
  * reserve_crashkernel() - reserves memory for crash kernel
  *
@@ -280,9 +284,9 @@ early_param("mem", early_mem);
 static int __init early_init_dt_scan_usablemem(unsigned long node,
const char *uname, int depth, void *data)
 {
-   struct memblock_region 

Re: [PATCH 1/2 RESEND v10] x86/mm, resource: add a new I/O resource descriptor 'IORES_DESC_RESERVED'

2019-04-15 Thread lijiang
在 2019年04月02日 20:43, Borislav Petkov 写道:
> On Tue, Apr 02, 2019 at 08:02:04PM +0800, lijiang wrote:
>> These regions(E820_TYPE_{RESERVED_KERN,RAM,UNUSABLE}) are still marked as
>> IORES_DESC_NONE and should not be mapped encrypted when using ioremap().
> 
> Seems to me like we're going in circles. You said here:
> 
> https://lkml.kernel.org/r/9eb61523-7a08-24c4-ac15-050537bd9...@redhat.com
> 
> that the kernel doesn't pass the e820 reserved ranges to the second
> kernel.
> 
> I suggested to use a special IORES descriptor for them -
> IORES_DES_RESERVED.
> 
> Now you say that that is not enough and some of those you want passed,
> are still marked as IORES_DESC_NONE.
> 
Sorry for the delay.

They are different problems.

The first problem is that passes the e820 reserved ranges to the second kernel,
for this case, it is good enough to use the IORES_DESC_RESERVED, which can
ensure that exactly matches the reserved resource ranges when walking through
iomem resources.

The second problem is about the SEV case. Now, the IORES_DESC_RESERVED has been
created for the reserved areas, therefore the check needs to be expanded so that
these areas are not mapped encrypted when using ioremap().

+static int __ioremap_check_desc_none_and_reserved(struct resource *res)
 {
-   return (res->desc != IORES_DESC_NONE);
+   return ((res->desc != IORES_DESC_NONE) &&
+   (res->desc != IORES_DESC_RESERVED));
 }


Maybe i should split it into two patches. The change of 
__ioremap_check_desc_none_and_reserved()
should be a separate patch. Any idea?

Thanks.
Lianbo

> Sounds to me like you need try again.
>

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v4 4/5] arm64: kdump: support more than one crash kernel regions

2019-04-15 Thread Chen Zhou
After commit (arm64: kdump: support reserving crashkernel above 4G),
there may be two crash kernel regions, one is below 4G, the other is
above 4G. Use memblock_cap_memory_ranges() to support multiple crash
kernel regions.

Crash dump kernel reads more than one crash kernel regions via a dtb
property under node /chosen,
linux,usable-memory-range = .

Besides, replace memblock_cap_memory_range() with
memblock_cap_memory_ranges().

Signed-off-by: Chen Zhou 
Signed-off-by: Mike Rapoport 
---
 arch/arm64/mm/init.c | 34 --
 include/linux/memblock.h |  1 -
 mm/memblock.c| 41 -
 3 files changed, 36 insertions(+), 40 deletions(-)

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index f5dde73..921953d 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -52,6 +52,9 @@
 #include 
 #include 
 
+/* at most two crash kernel regions, low_region and high_region */
+#define CRASH_MAX_USABLE_RANGES2
+
 /*
  * We need to be able to catch inadvertent references to memstart_addr
  * that occur (potentially in generic code) before arm64_memblock_init()
@@ -295,9 +298,9 @@ early_param("mem", early_mem);
 static int __init early_init_dt_scan_usablemem(unsigned long node,
const char *uname, int depth, void *data)
 {
-   struct memblock_region *usablemem = data;
-   const __be32 *reg;
-   int len;
+   struct memblock_type *usablemem = data;
+   const __be32 *reg, *endp;
+   int len, nr = 0;
 
if (depth != 1 || strcmp(uname, "chosen") != 0)
return 0;
@@ -306,22 +309,33 @@ static int __init early_init_dt_scan_usablemem(unsigned 
long node,
if (!reg || (len < (dt_root_addr_cells + dt_root_size_cells)))
return 1;
 
-   usablemem->base = dt_mem_next_cell(dt_root_addr_cells, );
-   usablemem->size = dt_mem_next_cell(dt_root_size_cells, );
+   endp = reg + (len / sizeof(__be32));
+   while ((endp - reg) >= (dt_root_addr_cells + dt_root_size_cells)) {
+   unsigned long base = dt_mem_next_cell(dt_root_addr_cells, );
+   unsigned long size = dt_mem_next_cell(dt_root_size_cells, );
+
+   if (memblock_add_range(usablemem, base, size, NUMA_NO_NODE,
+  MEMBLOCK_NONE))
+   return 0;
+   if (++nr >= CRASH_MAX_USABLE_RANGES)
+   break;
+   }
 
return 1;
 }
 
 static void __init fdt_enforce_memory_region(void)
 {
-   struct memblock_region reg = {
-   .size = 0,
+   struct memblock_region usable_regions[CRASH_MAX_USABLE_RANGES];
+   struct memblock_type usablemem = {
+   .max = CRASH_MAX_USABLE_RANGES,
+   .regions = usable_regions,
};
 
-   of_scan_flat_dt(early_init_dt_scan_usablemem, );
+   of_scan_flat_dt(early_init_dt_scan_usablemem, );
 
-   if (reg.size)
-   memblock_cap_memory_range(reg.base, reg.size);
+   if (usablemem.cnt)
+   memblock_cap_memory_ranges();
 }
 
 void __init arm64_memblock_init(void)
diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 180877c..f04dfc1 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -445,7 +445,6 @@ phys_addr_t memblock_mem_size(unsigned long limit_pfn);
 phys_addr_t memblock_start_of_DRAM(void);
 phys_addr_t memblock_end_of_DRAM(void);
 void memblock_enforce_memory_limit(phys_addr_t memory_limit);
-void memblock_cap_memory_range(phys_addr_t base, phys_addr_t size);
 void memblock_cap_memory_ranges(struct memblock_type *regions_to_keep);
 void memblock_mem_limit_remove_map(phys_addr_t limit);
 bool memblock_is_memory(phys_addr_t addr);
diff --git a/mm/memblock.c b/mm/memblock.c
index 9661807..9b5cef4 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1669,34 +1669,6 @@ void __init memblock_enforce_memory_limit(phys_addr_t 
limit)
  PHYS_ADDR_MAX);
 }
 
-void __init memblock_cap_memory_range(phys_addr_t base, phys_addr_t size)
-{
-   int start_rgn, end_rgn;
-   int i, ret;
-
-   if (!size)
-   return;
-
-   ret = memblock_isolate_range(, base, size,
-   _rgn, _rgn);
-   if (ret)
-   return;
-
-   /* remove all the MAP regions */
-   for (i = memblock.memory.cnt - 1; i >= end_rgn; i--)
-   if (!memblock_is_nomap([i]))
-   memblock_remove_region(, i);
-
-   for (i = start_rgn - 1; i >= 0; i--)
-   if (!memblock_is_nomap([i]))
-   memblock_remove_region(, i);
-
-   /* truncate the reserved regions */
-   memblock_remove_range(, 0, base);
-   memblock_remove_range(,
-   base + size, PHYS_ADDR_MAX);
-}
-
 void __init memblock_cap_memory_ranges(struct memblock_type *regions_to_keep)
 {
int 

[PATCH v4 5/5] kdump: update Documentation about crashkernel on arm64

2019-04-15 Thread Chen Zhou
Now we support crashkernel=X,[high,low] on arm64, update the
Documentation.

Signed-off-by: Chen Zhou 
---
 Documentation/admin-guide/kernel-parameters.txt | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt 
b/Documentation/admin-guide/kernel-parameters.txt
index 308af3b..a055983 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -715,14 +715,14 @@
Documentation/kdump/kdump.txt for an example.
 
crashkernel=size[KMG],high
-   [KNL, x86_64] range could be above 4G. Allow kernel
+   [KNL, x86_64, arm64] range could be above 4G. Allow 
kernel
to allocate physical memory region from top, so could
be above 4G if system have more than 4G ram installed.
Otherwise memory region will be allocated below 4G, if
available.
It will be ignored if crashkernel=X is specified.
crashkernel=size[KMG],low
-   [KNL, x86_64] range under 4G. When crashkernel=X,high
+   [KNL, x86_64, arm64] range under 4G. When 
crashkernel=X,high
is passed, kernel could allocate physical memory region
above 4G, that cause second kernel crash on system
that require some amount of low memory, e.g. swiotlb
-- 
2.7.4


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v4 2/5] arm64: kdump: support reserving crashkernel above 4G

2019-04-15 Thread Chen Zhou
When crashkernel is reserved above 4G in memory, kernel should
reserve some amount of low memory for swiotlb and some DMA buffers.

Kernel would try to allocate at least 256M below 4G automatically
as x86_64 if crashkernel is above 4G. Meanwhile, support
crashkernel=X,[high,low] in arm64.

Signed-off-by: Chen Zhou 
---
 arch/arm64/include/asm/kexec.h |  3 +++
 arch/arm64/kernel/setup.c  |  3 +++
 arch/arm64/mm/init.c   | 25 -
 3 files changed, 26 insertions(+), 5 deletions(-)

diff --git a/arch/arm64/include/asm/kexec.h b/arch/arm64/include/asm/kexec.h
index 67e4cb7..32949bf 100644
--- a/arch/arm64/include/asm/kexec.h
+++ b/arch/arm64/include/asm/kexec.h
@@ -28,6 +28,9 @@
 
 #define KEXEC_ARCH KEXEC_ARCH_AARCH64
 
+/* 2M alignment for crash kernel regions */
+#define CRASH_ALIGNSZ_2M
+
 #ifndef __ASSEMBLY__
 
 /**
diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
index 413d566..82cd9a0 100644
--- a/arch/arm64/kernel/setup.c
+++ b/arch/arm64/kernel/setup.c
@@ -243,6 +243,9 @@ static void __init request_standard_resources(void)
request_resource(res, _data);
 #ifdef CONFIG_KEXEC_CORE
/* Userspace will find "Crash kernel" region in /proc/iomem. */
+   if (crashk_low_res.end && crashk_low_res.start >= res->start &&
+   crashk_low_res.end <= res->end)
+   request_resource(res, _low_res);
if (crashk_res.end && crashk_res.start >= res->start &&
crashk_res.end <= res->end)
request_resource(res, _res);
diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index 972bf43..f5dde73 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -74,20 +74,30 @@ phys_addr_t arm64_dma_phys_limit __ro_after_init;
 static void __init reserve_crashkernel(void)
 {
unsigned long long crash_base, crash_size;
+   bool high = false;
int ret;
 
ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(),
_size, _base);
/* no crashkernel= or invalid value specified */
-   if (ret || !crash_size)
-   return;
+   if (ret || !crash_size) {
+   /* crashkernel=X,high */
+   ret = parse_crashkernel_high(boot_command_line,
+   memblock_phys_mem_size(),
+   _size, _base);
+   if (ret || !crash_size)
+   return;
+   high = true;
+   }
 
crash_size = PAGE_ALIGN(crash_size);
 
if (crash_base == 0) {
/* Current arm64 boot protocol requires 2MB alignment */
-   crash_base = memblock_find_in_range(0, ARCH_LOW_ADDRESS_LIMIT,
-   crash_size, SZ_2M);
+   crash_base = memblock_find_in_range(0,
+   high ? memblock_end_of_DRAM()
+   : ARCH_LOW_ADDRESS_LIMIT,
+   crash_size, CRASH_ALIGN);
if (crash_base == 0) {
pr_warn("cannot allocate crashkernel (size:0x%llx)\n",
crash_size);
@@ -105,13 +115,18 @@ static void __init reserve_crashkernel(void)
return;
}
 
-   if (!IS_ALIGNED(crash_base, SZ_2M)) {
+   if (!IS_ALIGNED(crash_base, CRASH_ALIGN)) {
pr_warn("cannot reserve crashkernel: base address is 
not 2MB aligned\n");
return;
}
}
memblock_reserve(crash_base, crash_size);
 
+   if (crash_base >= SZ_4G && reserve_crashkernel_low()) {
+   memblock_free(crash_base, crash_size);
+   return;
+   }
+
pr_info("crashkernel reserved: 0x%016llx - 0x%016llx (%lld MB)\n",
crash_base, crash_base + crash_size, crash_size >> 20);
 
-- 
2.7.4


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v4 3/5] memblock: add memblock_cap_memory_ranges for multiple ranges

2019-04-15 Thread Chen Zhou
The memblock_cap_memory_range() removes all the memory except the
range passed to it. Extend this function to receive memblock_type
with the regions that should be kept.

Enable this function in arm64 for reservation of multiple regions
for the crash kernel.

Signed-off-by: Chen Zhou 
Signed-off-by: Mike Rapoport 
---
 include/linux/memblock.h |  1 +
 mm/memblock.c| 45 +
 2 files changed, 46 insertions(+)

diff --git a/include/linux/memblock.h b/include/linux/memblock.h
index 47e3c06..180877c 100644
--- a/include/linux/memblock.h
+++ b/include/linux/memblock.h
@@ -446,6 +446,7 @@ phys_addr_t memblock_start_of_DRAM(void);
 phys_addr_t memblock_end_of_DRAM(void);
 void memblock_enforce_memory_limit(phys_addr_t memory_limit);
 void memblock_cap_memory_range(phys_addr_t base, phys_addr_t size);
+void memblock_cap_memory_ranges(struct memblock_type *regions_to_keep);
 void memblock_mem_limit_remove_map(phys_addr_t limit);
 bool memblock_is_memory(phys_addr_t addr);
 bool memblock_is_map_memory(phys_addr_t addr);
diff --git a/mm/memblock.c b/mm/memblock.c
index f315eca..9661807 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -1697,6 +1697,51 @@ void __init memblock_cap_memory_range(phys_addr_t base, 
phys_addr_t size)
base + size, PHYS_ADDR_MAX);
 }
 
+void __init memblock_cap_memory_ranges(struct memblock_type *regions_to_keep)
+{
+   int start_rgn[INIT_MEMBLOCK_REGIONS], end_rgn[INIT_MEMBLOCK_REGIONS];
+   int i, j, ret, nr = 0;
+   struct memblock_region *regs = regions_to_keep->regions;
+
+   for (i = 0; i < regions_to_keep->cnt; i++) {
+   ret = memblock_isolate_range(, regs[i].base,
+   regs[i].size, _rgn[i], _rgn[i]);
+   if (ret)
+   break;
+   nr++;
+   }
+   if (!nr)
+   return;
+
+   /* remove all the MAP regions */
+   for (i = memblock.memory.cnt - 1; i >= end_rgn[nr - 1]; i--)
+   if (!memblock_is_nomap([i]))
+   memblock_remove_region(, i);
+
+   for (i = nr - 1; i > 0; i--)
+   for (j = start_rgn[i] - 1; j >= end_rgn[i - 1]; j--)
+   if (!memblock_is_nomap([j]))
+   memblock_remove_region(, j);
+
+   for (i = start_rgn[0] - 1; i >= 0; i--)
+   if (!memblock_is_nomap([i]))
+   memblock_remove_region(, i);
+
+   /* truncate the reserved regions */
+   memblock_remove_range(, 0, regs[0].base);
+
+   for (i = nr - 1; i > 0; i--) {
+   phys_addr_t remove_base = regs[i - 1].base + regs[i - 1].size;
+   phys_addr_t remove_size = regs[i].base - remove_base;
+
+   memblock_remove_range(, remove_base,
+   remove_size);
+   }
+
+   memblock_remove_range(,
+   regs[nr - 1].base + regs[nr - 1].size, PHYS_ADDR_MAX);
+}
+
 void __init memblock_mem_limit_remove_map(phys_addr_t limit)
 {
phys_addr_t max_addr;
-- 
2.7.4


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


[PATCH v4 1/5] x86: kdump: move reserve_crashkernel_low() into kexec_core.c

2019-04-15 Thread Chen Zhou
In preparation for supporting more than one crash kernel regions
in arm64 as x86_64 does, move reserve_crashkernel_low() into
kexec/kexec_core.c.

Signed-off-by: Chen Zhou 
---
 arch/x86/include/asm/kexec.h |  3 ++
 arch/x86/kernel/setup.c  | 66 +---
 include/linux/kexec.h|  5 
 kernel/kexec_core.c  | 56 +
 4 files changed, 71 insertions(+), 59 deletions(-)

diff --git a/arch/x86/include/asm/kexec.h b/arch/x86/include/asm/kexec.h
index 003f2da..485a514 100644
--- a/arch/x86/include/asm/kexec.h
+++ b/arch/x86/include/asm/kexec.h
@@ -18,6 +18,9 @@
 
 # define KEXEC_CONTROL_CODE_MAX_SIZE   2048
 
+/* 16M alignment for crash kernel regions */
+#define CRASH_ALIGN(16 << 20)
+
 #ifndef __ASSEMBLY__
 
 #include 
diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c
index 3773905..4182035 100644
--- a/arch/x86/kernel/setup.c
+++ b/arch/x86/kernel/setup.c
@@ -447,9 +447,6 @@ static void __init 
memblock_x86_reserve_range_setup_data(void)
 
 #ifdef CONFIG_KEXEC_CORE
 
-/* 16M alignment for crash kernel regions */
-#define CRASH_ALIGN(16 << 20)
-
 /*
  * Keep the crash kernel below this limit.  On 32 bits earlier kernels
  * would limit the kernel to the low 512 MiB due to mapping restrictions.
@@ -463,59 +460,6 @@ static void __init 
memblock_x86_reserve_range_setup_data(void)
 # define CRASH_ADDR_HIGH_MAX   MAXMEM
 #endif
 
-static int __init reserve_crashkernel_low(void)
-{
-#ifdef CONFIG_X86_64
-   unsigned long long base, low_base = 0, low_size = 0;
-   unsigned long total_low_mem;
-   int ret;
-
-   total_low_mem = memblock_mem_size(1UL << (32 - PAGE_SHIFT));
-
-   /* crashkernel=Y,low */
-   ret = parse_crashkernel_low(boot_command_line, total_low_mem, 
_size, );
-   if (ret) {
-   /*
-* two parts from lib/swiotlb.c:
-* -swiotlb size: user-specified with swiotlb= or default.
-*
-* -swiotlb overflow buffer: now hardcoded to 32k. We round it
-* to 8M for other buffers that may need to stay low too. Also
-* make sure we allocate enough extra low memory so that we
-* don't run out of DMA buffers for 32-bit devices.
-*/
-   low_size = max(swiotlb_size_or_default() + (8UL << 20), 256UL 
<< 20);
-   } else {
-   /* passed with crashkernel=0,low ? */
-   if (!low_size)
-   return 0;
-   }
-
-   low_base = memblock_find_in_range(0, 1ULL << 32, low_size, CRASH_ALIGN);
-   if (!low_base) {
-   pr_err("Cannot reserve %ldMB crashkernel low memory, please try 
smaller size.\n",
-  (unsigned long)(low_size >> 20));
-   return -ENOMEM;
-   }
-
-   ret = memblock_reserve(low_base, low_size);
-   if (ret) {
-   pr_err("%s: Error reserving crashkernel low memblock.\n", 
__func__);
-   return ret;
-   }
-
-   pr_info("Reserving %ldMB of low memory at %ldMB for crashkernel (System 
low RAM: %ldMB)\n",
-   (unsigned long)(low_size >> 20),
-   (unsigned long)(low_base >> 20),
-   (unsigned long)(total_low_mem >> 20));
-
-   crashk_low_res.start = low_base;
-   crashk_low_res.end   = low_base + low_size - 1;
-   insert_resource(_resource, _low_res);
-#endif
-   return 0;
-}
-
 static void __init reserve_crashkernel(void)
 {
unsigned long long crash_size, crash_base, total_mem;
@@ -573,9 +517,13 @@ static void __init reserve_crashkernel(void)
return;
}
 
-   if (crash_base >= (1ULL << 32) && reserve_crashkernel_low()) {
-   memblock_free(crash_base, crash_size);
-   return;
+   if (crash_base >= (1ULL << 32)) {
+   if (reserve_crashkernel_low()) {
+   memblock_free(crash_base, crash_size);
+   return;
+   }
+
+   insert_resource(_resource, _low_res);
}
 
pr_info("Reserving %ldMB of memory at %ldMB for crashkernel (System 
RAM: %ldMB)\n",
diff --git a/include/linux/kexec.h b/include/linux/kexec.h
index b9b1bc5..096ad63 100644
--- a/include/linux/kexec.h
+++ b/include/linux/kexec.h
@@ -63,6 +63,10 @@
 
 #define KEXEC_CORE_NOTE_NAME   CRASH_CORE_NOTE_NAME
 
+#ifndef CRASH_ALIGN
+#define CRASH_ALIGN SZ_128M
+#endif
+
 /*
  * This structure is used to hold the arguments that are used when loading
  * kernel binaries.
@@ -281,6 +285,7 @@ extern void __crash_kexec(struct pt_regs *);
 extern void crash_kexec(struct pt_regs *);
 int kexec_should_crash(struct task_struct *);
 int kexec_crash_loaded(void);
+int __init reserve_crashkernel_low(void);
 void crash_save_cpu(struct pt_regs *regs, int cpu);
 extern int kimage_crash_copy_vmcoreinfo(struct kimage *image);
 
diff --git 

[PATCH v4 0/5] support reserving crashkernel above 4G on arm64 kdump

2019-04-15 Thread Chen Zhou
When crashkernel is reserved above 4G in memory, kernel should reserve
some amount of low memory for swiotlb and some DMA buffers. So there may
be two crash kernel regions, one is below 4G, the other is above 4G.

Crash dump kernel reads more than one crash kernel regions via a dtb
property under node /chosen,
linux,usable-memory-range = .

Besides, we need to modify kexec-tools:
  arm64: support more than one crash kernel regions(see [1])

Changes since [v3]
- Add memblock_cap_memory_ranges for multiple ranges.
- Split patch "arm64: kdump: support more than one crash kernel regions"
as two. One is above "Add memblock_cap_memory_ranges", the other is using
memblock_cap_memory_ranges to support multiple crash kernel regions.
- Fix some compiling warnings.

Changes since [v2]
- Split patch "arm64: kdump: support reserving crashkernel above 4G" as
  two. Put "move reserve_crashkernel_low() into kexec_core.c" in a separate
  patch.

Changes since [v1]:
- Move common reserve_crashkernel_low() code into kernel/kexec_core.c.
- Remove memblock_cap_memory_ranges() i added in v1 and implement that
  in fdt_enforce_memory_region().
  There are at most two crash kernel regions, for two crash kernel regions
  case, we cap the memory range [min(regs[*].start), max(regs[*].end)]
  and then remove the memory range in the middle.

[1]: http://lists.infradead.org/pipermail/kexec/2019-April/022792.html
[v1]: https://lkml.org/lkml/2019/4/8/628
[v2]: https://lkml.org/lkml/2019/4/9/86
[V3]: https://lkml.org/lkml/2019/4/15/6

Chen Zhou (5):
  x86: kdump: move reserve_crashkernel_low() into kexec_core.c
  arm64: kdump: support reserving crashkernel above 4G
  memblock: add memblock_cap_memory_ranges for multiple ranges
  arm64: kdump: support more than one crash kernel regions
  kdump: update Documentation about crashkernel on arm64

 Documentation/admin-guide/kernel-parameters.txt |  4 +-
 arch/arm64/include/asm/kexec.h  |  3 ++
 arch/arm64/kernel/setup.c   |  3 ++
 arch/arm64/mm/init.c| 59 --
 arch/x86/include/asm/kexec.h|  3 ++
 arch/x86/kernel/setup.c | 66 +++--
 include/linux/kexec.h   |  5 ++
 include/linux/memblock.h|  2 +-
 kernel/kexec_core.c | 56 +
 mm/memblock.c   | 56 +++--
 10 files changed, 166 insertions(+), 91 deletions(-)

-- 
2.7.4


___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v4] x86/boot: Use efi_setup_data for searching RSDP on kexec-ed kernel

2019-04-15 Thread Borislav Petkov
On Mon, Apr 15, 2019 at 11:07:17AM +0200, Borislav Petkov wrote:
> On Mon, Apr 15, 2019 at 07:01:54AM +, Junichi Nomura wrote:
> > OK. Then I'll go back to v3 and make sure to hang when
> > something is wrong during kexec boot on EFI system.
> 
> No need - I have it here locally. I'll clean it up and post it for
> review.

Here it is. Ok, not ok?

---
diff --git a/arch/x86/boot/compressed/acpi.c b/arch/x86/boot/compressed/acpi.c
index 0ef4ad55b29b..089639a8a384 100644
--- a/arch/x86/boot/compressed/acpi.c
+++ b/arch/x86/boot/compressed/acpi.c
@@ -44,17 +44,109 @@ static acpi_physical_address get_acpi_rsdp(void)
return addr;
 }
 
-/* Search EFI system tables for RSDP. */
-static acpi_physical_address efi_get_rsdp_addr(void)
+/*
+ * Search EFI system tables for RSDP.  If both ACPI_20_TABLE_GUID and
+ * ACPI_TABLE_GUID are found, take the former, which has more features.
+ */
+static acpi_physical_address
+__efi_get_rsdp_addr(unsigned long config_tables, unsigned int nr_tables,
+   bool efi_64)
 {
acpi_physical_address rsdp_addr = 0;
 
 #ifdef CONFIG_EFI
-   unsigned long systab, systab_tables, config_tables;
+   int i;
+
+   /* Get EFI tables from systab. */
+   for (i = 0; i < nr_tables; i++) {
+   acpi_physical_address table;
+   efi_guid_t guid;
+
+   if (efi_64) {
+   efi_config_table_64_t *tbl = (efi_config_table_64_t *) 
config_tables + i;
+
+   guid  = tbl->guid;
+   table = tbl->table;
+
+   if (!IS_ENABLED(CONFIG_X86_64) && table >> 32) {
+   debug_putstr("Error getting RSDP address: EFI 
config table located above 4GB.\n");
+   return 0;
+   }
+   } else {
+   efi_config_table_32_t *tbl = (efi_config_table_32_t *) 
config_tables + i;
+
+   guid  = tbl->guid;
+   table = tbl->table;
+   }
+
+   if (!(efi_guidcmp(guid, ACPI_TABLE_GUID)))
+   rsdp_addr = table;
+   else if (!(efi_guidcmp(guid, ACPI_20_TABLE_GUID)))
+   return table;
+   }
+#endif
+   return rsdp_addr;
+}
+
+/* EFI/kexec support is 64-bit only. */
+#ifdef CONFIG_X86_64
+static struct efi_setup_data * get_kexec_setup_data_addr(void)
+{
+   struct setup_data *data;
+   u64 pa_data;
+
+   pa_data = boot_params->hdr.setup_data;
+   while (pa_data) {
+   data = (struct setup_data *)pa_data;
+   if (data->type == SETUP_EFI)
+   return (struct efi_setup_data *)(pa_data + 
sizeof(struct setup_data));
+
+   pa_data = data->next;
+   }
+   return NULL;
+}
+
+static acpi_physical_address kexec_get_rsdp_addr(void)
+{
+   efi_system_table_64_t *systab;
+   struct efi_setup_data *esd;
+   struct efi_info *ei;
+   char *sig;
+
+   esd = (struct efi_setup_data *)get_kexec_setup_data_addr();
+   if (!esd)
+   return 0;
+
+   if (!esd->tables) {
+   debug_putstr("Wrong kexec SETUP_EFI data.\n");
+   return 0;
+   }
+
+   ei = _params->efi_info;
+   sig = (char *)>efi_loader_signature;
+   if (strncmp(sig, EFI64_LOADER_SIGNATURE, 4)) {
+   debug_putstr("Wrong kexec EFI loader signature.\n");
+   return 0;
+   }
+
+   /* Get systab from boot params. */
+   systab = (efi_system_table_64_t *) (ei->efi_systab | 
((__u64)ei->efi_systab_hi << 32));
+   if (!systab)
+   error("EFI system table not found in kexec boot_params.");
+
+   return __efi_get_rsdp_addr((unsigned long)esd->tables, 
systab->nr_tables, true);
+}
+#else
+static acpi_physical_address kexec_get_rsdp_addr(void) { return 0; }
+#endif /* CONFIG_X86_64 */
+
+static acpi_physical_address efi_get_rsdp_addr(void)
+{
+#ifdef CONFIG_EFI
+   unsigned long systab, config_tables;
unsigned int nr_tables;
struct efi_info *ei;
bool efi_64;
-   int size, i;
char *sig;
 
ei = _params->efi_info;
@@ -88,49 +180,20 @@ static acpi_physical_address efi_get_rsdp_addr(void)
 
config_tables   = stbl->tables;
nr_tables   = stbl->nr_tables;
-   size= sizeof(efi_config_table_64_t);
} else {
efi_system_table_32_t *stbl = (efi_system_table_32_t *)systab;
 
config_tables   = stbl->tables;
nr_tables   = stbl->nr_tables;
-   size= sizeof(efi_config_table_32_t);
}
 
if (!config_tables)
error("EFI config tables not found.");
 
-   /* Get EFI tables from systab. */
-   for (i = 0; i < nr_tables; i++) {
-   acpi_physical_address table;
-   efi_guid_t guid;
-
-  

Re: [PATCH v4] x86/boot: Use efi_setup_data for searching RSDP on kexec-ed kernel

2019-04-15 Thread Borislav Petkov
On Mon, Apr 15, 2019 at 07:01:54AM +, Junichi Nomura wrote:
> OK. Then I'll go back to v3 and make sure to hang when
> something is wrong during kexec boot on EFI system.

No need - I have it here locally. I'll clean it up and post it for
review.

-- 
Regards/Gruss,
Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v4] x86/boot: Use efi_setup_data for searching RSDP on kexec-ed kernel

2019-04-15 Thread Dave Young
On 04/12/19 at 08:23am, Baoquan He wrote:
> On 04/11/19 at 09:14am, Junichi Nomura wrote:
> > On 4/11/19 5:42 PM, Baoquan He wrote:
> > > On 04/11/19 at 08:16am, Junichi Nomura wrote:
> > >> kexec_get_rsdp_addr() might fail on kexec-booted kernel, e.g. if the
> > >> setup_data was invalid. In such a case, falling back to 
> > >> efi_get_rsdp_addr()
> > >> will hit the problem of accessing invalid table pointer again.
> > > 
> > > Seems you are trying to address Dave Young's comment in 
> > > http://lkml.kernel.org/r/20190404073233.gc5...@dhcp-128-65.nay.redhat.com
> > 
> > Right. His "In case kexec_get_rsdp_addr failed.." comment.
> > 
> > > We may need discuss and make clear if those are doable. E.g the first
> > > comment, if not hang by below line of code, returning 0 for what? Can
> > > kexec still be saved, or just reset to firmware?
> > > 
> > >   error("EFI system table not found in kexec boot_params.")
> > 
> > If we return 0 and also don't hang in the rest of get_rsdp_addr(),
> > it just work as the same way as v5.0 and earlier kernel do.
> > 
> > Failure cases in kexec_get_rsdp_addr() are followings:
> > 1. efi_setup_data is invalid
> > 2. loader signature is invalid
> > 3. EFI systab is not found in boot_params
> > 4. RSDP is not found by parsing tables pointed to by efi_setup_data
> > 
> > I think all of them are critical for EFI boot, so one option could be
> > we never return failure in kexec_get_rsdp_addr() and just hang.
> > But hanging in this very early stage of boot may make the problem
> > harder to investigate once happens. Even earlyprintk is not working yet.
> > So the other option is returning 0 to defer the crash for later stage.
> 
> OK, I got the point, thanks. So it is deferred to the late stage, KASLR
> may not avoid those memory region which is marked as hotpluggable in
> SRAT. Kernel can boot up, but doesn't function well on hotplug stuff.
> In this case, people don't know why it happened. We are still blind.
> 
> Seems early console in efi is the problem, but not kexec or hotplug. I
> am fine to hang, or make it continue booting for now.
> 
> Hi Dave, 
> 
> Is it possible to fix the efi early console issue? I mean the
> feasibility, I believe it won't be easy. Ask this because not only this
> issue encountered, any other issue could be triggered during boot
> decompressing stage. If efi has this problem, we can't debug them
> either.

For normal boot, it maybe doable to use some boot services eg. some
graphic protocols efi firmware provided.

But for kexec, it is different because it become virtual mode, boot
services are not available, and kernel takes over the mode setting etc.
the early framebuffer maybe usable, maybe not, it is not reliable.

Thanks
Dave

___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec


Re: [PATCH v4] x86/boot: Use efi_setup_data for searching RSDP on kexec-ed kernel

2019-04-15 Thread Junichi Nomura
On 4/12/19 10:35 PM, Borislav Petkov wrote:
> On Fri, Apr 12, 2019 at 10:49:56AM +0200, Borislav Petkov wrote:
>> Now I need to go figure out whether there's a reliable way to know in
>> the kexec kernel that it *is* a kexec kernel.
> 
> Actually, thinking about this more, we don't need to know whether the
> kernel was kexeced or not. Why?
> 
> Because if it is kexec'ed, kexec(1) passes the required info in
> setup_data. Now, if for whatever reason the kexec'ed kernel fails to
> parse that EFI info and get the systab to figure out the RDSP, then it
> doesn't have any other choice but fail booting.
> 
> Because there's no way it can figure out where the EFI runtime has been
> mapped and recover by finding the RDSP from there.
> 
> So I think we're perfectly fine with the old approach:
> 
> if (!pa)
> pa = kexec_get_rsdp_addr();
> 
> if (!pa)
> pa = efi_get_rsdp_addr();

OK. Then I'll go back to v3 and make sure to hang when
something is wrong during kexec boot on EFI system.

-- 
Jun'ichi Nomura, NEC Corporation / NEC Solution Innovators, Ltd.
___
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec