Re: [PATCH v6 0/2] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)
On Thu, Jul 2, 2020 at 10:45 PM Catalin Marinas wrote: > > On Thu, 14 May 2020 00:22:35 +0530, Bhupesh Sharma wrote: > > Apologies for the delayed update. Its been quite some time since I > > posted the last version (v5), but I have been really caught up in some > > other critical issues. > > > > Changes since v5: > > > > - v5 can be viewed here: > > http://lists.infradead.org/pipermail/kexec/2019-November/024055.html > > - Addressed review comments from James Morse and Boris. > > - Added Tested-by received from John on v5 patchset. > > - Rebased against arm64 (for-next/ptr-auth) branch which has Amit's > > patchset for ARMv8.3-A Pointer Authentication feature vmcoreinfo > > applied. > > > > [...] > > Applied to arm64 (for-next/vmcoreinfo), thanks! > > [1/2] crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo > https://git.kernel.org/arm64/c/1d50e5d0c505 > [2/2] arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo > https://git.kernel.org/arm64/c/bbdbc11804ff Thanks Catalin for pulling in the changes. Dave and James, many thanks for reviewing the same as well. Regards, Bhupesh
Re: Re: [RESEND PATCH v5 2/5] arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo
Hello Bharat, On Wed, Jun 10, 2020 at 10:17 PM Bharat Gooty wrote: > > Hello Bhupesh, > V6 patch set on Linux 5.7, did not help. > I have applied makedump file > http://lists.infradead.org/pipermail/kexec/2019-November/023963.html changes > also (makedump-1.6.6). Tried to apply it on makedumpfile 1.6.7. Patch set_2 > failed. Would like to know, if you have V5 patch set for makedump file > changes. With makedump 1.6.6, able to collect the vmore file. > I used latest crash utility > (https://www.redhat.com/archives/crash-utility/2019-November/msg00014.html > changes are present) > When I used crash utility, following is the error: > > Thanks, > -Bharat > > > -Original Message- > From: Scott Branden [mailto:scott.bran...@broadcom.com] > Sent: Thursday, April 30, 2020 4:34 AM > To: Bhupesh Sharma; Amit Kachhap > Cc: Mark Rutland; x...@kernel.org; Will Deacon; Linux Doc Mailing List; > Catalin Marinas; Ard Biesheuvel; kexec mailing list; Linux Kernel Mailing > List; Kazuhito Hagio; James Morse; Dave Anderson; bhupesh linux; > linuxppc-dev@lists.ozlabs.org; linux-arm-kernel; Steve Capper; Ray Jui; > Bharat Gooty > Subject: Re: Re: [RESEND PATCH v5 2/5] arm64/crash_core: Export TCR_EL1.T1SZ > in vmcoreinfo > > Hi Bhupesh, > > On 2020-02-23 10:25 p.m., Bhupesh Sharma wrote: > > Hi Amit, > > > > On Fri, Feb 21, 2020 at 2:36 PM Amit Kachhap wrote: > >> Hi Bhupesh, > >> > >> On 1/13/20 5:44 PM, Bhupesh Sharma wrote: > >>> Hi James, > >>> > >>> On 01/11/2020 12:30 AM, Dave Anderson wrote: > >>>> - Original Message - > >>>>> Hi Bhupesh, > >>>>> > >>>>> On 25/12/2019 19:01, Bhupesh Sharma wrote: > >>>>>> On 12/12/2019 04:02 PM, James Morse wrote: > >>>>>>> On 29/11/2019 19:59, Bhupesh Sharma wrote: > >>>>>>>> vabits_actual variable on arm64 indicates the actual VA space size, > >>>>>>>> and allows a single binary to support both 48-bit and 52-bit VA > >>>>>>>> spaces. > >>>>>>>> > >>>>>>>> If the ARMv8.2-LVA optional feature is present, and we are running > >>>>>>>> with a 64KB page size; then it is possible to use 52-bits of > >>>>>>>> address > >>>>>>>> space for both userspace and kernel addresses. However, any kernel > >>>>>>>> binary that supports 52-bit must also be able to fall back to > >>>>>>>> 48-bit > >>>>>>>> at early boot time if the hardware feature is not present. > >>>>>>>> > >>>>>>>> Since TCR_EL1.T1SZ indicates the size offset of the memory region > >>>>>>>> addressed by TTBR1_EL1 (and hence can be used for determining the > >>>>>>>> vabits_actual value) it makes more sense to export the same in > >>>>>>>> vmcoreinfo rather than vabits_actual variable, as the name of the > >>>>>>>> variable can change in future kernel versions, but the > >>>>>>>> architectural > >>>>>>>> constructs like TCR_EL1.T1SZ can be used better to indicate > >>>>>>>> intended > >>>>>>>> specific fields to user-space. > >>>>>>>> > >>>>>>>> User-space utilities like makedumpfile and crash-utility, need to > >>>>>>>> read/write this value from/to vmcoreinfo > >>>>>>> (write?) > >>>>>> Yes, also write so that the vmcoreinfo from an (crashing) arm64 > >>>>>> system can > >>>>>> be used for > >>>>>> analysis of the root-cause of panic/crash on say an x86_64 host using > >>>>>> utilities like > >>>>>> crash-utility/gdb. > >>>>> I read this as as "User-space [...] needs to write to vmcoreinfo". > >>> That's correct. But for writing to vmcore dump in the kdump kernel, we > >>> need to read the symbols from the vmcoreinfo in the primary kernel. > >>> > >>>>>>>> for determining if a virtual address lies in the linear map range. > >>>>>>> I think this is a fragile example. The debugger shouldn't need to > >>>>>>> know > >>>>>>> this. > >>>>>> Well that the c
Re: [PATCH v6 0/2] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)
Hello Catalin, Will, On Tue, Jun 2, 2020 at 10:54 AM Bhupesh Sharma wrote: > > Hello, > > On Thu, May 14, 2020 at 12:22 AM Bhupesh Sharma wrote: > > > > Apologies for the delayed update. Its been quite some time since I > > posted the last version (v5), but I have been really caught up in some > > other critical issues. > > > > Changes since v5: > > > > - v5 can be viewed here: > > http://lists.infradead.org/pipermail/kexec/2019-November/024055.html > > - Addressed review comments from James Morse and Boris. > > - Added Tested-by received from John on v5 patchset. > > - Rebased against arm64 (for-next/ptr-auth) branch which has Amit's > > patchset for ARMv8.3-A Pointer Authentication feature vmcoreinfo > > applied. > > > > Changes since v4: > > > > - v4 can be seen here: > > http://lists.infradead.org/pipermail/kexec/2019-November/023961.html > > - Addressed comments from Dave and added patches for documenting > > new variables appended to vmcoreinfo documentation. > > - Added testing report shared by Akashi for PATCH 2/5. > > > > Changes since v3: > > > > - v3 can be seen here: > > http://lists.infradead.org/pipermail/kexec/2019-March/022590.html > > - Addressed comments from James and exported TCR_EL1.T1SZ in vmcoreinfo > > instead of PTRS_PER_PGD. > > - Added a new patch (via [PATCH 3/3]), which fixes a simple typo in > > 'Documentation/arm64/memory.rst' > > > > Changes since v2: > > > > - v2 can be seen here: > > http://lists.infradead.org/pipermail/kexec/2019-March/022531.html > > - Protected 'MAX_PHYSMEM_BITS' vmcoreinfo variable under CONFIG_SPARSEMEM > > ifdef sections, as suggested by Kazu. > > - Updated vmcoreinfo documentation to add description about > > 'MAX_PHYSMEM_BITS' variable (via [PATCH 3/3]). > > > > Changes since v1: > > > > - v1 was sent out as a single patch which can be seen here: > > http://lists.infradead.org/pipermail/kexec/2019-February/022411.html > > > > - v2 breaks the single patch into two independent patches: > > [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, whereas > > [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel code > > (all archs) > > > > This patchset primarily fixes the regression reported in user-space > > utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture > > with the availability of 52-bit address space feature in underlying > > kernel. These regressions have been reported both on CPUs which don't > > support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels > > and also on prototype platforms (like ARMv8 FVP simulator model) which > > support ARMv8.2 extensions and are running newer kernels. > > > > The reason for these regressions is that right now user-space tools > > have no direct access to these values (since these are not exported > > from the kernel) and hence need to rely on a best-guess method of > > determining value of 'vabits_actual' and 'MAX_PHYSMEM_BITS' supported > > by underlying kernel. > > > > Exporting these values via vmcoreinfo will help user-land in such cases. > > In addition, as per suggestion from makedumpfile maintainer (Kazu), > > it makes more sense to append 'MAX_PHYSMEM_BITS' to > > vmcoreinfo in the core code itself rather than in arm64 arch-specific > > code, so that the user-space code for other archs can also benefit from > > this addition to the vmcoreinfo and use it as a standard way of > > determining 'SECTIONS_SHIFT' value in user-land. > > > > Cc: Boris Petkov > > Cc: Ingo Molnar > > Cc: Thomas Gleixner > > Cc: Jonathan Corbet > > Cc: James Morse > > Cc: Mark Rutland > > Cc: Will Deacon > > Cc: Steve Capper > > Cc: Catalin Marinas > > Cc: Ard Biesheuvel > > Cc: Michael Ellerman > > Cc: Paul Mackerras > > Cc: Benjamin Herrenschmidt > > Cc: Dave Anderson > > Cc: Kazuhito Hagio > > Cc: John Donnelly > > Cc: scott.bran...@broadcom.com > > Cc: Amit Kachhap > > Cc: x...@kernel.org > > Cc: linuxppc-dev@lists.ozlabs.org > > Cc: linux-arm-ker...@lists.infradead.org > > Cc: linux-ker...@vger.kernel.org > > Cc: linux-...@vger.kernel.org > > Cc: ke...@lists.infradead.org > > > > Bhupesh Sharma (2): > > crash_
Re: Re: [RESEND PATCH v5 5/5] Documentation/vmcoreinfo: Add documentation for 'TCR_EL1.T1SZ'
Hello Scott, On Thu, Jun 4, 2020 at 12:17 AM Scott Branden wrote: > > Hi Bhupesh, > > Would be great to get this patch series upstreamed? > > On 2019-12-25 10:49 a.m., Bhupesh Sharma wrote: > > Hi James, > > > > On 12/12/2019 04:02 PM, James Morse wrote: > >> Hi Bhupesh, > > > > I am sorry this review mail skipped my attention due to holidays and > > focus on other urgent issues. > > > >> On 29/11/2019 19:59, Bhupesh Sharma wrote: > >>> Add documentation for TCR_EL1.T1SZ variable being added to > >>> vmcoreinfo. > >>> > >>> It indicates the size offset of the memory region addressed by > >>> TTBR1_EL1 > >> > >>> and hence can be used for determining the vabits_actual value. > >> > >> used for determining random-internal-kernel-variable, that might not > >> exist tomorrow. > >> > >> Could you describe how this is useful/necessary if a debugger wants > >> to walk the page > >> tables from the core file? I think this is a better argument. > >> > >> Wouldn't the documentation be better as part of the patch that adds > >> the export? > >> (... unless these have to go via different trees? ..) > > > > Ok, will fix the same in v6 version. > > > >>> diff --git a/Documentation/admin-guide/kdump/vmcoreinfo.rst > >>> b/Documentation/admin-guide/kdump/vmcoreinfo.rst > >>> index 447b64314f56..f9349f9d3345 100644 > >>> --- a/Documentation/admin-guide/kdump/vmcoreinfo.rst > >>> +++ b/Documentation/admin-guide/kdump/vmcoreinfo.rst > >>> @@ -398,6 +398,12 @@ KERNELOFFSET > >>> The kernel randomization offset. Used to compute the page offset. If > >>> KASLR is disabled, this value is zero. > >>> +TCR_EL1.T1SZ > >>> + > >>> + > >>> +Indicates the size offset of the memory region addressed by TTBR1_EL1 > >> > >>> +and hence can be used for determining the vabits_actual value. > >> > >> 'vabits_actual' may not exist when the next person comes to read this > >> documentation (its > >> going to rot really quickly). > >> > >> I think the first half of this text is enough to say what this is > >> for. You should include > >> words to the effect that its the hardware value that goes with > >> swapper_pg_dir. You may > >> want to point readers to the arm-arm for more details on what the > >> value means. > > > > Ok, got it. Fixed this in v6, which should be on its way shortly. > I can't seem to find v6? Oops. I remember Cc'ing you to the v6 patchset (may be my email client messed up), anyways here is the v6 patchset for your reference: <http://lists.infradead.org/pipermail/kexec/2020-May/025095.html> Do share your review/test comments on the same. Thanks, Bhupesh
Re: [PATCH v6 0/2] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)
Hello, On Thu, May 14, 2020 at 12:22 AM Bhupesh Sharma wrote: > > Apologies for the delayed update. Its been quite some time since I > posted the last version (v5), but I have been really caught up in some > other critical issues. > > Changes since v5: > > - v5 can be viewed here: > http://lists.infradead.org/pipermail/kexec/2019-November/024055.html > - Addressed review comments from James Morse and Boris. > - Added Tested-by received from John on v5 patchset. > - Rebased against arm64 (for-next/ptr-auth) branch which has Amit's > patchset for ARMv8.3-A Pointer Authentication feature vmcoreinfo > applied. > > Changes since v4: > > - v4 can be seen here: > http://lists.infradead.org/pipermail/kexec/2019-November/023961.html > - Addressed comments from Dave and added patches for documenting > new variables appended to vmcoreinfo documentation. > - Added testing report shared by Akashi for PATCH 2/5. > > Changes since v3: > > - v3 can be seen here: > http://lists.infradead.org/pipermail/kexec/2019-March/022590.html > - Addressed comments from James and exported TCR_EL1.T1SZ in vmcoreinfo > instead of PTRS_PER_PGD. > - Added a new patch (via [PATCH 3/3]), which fixes a simple typo in > 'Documentation/arm64/memory.rst' > > Changes since v2: > > - v2 can be seen here: > http://lists.infradead.org/pipermail/kexec/2019-March/022531.html > - Protected 'MAX_PHYSMEM_BITS' vmcoreinfo variable under CONFIG_SPARSEMEM > ifdef sections, as suggested by Kazu. > - Updated vmcoreinfo documentation to add description about > 'MAX_PHYSMEM_BITS' variable (via [PATCH 3/3]). > > Changes since v1: > > - v1 was sent out as a single patch which can be seen here: > http://lists.infradead.org/pipermail/kexec/2019-February/022411.html > > - v2 breaks the single patch into two independent patches: > [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, whereas > [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel code > (all archs) > > This patchset primarily fixes the regression reported in user-space > utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture > with the availability of 52-bit address space feature in underlying > kernel. These regressions have been reported both on CPUs which don't > support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels > and also on prototype platforms (like ARMv8 FVP simulator model) which > support ARMv8.2 extensions and are running newer kernels. > > The reason for these regressions is that right now user-space tools > have no direct access to these values (since these are not exported > from the kernel) and hence need to rely on a best-guess method of > determining value of 'vabits_actual' and 'MAX_PHYSMEM_BITS' supported > by underlying kernel. > > Exporting these values via vmcoreinfo will help user-land in such cases. > In addition, as per suggestion from makedumpfile maintainer (Kazu), > it makes more sense to append 'MAX_PHYSMEM_BITS' to > vmcoreinfo in the core code itself rather than in arm64 arch-specific > code, so that the user-space code for other archs can also benefit from > this addition to the vmcoreinfo and use it as a standard way of > determining 'SECTIONS_SHIFT' value in user-land. > > Cc: Boris Petkov > Cc: Ingo Molnar > Cc: Thomas Gleixner > Cc: Jonathan Corbet > Cc: James Morse > Cc: Mark Rutland > Cc: Will Deacon > Cc: Steve Capper > Cc: Catalin Marinas > Cc: Ard Biesheuvel > Cc: Michael Ellerman > Cc: Paul Mackerras > Cc: Benjamin Herrenschmidt > Cc: Dave Anderson > Cc: Kazuhito Hagio > Cc: John Donnelly > Cc: scott.bran...@broadcom.com > Cc: Amit Kachhap > Cc: x...@kernel.org > Cc: linuxppc-dev@lists.ozlabs.org > Cc: linux-arm-ker...@lists.infradead.org > Cc: linux-ker...@vger.kernel.org > Cc: linux-...@vger.kernel.org > Cc: ke...@lists.infradead.org > > Bhupesh Sharma (2): > crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo > arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo > > Documentation/admin-guide/kdump/vmcoreinfo.rst | 16 > arch/arm64/include/asm/pgtable-hwdef.h | 1 + > arch/arm64/kernel/crash_core.c | 10 ++ > kernel/crash_core.c| 1 + > 4 files changed, 28 insertions(+) Ping. @James Morse , Others Please share if you have some comments regarding this patchset. Thanks, Bhupesh
[PATCH v6 0/2] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)
Apologies for the delayed update. Its been quite some time since I posted the last version (v5), but I have been really caught up in some other critical issues. Changes since v5: - v5 can be viewed here: http://lists.infradead.org/pipermail/kexec/2019-November/024055.html - Addressed review comments from James Morse and Boris. - Added Tested-by received from John on v5 patchset. - Rebased against arm64 (for-next/ptr-auth) branch which has Amit's patchset for ARMv8.3-A Pointer Authentication feature vmcoreinfo applied. Changes since v4: - v4 can be seen here: http://lists.infradead.org/pipermail/kexec/2019-November/023961.html - Addressed comments from Dave and added patches for documenting new variables appended to vmcoreinfo documentation. - Added testing report shared by Akashi for PATCH 2/5. Changes since v3: - v3 can be seen here: http://lists.infradead.org/pipermail/kexec/2019-March/022590.html - Addressed comments from James and exported TCR_EL1.T1SZ in vmcoreinfo instead of PTRS_PER_PGD. - Added a new patch (via [PATCH 3/3]), which fixes a simple typo in 'Documentation/arm64/memory.rst' Changes since v2: - v2 can be seen here: http://lists.infradead.org/pipermail/kexec/2019-March/022531.html - Protected 'MAX_PHYSMEM_BITS' vmcoreinfo variable under CONFIG_SPARSEMEM ifdef sections, as suggested by Kazu. - Updated vmcoreinfo documentation to add description about 'MAX_PHYSMEM_BITS' variable (via [PATCH 3/3]). Changes since v1: - v1 was sent out as a single patch which can be seen here: http://lists.infradead.org/pipermail/kexec/2019-February/022411.html - v2 breaks the single patch into two independent patches: [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, whereas [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel code (all archs) This patchset primarily fixes the regression reported in user-space utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture with the availability of 52-bit address space feature in underlying kernel. These regressions have been reported both on CPUs which don't support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels and also on prototype platforms (like ARMv8 FVP simulator model) which support ARMv8.2 extensions and are running newer kernels. The reason for these regressions is that right now user-space tools have no direct access to these values (since these are not exported from the kernel) and hence need to rely on a best-guess method of determining value of 'vabits_actual' and 'MAX_PHYSMEM_BITS' supported by underlying kernel. Exporting these values via vmcoreinfo will help user-land in such cases. In addition, as per suggestion from makedumpfile maintainer (Kazu), it makes more sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself rather than in arm64 arch-specific code, so that the user-space code for other archs can also benefit from this addition to the vmcoreinfo and use it as a standard way of determining 'SECTIONS_SHIFT' value in user-land. Cc: Boris Petkov Cc: Ingo Molnar Cc: Thomas Gleixner Cc: Jonathan Corbet Cc: James Morse Cc: Mark Rutland Cc: Will Deacon Cc: Steve Capper Cc: Catalin Marinas Cc: Ard Biesheuvel Cc: Michael Ellerman Cc: Paul Mackerras Cc: Benjamin Herrenschmidt Cc: Dave Anderson Cc: Kazuhito Hagio Cc: John Donnelly Cc: scott.bran...@broadcom.com Cc: Amit Kachhap Cc: x...@kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-ker...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: ke...@lists.infradead.org Bhupesh Sharma (2): crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo Documentation/admin-guide/kdump/vmcoreinfo.rst | 16 arch/arm64/include/asm/pgtable-hwdef.h | 1 + arch/arm64/kernel/crash_core.c | 10 ++ kernel/crash_core.c| 1 + 4 files changed, 28 insertions(+) -- 2.7.4
[PATCH v6 1/2] crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo
Right now user-space tools like 'makedumpfile' and 'crash' need to rely on a best-guess method of determining value of 'MAX_PHYSMEM_BITS' supported by underlying kernel. This value is used in user-space code to calculate the bit-space required to store a section for SPARESMEM (similar to the existing calculation method used in the kernel implementation): #define SECTIONS_SHIFT(MAX_PHYSMEM_BITS - SECTION_SIZE_BITS) Now, regressions have been reported in user-space utilities like 'makedumpfile' and 'crash' on arm64, with the recently added kernel support for 52-bit physical address space, as there is no clear method of determining this value in user-space (other than reading kernel CONFIG flags). As per suggestion from makedumpfile maintainer (Kazu), it makes more sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself rather than in arch-specific code, so that the user-space code for other archs can also benefit from this addition to the vmcoreinfo and use it as a standard way of determining 'SECTIONS_SHIFT' value in user-land. A reference 'makedumpfile' implementation which reads the 'MAX_PHYSMEM_BITS' value from vmcoreinfo in a arch-independent fashion is available here: While at it also update vmcoreinfo documentation for 'MAX_PHYSMEM_BITS' variable being added to vmcoreinfo. 'MAX_PHYSMEM_BITS' defines the maximum supported physical address space memory. Cc: Boris Petkov Cc: Ingo Molnar Cc: Thomas Gleixner Cc: James Morse Cc: Mark Rutland Cc: Will Deacon Cc: Michael Ellerman Cc: Paul Mackerras Cc: Benjamin Herrenschmidt Cc: Dave Anderson Cc: Kazuhito Hagio Cc: x...@kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-ker...@vger.kernel.org Cc: ke...@lists.infradead.org Tested-by: John Donnelly Signed-off-by: Bhupesh Sharma --- Documentation/admin-guide/kdump/vmcoreinfo.rst | 5 + kernel/crash_core.c| 1 + 2 files changed, 6 insertions(+) diff --git a/Documentation/admin-guide/kdump/vmcoreinfo.rst b/Documentation/admin-guide/kdump/vmcoreinfo.rst index e4ee8b2db604..2a632020f809 100644 --- a/Documentation/admin-guide/kdump/vmcoreinfo.rst +++ b/Documentation/admin-guide/kdump/vmcoreinfo.rst @@ -93,6 +93,11 @@ It exists in the sparse memory mapping model, and it is also somewhat similar to the mem_map variable, both of them are used to translate an address. +MAX_PHYSMEM_BITS + + +Defines the maximum supported physical address space memory. + page diff --git a/kernel/crash_core.c b/kernel/crash_core.c index 9f1557b98468..18175687133a 100644 --- a/kernel/crash_core.c +++ b/kernel/crash_core.c @@ -413,6 +413,7 @@ static int __init crash_save_vmcoreinfo_init(void) VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS); VMCOREINFO_STRUCT_SIZE(mem_section); VMCOREINFO_OFFSET(mem_section, section_mem_map); + VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS); #endif VMCOREINFO_STRUCT_SIZE(page); VMCOREINFO_STRUCT_SIZE(pglist_data); -- 2.7.4
Re: [RESEND PATCH v5 2/5] arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo
Hi Amit, On Fri, Feb 21, 2020 at 2:36 PM Amit Kachhap wrote: > > Hi Bhupesh, > > On 1/13/20 5:44 PM, Bhupesh Sharma wrote: > > Hi James, > > > > On 01/11/2020 12:30 AM, Dave Anderson wrote: > >> > >> - Original Message - > >>> Hi Bhupesh, > >>> > >>> On 25/12/2019 19:01, Bhupesh Sharma wrote: > >>>> On 12/12/2019 04:02 PM, James Morse wrote: > >>>>> On 29/11/2019 19:59, Bhupesh Sharma wrote: > >>>>>> vabits_actual variable on arm64 indicates the actual VA space size, > >>>>>> and allows a single binary to support both 48-bit and 52-bit VA > >>>>>> spaces. > >>>>>> > >>>>>> If the ARMv8.2-LVA optional feature is present, and we are running > >>>>>> with a 64KB page size; then it is possible to use 52-bits of address > >>>>>> space for both userspace and kernel addresses. However, any kernel > >>>>>> binary that supports 52-bit must also be able to fall back to 48-bit > >>>>>> at early boot time if the hardware feature is not present. > >>>>>> > >>>>>> Since TCR_EL1.T1SZ indicates the size offset of the memory region > >>>>>> addressed by TTBR1_EL1 (and hence can be used for determining the > >>>>>> vabits_actual value) it makes more sense to export the same in > >>>>>> vmcoreinfo rather than vabits_actual variable, as the name of the > >>>>>> variable can change in future kernel versions, but the architectural > >>>>>> constructs like TCR_EL1.T1SZ can be used better to indicate intended > >>>>>> specific fields to user-space. > >>>>>> > >>>>>> User-space utilities like makedumpfile and crash-utility, need to > >>>>>> read/write this value from/to vmcoreinfo > >>>>> > >>>>> (write?) > >>>> > >>>> Yes, also write so that the vmcoreinfo from an (crashing) arm64 > >>>> system can > >>>> be used for > >>>> analysis of the root-cause of panic/crash on say an x86_64 host using > >>>> utilities like > >>>> crash-utility/gdb. > >>> > >>> I read this as as "User-space [...] needs to write to vmcoreinfo". > > > > That's correct. But for writing to vmcore dump in the kdump kernel, we > > need to read the symbols from the vmcoreinfo in the primary kernel. > > > >>>>>> for determining if a virtual address lies in the linear map range. > >>>>> > >>>>> I think this is a fragile example. The debugger shouldn't need to know > >>>>> this. > >>>> > >>>> Well that the current user-space utility design, so I am not sure we > >>>> can > >>>> tweak that too much. > >>>> > >>>>>> The user-space computation for determining whether an address lies in > >>>>>> the linear map range is the same as we have in kernel-space: > >>>>>> > >>>>>> #define __is_lm_address(addr)(!(((u64)addr) & > >>>>>> BIT(vabits_actual - > >>>>>> 1))) > >>>>> > >>>>> This was changed with 14c127c957c1 ("arm64: mm: Flip kernel VA > >>>>> space"). If > >>>>> user-space > >>>>> tools rely on 'knowing' the kernel memory layout, they must have to > >>>>> constantly be fixed > >>>>> and updated. This is a poor argument for adding this to something that > >>>>> ends up as ABI. > >>>> > >>>> See above. The user-space has to rely on some ABI/guaranteed > >>>> hardware-symbols which can be > >>>> used for 'determining' the kernel memory layout. > >>> > >>> I disagree. Everything and anything in the kernel will change. The > >>> ABI rules apply to > >>> stuff exposed via syscalls and kernel filesystems. It does not apply > >>> to kernel internals, > >>> like the memory layout we used yesterday. 14c127c957c1 is a case in > >>> point. > >>> > >>> A debugger trying to rely on this sort of thing would have to play > >>> catchup whenever it > >>> changes. > >> > >> Exact
Re: [RESEND PATCH v5 2/5] arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo
Hi James, On 01/11/2020 12:30 AM, Dave Anderson wrote: - Original Message - Hi Bhupesh, On 25/12/2019 19:01, Bhupesh Sharma wrote: On 12/12/2019 04:02 PM, James Morse wrote: On 29/11/2019 19:59, Bhupesh Sharma wrote: vabits_actual variable on arm64 indicates the actual VA space size, and allows a single binary to support both 48-bit and 52-bit VA spaces. If the ARMv8.2-LVA optional feature is present, and we are running with a 64KB page size; then it is possible to use 52-bits of address space for both userspace and kernel addresses. However, any kernel binary that supports 52-bit must also be able to fall back to 48-bit at early boot time if the hardware feature is not present. Since TCR_EL1.T1SZ indicates the size offset of the memory region addressed by TTBR1_EL1 (and hence can be used for determining the vabits_actual value) it makes more sense to export the same in vmcoreinfo rather than vabits_actual variable, as the name of the variable can change in future kernel versions, but the architectural constructs like TCR_EL1.T1SZ can be used better to indicate intended specific fields to user-space. User-space utilities like makedumpfile and crash-utility, need to read/write this value from/to vmcoreinfo (write?) Yes, also write so that the vmcoreinfo from an (crashing) arm64 system can be used for analysis of the root-cause of panic/crash on say an x86_64 host using utilities like crash-utility/gdb. I read this as as "User-space [...] needs to write to vmcoreinfo". That's correct. But for writing to vmcore dump in the kdump kernel, we need to read the symbols from the vmcoreinfo in the primary kernel. for determining if a virtual address lies in the linear map range. I think this is a fragile example. The debugger shouldn't need to know this. Well that the current user-space utility design, so I am not sure we can tweak that too much. The user-space computation for determining whether an address lies in the linear map range is the same as we have in kernel-space: #define __is_lm_address(addr)(!(((u64)addr) & BIT(vabits_actual - 1))) This was changed with 14c127c957c1 ("arm64: mm: Flip kernel VA space"). If user-space tools rely on 'knowing' the kernel memory layout, they must have to constantly be fixed and updated. This is a poor argument for adding this to something that ends up as ABI. See above. The user-space has to rely on some ABI/guaranteed hardware-symbols which can be used for 'determining' the kernel memory layout. I disagree. Everything and anything in the kernel will change. The ABI rules apply to stuff exposed via syscalls and kernel filesystems. It does not apply to kernel internals, like the memory layout we used yesterday. 14c127c957c1 is a case in point. A debugger trying to rely on this sort of thing would have to play catchup whenever it changes. Exactly. That's the whole point. The crash utility and makedumpfile are not in the same league as other user-space tools. They have always had to "play catchup" precisely because they depend upon kernel internals, which constantly change. I agree with you and DaveA here. Software user-space debuggers are dependent on kernel internals (which can change from time-to-time) and will have to play catch-up (which has been the case since the very start). Unfortunately we don't have any clear ABI for software debugging tools - may be something to look for in future. A case in point is gdb/kgdb, which still needs to run with KASLR turned-off (nokaslr) for debugging, as it confuses gdb which resolve kernel symbol address from symbol table of vmlinux. But we can work-around the same in makedumpfile/crash by reading the 'kaslr_offset' value. And I have several users telling me now they cannot use gdb on KASLR enabled kernel to debug panics, but can makedumpfile + crash combination to achieve the same. So, we should be looking to fix these utilities which are broken since the 52-bit changes for arm64. Accordingly, I will try to send the v6 soon while incorporating the comments posted on the v5. Thanks, Bhupesh
Re: [RESEND PATCH v5 2/5] arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo
Hi James, On 12/12/2019 04:02 PM, James Morse wrote: Hi Bhupesh, On 29/11/2019 19:59, Bhupesh Sharma wrote: vabits_actual variable on arm64 indicates the actual VA space size, and allows a single binary to support both 48-bit and 52-bit VA spaces. If the ARMv8.2-LVA optional feature is present, and we are running with a 64KB page size; then it is possible to use 52-bits of address space for both userspace and kernel addresses. However, any kernel binary that supports 52-bit must also be able to fall back to 48-bit at early boot time if the hardware feature is not present. Since TCR_EL1.T1SZ indicates the size offset of the memory region addressed by TTBR1_EL1 (and hence can be used for determining the vabits_actual value) it makes more sense to export the same in vmcoreinfo rather than vabits_actual variable, as the name of the variable can change in future kernel versions, but the architectural constructs like TCR_EL1.T1SZ can be used better to indicate intended specific fields to user-space. User-space utilities like makedumpfile and crash-utility, need to read/write this value from/to vmcoreinfo (write?) Yes, also write so that the vmcoreinfo from an (crashing) arm64 system can be used for analysis of the root-cause of panic/crash on say an x86_64 host using utilities like crash-utility/gdb. for determining if a virtual address lies in the linear map range. I think this is a fragile example. The debugger shouldn't need to know this. Well that the current user-space utility design, so I am not sure we can tweak that too much. The user-space computation for determining whether an address lies in the linear map range is the same as we have in kernel-space: #define __is_lm_address(addr)(!(((u64)addr) & BIT(vabits_actual - 1))) This was changed with 14c127c957c1 ("arm64: mm: Flip kernel VA space"). If user-space tools rely on 'knowing' the kernel memory layout, they must have to constantly be fixed and updated. This is a poor argument for adding this to something that ends up as ABI. See above. The user-space has to rely on some ABI/guaranteed hardware-symbols which can be used for 'determining' the kernel memory layout. I think a better argument is walking the kernel page tables from the core dump. Core code's vmcoreinfo exports the location of the kernel page tables, but in the example above you can't walk them without knowing how T1SZ was configured. Sure, both makedumpfile and crash-utility (which walks the kernel page tables from the core dump) use this (and similar) information currently in the user-space. On older kernels, user-space that needs this would have to assume the value it computes from VA_BITs (also in vmcoreinfo) is the value in use. Yes, backward compatibility has been handled in the user-space already. ---%<--- I have sent out user-space patches for makedumpfile and crash-utility to add features for obtaining vabits_actual value from TCR_EL1.T1SZ (see [0] and [1]). Akashi reported that he was able to use this patchset and the user-space changes to get user-space working fine with the 52-bit kernel VA changes (see [2]). [0]. http://lists.infradead.org/pipermail/kexec/2019-November/023966.html [1]. http://lists.infradead.org/pipermail/kexec/2019-November/024006.html [2]. http://lists.infradead.org/pipermail/kexec/2019-November/023992.html ---%<--- This probably belongs in the cover letter instead of the commit log. Ok. (From-memory: one of vmcore/kcore is virtually addressed, the other physically. Does this fix your poblem in both cases?) diff --git a/arch/arm64/kernel/crash_core.c b/arch/arm64/kernel/crash_core.c index ca4c3e12d8c5..f78310ba65ea 100644 --- a/arch/arm64/kernel/crash_core.c +++ b/arch/arm64/kernel/crash_core.c @@ -7,6 +7,13 @@ #include #include You need to include asm/sysreg.h for read_sysreg(), and asm/pgtable-hwdef.h for the macros you added. Ok. Will check as I did not get any compilation errors without the same and build-bot also did not raise a flag for the missing include files. +static inline u64 get_tcr_el1_t1sz(void); Why do you need to do this? Without this I was getting a missing declaration error, while compiling the code. +static inline u64 get_tcr_el1_t1sz(void) +{ + return (read_sysreg(tcr_el1) & TCR_T1SZ_MASK) >> TCR_T1SZ_OFFSET; +} (We don't modify this one, and its always the same one very CPU, so this is fine. This function is only called once when the stringy vmcoreinfo elf_note is created...) Right. void arch_crash_save_vmcoreinfo(void) { VMCOREINFO_NUMBER(VA_BITS); @@ -15,5 +22,7 @@ void arch_crash_save_vmcoreinfo(void) kimage_voffset); vmcoreinfo_append_str("NUMBER(PHYS_OFFSET)=0x%llx\n",
Re: [RESEND PATCH v5 5/5] Documentation/vmcoreinfo: Add documentation for 'TCR_EL1.T1SZ'
Hi James, On 12/12/2019 04:02 PM, James Morse wrote: Hi Bhupesh, I am sorry this review mail skipped my attention due to holidays and focus on other urgent issues. On 29/11/2019 19:59, Bhupesh Sharma wrote: Add documentation for TCR_EL1.T1SZ variable being added to vmcoreinfo. It indicates the size offset of the memory region addressed by TTBR1_EL1 and hence can be used for determining the vabits_actual value. used for determining random-internal-kernel-variable, that might not exist tomorrow. Could you describe how this is useful/necessary if a debugger wants to walk the page tables from the core file? I think this is a better argument. Wouldn't the documentation be better as part of the patch that adds the export? (... unless these have to go via different trees? ..) Ok, will fix the same in v6 version. diff --git a/Documentation/admin-guide/kdump/vmcoreinfo.rst b/Documentation/admin-guide/kdump/vmcoreinfo.rst index 447b64314f56..f9349f9d3345 100644 --- a/Documentation/admin-guide/kdump/vmcoreinfo.rst +++ b/Documentation/admin-guide/kdump/vmcoreinfo.rst @@ -398,6 +398,12 @@ KERNELOFFSET The kernel randomization offset. Used to compute the page offset. If KASLR is disabled, this value is zero. +TCR_EL1.T1SZ + + +Indicates the size offset of the memory region addressed by TTBR1_EL1 +and hence can be used for determining the vabits_actual value. 'vabits_actual' may not exist when the next person comes to read this documentation (its going to rot really quickly). I think the first half of this text is enough to say what this is for. You should include words to the effect that its the hardware value that goes with swapper_pg_dir. You may want to point readers to the arm-arm for more details on what the value means. Ok, got it. Fixed this in v6, which should be on its way shortly. Thanks, Bhupesh
Re: [PATCH v5 0/5] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)
Hi Boris, On Sat, Dec 14, 2019 at 5:57 PM Borislav Petkov wrote: > > On Fri, Nov 29, 2019 at 01:53:36AM +0530, Bhupesh Sharma wrote: > > Bhupesh Sharma (5): > > crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo > > arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo > > Documentation/arm64: Fix a simple typo in memory.rst > > Documentation/vmcoreinfo: Add documentation for 'MAX_PHYSMEM_BITS' > > Documentation/vmcoreinfo: Add documentation for 'TCR_EL1.T1SZ' > > why are those last two separate patches and not part of the patches > which export the respective variable/define? I remember there was a suggestion during the review of an earlier version to keep them as a separate patch(es) so that the documentation text is easier to review, but I have no strong preference towards the same. I can merge the documentation patches with the respective patches (which export the variables/defines to vmcoreinfo) in v6, unless other maintainers have an objections towards the same. Thanks, Bhupesh
Re: [PATCH v5 0/5] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)
Hi Will, On Fri, Nov 29, 2019 at 3:54 PM Will Deacon wrote: > > On Fri, Nov 29, 2019 at 01:53:36AM +0530, Bhupesh Sharma wrote: > > Changes since v4: > > > > - v4 can be seen here: > > http://lists.infradead.org/pipermail/kexec/2019-November/023961.html > > - Addressed comments from Dave and added patches for documenting > > new variables appended to vmcoreinfo documentation. > > - Added testing report shared by Akashi for PATCH 2/5. > > Please can you fix your mail setup? The last two times you've sent this > series it seems to get split into two threads, which is really hard to > track in my inbox: > > First thread: > > https://lore.kernel.org/lkml/1574972621-25750-1-git-send-email-bhsha...@redhat.com/ > > Second thread: > > https://lore.kernel.org/lkml/1574972716-25858-1-git-send-email-bhsha...@redhat.com/ There seems to be some issue with my server's msmtp settings. I have tried resending the v5 (see <http://lists.infradead.org/pipermail/linux-arm-kernel/2019-November/696833.html>). I hope the threading is ok this time. Thanks for your patience. Regards, Bhupesh
[RESEND PATCH v5 5/5] Documentation/vmcoreinfo: Add documentation for 'TCR_EL1.T1SZ'
Add documentation for TCR_EL1.T1SZ variable being added to vmcoreinfo. It indicates the size offset of the memory region addressed by TTBR1_EL1 and hence can be used for determining the vabits_actual value. Cc: James Morse Cc: Mark Rutland Cc: Will Deacon Cc: Steve Capper Cc: Catalin Marinas Cc: Ard Biesheuvel Cc: Dave Anderson Cc: Kazuhito Hagio Cc: linux-arm-ker...@lists.infradead.org Cc: linux-ker...@vger.kernel.org Cc: ke...@lists.infradead.org Signed-off-by: Bhupesh Sharma --- Documentation/admin-guide/kdump/vmcoreinfo.rst | 6 ++ 1 file changed, 6 insertions(+) diff --git a/Documentation/admin-guide/kdump/vmcoreinfo.rst b/Documentation/admin-guide/kdump/vmcoreinfo.rst index 447b64314f56..f9349f9d3345 100644 --- a/Documentation/admin-guide/kdump/vmcoreinfo.rst +++ b/Documentation/admin-guide/kdump/vmcoreinfo.rst @@ -398,6 +398,12 @@ KERNELOFFSET The kernel randomization offset. Used to compute the page offset. If KASLR is disabled, this value is zero. +TCR_EL1.T1SZ + + +Indicates the size offset of the memory region addressed by TTBR1_EL1 +and hence can be used for determining the vabits_actual value. + arm === -- 2.7.4
[RESEND PATCH v5 4/5] Documentation/vmcoreinfo: Add documentation for 'MAX_PHYSMEM_BITS'
Add documentation for 'MAX_PHYSMEM_BITS' variable being added to vmcoreinfo. 'MAX_PHYSMEM_BITS' defines the maximum supported physical address space memory. Cc: Boris Petkov Cc: Ingo Molnar Cc: Thomas Gleixner Cc: James Morse Cc: Will Deacon Cc: Michael Ellerman Cc: Paul Mackerras Cc: Benjamin Herrenschmidt Cc: Dave Anderson Cc: Kazuhito Hagio Cc: x...@kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-ker...@vger.kernel.org Cc: ke...@lists.infradead.org Signed-off-by: Bhupesh Sharma --- Documentation/admin-guide/kdump/vmcoreinfo.rst | 5 + 1 file changed, 5 insertions(+) diff --git a/Documentation/admin-guide/kdump/vmcoreinfo.rst b/Documentation/admin-guide/kdump/vmcoreinfo.rst index 007a6b86e0ee..447b64314f56 100644 --- a/Documentation/admin-guide/kdump/vmcoreinfo.rst +++ b/Documentation/admin-guide/kdump/vmcoreinfo.rst @@ -93,6 +93,11 @@ It exists in the sparse memory mapping model, and it is also somewhat similar to the mem_map variable, both of them are used to translate an address. +MAX_PHYSMEM_BITS + + +Defines the maximum supported physical address space memory. + page -- 2.7.4
[RESEND PATCH v5 3/5] Documentation/arm64: Fix a simple typo in memory.rst
Fix a simple typo in arm64/memory.rst Cc: Jonathan Corbet Cc: James Morse Cc: Mark Rutland Cc: Will Deacon Cc: Steve Capper Cc: Catalin Marinas Cc: Ard Biesheuvel Cc: linux-...@vger.kernel.org Cc: linux-ker...@vger.kernel.org Cc: linux-arm-ker...@lists.infradead.org Signed-off-by: Bhupesh Sharma --- Documentation/arm64/memory.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/arm64/memory.rst b/Documentation/arm64/memory.rst index 02e02175e6f5..cf03b3290800 100644 --- a/Documentation/arm64/memory.rst +++ b/Documentation/arm64/memory.rst @@ -129,7 +129,7 @@ this logic. As a single binary will need to support both 48-bit and 52-bit VA spaces, the VMEMMAP must be sized large enough for 52-bit VAs and -also must be sized large enought to accommodate a fixed PAGE_OFFSET. +also must be sized large enough to accommodate a fixed PAGE_OFFSET. Most code in the kernel should not need to consider the VA_BITS, for code that does need to know the VA size the variables are -- 2.7.4
[RESEND PATCH v5 2/5] arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo
vabits_actual variable on arm64 indicates the actual VA space size, and allows a single binary to support both 48-bit and 52-bit VA spaces. If the ARMv8.2-LVA optional feature is present, and we are running with a 64KB page size; then it is possible to use 52-bits of address space for both userspace and kernel addresses. However, any kernel binary that supports 52-bit must also be able to fall back to 48-bit at early boot time if the hardware feature is not present. Since TCR_EL1.T1SZ indicates the size offset of the memory region addressed by TTBR1_EL1 (and hence can be used for determining the vabits_actual value) it makes more sense to export the same in vmcoreinfo rather than vabits_actual variable, as the name of the variable can change in future kernel versions, but the architectural constructs like TCR_EL1.T1SZ can be used better to indicate intended specific fields to user-space. User-space utilities like makedumpfile and crash-utility, need to read/write this value from/to vmcoreinfo for determining if a virtual address lies in the linear map range. The user-space computation for determining whether an address lies in the linear map range is the same as we have in kernel-space: #define __is_lm_address(addr) (!(((u64)addr) & BIT(vabits_actual - 1))) I have sent out user-space patches for makedumpfile and crash-utility to add features for obtaining vabits_actual value from TCR_EL1.T1SZ (see [0] and [1]). Akashi reported that he was able to use this patchset and the user-space changes to get user-space working fine with the 52-bit kernel VA changes (see [2]). [0]. http://lists.infradead.org/pipermail/kexec/2019-November/023966.html [1]. http://lists.infradead.org/pipermail/kexec/2019-November/024006.html [2]. http://lists.infradead.org/pipermail/kexec/2019-November/023992.html Cc: James Morse Cc: Mark Rutland Cc: Will Deacon Cc: Steve Capper Cc: Catalin Marinas Cc: Ard Biesheuvel Cc: Dave Anderson Cc: Kazuhito Hagio Cc: linux-arm-ker...@lists.infradead.org Cc: linux-ker...@vger.kernel.org Cc: ke...@lists.infradead.org Signed-off-by: Bhupesh Sharma --- arch/arm64/include/asm/pgtable-hwdef.h | 1 + arch/arm64/kernel/crash_core.c | 9 + 2 files changed, 10 insertions(+) diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h index d9fbd433cc17..d2e7aff5821e 100644 --- a/arch/arm64/include/asm/pgtable-hwdef.h +++ b/arch/arm64/include/asm/pgtable-hwdef.h @@ -215,6 +215,7 @@ #define TCR_TxSZ(x)(TCR_T0SZ(x) | TCR_T1SZ(x)) #define TCR_TxSZ_WIDTH 6 #define TCR_T0SZ_MASK (((UL(1) << TCR_TxSZ_WIDTH) - 1) << TCR_T0SZ_OFFSET) +#define TCR_T1SZ_MASK (((UL(1) << TCR_TxSZ_WIDTH) - 1) << TCR_T1SZ_OFFSET) #define TCR_EPD0_SHIFT 7 #define TCR_EPD0_MASK (UL(1) << TCR_EPD0_SHIFT) diff --git a/arch/arm64/kernel/crash_core.c b/arch/arm64/kernel/crash_core.c index ca4c3e12d8c5..f78310ba65ea 100644 --- a/arch/arm64/kernel/crash_core.c +++ b/arch/arm64/kernel/crash_core.c @@ -7,6 +7,13 @@ #include #include +static inline u64 get_tcr_el1_t1sz(void); + +static inline u64 get_tcr_el1_t1sz(void) +{ + return (read_sysreg(tcr_el1) & TCR_T1SZ_MASK) >> TCR_T1SZ_OFFSET; +} + void arch_crash_save_vmcoreinfo(void) { VMCOREINFO_NUMBER(VA_BITS); @@ -15,5 +22,7 @@ void arch_crash_save_vmcoreinfo(void) kimage_voffset); vmcoreinfo_append_str("NUMBER(PHYS_OFFSET)=0x%llx\n", PHYS_OFFSET); + vmcoreinfo_append_str("NUMBER(tcr_el1_t1sz)=0x%llx\n", + get_tcr_el1_t1sz()); vmcoreinfo_append_str("KERNELOFFSET=%lx\n", kaslr_offset()); } -- 2.7.4
[RESEND PATCH v5 1/5] crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo
Right now user-space tools like 'makedumpfile' and 'crash' need to rely on a best-guess method of determining value of 'MAX_PHYSMEM_BITS' supported by underlying kernel. This value is used in user-space code to calculate the bit-space required to store a section for SPARESMEM (similar to the existing calculation method used in the kernel implementation): #define SECTIONS_SHIFT(MAX_PHYSMEM_BITS - SECTION_SIZE_BITS) Now, regressions have been reported in user-space utilities like 'makedumpfile' and 'crash' on arm64, with the recently added kernel support for 52-bit physical address space, as there is no clear method of determining this value in user-space (other than reading kernel CONFIG flags). As per suggestion from makedumpfile maintainer (Kazu), it makes more sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself rather than in arch-specific code, so that the user-space code for other archs can also benefit from this addition to the vmcoreinfo and use it as a standard way of determining 'SECTIONS_SHIFT' value in user-land. A reference 'makedumpfile' implementation which reads the 'MAX_PHYSMEM_BITS' value from vmcoreinfo in a arch-independent fashion is available here: [0]. https://github.com/bhupesh-sharma/makedumpfile/blob/remove-max-phys-mem-bit-v1/arch/ppc64.c#L471 Cc: Boris Petkov Cc: Ingo Molnar Cc: Thomas Gleixner Cc: James Morse Cc: Mark Rutland Cc: Will Deacon Cc: Steve Capper Cc: Catalin Marinas Cc: Ard Biesheuvel Cc: Michael Ellerman Cc: Paul Mackerras Cc: Benjamin Herrenschmidt Cc: Dave Anderson Cc: Kazuhito Hagio Cc: x...@kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-ker...@vger.kernel.org Cc: ke...@lists.infradead.org Signed-off-by: Bhupesh Sharma --- kernel/crash_core.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/crash_core.c b/kernel/crash_core.c index 9f1557b98468..18175687133a 100644 --- a/kernel/crash_core.c +++ b/kernel/crash_core.c @@ -413,6 +413,7 @@ static int __init crash_save_vmcoreinfo_init(void) VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS); VMCOREINFO_STRUCT_SIZE(mem_section); VMCOREINFO_OFFSET(mem_section, section_mem_map); + VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS); #endif VMCOREINFO_STRUCT_SIZE(page); VMCOREINFO_STRUCT_SIZE(pglist_data); -- 2.7.4
[RESEND PATCH v5 0/5] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)
- Resending the v5 version as Will Deacon reported that the patchset was split into two seperate threads while sending out. It was an issue with my 'msmtp' settings which seems to be now fixed. Please ignore all previous v5 versions. Changes since v4: - v4 can be seen here: http://lists.infradead.org/pipermail/kexec/2019-November/023961.html - Addressed comments from Dave and added patches for documenting new variables appended to vmcoreinfo documentation. - Added testing report shared by Akashi for PATCH 2/5. Changes since v3: - v3 can be seen here: http://lists.infradead.org/pipermail/kexec/2019-March/022590.html - Addressed comments from James and exported TCR_EL1.T1SZ in vmcoreinfo instead of PTRS_PER_PGD. - Added a new patch (via [PATCH 3/3]), which fixes a simple typo in 'Documentation/arm64/memory.rst' Changes since v2: - v2 can be seen here: http://lists.infradead.org/pipermail/kexec/2019-March/022531.html - Protected 'MAX_PHYSMEM_BITS' vmcoreinfo variable under CONFIG_SPARSEMEM ifdef sections, as suggested by Kazu. - Updated vmcoreinfo documentation to add description about 'MAX_PHYSMEM_BITS' variable (via [PATCH 3/3]). Changes since v1: - v1 was sent out as a single patch which can be seen here: http://lists.infradead.org/pipermail/kexec/2019-February/022411.html - v2 breaks the single patch into two independent patches: [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, whereas [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel code (all archs) This patchset primarily fixes the regression reported in user-space utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture with the availability of 52-bit address space feature in underlying kernel. These regressions have been reported both on CPUs which don't support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels and also on prototype platforms (like ARMv8 FVP simulator model) which support ARMv8.2 extensions and are running newer kernels. The reason for these regressions is that right now user-space tools have no direct access to these values (since these are not exported from the kernel) and hence need to rely on a best-guess method of determining value of 'vabits_actual' and 'MAX_PHYSMEM_BITS' supported by underlying kernel. Exporting these values via vmcoreinfo will help user-land in such cases. In addition, as per suggestion from makedumpfile maintainer (Kazu), it makes more sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself rather than in arm64 arch-specific code, so that the user-space code for other archs can also benefit from this addition to the vmcoreinfo and use it as a standard way of determining 'SECTIONS_SHIFT' value in user-land. Cc: Boris Petkov Cc: Ingo Molnar Cc: Thomas Gleixner Cc: Jonathan Corbet Cc: James Morse Cc: Mark Rutland Cc: Will Deacon Cc: Steve Capper Cc: Catalin Marinas Cc: Ard Biesheuvel Cc: Michael Ellerman Cc: Paul Mackerras Cc: Benjamin Herrenschmidt Cc: Dave Anderson Cc: Kazuhito Hagio Cc: x...@kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-ker...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: ke...@lists.infradead.org Bhupesh Sharma (5): crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo Documentation/arm64: Fix a simple typo in memory.rst Documentation/vmcoreinfo: Add documentation for 'MAX_PHYSMEM_BITS' Documentation/vmcoreinfo: Add documentation for 'TCR_EL1.T1SZ' Documentation/admin-guide/kdump/vmcoreinfo.rst | 11 +++ Documentation/arm64/memory.rst | 2 +- arch/arm64/include/asm/pgtable-hwdef.h | 1 + arch/arm64/kernel/crash_core.c | 9 + kernel/crash_core.c| 1 + 5 files changed, 23 insertions(+), 1 deletion(-) -- 2.7.4
[RESEND PATCH v5 1/5] crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo
Right now user-space tools like 'makedumpfile' and 'crash' need to rely on a best-guess method of determining value of 'MAX_PHYSMEM_BITS' supported by underlying kernel. This value is used in user-space code to calculate the bit-space required to store a section for SPARESMEM (similar to the existing calculation method used in the kernel implementation): #define SECTIONS_SHIFT(MAX_PHYSMEM_BITS - SECTION_SIZE_BITS) Now, regressions have been reported in user-space utilities like 'makedumpfile' and 'crash' on arm64, with the recently added kernel support for 52-bit physical address space, as there is no clear method of determining this value in user-space (other than reading kernel CONFIG flags). As per suggestion from makedumpfile maintainer (Kazu), it makes more sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself rather than in arch-specific code, so that the user-space code for other archs can also benefit from this addition to the vmcoreinfo and use it as a standard way of determining 'SECTIONS_SHIFT' value in user-land. A reference 'makedumpfile' implementation which reads the 'MAX_PHYSMEM_BITS' value from vmcoreinfo in a arch-independent fashion is available here: [0]. https://github.com/bhupesh-sharma/makedumpfile/blob/remove-max-phys-mem-bit-v1/arch/ppc64.c#L471 Cc: Boris Petkov Cc: Ingo Molnar Cc: Thomas Gleixner Cc: James Morse Cc: Mark Rutland Cc: Will Deacon Cc: Steve Capper Cc: Catalin Marinas Cc: Ard Biesheuvel Cc: Michael Ellerman Cc: Paul Mackerras Cc: Benjamin Herrenschmidt Cc: Dave Anderson Cc: Kazuhito Hagio Cc: x...@kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-ker...@vger.kernel.org Cc: ke...@lists.infradead.org Signed-off-by: Bhupesh Sharma --- kernel/crash_core.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/crash_core.c b/kernel/crash_core.c index 9f1557b98468..18175687133a 100644 --- a/kernel/crash_core.c +++ b/kernel/crash_core.c @@ -413,6 +413,7 @@ static int __init crash_save_vmcoreinfo_init(void) VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS); VMCOREINFO_STRUCT_SIZE(mem_section); VMCOREINFO_OFFSET(mem_section, section_mem_map); + VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS); #endif VMCOREINFO_STRUCT_SIZE(page); VMCOREINFO_STRUCT_SIZE(pglist_data); -- 2.7.4
[RESEND PATCH v5 0/5] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)
- Resending the v5 version as Will Deacon reported that the patchset was split into two seperate threads while sending out. It was an issue with my 'msmtp' settings which is now fixed. Changes since v4: - v4 can be seen here: http://lists.infradead.org/pipermail/kexec/2019-November/023961.html - Addressed comments from Dave and added patches for documenting new variables appended to vmcoreinfo documentation. - Added testing report shared by Akashi for PATCH 2/5. Changes since v3: - v3 can be seen here: http://lists.infradead.org/pipermail/kexec/2019-March/022590.html - Addressed comments from James and exported TCR_EL1.T1SZ in vmcoreinfo instead of PTRS_PER_PGD. - Added a new patch (via [PATCH 3/3]), which fixes a simple typo in 'Documentation/arm64/memory.rst' Changes since v2: - v2 can be seen here: http://lists.infradead.org/pipermail/kexec/2019-March/022531.html - Protected 'MAX_PHYSMEM_BITS' vmcoreinfo variable under CONFIG_SPARSEMEM ifdef sections, as suggested by Kazu. - Updated vmcoreinfo documentation to add description about 'MAX_PHYSMEM_BITS' variable (via [PATCH 3/3]). Changes since v1: - v1 was sent out as a single patch which can be seen here: http://lists.infradead.org/pipermail/kexec/2019-February/022411.html - v2 breaks the single patch into two independent patches: [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, whereas [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel code (all archs) This patchset primarily fixes the regression reported in user-space utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture with the availability of 52-bit address space feature in underlying kernel. These regressions have been reported both on CPUs which don't support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels and also on prototype platforms (like ARMv8 FVP simulator model) which support ARMv8.2 extensions and are running newer kernels. The reason for these regressions is that right now user-space tools have no direct access to these values (since these are not exported from the kernel) and hence need to rely on a best-guess method of determining value of 'vabits_actual' and 'MAX_PHYSMEM_BITS' supported by underlying kernel. Exporting these values via vmcoreinfo will help user-land in such cases. In addition, as per suggestion from makedumpfile maintainer (Kazu), it makes more sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself rather than in arm64 arch-specific code, so that the user-space code for other archs can also benefit from this addition to the vmcoreinfo and use it as a standard way of determining 'SECTIONS_SHIFT' value in user-land. Cc: Boris Petkov Cc: Ingo Molnar Cc: Thomas Gleixner Cc: Jonathan Corbet Cc: James Morse Cc: Mark Rutland Cc: Will Deacon Cc: Steve Capper Cc: Catalin Marinas Cc: Ard Biesheuvel Cc: Michael Ellerman Cc: Paul Mackerras Cc: Benjamin Herrenschmidt Cc: Dave Anderson Cc: Kazuhito Hagio Cc: x...@kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-ker...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: ke...@lists.infradead.org Bhupesh Sharma (5): crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo Documentation/arm64: Fix a simple typo in memory.rst Documentation/vmcoreinfo: Add documentation for 'MAX_PHYSMEM_BITS' Documentation/vmcoreinfo: Add documentation for 'TCR_EL1.T1SZ' Documentation/admin-guide/kdump/vmcoreinfo.rst | 11 +++ Documentation/arm64/memory.rst | 2 +- arch/arm64/include/asm/pgtable-hwdef.h | 1 + arch/arm64/kernel/crash_core.c | 9 + kernel/crash_core.c| 1 + 5 files changed, 23 insertions(+), 1 deletion(-) -- 2.7.4
[PATCH v5 5/5] Documentation/vmcoreinfo: Add documentation for 'TCR_EL1.T1SZ'
Add documentation for TCR_EL1.T1SZ variable being added to vmcoreinfo. It indicates the size offset of the memory region addressed by TTBR1_EL1 and hence can be used for determining the vabits_actual value. Cc: James Morse Cc: Mark Rutland Cc: Will Deacon Cc: Steve Capper Cc: Catalin Marinas Cc: Ard Biesheuvel Cc: Dave Anderson Cc: Kazuhito Hagio Cc: linux-arm-ker...@lists.infradead.org Cc: linux-ker...@vger.kernel.org Cc: ke...@lists.infradead.org Signed-off-by: Bhupesh Sharma --- Documentation/admin-guide/kdump/vmcoreinfo.rst | 6 ++ 1 file changed, 6 insertions(+) diff --git a/Documentation/admin-guide/kdump/vmcoreinfo.rst b/Documentation/admin-guide/kdump/vmcoreinfo.rst index 447b64314f56..f9349f9d3345 100644 --- a/Documentation/admin-guide/kdump/vmcoreinfo.rst +++ b/Documentation/admin-guide/kdump/vmcoreinfo.rst @@ -398,6 +398,12 @@ KERNELOFFSET The kernel randomization offset. Used to compute the page offset. If KASLR is disabled, this value is zero. +TCR_EL1.T1SZ + + +Indicates the size offset of the memory region addressed by TTBR1_EL1 +and hence can be used for determining the vabits_actual value. + arm === -- 2.7.4
[PATCH v5 4/5] Documentation/vmcoreinfo: Add documentation for 'MAX_PHYSMEM_BITS'
Add documentation for 'MAX_PHYSMEM_BITS' variable being added to vmcoreinfo. 'MAX_PHYSMEM_BITS' defines the maximum supported physical address space memory. Cc: Boris Petkov Cc: Ingo Molnar Cc: Thomas Gleixner Cc: James Morse Cc: Will Deacon Cc: Michael Ellerman Cc: Paul Mackerras Cc: Benjamin Herrenschmidt Cc: Dave Anderson Cc: Kazuhito Hagio Cc: x...@kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-ker...@vger.kernel.org Cc: ke...@lists.infradead.org Signed-off-by: Bhupesh Sharma --- Documentation/admin-guide/kdump/vmcoreinfo.rst | 5 + 1 file changed, 5 insertions(+) diff --git a/Documentation/admin-guide/kdump/vmcoreinfo.rst b/Documentation/admin-guide/kdump/vmcoreinfo.rst index 007a6b86e0ee..447b64314f56 100644 --- a/Documentation/admin-guide/kdump/vmcoreinfo.rst +++ b/Documentation/admin-guide/kdump/vmcoreinfo.rst @@ -93,6 +93,11 @@ It exists in the sparse memory mapping model, and it is also somewhat similar to the mem_map variable, both of them are used to translate an address. +MAX_PHYSMEM_BITS + + +Defines the maximum supported physical address space memory. + page -- 2.7.4
[PATCH v5 3/5] Documentation/arm64: Fix a simple typo in memory.rst
Fix a simple typo in arm64/memory.rst Cc: Jonathan Corbet Cc: James Morse Cc: Mark Rutland Cc: Will Deacon Cc: Steve Capper Cc: Catalin Marinas Cc: Ard Biesheuvel Cc: linux-...@vger.kernel.org Cc: linux-ker...@vger.kernel.org Cc: linux-arm-ker...@lists.infradead.org Signed-off-by: Bhupesh Sharma --- Documentation/arm64/memory.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/arm64/memory.rst b/Documentation/arm64/memory.rst index 02e02175e6f5..cf03b3290800 100644 --- a/Documentation/arm64/memory.rst +++ b/Documentation/arm64/memory.rst @@ -129,7 +129,7 @@ this logic. As a single binary will need to support both 48-bit and 52-bit VA spaces, the VMEMMAP must be sized large enough for 52-bit VAs and -also must be sized large enought to accommodate a fixed PAGE_OFFSET. +also must be sized large enough to accommodate a fixed PAGE_OFFSET. Most code in the kernel should not need to consider the VA_BITS, for code that does need to know the VA size the variables are -- 2.7.4
[PATCH v5 2/5] arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo
vabits_actual variable on arm64 indicates the actual VA space size, and allows a single binary to support both 48-bit and 52-bit VA spaces. If the ARMv8.2-LVA optional feature is present, and we are running with a 64KB page size; then it is possible to use 52-bits of address space for both userspace and kernel addresses. However, any kernel binary that supports 52-bit must also be able to fall back to 48-bit at early boot time if the hardware feature is not present. Since TCR_EL1.T1SZ indicates the size offset of the memory region addressed by TTBR1_EL1 (and hence can be used for determining the vabits_actual value) it makes more sense to export the same in vmcoreinfo rather than vabits_actual variable, as the name of the variable can change in future kernel versions, but the architectural constructs like TCR_EL1.T1SZ can be used better to indicate intended specific fields to user-space. User-space utilities like makedumpfile and crash-utility, need to read/write this value from/to vmcoreinfo for determining if a virtual address lies in the linear map range. The user-space computation for determining whether an address lies in the linear map range is the same as we have in kernel-space: #define __is_lm_address(addr) (!(((u64)addr) & BIT(vabits_actual - 1))) I have sent out user-space patches for makedumpfile and crash-utility to add features for obtaining vabits_actual value from TCR_EL1.T1SZ (see [0] and [1]). Akashi reported that he was able to use this patchset and the user-space changes to get user-space working fine with the 52-bit kernel VA changes (see [2]). [0]. http://lists.infradead.org/pipermail/kexec/2019-November/023966.html [1]. http://lists.infradead.org/pipermail/kexec/2019-November/024006.html [2]. http://lists.infradead.org/pipermail/kexec/2019-November/023992.html Cc: James Morse Cc: Mark Rutland Cc: Will Deacon Cc: Steve Capper Cc: Catalin Marinas Cc: Ard Biesheuvel Cc: Dave Anderson Cc: Kazuhito Hagio Cc: linux-arm-ker...@lists.infradead.org Cc: linux-ker...@vger.kernel.org Cc: ke...@lists.infradead.org Signed-off-by: Bhupesh Sharma --- arch/arm64/include/asm/pgtable-hwdef.h | 1 + arch/arm64/kernel/crash_core.c | 9 + 2 files changed, 10 insertions(+) diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h index d9fbd433cc17..d2e7aff5821e 100644 --- a/arch/arm64/include/asm/pgtable-hwdef.h +++ b/arch/arm64/include/asm/pgtable-hwdef.h @@ -215,6 +215,7 @@ #define TCR_TxSZ(x)(TCR_T0SZ(x) | TCR_T1SZ(x)) #define TCR_TxSZ_WIDTH 6 #define TCR_T0SZ_MASK (((UL(1) << TCR_TxSZ_WIDTH) - 1) << TCR_T0SZ_OFFSET) +#define TCR_T1SZ_MASK (((UL(1) << TCR_TxSZ_WIDTH) - 1) << TCR_T1SZ_OFFSET) #define TCR_EPD0_SHIFT 7 #define TCR_EPD0_MASK (UL(1) << TCR_EPD0_SHIFT) diff --git a/arch/arm64/kernel/crash_core.c b/arch/arm64/kernel/crash_core.c index ca4c3e12d8c5..f78310ba65ea 100644 --- a/arch/arm64/kernel/crash_core.c +++ b/arch/arm64/kernel/crash_core.c @@ -7,6 +7,13 @@ #include #include +static inline u64 get_tcr_el1_t1sz(void); + +static inline u64 get_tcr_el1_t1sz(void) +{ + return (read_sysreg(tcr_el1) & TCR_T1SZ_MASK) >> TCR_T1SZ_OFFSET; +} + void arch_crash_save_vmcoreinfo(void) { VMCOREINFO_NUMBER(VA_BITS); @@ -15,5 +22,7 @@ void arch_crash_save_vmcoreinfo(void) kimage_voffset); vmcoreinfo_append_str("NUMBER(PHYS_OFFSET)=0x%llx\n", PHYS_OFFSET); + vmcoreinfo_append_str("NUMBER(tcr_el1_t1sz)=0x%llx\n", + get_tcr_el1_t1sz()); vmcoreinfo_append_str("KERNELOFFSET=%lx\n", kaslr_offset()); } -- 2.7.4
[PATCH v5 1/5] crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo
Right now user-space tools like 'makedumpfile' and 'crash' need to rely on a best-guess method of determining value of 'MAX_PHYSMEM_BITS' supported by underlying kernel. This value is used in user-space code to calculate the bit-space required to store a section for SPARESMEM (similar to the existing calculation method used in the kernel implementation): #define SECTIONS_SHIFT(MAX_PHYSMEM_BITS - SECTION_SIZE_BITS) Now, regressions have been reported in user-space utilities like 'makedumpfile' and 'crash' on arm64, with the recently added kernel support for 52-bit physical address space, as there is no clear method of determining this value in user-space (other than reading kernel CONFIG flags). As per suggestion from makedumpfile maintainer (Kazu), it makes more sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself rather than in arch-specific code, so that the user-space code for other archs can also benefit from this addition to the vmcoreinfo and use it as a standard way of determining 'SECTIONS_SHIFT' value in user-land. A reference 'makedumpfile' implementation which reads the 'MAX_PHYSMEM_BITS' value from vmcoreinfo in a arch-independent fashion is available here: [0]. https://github.com/bhupesh-sharma/makedumpfile/blob/remove-max-phys-mem-bit-v1/arch/ppc64.c#L471 Cc: Boris Petkov Cc: Ingo Molnar Cc: Thomas Gleixner Cc: James Morse Cc: Mark Rutland Cc: Will Deacon Cc: Steve Capper Cc: Catalin Marinas Cc: Ard Biesheuvel Cc: Michael Ellerman Cc: Paul Mackerras Cc: Benjamin Herrenschmidt Cc: Dave Anderson Cc: Kazuhito Hagio Cc: x...@kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-ker...@vger.kernel.org Cc: ke...@lists.infradead.org Signed-off-by: Bhupesh Sharma --- kernel/crash_core.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/crash_core.c b/kernel/crash_core.c index 9f1557b98468..18175687133a 100644 --- a/kernel/crash_core.c +++ b/kernel/crash_core.c @@ -413,6 +413,7 @@ static int __init crash_save_vmcoreinfo_init(void) VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS); VMCOREINFO_STRUCT_SIZE(mem_section); VMCOREINFO_OFFSET(mem_section, section_mem_map); + VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS); #endif VMCOREINFO_STRUCT_SIZE(page); VMCOREINFO_STRUCT_SIZE(pglist_data); -- 2.7.4
[PATCH v5 0/5] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)
Changes since v4: - v4 can be seen here: http://lists.infradead.org/pipermail/kexec/2019-November/023961.html - Addressed comments from Dave and added patches for documenting new variables appended to vmcoreinfo documentation. - Added testing report shared by Akashi for PATCH 2/5. Changes since v3: - v3 can be seen here: http://lists.infradead.org/pipermail/kexec/2019-March/022590.html - Addressed comments from James and exported TCR_EL1.T1SZ in vmcoreinfo instead of PTRS_PER_PGD. - Added a new patch (via [PATCH 3/3]), which fixes a simple typo in 'Documentation/arm64/memory.rst' Changes since v2: - v2 can be seen here: http://lists.infradead.org/pipermail/kexec/2019-March/022531.html - Protected 'MAX_PHYSMEM_BITS' vmcoreinfo variable under CONFIG_SPARSEMEM ifdef sections, as suggested by Kazu. - Updated vmcoreinfo documentation to add description about 'MAX_PHYSMEM_BITS' variable (via [PATCH 3/3]). Changes since v1: - v1 was sent out as a single patch which can be seen here: http://lists.infradead.org/pipermail/kexec/2019-February/022411.html - v2 breaks the single patch into two independent patches: [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, whereas [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel code (all archs) This patchset primarily fixes the regression reported in user-space utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture with the availability of 52-bit address space feature in underlying kernel. These regressions have been reported both on CPUs which don't support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels and also on prototype platforms (like ARMv8 FVP simulator model) which support ARMv8.2 extensions and are running newer kernels. The reason for these regressions is that right now user-space tools have no direct access to these values (since these are not exported from the kernel) and hence need to rely on a best-guess method of determining value of 'vabits_actual' and 'MAX_PHYSMEM_BITS' supported by underlying kernel. Exporting these values via vmcoreinfo will help user-land in such cases. In addition, as per suggestion from makedumpfile maintainer (Kazu), it makes more sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself rather than in arm64 arch-specific code, so that the user-space code for other archs can also benefit from this addition to the vmcoreinfo and use it as a standard way of determining 'SECTIONS_SHIFT' value in user-land. Cc: Boris Petkov Cc: Ingo Molnar Cc: Thomas Gleixner Cc: Jonathan Corbet Cc: James Morse Cc: Mark Rutland Cc: Will Deacon Cc: Steve Capper Cc: Catalin Marinas Cc: Ard Biesheuvel Cc: Michael Ellerman Cc: Paul Mackerras Cc: Benjamin Herrenschmidt Cc: Dave Anderson Cc: Kazuhito Hagio Cc: x...@kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-ker...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: ke...@lists.infradead.org Bhupesh Sharma (5): crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo Documentation/arm64: Fix a simple typo in memory.rst Documentation/vmcoreinfo: Add documentation for 'MAX_PHYSMEM_BITS' Documentation/vmcoreinfo: Add documentation for 'TCR_EL1.T1SZ' Documentation/admin-guide/kdump/vmcoreinfo.rst | 11 +++ Documentation/arm64/memory.rst | 2 +- arch/arm64/include/asm/pgtable-hwdef.h | 1 + arch/arm64/kernel/crash_core.c | 9 + kernel/crash_core.c| 1 + 5 files changed, 23 insertions(+), 1 deletion(-) -- 2.7.4
Re: [PATCH v4 0/3] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)
Hi Dave, On Thu, Nov 21, 2019 at 8:51 AM Dave Young wrote: > > On 11/11/19 at 01:31pm, Bhupesh Sharma wrote: > > Changes since v3: > > > > - v3 can be seen here: > > http://lists.infradead.org/pipermail/kexec/2019-March/022590.html > > - Addressed comments from James and exported TCR_EL1.T1SZ in vmcoreinfo > > instead of PTRS_PER_PGD. > > - Added a new patch (via [PATCH 3/3]), which fixes a simple typo in > > 'Documentation/arm64/memory.rst' > > > > Changes since v2: > > > > - v2 can be seen here: > > http://lists.infradead.org/pipermail/kexec/2019-March/022531.html > > - Protected 'MAX_PHYSMEM_BITS' vmcoreinfo variable under CONFIG_SPARSEMEM > > ifdef sections, as suggested by Kazu. > > - Updated vmcoreinfo documentation to add description about > > 'MAX_PHYSMEM_BITS' variable (via [PATCH 3/3]). > > > > Changes since v1: > > > > - v1 was sent out as a single patch which can be seen here: > > http://lists.infradead.org/pipermail/kexec/2019-February/022411.html > > > > - v2 breaks the single patch into two independent patches: > > [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, whereas > > [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel code > > (all archs) > > > > This patchset primarily fixes the regression reported in user-space > > utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture > > with the availability of 52-bit address space feature in underlying > > kernel. These regressions have been reported both on CPUs which don't > > support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels > > and also on prototype platforms (like ARMv8 FVP simulator model) which > > support ARMv8.2 extensions and are running newer kernels. > > > > The reason for these regressions is that right now user-space tools > > have no direct access to these values (since these are not exported > > from the kernel) and hence need to rely on a best-guess method of > > determining value of 'vabits_actual' and 'MAX_PHYSMEM_BITS' supported > > by underlying kernel. > > > > Exporting these values via vmcoreinfo will help user-land in such cases. > > In addition, as per suggestion from makedumpfile maintainer (Kazu), > > it makes more sense to append 'MAX_PHYSMEM_BITS' to > > vmcoreinfo in the core code itself rather than in arm64 arch-specific > > code, so that the user-space code for other archs can also benefit from > > this addition to the vmcoreinfo and use it as a standard way of > > determining 'SECTIONS_SHIFT' value in user-land. > > > > Cc: Boris Petkov > > Cc: Ingo Molnar > > Cc: Thomas Gleixner > > Cc: Jonathan Corbet > > Cc: James Morse > > Cc: Mark Rutland > > Cc: Will Deacon > > Cc: Steve Capper > > Cc: Catalin Marinas > > Cc: Ard Biesheuvel > > Cc: Michael Ellerman > > Cc: Paul Mackerras > > Cc: Benjamin Herrenschmidt > > Cc: Dave Anderson > > Cc: Kazuhito Hagio > > Cc: x...@kernel.org > > Cc: linuxppc-dev@lists.ozlabs.org > > Cc: linux-arm-ker...@lists.infradead.org > > Cc: linux-ker...@vger.kernel.org > > Cc: linux-...@vger.kernel.org > > Cc: ke...@lists.infradead.org > > > > Bhupesh Sharma (3): > > crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo > > arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo > > Soft reminder: the new introduced vmcoreinfo needs documentation > > Please check Documentation/admin-guide/kdump/vmcoreinfo.rst Sure, will send a v5 to address the same. Thanks, Bhupesh
Re: [PATCH v4 0/3] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)
On Tue, Nov 19, 2019 at 12:03 PM Prabhakar Kushwaha wrote: > > Hi Akashi, > > On Fri, Nov 15, 2019 at 7:29 AM AKASHI Takahiro > wrote: > > > > Bhupesh, > > > > On Fri, Nov 15, 2019 at 01:24:17AM +0530, Bhupesh Sharma wrote: > > > Hi Akashi, > > > > > > On Wed, Nov 13, 2019 at 12:11 PM AKASHI Takahiro > > > wrote: > > > > > > > > Hi Bhupesh, > > > > > > > > Do you have a corresponding patch for userspace tools, > > > > including crash util and/or makedumpfile? > > > > Otherwise, we can't verify that a generated core file is > > > > correctly handled. > > > > > > Sure. I am still working on the crash-utility related changes, but you > > > can find the makedumpfile changes I posted a couple of days ago here > > > (see [0]) and the github link for the makedumpfile changes can be seen > > > via [1]. > > > > > > I will post the crash-util changes shortly as well. > > > Thanks for having a look at the same. > > > > Thank you. > > I have tested my kdump patch with a hacked version of crash > > where VA_BITS_ACTUAL is calculated from tcr_el1_t1sz in vmcoreinfo. > > > > I also did hack to calculate VA_BITS_ACTUAL is calculated from > tcr_el1_t1sz in vmcoreinfo. Now i am getting error same as mentioned > by you in other thread last month. > https://www.mail-archive.com/crash-utility@redhat.com/msg07385.html > > how this error was overcome? > > I am using > - crashkernel: https://github.com/crash-utility/crash.git commit: > babd7ae62d4e8fd6f93fd30b88040d9376522aa3 > and > - Linux: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git > commit: af42d3466bdc8f39806b26f593604fdc54140bcb I will post a formal change for crash-utility shortly that fixes the same. Right now we are having issues with emails bouncing off 'crash-util...@redhat.com', so my patches sent to the same are in undelivered state at-the-moment. For easy testing I will share the link to my github tree (off-line) [which contains the changes] as well. Regards, Bhupesh
Re: [PATCH v4 0/3] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)
Hi Akashi, On Fri, Nov 15, 2019 at 7:29 AM AKASHI Takahiro wrote: > > Bhupesh, > > On Fri, Nov 15, 2019 at 01:24:17AM +0530, Bhupesh Sharma wrote: > > Hi Akashi, > > > > On Wed, Nov 13, 2019 at 12:11 PM AKASHI Takahiro > > wrote: > > > > > > Hi Bhupesh, > > > > > > Do you have a corresponding patch for userspace tools, > > > including crash util and/or makedumpfile? > > > Otherwise, we can't verify that a generated core file is > > > correctly handled. > > > > Sure. I am still working on the crash-utility related changes, but you > > can find the makedumpfile changes I posted a couple of days ago here > > (see [0]) and the github link for the makedumpfile changes can be seen > > via [1]. > > > > I will post the crash-util changes shortly as well. > > Thanks for having a look at the same. > > Thank you. > I have tested my kdump patch with a hacked version of crash > where VA_BITS_ACTUAL is calculated from tcr_el1_t1sz in vmcoreinfo. Thanks a lot for testing the changes. I will push the crash utility changes for review shortly and also Cc you to the patches. It would be great to have your Tested-by for this patch-set, if the user-space works fine for you with these changes. Regards, Bhupesh > -Takahiro Akashi > > > > [0]. http://lists.infradead.org/pipermail/kexec/2019-November/023963.html > > [1]. > > https://github.com/bhupesh-sharma/makedumpfile/tree/52-bit-va-support-via-vmcore-upstream-v4 > > > > Regards, > > Bhupesh > > > > > > > > Thanks, > > > -Takahiro Akashi > > > > > > On Mon, Nov 11, 2019 at 01:31:19PM +0530, Bhupesh Sharma wrote: > > > > Changes since v3: > > > > > > > > - v3 can be seen here: > > > > http://lists.infradead.org/pipermail/kexec/2019-March/022590.html > > > > - Addressed comments from James and exported TCR_EL1.T1SZ in vmcoreinfo > > > > instead of PTRS_PER_PGD. > > > > - Added a new patch (via [PATCH 3/3]), which fixes a simple typo in > > > > 'Documentation/arm64/memory.rst' > > > > > > > > Changes since v2: > > > > > > > > - v2 can be seen here: > > > > http://lists.infradead.org/pipermail/kexec/2019-March/022531.html > > > > - Protected 'MAX_PHYSMEM_BITS' vmcoreinfo variable under > > > > CONFIG_SPARSEMEM > > > > ifdef sections, as suggested by Kazu. > > > > - Updated vmcoreinfo documentation to add description about > > > > 'MAX_PHYSMEM_BITS' variable (via [PATCH 3/3]). > > > > > > > > Changes since v1: > > > > > > > > - v1 was sent out as a single patch which can be seen here: > > > > http://lists.infradead.org/pipermail/kexec/2019-February/022411.html > > > > > > > > - v2 breaks the single patch into two independent patches: > > > > [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, > > > > whereas > > > > [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel > > > > code (all archs) > > > > > > > > This patchset primarily fixes the regression reported in user-space > > > > utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture > > > > with the availability of 52-bit address space feature in underlying > > > > kernel. These regressions have been reported both on CPUs which don't > > > > support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels > > > > and also on prototype platforms (like ARMv8 FVP simulator model) which > > > > support ARMv8.2 extensions and are running newer kernels. > > > > > > > > The reason for these regressions is that right now user-space tools > > > > have no direct access to these values (since these are not exported > > > > from the kernel) and hence need to rely on a best-guess method of > > > > determining value of 'vabits_actual' and 'MAX_PHYSMEM_BITS' supported > > > > by underlying kernel. > > > > > > > > Exporting these values via vmcoreinfo will help user-land in such cases. > > > > In addition, as per suggestion from makedumpfile maintainer (Kazu), > > > > it makes more sense to append 'MAX_PHYSMEM_BITS' to > > > > vmcoreinfo in the core code itself rather than in arm64
Re: [PATCH v4 0/3] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)
Hi Akashi, On Wed, Nov 13, 2019 at 12:11 PM AKASHI Takahiro wrote: > > Hi Bhupesh, > > Do you have a corresponding patch for userspace tools, > including crash util and/or makedumpfile? > Otherwise, we can't verify that a generated core file is > correctly handled. Sure. I am still working on the crash-utility related changes, but you can find the makedumpfile changes I posted a couple of days ago here (see [0]) and the github link for the makedumpfile changes can be seen via [1]. I will post the crash-util changes shortly as well. Thanks for having a look at the same. [0]. http://lists.infradead.org/pipermail/kexec/2019-November/023963.html [1]. https://github.com/bhupesh-sharma/makedumpfile/tree/52-bit-va-support-via-vmcore-upstream-v4 Regards, Bhupesh > > Thanks, > -Takahiro Akashi > > On Mon, Nov 11, 2019 at 01:31:19PM +0530, Bhupesh Sharma wrote: > > Changes since v3: > > > > - v3 can be seen here: > > http://lists.infradead.org/pipermail/kexec/2019-March/022590.html > > - Addressed comments from James and exported TCR_EL1.T1SZ in vmcoreinfo > > instead of PTRS_PER_PGD. > > - Added a new patch (via [PATCH 3/3]), which fixes a simple typo in > > 'Documentation/arm64/memory.rst' > > > > Changes since v2: > > > > - v2 can be seen here: > > http://lists.infradead.org/pipermail/kexec/2019-March/022531.html > > - Protected 'MAX_PHYSMEM_BITS' vmcoreinfo variable under CONFIG_SPARSEMEM > > ifdef sections, as suggested by Kazu. > > - Updated vmcoreinfo documentation to add description about > > 'MAX_PHYSMEM_BITS' variable (via [PATCH 3/3]). > > > > Changes since v1: > > > > - v1 was sent out as a single patch which can be seen here: > > http://lists.infradead.org/pipermail/kexec/2019-February/022411.html > > > > - v2 breaks the single patch into two independent patches: > > [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, whereas > > [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel code > > (all archs) > > > > This patchset primarily fixes the regression reported in user-space > > utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture > > with the availability of 52-bit address space feature in underlying > > kernel. These regressions have been reported both on CPUs which don't > > support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels > > and also on prototype platforms (like ARMv8 FVP simulator model) which > > support ARMv8.2 extensions and are running newer kernels. > > > > The reason for these regressions is that right now user-space tools > > have no direct access to these values (since these are not exported > > from the kernel) and hence need to rely on a best-guess method of > > determining value of 'vabits_actual' and 'MAX_PHYSMEM_BITS' supported > > by underlying kernel. > > > > Exporting these values via vmcoreinfo will help user-land in such cases. > > In addition, as per suggestion from makedumpfile maintainer (Kazu), > > it makes more sense to append 'MAX_PHYSMEM_BITS' to > > vmcoreinfo in the core code itself rather than in arm64 arch-specific > > code, so that the user-space code for other archs can also benefit from > > this addition to the vmcoreinfo and use it as a standard way of > > determining 'SECTIONS_SHIFT' value in user-land. > > > > Cc: Boris Petkov > > Cc: Ingo Molnar > > Cc: Thomas Gleixner > > Cc: Jonathan Corbet > > Cc: James Morse > > Cc: Mark Rutland > > Cc: Will Deacon > > Cc: Steve Capper > > Cc: Catalin Marinas > > Cc: Ard Biesheuvel > > Cc: Michael Ellerman > > Cc: Paul Mackerras > > Cc: Benjamin Herrenschmidt > > Cc: Dave Anderson > > Cc: Kazuhito Hagio > > Cc: x...@kernel.org > > Cc: linuxppc-dev@lists.ozlabs.org > > Cc: linux-arm-ker...@lists.infradead.org > > Cc: linux-ker...@vger.kernel.org > > Cc: linux-...@vger.kernel.org > > Cc: ke...@lists.infradead.org > > > > Bhupesh Sharma (3): > > crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo > > arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo > > Documentation/arm64: Fix a simple typo in memory.rst > > > > Documentation/arm64/memory.rst | 2 +- > > arch/arm64/include/asm/pgtable-hwdef.h | 1 + > > arch/arm64/kernel/crash_core.c | 9 + > > kernel/crash_core.c| 1 + > > 4 files changed, 12 insertions(+), 1 deletion(-) > > > > -- > > 2.7.4 > > >
[PATCH v4 0/3] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)
Changes since v3: - v3 can be seen here: http://lists.infradead.org/pipermail/kexec/2019-March/022590.html - Addressed comments from James and exported TCR_EL1.T1SZ in vmcoreinfo instead of PTRS_PER_PGD. - Added a new patch (via [PATCH 3/3]), which fixes a simple typo in 'Documentation/arm64/memory.rst' Changes since v2: - v2 can be seen here: http://lists.infradead.org/pipermail/kexec/2019-March/022531.html - Protected 'MAX_PHYSMEM_BITS' vmcoreinfo variable under CONFIG_SPARSEMEM ifdef sections, as suggested by Kazu. - Updated vmcoreinfo documentation to add description about 'MAX_PHYSMEM_BITS' variable (via [PATCH 3/3]). Changes since v1: - v1 was sent out as a single patch which can be seen here: http://lists.infradead.org/pipermail/kexec/2019-February/022411.html - v2 breaks the single patch into two independent patches: [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, whereas [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel code (all archs) This patchset primarily fixes the regression reported in user-space utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture with the availability of 52-bit address space feature in underlying kernel. These regressions have been reported both on CPUs which don't support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels and also on prototype platforms (like ARMv8 FVP simulator model) which support ARMv8.2 extensions and are running newer kernels. The reason for these regressions is that right now user-space tools have no direct access to these values (since these are not exported from the kernel) and hence need to rely on a best-guess method of determining value of 'vabits_actual' and 'MAX_PHYSMEM_BITS' supported by underlying kernel. Exporting these values via vmcoreinfo will help user-land in such cases. In addition, as per suggestion from makedumpfile maintainer (Kazu), it makes more sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself rather than in arm64 arch-specific code, so that the user-space code for other archs can also benefit from this addition to the vmcoreinfo and use it as a standard way of determining 'SECTIONS_SHIFT' value in user-land. Cc: Boris Petkov Cc: Ingo Molnar Cc: Thomas Gleixner Cc: Jonathan Corbet Cc: James Morse Cc: Mark Rutland Cc: Will Deacon Cc: Steve Capper Cc: Catalin Marinas Cc: Ard Biesheuvel Cc: Michael Ellerman Cc: Paul Mackerras Cc: Benjamin Herrenschmidt Cc: Dave Anderson Cc: Kazuhito Hagio Cc: x...@kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-ker...@vger.kernel.org Cc: linux-...@vger.kernel.org Cc: ke...@lists.infradead.org Bhupesh Sharma (3): crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo Documentation/arm64: Fix a simple typo in memory.rst Documentation/arm64/memory.rst | 2 +- arch/arm64/include/asm/pgtable-hwdef.h | 1 + arch/arm64/kernel/crash_core.c | 9 + kernel/crash_core.c| 1 + 4 files changed, 12 insertions(+), 1 deletion(-) -- 2.7.4
[PATCH v4 1/3] crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo
Right now user-space tools like 'makedumpfile' and 'crash' need to rely on a best-guess method of determining value of 'MAX_PHYSMEM_BITS' supported by underlying kernel. This value is used in user-space code to calculate the bit-space required to store a section for SPARESMEM (similar to the existing calculation method used in the kernel implementation): #define SECTIONS_SHIFT(MAX_PHYSMEM_BITS - SECTION_SIZE_BITS) Now, regressions have been reported in user-space utilities like 'makedumpfile' and 'crash' on arm64, with the recently added kernel support for 52-bit physical address space, as there is no clear method of determining this value in user-space (other than reading kernel CONFIG flags). As per suggestion from makedumpfile maintainer (Kazu), it makes more sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself rather than in arch-specific code, so that the user-space code for other archs can also benefit from this addition to the vmcoreinfo and use it as a standard way of determining 'SECTIONS_SHIFT' value in user-land. A reference 'makedumpfile' implementation which reads the 'MAX_PHYSMEM_BITS' value from vmcoreinfo in a arch-independent fashion is available here: [0]. https://github.com/bhupesh-sharma/makedumpfile/blob/remove-max-phys-mem-bit-v1/arch/ppc64.c#L471 Cc: Boris Petkov Cc: Ingo Molnar Cc: Thomas Gleixner Cc: James Morse Cc: Mark Rutland Cc: Will Deacon Cc: Steve Capper Cc: Catalin Marinas Cc: Ard Biesheuvel Cc: Michael Ellerman Cc: Paul Mackerras Cc: Benjamin Herrenschmidt Cc: Dave Anderson Cc: Kazuhito Hagio Cc: x...@kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-ker...@vger.kernel.org Cc: ke...@lists.infradead.org Signed-off-by: Bhupesh Sharma --- kernel/crash_core.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/crash_core.c b/kernel/crash_core.c index 9f1557b98468..18175687133a 100644 --- a/kernel/crash_core.c +++ b/kernel/crash_core.c @@ -413,6 +413,7 @@ static int __init crash_save_vmcoreinfo_init(void) VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS); VMCOREINFO_STRUCT_SIZE(mem_section); VMCOREINFO_OFFSET(mem_section, section_mem_map); + VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS); #endif VMCOREINFO_STRUCT_SIZE(page); VMCOREINFO_STRUCT_SIZE(pglist_data); -- 2.7.4
[RESEND PATCH] Documentation/stackprotector: powerpc supports stack protector
powerpc architecture (both 64-bit and 32-bit) supports stack protector mechanism since some time now [see commit 06ec27aea9fc ("powerpc/64: add stack protector support")]. Update stackprotector arch support documentation to reflect the same. Cc: Jonathan Corbet Cc: Michael Ellerman Cc: linux-...@vger.kernel.org Cc: linux-ker...@vger.kernel.org Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Bhupesh Sharma --- Resend, this time Cc'ing Jonathan and doc-list. Documentation/features/debug/stackprotector/arch-support.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/features/debug/stackprotector/arch-support.txt b/Documentation/features/debug/stackprotector/arch-support.txt index ea521f3e..32bbdfc64c32 100644 --- a/Documentation/features/debug/stackprotector/arch-support.txt +++ b/Documentation/features/debug/stackprotector/arch-support.txt @@ -22,7 +22,7 @@ | nios2: | TODO | |openrisc: | TODO | | parisc: | TODO | -| powerpc: | TODO | +| powerpc: | ok | | riscv: | TODO | |s390: | TODO | | sh: | ok | -- 2.7.4
Re: [PATCH] Documentation/stackprotector: powerpc supports stack protector
Hi Jonathan, On Fri, May 31, 2019 at 8:44 PM Michael Ellerman wrote: > > Jonathan Corbet writes: > > On Thu, 30 May 2019 18:37:46 +0530 > > Bhupesh Sharma wrote: > > > >> > This should probably go via the documentation tree? > >> > > >> > Acked-by: Michael Ellerman > >> > >> Thanks for the review Michael. > >> I am ok with this going through the documentation tree as well. > > > > Works for me too, but I don't seem to find the actual patch anywhere I > > look. Can you send me a copy? > > You can get it from lore: > > > https://lore.kernel.org/linuxppc-dev/1559212177-7072-1-git-send-email-bhsha...@redhat.com/raw > > Or patchwork (automatically adds my ack): > > https://patchwork.ozlabs.org/patch/1107706/mbox/ > > Or Bhupesh can send it to you :) Please let me know if I should send out the patch again, this time Cc'ing you and the doc-list. Thanks, Bhupesh
Re: [PATCH 22/22] docs: fix broken documentation links
or details, see Documentation/x86/intel_mpx.rst > > If unsure, say N. > > @@ -1911,7 +1911,7 @@ config X86_INTEL_MEMORY_PROTECTION_KEYS > page-based protections, but without requiring modification of the > page tables when an application changes protection domains. > > - For details, see Documentation/x86/protection-keys.txt > + For details, see Documentation/x86/protection-keys.rst > > If unsure, say y. > > diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug > index f730680dc818..59f598543203 100644 > --- a/arch/x86/Kconfig.debug > +++ b/arch/x86/Kconfig.debug > @@ -156,7 +156,7 @@ config IOMMU_DEBUG > code. When you use it make sure you have a big enough > IOMMU/AGP aperture. Most of the options enabled by this can > be set more finegrained using the iommu= command line > - options. See Documentation/x86/x86_64/boot-options.txt for more > + options. See Documentation/x86/x86_64/boot-options.rst for more > details. > > config IOMMU_LEAK > diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S > index 850b8762e889..90d791ca1a95 100644 > --- a/arch/x86/boot/header.S > +++ b/arch/x86/boot/header.S > @@ -313,7 +313,7 @@ start_sys_seg:.word SYSSEG # obsolete and > meaningless, but just > > type_of_loader: .byte 0 # 0 means ancient bootloader, > newer > # bootloaders know to change this. > - # See Documentation/x86/boot.txt for > + # See Documentation/x86/boot.rst for > # assigned ids > > # flags, unused bits must be zero (RFU) bit within loadflags > diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S > index 11aa3b2afa4d..33f9fc38d014 100644 > --- a/arch/x86/entry/entry_64.S > +++ b/arch/x86/entry/entry_64.S > @@ -8,7 +8,7 @@ >* >* entry.S contains the system-call and fault low-level handling routines. >* > - * Some of this is documented in Documentation/x86/entry_64.txt > + * Some of this is documented in Documentation/x86/entry_64.rst >* >* A note on terminology: >* - iret frame:Architecture defined interrupt frame from SS to RIP > diff --git a/arch/x86/include/asm/bootparam_utils.h > b/arch/x86/include/asm/bootparam_utils.h > index f6f6ef436599..101eb944f13c 100644 > --- a/arch/x86/include/asm/bootparam_utils.h > +++ b/arch/x86/include/asm/bootparam_utils.h > @@ -24,7 +24,7 @@ static void sanitize_boot_params(struct boot_params > *boot_params) >* IMPORTANT NOTE TO BOOTLOADER AUTHORS: do not simply clear >* this field. The purpose of this field is to guarantee >* compliance with the x86 boot spec located in > - * Documentation/x86/boot.txt . That spec says that the > + * Documentation/x86/boot.rst . That spec says that the >* *whole* structure should be cleared, after which only the >* portion defined by struct setup_header (boot_params->hdr) >* should be copied in. > diff --git a/arch/x86/include/asm/page_64_types.h > b/arch/x86/include/asm/page_64_types.h > index 793c14c372cb..288b065955b7 100644 > --- a/arch/x86/include/asm/page_64_types.h > +++ b/arch/x86/include/asm/page_64_types.h > @@ -48,7 +48,7 @@ > > #define __START_KERNEL_map _AC(0x8000, UL) > > -/* See Documentation/x86/x86_64/mm.txt for a description of the memory map. > */ > +/* See Documentation/x86/x86_64/mm.rst for a description of the memory map. > */ > > #define __PHYSICAL_MASK_SHIFT 52 > > diff --git a/arch/x86/include/asm/pgtable_64_types.h > b/arch/x86/include/asm/pgtable_64_types.h > index 88bca456da99..52e5f5f2240d 100644 > --- a/arch/x86/include/asm/pgtable_64_types.h > +++ b/arch/x86/include/asm/pgtable_64_types.h > @@ -103,7 +103,7 @@ extern unsigned int ptrs_per_p4d; > #define PGDIR_MASK (~(PGDIR_SIZE - 1)) > > /* > - * See Documentation/x86/x86_64/mm.txt for a description of the memory map. > + * See Documentation/x86/x86_64/mm.rst for a description of the memory map. >* >* Be very careful vs. KASLR when changing anything here. The KASLR address >* range must not overlap with anything except the KASAN shadow area, which > diff --git a/arch/x86/kernel/cpu/microcode/amd.c > b/arch/x86/kernel/cpu/microcode/amd.c > index e1f3ba19ba54..06d4e67f31ab 100644 > --- a/arch/x86/kernel/cpu/microcode/amd.c > +++ b/arch/x86/kernel/cpu/microcode/amd.c > @@ -61,7 +61,7 @@ static u8 amd_ucode_patch[PATCH_MAX_SIZE]; > > /* >* Microcode patch container file is prepended to the initrd in cpio > - * format. See Documentation/x86/microcode.txt > + * format. See Documentation/x86/microcode.rst >*/ > static const char > ucode_path[] __maybe_unused = "kernel/x86/microcode/AuthenticAMD.bin"; > diff --git a/arch/x86/kernel/kexec-bzimage64.c > b/arch/x86/kernel/kexec-bzimage64.c > index 22f60dd26460..b07e7069b09e 100644 > --- a/arch/x86/kernel/kexec-bzimage64.c > +++ b/arch/x86/kernel/kexec-bzimage64.c > @@ -416,7 +416,7 @@ static void *bzImage64_load(struct kimage *image, char > *kernel, > efi_map_offset = params_cmdline_sz; > efi_setup_data_offset = efi_map_offset + ALIGN(efi_map_sz, 16); > > - /* Copy setup header onto bootparams. Documentation/x86/boot.txt */ > + /* Copy setup header onto bootparams. Documentation/x86/boot.rst */ > setup_header_size = 0x0202 + kernel[0x0201] - setup_hdr_offset; For the arm, arm64 and x86 kexec bits: Reviewed-by: Bhupesh Sharma Thanks, Bhupesh
Re: [PATCH] Documentation/stackprotector: powerpc supports stack protector
On Thu, May 30, 2019 at 6:25 PM Michael Ellerman wrote: > > Bhupesh Sharma writes: > > powerpc architecture (both 64-bit and 32-bit) supports stack protector > > mechanism since some time now [see commit 06ec27aea9fc ("powerpc/64: > > add stack protector support")]. > > > > Update stackprotector arch support documentation to reflect the same. > > > > Signed-off-by: Bhupesh Sharma > > --- > > Documentation/features/debug/stackprotector/arch-support.txt | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/Documentation/features/debug/stackprotector/arch-support.txt > > b/Documentation/features/debug/stackprotector/arch-support.txt > > index ea521f3e..32bbdfc64c32 100644 > > --- a/Documentation/features/debug/stackprotector/arch-support.txt > > +++ b/Documentation/features/debug/stackprotector/arch-support.txt > > @@ -22,7 +22,7 @@ > > | nios2: | TODO | > > |openrisc: | TODO | > > | parisc: | TODO | > > -| powerpc: | TODO | > > +| powerpc: | ok | > > | riscv: | TODO | > > |s390: | TODO | > > | sh: | ok | > > -- > > 2.7.4 > > Thanks. > > This should probably go via the documentation tree? > > Acked-by: Michael Ellerman Thanks for the review Michael. I am ok with this going through the documentation tree as well. Regards, Bhupesh
[PATCH] Documentation/stackprotector: powerpc supports stack protector
powerpc architecture (both 64-bit and 32-bit) supports stack protector mechanism since some time now [see commit 06ec27aea9fc ("powerpc/64: add stack protector support")]. Update stackprotector arch support documentation to reflect the same. Signed-off-by: Bhupesh Sharma --- Documentation/features/debug/stackprotector/arch-support.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Documentation/features/debug/stackprotector/arch-support.txt b/Documentation/features/debug/stackprotector/arch-support.txt index ea521f3e..32bbdfc64c32 100644 --- a/Documentation/features/debug/stackprotector/arch-support.txt +++ b/Documentation/features/debug/stackprotector/arch-support.txt @@ -22,7 +22,7 @@ | nios2: | TODO | |openrisc: | TODO | | parisc: | TODO | -| powerpc: | TODO | +| powerpc: | ok | | riscv: | TODO | |s390: | TODO | | sh: | ok | -- 2.7.4
[PATCH] include/kcore: Remove left-over instances of 'kclist_add_remap()'
Commit bf904d2762ee ("x86/pti/64: Remove the SYSCALL64 entry trampoline") removed the sole usage of 'kclist_add_remap()' from 'arch/x86/mm/cpu_entry_area.c', but it did not remove the left-over definition from the include file. Fix the same. Cc: James Morse Cc: Boris Petkov Cc: Ingo Molnar Cc: Thomas Gleixner Cc: Michael Ellerman Cc: Dave Anderson Cc: Dave Young Cc: x...@kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-ker...@vger.kernel.org Cc: ke...@lists.infradead.org Signed-off-by: Bhupesh Sharma --- include/linux/kcore.h | 11 --- 1 file changed, 11 deletions(-) diff --git a/include/linux/kcore.h b/include/linux/kcore.h index c843f4a9c512..da676cdbd727 100644 --- a/include/linux/kcore.h +++ b/include/linux/kcore.h @@ -38,12 +38,6 @@ struct vmcoredd_node { #ifdef CONFIG_PROC_KCORE void __init kclist_add(struct kcore_list *, void *, size_t, int type); -static inline -void kclist_add_remap(struct kcore_list *m, void *addr, void *vaddr, size_t sz) -{ - m->vaddr = (unsigned long)vaddr; - kclist_add(m, addr, sz, KCORE_REMAP); -} extern int __init register_mem_pfn_is_ram(int (*fn)(unsigned long pfn)); #else @@ -51,11 +45,6 @@ static inline void kclist_add(struct kcore_list *new, void *addr, size_t size, int type) { } - -static inline -void kclist_add_remap(struct kcore_list *m, void *addr, void *vaddr, size_t sz) -{ -} #endif #endif /* _LINUX_KCORE_H */ -- 2.7.4
[PATCH v3 3/3] Documentation/vmcoreinfo: Add documentation for 'MAX_PHYSMEM_BITS'
Add documentation for 'MAX_PHYSMEM_BITS' variable being added to vmcoreinfo. 'MAX_PHYSMEM_BITS' defines the maximum supported physical address space memory. Cc: Boris Petkov Cc: Ingo Molnar Cc: Thomas Gleixner Cc: James Morse Cc: Will Deacon Cc: Michael Ellerman Cc: Paul Mackerras Cc: Benjamin Herrenschmidt Cc: Dave Anderson Cc: Kazuhito Hagio Cc: x...@kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-ker...@vger.kernel.org Cc: ke...@lists.infradead.org Signed-off-by: Bhupesh Sharma --- Documentation/kdump/vmcoreinfo.txt | 5 + 1 file changed, 5 insertions(+) diff --git a/Documentation/kdump/vmcoreinfo.txt b/Documentation/kdump/vmcoreinfo.txt index bb94a4bd597a..f5a11388dc49 100644 --- a/Documentation/kdump/vmcoreinfo.txt +++ b/Documentation/kdump/vmcoreinfo.txt @@ -95,6 +95,11 @@ It exists in the sparse memory mapping model, and it is also somewhat similar to the mem_map variable, both of them are used to translate an address. +MAX_PHYSMEM_BITS + + +Defines the maximum supported physical address space memory. + page -- 2.7.4
[PATCH v3 2/3] crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo
Right now user-space tools like 'makedumpfile' and 'crash' need to rely on a best-guess method of determining value of 'MAX_PHYSMEM_BITS' supported by underlying kernel. This value is used in user-space code to calculate the bit-space required to store a section for SPARESMEM (similar to the existing calculation method used in the kernel implementation): #define SECTIONS_SHIFT(MAX_PHYSMEM_BITS - SECTION_SIZE_BITS) Now, regressions have been reported in user-space utilities like 'makedumpfile' and 'crash' on arm64, with the recently added kernel support for 52-bit physical address space, as there is no clear method of determining this value in user-space (other than reading kernel CONFIG flags). As per suggestion from makedumpfile maintainer (Kazu), it makes more sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself rather than in arch-specific code, so that the user-space code for other archs can also benefit from this addition to the vmcoreinfo and use it as a standard way of determining 'SECTIONS_SHIFT' value in user-land. A reference 'makedumpfile' implementation which reads the 'MAX_PHYSMEM_BITS' value from vmcoreinfo in a arch-independent fashion is available here: [0]. https://github.com/bhupesh-sharma/makedumpfile/blob/remove-max-phys-mem-bit-v1/arch/ppc64.c#L471 Cc: Boris Petkov Cc: Ingo Molnar Cc: Thomas Gleixner Cc: James Morse Cc: Will Deacon Cc: Michael Ellerman Cc: Paul Mackerras Cc: Benjamin Herrenschmidt Cc: Dave Anderson Cc: Kazuhito Hagio Cc: x...@kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-ker...@vger.kernel.org Cc: ke...@lists.infradead.org Signed-off-by: Bhupesh Sharma --- kernel/crash_core.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/crash_core.c b/kernel/crash_core.c index 093c9f917ed0..495f09084696 100644 --- a/kernel/crash_core.c +++ b/kernel/crash_core.c @@ -415,6 +415,7 @@ static int __init crash_save_vmcoreinfo_init(void) VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS); VMCOREINFO_STRUCT_SIZE(mem_section); VMCOREINFO_OFFSET(mem_section, section_mem_map); + VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS); #endif VMCOREINFO_STRUCT_SIZE(page); VMCOREINFO_STRUCT_SIZE(pglist_data); -- 2.7.4
[PATCH v3 0/3] Append new variables to vmcoreinfo (PTRS_PER_PGD for arm64 and MAX_PHYSMEM_BITS for all archs)
Changes since v2: - v2 can be seen here: http://lists.infradead.org/pipermail/kexec/2019-March/022531.html - Protected 'MAX_PHYSMEM_BITS' vmcoreinfo variable under CONFIG_SPARSEMEM ifdef sections, as suggested by Kazu. - Updated vmcoreinfo documentation to add description about 'MAX_PHYSMEM_BITS' variable (via [PATCH 3/3]). Changes since v1: - v1 was sent out as a single patch which can be seen here: http://lists.infradead.org/pipermail/kexec/2019-February/022411.html - v2 breaks the single patch into two independent patches: [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, whereas [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel code (all archs) This patchset primarily fixes the regression reported in user-space utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture with the availability of 52-bit address space feature in underlying kernel. These regressions have been reported both on CPUs which don't support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels and also on prototype platforms (like ARMv8 FVP simulator model) which support ARMv8.2 extensions and are running newer kernels. The reason for these regressions is that right now user-space tools have no direct access to these values (since these are not exported from the kernel) and hence need to rely on a best-guess method of determining value of 'PTRS_PER_PGD' and 'MAX_PHYSMEM_BITS' supported by underlying kernel. Exporting these values via vmcoreinfo will help user-land in such cases. In addition, as per suggestion from makedumpfile maintainer (Kazu), it makes more sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself rather than in arm64 arch-specific code, so that the user-space code for other archs can also benefit from this addition to the vmcoreinfo and use it as a standard way of determining 'SECTIONS_SHIFT' value in user-land. Cc: Mark Rutland Cc: James Morse Cc: Will Deacon Cc: Boris Petkov Cc: Ingo Molnar Cc: Thomas Gleixner Cc: Michael Ellerman Cc: Paul Mackerras Cc: Benjamin Herrenschmidt Cc: Dave Anderson Cc: Kazuhito Hagio Cc: x...@kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-ker...@vger.kernel.org Cc: ke...@lists.infradead.org Bhupesh Sharma (3): arm64, vmcoreinfo : Append 'PTRS_PER_PGD' to vmcoreinfo crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo Documentation/vmcoreinfo: Add documentation for 'MAX_PHYSMEM_BITS' Documentation/kdump/vmcoreinfo.txt | 5 + arch/arm64/kernel/crash_core.c | 1 + kernel/crash_core.c| 1 + 3 files changed, 7 insertions(+) -- 2.7.4
Re: [PATCH v2 2/2] crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo
Hi Kazu, On 03/13/2019 01:17 AM, Kazuhito Hagio wrote: Hi Bhupesh, -Original Message- Right now user-space tools like 'makedumpfile' and 'crash' need to rely on a best-guess method of determining value of 'MAX_PHYSMEM_BITS' supported by underlying kernel. This value is used in user-space code to calculate the bit-space required to store a section for SPARESMEM (similar to the existing calculation method used in the kernel implementation): #define SECTIONS_SHIFT(MAX_PHYSMEM_BITS - SECTION_SIZE_BITS) Now, regressions have been reported in user-space utilities like 'makedumpfile' and 'crash' on arm64, with the recently added kernel support for 52-bit physical address space, as there is no clear method of determining this value in user-space (other than reading kernel CONFIG flags). As per suggestion from makedumpfile maintainer (Kazu), it makes more sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself rather than in arch-specific code, so that the user-space code for other archs can also benefit from this addition to the vmcoreinfo and use it as a standard way of determining 'SECTIONS_SHIFT' value in user-land. A reference 'makedumpfile' implementation which reads the 'MAX_PHYSMEM_BITS' value from vmcoreinfo in a arch-independent fashion is available here: [0]. https://github.com/bhupesh-sharma/makedumpfile/blob/remove-max-phys-mem-bit-v1/arch/ppc64.c#L471 Cc: Boris Petkov Cc: Ingo Molnar Cc: Thomas Gleixner Cc: James Morse Cc: Will Deacon Cc: Michael Ellerman Cc: Paul Mackerras Cc: Benjamin Herrenschmidt Cc: Dave Anderson Cc: Kazuhito Hagio Cc: x...@kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-ker...@vger.kernel.org Cc: ke...@lists.infradead.org Signed-off-by: Bhupesh Sharma --- kernel/crash_core.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/crash_core.c b/kernel/crash_core.c index 093c9f917ed0..44b90368e183 100644 --- a/kernel/crash_core.c +++ b/kernel/crash_core.c @@ -467,6 +467,7 @@ static int __init crash_save_vmcoreinfo_init(void) #define PAGE_OFFLINE_MAPCOUNT_VALUE (~PG_offline) VMCOREINFO_NUMBER(PAGE_OFFLINE_MAPCOUNT_VALUE); #endif + VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS); Some architectures define MAX_PHYSMEM_BITS only with CONFIG_SPARSEMEM, so we need to move this to the #ifdef section that exports some mem_section things. Thanks! Kazu Sorry for the late response, I wanted to make sure I check almost all archs to understand if a proposal would work for all. As per my current understanding, we can protect the export of 'MAX_PHYSMEM_BITS' via a #ifdef section against CONFIG_SPARSEMEM, and it should work for all archs. Here are some arguments to support the same, would request maintainers of various archs (in Cc) to correct me if I am missing something here: 1. SPARSEMEM is dependent upon on (!SELECT_MEMORY_MODEL && ARCH_SPARSEMEM_ENABLE) || SPARSEMEM_MANUAL: config SPARSEMEM def_bool y depends on (!SELECT_MEMORY_MODEL && ARCH_SPARSEMEM_ENABLE) || SPARSEMEM_MANUAL 2. For a couple of archs, this option is already turned on by default in their respective defconfigs: $ grep -nrw "CONFIG_SPARSEMEM_MANUAL" * arch/ia64/configs/gensparse_defconfig:18:CONFIG_SPARSEMEM_MANUAL=y arch/powerpc/configs/ppc64e_defconfig:30:CONFIG_SPARSEMEM_MANUAL=y 3. Note that other archs use ARCH_SPARSEMEM_DEFAULT to define if CONFIG_SPARSEMEM_MANUAL is set by default: choice prompt "Memory model" .. default SPARSEMEM_MANUAL if ARCH_SPARSEMEM_DEFAULT 3a. $ grep -nrw -A 2 "ARCH_SPARSEMEM_DEFAULT" * arch/s390/Kconfig:621:config ARCH_SPARSEMEM_DEFAULT arch/s390/Kconfig-622- def_bool y -- arch/x86/Kconfig:1623:config ARCH_SPARSEMEM_DEFAULT arch/x86/Kconfig-1624- def_bool y arch/x86/Kconfig-1625- depends on X86_64 -- arch/powerpc/Kconfig:614:config ARCH_SPARSEMEM_DEFAULT arch/powerpc/Kconfig-615- def_bool y arch/powerpc/Kconfig-616- depends on PPC_BOOK3S_64 -- arch/arm64/Kconfig:850:config ARCH_SPARSEMEM_DEFAULT arch/arm64/Kconfig-851- def_bool ARCH_SPARSEMEM_ENABLE -- arch/sh/mm/Kconfig:138:config ARCH_SPARSEMEM_DEFAULT arch/sh/mm/Kconfig-139- def_bool y -- arch/sparc/Kconfig:315:config ARCH_SPARSEMEM_DEFAULT arch/sparc/Kconfig-316- def_bool y if SPARC64 -- arch/arm/Kconfig:1591:config ARCH_SPARSEMEM_DEFAULT arch/arm/Kconfig-1592- def_bool ARCH_SPARSEMEM_ENABLE Since most archs (except MIPS) set CONFIG_ARCH_SPARSEMEM_DEFAULT/CONFIG_ARCH_SPARSEMEM_ENABLE to y in the default configurations, so even though they don't protect 'MAX_PHYSMEM_BITS' define in CONFIG_SPARSEMEM ifdef sections, we still would be ok protecting the 'MAX_PHYSMEM_BITS' vmcoreinfo export inside a CONFIG_SPARSEMEM ifdef section. Thanks for your inputs, I will include this change in the v3. Regards, Bhupesh
Re: [PATCH v2 0/2] Append new variables to vmcoreinfo (PTRS_PER_PGD for arm64 and MAX_PHYSMEM_BITS for all archs)
Hi Dave, On 03/11/2019 02:35 PM, Dave Young wrote: Hi Bhupesh, On 03/10/19 at 03:34pm, Bhupesh Sharma wrote: Changes since v1: - v1 was sent out as a single patch which can be seen here: http://lists.infradead.org/pipermail/kexec/2019-February/022411.html - v2 breaks the single patch into two independent patches: [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, whereas [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel code (all archs) This patchset primarily fixes the regression reported in user-space utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture with the availability of 52-bit address space feature in underlying kernel. These regressions have been reported both on CPUs which don't support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels and also on prototype platforms (like ARMv8 FVP simulator model) which support ARMv8.2 extensions and are running newer kernels. The reason for these regressions is that right now user-space tools have no direct access to these values (since these are not exported from the kernel) and hence need to rely on a best-guess method of determining value of 'PTRS_PER_PGD' and 'MAX_PHYSMEM_BITS' supported by underlying kernel. Exporting these values via vmcoreinfo will help user-land in such cases. In addition, as per suggestion from makedumpfile maintainer (Kazu) during v1 review, it makes more sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself rather than in arm64 arch-specific code, so that the user-space code for other archs can also benefit from this addition to the vmcoreinfo and use it as a standard way of determining 'SECTIONS_SHIFT' value in user-land. Cc: Mark Rutland Cc: James Morse Cc: Will Deacon Cc: Boris Petkov Cc: Ingo Molnar Cc: Thomas Gleixner Cc: Michael Ellerman Cc: Paul Mackerras Cc: Benjamin Herrenschmidt Cc: Dave Anderson Cc: Kazuhito Hagio Cc: x...@kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-ker...@vger.kernel.org Cc: ke...@lists.infradead.org Bhupesh Sharma (2): arm64, vmcoreinfo : Append 'PTRS_PER_PGD' to vmcoreinfo crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo arch/arm64/kernel/crash_core.c | 1 + kernel/crash_core.c| 1 + 2 files changed, 2 insertions(+) Lianbo's document patch has been merged, would you mind to add vmcoreinfo doc patch as well in your series? Thanks for the inputs. Will add it to the v3. Let's wait for other comments/reviews, before I spin a version 3. Regards, Bhupesh
[PATCH v2 2/2] crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo
Right now user-space tools like 'makedumpfile' and 'crash' need to rely on a best-guess method of determining value of 'MAX_PHYSMEM_BITS' supported by underlying kernel. This value is used in user-space code to calculate the bit-space required to store a section for SPARESMEM (similar to the existing calculation method used in the kernel implementation): #define SECTIONS_SHIFT(MAX_PHYSMEM_BITS - SECTION_SIZE_BITS) Now, regressions have been reported in user-space utilities like 'makedumpfile' and 'crash' on arm64, with the recently added kernel support for 52-bit physical address space, as there is no clear method of determining this value in user-space (other than reading kernel CONFIG flags). As per suggestion from makedumpfile maintainer (Kazu), it makes more sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself rather than in arch-specific code, so that the user-space code for other archs can also benefit from this addition to the vmcoreinfo and use it as a standard way of determining 'SECTIONS_SHIFT' value in user-land. A reference 'makedumpfile' implementation which reads the 'MAX_PHYSMEM_BITS' value from vmcoreinfo in a arch-independent fashion is available here: [0]. https://github.com/bhupesh-sharma/makedumpfile/blob/remove-max-phys-mem-bit-v1/arch/ppc64.c#L471 Cc: Boris Petkov Cc: Ingo Molnar Cc: Thomas Gleixner Cc: James Morse Cc: Will Deacon Cc: Michael Ellerman Cc: Paul Mackerras Cc: Benjamin Herrenschmidt Cc: Dave Anderson Cc: Kazuhito Hagio Cc: x...@kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-ker...@vger.kernel.org Cc: ke...@lists.infradead.org Signed-off-by: Bhupesh Sharma --- kernel/crash_core.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/crash_core.c b/kernel/crash_core.c index 093c9f917ed0..44b90368e183 100644 --- a/kernel/crash_core.c +++ b/kernel/crash_core.c @@ -467,6 +467,7 @@ static int __init crash_save_vmcoreinfo_init(void) #define PAGE_OFFLINE_MAPCOUNT_VALUE(~PG_offline) VMCOREINFO_NUMBER(PAGE_OFFLINE_MAPCOUNT_VALUE); #endif + VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS); arch_crash_save_vmcoreinfo(); update_vmcoreinfo_note(); -- 2.7.4
[PATCH v2 0/2] Append new variables to vmcoreinfo (PTRS_PER_PGD for arm64 and MAX_PHYSMEM_BITS for all archs)
Changes since v1: - v1 was sent out as a single patch which can be seen here: http://lists.infradead.org/pipermail/kexec/2019-February/022411.html - v2 breaks the single patch into two independent patches: [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, whereas [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel code (all archs) This patchset primarily fixes the regression reported in user-space utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture with the availability of 52-bit address space feature in underlying kernel. These regressions have been reported both on CPUs which don't support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels and also on prototype platforms (like ARMv8 FVP simulator model) which support ARMv8.2 extensions and are running newer kernels. The reason for these regressions is that right now user-space tools have no direct access to these values (since these are not exported from the kernel) and hence need to rely on a best-guess method of determining value of 'PTRS_PER_PGD' and 'MAX_PHYSMEM_BITS' supported by underlying kernel. Exporting these values via vmcoreinfo will help user-land in such cases. In addition, as per suggestion from makedumpfile maintainer (Kazu) during v1 review, it makes more sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself rather than in arm64 arch-specific code, so that the user-space code for other archs can also benefit from this addition to the vmcoreinfo and use it as a standard way of determining 'SECTIONS_SHIFT' value in user-land. Cc: Mark Rutland Cc: James Morse Cc: Will Deacon Cc: Boris Petkov Cc: Ingo Molnar Cc: Thomas Gleixner Cc: Michael Ellerman Cc: Paul Mackerras Cc: Benjamin Herrenschmidt Cc: Dave Anderson Cc: Kazuhito Hagio Cc: x...@kernel.org Cc: linuxppc-dev@lists.ozlabs.org Cc: linux-arm-ker...@lists.infradead.org Cc: linux-ker...@vger.kernel.org Cc: ke...@lists.infradead.org Bhupesh Sharma (2): arm64, vmcoreinfo : Append 'PTRS_PER_PGD' to vmcoreinfo crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo arch/arm64/kernel/crash_core.c | 1 + kernel/crash_core.c| 1 + 2 files changed, 2 insertions(+) -- 2.7.4
Re: [PATCH v2 7/9] arm64: kdump: No need to mark crashkernel pages manually PG_reserved
Hi David, On Mon, Jan 14, 2019 at 6:30 PM David Hildenbrand wrote: > > The crashkernel is reserved via memblock_reserve(). memblock_free_all() > will call free_low_memory_core_early(), which will go over all reserved > memblocks, marking the pages as PG_reserved. > > So manually marking pages as PG_reserved is not necessary, they are > already in the desired state (otherwise they would have been handed over > to the buddy as free pages and bad things would happen). > > Cc: Catalin Marinas > Cc: Will Deacon > Cc: James Morse > Cc: Bhupesh Sharma > Cc: David Hildenbrand > Cc: Mark Rutland > Cc: Dave Kleikamp > Cc: Andrew Morton > Cc: Mike Rapoport > Cc: Michal Hocko > Cc: Florian Fainelli > Cc: Stefan Agner > Cc: Laura Abbott > Cc: Greg Hackmann > Cc: Johannes Weiner > Cc: Kristina Martsenko > Cc: CHANDAN VN > Cc: AKASHI Takahiro > Cc: Logan Gunthorpe > Reviewed-by: Matthias Brugger > Signed-off-by: David Hildenbrand > --- > arch/arm64/kernel/machine_kexec.c | 2 +- > arch/arm64/mm/init.c | 27 --- > 2 files changed, 1 insertion(+), 28 deletions(-) > > diff --git a/arch/arm64/kernel/machine_kexec.c > b/arch/arm64/kernel/machine_kexec.c > index 6f0587b5e941..66b5d697d943 100644 > --- a/arch/arm64/kernel/machine_kexec.c > +++ b/arch/arm64/kernel/machine_kexec.c > @@ -321,7 +321,7 @@ void crash_post_resume(void) > * but does not hold any data of loaded kernel image. > * > * Note that all the pages in crash dump kernel memory have been initially > - * marked as Reserved in kexec_reserve_crashkres_pages(). > + * marked as Reserved as memory was allocated via memblock_reserve(). > * > * In hibernation, the pages which are Reserved and yet "nosave" are excluded > * from the hibernation iamge. crash_is_nosave() does thich check for crash > diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c > index 7205a9085b4d..c38976b70069 100644 > --- a/arch/arm64/mm/init.c > +++ b/arch/arm64/mm/init.c > @@ -118,35 +118,10 @@ static void __init reserve_crashkernel(void) > crashk_res.start = crash_base; > crashk_res.end = crash_base + crash_size - 1; > } > - > -static void __init kexec_reserve_crashkres_pages(void) > -{ > -#ifdef CONFIG_HIBERNATION > - phys_addr_t addr; > - struct page *page; > - > - if (!crashk_res.end) > - return; > - > - /* > -* To reduce the size of hibernation image, all the pages are > -* marked as Reserved initially. > -*/ > - for (addr = crashk_res.start; addr < (crashk_res.end + 1); > - addr += PAGE_SIZE) { > - page = phys_to_page(addr); > - SetPageReserved(page); > - } > -#endif > -} > #else > static void __init reserve_crashkernel(void) > { > } > - > -static void __init kexec_reserve_crashkres_pages(void) > -{ > -} > #endif /* CONFIG_KEXEC_CORE */ > > #ifdef CONFIG_CRASH_DUMP > @@ -586,8 +561,6 @@ void __init mem_init(void) > /* this will put all unused low memory onto the freelists */ > memblock_free_all(); > > - kexec_reserve_crashkres_pages(); > - > mem_init_print_info(NULL); > > /* > -- > 2.17.2 LGTM, so: Reviewed-by: Bhupesh Sharma
Re: [PATCH v2 6/9] arm64: kexec: no need to ClearPageReserved()
Hi David, Thanks for the patch. On Mon, Jan 14, 2019 at 6:29 PM David Hildenbrand wrote: > > This will be done by free_reserved_page(). > > Cc: Catalin Marinas > Cc: Will Deacon > Cc: Bhupesh Sharma > Cc: James Morse > Cc: Marc Zyngier > Cc: Dave Kleikamp > Cc: Mark Rutland > Cc: Andrew Morton > Cc: Michal Hocko > Cc: Matthew Wilcox > Acked-by: James Morse > Signed-off-by: David Hildenbrand > --- > arch/arm64/kernel/machine_kexec.c | 1 - > 1 file changed, 1 deletion(-) > > diff --git a/arch/arm64/kernel/machine_kexec.c > b/arch/arm64/kernel/machine_kexec.c > index aa9c94113700..6f0587b5e941 100644 > --- a/arch/arm64/kernel/machine_kexec.c > +++ b/arch/arm64/kernel/machine_kexec.c > @@ -361,7 +361,6 @@ void crash_free_reserved_phys_range(unsigned long begin, > unsigned long end) > > for (addr = begin; addr < end; addr += PAGE_SIZE) { > page = phys_to_page(addr); > - ClearPageReserved(page); > free_reserved_page(page); > } > } > -- > 2.17.2 > Reviewed-by: Bhupesh Sharma
Re: [kernel-hardening] [PATCH] powerpc: Increase ELF_ET_DYN_BASE to 1TB for 64-bit applications
On Wed, Jun 7, 2017 at 2:59 PM, Michael Ellerman wrote: > Daniel Micay writes: > >> Rather than doing this, the base should just be split for an ELF >> interpreter like PaX. > > I don't quite parse that, I think you mean PaX uses a different base for > an ELF interpreter vs a regular ET_DYN? I am also not very conversant with PaX. AFAIU, we can use the following methods to print the shared object dependencies instead of ldd: 1. One can load the binary directly with LD_TRACE_LOADED_OBJECTS=1. So, instead of: # /lib64/ld-2.24.so ./large-bss-test-app Segmentation fault (core dumped) One can do: # LD_TRACE_LOADED_OBJECTS=1 ./large-bss-test-app linux-vdso64.so.1 (0x7fffa67a) libc.so.6 => /lib64/libc.so.6 (0x7fffa659) /lib64/ld64.so.2 (0x7fffa67c) 2. There are other utils like pax-utils etc that we can use. But, we generally cannot force a user to not use ldd to determine the shared object dependencies, especially when all the documentation points to it and it works well on the other archs like x86 and arm64. > That would be cool. How do you know that it's an ELF interpreter you're > loading? Is it just something that's PIE but doesn't request an > interpreter? > > Is the PaX code somewhere I can look at? > >> It makes sense for a standalone executable to be as low in the address >> space as possible. > > More or less. There are performance reasons why 1T could be good for us, > but I want to see some performance numbers to justify that change. And > it does mean you have a bit less address space to play with. Do you have any specific performance test(s) in mind which I can run to see how the 1TB impacts them? I am trying to run ltp after this change and will be able to share the results shortly, but I am not sure it provides the right data to validate such a change. Regards, Bhupesh
[PATCH] powerpc: Increase ELF_ET_DYN_BASE to 1TB for 64-bit applications
Since 7e60d1b427c51cf2525e5d736a71780978cfb828, the ELF_ET_DYN_BASE for powerpc applications has been set to 512MB. Recently there have been several reports of applications SEGV'ing and newer version of glibc also SEGV'ing (while testing) when using the following test method: LD_LIBRARY_PATH=/XXX/lib /XXX/lib/ld-2.24.so For reproducing the above, consider the following test application which uses a larger bss: 1. # cat large-bss-test-app.c #include #include #define VECSIZE (1024 * 1024 * 100) float p[VECSIZE], v1[VECSIZE], v2[VECSIZE]; void vec_mult(long int N) { long int i; for (i = 0; i < N; i++) p[i] = v1[i] * v2[i]; } int main() { char command[1024]; sprintf(command,"cat /proc/%d/maps",getpid()); system(command); vec_mult(VECSIZE/100); printf ("Done\n"); return 0; } 2. Compile it using gcc (I am using gcc-6.3.1): # gcc -g -o large-bss-test-app large-bss-test-app.c 3. Running the compiled application with ld.so directly is enough to trigger the SEGV on ppc64le/ppc64: # /lib64/ld-2.24.so ./large-bss-test-app Segmentation fault (core dumped) 4. Notice it's random depending on the layout changes, so it passes on some occassions as well: # /lib64/ld-2.24.so ./large-bss-test-app 1000-1001 r-xp fd:00 2883597 /root/large-bss-test-app 1001-1002 r--p fd:00 2883597 /root/large-bss-test-app 1002-1003 rw-p 0001 fd:00 2883597 /root/large-bss-test-app 1003-5b03 rw-p 00:00 0 5e95-5e99 r-xp fd:00 1180301 /usr/lib64/ld-2.24.so 5e99-5e9a r--p 0003 fd:00 1180301 /usr/lib64/ld-2.24.so 5e9a-5e9b rw-p 0004 fd:00 1180301 /usr/lib64/ld-2.24.so 3fffa368-3fffa386 r-xp fd:00 1180308 /usr/lib64/libc-2.24.so 3fffa386-3fffa387 r--p 001d fd:00 1180308 /usr/lib64/libc-2.24.so 3fffa387-3fffa388 rw-p 001e fd:00 1180308 /usr/lib64/libc-2.24.so 3fffa389-3fffa38b r-xp 00:00 0 [vdso] 3fffc674-3fffc677 rw-p 00:00 0 [stack] Done One way to fix this is to move ELF_ET_DYN_BASE from 0x2000 (512MB) to 0x100 (1TB), atleast for 64-bit applications. This allows hopefully enough space for most of the applications without causing them to trample upon the ld.so, leading to a SEGV. ELF_ET_DYN_BASE is still kept as 0x2000 (512MB) for 32-bit applications to preserve their compatibility. After this change, the layout for the 'large-bss-test-app' changes as shown below: # /lib64/ld-2.24.so ./large-bss-test-app 1000-1001 r-xp fd:00 2107527 /root/large-bss-test-app 1001-1002 r--p fd:00 2107527 /root/large-bss-test-app 1002-1003 rw-p 0001 fd:00 2107527 /root/large-bss-test-app 1003-5b03 rw-p 00:00 0 100283b-100283f r-xp fd:00 1835645 /usr/lib64/ld-2.24.so 100283f-1002840 r--p 0003 fd:00 1835645 /usr/lib64/ld-2.24.so 1002840-1002841 rw-p 0004 fd:00 1835645 /usr/lib64/ld-2.24.so 7fff8a47-7fff8a65 r-xp fd:00 1835652 /usr/lib64/libc-2.24.so 7fff8a65-7fff8a66 r--p 001d fd:00 1835652 /usr/lib64/libc-2.24.so 7fff8a66-7fff8a67 rw-p 001e fd:00 1835652 /usr/lib64/libc-2.24.so 7fff8a68-7fff8a6a r-xp 00:00 0 [vdso] 7fffc6d9-7fffc6dc rw-p 00:00 0 [stack] Done Cc: Anton Blanchard Cc: Daniel Cashman Cc: Kees Cook Cc: Michael Ellerman Cc: Benjamin Herrenschmidt Signed-off-by: Bhupesh Sharma --- arch/powerpc/include/asm/elf.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/elf.h b/arch/powerpc/include/asm/elf.h index 09bde6e..683230c 100644 --- a/arch/powerpc/include/asm/elf.h +++ b/arch/powerpc/include/asm/elf.h @@ -28,7 +28,9 @@ the loader. We need to make sure that it is out of the way of the program that it will "exec", and that there is sufficient room for the brk. */ -#define ELF_ET_DYN_BASE0x2000 +/* Keep thi
Re: [PATCH v2] powerpc/mm: Add support for runtime configuration of ASLR limits
Hi Michael, Thanks for the v2. It looks good. On Thu, Apr 20, 2017 at 8:06 PM, Michael Ellerman wrote: > Add powerpc support for mmap_rnd_bits and mmap_rnd_compat_bits, which are two > sysctls that allow a user to configure the number of bits of randomness used > for > ASLR. > > Because of the way the Kconfig for ARCH_MMAP_RND_BITS is defined, we have to > construct at least the MIN value in Kconfig, vs in a header which would be > more > natural. Given that we just go ahead and do it all in Kconfig. > > At least according to the code (the documentation makes no mention of it), the > value is defined as the number of bits of randomisation *of the page*, not the > address. This makes some sense, with larger page sizes more of the low bits > are > forced to zero, which would reduce the randomisation if we didn't take the > PAGE_SIZE into account. However it does mean the min/max values have to change > depending on the PAGE_SIZE in order to actually limit the amount of address > space consumed by the randomisation. > > The result of that is that we have to define the default values based on both > 32-bit vs 64-bit, but also the configured PAGE_SIZE. Furthermore now that we > have 128TB address space support on Book3S, we also have to take that into > account. > > Finally we can wire up the value in arch_mmap_rnd(). > > Signed-off-by: Michael Ellerman > Signed-off-by: Bhupesh Sharma > --- > arch/powerpc/Kconfig | 44 > arch/powerpc/mm/mmap.c | 11 ++- > 2 files changed, 50 insertions(+), 5 deletions(-) > > v2: Fix the 32-bit MAX value incorrectly using MIN as spotted by Kees. > > Kees/Bhupesh, would love a Review/Ack/Tested-by from you, I'll plan to merge > this later today (Friday) my time. > > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig > index 97a8bc8a095c..6f0503951e94 100644 > --- a/arch/powerpc/Kconfig > +++ b/arch/powerpc/Kconfig > @@ -22,6 +22,48 @@ config MMU > bool > default y > > +config ARCH_MMAP_RND_BITS_MAX > + # On Book3S 64, the default virtual address space for 64-bit processes > + # is 2^47 (128TB). As a maximum, allow randomisation to consume up to > + # 32T of address space (2^45), which should ensure a reasonable gap > + # between bottom-up and top-down allocations for applications that > + # consume "normal" amounts of address space. Book3S 64 only supports > 64K > + # and 4K page sizes. > + default 29 if PPC_BOOK3S_64 && PPC_64K_PAGES # 29 = 45 (32T) - 16 > (64K) > + default 33 if PPC_BOOK3S_64 # 33 = 45 (32T) - 12 (4K) > + # > + # On all other 64-bit platforms (currently only Book3E), the virtual > + # address space is 2^46 (64TB). Allow randomisation to consume up to > 16T > + # of address space (2^44). Only 4K page sizes are supported. > + default 32 if 64BIT # 32 = 44 (16T) - 12 (4K) > + # > + # For 32-bit, use the compat values, as they're the same. > + default ARCH_MMAP_RND_COMPAT_BITS_MAX > + > +config ARCH_MMAP_RND_BITS_MIN > + # Allow randomisation to consume up to 1GB of address space (2^30). > + default 14 if 64BIT && PPC_64K_PAGES# 14 = 30 (1GB) - 16 (64K) > + default 18 if 64BIT # 18 = 30 (1GB) - 12 (4K) > + # > + # For 32-bit, use the compat values, as they're the same. > + default ARCH_MMAP_RND_COMPAT_BITS_MIN > + > +config ARCH_MMAP_RND_COMPAT_BITS_MAX > + # Total virtual address space for 32-bit processes is 2^31 (2GB). > + # Allow randomisation to consume up to 512MB of address space (2^29). > + default 11 if PPC_256K_PAGES# 11 = 29 (512MB) - 18 (256K) > + default 13 if PPC_64K_PAGES # 13 = 29 (512MB) - 16 (64K) > + default 15 if PPC_16K_PAGES # 15 = 29 (512MB) - 14 (16K) > + default 17 # 17 = 29 (512MB) - 12 (4K) > + > +config ARCH_MMAP_RND_COMPAT_BITS_MIN > + # Total virtual address space for 32-bit processes is 2^31 (2GB). > + # Allow randomisation to consume up to 8MB of address space (2^23). > + default 5 if PPC_256K_PAGES # 5 = 23 (8MB) - 18 (256K) > + default 7 if PPC_64K_PAGES # 7 = 23 (8MB) - 16 (64K) > + default 9 if PPC_16K_PAGES # 9 = 23 (8MB) - 14 (16K) > + default 11 # 11 = 23 (8MB) - 12 (4K) > + > config HAVE_SETUP_PER_CPU_AREA > def_bool PPC64 > > @@ -120,6 +162,8 @@ config PPC > select HAVE_ARCH_HARDENED_USERCOPY > select HAVE_ARCH_JUMP_LABEL > select HAVE_ARCH_KGDB >
Re: [PATCH] powerpc/mm: Add support for runtime configuration of ASLR limits
Hi Michael, On Wed, Apr 19, 2017 at 7:59 PM, Michael Ellerman wrote: > Add powerpc support for mmap_rnd_bits and mmap_rnd_compat_bits, which are two > sysctls that allow a user to configure the number of bits of randomness used > for > ASLR. > > Because of the way the Kconfig for ARCH_MMAP_RND_BITS is defined, we have to > construct at least the MIN value in Kconfig, vs in a header which would be > more > natural. Given that we just go ahead and do it all in Kconfig. > > At least according to the code (the documentation makes no mention of it), the > value is defined as the number of bits of randomisation *of the page*, not the > address. This makes some sense, with larger page sizes more of the low bits > are > forced to zero, which would reduce the randomisation if we didn't take the > PAGE_SIZE into account. However it does mean the min/max values have to change > depending on the PAGE_SIZE in order to actually limit the amount of address > space consumed by the randomisation. > > The result of that is that we have to define the default values based on both > 32-bit vs 64-bit, but also the configured PAGE_SIZE. Furthermore now that we > have 128TB address space support on Book3S, we also have to take that into > account. Thanks for the patch. I have a couple of comments: (A) As Aneesh noted in the review of my v2 patch (see [1]), we need to handle the configurable 512TB case as well. Right? > Finally we can wire up the value in arch_mmap_rnd(). > > Signed-off-by: Michael Ellerman (b) I am wondering if I missed your comments on my v2 on the same subject - may be you missed my reminder message (see [2]). I am just starting off on PPC related enhancements that we find useful while working on Redhat/Fedora PPC systems (I have mainly been associated with ARM and peripheral driver development in the past), and would have been motivated further if I could get responses to my queries which I had raised earlier on the list (see [3]) - especially the branch to base newer version of patches on. Also I am not sure how the PPC subsystem handles S-O-Bs of earlier contributions on the same subject (as it varies from one maintainer/subsystem to the other), so I leave it up to you. That being said, I will try to improve any new patches I plan to send out on PPC mailing list in future. [1] https://lkml.org/lkml/2017/4/13/57 [2] https://lkml.org/lkml/2017/4/10/796 [3] https://lkml.org/lkml/2017/4/17/3 Thanks, Bhupesh > --- > arch/powerpc/Kconfig | 44 > arch/powerpc/mm/mmap.c | 11 ++- > 2 files changed, 50 insertions(+), 5 deletions(-) > > > This is based on my next branch which has the 128TB changes: > > https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/log/?h=next > > I would definitely appreciate someone checking my math, and any test results. > > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig > index 97a8bc8a095c..608ee0b7b79f 100644 > --- a/arch/powerpc/Kconfig > +++ b/arch/powerpc/Kconfig > @@ -22,6 +22,48 @@ config MMU > bool > default y > > +config ARCH_MMAP_RND_BITS_MAX > + # On Book3S 64, the default virtual address space for 64-bit processes > + # is 2^47 (128TB). As a maximum, allow randomisation to consume up to > + # 32T of address space (2^45), which should ensure a reasonable gap > + # between bottom-up and top-down allocations for applications that > + # consume "normal" amounts of address space. Book3S 64 only supports > 64K > + # and 4K page sizes. > + default 29 if PPC_BOOK3S_64 && PPC_64K_PAGES # 29 = 45 (32T) - 16 > (64K) > + default 33 if PPC_BOOK3S_64 # 33 = 45 (32T) - 12 (4K) > + # > + # On all other 64-bit platforms (currently only Book3E), the virtual > + # address space is 2^46 (64TB). Allow randomisation to consume up to > 16T > + # of address space (2^44). Only 4K page sizes are supported. > + default 32 if 64BIT # 32 = 44 (16T) - 12 (4K) > + # > + # For 32-bit, use the compat values, as they're the same. > + default ARCH_MMAP_RND_COMPAT_BITS_MIN > + > +config ARCH_MMAP_RND_BITS_MIN > + # Allow randomisation to consume up to 1GB of address space (2^30). > + default 14 if 64BIT && PPC_64K_PAGES# 14 = 30 (1GB) - 16 (64K) > + default 18 if 64BIT # 18 = 30 (1GB) - 12 (4K) > + # > + # For 32-bit, use the compat values, as they're the same. > + default ARCH_MMAP_RND_COMPAT_BITS_MIN > + > +config ARCH_MMAP_RND_COMPAT_BITS_MAX > + # Total virtual address space for 32-bit processes is 2^31 (2GB). > + # Allow randomisation to consume up to 512MB of address space (2^29). > + default 11 if PPC_256K_PAGES# 11 = 29 (512MB) - 18 (256K) > + default 13 if PPC_64K_PAGES # 13 = 29 (512MB) - 16 (64K) > + default 15 if PPC_16K_PAGES # 15 = 29 (512MB) - 14 (16K) > + default
Re: [PATCH v3] powerpc: mm: support ARCH_MMAP_RND_BITS
On Thu, Apr 13, 2017 at 12:39 PM, Balbir Singh wrote: >>> >>> Yes. It was derived from TASK_SIZE : >>> >>> http://lxr.free-electrons.com/source/arch/powerpc/include/asm/processor.h#L105 >>> >> >> That is getting update to 128TB by default and conditionally to 512TB >> > > Since this is compile time, we should probably keep the scope to 128TB > for now and see if we want to change things at run time later, since > the expansion is based on a hint. Suggestions? > I think this makes sense. If the conditional expansion to 512TB is protected by a kconfig symbol, we can use the same to have separate ranges for 128TB and 512TB using the kconfig symbol as the differentiating factor. Also please let me know which branch/tree to use where we are done with the change making the default to 128TB. My v2 was based on git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git (mater branch), where the TASK_SIZE is set to 46-bits inside 'arch/powerpc/include/asm/processor.h'. So that I can spin the v3 accordingly. Thanks, Bhupesh
Re: [PATCH v3] powerpc: mm: support ARCH_MMAP_RND_BITS
On Thu, Apr 13, 2017 at 12:28 PM, Aneesh Kumar K.V wrote: > > > On Thursday 13 April 2017 12:22 PM, Bhupesh Sharma wrote: >> >> Hi Aneesh, >> >> On Thu, Apr 13, 2017 at 12:06 PM, Aneesh Kumar K.V >> wrote: >>> >>> Bhupesh Sharma writes: >>> >>>> powerpc arch_mmap_rnd() currently uses hard-coded values - >>>> (23-PAGE_SHIFT) for >>>> 32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset >>>> for the mmap base address for a ASLR ELF. >>>> >>>> This patch makes sure that powerpc mmap arch_mmap_rnd() implementation >>>> is similar to other ARCHs (like x86, arm64) and uses mmap_rnd_bits >>>> and helpers to generate the mmap address randomization. >>>> >>>> The maximum and minimum randomization range values represent >>>> a compromise between increased ASLR effectiveness and avoiding >>>> address-space fragmentation. >>>> >>>> Using the Kconfig option and suitable /proc tunable, platform >>>> developers may choose where to place this compromise. >>>> >>>> Also this patch keeps the default values as new minimums. >>>> >>>> Signed-off-by: Bhupesh Sharma >>>> Reviewed-by: Kees Cook >>>> --- >>>> * Changes since v2: >>>> v2 can be seen here (https://patchwork.kernel.org/patch/9551509/) >>>> - Changed a few minimum and maximum randomization ranges as per >>>> Michael's suggestion. >>>> - Corrected Kees's email address in the Reviewed-by line. >>>> - Added further comments in kconfig to explain how the address >>>> ranges were worked out. >>>> >>>> * Changes since v1: >>>> v1 can be seen here >>>> (https://lists.ozlabs.org/pipermail/linuxppc-dev/2017-February/153594.html) >>>> - No functional change in this patch. >>>> - Dropped PATCH 2/2 from v1 as recommended by Kees Cook. >>>> >>>> arch/powerpc/Kconfig | 44 >>>> >>>> arch/powerpc/mm/mmap.c | 7 --- >>>> 2 files changed, 48 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig >>>> index 97a8bc8..84aae67 100644 >>>> --- a/arch/powerpc/Kconfig >>>> +++ b/arch/powerpc/Kconfig >>>> @@ -22,6 +22,48 @@ config MMU >>>> bool >>>> default y >>>> >>>> +# min bits determined by the following formula: >>>> +# VA_BITS - PAGE_SHIFT - CONSTANT >>>> +# where, >>>> +#VA_BITS = 46 bits for 64BIT and 4GB - 1 Page = 31 bits for 32BIT >>> >>> >>> >>> Where did we derive that 46 bits from ? is that based on TASK_SIZE ? >> >> >> Yes. It was derived from TASK_SIZE : >> >> http://lxr.free-electrons.com/source/arch/powerpc/include/asm/processor.h#L105 >> > > That is getting update to 128TB by default and conditionally to 512TB Can't find the relevant patch in linus's master branch. Please share the appropriate patch/discussion link. Regards, Bhupesh
Re: [PATCH v3] powerpc: mm: support ARCH_MMAP_RND_BITS
Hi Aneesh, On Thu, Apr 13, 2017 at 12:06 PM, Aneesh Kumar K.V wrote: > Bhupesh Sharma writes: > >> powerpc arch_mmap_rnd() currently uses hard-coded values - (23-PAGE_SHIFT) >> for >> 32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset >> for the mmap base address for a ASLR ELF. >> >> This patch makes sure that powerpc mmap arch_mmap_rnd() implementation >> is similar to other ARCHs (like x86, arm64) and uses mmap_rnd_bits >> and helpers to generate the mmap address randomization. >> >> The maximum and minimum randomization range values represent >> a compromise between increased ASLR effectiveness and avoiding >> address-space fragmentation. >> >> Using the Kconfig option and suitable /proc tunable, platform >> developers may choose where to place this compromise. >> >> Also this patch keeps the default values as new minimums. >> >> Signed-off-by: Bhupesh Sharma >> Reviewed-by: Kees Cook >> --- >> * Changes since v2: >> v2 can be seen here (https://patchwork.kernel.org/patch/9551509/) >> - Changed a few minimum and maximum randomization ranges as per >> Michael's suggestion. >> - Corrected Kees's email address in the Reviewed-by line. >> - Added further comments in kconfig to explain how the address ranges >> were worked out. >> >> * Changes since v1: >> v1 can be seen here >> (https://lists.ozlabs.org/pipermail/linuxppc-dev/2017-February/153594.html) >> - No functional change in this patch. >> - Dropped PATCH 2/2 from v1 as recommended by Kees Cook. >> >> arch/powerpc/Kconfig | 44 >> arch/powerpc/mm/mmap.c | 7 --- >> 2 files changed, 48 insertions(+), 3 deletions(-) >> >> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig >> index 97a8bc8..84aae67 100644 >> --- a/arch/powerpc/Kconfig >> +++ b/arch/powerpc/Kconfig >> @@ -22,6 +22,48 @@ config MMU >> bool >> default y >> >> +# min bits determined by the following formula: >> +# VA_BITS - PAGE_SHIFT - CONSTANT >> +# where, >> +#VA_BITS = 46 bits for 64BIT and 4GB - 1 Page = 31 bits for 32BIT > > > Where did we derive that 46 bits from ? is that based on TASK_SIZE ? Yes. It was derived from TASK_SIZE : http://lxr.free-electrons.com/source/arch/powerpc/include/asm/processor.h#L105 Regards, Bhupesh > >> +#CONSTANT = 16 for 64BIT and 8 for 32BIT >> +config ARCH_MMAP_RND_BITS_MIN >> + default 5 if PPC_256K_PAGES && 32BIT # 31 - 18 - 8 = 5 >> + default 7 if PPC_64K_PAGES && 32BIT # 31 - 16 - 8 = 7 >> + default 9 if PPC_16K_PAGES && 32BIT # 31 - 14 - 8 = 9 >> + default 11 if PPC_4K_PAGES && 32BIT # 31 - 12 - 8 = 11 >> + default 12 if PPC_256K_PAGES && 64BIT # 46 - 18 - 16 = 12 >> + default 14 if PPC_64K_PAGES && 64BIT # 46 - 16 - 16 = 14 >> + default 16 if PPC_16K_PAGES && 64BIT # 46 - 14 - 16 = 16 >> + default 18 if PPC_4K_PAGES && 64BIT # 46 - 12 - 16 = 18 >> + >> +# max bits determined by the following formula: > > > -aneesh >
Re: [PATCH v3] powerpc: mm: support ARCH_MMAP_RND_BITS
Hi Michael, On Wed, Mar 29, 2017 at 1:15 AM, Bhupesh Sharma wrote: > powerpc arch_mmap_rnd() currently uses hard-coded values - (23-PAGE_SHIFT) for > 32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset > for the mmap base address for a ASLR ELF. > > This patch makes sure that powerpc mmap arch_mmap_rnd() implementation > is similar to other ARCHs (like x86, arm64) and uses mmap_rnd_bits > and helpers to generate the mmap address randomization. > > The maximum and minimum randomization range values represent > a compromise between increased ASLR effectiveness and avoiding > address-space fragmentation. > > Using the Kconfig option and suitable /proc tunable, platform > developers may choose where to place this compromise. > > Also this patch keeps the default values as new minimums. > > Signed-off-by: Bhupesh Sharma > Reviewed-by: Kees Cook > --- > * Changes since v2: > v2 can be seen here (https://patchwork.kernel.org/patch/9551509/) > - Changed a few minimum and maximum randomization ranges as per Michael's > suggestion. > - Corrected Kees's email address in the Reviewed-by line. > - Added further comments in kconfig to explain how the address ranges > were worked out. > > * Changes since v1: > v1 can be seen here > (https://lists.ozlabs.org/pipermail/linuxppc-dev/2017-February/153594.html) > - No functional change in this patch. > - Dropped PATCH 2/2 from v1 as recommended by Kees Cook. > > arch/powerpc/Kconfig | 44 > arch/powerpc/mm/mmap.c | 7 --- > 2 files changed, 48 insertions(+), 3 deletions(-) > > diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig > index 97a8bc8..84aae67 100644 > --- a/arch/powerpc/Kconfig > +++ b/arch/powerpc/Kconfig > @@ -22,6 +22,48 @@ config MMU > bool > default y > > +# min bits determined by the following formula: > +# VA_BITS - PAGE_SHIFT - CONSTANT > +# where, > +# VA_BITS = 46 bits for 64BIT and 4GB - 1 Page = 31 bits for 32BIT > +# CONSTANT = 16 for 64BIT and 8 for 32BIT > +config ARCH_MMAP_RND_BITS_MIN > + default 5 if PPC_256K_PAGES && 32BIT # 31 - 18 - 8 = 5 > + default 7 if PPC_64K_PAGES && 32BIT # 31 - 16 - 8 = 7 > + default 9 if PPC_16K_PAGES && 32BIT # 31 - 14 - 8 = 9 > + default 11 if PPC_4K_PAGES && 32BIT # 31 - 12 - 8 = 11 > + default 12 if PPC_256K_PAGES && 64BIT # 46 - 18 - 16 = 12 > + default 14 if PPC_64K_PAGES && 64BIT # 46 - 16 - 16 = 14 > + default 16 if PPC_16K_PAGES && 64BIT # 46 - 14 - 16 = 16 > + default 18 if PPC_4K_PAGES && 64BIT # 46 - 12 - 16 = 18 > + > +# max bits determined by the following formula: > +# VA_BITS - PAGE_SHIFT - CONSTANT > +# where, > +# VA_BITS = 46 bits for 64BIT, and 4GB - 1 Page = 31 bits for 32BIT > +# CONSTANT = 2, both for 64BIT and 32BIT > +config ARCH_MMAP_RND_BITS_MAX > + default 11 if PPC_256K_PAGES && 32BIT # 31 - 18 - 2 = 11 > + default 13 if PPC_64K_PAGES && 32BIT # 31 - 16 - 2 = 13 > + default 15 if PPC_16K_PAGES && 32BIT # 31 - 14 - 2 = 15 > + default 17 if PPC_4K_PAGES && 32BIT # 31 - 12 - 2 = 17 > + default 26 if PPC_256K_PAGES && 64BIT # 46 - 18 - 2 = 26 > + default 28 if PPC_64K_PAGES && 64BIT # 46 - 16 - 2 = 28 > + default 30 if PPC_16K_PAGES && 64BIT # 46 - 14 - 2 = 30 > + default 32 if PPC_4K_PAGES && 64BIT # 46 - 12 - 2 = 32 > + > +config ARCH_MMAP_RND_COMPAT_BITS_MIN > + default 5 if PPC_256K_PAGES > + default 7 if PPC_64K_PAGES > + default 9 if PPC_16K_PAGES > + default 11 > + > +config ARCH_MMAP_RND_COMPAT_BITS_MAX > + default 11 if PPC_256K_PAGES > + default 13 if PPC_64K_PAGES > + default 15 if PPC_16K_PAGES > + default 17 > + > config HAVE_SETUP_PER_CPU_AREA > def_bool PPC64 > > @@ -142,6 +184,8 @@ config PPC > select HAVE_IRQ_EXIT_ON_IRQ_STACK > select HAVE_KERNEL_GZIP > select HAVE_KPROBES > + select HAVE_ARCH_MMAP_RND_BITS > + select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT > select HAVE_KRETPROBES > select HAVE_LIVEPATCH if > HAVE_DYNAMIC_FTRACE_WITH_REGS > select HAVE_MEMBLOCK > diff --git a/arch/powerpc/mm/mmap.c b/arch/powerpc/mm/mmap.c > index a5d9ef5..92a9355 100644 > --- a/arch/powerpc/mm/mmap.c > +++ b/arch/powerpc/mm/mmap.c > @@ -61,11 +61,12 @@ unsigned long arch_mmap_rnd(void) > { > unsigned long rnd; > > - /* 8MB for 32bit
[PATCH v3] powerpc: mm: support ARCH_MMAP_RND_BITS
powerpc arch_mmap_rnd() currently uses hard-coded values - (23-PAGE_SHIFT) for 32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset for the mmap base address for a ASLR ELF. This patch makes sure that powerpc mmap arch_mmap_rnd() implementation is similar to other ARCHs (like x86, arm64) and uses mmap_rnd_bits and helpers to generate the mmap address randomization. The maximum and minimum randomization range values represent a compromise between increased ASLR effectiveness and avoiding address-space fragmentation. Using the Kconfig option and suitable /proc tunable, platform developers may choose where to place this compromise. Also this patch keeps the default values as new minimums. Signed-off-by: Bhupesh Sharma Reviewed-by: Kees Cook --- * Changes since v2: v2 can be seen here (https://patchwork.kernel.org/patch/9551509/) - Changed a few minimum and maximum randomization ranges as per Michael's suggestion. - Corrected Kees's email address in the Reviewed-by line. - Added further comments in kconfig to explain how the address ranges were worked out. * Changes since v1: v1 can be seen here (https://lists.ozlabs.org/pipermail/linuxppc-dev/2017-February/153594.html) - No functional change in this patch. - Dropped PATCH 2/2 from v1 as recommended by Kees Cook. arch/powerpc/Kconfig | 44 arch/powerpc/mm/mmap.c | 7 --- 2 files changed, 48 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 97a8bc8..84aae67 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -22,6 +22,48 @@ config MMU bool default y +# min bits determined by the following formula: +# VA_BITS - PAGE_SHIFT - CONSTANT +# where, +# VA_BITS = 46 bits for 64BIT and 4GB - 1 Page = 31 bits for 32BIT +# CONSTANT = 16 for 64BIT and 8 for 32BIT +config ARCH_MMAP_RND_BITS_MIN + default 5 if PPC_256K_PAGES && 32BIT # 31 - 18 - 8 = 5 + default 7 if PPC_64K_PAGES && 32BIT # 31 - 16 - 8 = 7 + default 9 if PPC_16K_PAGES && 32BIT # 31 - 14 - 8 = 9 + default 11 if PPC_4K_PAGES && 32BIT # 31 - 12 - 8 = 11 + default 12 if PPC_256K_PAGES && 64BIT # 46 - 18 - 16 = 12 + default 14 if PPC_64K_PAGES && 64BIT # 46 - 16 - 16 = 14 + default 16 if PPC_16K_PAGES && 64BIT # 46 - 14 - 16 = 16 + default 18 if PPC_4K_PAGES && 64BIT # 46 - 12 - 16 = 18 + +# max bits determined by the following formula: +# VA_BITS - PAGE_SHIFT - CONSTANT +# where, +# VA_BITS = 46 bits for 64BIT, and 4GB - 1 Page = 31 bits for 32BIT +# CONSTANT = 2, both for 64BIT and 32BIT +config ARCH_MMAP_RND_BITS_MAX + default 11 if PPC_256K_PAGES && 32BIT # 31 - 18 - 2 = 11 + default 13 if PPC_64K_PAGES && 32BIT # 31 - 16 - 2 = 13 + default 15 if PPC_16K_PAGES && 32BIT # 31 - 14 - 2 = 15 + default 17 if PPC_4K_PAGES && 32BIT # 31 - 12 - 2 = 17 + default 26 if PPC_256K_PAGES && 64BIT # 46 - 18 - 2 = 26 + default 28 if PPC_64K_PAGES && 64BIT # 46 - 16 - 2 = 28 + default 30 if PPC_16K_PAGES && 64BIT # 46 - 14 - 2 = 30 + default 32 if PPC_4K_PAGES && 64BIT # 46 - 12 - 2 = 32 + +config ARCH_MMAP_RND_COMPAT_BITS_MIN + default 5 if PPC_256K_PAGES + default 7 if PPC_64K_PAGES + default 9 if PPC_16K_PAGES + default 11 + +config ARCH_MMAP_RND_COMPAT_BITS_MAX + default 11 if PPC_256K_PAGES + default 13 if PPC_64K_PAGES + default 15 if PPC_16K_PAGES + default 17 + config HAVE_SETUP_PER_CPU_AREA def_bool PPC64 @@ -142,6 +184,8 @@ config PPC select HAVE_IRQ_EXIT_ON_IRQ_STACK select HAVE_KERNEL_GZIP select HAVE_KPROBES + select HAVE_ARCH_MMAP_RND_BITS + select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT select HAVE_KRETPROBES select HAVE_LIVEPATCH if HAVE_DYNAMIC_FTRACE_WITH_REGS select HAVE_MEMBLOCK diff --git a/arch/powerpc/mm/mmap.c b/arch/powerpc/mm/mmap.c index a5d9ef5..92a9355 100644 --- a/arch/powerpc/mm/mmap.c +++ b/arch/powerpc/mm/mmap.c @@ -61,11 +61,12 @@ unsigned long arch_mmap_rnd(void) { unsigned long rnd; - /* 8MB for 32bit, 1GB for 64bit */ +#ifdef CONFIG_COMPAT if (is_32bit_task()) - rnd = get_random_long() % (1<<(23-PAGE_SHIFT)); + rnd = get_random_long() & ((1UL << mmap_rnd_compat_bits) - 1); else - rnd = get_random_long() % (1UL<<(30-PAGE_SHIFT)); +#endif + rnd = get_random_long() & ((1UL << mmap_rnd_bits) - 1); return rnd << PAGE_SHIFT; } -- 2.7.4
Re: [kernel-hardening] Re: [PATCH 1/2] powerpc: mm: support ARCH_MMAP_RND_BITS
Hi Michael, On Thu, Feb 16, 2017 at 10:19 AM, Bhupesh Sharma wrote: > Hi Michael, > > On Fri, Feb 10, 2017 at 4:41 PM, Bhupesh Sharma wrote: >> On Fri, Feb 10, 2017 at 4:31 PM, Michael Ellerman >> wrote: >>> Bhupesh Sharma writes: >>> >>>> HI Michael, >>>> >>>> On Thu, Feb 2, 2017 at 3:53 PM, Michael Ellerman >>>> wrote: >>>>> Bhupesh Sharma writes: >>>>> >>>>>> powerpc: arch_mmap_rnd() uses hard-coded values, (23-PAGE_SHIFT) for >>>>>> 32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset >>>>>> for the mmap base address. >>>>>> >>>>>> This value represents a compromise between increased >>>>>> ASLR effectiveness and avoiding address-space fragmentation. >>>>>> Replace it with a Kconfig option, which is sensibly bounded, so that >>>>>> platform developers may choose where to place this compromise. >>>>>> Keep default values as new minimums. >>>>>> >>>>>> This patch makes sure that now powerpc mmap arch_mmap_rnd() approach >>>>>> is similar to other ARCHs like x86, arm64 and arm. >>>>> >>>>> Thanks for looking at this, it's been on my TODO for a while. >>>>> >>>>> I have a half completed version locally, but never got around to testing >>>>> it thoroughly. >>>> >>>> Sure :) >>>> >>>>>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig >>>>>> index a8ee573fe610..b4a843f68705 100644 >>>>>> --- a/arch/powerpc/Kconfig >>>>>> +++ b/arch/powerpc/Kconfig >>>>>> @@ -22,6 +22,38 @@ config MMU >>>>>> bool >>>>>> default y >>>>>> >>>>>> +config ARCH_MMAP_RND_BITS_MIN >>>>>> + default 5 if PPC_256K_PAGES && 32BIT >>>>>> + default 12 if PPC_256K_PAGES && 64BIT >>>>>> + default 7 if PPC_64K_PAGES && 32BIT >>>>>> + default 14 if PPC_64K_PAGES && 64BIT >>>>>> + default 9 if PPC_16K_PAGES && 32BIT >>>>>> + default 16 if PPC_16K_PAGES && 64BIT >>>>>> + default 11 if PPC_4K_PAGES && 32BIT >>>>>> + default 18 if PPC_4K_PAGES && 64BIT >>>>>> + >>>>>> +# max bits determined by the following formula: >>>>>> +# VA_BITS - PAGE_SHIFT - 4 >>>>>> +# for e.g for 64K page and 64BIT = 48 - 16 - 4 = 28 >>>>>> +config ARCH_MMAP_RND_BITS_MAX >>>>>> + default 10 if PPC_256K_PAGES && 32BIT >>>>>> + default 26 if PPC_256K_PAGES && 64BIT >>>>>> + default 12 if PPC_64K_PAGES && 32BIT >>>>>> + default 28 if PPC_64K_PAGES && 64BIT >>>>>> + default 14 if PPC_16K_PAGES && 32BIT >>>>>> + default 30 if PPC_16K_PAGES && 64BIT >>>>>> + default 16 if PPC_4K_PAGES && 32BIT >>>>>> + default 32 if PPC_4K_PAGES && 64BIT >>>>>> + >>>>>> +config ARCH_MMAP_RND_COMPAT_BITS_MIN >>>>>> + default 5 if PPC_256K_PAGES >>>>>> + default 7 if PPC_64K_PAGES >>>>>> + default 9 if PPC_16K_PAGES >>>>>> + default 11 >>>>>> + >>>>>> +config ARCH_MMAP_RND_COMPAT_BITS_MAX >>>>>> + default 16 >>>>>> + >>>>> >>>>> This is what I have below, which is a bit neater I think because each >>>>> value is only there once (by defaulting to the COMPAT value). >>>>> >>>>> My max values are different to yours, I don't really remember why I >>>>> chose those values, so we can argue about which is right. >>>> >>>> I am not sure how you derived these values, but I am not sure there >>>> should be differences between 64-BIT x86/ARM64 and PPC values for the >>>> MAX values. >>> >>> But your values *are* different to x86 and arm64. >>> >>> And why would they be the same anyway? x86 has a 47 bit address space, >>> 64-bit powerpc is 46 bits, and arm64 is configurable fro
Re: [kernel-hardening] Re: [PATCH 1/2] powerpc: mm: support ARCH_MMAP_RND_BITS
Hi Michael, On Fri, Feb 10, 2017 at 4:41 PM, Bhupesh Sharma wrote: > On Fri, Feb 10, 2017 at 4:31 PM, Michael Ellerman wrote: >> Bhupesh Sharma writes: >> >>> HI Michael, >>> >>> On Thu, Feb 2, 2017 at 3:53 PM, Michael Ellerman >>> wrote: >>>> Bhupesh Sharma writes: >>>> >>>>> powerpc: arch_mmap_rnd() uses hard-coded values, (23-PAGE_SHIFT) for >>>>> 32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset >>>>> for the mmap base address. >>>>> >>>>> This value represents a compromise between increased >>>>> ASLR effectiveness and avoiding address-space fragmentation. >>>>> Replace it with a Kconfig option, which is sensibly bounded, so that >>>>> platform developers may choose where to place this compromise. >>>>> Keep default values as new minimums. >>>>> >>>>> This patch makes sure that now powerpc mmap arch_mmap_rnd() approach >>>>> is similar to other ARCHs like x86, arm64 and arm. >>>> >>>> Thanks for looking at this, it's been on my TODO for a while. >>>> >>>> I have a half completed version locally, but never got around to testing >>>> it thoroughly. >>> >>> Sure :) >>> >>>>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig >>>>> index a8ee573fe610..b4a843f68705 100644 >>>>> --- a/arch/powerpc/Kconfig >>>>> +++ b/arch/powerpc/Kconfig >>>>> @@ -22,6 +22,38 @@ config MMU >>>>> bool >>>>> default y >>>>> >>>>> +config ARCH_MMAP_RND_BITS_MIN >>>>> + default 5 if PPC_256K_PAGES && 32BIT >>>>> + default 12 if PPC_256K_PAGES && 64BIT >>>>> + default 7 if PPC_64K_PAGES && 32BIT >>>>> + default 14 if PPC_64K_PAGES && 64BIT >>>>> + default 9 if PPC_16K_PAGES && 32BIT >>>>> + default 16 if PPC_16K_PAGES && 64BIT >>>>> + default 11 if PPC_4K_PAGES && 32BIT >>>>> + default 18 if PPC_4K_PAGES && 64BIT >>>>> + >>>>> +# max bits determined by the following formula: >>>>> +# VA_BITS - PAGE_SHIFT - 4 >>>>> +# for e.g for 64K page and 64BIT = 48 - 16 - 4 = 28 >>>>> +config ARCH_MMAP_RND_BITS_MAX >>>>> + default 10 if PPC_256K_PAGES && 32BIT >>>>> + default 26 if PPC_256K_PAGES && 64BIT >>>>> + default 12 if PPC_64K_PAGES && 32BIT >>>>> + default 28 if PPC_64K_PAGES && 64BIT >>>>> + default 14 if PPC_16K_PAGES && 32BIT >>>>> + default 30 if PPC_16K_PAGES && 64BIT >>>>> + default 16 if PPC_4K_PAGES && 32BIT >>>>> + default 32 if PPC_4K_PAGES && 64BIT >>>>> + >>>>> +config ARCH_MMAP_RND_COMPAT_BITS_MIN >>>>> + default 5 if PPC_256K_PAGES >>>>> + default 7 if PPC_64K_PAGES >>>>> + default 9 if PPC_16K_PAGES >>>>> + default 11 >>>>> + >>>>> +config ARCH_MMAP_RND_COMPAT_BITS_MAX >>>>> + default 16 >>>>> + >>>> >>>> This is what I have below, which is a bit neater I think because each >>>> value is only there once (by defaulting to the COMPAT value). >>>> >>>> My max values are different to yours, I don't really remember why I >>>> chose those values, so we can argue about which is right. >>> >>> I am not sure how you derived these values, but I am not sure there >>> should be differences between 64-BIT x86/ARM64 and PPC values for the >>> MAX values. >> >> But your values *are* different to x86 and arm64. >> >> And why would they be the same anyway? x86 has a 47 bit address space, >> 64-bit powerpc is 46 bits, and arm64 is configurable from 36 to 48 bits. >> >> So your calculations above using VA_BITS = 48 should be using 46 bits. >> >> But if you fixed that, your formula basically gives 1/16th of the >> address space as the maximum range. Why is that the right amount? >> >> x86 uses 1/8th, and arm64 uses a mixture of 1/8th and 1/32nd (though >> those might be bugs). >> >> My values were more libera
Re: [kernel-hardening] Re: [PATCH 1/2] powerpc: mm: support ARCH_MMAP_RND_BITS
On Fri, Feb 10, 2017 at 4:31 PM, Michael Ellerman wrote: > Bhupesh Sharma writes: > >> HI Michael, >> >> On Thu, Feb 2, 2017 at 3:53 PM, Michael Ellerman wrote: >>> Bhupesh Sharma writes: >>> >>>> powerpc: arch_mmap_rnd() uses hard-coded values, (23-PAGE_SHIFT) for >>>> 32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset >>>> for the mmap base address. >>>> >>>> This value represents a compromise between increased >>>> ASLR effectiveness and avoiding address-space fragmentation. >>>> Replace it with a Kconfig option, which is sensibly bounded, so that >>>> platform developers may choose where to place this compromise. >>>> Keep default values as new minimums. >>>> >>>> This patch makes sure that now powerpc mmap arch_mmap_rnd() approach >>>> is similar to other ARCHs like x86, arm64 and arm. >>> >>> Thanks for looking at this, it's been on my TODO for a while. >>> >>> I have a half completed version locally, but never got around to testing >>> it thoroughly. >> >> Sure :) >> >>>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig >>>> index a8ee573fe610..b4a843f68705 100644 >>>> --- a/arch/powerpc/Kconfig >>>> +++ b/arch/powerpc/Kconfig >>>> @@ -22,6 +22,38 @@ config MMU >>>> bool >>>> default y >>>> >>>> +config ARCH_MMAP_RND_BITS_MIN >>>> + default 5 if PPC_256K_PAGES && 32BIT >>>> + default 12 if PPC_256K_PAGES && 64BIT >>>> + default 7 if PPC_64K_PAGES && 32BIT >>>> + default 14 if PPC_64K_PAGES && 64BIT >>>> + default 9 if PPC_16K_PAGES && 32BIT >>>> + default 16 if PPC_16K_PAGES && 64BIT >>>> + default 11 if PPC_4K_PAGES && 32BIT >>>> + default 18 if PPC_4K_PAGES && 64BIT >>>> + >>>> +# max bits determined by the following formula: >>>> +# VA_BITS - PAGE_SHIFT - 4 >>>> +# for e.g for 64K page and 64BIT = 48 - 16 - 4 = 28 >>>> +config ARCH_MMAP_RND_BITS_MAX >>>> + default 10 if PPC_256K_PAGES && 32BIT >>>> + default 26 if PPC_256K_PAGES && 64BIT >>>> + default 12 if PPC_64K_PAGES && 32BIT >>>> + default 28 if PPC_64K_PAGES && 64BIT >>>> + default 14 if PPC_16K_PAGES && 32BIT >>>> + default 30 if PPC_16K_PAGES && 64BIT >>>> + default 16 if PPC_4K_PAGES && 32BIT >>>> + default 32 if PPC_4K_PAGES && 64BIT >>>> + >>>> +config ARCH_MMAP_RND_COMPAT_BITS_MIN >>>> + default 5 if PPC_256K_PAGES >>>> + default 7 if PPC_64K_PAGES >>>> + default 9 if PPC_16K_PAGES >>>> + default 11 >>>> + >>>> +config ARCH_MMAP_RND_COMPAT_BITS_MAX >>>> + default 16 >>>> + >>> >>> This is what I have below, which is a bit neater I think because each >>> value is only there once (by defaulting to the COMPAT value). >>> >>> My max values are different to yours, I don't really remember why I >>> chose those values, so we can argue about which is right. >> >> I am not sure how you derived these values, but I am not sure there >> should be differences between 64-BIT x86/ARM64 and PPC values for the >> MAX values. > > But your values *are* different to x86 and arm64. > > And why would they be the same anyway? x86 has a 47 bit address space, > 64-bit powerpc is 46 bits, and arm64 is configurable from 36 to 48 bits. > > So your calculations above using VA_BITS = 48 should be using 46 bits. > > But if you fixed that, your formula basically gives 1/16th of the > address space as the maximum range. Why is that the right amount? > > x86 uses 1/8th, and arm64 uses a mixture of 1/8th and 1/32nd (though > those might be bugs). > > My values were more liberal, giving up to half the address space for 32 > & 64-bit. Maybe that's too generous, but my rationale was it's up to the > sysadmin to tweak the values and they get to keep the pieces if it > breaks. I am not sure why would one want to use more than the practical limits of 1/8th used by x86 - this causes additional burden of address space fragmentation. So we need to balance between the randomness increase and the address
Re: [kernel-hardening] [PATCH v2 1/1] powerpc: mm: support ARCH_MMAP_RND_BITS
Hi Michael, On Tue, Feb 7, 2017 at 7:57 AM, Michael Ellerman wrote: > Bhupesh Sharma writes: > >> powerpc: arch_mmap_rnd() uses hard-coded values, (23-PAGE_SHIFT) for >> 32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset >> for the mmap base address. >> >> This value represents a compromise between increased >> ASLR effectiveness and avoiding address-space fragmentation. >> Replace it with a Kconfig option, which is sensibly bounded, so that >> platform developers may choose where to place this compromise. >> Keep default values as new minimums. >> >> This patch makes sure that now powerpc mmap arch_mmap_rnd() approach >> is similar to other ARCHs like x86, arm64 and arm. >> >> Cc: Alexander Graf >> Cc: Benjamin Herrenschmidt >> Cc: Paul Mackerras >> Cc: Michael Ellerman >> Cc: Anatolij Gustschin >> Cc: Alistair Popple >> Cc: Matt Porter >> Cc: Vitaly Bordug >> Cc: Scott Wood >> Cc: Kumar Gala >> Cc: Daniel Cashman >> Signed-off-by: Bhupesh Sharma >> Reviewed-by: Kees Cook >> --- >> Changes since v1: >> v1 can be seen here >> (https://lists.ozlabs.org/pipermail/linuxppc-dev/2017-February/153594.html) >> - No functional change in this patch. >> - Added R-B from Kees. >> - Dropped PATCH 2/2 from v1 as recommended by Kees Cook. > > Thanks for v2. > > But I replied to your v1 with some comments, did you see them? > I have replied to your comments on the original thread. Please share your views and if possible share your test results on the PPC setups you might have at your end. Thanks, Bhupesh
Re: [PATCH 1/2] powerpc: mm: support ARCH_MMAP_RND_BITS
HI Michael, On Thu, Feb 2, 2017 at 3:53 PM, Michael Ellerman wrote: > Bhupesh Sharma writes: > >> powerpc: arch_mmap_rnd() uses hard-coded values, (23-PAGE_SHIFT) for >> 32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset >> for the mmap base address. >> >> This value represents a compromise between increased >> ASLR effectiveness and avoiding address-space fragmentation. >> Replace it with a Kconfig option, which is sensibly bounded, so that >> platform developers may choose where to place this compromise. >> Keep default values as new minimums. >> >> This patch makes sure that now powerpc mmap arch_mmap_rnd() approach >> is similar to other ARCHs like x86, arm64 and arm. > > Thanks for looking at this, it's been on my TODO for a while. > > I have a half completed version locally, but never got around to testing > it thoroughly. Sure :) >> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig >> index a8ee573fe610..b4a843f68705 100644 >> --- a/arch/powerpc/Kconfig >> +++ b/arch/powerpc/Kconfig >> @@ -22,6 +22,38 @@ config MMU >> bool >> default y >> >> +config ARCH_MMAP_RND_BITS_MIN >> + default 5 if PPC_256K_PAGES && 32BIT >> + default 12 if PPC_256K_PAGES && 64BIT >> + default 7 if PPC_64K_PAGES && 32BIT >> + default 14 if PPC_64K_PAGES && 64BIT >> + default 9 if PPC_16K_PAGES && 32BIT >> + default 16 if PPC_16K_PAGES && 64BIT >> + default 11 if PPC_4K_PAGES && 32BIT >> + default 18 if PPC_4K_PAGES && 64BIT >> + >> +# max bits determined by the following formula: >> +# VA_BITS - PAGE_SHIFT - 4 >> +# for e.g for 64K page and 64BIT = 48 - 16 - 4 = 28 >> +config ARCH_MMAP_RND_BITS_MAX >> + default 10 if PPC_256K_PAGES && 32BIT >> + default 26 if PPC_256K_PAGES && 64BIT >> + default 12 if PPC_64K_PAGES && 32BIT >> + default 28 if PPC_64K_PAGES && 64BIT >> + default 14 if PPC_16K_PAGES && 32BIT >> + default 30 if PPC_16K_PAGES && 64BIT >> + default 16 if PPC_4K_PAGES && 32BIT >> + default 32 if PPC_4K_PAGES && 64BIT >> + >> +config ARCH_MMAP_RND_COMPAT_BITS_MIN >> + default 5 if PPC_256K_PAGES >> + default 7 if PPC_64K_PAGES >> + default 9 if PPC_16K_PAGES >> + default 11 >> + >> +config ARCH_MMAP_RND_COMPAT_BITS_MAX >> + default 16 >> + > > This is what I have below, which is a bit neater I think because each > value is only there once (by defaulting to the COMPAT value). > > My max values are different to yours, I don't really remember why I > chose those values, so we can argue about which is right. I am not sure how you derived these values, but I am not sure there should be differences between 64-BIT x86/ARM64 and PPC values for the MAX values. > > +config ARCH_MMAP_RND_BITS_MIN > + # On 64-bit up to 1G of address space (2^30) > + default 12 if 64BIT && PPC_256K_PAGES # 256K (2^18), = 30 - 18 = 12 > + default 14 if 64BIT && PPC_64K_PAGES# 64K (2^16), = 30 - 16 = 14 > + default 16 if 64BIT && PPC_16K_PAGES# 16K (2^14), = 30 - 14 = 16 > + default 18 if 64BIT # 4K (2^12), = 30 - 12 = 18 > + default ARCH_MMAP_RND_COMPAT_BITS_MIN > + > +config ARCH_MMAP_RND_BITS_MAX > + # On 64-bit up to 32T of address space (2^45) > + default 27 if 64BIT && PPC_256K_PAGES # 256K (2^18), = 45 - 18 = 27 > + default 29 if 64BIT && PPC_64K_PAGES# 64K (2^16), = 45 - 16 = 29 > + default 31 if 64BIT && PPC_16K_PAGES# 16K (2^14), = 45 - 14 = 31 > + default 33 if 64BIT # 4K (2^12), = 45 - 12 = 33 > + default ARCH_MMAP_RND_COMPAT_BITS_MAX > + > +config ARCH_MMAP_RND_COMPAT_BITS_MIN > + # Up to 8MB of address space (2^23) > + default 5 if PPC_256K_PAGES # 256K (2^18), = 23 - 18 = 5 > + default 7 if PPC_64K_PAGES # 64K (2^16), = 23 - 16 = 7 > + default 9 if PPC_16K_PAGES # 16K (2^14), = 23 - 14 = 9 > + default 11 # 4K (2^12), = 23 - 12 = 11 > + > +config ARCH_MMAP_RND_COMPAT_BITS_MAX > + # Up to 2G of address space (2^31) > + default 13 if PPC_256K_PAGES# 256K (2^18), = 31 - 18 = 13 > + default 15 if PPC_64K_PAGES # 64K (2^16), = 31 - 16 = 15 > + default 1
Re: [PATCH v2 1/1] powerpc: mm: support ARCH_MMAP_RND_BITS
On Sat, Feb 4, 2017 at 6:13 AM, Kees Cook wrote: > On Thu, Feb 2, 2017 at 9:11 PM, Bhupesh Sharma wrote: >> powerpc: arch_mmap_rnd() uses hard-coded values, (23-PAGE_SHIFT) for >> 32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset >> for the mmap base address. >> >> This value represents a compromise between increased >> ASLR effectiveness and avoiding address-space fragmentation. >> Replace it with a Kconfig option, which is sensibly bounded, so that >> platform developers may choose where to place this compromise. >> Keep default values as new minimums. >> >> This patch makes sure that now powerpc mmap arch_mmap_rnd() approach >> is similar to other ARCHs like x86, arm64 and arm. >> >> Cc: Alexander Graf >> Cc: Benjamin Herrenschmidt >> Cc: Paul Mackerras >> Cc: Michael Ellerman >> Cc: Anatolij Gustschin >> Cc: Alistair Popple >> Cc: Matt Porter >> Cc: Vitaly Bordug >> Cc: Scott Wood >> Cc: Kumar Gala >> Cc: Daniel Cashman >> Signed-off-by: Bhupesh Sharma >> Reviewed-by: Kees Cook > > This " at " should be "@", but otherwise, yay v2! :) > Noted. Sorry for the typo :( Regards, Bhupesh
[PATCH v2 1/1] powerpc: mm: support ARCH_MMAP_RND_BITS
powerpc: arch_mmap_rnd() uses hard-coded values, (23-PAGE_SHIFT) for 32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset for the mmap base address. This value represents a compromise between increased ASLR effectiveness and avoiding address-space fragmentation. Replace it with a Kconfig option, which is sensibly bounded, so that platform developers may choose where to place this compromise. Keep default values as new minimums. This patch makes sure that now powerpc mmap arch_mmap_rnd() approach is similar to other ARCHs like x86, arm64 and arm. Cc: Alexander Graf Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Michael Ellerman Cc: Anatolij Gustschin Cc: Alistair Popple Cc: Matt Porter Cc: Vitaly Bordug Cc: Scott Wood Cc: Kumar Gala Cc: Daniel Cashman Signed-off-by: Bhupesh Sharma Reviewed-by: Kees Cook --- Changes since v1: v1 can be seen here (https://lists.ozlabs.org/pipermail/linuxppc-dev/2017-February/153594.html) - No functional change in this patch. - Added R-B from Kees. - Dropped PATCH 2/2 from v1 as recommended by Kees Cook. arch/powerpc/Kconfig | 34 ++ arch/powerpc/mm/mmap.c | 7 --- 2 files changed, 38 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index a8ee573fe610..b4a843f68705 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -22,6 +22,38 @@ config MMU bool default y +config ARCH_MMAP_RND_BITS_MIN + default 5 if PPC_256K_PAGES && 32BIT + default 12 if PPC_256K_PAGES && 64BIT + default 7 if PPC_64K_PAGES && 32BIT + default 14 if PPC_64K_PAGES && 64BIT + default 9 if PPC_16K_PAGES && 32BIT + default 16 if PPC_16K_PAGES && 64BIT + default 11 if PPC_4K_PAGES && 32BIT + default 18 if PPC_4K_PAGES && 64BIT + +# max bits determined by the following formula: +# VA_BITS - PAGE_SHIFT - 4 +# for e.g for 64K page and 64BIT = 48 - 16 - 4 = 28 +config ARCH_MMAP_RND_BITS_MAX + default 10 if PPC_256K_PAGES && 32BIT + default 26 if PPC_256K_PAGES && 64BIT + default 12 if PPC_64K_PAGES && 32BIT + default 28 if PPC_64K_PAGES && 64BIT + default 14 if PPC_16K_PAGES && 32BIT + default 30 if PPC_16K_PAGES && 64BIT + default 16 if PPC_4K_PAGES && 32BIT + default 32 if PPC_4K_PAGES && 64BIT + +config ARCH_MMAP_RND_COMPAT_BITS_MIN + default 5 if PPC_256K_PAGES + default 7 if PPC_64K_PAGES + default 9 if PPC_16K_PAGES + default 11 + +config ARCH_MMAP_RND_COMPAT_BITS_MAX + default 16 + config HAVE_SETUP_PER_CPU_AREA def_bool PPC64 @@ -100,6 +132,8 @@ config PPC select HAVE_EFFICIENT_UNALIGNED_ACCESS if !(CPU_LITTLE_ENDIAN && POWER7_CPU) select HAVE_KPROBES select HAVE_ARCH_KGDB + select HAVE_ARCH_MMAP_RND_BITS + select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT select HAVE_KRETPROBES select HAVE_ARCH_TRACEHOOK select HAVE_MEMBLOCK diff --git a/arch/powerpc/mm/mmap.c b/arch/powerpc/mm/mmap.c index 2f1e44362198..babf59faab3b 100644 --- a/arch/powerpc/mm/mmap.c +++ b/arch/powerpc/mm/mmap.c @@ -60,11 +60,12 @@ unsigned long arch_mmap_rnd(void) { unsigned long rnd; - /* 8MB for 32bit, 1GB for 64bit */ +#ifdef CONFIG_COMPAT if (is_32bit_task()) - rnd = get_random_long() % (1<<(23-PAGE_SHIFT)); + rnd = get_random_long() & ((1UL << mmap_rnd_compat_bits) - 1); else - rnd = get_random_long() % (1UL<<(30-PAGE_SHIFT)); +#endif + rnd = get_random_long() & ((1UL << mmap_rnd_bits) - 1); return rnd << PAGE_SHIFT; } -- 2.7.4
Re: [PATCH 0/2] RFC: Adjust powerpc ASLR elf randomness
On 3 Feb 2017 00:49, "Kees Cook" wrote: On Thu, Feb 2, 2017 at 10:08 AM, Bhupesh Sharma wrote: > On Thu, Feb 2, 2017 at 7:51 PM, Kees Cook wrote: >> On Wed, Feb 1, 2017 at 9:42 PM, Bhupesh Sharma wrote: >>> The 2nd patch increases the ELF_ET_DYN_BASE value from the current >>> hardcoded value of 0x2000_ to something more practical, >>> i.e. TASK_SIZE - PAGE_SHIFT (which makes sense especially for >>> 64-bit platforms which would like to utilize more randomization >>> in the load address of a PIE elf). >> >> I don't think you want this second patch. Moving ELF_ET_DYN_BASE to >> the top of TASK_SIZE means you'll be constantly colliding with stack >> and mmap randomization. 0x2000 is way better since it randomizes >> up from there towards the mmap area. >> >> Is there a reason to avoid the 32-bit memory range for the ELF addresses? > > I think you are right. Hmm, I think I was going by my particular use > case which might not be required for generic PPC platforms. > > I have one doubt though - I have primarily worked on arm64 and x86 > architectures and there I see there 64-bit user space applications > using the 64-bit load addresses/ranges. I am not sure why PPC64 is > different historically. x86's ELF_ET_DYN_BASE is (TASK_SIZE / 3 * 2), so it puts it ET_DYN base at the top third of the address space. (In theory, this is to avoid interpreter collisions, but I'm working on removing that restriction, as it seems pointless.) Other architectures have small ELF_ET_DYN_BASEs, which is good: it allows them to have larger entropy for ET_DYN. Fair enough. I will drop the 2nd patch then and spin a v2. Regards, Bhupesh
Re: [PATCH 0/2] RFC: Adjust powerpc ASLR elf randomness
Hi Balbir, On Thu, Feb 2, 2017 at 12:14 PM, Balbir Singh wrote: > On Thu, Feb 02, 2017 at 11:12:46AM +0530, Bhupesh Sharma wrote: >> This RFC patchset tries to make the powerpc ASLR elf randomness >> implementation similar to other ARCHs (like x86). >> >> The 1st patch introduces the support of ARCH_MMAP_RND_BITS in powerpc >> mmap implementation to allow a sane balance between increased randomness >> in the mmap address of ASLR elfs and increased address space >> fragmentation. >> > > From what I see we get 28 bits of entropy right for 64k pages > bits as compared to 14 bits earlier? That's correct. We can go upto 28-bits of entropy for 64BIT platforms using 64K pages with the current approach. I see arm64 using > 28 bits of entropy randomness in some cases, but I think 28-bit MAX entropy is sensible for 64BIT/64K combination on PPC. >> The 2nd patch increases the ELF_ET_DYN_BASE value from the current >> hardcoded value of 0x2000_ to something more practical, >> i.e. TASK_SIZE - PAGE_SHIFT (which makes sense especially for >> 64-bit platforms which would like to utilize more randomization >> in the load address of a PIE elf). >> > > This helps PIE executables as such and leaves other not impacted? It basically affects all shared object files (as noted in [1]). However as Kees noted in one of his reviews, I think this 2nd patch might not be needed for all generic ppc platforms. [1] http://lxr.free-electrons.com/source/arch/powerpc/include/asm/elf.h#L26. Regards, Bhupesh
Re: [PATCH 0/2] RFC: Adjust powerpc ASLR elf randomness
Hi Kees, Thanks for the review. Please see my comments inline. On Thu, Feb 2, 2017 at 7:51 PM, Kees Cook wrote: > On Wed, Feb 1, 2017 at 9:42 PM, Bhupesh Sharma wrote: >> This RFC patchset tries to make the powerpc ASLR elf randomness >> implementation similar to other ARCHs (like x86). >> >> The 1st patch introduces the support of ARCH_MMAP_RND_BITS in powerpc >> mmap implementation to allow a sane balance between increased randomness >> in the mmap address of ASLR elfs and increased address space >> fragmentation. >> >> The 2nd patch increases the ELF_ET_DYN_BASE value from the current >> hardcoded value of 0x2000_ to something more practical, >> i.e. TASK_SIZE - PAGE_SHIFT (which makes sense especially for >> 64-bit platforms which would like to utilize more randomization >> in the load address of a PIE elf). > > I don't think you want this second patch. Moving ELF_ET_DYN_BASE to > the top of TASK_SIZE means you'll be constantly colliding with stack > and mmap randomization. 0x2000 is way better since it randomizes > up from there towards the mmap area. > > Is there a reason to avoid the 32-bit memory range for the ELF addresses? > > -Kees I think you are right. Hmm, I think I was going by my particular use case which might not be required for generic PPC platforms. I have one doubt though - I have primarily worked on arm64 and x86 architectures and there I see there 64-bit user space applications using the 64-bit load addresses/ranges. I am not sure why PPC64 is different historically. Regards, Bhupesh
Re: [PATCH 1/2] powerpc: mm: support ARCH_MMAP_RND_BITS
HI Balbir, On Thu, Feb 2, 2017 at 2:41 PM, Balbir Singh wrote: >> @@ -100,6 +132,8 @@ config PPC >> select HAVE_EFFICIENT_UNALIGNED_ACCESS if !(CPU_LITTLE_ENDIAN && >> POWER7_CPU) >> select HAVE_KPROBES >> select HAVE_ARCH_KGDB >> + select HAVE_ARCH_MMAP_RND_BITS >> + select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT > > COMPAT is on for ppc64 by default, so we'll end up with COMPAT_BITS same > as before all the time. No, actually the 'ARCH_MMAP_RND_COMPAT_BITS' values can be changed after boot using the '/proc/sys/vm/mmap_rnd_compat_bits' tunable: http://lxr.free-electrons.com/source/arch/Kconfig#L624 Regards, Bhupesh
Re: [PATCH 1/2] powerpc: mm: support ARCH_MMAP_RND_BITS
Hi Kees, On Thu, Feb 2, 2017 at 7:55 PM, Kees Cook wrote: > On Wed, Feb 1, 2017 at 9:42 PM, Bhupesh Sharma wrote: >> powerpc: arch_mmap_rnd() uses hard-coded values, (23-PAGE_SHIFT) for >> 32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset >> for the mmap base address. >> >> This value represents a compromise between increased >> ASLR effectiveness and avoiding address-space fragmentation. >> Replace it with a Kconfig option, which is sensibly bounded, so that >> platform developers may choose where to place this compromise. >> Keep default values as new minimums. >> >> This patch makes sure that now powerpc mmap arch_mmap_rnd() approach >> is similar to other ARCHs like x86, arm64 and arm. >> >> Cc: Alexander Graf >> Cc: Benjamin Herrenschmidt >> Cc: Paul Mackerras >> Cc: Michael Ellerman >> Cc: Anatolij Gustschin >> Cc: Alistair Popple >> Cc: Matt Porter >> Cc: Vitaly Bordug >> Cc: Scott Wood >> Cc: Kumar Gala >> Cc: Daniel Cashman >> Cc: Kees Cook >> Signed-off-by: Bhupesh Sharma >> --- >> arch/powerpc/Kconfig | 34 ++ >> arch/powerpc/mm/mmap.c | 7 --- >> 2 files changed, 38 insertions(+), 3 deletions(-) >> >> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig >> index a8ee573fe610..b4a843f68705 100644 >> --- a/arch/powerpc/Kconfig >> +++ b/arch/powerpc/Kconfig >> @@ -22,6 +22,38 @@ config MMU >> bool >> default y >> >> +config ARCH_MMAP_RND_BITS_MIN >> + default 5 if PPC_256K_PAGES && 32BIT >> + default 12 if PPC_256K_PAGES && 64BIT >> + default 7 if PPC_64K_PAGES && 32BIT >> + default 14 if PPC_64K_PAGES && 64BIT >> + default 9 if PPC_16K_PAGES && 32BIT >> + default 16 if PPC_16K_PAGES && 64BIT >> + default 11 if PPC_4K_PAGES && 32BIT >> + default 18 if PPC_4K_PAGES && 64BIT >> + >> +# max bits determined by the following formula: >> +# VA_BITS - PAGE_SHIFT - 4 >> +# for e.g for 64K page and 64BIT = 48 - 16 - 4 = 28 >> +config ARCH_MMAP_RND_BITS_MAX >> + default 10 if PPC_256K_PAGES && 32BIT >> + default 26 if PPC_256K_PAGES && 64BIT >> + default 12 if PPC_64K_PAGES && 32BIT >> + default 28 if PPC_64K_PAGES && 64BIT >> + default 14 if PPC_16K_PAGES && 32BIT >> + default 30 if PPC_16K_PAGES && 64BIT >> + default 16 if PPC_4K_PAGES && 32BIT >> + default 32 if PPC_4K_PAGES && 64BIT >> + >> +config ARCH_MMAP_RND_COMPAT_BITS_MIN >> + default 5 if PPC_256K_PAGES >> + default 7 if PPC_64K_PAGES >> + default 9 if PPC_16K_PAGES >> + default 11 >> + >> +config ARCH_MMAP_RND_COMPAT_BITS_MAX >> + default 16 >> + >> config HAVE_SETUP_PER_CPU_AREA >> def_bool PPC64 >> >> @@ -100,6 +132,8 @@ config PPC >> select HAVE_EFFICIENT_UNALIGNED_ACCESS if !(CPU_LITTLE_ENDIAN && >> POWER7_CPU) >> select HAVE_KPROBES >> select HAVE_ARCH_KGDB >> + select HAVE_ARCH_MMAP_RND_BITS >> + select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT >> select HAVE_KRETPROBES >> select HAVE_ARCH_TRACEHOOK >> select HAVE_MEMBLOCK >> diff --git a/arch/powerpc/mm/mmap.c b/arch/powerpc/mm/mmap.c >> index 2f1e44362198..babf59faab3b 100644 >> --- a/arch/powerpc/mm/mmap.c >> +++ b/arch/powerpc/mm/mmap.c >> @@ -60,11 +60,12 @@ unsigned long arch_mmap_rnd(void) >> { >> unsigned long rnd; >> >> - /* 8MB for 32bit, 1GB for 64bit */ >> +#ifdef CONFIG_COMPAT >> if (is_32bit_task()) >> - rnd = get_random_long() % (1<<(23-PAGE_SHIFT)); >> + rnd = get_random_long() & ((1UL << mmap_rnd_compat_bits) - >> 1); >> else >> - rnd = get_random_long() % (1UL<<(30-PAGE_SHIFT)); >> +#endif >> + rnd = get_random_long() & ((1UL << mmap_rnd_bits) - 1); >> >> return rnd << PAGE_SHIFT; >> } > > Awesome! This looks good to me based on my earlier analysis. > > Reviewed-by: Kees Cook Many thanks. Regards, Bhupesh
Re: Query regarding randomization bits for a ASLR elf on PPC64
Hi Kees, On Thu, Jan 26, 2017 at 7:08 AM, Kees Cook wrote: > On Sun, Jan 22, 2017 at 9:34 PM, Bhupesh Sharma wrote: >> I was recently looking at ways to extend the randomization range for a >> ASLR elf on a PPC64LE system. >> >> I basically have been using 28-bits of randomization on x86_64 for an >> ASLR elf using appropriate ARCH_MMAP_RND_BITS_MIN and >> ARCH_MMAP_RND_BITS_MAX values: >> >> http://lxr.free-electrons.com/source/arch/x86/Kconfig#L192 >> >> And I understand from looking at the PPC64 code base that both >> ARCH_MMAP_RND_BITS_MIN and ARCH_MMAP_RND_BITS_MAX are not used in the >> current upstream code. > > Yeah, looks like PPC could use it. If you've got hardware to test > with, please add it. :) > >> I am looking at ways to randomize the mmap, stack and brk ranges for a >> ALSR elf on PPC64LE. Currently I am using a PAGE SIZE of 64K in my >> config file and hence the randomization usually translates to >> something like this for me: > > Just to be clear: 64K pages will lose you 4 bits of entropy when > compared to 4K on x86_64. (Assuming I'm doing the math right...) > >> mmap: >> --- >> http://lxr.free-electrons.com/source/arch/powerpc/mm/mmap.c#L67 >> >> rnd = get_random_long() % (1UL<<(30-PAGE_SHIFT)); >> >> Since PAGE_SHIFT is 16 for 64K page size, this computation reduces to: >> rnd = get_random_long() % (1UL<<(14)); >> >> If I compare this to x86_64, I see there: >> >> http://lxr.free-electrons.com/source/arch/x86/mm/mmap.c#L79 >> >> rnd = get_random_long() & ((1UL << mmap_rnd_bits) - 1); >> >> So, if mmap_rnd_bits = 28, this equates to: >> rnd = get_random_long() & ((1UL << 28) - 1); >> >> Observations and Queries: >> -- >> >> - So, x86_64 gives approx twice number of random bits for a ASLR elf >> running on it as compared to PPC64 although both use a 48-bit VA. >> >> - I also see this comment for PPC at various places, regarding 1GB >> randomness spread for PPC64. Is this restricted by the hardware or the >> kernel usage?: >> >> /* 8MB for 32bit, 1GB for 64bit */ >> 64 if (is_32bit_task()) >> 65 rnd = get_random_long() % (1<<(23-PAGE_SHIFT)); >> 66 else >> 67 rnd = get_random_long() % (1UL<<(30-PAGE_SHIFT)); > > Yeah, I'm not sure about this. The comments above the MIN_GAP* macros > seem to talk about making sure there is the 1GB stack gap, but that > shouldn't limit mmap. > > Stack base is randomized in fs/binfmt_elf.c randomize_stack_top() > which uses STACK_RND_MASK (and PAGE_SHIFT). > > x86: > /* 1GB for 64bit, 8MB for 32bit */ > #define STACK_RND_MASK (test_thread_flag(TIF_ADDR32) ? 0x7ff : 0x3f) > > powerpc: > /* 1GB for 64bit, 8MB for 32bit */ > #define STACK_RND_MASK (is_32bit_task() ? \ > (0x7ff >> (PAGE_SHIFT - 12)) : \ > (0x3 >> (PAGE_SHIFT - 12))) > > So, in the 64k page case, stack randomization entropy is reduced, but > otherwise identical to x86. > > x86 and powerpc both use arch_mmap_rnd() for both mmap and ET_DYN > (with different bases). > > x86 uses ELF_ET_DYN_BASE as TASK_SIZE / 3 * 2 (which the ELF loader > pushes back up the nearest PAGE_SIZE alignment: 0x5000), > though powerpc uses 0x2000, so it should have significantly more > space for mmap and ET_DYN ASLR than x86. > >> - I tried to increase the randomness to 28 bits for PPC as well by >> making the PPC mmap, brk code equivalent to x86_64 and it works fine >> for my use case. > > The PPC brk randomization on powerpc doesn't use the more common > randomize_page() way other archs do it... > > /* 8MB for 32bit, 1GB for 64bit */ > if (is_32bit_task()) > rnd = (get_random_long() % (1UL<<(23-PAGE_SHIFT))); > else > rnd = (get_random_long() % (1UL<<(30-PAGE_SHIFT))); > > return rnd << PAGE_SHIFT; > > x86 uses 0x0200 (via randomize_page()), which, if I'm doing the > math right is 14 bits, regardless of 32/64-bit. arm64 uses 0x4000 > (20 bits) on 64-bit processes and the same as x86 (14) for 32-bit > processes. Looks like powerpc uses either 13 or 20 for 4k pages, which > is close to the same. > >> - But, I am not sure this is the right thing to do and whether the >> PPC64 also supports the MIN and MAX ranges for randomization. > > It can support it once you implement the Kconfigs for it. :) > >> - If it does I would
[PATCH 2/2] powerpc: Redefine ELF_ET_DYN_BASE
Currently the powerpc arch uses a ELF_ET_DYN_BASE value of 0x2000 which ends up pushing an elf to a load address which is 32-bit. On 64-bit platforms, this might be too less especially when one is trying to increase the randomness of the load address of the ASLR elfs on such platforms. This patch makes the powerpc platforms mimic the x86 ones, by ensuring that the ELF_ET_DYN_BASE is calculated on basis of the current task's TASK_SIZE. Cc: Alexander Graf Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Michael Ellerman Cc: Anatolij Gustschin Cc: Alistair Popple Cc: Matt Porter Cc: Vitaly Bordug Cc: Scott Wood Cc: Kumar Gala Cc: Daniel Cashman Cc: Kees Cook Signed-off-by: Bhupesh Sharma --- arch/powerpc/include/asm/elf.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/include/asm/elf.h b/arch/powerpc/include/asm/elf.h index ee46ffef608e..dd035f6dd782 100644 --- a/arch/powerpc/include/asm/elf.h +++ b/arch/powerpc/include/asm/elf.h @@ -28,7 +28,7 @@ the loader. We need to make sure that it is out of the way of the program that it will "exec", and that there is sufficient room for the brk. */ -#define ELF_ET_DYN_BASE0x2000 +#define ELF_ET_DYN_BASE(TASK_SIZE - PAGE_SIZE) #define ELF_CORE_EFLAGS (is_elf2_task() ? 2 : 0) -- 2.7.4
[PATCH 1/2] powerpc: mm: support ARCH_MMAP_RND_BITS
powerpc: arch_mmap_rnd() uses hard-coded values, (23-PAGE_SHIFT) for 32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset for the mmap base address. This value represents a compromise between increased ASLR effectiveness and avoiding address-space fragmentation. Replace it with a Kconfig option, which is sensibly bounded, so that platform developers may choose where to place this compromise. Keep default values as new minimums. This patch makes sure that now powerpc mmap arch_mmap_rnd() approach is similar to other ARCHs like x86, arm64 and arm. Cc: Alexander Graf Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Michael Ellerman Cc: Anatolij Gustschin Cc: Alistair Popple Cc: Matt Porter Cc: Vitaly Bordug Cc: Scott Wood Cc: Kumar Gala Cc: Daniel Cashman Cc: Kees Cook Signed-off-by: Bhupesh Sharma --- arch/powerpc/Kconfig | 34 ++ arch/powerpc/mm/mmap.c | 7 --- 2 files changed, 38 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index a8ee573fe610..b4a843f68705 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -22,6 +22,38 @@ config MMU bool default y +config ARCH_MMAP_RND_BITS_MIN + default 5 if PPC_256K_PAGES && 32BIT + default 12 if PPC_256K_PAGES && 64BIT + default 7 if PPC_64K_PAGES && 32BIT + default 14 if PPC_64K_PAGES && 64BIT + default 9 if PPC_16K_PAGES && 32BIT + default 16 if PPC_16K_PAGES && 64BIT + default 11 if PPC_4K_PAGES && 32BIT + default 18 if PPC_4K_PAGES && 64BIT + +# max bits determined by the following formula: +# VA_BITS - PAGE_SHIFT - 4 +# for e.g for 64K page and 64BIT = 48 - 16 - 4 = 28 +config ARCH_MMAP_RND_BITS_MAX + default 10 if PPC_256K_PAGES && 32BIT + default 26 if PPC_256K_PAGES && 64BIT + default 12 if PPC_64K_PAGES && 32BIT + default 28 if PPC_64K_PAGES && 64BIT + default 14 if PPC_16K_PAGES && 32BIT + default 30 if PPC_16K_PAGES && 64BIT + default 16 if PPC_4K_PAGES && 32BIT + default 32 if PPC_4K_PAGES && 64BIT + +config ARCH_MMAP_RND_COMPAT_BITS_MIN + default 5 if PPC_256K_PAGES + default 7 if PPC_64K_PAGES + default 9 if PPC_16K_PAGES + default 11 + +config ARCH_MMAP_RND_COMPAT_BITS_MAX + default 16 + config HAVE_SETUP_PER_CPU_AREA def_bool PPC64 @@ -100,6 +132,8 @@ config PPC select HAVE_EFFICIENT_UNALIGNED_ACCESS if !(CPU_LITTLE_ENDIAN && POWER7_CPU) select HAVE_KPROBES select HAVE_ARCH_KGDB + select HAVE_ARCH_MMAP_RND_BITS + select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT select HAVE_KRETPROBES select HAVE_ARCH_TRACEHOOK select HAVE_MEMBLOCK diff --git a/arch/powerpc/mm/mmap.c b/arch/powerpc/mm/mmap.c index 2f1e44362198..babf59faab3b 100644 --- a/arch/powerpc/mm/mmap.c +++ b/arch/powerpc/mm/mmap.c @@ -60,11 +60,12 @@ unsigned long arch_mmap_rnd(void) { unsigned long rnd; - /* 8MB for 32bit, 1GB for 64bit */ +#ifdef CONFIG_COMPAT if (is_32bit_task()) - rnd = get_random_long() % (1<<(23-PAGE_SHIFT)); + rnd = get_random_long() & ((1UL << mmap_rnd_compat_bits) - 1); else - rnd = get_random_long() % (1UL<<(30-PAGE_SHIFT)); +#endif + rnd = get_random_long() & ((1UL << mmap_rnd_bits) - 1); return rnd << PAGE_SHIFT; } -- 2.7.4
[PATCH 0/2] RFC: Adjust powerpc ASLR elf randomness
This RFC patchset tries to make the powerpc ASLR elf randomness implementation similar to other ARCHs (like x86). The 1st patch introduces the support of ARCH_MMAP_RND_BITS in powerpc mmap implementation to allow a sane balance between increased randomness in the mmap address of ASLR elfs and increased address space fragmentation. The 2nd patch increases the ELF_ET_DYN_BASE value from the current hardcoded value of 0x2000_ to something more practical, i.e. TASK_SIZE - PAGE_SHIFT (which makes sense especially for 64-bit platforms which would like to utilize more randomization in the load address of a PIE elf). I have tested this patchset on 64-bit Fedora and RHEL7 machines/VMs. Here are the test results and details of the test environment: 1. Create a test PIE program which shows its own memory map: $ cat show_mmap_pie.c #include #include int main(void){ char command[1024]; sprintf(command,"cat /proc/%d/maps",getpid()); system(command); return 0; } 2. Compile it as a PIE: $ gcc -o show_mmap_pie -fpie -pie show_mmap_pie.c 3. Before this patchset (on a Fedora-25 PPC64 POWER7 machine): # ./show_mmap_pie 33dd-33de r-xp fd:00 1724816 /root/git/linux/show_mmap_pie 33de-33df r--p fd:00 1724816 /root/git/linux/show_mmap_pie 33df-33e0 rw-p 0001 fd:00 1724816 /root/git/linux/show_mmap_pie 3fff9d75-3fff9d94 r-xp fd:00 2753176 /usr/lib64/power7/libc-2.23.so 3fff9d94-3fff9d95 ---p 001f fd:00 2753176 /usr/lib64/power7/libc-2.23.so 3fff9d95-3fff9d96 r--p 001f fd:00 2753176 /usr/lib64/power7/libc-2.23.so 3fff9d96-3fff9d97 rw-p 0020 fd:00 2753176 /usr/lib64/power7/libc-2.23.so 3fff9d98-3fff9d9a r-xp 00:00 0 [vdso] 3fff9d9a-3fff9d9e r-xp fd:00 2625136 /usr/lib64/ld-2.23.so 3fff9d9e-3fff9d9f r--p 0003 fd:00 2625136 /usr/lib64/ld-2.23.so 3fff9d9f-3fff9da0 rw-p 0004 fd:00 2625136 /usr/lib64/ld-2.23.so 3528-352b rw-p 00:00 0 [stack] As one can notice, the load address even for a 64-bit binary (show_mmap_pie), is within the 32-bit range. 4. After this patchset (on a Fedora-25 PPC64 POWER7 machine): # ./show_mmap_pie 3fffad25-3fffad44 r-xp fd:00 2753176 /usr/lib64/power7/libc-2.23.so 3fffad44-3fffad45 ---p 001f fd:00 2753176 /usr/lib64/power7/libc-2.23.so 3fffad45-3fffad46 r--p 001f fd:00 2753176 /usr/lib64/power7/libc-2.23.so 3fffad46-3fffad47 rw-p 0020 fd:00 2753176 /usr/lib64/power7/libc-2.23.so 3fffad48-3fffad4a r-xp 00:00 0 [vdso] 3fffad4a-3fffad4e r-xp fd:00 2625136 /usr/lib64/ld-2.23.so 3fffad4e-3fffad4f r--p 0003 fd:00 2625136 /usr/lib64/ld-2.23.so 3fffad4f-3fffad50 rw-p 0004 fd:00 2625136 /usr/lib64/ld-2.23.so 3fffad50-3fffad51 r-xp fd:00 1724816 /root/git/linux/show_mmap_pie 3fffad51-3fffad52 r--p fd:00 1724816 /root/git/linux/show_mmap_pie 3fffad52-3fffad53 rw-p 0001 fd:00 1724816 /root/git/linux/show_mmap_pie 3fffe311-3fffe314 rw-p 00:00 0 [stack] The load address of the elf is now pushed to be in a 64-bit range. As I have access to limited number of powerpc machines, request folks having powerpc platforms to try this patchset and share their test results/issues as well. Cc: Alexander Graf Cc: Benjamin Herrenschmidt Cc: Paul Mackerras Cc: Michael Ellerman Cc: Anatolij Gustschin Cc: Alistair Popple Cc: Matt Porter Cc: Vitaly Bordug Cc: Scott Wood Cc: Kumar Gala Cc: Daniel Cashman Cc: Kees Cook Bhupesh Sharma (2): powerpc: mm: support ARCH_MMAP_RND_BITS powerpc: Redefine ELF_ET_DYN_BASE arch/powerpc/Kconfig | 34 ++ arch/powerpc/include/asm/elf.h | 2 +- arch/powerpc/mm/mmap.c | 7 --- 3 files changed, 39 insertions(+), 4 deletions(-) -- 2.7.4
Query regarding randomization bits for a ASLR elf on PPC64
Hi Experts, I was recently looking at ways to extend the randomization range for a ASLR elf on a PPC64LE system. I basically have been using 28-bits of randomization on x86_64 for an ASLR elf using appropriate ARCH_MMAP_RND_BITS_MIN and ARCH_MMAP_RND_BITS_MAX values: http://lxr.free-electrons.com/source/arch/x86/Kconfig#L192 And I understand from looking at the PPC64 code base that both ARCH_MMAP_RND_BITS_MIN and ARCH_MMAP_RND_BITS_MAX are not used in the current upstream code. I am looking at ways to randomize the mmap, stack and brk ranges for a ALSR elf on PPC64LE. Currently I am using a PAGE SIZE of 64K in my config file and hence the randomization usually translates to something like this for me: mmap: --- http://lxr.free-electrons.com/source/arch/powerpc/mm/mmap.c#L67 rnd = get_random_long() % (1UL<<(30-PAGE_SHIFT)); Since PAGE_SHIFT is 16 for 64K page size, this computation reduces to: rnd = get_random_long() % (1UL<<(14)); If I compare this to x86_64, I see there: http://lxr.free-electrons.com/source/arch/x86/mm/mmap.c#L79 rnd = get_random_long() & ((1UL << mmap_rnd_bits) - 1); So, if mmap_rnd_bits = 28, this equates to: rnd = get_random_long() & ((1UL << 28) - 1); Observations and Queries: -- - So, x86_64 gives approx twice number of random bits for a ASLR elf running on it as compared to PPC64 although both use a 48-bit VA. - I also see this comment for PPC at various places, regarding 1GB randomness spread for PPC64. Is this restricted by the hardware or the kernel usage?: /* 8MB for 32bit, 1GB for 64bit */ 64 if (is_32bit_task()) 65 rnd = get_random_long() % (1<<(23-PAGE_SHIFT)); 66 else 67 rnd = get_random_long() % (1UL<<(30-PAGE_SHIFT)); - I tried to increase the randomness to 28 bits for PPC as well by making the PPC mmap, brk code equivalent to x86_64 and it works fine for my use case. - But, I am not sure this is the right thing to do and whether the PPC64 also supports the MIN and MAX ranges for randomization. - If it does I would like to understand, test and push a patch to implement the same for PPC64 in upstream. Sorry for the long mail, but would really appreciate if someone can help me understand the details here. Thanks, Bhupesh