from:"Bhupesh Sharma"

Re: [PATCH v6 0/2] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)

2020-07-02 Thread Bhupesh Sharma

On Thu, Jul 2, 2020 at 10:45 PM Catalin Marinas  wrote:
>
> On Thu, 14 May 2020 00:22:35 +0530, Bhupesh Sharma wrote:
> > Apologies for the delayed update. Its been quite some time since I
> > posted the last version (v5), but I have been really caught up in some
> > other critical issues.
> >
> > Changes since v5:
> > 
> > - v5 can be viewed here:
> >   http://lists.infradead.org/pipermail/kexec/2019-November/024055.html
> > - Addressed review comments from James Morse and Boris.
> > - Added Tested-by received from John on v5 patchset.
> > - Rebased against arm64 (for-next/ptr-auth) branch which has Amit's
> >   patchset for ARMv8.3-A Pointer Authentication feature vmcoreinfo
> >   applied.
> >
> > [...]
>
> Applied to arm64 (for-next/vmcoreinfo), thanks!
>
> [1/2] crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo
>   https://git.kernel.org/arm64/c/1d50e5d0c505
> [2/2] arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo
>   https://git.kernel.org/arm64/c/bbdbc11804ff

Thanks Catalin for pulling in the changes.

Dave and James, many thanks for reviewing the same as well.

Regards,
Bhupesh

Re: Re: [RESEND PATCH v5 2/5] arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo

2020-06-16 Thread Bhupesh Sharma

Hello Bharat,

On Wed, Jun 10, 2020 at 10:17 PM Bharat Gooty  wrote:
>
> Hello Bhupesh,
> V6 patch set on Linux 5.7, did not help.
> I have applied makedump file
> http://lists.infradead.org/pipermail/kexec/2019-November/023963.html changes
> also (makedump-1.6.6). Tried to apply it on makedumpfile 1.6.7.  Patch set_2
> failed. Would like to know, if you have V5 patch set for makedump file
> changes. With makedump 1.6.6, able to collect the vmore file.
> I used latest crash utility
> (https://www.redhat.com/archives/crash-utility/2019-November/msg00014.html
> changes are present)
> When I used crash utility, following is the error:
>
> Thanks,
> -Bharat
>
>
> -Original Message-
> From: Scott Branden [mailto:scott.bran...@broadcom.com]
> Sent: Thursday, April 30, 2020 4:34 AM
> To: Bhupesh Sharma; Amit Kachhap
> Cc: Mark Rutland; x...@kernel.org; Will Deacon; Linux Doc Mailing List;
> Catalin Marinas; Ard Biesheuvel; kexec mailing list; Linux Kernel Mailing
> List; Kazuhito Hagio; James Morse; Dave Anderson; bhupesh linux;
> linuxppc-dev@lists.ozlabs.org; linux-arm-kernel; Steve Capper; Ray Jui;
> Bharat Gooty
> Subject: Re: Re: [RESEND PATCH v5 2/5] arm64/crash_core: Export TCR_EL1.T1SZ
> in vmcoreinfo
>
> Hi Bhupesh,
>
> On 2020-02-23 10:25 p.m., Bhupesh Sharma wrote:
> > Hi Amit,
> >
> > On Fri, Feb 21, 2020 at 2:36 PM Amit Kachhap  wrote:
> >> Hi Bhupesh,
> >>
> >> On 1/13/20 5:44 PM, Bhupesh Sharma wrote:
> >>> Hi James,
> >>>
> >>> On 01/11/2020 12:30 AM, Dave Anderson wrote:
> >>>> - Original Message -
> >>>>> Hi Bhupesh,
> >>>>>
> >>>>> On 25/12/2019 19:01, Bhupesh Sharma wrote:
> >>>>>> On 12/12/2019 04:02 PM, James Morse wrote:
> >>>>>>> On 29/11/2019 19:59, Bhupesh Sharma wrote:
> >>>>>>>> vabits_actual variable on arm64 indicates the actual VA space size,
> >>>>>>>> and allows a single binary to support both 48-bit and 52-bit VA
> >>>>>>>> spaces.
> >>>>>>>>
> >>>>>>>> If the ARMv8.2-LVA optional feature is present, and we are running
> >>>>>>>> with a 64KB page size; then it is possible to use 52-bits of
> >>>>>>>> address
> >>>>>>>> space for both userspace and kernel addresses. However, any kernel
> >>>>>>>> binary that supports 52-bit must also be able to fall back to
> >>>>>>>> 48-bit
> >>>>>>>> at early boot time if the hardware feature is not present.
> >>>>>>>>
> >>>>>>>> Since TCR_EL1.T1SZ indicates the size offset of the memory region
> >>>>>>>> addressed by TTBR1_EL1 (and hence can be used for determining the
> >>>>>>>> vabits_actual value) it makes more sense to export the same in
> >>>>>>>> vmcoreinfo rather than vabits_actual variable, as the name of the
> >>>>>>>> variable can change in future kernel versions, but the
> >>>>>>>> architectural
> >>>>>>>> constructs like TCR_EL1.T1SZ can be used better to indicate
> >>>>>>>> intended
> >>>>>>>> specific fields to user-space.
> >>>>>>>>
> >>>>>>>> User-space utilities like makedumpfile and crash-utility, need to
> >>>>>>>> read/write this value from/to vmcoreinfo
> >>>>>>> (write?)
> >>>>>> Yes, also write so that the vmcoreinfo from an (crashing) arm64
> >>>>>> system can
> >>>>>> be used for
> >>>>>> analysis of the root-cause of panic/crash on say an x86_64 host using
> >>>>>> utilities like
> >>>>>> crash-utility/gdb.
> >>>>> I read this as as "User-space [...] needs to write to vmcoreinfo".
> >>> That's correct. But for writing to vmcore dump in the kdump kernel, we
> >>> need to read the symbols from the vmcoreinfo in the primary kernel.
> >>>
> >>>>>>>> for determining if a virtual address lies in the linear map range.
> >>>>>>> I think this is a fragile example. The debugger shouldn't need to
> >>>>>>> know
> >>>>>>> this.
> >>>>>> Well that the c

Re: [PATCH v6 0/2] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)

2020-06-15 Thread Bhupesh Sharma

Hello Catalin, Will,

On Tue, Jun 2, 2020 at 10:54 AM Bhupesh Sharma  wrote:
>
> Hello,
>
> On Thu, May 14, 2020 at 12:22 AM Bhupesh Sharma  wrote:
> >
> > Apologies for the delayed update. Its been quite some time since I
> > posted the last version (v5), but I have been really caught up in some
> > other critical issues.
> >
> > Changes since v5:
> > 
> > - v5 can be viewed here:
> >   http://lists.infradead.org/pipermail/kexec/2019-November/024055.html
> > - Addressed review comments from James Morse and Boris.
> > - Added Tested-by received from John on v5 patchset.
> > - Rebased against arm64 (for-next/ptr-auth) branch which has Amit's
> >   patchset for ARMv8.3-A Pointer Authentication feature vmcoreinfo
> >   applied.
> >
> > Changes since v4:
> > 
> > - v4 can be seen here:
> >   http://lists.infradead.org/pipermail/kexec/2019-November/023961.html
> > - Addressed comments from Dave and added patches for documenting
> >   new variables appended to vmcoreinfo documentation.
> > - Added testing report shared by Akashi for PATCH 2/5.
> >
> > Changes since v3:
> > 
> > - v3 can be seen here:
> >   http://lists.infradead.org/pipermail/kexec/2019-March/022590.html
> > - Addressed comments from James and exported TCR_EL1.T1SZ in vmcoreinfo
> >   instead of PTRS_PER_PGD.
> > - Added a new patch (via [PATCH 3/3]), which fixes a simple typo in
> >   'Documentation/arm64/memory.rst'
> >
> > Changes since v2:
> > 
> > - v2 can be seen here:
> >   http://lists.infradead.org/pipermail/kexec/2019-March/022531.html
> > - Protected 'MAX_PHYSMEM_BITS' vmcoreinfo variable under CONFIG_SPARSEMEM
> >   ifdef sections, as suggested by Kazu.
> > - Updated vmcoreinfo documentation to add description about
> >   'MAX_PHYSMEM_BITS' variable (via [PATCH 3/3]).
> >
> > Changes since v1:
> > 
> > - v1 was sent out as a single patch which can be seen here:
> >   http://lists.infradead.org/pipermail/kexec/2019-February/022411.html
> >
> > - v2 breaks the single patch into two independent patches:
> >   [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, whereas
> >   [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel code 
> > (all archs)
> >
> > This patchset primarily fixes the regression reported in user-space
> > utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture
> > with the availability of 52-bit address space feature in underlying
> > kernel. These regressions have been reported both on CPUs which don't
> > support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels
> > and also on prototype platforms (like ARMv8 FVP simulator model) which
> > support ARMv8.2 extensions and are running newer kernels.
> >
> > The reason for these regressions is that right now user-space tools
> > have no direct access to these values (since these are not exported
> > from the kernel) and hence need to rely on a best-guess method of
> > determining value of 'vabits_actual' and 'MAX_PHYSMEM_BITS' supported
> > by underlying kernel.
> >
> > Exporting these values via vmcoreinfo will help user-land in such cases.
> > In addition, as per suggestion from makedumpfile maintainer (Kazu),
> > it makes more sense to append 'MAX_PHYSMEM_BITS' to
> > vmcoreinfo in the core code itself rather than in arm64 arch-specific
> > code, so that the user-space code for other archs can also benefit from
> > this addition to the vmcoreinfo and use it as a standard way of
> > determining 'SECTIONS_SHIFT' value in user-land.
> >
> > Cc: Boris Petkov 
> > Cc: Ingo Molnar 
> > Cc: Thomas Gleixner 
> > Cc: Jonathan Corbet 
> > Cc: James Morse 
> > Cc: Mark Rutland 
> > Cc: Will Deacon 
> > Cc: Steve Capper 
> > Cc: Catalin Marinas 
> > Cc: Ard Biesheuvel 
> > Cc: Michael Ellerman 
> > Cc: Paul Mackerras 
> > Cc: Benjamin Herrenschmidt 
> > Cc: Dave Anderson 
> > Cc: Kazuhito Hagio 
> > Cc: John Donnelly 
> > Cc: scott.bran...@broadcom.com
> > Cc: Amit Kachhap 
> > Cc: x...@kernel.org
> > Cc: linuxppc-dev@lists.ozlabs.org
> > Cc: linux-arm-ker...@lists.infradead.org
> > Cc: linux-ker...@vger.kernel.org
> > Cc: linux-...@vger.kernel.org
> > Cc: ke...@lists.infradead.org
> >
> > Bhupesh Sharma (2):
> >   crash_

Re: Re: [RESEND PATCH v5 5/5] Documentation/vmcoreinfo: Add documentation for 'TCR_EL1.T1SZ'

2020-06-03 Thread Bhupesh Sharma

Hello Scott,

On Thu, Jun 4, 2020 at 12:17 AM Scott Branden
 wrote:
>
> Hi Bhupesh,
>
> Would be great to get this patch series upstreamed?
>
> On 2019-12-25 10:49 a.m., Bhupesh Sharma wrote:
> > Hi James,
> >
> > On 12/12/2019 04:02 PM, James Morse wrote:
> >> Hi Bhupesh,
> >
> > I am sorry this review mail skipped my attention due to holidays and
> > focus on other urgent issues.
> >
> >> On 29/11/2019 19:59, Bhupesh Sharma wrote:
> >>> Add documentation for TCR_EL1.T1SZ variable being added to
> >>> vmcoreinfo.
> >>>
> >>> It indicates the size offset of the memory region addressed by
> >>> TTBR1_EL1
> >>
> >>> and hence can be used for determining the vabits_actual value.
> >>
> >> used for determining random-internal-kernel-variable, that might not
> >> exist tomorrow.
> >>
> >> Could you describe how this is useful/necessary if a debugger wants
> >> to walk the page
> >> tables from the core file? I think this is a better argument.
> >>
> >> Wouldn't the documentation be better as part of the patch that adds
> >> the export?
> >> (... unless these have to go via different trees? ..)
> >
> > Ok, will fix the same in v6 version.
> >
> >>> diff --git a/Documentation/admin-guide/kdump/vmcoreinfo.rst
> >>> b/Documentation/admin-guide/kdump/vmcoreinfo.rst
> >>> index 447b64314f56..f9349f9d3345 100644
> >>> --- a/Documentation/admin-guide/kdump/vmcoreinfo.rst
> >>> +++ b/Documentation/admin-guide/kdump/vmcoreinfo.rst
> >>> @@ -398,6 +398,12 @@ KERNELOFFSET
> >>>   The kernel randomization offset. Used to compute the page offset. If
> >>>   KASLR is disabled, this value is zero.
> >>>   +TCR_EL1.T1SZ
> >>> +
> >>> +
> >>> +Indicates the size offset of the memory region addressed by TTBR1_EL1
> >>
> >>> +and hence can be used for determining the vabits_actual value.
> >>
> >> 'vabits_actual' may not exist when the next person comes to read this
> >> documentation (its
> >> going to rot really quickly).
> >>
> >> I think the first half of this text is enough to say what this is
> >> for. You should include
> >> words to the effect that its the hardware value that goes with
> >> swapper_pg_dir. You may
> >> want to point readers to the arm-arm for more details on what the
> >> value means.
> >
> > Ok, got it. Fixed this in v6, which should be on its way shortly.
> I can't seem to find v6?

Oops. I remember Cc'ing you to the v6 patchset (may be my email client
messed up), anyways here is the v6 patchset for your reference:
<http://lists.infradead.org/pipermail/kexec/2020-May/025095.html>

Do share your review/test comments on the same.

Thanks,
Bhupesh

Re: [PATCH v6 0/2] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)

2020-06-01 Thread Bhupesh Sharma

Hello,

On Thu, May 14, 2020 at 12:22 AM Bhupesh Sharma  wrote:
>
> Apologies for the delayed update. Its been quite some time since I
> posted the last version (v5), but I have been really caught up in some
> other critical issues.
>
> Changes since v5:
> 
> - v5 can be viewed here:
>   http://lists.infradead.org/pipermail/kexec/2019-November/024055.html
> - Addressed review comments from James Morse and Boris.
> - Added Tested-by received from John on v5 patchset.
> - Rebased against arm64 (for-next/ptr-auth) branch which has Amit's
>   patchset for ARMv8.3-A Pointer Authentication feature vmcoreinfo
>   applied.
>
> Changes since v4:
> 
> - v4 can be seen here:
>   http://lists.infradead.org/pipermail/kexec/2019-November/023961.html
> - Addressed comments from Dave and added patches for documenting
>   new variables appended to vmcoreinfo documentation.
> - Added testing report shared by Akashi for PATCH 2/5.
>
> Changes since v3:
> 
> - v3 can be seen here:
>   http://lists.infradead.org/pipermail/kexec/2019-March/022590.html
> - Addressed comments from James and exported TCR_EL1.T1SZ in vmcoreinfo
>   instead of PTRS_PER_PGD.
> - Added a new patch (via [PATCH 3/3]), which fixes a simple typo in
>   'Documentation/arm64/memory.rst'
>
> Changes since v2:
> 
> - v2 can be seen here:
>   http://lists.infradead.org/pipermail/kexec/2019-March/022531.html
> - Protected 'MAX_PHYSMEM_BITS' vmcoreinfo variable under CONFIG_SPARSEMEM
>   ifdef sections, as suggested by Kazu.
> - Updated vmcoreinfo documentation to add description about
>   'MAX_PHYSMEM_BITS' variable (via [PATCH 3/3]).
>
> Changes since v1:
> 
> - v1 was sent out as a single patch which can be seen here:
>   http://lists.infradead.org/pipermail/kexec/2019-February/022411.html
>
> - v2 breaks the single patch into two independent patches:
>   [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, whereas
>   [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel code 
> (all archs)
>
> This patchset primarily fixes the regression reported in user-space
> utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture
> with the availability of 52-bit address space feature in underlying
> kernel. These regressions have been reported both on CPUs which don't
> support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels
> and also on prototype platforms (like ARMv8 FVP simulator model) which
> support ARMv8.2 extensions and are running newer kernels.
>
> The reason for these regressions is that right now user-space tools
> have no direct access to these values (since these are not exported
> from the kernel) and hence need to rely on a best-guess method of
> determining value of 'vabits_actual' and 'MAX_PHYSMEM_BITS' supported
> by underlying kernel.
>
> Exporting these values via vmcoreinfo will help user-land in such cases.
> In addition, as per suggestion from makedumpfile maintainer (Kazu),
> it makes more sense to append 'MAX_PHYSMEM_BITS' to
> vmcoreinfo in the core code itself rather than in arm64 arch-specific
> code, so that the user-space code for other archs can also benefit from
> this addition to the vmcoreinfo and use it as a standard way of
> determining 'SECTIONS_SHIFT' value in user-land.
>
> Cc: Boris Petkov 
> Cc: Ingo Molnar 
> Cc: Thomas Gleixner 
> Cc: Jonathan Corbet 
> Cc: James Morse 
> Cc: Mark Rutland 
> Cc: Will Deacon 
> Cc: Steve Capper 
> Cc: Catalin Marinas 
> Cc: Ard Biesheuvel 
> Cc: Michael Ellerman 
> Cc: Paul Mackerras 
> Cc: Benjamin Herrenschmidt 
> Cc: Dave Anderson 
> Cc: Kazuhito Hagio 
> Cc: John Donnelly 
> Cc: scott.bran...@broadcom.com
> Cc: Amit Kachhap 
> Cc: x...@kernel.org
> Cc: linuxppc-dev@lists.ozlabs.org
> Cc: linux-arm-ker...@lists.infradead.org
> Cc: linux-ker...@vger.kernel.org
> Cc: linux-...@vger.kernel.org
> Cc: ke...@lists.infradead.org
>
> Bhupesh Sharma (2):
>   crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo
>   arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo
>
>  Documentation/admin-guide/kdump/vmcoreinfo.rst | 16 
>  arch/arm64/include/asm/pgtable-hwdef.h |  1 +
>  arch/arm64/kernel/crash_core.c | 10 ++
>  kernel/crash_core.c|  1 +
>  4 files changed, 28 insertions(+)

Ping. @James Morse , Others

Please share if you have some comments regarding this patchset.

Thanks,
Bhupesh

[PATCH v6 0/2] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)

2020-05-13 Thread Bhupesh Sharma

Apologies for the delayed update. Its been quite some time since I
posted the last version (v5), but I have been really caught up in some
other critical issues.

Changes since v5:

- v5 can be viewed here:
  http://lists.infradead.org/pipermail/kexec/2019-November/024055.html
- Addressed review comments from James Morse and Boris.
- Added Tested-by received from John on v5 patchset.
- Rebased against arm64 (for-next/ptr-auth) branch which has Amit's
  patchset for ARMv8.3-A Pointer Authentication feature vmcoreinfo
  applied.

Changes since v4:

- v4 can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-November/023961.html
- Addressed comments from Dave and added patches for documenting
  new variables appended to vmcoreinfo documentation.
- Added testing report shared by Akashi for PATCH 2/5.

Changes since v3:

- v3 can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-March/022590.html
- Addressed comments from James and exported TCR_EL1.T1SZ in vmcoreinfo
  instead of PTRS_PER_PGD.
- Added a new patch (via [PATCH 3/3]), which fixes a simple typo in
  'Documentation/arm64/memory.rst'

Changes since v2:

- v2 can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-March/022531.html
- Protected 'MAX_PHYSMEM_BITS' vmcoreinfo variable under CONFIG_SPARSEMEM
  ifdef sections, as suggested by Kazu.
- Updated vmcoreinfo documentation to add description about
  'MAX_PHYSMEM_BITS' variable (via [PATCH 3/3]).

Changes since v1:

- v1 was sent out as a single patch which can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-February/022411.html

- v2 breaks the single patch into two independent patches:
  [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, whereas
  [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel code (all 
archs)

This patchset primarily fixes the regression reported in user-space
utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture
with the availability of 52-bit address space feature in underlying
kernel. These regressions have been reported both on CPUs which don't
support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels
and also on prototype platforms (like ARMv8 FVP simulator model) which
support ARMv8.2 extensions and are running newer kernels.

The reason for these regressions is that right now user-space tools
have no direct access to these values (since these are not exported
from the kernel) and hence need to rely on a best-guess method of
determining value of 'vabits_actual' and 'MAX_PHYSMEM_BITS' supported
by underlying kernel.

Exporting these values via vmcoreinfo will help user-land in such cases.
In addition, as per suggestion from makedumpfile maintainer (Kazu),
it makes more sense to append 'MAX_PHYSMEM_BITS' to
vmcoreinfo in the core code itself rather than in arm64 arch-specific
code, so that the user-space code for other archs can also benefit from
this addition to the vmcoreinfo and use it as a standard way of
determining 'SECTIONS_SHIFT' value in user-land.

Cc: Boris Petkov 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: Jonathan Corbet 
Cc: James Morse 
Cc: Mark Rutland 
Cc: Will Deacon 
Cc: Steve Capper 
Cc: Catalin Marinas 
Cc: Ard Biesheuvel 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Benjamin Herrenschmidt 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: John Donnelly 
Cc: scott.bran...@broadcom.com
Cc: Amit Kachhap 
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: linux-...@vger.kernel.org
Cc: ke...@lists.infradead.org

Bhupesh Sharma (2):
  crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo
  arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo

 Documentation/admin-guide/kdump/vmcoreinfo.rst | 16 
 arch/arm64/include/asm/pgtable-hwdef.h |  1 +
 arch/arm64/kernel/crash_core.c | 10 ++
 kernel/crash_core.c|  1 +
 4 files changed, 28 insertions(+)

-- 
2.7.4

[PATCH v6 1/2] crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo

2020-05-13 Thread Bhupesh Sharma

Right now user-space tools like 'makedumpfile' and 'crash' need to rely
on a best-guess method of determining value of 'MAX_PHYSMEM_BITS'
supported by underlying kernel.

This value is used in user-space code to calculate the bit-space
required to store a section for SPARESMEM (similar to the existing
calculation method used in the kernel implementation):

  #define SECTIONS_SHIFT(MAX_PHYSMEM_BITS - SECTION_SIZE_BITS)

Now, regressions have been reported in user-space utilities
like 'makedumpfile' and 'crash' on arm64, with the recently added
kernel support for 52-bit physical address space, as there is
no clear method of determining this value in user-space
(other than reading kernel CONFIG flags).

As per suggestion from makedumpfile maintainer (Kazu), it makes more
sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself
rather than in arch-specific code, so that the user-space code for other
archs can also benefit from this addition to the vmcoreinfo and use it
as a standard way of determining 'SECTIONS_SHIFT' value in user-land.

A reference 'makedumpfile' implementation which reads the
'MAX_PHYSMEM_BITS' value from vmcoreinfo in a arch-independent fashion
is available here:

While at it also update vmcoreinfo documentation for 'MAX_PHYSMEM_BITS'
variable being added to vmcoreinfo.

'MAX_PHYSMEM_BITS' defines the maximum supported physical address
space memory.

Cc: Boris Petkov 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: James Morse 
Cc: Mark Rutland 
Cc: Will Deacon 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Benjamin Herrenschmidt 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: ke...@lists.infradead.org
Tested-by: John Donnelly 
Signed-off-by: Bhupesh Sharma 
---
 Documentation/admin-guide/kdump/vmcoreinfo.rst | 5 +
 kernel/crash_core.c| 1 +
 2 files changed, 6 insertions(+)

diff --git a/Documentation/admin-guide/kdump/vmcoreinfo.rst 
b/Documentation/admin-guide/kdump/vmcoreinfo.rst
index e4ee8b2db604..2a632020f809 100644
--- a/Documentation/admin-guide/kdump/vmcoreinfo.rst
+++ b/Documentation/admin-guide/kdump/vmcoreinfo.rst
@@ -93,6 +93,11 @@ It exists in the sparse memory mapping model, and it is also 
somewhat
 similar to the mem_map variable, both of them are used to translate an
 address.
 
+MAX_PHYSMEM_BITS
+
+
+Defines the maximum supported physical address space memory.
+
 page
 
 
diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 9f1557b98468..18175687133a 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -413,6 +413,7 @@ static int __init crash_save_vmcoreinfo_init(void)
VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
VMCOREINFO_STRUCT_SIZE(mem_section);
VMCOREINFO_OFFSET(mem_section, section_mem_map);
+   VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS);
 #endif
VMCOREINFO_STRUCT_SIZE(page);
VMCOREINFO_STRUCT_SIZE(pglist_data);
-- 
2.7.4

Re: [RESEND PATCH v5 2/5] arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo

2020-02-24 Thread Bhupesh Sharma

Hi Amit,

On Fri, Feb 21, 2020 at 2:36 PM Amit Kachhap  wrote:
>
> Hi Bhupesh,
>
> On 1/13/20 5:44 PM, Bhupesh Sharma wrote:
> > Hi James,
> >
> > On 01/11/2020 12:30 AM, Dave Anderson wrote:
> >>
> >> - Original Message -
> >>> Hi Bhupesh,
> >>>
> >>> On 25/12/2019 19:01, Bhupesh Sharma wrote:
> >>>> On 12/12/2019 04:02 PM, James Morse wrote:
> >>>>> On 29/11/2019 19:59, Bhupesh Sharma wrote:
> >>>>>> vabits_actual variable on arm64 indicates the actual VA space size,
> >>>>>> and allows a single binary to support both 48-bit and 52-bit VA
> >>>>>> spaces.
> >>>>>>
> >>>>>> If the ARMv8.2-LVA optional feature is present, and we are running
> >>>>>> with a 64KB page size; then it is possible to use 52-bits of address
> >>>>>> space for both userspace and kernel addresses. However, any kernel
> >>>>>> binary that supports 52-bit must also be able to fall back to 48-bit
> >>>>>> at early boot time if the hardware feature is not present.
> >>>>>>
> >>>>>> Since TCR_EL1.T1SZ indicates the size offset of the memory region
> >>>>>> addressed by TTBR1_EL1 (and hence can be used for determining the
> >>>>>> vabits_actual value) it makes more sense to export the same in
> >>>>>> vmcoreinfo rather than vabits_actual variable, as the name of the
> >>>>>> variable can change in future kernel versions, but the architectural
> >>>>>> constructs like TCR_EL1.T1SZ can be used better to indicate intended
> >>>>>> specific fields to user-space.
> >>>>>>
> >>>>>> User-space utilities like makedumpfile and crash-utility, need to
> >>>>>> read/write this value from/to vmcoreinfo
> >>>>>
> >>>>> (write?)
> >>>>
> >>>> Yes, also write so that the vmcoreinfo from an (crashing) arm64
> >>>> system can
> >>>> be used for
> >>>> analysis of the root-cause of panic/crash on say an x86_64 host using
> >>>> utilities like
> >>>> crash-utility/gdb.
> >>>
> >>> I read this as as "User-space [...] needs to write to vmcoreinfo".
> >
> > That's correct. But for writing to vmcore dump in the kdump kernel, we
> > need to read the symbols from the vmcoreinfo in the primary kernel.
> >
> >>>>>> for determining if a virtual address lies in the linear map range.
> >>>>>
> >>>>> I think this is a fragile example. The debugger shouldn't need to know
> >>>>> this.
> >>>>
> >>>> Well that the current user-space utility design, so I am not sure we
> >>>> can
> >>>> tweak that too much.
> >>>>
> >>>>>> The user-space computation for determining whether an address lies in
> >>>>>> the linear map range is the same as we have in kernel-space:
> >>>>>>
> >>>>>> #define __is_lm_address(addr)(!(((u64)addr) &
> >>>>>> BIT(vabits_actual -
> >>>>>> 1)))
> >>>>>
> >>>>> This was changed with 14c127c957c1 ("arm64: mm: Flip kernel VA
> >>>>> space"). If
> >>>>> user-space
> >>>>> tools rely on 'knowing' the kernel memory layout, they must have to
> >>>>> constantly be fixed
> >>>>> and updated. This is a poor argument for adding this to something that
> >>>>> ends up as ABI.
> >>>>
> >>>> See above. The user-space has to rely on some ABI/guaranteed
> >>>> hardware-symbols which can be
> >>>> used for 'determining' the kernel memory layout.
> >>>
> >>> I disagree. Everything and anything in the kernel will change. The
> >>> ABI rules apply to
> >>> stuff exposed via syscalls and kernel filesystems. It does not apply
> >>> to kernel internals,
> >>> like the memory layout we used yesterday. 14c127c957c1 is a case in
> >>> point.
> >>>
> >>> A debugger trying to rely on this sort of thing would have to play
> >>> catchup whenever it
> >>> changes.
> >>
> >> Exact

Re: [RESEND PATCH v5 2/5] arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo

2020-01-13 Thread Bhupesh Sharma


Hi James,

On 01/11/2020 12:30 AM, Dave Anderson wrote:


- Original Message -

Hi Bhupesh,

On 25/12/2019 19:01, Bhupesh Sharma wrote:

On 12/12/2019 04:02 PM, James Morse wrote:

On 29/11/2019 19:59, Bhupesh Sharma wrote:

vabits_actual variable on arm64 indicates the actual VA space size,
and allows a single binary to support both 48-bit and 52-bit VA
spaces.

If the ARMv8.2-LVA optional feature is present, and we are running
with a 64KB page size; then it is possible to use 52-bits of address
space for both userspace and kernel addresses. However, any kernel
binary that supports 52-bit must also be able to fall back to 48-bit
at early boot time if the hardware feature is not present.

Since TCR_EL1.T1SZ indicates the size offset of the memory region
addressed by TTBR1_EL1 (and hence can be used for determining the
vabits_actual value) it makes more sense to export the same in
vmcoreinfo rather than vabits_actual variable, as the name of the
variable can change in future kernel versions, but the architectural
constructs like TCR_EL1.T1SZ can be used better to indicate intended
specific fields to user-space.

User-space utilities like makedumpfile and crash-utility, need to
read/write this value from/to vmcoreinfo


(write?)


Yes, also write so that the vmcoreinfo from an (crashing) arm64 system can
be used for
analysis of the root-cause of panic/crash on say an x86_64 host using
utilities like
crash-utility/gdb.


I read this as as "User-space [...] needs to write to vmcoreinfo".


That's correct. But for writing to vmcore dump in the kdump kernel, we 
need to read the symbols from the vmcoreinfo in the primary kernel.



for determining if a virtual address lies in the linear map range.


I think this is a fragile example. The debugger shouldn't need to know
this.


Well that the current user-space utility design, so I am not sure we can
tweak that too much.


The user-space computation for determining whether an address lies in
the linear map range is the same as we have in kernel-space:

#define __is_lm_address(addr)(!(((u64)addr) & BIT(vabits_actual -
1)))


This was changed with 14c127c957c1 ("arm64: mm: Flip kernel VA space"). If
user-space
tools rely on 'knowing' the kernel memory layout, they must have to
constantly be fixed
and updated. This is a poor argument for adding this to something that
ends up as ABI.


See above. The user-space has to rely on some ABI/guaranteed
hardware-symbols which can be
used for 'determining' the kernel memory layout.


I disagree. Everything and anything in the kernel will change. The ABI rules 
apply to
stuff exposed via syscalls and kernel filesystems. It does not apply to kernel 
internals,
like the memory layout we used yesterday. 14c127c957c1 is a case in point.

A debugger trying to rely on this sort of thing would have to play catchup 
whenever it
changes.


Exactly.  That's the whole point.

The crash utility and makedumpfile are not in the same league as other 
user-space tools.
They have always had to "play catchup" precisely because they depend upon 
kernel internals,
which constantly change.


I agree with you and DaveA here. Software user-space debuggers are 
dependent on kernel internals (which can change from time-to-time) and 
will have to play catch-up (which has been the case since the very start).


Unfortunately we don't have any clear ABI for software debugging tools - 
may be something to look for in future.


A case in point is gdb/kgdb, which still needs to run with KASLR 
turned-off (nokaslr) for debugging, as it confuses gdb which resolve 
kernel symbol address from symbol table of vmlinux. But we can 
work-around the same in makedumpfile/crash by reading the 'kaslr_offset' 
value. And I have several users telling me now they cannot use gdb on 
KASLR enabled kernel to debug panics, but can makedumpfile + crash 
combination to achieve the same.


So, we should be looking to fix these utilities which are broken since 
the 52-bit changes for arm64. Accordingly, I will try to send the v6

soon while incorporating the comments posted on the v5.

Thanks,
Bhupesh

Re: [RESEND PATCH v5 2/5] arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo

2019-12-25 Thread Bhupesh Sharma


Hi James,

On 12/12/2019 04:02 PM, James Morse wrote:

Hi Bhupesh,

On 29/11/2019 19:59, Bhupesh Sharma wrote:

vabits_actual variable on arm64 indicates the actual VA space size,
and allows a single binary to support both 48-bit and 52-bit VA
spaces.

If the ARMv8.2-LVA optional feature is present, and we are running
with a 64KB page size; then it is possible to use 52-bits of address
space for both userspace and kernel addresses. However, any kernel
binary that supports 52-bit must also be able to fall back to 48-bit
at early boot time if the hardware feature is not present.

Since TCR_EL1.T1SZ indicates the size offset of the memory region
addressed by TTBR1_EL1 (and hence can be used for determining the
vabits_actual value) it makes more sense to export the same in
vmcoreinfo rather than vabits_actual variable, as the name of the
variable can change in future kernel versions, but the architectural
constructs like TCR_EL1.T1SZ can be used better to indicate intended
specific fields to user-space.

User-space utilities like makedumpfile and crash-utility, need to
read/write this value from/to vmcoreinfo


(write?)


Yes, also write so that the vmcoreinfo from an (crashing) arm64 system 
can be used for analysis of the root-cause of panic/crash on say an 
x86_64 host using utilities like crash-utility/gdb.



for determining if a virtual address lies in the linear map range.


I think this is a fragile example. The debugger shouldn't need to know this.


Well that the current user-space utility design, so I am not sure we can 
tweak that too much.



The user-space computation for determining whether an address lies in
the linear map range is the same as we have in kernel-space:

   #define __is_lm_address(addr)(!(((u64)addr) & BIT(vabits_actual - 
1)))


This was changed with 14c127c957c1 ("arm64: mm: Flip kernel VA space"). If 
user-space
tools rely on 'knowing' the kernel memory layout, they must have to constantly 
be fixed
and updated. This is a poor argument for adding this to something that ends up 
as ABI.


See above. The user-space has to rely on some ABI/guaranteed 
hardware-symbols which can be used for 'determining' the kernel memory 
layout.



I think a better argument is walking the kernel page tables from the core dump.
Core code's vmcoreinfo exports the location of the kernel page tables, but in 
the example
above you can't walk them without knowing how T1SZ was configured.


Sure, both makedumpfile and crash-utility (which walks the kernel page 
tables from the core dump) use this (and similar) information currently 
in the user-space.



On older kernels, user-space that needs this would have to assume the value it 
computes
from VA_BITs (also in vmcoreinfo) is the value in use.


Yes, backward compatibility has been handled in the user-space already.


---%<---

I have sent out user-space patches for makedumpfile and crash-utility
to add features for obtaining vabits_actual value from TCR_EL1.T1SZ (see
[0] and [1]).

Akashi reported that he was able to use this patchset and the user-space
changes to get user-space working fine with the 52-bit kernel VA
changes (see [2]).

[0]. http://lists.infradead.org/pipermail/kexec/2019-November/023966.html
[1]. http://lists.infradead.org/pipermail/kexec/2019-November/024006.html
[2]. http://lists.infradead.org/pipermail/kexec/2019-November/023992.html

---%<---

This probably belongs in the cover letter instead of the commit log.


Ok.


(From-memory: one of vmcore/kcore is virtually addressed, the other physically. 
Does this
fix your poblem in both cases?)



diff --git a/arch/arm64/kernel/crash_core.c b/arch/arm64/kernel/crash_core.c
index ca4c3e12d8c5..f78310ba65ea 100644
--- a/arch/arm64/kernel/crash_core.c
+++ b/arch/arm64/kernel/crash_core.c
@@ -7,6 +7,13 @@
  #include 
  #include 


You need to include asm/sysreg.h for read_sysreg(), and asm/pgtable-hwdef.h for 
the macros
you added.


Ok. Will check as I did not get any compilation errors without the same 
and build-bot also did not raise a flag for the missing include files.



+static inline u64 get_tcr_el1_t1sz(void);



Why do you need to do this?


Without this I was getting a missing declaration error, while compiling 
the code.



+static inline u64 get_tcr_el1_t1sz(void)
+{
+   return (read_sysreg(tcr_el1) & TCR_T1SZ_MASK) >> TCR_T1SZ_OFFSET;
+}


(We don't modify this one, and its always the same one very CPU, so this is 
fine.
This function is only called once when the stringy vmcoreinfo elf_note is 
created...)


Right.


  void arch_crash_save_vmcoreinfo(void)
  {
VMCOREINFO_NUMBER(VA_BITS);
@@ -15,5 +22,7 @@ void arch_crash_save_vmcoreinfo(void)
kimage_voffset);
vmcoreinfo_append_str("NUMBER(PHYS_OFFSET)=0x%llx\n",

Re: [RESEND PATCH v5 5/5] Documentation/vmcoreinfo: Add documentation for 'TCR_EL1.T1SZ'

2019-12-25 Thread Bhupesh Sharma


Hi James,

On 12/12/2019 04:02 PM, James Morse wrote:

Hi Bhupesh,


I am sorry this review mail skipped my attention due to holidays and 
focus on other urgent issues.



On 29/11/2019 19:59, Bhupesh Sharma wrote:

Add documentation for TCR_EL1.T1SZ variable being added to
vmcoreinfo.

It indicates the size offset of the memory region addressed by TTBR1_EL1



and hence can be used for determining the vabits_actual value.


used for determining random-internal-kernel-variable, that might not exist 
tomorrow.

Could you describe how this is useful/necessary if a debugger wants to walk the 
page
tables from the core file? I think this is a better argument.

Wouldn't the documentation be better as part of the patch that adds the export?
(... unless these have to go via different trees? ..)


Ok, will fix the same in v6 version.


diff --git a/Documentation/admin-guide/kdump/vmcoreinfo.rst 
b/Documentation/admin-guide/kdump/vmcoreinfo.rst
index 447b64314f56..f9349f9d3345 100644
--- a/Documentation/admin-guide/kdump/vmcoreinfo.rst
+++ b/Documentation/admin-guide/kdump/vmcoreinfo.rst
@@ -398,6 +398,12 @@ KERNELOFFSET
  The kernel randomization offset. Used to compute the page offset. If
  KASLR is disabled, this value is zero.
  
+TCR_EL1.T1SZ

+
+
+Indicates the size offset of the memory region addressed by TTBR1_EL1



+and hence can be used for determining the vabits_actual value.


'vabits_actual' may not exist when the next person comes to read this 
documentation (its
going to rot really quickly).

I think the first half of this text is enough to say what this is for. You 
should include
words to the effect that its the hardware value that goes with swapper_pg_dir. 
You may
want to point readers to the arm-arm for more details on what the value means.


Ok, got it. Fixed this in v6, which should be on its way shortly.

Thanks,
Bhupesh

Re: [PATCH v5 0/5] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)

2019-12-15 Thread Bhupesh Sharma

Hi Boris,

On Sat, Dec 14, 2019 at 5:57 PM Borislav Petkov  wrote:
>
> On Fri, Nov 29, 2019 at 01:53:36AM +0530, Bhupesh Sharma wrote:
> > Bhupesh Sharma (5):
> >   crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo
> >   arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo
> >   Documentation/arm64: Fix a simple typo in memory.rst
> >   Documentation/vmcoreinfo: Add documentation for 'MAX_PHYSMEM_BITS'
> >   Documentation/vmcoreinfo: Add documentation for 'TCR_EL1.T1SZ'
>
> why are those last two separate patches and not part of the patches
> which export the respective variable/define?

I remember there was a suggestion during the review of an earlier
version to keep them as a separate patch(es) so that the documentation
text is easier to review, but I have no strong preference towards the
same.

I can merge the documentation patches with the respective patches
(which export the variables/defines to vmcoreinfo) in v6, unless other
maintainers have an objections towards the same.

Thanks,
Bhupesh

Re: [PATCH v5 0/5] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)

2019-11-29 Thread Bhupesh Sharma

Hi Will,

On Fri, Nov 29, 2019 at 3:54 PM Will Deacon  wrote:
>
> On Fri, Nov 29, 2019 at 01:53:36AM +0530, Bhupesh Sharma wrote:
> > Changes since v4:
> > 
> > - v4 can be seen here:
> >   http://lists.infradead.org/pipermail/kexec/2019-November/023961.html
> > - Addressed comments from Dave and added patches for documenting
> >   new variables appended to vmcoreinfo documentation.
> > - Added testing report shared by Akashi for PATCH 2/5.
>
> Please can you fix your mail setup? The last two times you've sent this
> series it seems to get split into two threads, which is really hard to
> track in my inbox:
>
> First thread:
>
> https://lore.kernel.org/lkml/1574972621-25750-1-git-send-email-bhsha...@redhat.com/
>
> Second thread:
>
> https://lore.kernel.org/lkml/1574972716-25858-1-git-send-email-bhsha...@redhat.com/

There seems to be some issue with my server's msmtp settings. I have
tried resending the v5 (see
<http://lists.infradead.org/pipermail/linux-arm-kernel/2019-November/696833.html>).

I hope the threading is ok this time.

Thanks for your patience.

Regards,
Bhupesh

[RESEND PATCH v5 5/5] Documentation/vmcoreinfo: Add documentation for 'TCR_EL1.T1SZ'

2019-11-29 Thread Bhupesh Sharma

Add documentation for TCR_EL1.T1SZ variable being added to
vmcoreinfo.

It indicates the size offset of the memory region addressed by TTBR1_EL1
and hence can be used for determining the vabits_actual value.

Cc: James Morse 
Cc: Mark Rutland 
Cc: Will Deacon 
Cc: Steve Capper 
Cc: Catalin Marinas 
Cc: Ard Biesheuvel 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: ke...@lists.infradead.org
Signed-off-by: Bhupesh Sharma 
---
 Documentation/admin-guide/kdump/vmcoreinfo.rst | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/Documentation/admin-guide/kdump/vmcoreinfo.rst 
b/Documentation/admin-guide/kdump/vmcoreinfo.rst
index 447b64314f56..f9349f9d3345 100644
--- a/Documentation/admin-guide/kdump/vmcoreinfo.rst
+++ b/Documentation/admin-guide/kdump/vmcoreinfo.rst
@@ -398,6 +398,12 @@ KERNELOFFSET
 The kernel randomization offset. Used to compute the page offset. If
 KASLR is disabled, this value is zero.
 
+TCR_EL1.T1SZ
+
+
+Indicates the size offset of the memory region addressed by TTBR1_EL1
+and hence can be used for determining the vabits_actual value.
+
 arm
 ===
 
-- 
2.7.4

[RESEND PATCH v5 4/5] Documentation/vmcoreinfo: Add documentation for 'MAX_PHYSMEM_BITS'

2019-11-29 Thread Bhupesh Sharma

Add documentation for 'MAX_PHYSMEM_BITS' variable being added to
vmcoreinfo.

'MAX_PHYSMEM_BITS' defines the maximum supported physical address
space memory.

Cc: Boris Petkov 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: James Morse 
Cc: Will Deacon 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Benjamin Herrenschmidt 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: ke...@lists.infradead.org
Signed-off-by: Bhupesh Sharma 
---
 Documentation/admin-guide/kdump/vmcoreinfo.rst | 5 +
 1 file changed, 5 insertions(+)

diff --git a/Documentation/admin-guide/kdump/vmcoreinfo.rst 
b/Documentation/admin-guide/kdump/vmcoreinfo.rst
index 007a6b86e0ee..447b64314f56 100644
--- a/Documentation/admin-guide/kdump/vmcoreinfo.rst
+++ b/Documentation/admin-guide/kdump/vmcoreinfo.rst
@@ -93,6 +93,11 @@ It exists in the sparse memory mapping model, and it is also 
somewhat
 similar to the mem_map variable, both of them are used to translate an
 address.
 
+MAX_PHYSMEM_BITS
+
+
+Defines the maximum supported physical address space memory.
+
 page
 
 
-- 
2.7.4

[RESEND PATCH v5 3/5] Documentation/arm64: Fix a simple typo in memory.rst

2019-11-29 Thread Bhupesh Sharma

Fix a simple typo in arm64/memory.rst

Cc: Jonathan Corbet 
Cc: James Morse 
Cc: Mark Rutland 
Cc: Will Deacon 
Cc: Steve Capper 
Cc: Catalin Marinas 
Cc: Ard Biesheuvel 
Cc: linux-...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Signed-off-by: Bhupesh Sharma 
---
 Documentation/arm64/memory.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/arm64/memory.rst b/Documentation/arm64/memory.rst
index 02e02175e6f5..cf03b3290800 100644
--- a/Documentation/arm64/memory.rst
+++ b/Documentation/arm64/memory.rst
@@ -129,7 +129,7 @@ this logic.
 
 As a single binary will need to support both 48-bit and 52-bit VA
 spaces, the VMEMMAP must be sized large enough for 52-bit VAs and
-also must be sized large enought to accommodate a fixed PAGE_OFFSET.
+also must be sized large enough to accommodate a fixed PAGE_OFFSET.
 
 Most code in the kernel should not need to consider the VA_BITS, for
 code that does need to know the VA size the variables are
-- 
2.7.4

[RESEND PATCH v5 2/5] arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo

2019-11-29 Thread Bhupesh Sharma

vabits_actual variable on arm64 indicates the actual VA space size,
and allows a single binary to support both 48-bit and 52-bit VA
spaces.

If the ARMv8.2-LVA optional feature is present, and we are running
with a 64KB page size; then it is possible to use 52-bits of address
space for both userspace and kernel addresses. However, any kernel
binary that supports 52-bit must also be able to fall back to 48-bit
at early boot time if the hardware feature is not present.

Since TCR_EL1.T1SZ indicates the size offset of the memory region
addressed by TTBR1_EL1 (and hence can be used for determining the
vabits_actual value) it makes more sense to export the same in
vmcoreinfo rather than vabits_actual variable, as the name of the
variable can change in future kernel versions, but the architectural
constructs like TCR_EL1.T1SZ can be used better to indicate intended
specific fields to user-space.

User-space utilities like makedumpfile and crash-utility, need to
read/write this value from/to vmcoreinfo for determining if a virtual
address lies in the linear map range.

The user-space computation for determining whether an address lies in
the linear map range is the same as we have in kernel-space:

  #define __is_lm_address(addr) (!(((u64)addr) & BIT(vabits_actual - 1)))

I have sent out user-space patches for makedumpfile and crash-utility
to add features for obtaining vabits_actual value from TCR_EL1.T1SZ (see
[0] and [1]).

Akashi reported that he was able to use this patchset and the user-space
changes to get user-space working fine with the 52-bit kernel VA
changes (see [2]).

[0]. http://lists.infradead.org/pipermail/kexec/2019-November/023966.html
[1]. http://lists.infradead.org/pipermail/kexec/2019-November/024006.html
[2]. http://lists.infradead.org/pipermail/kexec/2019-November/023992.html

Cc: James Morse 
Cc: Mark Rutland 
Cc: Will Deacon 
Cc: Steve Capper 
Cc: Catalin Marinas 
Cc: Ard Biesheuvel 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: ke...@lists.infradead.org
Signed-off-by: Bhupesh Sharma 
---
 arch/arm64/include/asm/pgtable-hwdef.h | 1 +
 arch/arm64/kernel/crash_core.c | 9 +
 2 files changed, 10 insertions(+)

diff --git a/arch/arm64/include/asm/pgtable-hwdef.h 
b/arch/arm64/include/asm/pgtable-hwdef.h
index d9fbd433cc17..d2e7aff5821e 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -215,6 +215,7 @@
 #define TCR_TxSZ(x)(TCR_T0SZ(x) | TCR_T1SZ(x))
 #define TCR_TxSZ_WIDTH 6
 #define TCR_T0SZ_MASK  (((UL(1) << TCR_TxSZ_WIDTH) - 1) << 
TCR_T0SZ_OFFSET)
+#define TCR_T1SZ_MASK  (((UL(1) << TCR_TxSZ_WIDTH) - 1) << 
TCR_T1SZ_OFFSET)
 
 #define TCR_EPD0_SHIFT 7
 #define TCR_EPD0_MASK  (UL(1) << TCR_EPD0_SHIFT)
diff --git a/arch/arm64/kernel/crash_core.c b/arch/arm64/kernel/crash_core.c
index ca4c3e12d8c5..f78310ba65ea 100644
--- a/arch/arm64/kernel/crash_core.c
+++ b/arch/arm64/kernel/crash_core.c
@@ -7,6 +7,13 @@
 #include 
 #include 
 
+static inline u64 get_tcr_el1_t1sz(void);
+
+static inline u64 get_tcr_el1_t1sz(void)
+{
+   return (read_sysreg(tcr_el1) & TCR_T1SZ_MASK) >> TCR_T1SZ_OFFSET;
+}
+
 void arch_crash_save_vmcoreinfo(void)
 {
VMCOREINFO_NUMBER(VA_BITS);
@@ -15,5 +22,7 @@ void arch_crash_save_vmcoreinfo(void)
kimage_voffset);
vmcoreinfo_append_str("NUMBER(PHYS_OFFSET)=0x%llx\n",
PHYS_OFFSET);
+   vmcoreinfo_append_str("NUMBER(tcr_el1_t1sz)=0x%llx\n",
+   get_tcr_el1_t1sz());
vmcoreinfo_append_str("KERNELOFFSET=%lx\n", kaslr_offset());
 }
-- 
2.7.4

[RESEND PATCH v5 1/5] crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo

2019-11-29 Thread Bhupesh Sharma

Right now user-space tools like 'makedumpfile' and 'crash' need to rely
on a best-guess method of determining value of 'MAX_PHYSMEM_BITS'
supported by underlying kernel.

This value is used in user-space code to calculate the bit-space
required to store a section for SPARESMEM (similar to the existing
calculation method used in the kernel implementation):

  #define SECTIONS_SHIFT(MAX_PHYSMEM_BITS - SECTION_SIZE_BITS)

Now, regressions have been reported in user-space utilities
like 'makedumpfile' and 'crash' on arm64, with the recently added
kernel support for 52-bit physical address space, as there is
no clear method of determining this value in user-space
(other than reading kernel CONFIG flags).

As per suggestion from makedumpfile maintainer (Kazu), it makes more
sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself
rather than in arch-specific code, so that the user-space code for other
archs can also benefit from this addition to the vmcoreinfo and use it
as a standard way of determining 'SECTIONS_SHIFT' value in user-land.

A reference 'makedumpfile' implementation which reads the
'MAX_PHYSMEM_BITS' value from vmcoreinfo in a arch-independent fashion
is available here:

[0]. 
https://github.com/bhupesh-sharma/makedumpfile/blob/remove-max-phys-mem-bit-v1/arch/ppc64.c#L471

Cc: Boris Petkov 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: James Morse 
Cc: Mark Rutland 
Cc: Will Deacon 
Cc: Steve Capper 
Cc: Catalin Marinas 
Cc: Ard Biesheuvel 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Benjamin Herrenschmidt 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: ke...@lists.infradead.org
Signed-off-by: Bhupesh Sharma 
---
 kernel/crash_core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 9f1557b98468..18175687133a 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -413,6 +413,7 @@ static int __init crash_save_vmcoreinfo_init(void)
VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
VMCOREINFO_STRUCT_SIZE(mem_section);
VMCOREINFO_OFFSET(mem_section, section_mem_map);
+   VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS);
 #endif
VMCOREINFO_STRUCT_SIZE(page);
VMCOREINFO_STRUCT_SIZE(pglist_data);
-- 
2.7.4

[RESEND PATCH v5 0/5] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)

2019-11-29 Thread Bhupesh Sharma

- Resending the v5 version as Will Deacon reported that the patchset was
  split into two seperate threads while sending out. It was an issue
  with my 'msmtp' settings which seems to be now fixed. Please ignore
  all previous v5 versions.

Changes since v4:

- v4 can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-November/023961.html
- Addressed comments from Dave and added patches for documenting
  new variables appended to vmcoreinfo documentation.
- Added testing report shared by Akashi for PATCH 2/5.

Changes since v3:

- v3 can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-March/022590.html
- Addressed comments from James and exported TCR_EL1.T1SZ in vmcoreinfo
  instead of PTRS_PER_PGD.
- Added a new patch (via [PATCH 3/3]), which fixes a simple typo in
  'Documentation/arm64/memory.rst'

Changes since v2:

- v2 can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-March/022531.html
- Protected 'MAX_PHYSMEM_BITS' vmcoreinfo variable under CONFIG_SPARSEMEM
  ifdef sections, as suggested by Kazu.
- Updated vmcoreinfo documentation to add description about
  'MAX_PHYSMEM_BITS' variable (via [PATCH 3/3]).

Changes since v1:

- v1 was sent out as a single patch which can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-February/022411.html

- v2 breaks the single patch into two independent patches:
  [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, whereas
  [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel code (all 
archs)

This patchset primarily fixes the regression reported in user-space
utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture
with the availability of 52-bit address space feature in underlying
kernel. These regressions have been reported both on CPUs which don't
support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels
and also on prototype platforms (like ARMv8 FVP simulator model) which
support ARMv8.2 extensions and are running newer kernels.

The reason for these regressions is that right now user-space tools
have no direct access to these values (since these are not exported
from the kernel) and hence need to rely on a best-guess method of
determining value of 'vabits_actual' and 'MAX_PHYSMEM_BITS' supported
by underlying kernel.

Exporting these values via vmcoreinfo will help user-land in such cases.
In addition, as per suggestion from makedumpfile maintainer (Kazu),
it makes more sense to append 'MAX_PHYSMEM_BITS' to
vmcoreinfo in the core code itself rather than in arm64 arch-specific
code, so that the user-space code for other archs can also benefit from
this addition to the vmcoreinfo and use it as a standard way of
determining 'SECTIONS_SHIFT' value in user-land.

Cc: Boris Petkov 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: Jonathan Corbet 
Cc: James Morse 
Cc: Mark Rutland 
Cc: Will Deacon 
Cc: Steve Capper 
Cc: Catalin Marinas 
Cc: Ard Biesheuvel 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Benjamin Herrenschmidt 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: linux-...@vger.kernel.org
Cc: ke...@lists.infradead.org

Bhupesh Sharma (5):
  crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo
  arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo
  Documentation/arm64: Fix a simple typo in memory.rst
  Documentation/vmcoreinfo: Add documentation for 'MAX_PHYSMEM_BITS'
  Documentation/vmcoreinfo: Add documentation for 'TCR_EL1.T1SZ'

 Documentation/admin-guide/kdump/vmcoreinfo.rst | 11 +++
 Documentation/arm64/memory.rst |  2 +-
 arch/arm64/include/asm/pgtable-hwdef.h |  1 +
 arch/arm64/kernel/crash_core.c |  9 +
 kernel/crash_core.c|  1 +
 5 files changed, 23 insertions(+), 1 deletion(-)

-- 
2.7.4

[RESEND PATCH v5 1/5] crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo

2019-11-29 Thread Bhupesh Sharma

Right now user-space tools like 'makedumpfile' and 'crash' need to rely
on a best-guess method of determining value of 'MAX_PHYSMEM_BITS'
supported by underlying kernel.

This value is used in user-space code to calculate the bit-space
required to store a section for SPARESMEM (similar to the existing
calculation method used in the kernel implementation):

  #define SECTIONS_SHIFT(MAX_PHYSMEM_BITS - SECTION_SIZE_BITS)

Now, regressions have been reported in user-space utilities
like 'makedumpfile' and 'crash' on arm64, with the recently added
kernel support for 52-bit physical address space, as there is
no clear method of determining this value in user-space
(other than reading kernel CONFIG flags).

As per suggestion from makedumpfile maintainer (Kazu), it makes more
sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself
rather than in arch-specific code, so that the user-space code for other
archs can also benefit from this addition to the vmcoreinfo and use it
as a standard way of determining 'SECTIONS_SHIFT' value in user-land.

A reference 'makedumpfile' implementation which reads the
'MAX_PHYSMEM_BITS' value from vmcoreinfo in a arch-independent fashion
is available here:

[0]. 
https://github.com/bhupesh-sharma/makedumpfile/blob/remove-max-phys-mem-bit-v1/arch/ppc64.c#L471

Cc: Boris Petkov 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: James Morse 
Cc: Mark Rutland 
Cc: Will Deacon 
Cc: Steve Capper 
Cc: Catalin Marinas 
Cc: Ard Biesheuvel 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Benjamin Herrenschmidt 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: ke...@lists.infradead.org
Signed-off-by: Bhupesh Sharma 
---
 kernel/crash_core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 9f1557b98468..18175687133a 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -413,6 +413,7 @@ static int __init crash_save_vmcoreinfo_init(void)
VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
VMCOREINFO_STRUCT_SIZE(mem_section);
VMCOREINFO_OFFSET(mem_section, section_mem_map);
+   VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS);
 #endif
VMCOREINFO_STRUCT_SIZE(page);
VMCOREINFO_STRUCT_SIZE(pglist_data);
-- 
2.7.4

[RESEND PATCH v5 0/5] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)

2019-11-29 Thread Bhupesh Sharma

- Resending the v5 version as Will Deacon reported that the patchset was
  split into two seperate threads while sending out. It was an issue
  with my 'msmtp' settings which is now fixed.

Changes since v4:

- v4 can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-November/023961.html
- Addressed comments from Dave and added patches for documenting
  new variables appended to vmcoreinfo documentation.
- Added testing report shared by Akashi for PATCH 2/5.

Changes since v3:

- v3 can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-March/022590.html
- Addressed comments from James and exported TCR_EL1.T1SZ in vmcoreinfo
  instead of PTRS_PER_PGD.
- Added a new patch (via [PATCH 3/3]), which fixes a simple typo in
  'Documentation/arm64/memory.rst'

Changes since v2:

- v2 can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-March/022531.html
- Protected 'MAX_PHYSMEM_BITS' vmcoreinfo variable under CONFIG_SPARSEMEM
  ifdef sections, as suggested by Kazu.
- Updated vmcoreinfo documentation to add description about
  'MAX_PHYSMEM_BITS' variable (via [PATCH 3/3]).

Changes since v1:

- v1 was sent out as a single patch which can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-February/022411.html

- v2 breaks the single patch into two independent patches:
  [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, whereas
  [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel code (all 
archs)

This patchset primarily fixes the regression reported in user-space
utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture
with the availability of 52-bit address space feature in underlying
kernel. These regressions have been reported both on CPUs which don't
support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels
and also on prototype platforms (like ARMv8 FVP simulator model) which
support ARMv8.2 extensions and are running newer kernels.

The reason for these regressions is that right now user-space tools
have no direct access to these values (since these are not exported
from the kernel) and hence need to rely on a best-guess method of
determining value of 'vabits_actual' and 'MAX_PHYSMEM_BITS' supported
by underlying kernel.

Exporting these values via vmcoreinfo will help user-land in such cases.
In addition, as per suggestion from makedumpfile maintainer (Kazu),
it makes more sense to append 'MAX_PHYSMEM_BITS' to
vmcoreinfo in the core code itself rather than in arm64 arch-specific
code, so that the user-space code for other archs can also benefit from
this addition to the vmcoreinfo and use it as a standard way of
determining 'SECTIONS_SHIFT' value in user-land.

Cc: Boris Petkov 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: Jonathan Corbet 
Cc: James Morse 
Cc: Mark Rutland 
Cc: Will Deacon 
Cc: Steve Capper 
Cc: Catalin Marinas 
Cc: Ard Biesheuvel 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Benjamin Herrenschmidt 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: linux-...@vger.kernel.org
Cc: ke...@lists.infradead.org

Bhupesh Sharma (5):
  crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo
  arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo
  Documentation/arm64: Fix a simple typo in memory.rst
  Documentation/vmcoreinfo: Add documentation for 'MAX_PHYSMEM_BITS'
  Documentation/vmcoreinfo: Add documentation for 'TCR_EL1.T1SZ'

 Documentation/admin-guide/kdump/vmcoreinfo.rst | 11 +++
 Documentation/arm64/memory.rst |  2 +-
 arch/arm64/include/asm/pgtable-hwdef.h |  1 +
 arch/arm64/kernel/crash_core.c |  9 +
 kernel/crash_core.c|  1 +
 5 files changed, 23 insertions(+), 1 deletion(-)

-- 
2.7.4

[PATCH v5 5/5] Documentation/vmcoreinfo: Add documentation for 'TCR_EL1.T1SZ'

2019-11-28 Thread Bhupesh Sharma

Add documentation for TCR_EL1.T1SZ variable being added to
vmcoreinfo.

It indicates the size offset of the memory region addressed by TTBR1_EL1
and hence can be used for determining the vabits_actual value.

Cc: James Morse 
Cc: Mark Rutland 
Cc: Will Deacon 
Cc: Steve Capper 
Cc: Catalin Marinas 
Cc: Ard Biesheuvel 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: ke...@lists.infradead.org
Signed-off-by: Bhupesh Sharma 
---
 Documentation/admin-guide/kdump/vmcoreinfo.rst | 6 ++
 1 file changed, 6 insertions(+)

diff --git a/Documentation/admin-guide/kdump/vmcoreinfo.rst 
b/Documentation/admin-guide/kdump/vmcoreinfo.rst
index 447b64314f56..f9349f9d3345 100644
--- a/Documentation/admin-guide/kdump/vmcoreinfo.rst
+++ b/Documentation/admin-guide/kdump/vmcoreinfo.rst
@@ -398,6 +398,12 @@ KERNELOFFSET
 The kernel randomization offset. Used to compute the page offset. If
 KASLR is disabled, this value is zero.
 
+TCR_EL1.T1SZ
+
+
+Indicates the size offset of the memory region addressed by TTBR1_EL1
+and hence can be used for determining the vabits_actual value.
+
 arm
 ===
 
-- 
2.7.4

[PATCH v5 4/5] Documentation/vmcoreinfo: Add documentation for 'MAX_PHYSMEM_BITS'

2019-11-28 Thread Bhupesh Sharma

Add documentation for 'MAX_PHYSMEM_BITS' variable being added to
vmcoreinfo.

'MAX_PHYSMEM_BITS' defines the maximum supported physical address
space memory.

Cc: Boris Petkov 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: James Morse 
Cc: Will Deacon 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Benjamin Herrenschmidt 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: ke...@lists.infradead.org
Signed-off-by: Bhupesh Sharma 
---
 Documentation/admin-guide/kdump/vmcoreinfo.rst | 5 +
 1 file changed, 5 insertions(+)

diff --git a/Documentation/admin-guide/kdump/vmcoreinfo.rst 
b/Documentation/admin-guide/kdump/vmcoreinfo.rst
index 007a6b86e0ee..447b64314f56 100644
--- a/Documentation/admin-guide/kdump/vmcoreinfo.rst
+++ b/Documentation/admin-guide/kdump/vmcoreinfo.rst
@@ -93,6 +93,11 @@ It exists in the sparse memory mapping model, and it is also 
somewhat
 similar to the mem_map variable, both of them are used to translate an
 address.
 
+MAX_PHYSMEM_BITS
+
+
+Defines the maximum supported physical address space memory.
+
 page
 
 
-- 
2.7.4

[PATCH v5 3/5] Documentation/arm64: Fix a simple typo in memory.rst

2019-11-28 Thread Bhupesh Sharma

Fix a simple typo in arm64/memory.rst

Cc: Jonathan Corbet 
Cc: James Morse 
Cc: Mark Rutland 
Cc: Will Deacon 
Cc: Steve Capper 
Cc: Catalin Marinas 
Cc: Ard Biesheuvel 
Cc: linux-...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: linux-arm-ker...@lists.infradead.org
Signed-off-by: Bhupesh Sharma 
---
 Documentation/arm64/memory.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/arm64/memory.rst b/Documentation/arm64/memory.rst
index 02e02175e6f5..cf03b3290800 100644
--- a/Documentation/arm64/memory.rst
+++ b/Documentation/arm64/memory.rst
@@ -129,7 +129,7 @@ this logic.
 
 As a single binary will need to support both 48-bit and 52-bit VA
 spaces, the VMEMMAP must be sized large enough for 52-bit VAs and
-also must be sized large enought to accommodate a fixed PAGE_OFFSET.
+also must be sized large enough to accommodate a fixed PAGE_OFFSET.
 
 Most code in the kernel should not need to consider the VA_BITS, for
 code that does need to know the VA size the variables are
-- 
2.7.4

[PATCH v5 2/5] arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo

2019-11-28 Thread Bhupesh Sharma

vabits_actual variable on arm64 indicates the actual VA space size,
and allows a single binary to support both 48-bit and 52-bit VA
spaces.

If the ARMv8.2-LVA optional feature is present, and we are running
with a 64KB page size; then it is possible to use 52-bits of address
space for both userspace and kernel addresses. However, any kernel
binary that supports 52-bit must also be able to fall back to 48-bit
at early boot time if the hardware feature is not present.

Since TCR_EL1.T1SZ indicates the size offset of the memory region
addressed by TTBR1_EL1 (and hence can be used for determining the
vabits_actual value) it makes more sense to export the same in
vmcoreinfo rather than vabits_actual variable, as the name of the
variable can change in future kernel versions, but the architectural
constructs like TCR_EL1.T1SZ can be used better to indicate intended
specific fields to user-space.

User-space utilities like makedumpfile and crash-utility, need to
read/write this value from/to vmcoreinfo for determining if a virtual
address lies in the linear map range.

The user-space computation for determining whether an address lies in
the linear map range is the same as we have in kernel-space:

  #define __is_lm_address(addr) (!(((u64)addr) & BIT(vabits_actual - 1)))

I have sent out user-space patches for makedumpfile and crash-utility
to add features for obtaining vabits_actual value from TCR_EL1.T1SZ (see
[0] and [1]).

Akashi reported that he was able to use this patchset and the user-space
changes to get user-space working fine with the 52-bit kernel VA
changes (see [2]).

[0]. http://lists.infradead.org/pipermail/kexec/2019-November/023966.html
[1]. http://lists.infradead.org/pipermail/kexec/2019-November/024006.html
[2]. http://lists.infradead.org/pipermail/kexec/2019-November/023992.html

Cc: James Morse 
Cc: Mark Rutland 
Cc: Will Deacon 
Cc: Steve Capper 
Cc: Catalin Marinas 
Cc: Ard Biesheuvel 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: ke...@lists.infradead.org
Signed-off-by: Bhupesh Sharma 
---
 arch/arm64/include/asm/pgtable-hwdef.h | 1 +
 arch/arm64/kernel/crash_core.c | 9 +
 2 files changed, 10 insertions(+)

diff --git a/arch/arm64/include/asm/pgtable-hwdef.h 
b/arch/arm64/include/asm/pgtable-hwdef.h
index d9fbd433cc17..d2e7aff5821e 100644
--- a/arch/arm64/include/asm/pgtable-hwdef.h
+++ b/arch/arm64/include/asm/pgtable-hwdef.h
@@ -215,6 +215,7 @@
 #define TCR_TxSZ(x)(TCR_T0SZ(x) | TCR_T1SZ(x))
 #define TCR_TxSZ_WIDTH 6
 #define TCR_T0SZ_MASK  (((UL(1) << TCR_TxSZ_WIDTH) - 1) << 
TCR_T0SZ_OFFSET)
+#define TCR_T1SZ_MASK  (((UL(1) << TCR_TxSZ_WIDTH) - 1) << 
TCR_T1SZ_OFFSET)
 
 #define TCR_EPD0_SHIFT 7
 #define TCR_EPD0_MASK  (UL(1) << TCR_EPD0_SHIFT)
diff --git a/arch/arm64/kernel/crash_core.c b/arch/arm64/kernel/crash_core.c
index ca4c3e12d8c5..f78310ba65ea 100644
--- a/arch/arm64/kernel/crash_core.c
+++ b/arch/arm64/kernel/crash_core.c
@@ -7,6 +7,13 @@
 #include 
 #include 
 
+static inline u64 get_tcr_el1_t1sz(void);
+
+static inline u64 get_tcr_el1_t1sz(void)
+{
+   return (read_sysreg(tcr_el1) & TCR_T1SZ_MASK) >> TCR_T1SZ_OFFSET;
+}
+
 void arch_crash_save_vmcoreinfo(void)
 {
VMCOREINFO_NUMBER(VA_BITS);
@@ -15,5 +22,7 @@ void arch_crash_save_vmcoreinfo(void)
kimage_voffset);
vmcoreinfo_append_str("NUMBER(PHYS_OFFSET)=0x%llx\n",
PHYS_OFFSET);
+   vmcoreinfo_append_str("NUMBER(tcr_el1_t1sz)=0x%llx\n",
+   get_tcr_el1_t1sz());
vmcoreinfo_append_str("KERNELOFFSET=%lx\n", kaslr_offset());
 }
-- 
2.7.4

[PATCH v5 1/5] crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo

2019-11-28 Thread Bhupesh Sharma

Right now user-space tools like 'makedumpfile' and 'crash' need to rely
on a best-guess method of determining value of 'MAX_PHYSMEM_BITS'
supported by underlying kernel.

This value is used in user-space code to calculate the bit-space
required to store a section for SPARESMEM (similar to the existing
calculation method used in the kernel implementation):

  #define SECTIONS_SHIFT(MAX_PHYSMEM_BITS - SECTION_SIZE_BITS)

Now, regressions have been reported in user-space utilities
like 'makedumpfile' and 'crash' on arm64, with the recently added
kernel support for 52-bit physical address space, as there is
no clear method of determining this value in user-space
(other than reading kernel CONFIG flags).

As per suggestion from makedumpfile maintainer (Kazu), it makes more
sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself
rather than in arch-specific code, so that the user-space code for other
archs can also benefit from this addition to the vmcoreinfo and use it
as a standard way of determining 'SECTIONS_SHIFT' value in user-land.

A reference 'makedumpfile' implementation which reads the
'MAX_PHYSMEM_BITS' value from vmcoreinfo in a arch-independent fashion
is available here:

[0]. 
https://github.com/bhupesh-sharma/makedumpfile/blob/remove-max-phys-mem-bit-v1/arch/ppc64.c#L471

Cc: Boris Petkov 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: James Morse 
Cc: Mark Rutland 
Cc: Will Deacon 
Cc: Steve Capper 
Cc: Catalin Marinas 
Cc: Ard Biesheuvel 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Benjamin Herrenschmidt 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: ke...@lists.infradead.org
Signed-off-by: Bhupesh Sharma 
---
 kernel/crash_core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 9f1557b98468..18175687133a 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -413,6 +413,7 @@ static int __init crash_save_vmcoreinfo_init(void)
VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
VMCOREINFO_STRUCT_SIZE(mem_section);
VMCOREINFO_OFFSET(mem_section, section_mem_map);
+   VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS);
 #endif
VMCOREINFO_STRUCT_SIZE(page);
VMCOREINFO_STRUCT_SIZE(pglist_data);
-- 
2.7.4

[PATCH v5 0/5] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)

2019-11-28 Thread Bhupesh Sharma

Changes since v4:

- v4 can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-November/023961.html
- Addressed comments from Dave and added patches for documenting
  new variables appended to vmcoreinfo documentation.
- Added testing report shared by Akashi for PATCH 2/5.

Changes since v3:

- v3 can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-March/022590.html
- Addressed comments from James and exported TCR_EL1.T1SZ in vmcoreinfo
  instead of PTRS_PER_PGD.
- Added a new patch (via [PATCH 3/3]), which fixes a simple typo in
  'Documentation/arm64/memory.rst'

Changes since v2:

- v2 can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-March/022531.html
- Protected 'MAX_PHYSMEM_BITS' vmcoreinfo variable under CONFIG_SPARSEMEM
  ifdef sections, as suggested by Kazu.
- Updated vmcoreinfo documentation to add description about
  'MAX_PHYSMEM_BITS' variable (via [PATCH 3/3]).

Changes since v1:

- v1 was sent out as a single patch which can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-February/022411.html

- v2 breaks the single patch into two independent patches:
  [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, whereas
  [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel code (all 
archs)

This patchset primarily fixes the regression reported in user-space
utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture
with the availability of 52-bit address space feature in underlying
kernel. These regressions have been reported both on CPUs which don't
support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels
and also on prototype platforms (like ARMv8 FVP simulator model) which
support ARMv8.2 extensions and are running newer kernels.

The reason for these regressions is that right now user-space tools
have no direct access to these values (since these are not exported
from the kernel) and hence need to rely on a best-guess method of
determining value of 'vabits_actual' and 'MAX_PHYSMEM_BITS' supported
by underlying kernel.

Exporting these values via vmcoreinfo will help user-land in such cases.
In addition, as per suggestion from makedumpfile maintainer (Kazu),
it makes more sense to append 'MAX_PHYSMEM_BITS' to
vmcoreinfo in the core code itself rather than in arm64 arch-specific
code, so that the user-space code for other archs can also benefit from
this addition to the vmcoreinfo and use it as a standard way of
determining 'SECTIONS_SHIFT' value in user-land.

Cc: Boris Petkov 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: Jonathan Corbet 
Cc: James Morse 
Cc: Mark Rutland 
Cc: Will Deacon 
Cc: Steve Capper 
Cc: Catalin Marinas 
Cc: Ard Biesheuvel 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Benjamin Herrenschmidt 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: linux-...@vger.kernel.org
Cc: ke...@lists.infradead.org

Bhupesh Sharma (5):
  crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo
  arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo
  Documentation/arm64: Fix a simple typo in memory.rst
  Documentation/vmcoreinfo: Add documentation for 'MAX_PHYSMEM_BITS'
  Documentation/vmcoreinfo: Add documentation for 'TCR_EL1.T1SZ'

 Documentation/admin-guide/kdump/vmcoreinfo.rst | 11 +++
 Documentation/arm64/memory.rst |  2 +-
 arch/arm64/include/asm/pgtable-hwdef.h |  1 +
 arch/arm64/kernel/crash_core.c |  9 +
 kernel/crash_core.c|  1 +
 5 files changed, 23 insertions(+), 1 deletion(-)

-- 
2.7.4

Re: [PATCH v4 0/3] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)

2019-11-20 Thread Bhupesh Sharma

Hi Dave,

On Thu, Nov 21, 2019 at 8:51 AM Dave Young  wrote:
>
> On 11/11/19 at 01:31pm, Bhupesh Sharma wrote:
> > Changes since v3:
> > 
> > - v3 can be seen here:
> >   http://lists.infradead.org/pipermail/kexec/2019-March/022590.html
> > - Addressed comments from James and exported TCR_EL1.T1SZ in vmcoreinfo
> >   instead of PTRS_PER_PGD.
> > - Added a new patch (via [PATCH 3/3]), which fixes a simple typo in
> >   'Documentation/arm64/memory.rst'
> >
> > Changes since v2:
> > 
> > - v2 can be seen here:
> >   http://lists.infradead.org/pipermail/kexec/2019-March/022531.html
> > - Protected 'MAX_PHYSMEM_BITS' vmcoreinfo variable under CONFIG_SPARSEMEM
> >   ifdef sections, as suggested by Kazu.
> > - Updated vmcoreinfo documentation to add description about
> >   'MAX_PHYSMEM_BITS' variable (via [PATCH 3/3]).
> >
> > Changes since v1:
> > 
> > - v1 was sent out as a single patch which can be seen here:
> >   http://lists.infradead.org/pipermail/kexec/2019-February/022411.html
> >
> > - v2 breaks the single patch into two independent patches:
> >   [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, whereas
> >   [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel code 
> > (all archs)
> >
> > This patchset primarily fixes the regression reported in user-space
> > utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture
> > with the availability of 52-bit address space feature in underlying
> > kernel. These regressions have been reported both on CPUs which don't
> > support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels
> > and also on prototype platforms (like ARMv8 FVP simulator model) which
> > support ARMv8.2 extensions and are running newer kernels.
> >
> > The reason for these regressions is that right now user-space tools
> > have no direct access to these values (since these are not exported
> > from the kernel) and hence need to rely on a best-guess method of
> > determining value of 'vabits_actual' and 'MAX_PHYSMEM_BITS' supported
> > by underlying kernel.
> >
> > Exporting these values via vmcoreinfo will help user-land in such cases.
> > In addition, as per suggestion from makedumpfile maintainer (Kazu),
> > it makes more sense to append 'MAX_PHYSMEM_BITS' to
> > vmcoreinfo in the core code itself rather than in arm64 arch-specific
> > code, so that the user-space code for other archs can also benefit from
> > this addition to the vmcoreinfo and use it as a standard way of
> > determining 'SECTIONS_SHIFT' value in user-land.
> >
> > Cc: Boris Petkov 
> > Cc: Ingo Molnar 
> > Cc: Thomas Gleixner 
> > Cc: Jonathan Corbet 
> > Cc: James Morse 
> > Cc: Mark Rutland 
> > Cc: Will Deacon 
> > Cc: Steve Capper 
> > Cc: Catalin Marinas 
> > Cc: Ard Biesheuvel 
> > Cc: Michael Ellerman 
> > Cc: Paul Mackerras 
> > Cc: Benjamin Herrenschmidt 
> > Cc: Dave Anderson 
> > Cc: Kazuhito Hagio 
> > Cc: x...@kernel.org
> > Cc: linuxppc-dev@lists.ozlabs.org
> > Cc: linux-arm-ker...@lists.infradead.org
> > Cc: linux-ker...@vger.kernel.org
> > Cc: linux-...@vger.kernel.org
> > Cc: ke...@lists.infradead.org
> >
> > Bhupesh Sharma (3):
> >   crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo
> >   arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo
>
> Soft reminder:  the new introduced vmcoreinfo needs documentation
>
> Please check Documentation/admin-guide/kdump/vmcoreinfo.rst

Sure, will send a v5 to address the same.

Thanks,
Bhupesh

Re: [PATCH v4 0/3] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)

2019-11-19 Thread Bhupesh Sharma

On Tue, Nov 19, 2019 at 12:03 PM Prabhakar Kushwaha
 wrote:
>
> Hi Akashi,
>
> On Fri, Nov 15, 2019 at 7:29 AM AKASHI Takahiro
>  wrote:
> >
> > Bhupesh,
> >
> > On Fri, Nov 15, 2019 at 01:24:17AM +0530, Bhupesh Sharma wrote:
> > > Hi Akashi,
> > >
> > > On Wed, Nov 13, 2019 at 12:11 PM AKASHI Takahiro
> > >  wrote:
> > > >
> > > > Hi Bhupesh,
> > > >
> > > > Do you have a corresponding patch for userspace tools,
> > > > including crash util and/or makedumpfile?
> > > > Otherwise, we can't verify that a generated core file is
> > > > correctly handled.
> > >
> > > Sure. I am still working on the crash-utility related changes, but you
> > > can find the makedumpfile changes I posted a couple of days ago here
> > > (see [0]) and the github link for the makedumpfile changes can be seen
> > > via [1].
> > >
> > > I will post the crash-util changes shortly as well.
> > > Thanks for having a look at the same.
> >
> > Thank you.
> > I have tested my kdump patch with a hacked version of crash
> > where VA_BITS_ACTUAL is calculated from tcr_el1_t1sz in vmcoreinfo.
> >
>
> I also did hack to calculate VA_BITS_ACTUAL is calculated from
> tcr_el1_t1sz in vmcoreinfo. Now i am getting error same as mentioned
> by you in other thread last month.
> https://www.mail-archive.com/crash-utility@redhat.com/msg07385.html
>
> how this error was overcome?
>
> I am using
>  - crashkernel: https://github.com/crash-utility/crash.git  commit:
> babd7ae62d4e8fd6f93fd30b88040d9376522aa3
> and
>  - Linux: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
> commit: af42d3466bdc8f39806b26f593604fdc54140bcb

I will post a formal change for crash-utility shortly that fixes the
same. Right now we are having issues with emails bouncing off
'crash-util...@redhat.com', so my patches sent to the same are in
undelivered state at-the-moment.

For easy testing I will share the link to my github tree (off-line)
[which contains the changes] as well.

Regards,
Bhupesh

Re: [PATCH v4 0/3] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)

2019-11-16 Thread Bhupesh Sharma

Hi Akashi,

On Fri, Nov 15, 2019 at 7:29 AM AKASHI Takahiro
 wrote:
>
> Bhupesh,
>
> On Fri, Nov 15, 2019 at 01:24:17AM +0530, Bhupesh Sharma wrote:
> > Hi Akashi,
> >
> > On Wed, Nov 13, 2019 at 12:11 PM AKASHI Takahiro
> >  wrote:
> > >
> > > Hi Bhupesh,
> > >
> > > Do you have a corresponding patch for userspace tools,
> > > including crash util and/or makedumpfile?
> > > Otherwise, we can't verify that a generated core file is
> > > correctly handled.
> >
> > Sure. I am still working on the crash-utility related changes, but you
> > can find the makedumpfile changes I posted a couple of days ago here
> > (see [0]) and the github link for the makedumpfile changes can be seen
> > via [1].
> >
> > I will post the crash-util changes shortly as well.
> > Thanks for having a look at the same.
>
> Thank you.
> I have tested my kdump patch with a hacked version of crash
> where VA_BITS_ACTUAL is calculated from tcr_el1_t1sz in vmcoreinfo.

Thanks a lot for testing the changes. I will push the crash utility
changes for review shortly and also Cc you to the patches.
It would be great to have your Tested-by for this patch-set, if the
user-space works fine for you with these changes.

Regards,
Bhupesh

> -Takahiro Akashi
>
>
> > [0]. http://lists.infradead.org/pipermail/kexec/2019-November/023963.html
> > [1]. 
> > https://github.com/bhupesh-sharma/makedumpfile/tree/52-bit-va-support-via-vmcore-upstream-v4
> >
> > Regards,
> > Bhupesh
> >
> > >
> > > Thanks,
> > > -Takahiro Akashi
> > >
> > > On Mon, Nov 11, 2019 at 01:31:19PM +0530, Bhupesh Sharma wrote:
> > > > Changes since v3:
> > > > 
> > > > - v3 can be seen here:
> > > >   http://lists.infradead.org/pipermail/kexec/2019-March/022590.html
> > > > - Addressed comments from James and exported TCR_EL1.T1SZ in vmcoreinfo
> > > >   instead of PTRS_PER_PGD.
> > > > - Added a new patch (via [PATCH 3/3]), which fixes a simple typo in
> > > >   'Documentation/arm64/memory.rst'
> > > >
> > > > Changes since v2:
> > > > 
> > > > - v2 can be seen here:
> > > >   http://lists.infradead.org/pipermail/kexec/2019-March/022531.html
> > > > - Protected 'MAX_PHYSMEM_BITS' vmcoreinfo variable under 
> > > > CONFIG_SPARSEMEM
> > > >   ifdef sections, as suggested by Kazu.
> > > > - Updated vmcoreinfo documentation to add description about
> > > >   'MAX_PHYSMEM_BITS' variable (via [PATCH 3/3]).
> > > >
> > > > Changes since v1:
> > > > 
> > > > - v1 was sent out as a single patch which can be seen here:
> > > >   http://lists.infradead.org/pipermail/kexec/2019-February/022411.html
> > > >
> > > > - v2 breaks the single patch into two independent patches:
> > > >   [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, 
> > > > whereas
> > > >   [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel 
> > > > code (all archs)
> > > >
> > > > This patchset primarily fixes the regression reported in user-space
> > > > utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture
> > > > with the availability of 52-bit address space feature in underlying
> > > > kernel. These regressions have been reported both on CPUs which don't
> > > > support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels
> > > > and also on prototype platforms (like ARMv8 FVP simulator model) which
> > > > support ARMv8.2 extensions and are running newer kernels.
> > > >
> > > > The reason for these regressions is that right now user-space tools
> > > > have no direct access to these values (since these are not exported
> > > > from the kernel) and hence need to rely on a best-guess method of
> > > > determining value of 'vabits_actual' and 'MAX_PHYSMEM_BITS' supported
> > > > by underlying kernel.
> > > >
> > > > Exporting these values via vmcoreinfo will help user-land in such cases.
> > > > In addition, as per suggestion from makedumpfile maintainer (Kazu),
> > > > it makes more sense to append 'MAX_PHYSMEM_BITS' to
> > > > vmcoreinfo in the core code itself rather than in arm64

Re: [PATCH v4 0/3] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)

2019-11-14 Thread Bhupesh Sharma

Hi Akashi,

On Wed, Nov 13, 2019 at 12:11 PM AKASHI Takahiro
 wrote:
>
> Hi Bhupesh,
>
> Do you have a corresponding patch for userspace tools,
> including crash util and/or makedumpfile?
> Otherwise, we can't verify that a generated core file is
> correctly handled.

Sure. I am still working on the crash-utility related changes, but you
can find the makedumpfile changes I posted a couple of days ago here
(see [0]) and the github link for the makedumpfile changes can be seen
via [1].

I will post the crash-util changes shortly as well.
Thanks for having a look at the same.

[0]. http://lists.infradead.org/pipermail/kexec/2019-November/023963.html
[1]. 
https://github.com/bhupesh-sharma/makedumpfile/tree/52-bit-va-support-via-vmcore-upstream-v4

Regards,
Bhupesh

>
> Thanks,
> -Takahiro Akashi
>
> On Mon, Nov 11, 2019 at 01:31:19PM +0530, Bhupesh Sharma wrote:
> > Changes since v3:
> > 
> > - v3 can be seen here:
> >   http://lists.infradead.org/pipermail/kexec/2019-March/022590.html
> > - Addressed comments from James and exported TCR_EL1.T1SZ in vmcoreinfo
> >   instead of PTRS_PER_PGD.
> > - Added a new patch (via [PATCH 3/3]), which fixes a simple typo in
> >   'Documentation/arm64/memory.rst'
> >
> > Changes since v2:
> > 
> > - v2 can be seen here:
> >   http://lists.infradead.org/pipermail/kexec/2019-March/022531.html
> > - Protected 'MAX_PHYSMEM_BITS' vmcoreinfo variable under CONFIG_SPARSEMEM
> >   ifdef sections, as suggested by Kazu.
> > - Updated vmcoreinfo documentation to add description about
> >   'MAX_PHYSMEM_BITS' variable (via [PATCH 3/3]).
> >
> > Changes since v1:
> > 
> > - v1 was sent out as a single patch which can be seen here:
> >   http://lists.infradead.org/pipermail/kexec/2019-February/022411.html
> >
> > - v2 breaks the single patch into two independent patches:
> >   [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, whereas
> >   [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel code 
> > (all archs)
> >
> > This patchset primarily fixes the regression reported in user-space
> > utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture
> > with the availability of 52-bit address space feature in underlying
> > kernel. These regressions have been reported both on CPUs which don't
> > support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels
> > and also on prototype platforms (like ARMv8 FVP simulator model) which
> > support ARMv8.2 extensions and are running newer kernels.
> >
> > The reason for these regressions is that right now user-space tools
> > have no direct access to these values (since these are not exported
> > from the kernel) and hence need to rely on a best-guess method of
> > determining value of 'vabits_actual' and 'MAX_PHYSMEM_BITS' supported
> > by underlying kernel.
> >
> > Exporting these values via vmcoreinfo will help user-land in such cases.
> > In addition, as per suggestion from makedumpfile maintainer (Kazu),
> > it makes more sense to append 'MAX_PHYSMEM_BITS' to
> > vmcoreinfo in the core code itself rather than in arm64 arch-specific
> > code, so that the user-space code for other archs can also benefit from
> > this addition to the vmcoreinfo and use it as a standard way of
> > determining 'SECTIONS_SHIFT' value in user-land.
> >
> > Cc: Boris Petkov 
> > Cc: Ingo Molnar 
> > Cc: Thomas Gleixner 
> > Cc: Jonathan Corbet 
> > Cc: James Morse 
> > Cc: Mark Rutland 
> > Cc: Will Deacon 
> > Cc: Steve Capper 
> > Cc: Catalin Marinas 
> > Cc: Ard Biesheuvel 
> > Cc: Michael Ellerman 
> > Cc: Paul Mackerras 
> > Cc: Benjamin Herrenschmidt 
> > Cc: Dave Anderson 
> > Cc: Kazuhito Hagio 
> > Cc: x...@kernel.org
> > Cc: linuxppc-dev@lists.ozlabs.org
> > Cc: linux-arm-ker...@lists.infradead.org
> > Cc: linux-ker...@vger.kernel.org
> > Cc: linux-...@vger.kernel.org
> > Cc: ke...@lists.infradead.org
> >
> > Bhupesh Sharma (3):
> >   crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo
> >   arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo
> >   Documentation/arm64: Fix a simple typo in memory.rst
> >
> >  Documentation/arm64/memory.rst | 2 +-
> >  arch/arm64/include/asm/pgtable-hwdef.h | 1 +
> >  arch/arm64/kernel/crash_core.c | 9 +
> >  kernel/crash_core.c| 1 +
> >  4 files changed, 12 insertions(+), 1 deletion(-)
> >
> > --
> > 2.7.4
> >
>

[PATCH v4 0/3] Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs)

2019-11-11 Thread Bhupesh Sharma

Changes since v3:

- v3 can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-March/022590.html
- Addressed comments from James and exported TCR_EL1.T1SZ in vmcoreinfo
  instead of PTRS_PER_PGD.
- Added a new patch (via [PATCH 3/3]), which fixes a simple typo in
  'Documentation/arm64/memory.rst'

Changes since v2:

- v2 can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-March/022531.html
- Protected 'MAX_PHYSMEM_BITS' vmcoreinfo variable under CONFIG_SPARSEMEM
  ifdef sections, as suggested by Kazu.
- Updated vmcoreinfo documentation to add description about
  'MAX_PHYSMEM_BITS' variable (via [PATCH 3/3]).

Changes since v1:

- v1 was sent out as a single patch which can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-February/022411.html

- v2 breaks the single patch into two independent patches:
  [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, whereas
  [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel code (all 
archs)

This patchset primarily fixes the regression reported in user-space
utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture
with the availability of 52-bit address space feature in underlying
kernel. These regressions have been reported both on CPUs which don't
support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels
and also on prototype platforms (like ARMv8 FVP simulator model) which
support ARMv8.2 extensions and are running newer kernels.

The reason for these regressions is that right now user-space tools
have no direct access to these values (since these are not exported
from the kernel) and hence need to rely on a best-guess method of
determining value of 'vabits_actual' and 'MAX_PHYSMEM_BITS' supported
by underlying kernel.

Exporting these values via vmcoreinfo will help user-land in such cases.
In addition, as per suggestion from makedumpfile maintainer (Kazu),
it makes more sense to append 'MAX_PHYSMEM_BITS' to
vmcoreinfo in the core code itself rather than in arm64 arch-specific
code, so that the user-space code for other archs can also benefit from
this addition to the vmcoreinfo and use it as a standard way of
determining 'SECTIONS_SHIFT' value in user-land.

Cc: Boris Petkov 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: Jonathan Corbet 
Cc: James Morse 
Cc: Mark Rutland 
Cc: Will Deacon 
Cc: Steve Capper 
Cc: Catalin Marinas 
Cc: Ard Biesheuvel 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Benjamin Herrenschmidt 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: linux-...@vger.kernel.org
Cc: ke...@lists.infradead.org

Bhupesh Sharma (3):
  crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo
  arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo
  Documentation/arm64: Fix a simple typo in memory.rst

 Documentation/arm64/memory.rst | 2 +-
 arch/arm64/include/asm/pgtable-hwdef.h | 1 +
 arch/arm64/kernel/crash_core.c | 9 +
 kernel/crash_core.c| 1 +
 4 files changed, 12 insertions(+), 1 deletion(-)

-- 
2.7.4

[PATCH v4 1/3] crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo

2019-11-11 Thread Bhupesh Sharma

Right now user-space tools like 'makedumpfile' and 'crash' need to rely
on a best-guess method of determining value of 'MAX_PHYSMEM_BITS'
supported by underlying kernel.

This value is used in user-space code to calculate the bit-space
required to store a section for SPARESMEM (similar to the existing
calculation method used in the kernel implementation):

  #define SECTIONS_SHIFT(MAX_PHYSMEM_BITS - SECTION_SIZE_BITS)

Now, regressions have been reported in user-space utilities
like 'makedumpfile' and 'crash' on arm64, with the recently added
kernel support for 52-bit physical address space, as there is
no clear method of determining this value in user-space
(other than reading kernel CONFIG flags).

As per suggestion from makedumpfile maintainer (Kazu), it makes more
sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself
rather than in arch-specific code, so that the user-space code for other
archs can also benefit from this addition to the vmcoreinfo and use it
as a standard way of determining 'SECTIONS_SHIFT' value in user-land.

A reference 'makedumpfile' implementation which reads the
'MAX_PHYSMEM_BITS' value from vmcoreinfo in a arch-independent fashion
is available here:

[0]. 
https://github.com/bhupesh-sharma/makedumpfile/blob/remove-max-phys-mem-bit-v1/arch/ppc64.c#L471

Cc: Boris Petkov 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: James Morse 
Cc: Mark Rutland 
Cc: Will Deacon 
Cc: Steve Capper 
Cc: Catalin Marinas 
Cc: Ard Biesheuvel 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Benjamin Herrenschmidt 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: ke...@lists.infradead.org
Signed-off-by: Bhupesh Sharma 
---
 kernel/crash_core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 9f1557b98468..18175687133a 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -413,6 +413,7 @@ static int __init crash_save_vmcoreinfo_init(void)
VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
VMCOREINFO_STRUCT_SIZE(mem_section);
VMCOREINFO_OFFSET(mem_section, section_mem_map);
+   VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS);
 #endif
VMCOREINFO_STRUCT_SIZE(page);
VMCOREINFO_STRUCT_SIZE(pglist_data);
-- 
2.7.4

[RESEND PATCH] Documentation/stackprotector: powerpc supports stack protector

2019-06-10 Thread Bhupesh Sharma

powerpc architecture (both 64-bit and 32-bit) supports stack protector
mechanism since some time now [see commit 06ec27aea9fc ("powerpc/64:
add stack protector support")].

Update stackprotector arch support documentation to reflect the same.

Cc: Jonathan Corbet 
Cc: Michael Ellerman 
Cc: linux-...@vger.kernel.org
Cc: linux-ker...@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Signed-off-by: Bhupesh Sharma 
---
Resend, this time Cc'ing Jonathan and doc-list.

 Documentation/features/debug/stackprotector/arch-support.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/features/debug/stackprotector/arch-support.txt 
b/Documentation/features/debug/stackprotector/arch-support.txt
index ea521f3e..32bbdfc64c32 100644
--- a/Documentation/features/debug/stackprotector/arch-support.txt
+++ b/Documentation/features/debug/stackprotector/arch-support.txt
@@ -22,7 +22,7 @@
 |   nios2: | TODO |
 |openrisc: | TODO |
 |  parisc: | TODO |
-| powerpc: | TODO |
+| powerpc: |  ok  |
 |   riscv: | TODO |
 |s390: | TODO |
 |  sh: |  ok  |
-- 
2.7.4

Re: [PATCH] Documentation/stackprotector: powerpc supports stack protector

2019-06-05 Thread Bhupesh Sharma

Hi Jonathan,

On Fri, May 31, 2019 at 8:44 PM Michael Ellerman  wrote:
>
> Jonathan Corbet  writes:
> > On Thu, 30 May 2019 18:37:46 +0530
> > Bhupesh Sharma  wrote:
> >
> >> > This should probably go via the documentation tree?
> >> >
> >> > Acked-by: Michael Ellerman 
> >>
> >> Thanks for the review Michael.
> >> I am ok with this going through the documentation tree as well.
> >
> > Works for me too, but I don't seem to find the actual patch anywhere I
> > look.  Can you send me a copy?
>
> You can get it from lore:
>
>   
> https://lore.kernel.org/linuxppc-dev/1559212177-7072-1-git-send-email-bhsha...@redhat.com/raw
>
> Or patchwork (automatically adds my ack):
>
>   https://patchwork.ozlabs.org/patch/1107706/mbox/
>
> Or Bhupesh can send it to you :)

Please let me know if I should send out the patch again, this time
Cc'ing you and the doc-list.

Thanks,
Bhupesh

Re: [PATCH 22/22] docs: fix broken documentation links

2019-06-04 Thread Bhupesh Sharma

or details, see Documentation/x86/intel_mpx.rst
>
> If unsure, say N.
>
> @@ -1911,7 +1911,7 @@ config X86_INTEL_MEMORY_PROTECTION_KEYS
> page-based protections, but without requiring modification of the
> page tables when an application changes protection domains.
>
> -   For details, see Documentation/x86/protection-keys.txt
> +   For details, see Documentation/x86/protection-keys.rst
>
> If unsure, say y.
>
> diff --git a/arch/x86/Kconfig.debug b/arch/x86/Kconfig.debug
> index f730680dc818..59f598543203 100644
> --- a/arch/x86/Kconfig.debug
> +++ b/arch/x86/Kconfig.debug
> @@ -156,7 +156,7 @@ config IOMMU_DEBUG
> code. When you use it make sure you have a big enough
> IOMMU/AGP aperture.  Most of the options enabled by this can
> be set more finegrained using the iommu= command line
> -   options. See Documentation/x86/x86_64/boot-options.txt for more
> +   options. See Documentation/x86/x86_64/boot-options.rst for more
> details.
>
>   config IOMMU_LEAK
> diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S
> index 850b8762e889..90d791ca1a95 100644
> --- a/arch/x86/boot/header.S
> +++ b/arch/x86/boot/header.S
> @@ -313,7 +313,7 @@ start_sys_seg:.word   SYSSEG  # obsolete and 
> meaningless, but just
>
>   type_of_loader: .byte   0   # 0 means ancient bootloader, 
> newer
>   # bootloaders know to change this.
> - # See Documentation/x86/boot.txt for
> + # See Documentation/x86/boot.rst for
>   # assigned ids
>
>   # flags, unused bits must be zero (RFU) bit within loadflags
> diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
> index 11aa3b2afa4d..33f9fc38d014 100644
> --- a/arch/x86/entry/entry_64.S
> +++ b/arch/x86/entry/entry_64.S
> @@ -8,7 +8,7 @@
>*
>* entry.S contains the system-call and fault low-level handling routines.
>*
> - * Some of this is documented in Documentation/x86/entry_64.txt
> + * Some of this is documented in Documentation/x86/entry_64.rst
>*
>* A note on terminology:
>* - iret frame:Architecture defined interrupt frame from SS to RIP
> diff --git a/arch/x86/include/asm/bootparam_utils.h 
> b/arch/x86/include/asm/bootparam_utils.h
> index f6f6ef436599..101eb944f13c 100644
> --- a/arch/x86/include/asm/bootparam_utils.h
> +++ b/arch/x86/include/asm/bootparam_utils.h
> @@ -24,7 +24,7 @@ static void sanitize_boot_params(struct boot_params 
> *boot_params)
>* IMPORTANT NOTE TO BOOTLOADER AUTHORS: do not simply clear
>* this field.  The purpose of this field is to guarantee
>* compliance with the x86 boot spec located in
> -  * Documentation/x86/boot.txt .  That spec says that the
> +  * Documentation/x86/boot.rst .  That spec says that the
>* *whole* structure should be cleared, after which only the
>* portion defined by struct setup_header (boot_params->hdr)
>* should be copied in.
> diff --git a/arch/x86/include/asm/page_64_types.h 
> b/arch/x86/include/asm/page_64_types.h
> index 793c14c372cb..288b065955b7 100644
> --- a/arch/x86/include/asm/page_64_types.h
> +++ b/arch/x86/include/asm/page_64_types.h
> @@ -48,7 +48,7 @@
>
>   #define __START_KERNEL_map  _AC(0x8000, UL)
>
> -/* See Documentation/x86/x86_64/mm.txt for a description of the memory map. 
> */
> +/* See Documentation/x86/x86_64/mm.rst for a description of the memory map. 
> */
>
>   #define __PHYSICAL_MASK_SHIFT   52
>
> diff --git a/arch/x86/include/asm/pgtable_64_types.h 
> b/arch/x86/include/asm/pgtable_64_types.h
> index 88bca456da99..52e5f5f2240d 100644
> --- a/arch/x86/include/asm/pgtable_64_types.h
> +++ b/arch/x86/include/asm/pgtable_64_types.h
> @@ -103,7 +103,7 @@ extern unsigned int ptrs_per_p4d;
>   #define PGDIR_MASK  (~(PGDIR_SIZE - 1))
>
>   /*
> - * See Documentation/x86/x86_64/mm.txt for a description of the memory map.
> + * See Documentation/x86/x86_64/mm.rst for a description of the memory map.
>*
>* Be very careful vs. KASLR when changing anything here. The KASLR address
>* range must not overlap with anything except the KASAN shadow area, which
> diff --git a/arch/x86/kernel/cpu/microcode/amd.c 
> b/arch/x86/kernel/cpu/microcode/amd.c
> index e1f3ba19ba54..06d4e67f31ab 100644
> --- a/arch/x86/kernel/cpu/microcode/amd.c
> +++ b/arch/x86/kernel/cpu/microcode/amd.c
> @@ -61,7 +61,7 @@ static u8 amd_ucode_patch[PATCH_MAX_SIZE];
>
>   /*
>* Microcode patch container file is prepended to the initrd in cpio
> - * format. See Documentation/x86/microcode.txt
> + * format. See Documentation/x86/microcode.rst
>*/
>   static const char
>   ucode_path[] __maybe_unused = "kernel/x86/microcode/AuthenticAMD.bin";
> diff --git a/arch/x86/kernel/kexec-bzimage64.c 
> b/arch/x86/kernel/kexec-bzimage64.c
> index 22f60dd26460..b07e7069b09e 100644
> --- a/arch/x86/kernel/kexec-bzimage64.c
> +++ b/arch/x86/kernel/kexec-bzimage64.c
> @@ -416,7 +416,7 @@ static void *bzImage64_load(struct kimage *image, char 
> *kernel,
>   efi_map_offset = params_cmdline_sz;
>   efi_setup_data_offset = efi_map_offset + ALIGN(efi_map_sz, 16);
>
> - /* Copy setup header onto bootparams. Documentation/x86/boot.txt */
> + /* Copy setup header onto bootparams. Documentation/x86/boot.rst */
>   setup_header_size = 0x0202 + kernel[0x0201] - setup_hdr_offset;

For the arm, arm64 and x86 kexec bits:

Reviewed-by: Bhupesh Sharma 

Thanks,
Bhupesh

Re: [PATCH] Documentation/stackprotector: powerpc supports stack protector

2019-05-30 Thread Bhupesh Sharma

On Thu, May 30, 2019 at 6:25 PM Michael Ellerman  wrote:
>
> Bhupesh Sharma  writes:
> > powerpc architecture (both 64-bit and 32-bit) supports stack protector
> > mechanism since some time now [see commit 06ec27aea9fc ("powerpc/64:
> > add stack protector support")].
> >
> > Update stackprotector arch support documentation to reflect the same.
> >
> > Signed-off-by: Bhupesh Sharma 
> > ---
> >  Documentation/features/debug/stackprotector/arch-support.txt | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/Documentation/features/debug/stackprotector/arch-support.txt 
> > b/Documentation/features/debug/stackprotector/arch-support.txt
> > index ea521f3e..32bbdfc64c32 100644
> > --- a/Documentation/features/debug/stackprotector/arch-support.txt
> > +++ b/Documentation/features/debug/stackprotector/arch-support.txt
> > @@ -22,7 +22,7 @@
> >  |   nios2: | TODO |
> >  |openrisc: | TODO |
> >  |  parisc: | TODO |
> > -| powerpc: | TODO |
> > +| powerpc: |  ok  |
> >  |   riscv: | TODO |
> >  |s390: | TODO |
> >  |  sh: |  ok  |
> > --
> > 2.7.4
>
> Thanks.
>
> This should probably go via the documentation tree?
>
> Acked-by: Michael Ellerman 

Thanks for the review Michael.
I am ok with this going through the documentation tree as well.

Regards,
Bhupesh

[PATCH] Documentation/stackprotector: powerpc supports stack protector

2019-05-30 Thread Bhupesh Sharma

powerpc architecture (both 64-bit and 32-bit) supports stack protector
mechanism since some time now [see commit 06ec27aea9fc ("powerpc/64:
add stack protector support")].

Update stackprotector arch support documentation to reflect the same.

Signed-off-by: Bhupesh Sharma 
---
 Documentation/features/debug/stackprotector/arch-support.txt | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Documentation/features/debug/stackprotector/arch-support.txt 
b/Documentation/features/debug/stackprotector/arch-support.txt
index ea521f3e..32bbdfc64c32 100644
--- a/Documentation/features/debug/stackprotector/arch-support.txt
+++ b/Documentation/features/debug/stackprotector/arch-support.txt
@@ -22,7 +22,7 @@
 |   nios2: | TODO |
 |openrisc: | TODO |
 |  parisc: | TODO |
-| powerpc: | TODO |
+| powerpc: |  ok  |
 |   riscv: | TODO |
 |s390: | TODO |
 |  sh: |  ok  |
-- 
2.7.4

[PATCH] include/kcore: Remove left-over instances of 'kclist_add_remap()'

2019-03-25 Thread Bhupesh Sharma

Commit bf904d2762ee ("x86/pti/64: Remove the SYSCALL64 entry trampoline")
removed the sole usage of 'kclist_add_remap()' from
'arch/x86/mm/cpu_entry_area.c', but it did not remove the left-over
definition from the include file.

Fix the same.

Cc: James Morse 
Cc: Boris Petkov 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: Michael Ellerman 
Cc: Dave Anderson 
Cc: Dave Young 
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: ke...@lists.infradead.org
Signed-off-by: Bhupesh Sharma 
---
 include/linux/kcore.h | 11 ---
 1 file changed, 11 deletions(-)

diff --git a/include/linux/kcore.h b/include/linux/kcore.h
index c843f4a9c512..da676cdbd727 100644
--- a/include/linux/kcore.h
+++ b/include/linux/kcore.h
@@ -38,12 +38,6 @@ struct vmcoredd_node {
 
 #ifdef CONFIG_PROC_KCORE
 void __init kclist_add(struct kcore_list *, void *, size_t, int type);
-static inline
-void kclist_add_remap(struct kcore_list *m, void *addr, void *vaddr, size_t sz)
-{
-   m->vaddr = (unsigned long)vaddr;
-   kclist_add(m, addr, sz, KCORE_REMAP);
-}
 
 extern int __init register_mem_pfn_is_ram(int (*fn)(unsigned long pfn));
 #else
@@ -51,11 +45,6 @@ static inline
 void kclist_add(struct kcore_list *new, void *addr, size_t size, int type)
 {
 }
-
-static inline
-void kclist_add_remap(struct kcore_list *m, void *addr, void *vaddr, size_t sz)
-{
-}
 #endif
 
 #endif /* _LINUX_KCORE_H */
-- 
2.7.4

[PATCH v3 3/3] Documentation/vmcoreinfo: Add documentation for 'MAX_PHYSMEM_BITS'

2019-03-19 Thread Bhupesh Sharma

Add documentation for 'MAX_PHYSMEM_BITS' variable being added to
vmcoreinfo.

'MAX_PHYSMEM_BITS' defines the maximum supported physical address
space memory.

Cc: Boris Petkov 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: James Morse 
Cc: Will Deacon 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Benjamin Herrenschmidt 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: ke...@lists.infradead.org
Signed-off-by: Bhupesh Sharma 
---
 Documentation/kdump/vmcoreinfo.txt | 5 +
 1 file changed, 5 insertions(+)

diff --git a/Documentation/kdump/vmcoreinfo.txt 
b/Documentation/kdump/vmcoreinfo.txt
index bb94a4bd597a..f5a11388dc49 100644
--- a/Documentation/kdump/vmcoreinfo.txt
+++ b/Documentation/kdump/vmcoreinfo.txt
@@ -95,6 +95,11 @@ It exists in the sparse memory mapping model, and it is also 
somewhat
 similar to the mem_map variable, both of them are used to translate an
 address.
 
+MAX_PHYSMEM_BITS
+
+
+Defines the maximum supported physical address space memory.
+
 page
 
 
-- 
2.7.4

[PATCH v3 2/3] crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo

2019-03-19 Thread Bhupesh Sharma

Right now user-space tools like 'makedumpfile' and 'crash' need to rely
on a best-guess method of determining value of 'MAX_PHYSMEM_BITS'
supported by underlying kernel.

This value is used in user-space code to calculate the bit-space
required to store a section for SPARESMEM (similar to the existing
calculation method used in the kernel implementation):

  #define SECTIONS_SHIFT(MAX_PHYSMEM_BITS - SECTION_SIZE_BITS)

Now, regressions have been reported in user-space utilities
like 'makedumpfile' and 'crash' on arm64, with the recently added
kernel support for 52-bit physical address space, as there is
no clear method of determining this value in user-space
(other than reading kernel CONFIG flags).

As per suggestion from makedumpfile maintainer (Kazu), it makes more
sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself
rather than in arch-specific code, so that the user-space code for other
archs can also benefit from this addition to the vmcoreinfo and use it
as a standard way of determining 'SECTIONS_SHIFT' value in user-land.

A reference 'makedumpfile' implementation which reads the
'MAX_PHYSMEM_BITS' value from vmcoreinfo in a arch-independent fashion
is available here:

[0]. 
https://github.com/bhupesh-sharma/makedumpfile/blob/remove-max-phys-mem-bit-v1/arch/ppc64.c#L471

Cc: Boris Petkov 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: James Morse 
Cc: Will Deacon 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Benjamin Herrenschmidt 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: ke...@lists.infradead.org
Signed-off-by: Bhupesh Sharma 
---
 kernel/crash_core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 093c9f917ed0..495f09084696 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -415,6 +415,7 @@ static int __init crash_save_vmcoreinfo_init(void)
VMCOREINFO_LENGTH(mem_section, NR_SECTION_ROOTS);
VMCOREINFO_STRUCT_SIZE(mem_section);
VMCOREINFO_OFFSET(mem_section, section_mem_map);
+   VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS);
 #endif
VMCOREINFO_STRUCT_SIZE(page);
VMCOREINFO_STRUCT_SIZE(pglist_data);
-- 
2.7.4

[PATCH v3 0/3] Append new variables to vmcoreinfo (PTRS_PER_PGD for arm64 and MAX_PHYSMEM_BITS for all archs)

2019-03-19 Thread Bhupesh Sharma

Changes since v2:

- v2 can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-March/022531.html
- Protected 'MAX_PHYSMEM_BITS' vmcoreinfo variable under CONFIG_SPARSEMEM
  ifdef sections, as suggested by Kazu.
- Updated vmcoreinfo documentation to add description about
  'MAX_PHYSMEM_BITS' variable (via [PATCH 3/3]).

Changes since v1:

- v1 was sent out as a single patch which can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-February/022411.html

- v2 breaks the single patch into two independent patches:
  [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, whereas
  [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel code (all 
archs)

This patchset primarily fixes the regression reported in user-space
utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture
with the availability of 52-bit address space feature in underlying
kernel. These regressions have been reported both on CPUs which don't
support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels
and also on prototype platforms (like ARMv8 FVP simulator model) which
support ARMv8.2 extensions and are running newer kernels.

The reason for these regressions is that right now user-space tools
have no direct access to these values (since these are not exported
from the kernel) and hence need to rely on a best-guess method of
determining value of 'PTRS_PER_PGD' and 'MAX_PHYSMEM_BITS' supported
by underlying kernel.

Exporting these values via vmcoreinfo will help user-land in such cases.
In addition, as per suggestion from makedumpfile maintainer (Kazu),
it makes more sense to append 'MAX_PHYSMEM_BITS' to
vmcoreinfo in the core code itself rather than in arm64 arch-specific
code, so that the user-space code for other archs can also benefit from
this addition to the vmcoreinfo and use it as a standard way of
determining 'SECTIONS_SHIFT' value in user-land.

Cc: Mark Rutland 
Cc: James Morse 
Cc: Will Deacon 
Cc: Boris Petkov 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Benjamin Herrenschmidt 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: ke...@lists.infradead.org

Bhupesh Sharma (3):
  arm64, vmcoreinfo : Append 'PTRS_PER_PGD' to vmcoreinfo
  crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo
  Documentation/vmcoreinfo: Add documentation for 'MAX_PHYSMEM_BITS'

 Documentation/kdump/vmcoreinfo.txt | 5 +
 arch/arm64/kernel/crash_core.c | 1 +
 kernel/crash_core.c| 1 +
 3 files changed, 7 insertions(+)

-- 
2.7.4

Re: [PATCH v2 2/2] crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo

2019-03-14 Thread Bhupesh Sharma


Hi Kazu,

On 03/13/2019 01:17 AM, Kazuhito Hagio wrote:

Hi Bhupesh,

-Original Message-

Right now user-space tools like 'makedumpfile' and 'crash' need to rely
on a best-guess method of determining value of 'MAX_PHYSMEM_BITS'
supported by underlying kernel.

This value is used in user-space code to calculate the bit-space
required to store a section for SPARESMEM (similar to the existing
calculation method used in the kernel implementation):

   #define SECTIONS_SHIFT(MAX_PHYSMEM_BITS - SECTION_SIZE_BITS)

Now, regressions have been reported in user-space utilities
like 'makedumpfile' and 'crash' on arm64, with the recently added
kernel support for 52-bit physical address space, as there is
no clear method of determining this value in user-space
(other than reading kernel CONFIG flags).

As per suggestion from makedumpfile maintainer (Kazu), it makes more
sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself
rather than in arch-specific code, so that the user-space code for other
archs can also benefit from this addition to the vmcoreinfo and use it
as a standard way of determining 'SECTIONS_SHIFT' value in user-land.

A reference 'makedumpfile' implementation which reads the
'MAX_PHYSMEM_BITS' value from vmcoreinfo in a arch-independent fashion
is available here:

[0]. 
https://github.com/bhupesh-sharma/makedumpfile/blob/remove-max-phys-mem-bit-v1/arch/ppc64.c#L471

Cc: Boris Petkov 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: James Morse 
Cc: Will Deacon 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Benjamin Herrenschmidt 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: ke...@lists.infradead.org
Signed-off-by: Bhupesh Sharma 
---
  kernel/crash_core.c | 1 +
  1 file changed, 1 insertion(+)

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 093c9f917ed0..44b90368e183 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -467,6 +467,7 @@ static int __init crash_save_vmcoreinfo_init(void)
  #define PAGE_OFFLINE_MAPCOUNT_VALUE   (~PG_offline)
VMCOREINFO_NUMBER(PAGE_OFFLINE_MAPCOUNT_VALUE);
  #endif
+   VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS);


Some architectures define MAX_PHYSMEM_BITS only with CONFIG_SPARSEMEM,
so we need to move this to the #ifdef section that exports some
mem_section things.

Thanks!
Kazu


Sorry for the late response, I wanted to make sure I check almost all  
archs to understand if a proposal would work for all.


As per my current understanding, we can protect the export of  
'MAX_PHYSMEM_BITS' via a #ifdef section against CONFIG_SPARSEMEM, and it  
should work for all archs. Here are some arguments to support the same,  
would request maintainers of various archs (in Cc) to correct me if I am  
missing something here:


1. SPARSEMEM is dependent upon on (!SELECT_MEMORY_MODEL &&  
ARCH_SPARSEMEM_ENABLE) || SPARSEMEM_MANUAL:


config SPARSEMEM
def_bool y
	depends on (!SELECT_MEMORY_MODEL && ARCH_SPARSEMEM_ENABLE) ||  
SPARSEMEM_MANUAL


2. For a couple of archs, this option is already turned on by default in  
their respective defconfigs:


$ grep -nrw "CONFIG_SPARSEMEM_MANUAL" *
arch/ia64/configs/gensparse_defconfig:18:CONFIG_SPARSEMEM_MANUAL=y
arch/powerpc/configs/ppc64e_defconfig:30:CONFIG_SPARSEMEM_MANUAL=y

3. Note that other archs use ARCH_SPARSEMEM_DEFAULT to define if  
CONFIG_SPARSEMEM_MANUAL is set by default:


choice
prompt "Memory model"
..
default SPARSEMEM_MANUAL if ARCH_SPARSEMEM_DEFAULT

3a.

$ grep -nrw -A 2 "ARCH_SPARSEMEM_DEFAULT" *
arch/s390/Kconfig:621:config ARCH_SPARSEMEM_DEFAULT
arch/s390/Kconfig-622-  def_bool y
--
arch/x86/Kconfig:1623:config ARCH_SPARSEMEM_DEFAULT
arch/x86/Kconfig-1624-  def_bool y
arch/x86/Kconfig-1625-  depends on X86_64
--
arch/powerpc/Kconfig:614:config ARCH_SPARSEMEM_DEFAULT
arch/powerpc/Kconfig-615-   def_bool y
arch/powerpc/Kconfig-616-   depends on PPC_BOOK3S_64
--
arch/arm64/Kconfig:850:config ARCH_SPARSEMEM_DEFAULT
arch/arm64/Kconfig-851- def_bool ARCH_SPARSEMEM_ENABLE
--
arch/sh/mm/Kconfig:138:config ARCH_SPARSEMEM_DEFAULT
arch/sh/mm/Kconfig-139- def_bool y
--
arch/sparc/Kconfig:315:config ARCH_SPARSEMEM_DEFAULT
arch/sparc/Kconfig-316- def_bool y if SPARC64
--
arch/arm/Kconfig:1591:config ARCH_SPARSEMEM_DEFAULT
arch/arm/Kconfig-1592-  def_bool ARCH_SPARSEMEM_ENABLE

Since most archs (except MIPS) set  
CONFIG_ARCH_SPARSEMEM_DEFAULT/CONFIG_ARCH_SPARSEMEM_ENABLE to y in the  
default configurations, so even though they don't protect  
'MAX_PHYSMEM_BITS' define in CONFIG_SPARSEMEM ifdef sections, we still  
would be ok protecting the 'MAX_PHYSMEM_BITS' vmcoreinfo export inside a  
CONFIG_SPARSEMEM ifdef section.


Thanks for your inputs, I will include this change in the v3.

Regards,
Bhupesh

Re: [PATCH v2 0/2] Append new variables to vmcoreinfo (PTRS_PER_PGD for arm64 and MAX_PHYSMEM_BITS for all archs)

2019-03-12 Thread Bhupesh Sharma


Hi Dave,

On 03/11/2019 02:35 PM, Dave Young wrote:

Hi Bhupesh,
On 03/10/19 at 03:34pm, Bhupesh Sharma wrote:

Changes since v1:

- v1 was sent out as a single patch which can be seen here:
   http://lists.infradead.org/pipermail/kexec/2019-February/022411.html

- v2 breaks the single patch into two independent patches:
   [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, whereas
   [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel code 
(all archs)

This patchset primarily fixes the regression reported in user-space
utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture
with the availability of 52-bit address space feature in underlying
kernel. These regressions have been reported both on CPUs which don't
support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels
and also on prototype platforms (like ARMv8 FVP simulator model) which
support ARMv8.2 extensions and are running newer kernels.

The reason for these regressions is that right now user-space tools
have no direct access to these values (since these are not exported
from the kernel) and hence need to rely on a best-guess method of
determining value of 'PTRS_PER_PGD' and 'MAX_PHYSMEM_BITS' supported
by underlying kernel.

Exporting these values via vmcoreinfo will help user-land in such cases.
In addition, as per suggestion from makedumpfile maintainer (Kazu)
during v1 review, it makes more sense to append 'MAX_PHYSMEM_BITS' to
vmcoreinfo in the core code itself rather than in arm64 arch-specific
code, so that the user-space code for other archs can also benefit from
this addition to the vmcoreinfo and use it as a standard way of
determining 'SECTIONS_SHIFT' value in user-land.

Cc: Mark Rutland 
Cc: James Morse 
Cc: Will Deacon 
Cc: Boris Petkov 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Benjamin Herrenschmidt 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: ke...@lists.infradead.org

Bhupesh Sharma (2):
   arm64, vmcoreinfo : Append 'PTRS_PER_PGD' to vmcoreinfo
   crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo

  arch/arm64/kernel/crash_core.c | 1 +
  kernel/crash_core.c| 1 +
  2 files changed, 2 insertions(+)



Lianbo's document patch has been merged, would you mind to add vmcoreinfo doc
patch as well in your series?


Thanks for the inputs. Will add it to the v3.
Let's wait for other comments/reviews, before I spin a version 3.

Regards,
Bhupesh

[PATCH v2 2/2] crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo

2019-03-10 Thread Bhupesh Sharma

Right now user-space tools like 'makedumpfile' and 'crash' need to rely
on a best-guess method of determining value of 'MAX_PHYSMEM_BITS'
supported by underlying kernel.

This value is used in user-space code to calculate the bit-space
required to store a section for SPARESMEM (similar to the existing
calculation method used in the kernel implementation):

  #define SECTIONS_SHIFT(MAX_PHYSMEM_BITS - SECTION_SIZE_BITS)

Now, regressions have been reported in user-space utilities
like 'makedumpfile' and 'crash' on arm64, with the recently added
kernel support for 52-bit physical address space, as there is
no clear method of determining this value in user-space
(other than reading kernel CONFIG flags).

As per suggestion from makedumpfile maintainer (Kazu), it makes more
sense to append 'MAX_PHYSMEM_BITS' to vmcoreinfo in the core code itself
rather than in arch-specific code, so that the user-space code for other
archs can also benefit from this addition to the vmcoreinfo and use it
as a standard way of determining 'SECTIONS_SHIFT' value in user-land.

A reference 'makedumpfile' implementation which reads the
'MAX_PHYSMEM_BITS' value from vmcoreinfo in a arch-independent fashion
is available here:

[0]. 
https://github.com/bhupesh-sharma/makedumpfile/blob/remove-max-phys-mem-bit-v1/arch/ppc64.c#L471

Cc: Boris Petkov 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: James Morse 
Cc: Will Deacon 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Benjamin Herrenschmidt 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: ke...@lists.infradead.org
Signed-off-by: Bhupesh Sharma 
---
 kernel/crash_core.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/kernel/crash_core.c b/kernel/crash_core.c
index 093c9f917ed0..44b90368e183 100644
--- a/kernel/crash_core.c
+++ b/kernel/crash_core.c
@@ -467,6 +467,7 @@ static int __init crash_save_vmcoreinfo_init(void)
 #define PAGE_OFFLINE_MAPCOUNT_VALUE(~PG_offline)
VMCOREINFO_NUMBER(PAGE_OFFLINE_MAPCOUNT_VALUE);
 #endif
+   VMCOREINFO_NUMBER(MAX_PHYSMEM_BITS);
 
arch_crash_save_vmcoreinfo();
update_vmcoreinfo_note();
-- 
2.7.4

[PATCH v2 0/2] Append new variables to vmcoreinfo (PTRS_PER_PGD for arm64 and MAX_PHYSMEM_BITS for all archs)

2019-03-10 Thread Bhupesh Sharma

Changes since v1:

- v1 was sent out as a single patch which can be seen here:
  http://lists.infradead.org/pipermail/kexec/2019-February/022411.html

- v2 breaks the single patch into two independent patches:
  [PATCH 1/2] appends 'PTRS_PER_PGD' to vmcoreinfo for arm64 arch, whereas
  [PATCH 2/2] appends 'MAX_PHYSMEM_BITS' to vmcoreinfo in core kernel code (all 
archs)

This patchset primarily fixes the regression reported in user-space
utilities like 'makedumpfile' and 'crash-utility' on arm64 architecture
with the availability of 52-bit address space feature in underlying
kernel. These regressions have been reported both on CPUs which don't
support ARMv8.2 extensions (i.e. LVA, LPA) and are running newer kernels
and also on prototype platforms (like ARMv8 FVP simulator model) which
support ARMv8.2 extensions and are running newer kernels.

The reason for these regressions is that right now user-space tools
have no direct access to these values (since these are not exported
from the kernel) and hence need to rely on a best-guess method of
determining value of 'PTRS_PER_PGD' and 'MAX_PHYSMEM_BITS' supported
by underlying kernel.

Exporting these values via vmcoreinfo will help user-land in such cases.
In addition, as per suggestion from makedumpfile maintainer (Kazu)
during v1 review, it makes more sense to append 'MAX_PHYSMEM_BITS' to
vmcoreinfo in the core code itself rather than in arm64 arch-specific
code, so that the user-space code for other archs can also benefit from
this addition to the vmcoreinfo and use it as a standard way of
determining 'SECTIONS_SHIFT' value in user-land.

Cc: Mark Rutland 
Cc: James Morse 
Cc: Will Deacon 
Cc: Boris Petkov 
Cc: Ingo Molnar 
Cc: Thomas Gleixner 
Cc: Michael Ellerman 
Cc: Paul Mackerras 
Cc: Benjamin Herrenschmidt 
Cc: Dave Anderson 
Cc: Kazuhito Hagio 
Cc: x...@kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: linux-arm-ker...@lists.infradead.org
Cc: linux-ker...@vger.kernel.org
Cc: ke...@lists.infradead.org

Bhupesh Sharma (2):
  arm64, vmcoreinfo : Append 'PTRS_PER_PGD' to vmcoreinfo
  crash_core, vmcoreinfo: Append 'MAX_PHYSMEM_BITS' to vmcoreinfo

 arch/arm64/kernel/crash_core.c | 1 +
 kernel/crash_core.c| 1 +
 2 files changed, 2 insertions(+)

-- 
2.7.4

Re: [PATCH v2 7/9] arm64: kdump: No need to mark crashkernel pages manually PG_reserved

2019-01-14 Thread Bhupesh Sharma

Hi David,

On Mon, Jan 14, 2019 at 6:30 PM David Hildenbrand  wrote:
>
> The crashkernel is reserved via memblock_reserve(). memblock_free_all()
> will call free_low_memory_core_early(), which will go over all reserved
> memblocks, marking the pages as PG_reserved.
>
> So manually marking pages as PG_reserved is not necessary, they are
> already in the desired state (otherwise they would have been handed over
> to the buddy as free pages and bad things would happen).
>
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> Cc: James Morse 
> Cc: Bhupesh Sharma 
> Cc: David Hildenbrand 
> Cc: Mark Rutland 
> Cc: Dave Kleikamp 
> Cc: Andrew Morton 
> Cc: Mike Rapoport 
> Cc: Michal Hocko 
> Cc: Florian Fainelli 
> Cc: Stefan Agner 
> Cc: Laura Abbott 
> Cc: Greg Hackmann 
> Cc: Johannes Weiner 
> Cc: Kristina Martsenko 
> Cc: CHANDAN VN 
> Cc: AKASHI Takahiro 
> Cc: Logan Gunthorpe 
> Reviewed-by: Matthias Brugger 
> Signed-off-by: David Hildenbrand 
> ---
>  arch/arm64/kernel/machine_kexec.c |  2 +-
>  arch/arm64/mm/init.c  | 27 ---
>  2 files changed, 1 insertion(+), 28 deletions(-)
>
> diff --git a/arch/arm64/kernel/machine_kexec.c 
> b/arch/arm64/kernel/machine_kexec.c
> index 6f0587b5e941..66b5d697d943 100644
> --- a/arch/arm64/kernel/machine_kexec.c
> +++ b/arch/arm64/kernel/machine_kexec.c
> @@ -321,7 +321,7 @@ void crash_post_resume(void)
>   * but does not hold any data of loaded kernel image.
>   *
>   * Note that all the pages in crash dump kernel memory have been initially
> - * marked as Reserved in kexec_reserve_crashkres_pages().
> + * marked as Reserved as memory was allocated via memblock_reserve().
>   *
>   * In hibernation, the pages which are Reserved and yet "nosave" are excluded
>   * from the hibernation iamge. crash_is_nosave() does thich check for crash
> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
> index 7205a9085b4d..c38976b70069 100644
> --- a/arch/arm64/mm/init.c
> +++ b/arch/arm64/mm/init.c
> @@ -118,35 +118,10 @@ static void __init reserve_crashkernel(void)
> crashk_res.start = crash_base;
> crashk_res.end = crash_base + crash_size - 1;
>  }
> -
> -static void __init kexec_reserve_crashkres_pages(void)
> -{
> -#ifdef CONFIG_HIBERNATION
> -   phys_addr_t addr;
> -   struct page *page;
> -
> -   if (!crashk_res.end)
> -   return;
> -
> -   /*
> -* To reduce the size of hibernation image, all the pages are
> -* marked as Reserved initially.
> -*/
> -   for (addr = crashk_res.start; addr < (crashk_res.end + 1);
> -   addr += PAGE_SIZE) {
> -   page = phys_to_page(addr);
> -   SetPageReserved(page);
> -   }
> -#endif
> -}
>  #else
>  static void __init reserve_crashkernel(void)
>  {
>  }
> -
> -static void __init kexec_reserve_crashkres_pages(void)
> -{
> -}
>  #endif /* CONFIG_KEXEC_CORE */
>
>  #ifdef CONFIG_CRASH_DUMP
> @@ -586,8 +561,6 @@ void __init mem_init(void)
> /* this will put all unused low memory onto the freelists */
> memblock_free_all();
>
> -   kexec_reserve_crashkres_pages();
> -
> mem_init_print_info(NULL);
>
> /*
> --
> 2.17.2

LGTM, so:
Reviewed-by: Bhupesh Sharma

Re: [PATCH v2 6/9] arm64: kexec: no need to ClearPageReserved()

2019-01-14 Thread Bhupesh Sharma

Hi David,

Thanks for the patch.

On Mon, Jan 14, 2019 at 6:29 PM David Hildenbrand  wrote:
>
> This will be done by free_reserved_page().
>
> Cc: Catalin Marinas 
> Cc: Will Deacon 
> Cc: Bhupesh Sharma 
> Cc: James Morse 
> Cc: Marc Zyngier 
> Cc: Dave Kleikamp 
> Cc: Mark Rutland 
> Cc: Andrew Morton 
> Cc: Michal Hocko 
> Cc: Matthew Wilcox 
> Acked-by: James Morse 
> Signed-off-by: David Hildenbrand 
> ---
>  arch/arm64/kernel/machine_kexec.c | 1 -
>  1 file changed, 1 deletion(-)
>
> diff --git a/arch/arm64/kernel/machine_kexec.c 
> b/arch/arm64/kernel/machine_kexec.c
> index aa9c94113700..6f0587b5e941 100644
> --- a/arch/arm64/kernel/machine_kexec.c
> +++ b/arch/arm64/kernel/machine_kexec.c
> @@ -361,7 +361,6 @@ void crash_free_reserved_phys_range(unsigned long begin, 
> unsigned long end)
>
> for (addr = begin; addr < end; addr += PAGE_SIZE) {
> page = phys_to_page(addr);
> -   ClearPageReserved(page);
>     free_reserved_page(page);
> }
>  }
> --
> 2.17.2
>

Reviewed-by: Bhupesh Sharma

Re: [kernel-hardening] [PATCH] powerpc: Increase ELF_ET_DYN_BASE to 1TB for 64-bit applications

2017-06-07 Thread Bhupesh SHARMA

On Wed, Jun 7, 2017 at 2:59 PM, Michael Ellerman  wrote:
> Daniel Micay  writes:
>
>> Rather than doing this, the base should just be split for an ELF
>> interpreter like PaX.
>
> I don't quite parse that, I think you mean PaX uses a different base for
> an ELF interpreter vs a regular ET_DYN?

I am also not very conversant with PaX. AFAIU, we can use the
following methods to print the shared object dependencies instead of
ldd:

1. One can load the binary directly with LD_TRACE_LOADED_OBJECTS=1.

So, instead of:

# /lib64/ld-2.24.so ./large-bss-test-app
Segmentation fault (core dumped)

One can do:
# LD_TRACE_LOADED_OBJECTS=1 ./large-bss-test-app
linux-vdso64.so.1 (0x7fffa67a)
libc.so.6 => /lib64/libc.so.6 (0x7fffa659)
/lib64/ld64.so.2 (0x7fffa67c)

2. There are other utils like pax-utils etc that we can use.

But, we generally cannot force a user to not use ldd to determine the
shared object dependencies, especially when all the documentation
points to it and it works well on the other archs like x86 and arm64.

> That would be cool. How do you know that it's an ELF interpreter you're
> loading? Is it just something that's PIE but doesn't request an
> interpreter?
>
> Is the PaX code somewhere I can look at?
>
>> It makes sense for a standalone executable to be as low in the address
>> space as possible.
>
> More or less. There are performance reasons why 1T could be good for us,
> but I want to see some performance numbers to justify that change. And
> it does mean you have a bit less address space to play with.

Do you have any specific performance test(s) in mind which I can run
to see how the 1TB impacts them? I am trying to run ltp after this
change and will be able to share the results shortly, but I am not
sure it provides the right data to validate such a change.

Regards,
Bhupesh

[PATCH] powerpc: Increase ELF_ET_DYN_BASE to 1TB for 64-bit applications

2017-06-04 Thread Bhupesh Sharma

Since 7e60d1b427c51cf2525e5d736a71780978cfb828, the ELF_ET_DYN_BASE
for powerpc applications has been set to 512MB.

Recently there have been several reports of applications SEGV'ing
and newer version of glibc also SEGV'ing (while testing) when using the 
following
test method:

LD_LIBRARY_PATH=/XXX/lib /XXX/lib/ld-2.24.so 

For reproducing the above, consider the following test application which
uses a larger bss:

1.
# cat large-bss-test-app.c
 
#include 
#include 
#define VECSIZE (1024 * 1024 * 100)

float p[VECSIZE], v1[VECSIZE], v2[VECSIZE];

void vec_mult(long int N) {
long int i;
for (i = 0; i < N; i++)
p[i] = v1[i] * v2[i];
}

int main() {
char command[1024];
sprintf(command,"cat /proc/%d/maps",getpid());
system(command);

vec_mult(VECSIZE/100);
printf ("Done\n");
return 0;
}

2. Compile it using gcc (I am using gcc-6.3.1):
# gcc -g -o large-bss-test-app large-bss-test-app.c

3. Running the compiled application with ld.so directly is enough to 
trigger the SEGV
   on ppc64le/ppc64:
# /lib64/ld-2.24.so ./large-bss-test-app 
Segmentation fault (core dumped)

4. Notice it's random depending on the layout changes, so it passes on some 
occassions as well:
# /lib64/ld-2.24.so ./large-bss-test-app
1000-1001 r-xp  fd:00 2883597
/root/large-bss-test-app
1001-1002 r--p  fd:00 2883597
/root/large-bss-test-app
1002-1003 rw-p 0001 fd:00 2883597
/root/large-bss-test-app
1003-5b03 rw-p  00:00 0
5e95-5e99 r-xp  fd:00 1180301
/usr/lib64/ld-2.24.so
5e99-5e9a r--p 0003 fd:00 1180301
/usr/lib64/ld-2.24.so
5e9a-5e9b rw-p 0004 fd:00 1180301
/usr/lib64/ld-2.24.so
3fffa368-3fffa386 r-xp  fd:00 1180308
/usr/lib64/libc-2.24.so
3fffa386-3fffa387 r--p 001d fd:00 1180308
/usr/lib64/libc-2.24.so
3fffa387-3fffa388 rw-p 001e fd:00 1180308
/usr/lib64/libc-2.24.so
3fffa389-3fffa38b r-xp  00:00 0  
[vdso]
3fffc674-3fffc677 rw-p  00:00 0  
[stack]
Done

One way to fix this is to move ELF_ET_DYN_BASE from 0x2000 (512MB) to 
0x100 (1TB),
atleast for 64-bit applications. This allows hopefully enough space for most of 
the
applications without causing them to trample upon the ld.so, leading to a SEGV.

ELF_ET_DYN_BASE is still kept as 0x2000 (512MB) for 32-bit applications to 
preserve their
compatibility.

After this change, the layout for the 'large-bss-test-app' changes as shown 
below:
# /lib64/ld-2.24.so ./large-bss-test-app
1000-1001 r-xp  fd:00 2107527
/root/large-bss-test-app
1001-1002 r--p  fd:00 2107527
/root/large-bss-test-app
1002-1003 rw-p 0001 fd:00 2107527
/root/large-bss-test-app
1003-5b03 rw-p  00:00 0
100283b-100283f r-xp  fd:00 1835645  
/usr/lib64/ld-2.24.so
100283f-1002840 r--p 0003 fd:00 1835645  
/usr/lib64/ld-2.24.so
1002840-1002841 rw-p 0004 fd:00 1835645  
/usr/lib64/ld-2.24.so
7fff8a47-7fff8a65 r-xp  fd:00 1835652
/usr/lib64/libc-2.24.so
7fff8a65-7fff8a66 r--p 001d fd:00 1835652
/usr/lib64/libc-2.24.so
7fff8a66-7fff8a67 rw-p 001e fd:00 1835652
/usr/lib64/libc-2.24.so
7fff8a68-7fff8a6a r-xp  00:00 0  
[vdso]
7fffc6d9-7fffc6dc rw-p  00:00 0  
[stack]
Done

Cc: Anton Blanchard 
Cc: Daniel Cashman 
Cc: Kees Cook 
Cc: Michael Ellerman  
Cc: Benjamin Herrenschmidt 
Signed-off-by: Bhupesh Sharma 
---
 arch/powerpc/include/asm/elf.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/elf.h b/arch/powerpc/include/asm/elf.h
index 09bde6e..683230c 100644
--- a/arch/powerpc/include/asm/elf.h
+++ b/arch/powerpc/include/asm/elf.h
@@ -28,7 +28,9 @@
the loader.  We need to make sure that it is out of the way of the program
that it will "exec", and that there is sufficient room for the brk.  */
 
-#define ELF_ET_DYN_BASE0x2000
+/* Keep thi

Re: [PATCH v2] powerpc/mm: Add support for runtime configuration of ASLR limits

2017-04-20 Thread Bhupesh Sharma

Hi Michael,

Thanks for the v2. It looks good.

On Thu, Apr 20, 2017 at 8:06 PM, Michael Ellerman  wrote:
> Add powerpc support for mmap_rnd_bits and mmap_rnd_compat_bits, which are two
> sysctls that allow a user to configure the number of bits of randomness used 
> for
> ASLR.
>
> Because of the way the Kconfig for ARCH_MMAP_RND_BITS is defined, we have to
> construct at least the MIN value in Kconfig, vs in a header which would be 
> more
> natural. Given that we just go ahead and do it all in Kconfig.
>
> At least according to the code (the documentation makes no mention of it), the
> value is defined as the number of bits of randomisation *of the page*, not the
> address. This makes some sense, with larger page sizes more of the low bits 
> are
> forced to zero, which would reduce the randomisation if we didn't take the
> PAGE_SIZE into account. However it does mean the min/max values have to change
> depending on the PAGE_SIZE in order to actually limit the amount of address
> space consumed by the randomisation.
>
> The result of that is that we have to define the default values based on both
> 32-bit vs 64-bit, but also the configured PAGE_SIZE. Furthermore now that we
> have 128TB address space support on Book3S, we also have to take that into
> account.
>
> Finally we can wire up the value in arch_mmap_rnd().
>
> Signed-off-by: Michael Ellerman 
> Signed-off-by: Bhupesh Sharma 
> ---
>  arch/powerpc/Kconfig   | 44 
>  arch/powerpc/mm/mmap.c | 11 ++-
>  2 files changed, 50 insertions(+), 5 deletions(-)
>
> v2: Fix the 32-bit MAX value incorrectly using MIN as spotted by Kees.
>
> Kees/Bhupesh, would love a Review/Ack/Tested-by from you, I'll plan to merge
> this later today (Friday) my time.
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 97a8bc8a095c..6f0503951e94 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -22,6 +22,48 @@ config MMU
> bool
> default y
>
> +config ARCH_MMAP_RND_BITS_MAX
> +   # On Book3S 64, the default virtual address space for 64-bit processes
> +   # is 2^47 (128TB). As a maximum, allow randomisation to consume up to
> +   # 32T of address space (2^45), which should ensure a reasonable gap
> +   # between bottom-up and top-down allocations for applications that
> +   # consume "normal" amounts of address space. Book3S 64 only supports 
> 64K
> +   # and 4K page sizes.
> +   default 29 if PPC_BOOK3S_64 && PPC_64K_PAGES # 29 = 45 (32T) - 16 
> (64K)
> +   default 33 if PPC_BOOK3S_64  # 33 = 45 (32T) - 12 (4K)
> +   #
> +   # On all other 64-bit platforms (currently only Book3E), the virtual
> +   # address space is 2^46 (64TB). Allow randomisation to consume up to 
> 16T
> +   # of address space (2^44). Only 4K page sizes are supported.
> +   default 32 if 64BIT # 32 = 44 (16T) - 12 (4K)
> +   #
> +   # For 32-bit, use the compat values, as they're the same.
> +   default ARCH_MMAP_RND_COMPAT_BITS_MAX
> +
> +config ARCH_MMAP_RND_BITS_MIN
> +   # Allow randomisation to consume up to 1GB of address space (2^30).
> +   default 14 if 64BIT && PPC_64K_PAGES# 14 = 30 (1GB) - 16 (64K)
> +   default 18 if 64BIT # 18 = 30 (1GB) - 12 (4K)
> +   #
> +   # For 32-bit, use the compat values, as they're the same.
> +   default ARCH_MMAP_RND_COMPAT_BITS_MIN
> +
> +config ARCH_MMAP_RND_COMPAT_BITS_MAX
> +   # Total virtual address space for 32-bit processes is 2^31 (2GB).
> +   # Allow randomisation to consume up to 512MB of address space (2^29).
> +   default 11 if PPC_256K_PAGES# 11 = 29 (512MB) - 18 (256K)
> +   default 13 if PPC_64K_PAGES # 13 = 29 (512MB) - 16 (64K)
> +   default 15 if PPC_16K_PAGES # 15 = 29 (512MB) - 14 (16K)
> +   default 17  # 17 = 29 (512MB) - 12 (4K)
> +
> +config ARCH_MMAP_RND_COMPAT_BITS_MIN
> +   # Total virtual address space for 32-bit processes is 2^31 (2GB).
> +   # Allow randomisation to consume up to 8MB of address space (2^23).
> +   default 5 if PPC_256K_PAGES #  5 = 23 (8MB) - 18 (256K)
> +   default 7 if PPC_64K_PAGES  #  7 = 23 (8MB) - 16 (64K)
> +   default 9 if PPC_16K_PAGES  #  9 = 23 (8MB) - 14 (16K)
> +   default 11  # 11 = 23 (8MB) - 12 (4K)
> +
>  config HAVE_SETUP_PER_CPU_AREA
> def_bool PPC64
>
> @@ -120,6 +162,8 @@ config PPC
> select HAVE_ARCH_HARDENED_USERCOPY
> select HAVE_ARCH_JUMP_LABEL
> select HAVE_ARCH_KGDB
>

Re: [PATCH] powerpc/mm: Add support for runtime configuration of ASLR limits

2017-04-19 Thread Bhupesh Sharma

Hi Michael,

On Wed, Apr 19, 2017 at 7:59 PM, Michael Ellerman  wrote:
> Add powerpc support for mmap_rnd_bits and mmap_rnd_compat_bits, which are two
> sysctls that allow a user to configure the number of bits of randomness used 
> for
> ASLR.
>
> Because of the way the Kconfig for ARCH_MMAP_RND_BITS is defined, we have to
> construct at least the MIN value in Kconfig, vs in a header which would be 
> more
> natural. Given that we just go ahead and do it all in Kconfig.
>
> At least according to the code (the documentation makes no mention of it), the
> value is defined as the number of bits of randomisation *of the page*, not the
> address. This makes some sense, with larger page sizes more of the low bits 
> are
> forced to zero, which would reduce the randomisation if we didn't take the
> PAGE_SIZE into account. However it does mean the min/max values have to change
> depending on the PAGE_SIZE in order to actually limit the amount of address
> space consumed by the randomisation.
>
> The result of that is that we have to define the default values based on both
> 32-bit vs 64-bit, but also the configured PAGE_SIZE. Furthermore now that we
> have 128TB address space support on Book3S, we also have to take that into
> account.

Thanks for the patch. I have a couple of comments:

(A) As Aneesh noted in the review of my v2 patch (see [1]), we need to
handle the configurable 512TB case as well. Right?

> Finally we can wire up the value in arch_mmap_rnd().
>
> Signed-off-by: Michael Ellerman 

(b) I am wondering if I missed your comments on my v2 on the same
subject - may be you missed my reminder message (see [2]).

I am just starting off on PPC related enhancements that we find useful
while working on Redhat/Fedora PPC systems (I have mainly been
associated with ARM and peripheral driver development in the past),
and would have been motivated further if I could get responses to my
queries which I had raised earlier on the list (see [3]) - especially
the branch to base newer version of patches on.

Also I am not sure how the PPC subsystem handles S-O-Bs of earlier
contributions on the same subject (as it varies from one
maintainer/subsystem to the other), so I leave it up to you.

That being said, I will try to improve any new patches I plan to send
out on PPC mailing list in future.

[1] https://lkml.org/lkml/2017/4/13/57
[2] https://lkml.org/lkml/2017/4/10/796
[3] https://lkml.org/lkml/2017/4/17/3

Thanks,
Bhupesh

> ---
>  arch/powerpc/Kconfig   | 44 
>  arch/powerpc/mm/mmap.c | 11 ++-
>  2 files changed, 50 insertions(+), 5 deletions(-)
>
>
> This is based on my next branch which has the 128TB changes:
>   
> https://git.kernel.org/pub/scm/linux/kernel/git/powerpc/linux.git/log/?h=next
>
> I would definitely appreciate someone checking my math, and any test results.
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 97a8bc8a095c..608ee0b7b79f 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -22,6 +22,48 @@ config MMU
> bool
> default y
>
> +config ARCH_MMAP_RND_BITS_MAX
> +   # On Book3S 64, the default virtual address space for 64-bit processes
> +   # is 2^47 (128TB). As a maximum, allow randomisation to consume up to
> +   # 32T of address space (2^45), which should ensure a reasonable gap
> +   # between bottom-up and top-down allocations for applications that
> +   # consume "normal" amounts of address space. Book3S 64 only supports 
> 64K
> +   # and 4K page sizes.
> +   default 29 if PPC_BOOK3S_64 && PPC_64K_PAGES # 29 = 45 (32T) - 16 
> (64K)
> +   default 33 if PPC_BOOK3S_64  # 33 = 45 (32T) - 12 (4K)
> +   #
> +   # On all other 64-bit platforms (currently only Book3E), the virtual
> +   # address space is 2^46 (64TB). Allow randomisation to consume up to 
> 16T
> +   # of address space (2^44). Only 4K page sizes are supported.
> +   default 32 if 64BIT # 32 = 44 (16T) - 12 (4K)
> +   #
> +   # For 32-bit, use the compat values, as they're the same.
> +   default ARCH_MMAP_RND_COMPAT_BITS_MIN
> +
> +config ARCH_MMAP_RND_BITS_MIN
> +   # Allow randomisation to consume up to 1GB of address space (2^30).
> +   default 14 if 64BIT && PPC_64K_PAGES# 14 = 30 (1GB) - 16 (64K)
> +   default 18 if 64BIT # 18 = 30 (1GB) - 12 (4K)
> +   #
> +   # For 32-bit, use the compat values, as they're the same.
> +   default ARCH_MMAP_RND_COMPAT_BITS_MIN
> +
> +config ARCH_MMAP_RND_COMPAT_BITS_MAX
> +   # Total virtual address space for 32-bit processes is 2^31 (2GB).
> +   # Allow randomisation to consume up to 512MB of address space (2^29).
> +   default 11 if PPC_256K_PAGES# 11 = 29 (512MB) - 18 (256K)
> +   default 13 if PPC_64K_PAGES # 13 = 29 (512MB) - 16 (64K)
> +   default 15 if PPC_16K_PAGES # 15 = 29 (512MB) - 14 (16K)
> +   default

Re: [PATCH v3] powerpc: mm: support ARCH_MMAP_RND_BITS

2017-04-16 Thread Bhupesh SHARMA

On Thu, Apr 13, 2017 at 12:39 PM, Balbir Singh  wrote:
>>>
>>> Yes. It was derived from TASK_SIZE :
>>>
>>> http://lxr.free-electrons.com/source/arch/powerpc/include/asm/processor.h#L105
>>>
>>
>> That is getting update to 128TB by default and conditionally to 512TB
>>
>
> Since this is compile time, we should probably keep the scope to 128TB
> for now and see if we want to change things at run time later, since
> the expansion is based on a hint. Suggestions?
>

I think this makes sense. If the conditional expansion to 512TB is
protected by a kconfig symbol, we can use the same to have separate
ranges for 128TB and 512TB using the kconfig symbol as the
differentiating factor.

Also please let me know which branch/tree to use where we are done
with the change making the default to 128TB. My v2 was based on
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
(mater branch), where the TASK_SIZE is set to 46-bits inside
'arch/powerpc/include/asm/processor.h'.

So that I can spin the v3 accordingly.

Thanks,
Bhupesh

Re: [PATCH v3] powerpc: mm: support ARCH_MMAP_RND_BITS

2017-04-13 Thread Bhupesh Sharma

On Thu, Apr 13, 2017 at 12:28 PM, Aneesh Kumar K.V
 wrote:
>
>
> On Thursday 13 April 2017 12:22 PM, Bhupesh Sharma wrote:
>>
>> Hi Aneesh,
>>
>> On Thu, Apr 13, 2017 at 12:06 PM, Aneesh Kumar K.V
>>  wrote:
>>>
>>> Bhupesh Sharma  writes:
>>>
>>>> powerpc arch_mmap_rnd() currently uses hard-coded values -
>>>> (23-PAGE_SHIFT) for
>>>> 32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset
>>>> for the mmap base address for a ASLR ELF.
>>>>
>>>> This patch makes sure that powerpc mmap arch_mmap_rnd() implementation
>>>> is similar to other ARCHs (like x86, arm64) and uses mmap_rnd_bits
>>>> and helpers to generate the mmap address randomization.
>>>>
>>>> The maximum and minimum randomization range values represent
>>>> a compromise between increased ASLR effectiveness and avoiding
>>>> address-space fragmentation.
>>>>
>>>> Using the Kconfig option and suitable /proc tunable, platform
>>>> developers may choose where to place this compromise.
>>>>
>>>> Also this patch keeps the default values as new minimums.
>>>>
>>>> Signed-off-by: Bhupesh Sharma 
>>>> Reviewed-by: Kees Cook 
>>>> ---
>>>> * Changes since v2:
>>>> v2 can be seen here (https://patchwork.kernel.org/patch/9551509/)
>>>> - Changed a few minimum and maximum randomization ranges as per
>>>> Michael's suggestion.
>>>> - Corrected Kees's email address in the Reviewed-by line.
>>>> - Added further comments in kconfig to explain how the address
>>>> ranges were worked out.
>>>>
>>>> * Changes since v1:
>>>> v1 can be seen here
>>>> (https://lists.ozlabs.org/pipermail/linuxppc-dev/2017-February/153594.html)
>>>> - No functional change in this patch.
>>>> - Dropped PATCH 2/2 from v1 as recommended by Kees Cook.
>>>>
>>>>  arch/powerpc/Kconfig   | 44
>>>> 
>>>>  arch/powerpc/mm/mmap.c |  7 ---
>>>>  2 files changed, 48 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>>>> index 97a8bc8..84aae67 100644
>>>> --- a/arch/powerpc/Kconfig
>>>> +++ b/arch/powerpc/Kconfig
>>>> @@ -22,6 +22,48 @@ config MMU
>>>>   bool
>>>>   default y
>>>>
>>>> +# min bits determined by the following formula:
>>>> +# VA_BITS - PAGE_SHIFT - CONSTANT
>>>> +# where,
>>>> +#VA_BITS = 46 bits for 64BIT and 4GB - 1 Page = 31 bits for 32BIT
>>>
>>>
>>>
>>> Where did we derive that 46 bits from ? is that based on TASK_SIZE ?
>>
>>
>> Yes. It was derived from TASK_SIZE :
>>
>> http://lxr.free-electrons.com/source/arch/powerpc/include/asm/processor.h#L105
>>
>
> That is getting update to 128TB by default and conditionally to 512TB

Can't find the relevant patch in linus's master branch.
Please share the appropriate patch/discussion link.

Regards,
Bhupesh

Re: [PATCH v3] powerpc: mm: support ARCH_MMAP_RND_BITS

2017-04-12 Thread Bhupesh Sharma

Hi Aneesh,

On Thu, Apr 13, 2017 at 12:06 PM, Aneesh Kumar K.V
 wrote:
> Bhupesh Sharma  writes:
>
>> powerpc arch_mmap_rnd() currently uses hard-coded values - (23-PAGE_SHIFT) 
>> for
>> 32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset
>> for the mmap base address for a ASLR ELF.
>>
>> This patch makes sure that powerpc mmap arch_mmap_rnd() implementation
>> is similar to other ARCHs (like x86, arm64) and uses mmap_rnd_bits
>> and helpers to generate the mmap address randomization.
>>
>> The maximum and minimum randomization range values represent
>> a compromise between increased ASLR effectiveness and avoiding
>> address-space fragmentation.
>>
>> Using the Kconfig option and suitable /proc tunable, platform
>> developers may choose where to place this compromise.
>>
>> Also this patch keeps the default values as new minimums.
>>
>> Signed-off-by: Bhupesh Sharma 
>> Reviewed-by: Kees Cook 
>> ---
>> * Changes since v2:
>> v2 can be seen here (https://patchwork.kernel.org/patch/9551509/)
>> - Changed a few minimum and maximum randomization ranges as per 
>> Michael's suggestion.
>> - Corrected Kees's email address in the Reviewed-by line.
>> - Added further comments in kconfig to explain how the address ranges 
>> were worked out.
>>
>> * Changes since v1:
>> v1 can be seen here 
>> (https://lists.ozlabs.org/pipermail/linuxppc-dev/2017-February/153594.html)
>> - No functional change in this patch.
>> - Dropped PATCH 2/2 from v1 as recommended by Kees Cook.
>>
>>  arch/powerpc/Kconfig   | 44 
>>  arch/powerpc/mm/mmap.c |  7 ---
>>  2 files changed, 48 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>> index 97a8bc8..84aae67 100644
>> --- a/arch/powerpc/Kconfig
>> +++ b/arch/powerpc/Kconfig
>> @@ -22,6 +22,48 @@ config MMU
>>   bool
>>   default y
>>
>> +# min bits determined by the following formula:
>> +# VA_BITS - PAGE_SHIFT - CONSTANT
>> +# where,
>> +#VA_BITS = 46 bits for 64BIT and 4GB - 1 Page = 31 bits for 32BIT
>
>
> Where did we derive that 46 bits from ? is that based on TASK_SIZE ?

Yes. It was derived from TASK_SIZE :
http://lxr.free-electrons.com/source/arch/powerpc/include/asm/processor.h#L105

Regards,
Bhupesh

>
>> +#CONSTANT = 16 for 64BIT and 8 for 32BIT
>> +config ARCH_MMAP_RND_BITS_MIN
>> +   default 5 if PPC_256K_PAGES && 32BIT  # 31 - 18 - 8 = 5
>> +   default 7 if PPC_64K_PAGES && 32BIT   # 31 - 16 - 8 = 7
>> +   default 9 if PPC_16K_PAGES && 32BIT   # 31 - 14 - 8 = 9
>> +   default 11 if PPC_4K_PAGES && 32BIT   # 31 - 12 - 8 = 11
>> +   default 12 if PPC_256K_PAGES && 64BIT # 46 - 18 - 16 = 12
>> +   default 14 if PPC_64K_PAGES && 64BIT  # 46 - 16 - 16 = 14
>> +   default 16 if PPC_16K_PAGES && 64BIT  # 46 - 14 - 16 = 16
>> +   default 18 if PPC_4K_PAGES && 64BIT   # 46 - 12 - 16 = 18
>> +
>> +# max bits determined by the following formula:
>
>
> -aneesh
>

Re: [PATCH v3] powerpc: mm: support ARCH_MMAP_RND_BITS

2017-04-10 Thread Bhupesh Sharma

Hi Michael,

On Wed, Mar 29, 2017 at 1:15 AM, Bhupesh Sharma  wrote:
> powerpc arch_mmap_rnd() currently uses hard-coded values - (23-PAGE_SHIFT) for
> 32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset
> for the mmap base address for a ASLR ELF.
>
> This patch makes sure that powerpc mmap arch_mmap_rnd() implementation
> is similar to other ARCHs (like x86, arm64) and uses mmap_rnd_bits
> and helpers to generate the mmap address randomization.
>
> The maximum and minimum randomization range values represent
> a compromise between increased ASLR effectiveness and avoiding
> address-space fragmentation.
>
> Using the Kconfig option and suitable /proc tunable, platform
> developers may choose where to place this compromise.
>
> Also this patch keeps the default values as new minimums.
>
> Signed-off-by: Bhupesh Sharma 
> Reviewed-by: Kees Cook 
> ---
> * Changes since v2:
> v2 can be seen here (https://patchwork.kernel.org/patch/9551509/)
> - Changed a few minimum and maximum randomization ranges as per Michael's 
> suggestion.
> - Corrected Kees's email address in the Reviewed-by line.
> - Added further comments in kconfig to explain how the address ranges 
> were worked out.
>
> * Changes since v1:
> v1 can be seen here 
> (https://lists.ozlabs.org/pipermail/linuxppc-dev/2017-February/153594.html)
> - No functional change in this patch.
> - Dropped PATCH 2/2 from v1 as recommended by Kees Cook.
>
>  arch/powerpc/Kconfig   | 44 
>  arch/powerpc/mm/mmap.c |  7 ---
>  2 files changed, 48 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 97a8bc8..84aae67 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -22,6 +22,48 @@ config MMU
> bool
> default y
>
> +# min bits determined by the following formula:
> +# VA_BITS - PAGE_SHIFT - CONSTANT
> +# where,
> +#  VA_BITS = 46 bits for 64BIT and 4GB - 1 Page = 31 bits for 32BIT
> +#  CONSTANT = 16 for 64BIT and 8 for 32BIT
> +config ARCH_MMAP_RND_BITS_MIN
> +   default 5 if PPC_256K_PAGES && 32BIT  # 31 - 18 - 8 = 5
> +   default 7 if PPC_64K_PAGES && 32BIT   # 31 - 16 - 8 = 7
> +   default 9 if PPC_16K_PAGES && 32BIT   # 31 - 14 - 8 = 9
> +   default 11 if PPC_4K_PAGES && 32BIT   # 31 - 12 - 8 = 11
> +   default 12 if PPC_256K_PAGES && 64BIT # 46 - 18 - 16 = 12
> +   default 14 if PPC_64K_PAGES && 64BIT  # 46 - 16 - 16 = 14
> +   default 16 if PPC_16K_PAGES && 64BIT  # 46 - 14 - 16 = 16
> +   default 18 if PPC_4K_PAGES && 64BIT   # 46 - 12 - 16 = 18
> +
> +# max bits determined by the following formula:
> +# VA_BITS - PAGE_SHIFT - CONSTANT
> +# where,
> +#  VA_BITS = 46 bits for 64BIT, and 4GB - 1 Page = 31 bits for 32BIT
> +#  CONSTANT = 2, both for 64BIT and 32BIT
> +config ARCH_MMAP_RND_BITS_MAX
> +   default 11 if PPC_256K_PAGES && 32BIT # 31 - 18 - 2 = 11
> +   default 13 if PPC_64K_PAGES && 32BIT  # 31 - 16 - 2 = 13
> +   default 15 if PPC_16K_PAGES && 32BIT  # 31 - 14 - 2 = 15
> +   default 17 if PPC_4K_PAGES && 32BIT   # 31 - 12 - 2 = 17
> +   default 26 if PPC_256K_PAGES && 64BIT # 46 - 18 - 2 = 26
> +   default 28 if PPC_64K_PAGES && 64BIT  # 46 - 16 - 2 = 28
> +   default 30 if PPC_16K_PAGES && 64BIT  # 46 - 14 - 2 = 30
> +   default 32 if PPC_4K_PAGES && 64BIT   # 46 - 12 - 2 = 32
> +
> +config ARCH_MMAP_RND_COMPAT_BITS_MIN
> +   default 5 if PPC_256K_PAGES
> +   default 7 if PPC_64K_PAGES
> +   default 9 if PPC_16K_PAGES
> +   default 11
> +
> +config ARCH_MMAP_RND_COMPAT_BITS_MAX
> +   default 11 if PPC_256K_PAGES
> +   default 13 if PPC_64K_PAGES
> +   default 15 if PPC_16K_PAGES
> +   default 17
> +
>  config HAVE_SETUP_PER_CPU_AREA
> def_bool PPC64
>
> @@ -142,6 +184,8 @@ config PPC
> select HAVE_IRQ_EXIT_ON_IRQ_STACK
> select HAVE_KERNEL_GZIP
> select HAVE_KPROBES
> +   select HAVE_ARCH_MMAP_RND_BITS
> +   select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
> select HAVE_KRETPROBES
> select HAVE_LIVEPATCH   if 
> HAVE_DYNAMIC_FTRACE_WITH_REGS
> select HAVE_MEMBLOCK
> diff --git a/arch/powerpc/mm/mmap.c b/arch/powerpc/mm/mmap.c
> index a5d9ef5..92a9355 100644
> --- a/arch/powerpc/mm/mmap.c
> +++ b/arch/powerpc/mm/mmap.c
> @@ -61,11 +61,12 @@ unsigned long arch_mmap_rnd(void)
>  {
> unsigned long rnd;
>
> -   /* 8MB for 32bit

[PATCH v3] powerpc: mm: support ARCH_MMAP_RND_BITS

2017-03-28 Thread Bhupesh Sharma

powerpc arch_mmap_rnd() currently uses hard-coded values - (23-PAGE_SHIFT) for
32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset
for the mmap base address for a ASLR ELF.

This patch makes sure that powerpc mmap arch_mmap_rnd() implementation
is similar to other ARCHs (like x86, arm64) and uses mmap_rnd_bits
and helpers to generate the mmap address randomization.

The maximum and minimum randomization range values represent
a compromise between increased ASLR effectiveness and avoiding
address-space fragmentation.

Using the Kconfig option and suitable /proc tunable, platform
developers may choose where to place this compromise.

Also this patch keeps the default values as new minimums.

Signed-off-by: Bhupesh Sharma 
Reviewed-by: Kees Cook 
---
* Changes since v2:
v2 can be seen here (https://patchwork.kernel.org/patch/9551509/)
- Changed a few minimum and maximum randomization ranges as per Michael's 
suggestion.
- Corrected Kees's email address in the Reviewed-by line.
- Added further comments in kconfig to explain how the address ranges were 
worked out.

* Changes since v1:
v1 can be seen here 
(https://lists.ozlabs.org/pipermail/linuxppc-dev/2017-February/153594.html)
- No functional change in this patch.
- Dropped PATCH 2/2 from v1 as recommended by Kees Cook.

 arch/powerpc/Kconfig   | 44 
 arch/powerpc/mm/mmap.c |  7 ---
 2 files changed, 48 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 97a8bc8..84aae67 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -22,6 +22,48 @@ config MMU
bool
default y
 
+# min bits determined by the following formula:
+# VA_BITS - PAGE_SHIFT - CONSTANT
+# where,
+#  VA_BITS = 46 bits for 64BIT and 4GB - 1 Page = 31 bits for 32BIT
+#  CONSTANT = 16 for 64BIT and 8 for 32BIT
+config ARCH_MMAP_RND_BITS_MIN
+   default 5 if PPC_256K_PAGES && 32BIT  # 31 - 18 - 8 = 5
+   default 7 if PPC_64K_PAGES && 32BIT   # 31 - 16 - 8 = 7
+   default 9 if PPC_16K_PAGES && 32BIT   # 31 - 14 - 8 = 9
+   default 11 if PPC_4K_PAGES && 32BIT   # 31 - 12 - 8 = 11
+   default 12 if PPC_256K_PAGES && 64BIT # 46 - 18 - 16 = 12
+   default 14 if PPC_64K_PAGES && 64BIT  # 46 - 16 - 16 = 14
+   default 16 if PPC_16K_PAGES && 64BIT  # 46 - 14 - 16 = 16
+   default 18 if PPC_4K_PAGES && 64BIT   # 46 - 12 - 16 = 18
+
+# max bits determined by the following formula:
+# VA_BITS - PAGE_SHIFT - CONSTANT
+# where, 
+#  VA_BITS = 46 bits for 64BIT, and 4GB - 1 Page = 31 bits for 32BIT
+#  CONSTANT = 2, both for 64BIT and 32BIT
+config ARCH_MMAP_RND_BITS_MAX
+   default 11 if PPC_256K_PAGES && 32BIT # 31 - 18 - 2 = 11
+   default 13 if PPC_64K_PAGES && 32BIT  # 31 - 16 - 2 = 13
+   default 15 if PPC_16K_PAGES && 32BIT  # 31 - 14 - 2 = 15
+   default 17 if PPC_4K_PAGES && 32BIT   # 31 - 12 - 2 = 17
+   default 26 if PPC_256K_PAGES && 64BIT # 46 - 18 - 2 = 26
+   default 28 if PPC_64K_PAGES && 64BIT  # 46 - 16 - 2 = 28
+   default 30 if PPC_16K_PAGES && 64BIT  # 46 - 14 - 2 = 30
+   default 32 if PPC_4K_PAGES && 64BIT   # 46 - 12 - 2 = 32
+
+config ARCH_MMAP_RND_COMPAT_BITS_MIN
+   default 5 if PPC_256K_PAGES
+   default 7 if PPC_64K_PAGES
+   default 9 if PPC_16K_PAGES
+   default 11
+
+config ARCH_MMAP_RND_COMPAT_BITS_MAX
+   default 11 if PPC_256K_PAGES
+   default 13 if PPC_64K_PAGES
+   default 15 if PPC_16K_PAGES 
+   default 17
+
 config HAVE_SETUP_PER_CPU_AREA
def_bool PPC64
 
@@ -142,6 +184,8 @@ config PPC
select HAVE_IRQ_EXIT_ON_IRQ_STACK
select HAVE_KERNEL_GZIP
select HAVE_KPROBES
+   select HAVE_ARCH_MMAP_RND_BITS
+   select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
select HAVE_KRETPROBES
select HAVE_LIVEPATCH   if HAVE_DYNAMIC_FTRACE_WITH_REGS
select HAVE_MEMBLOCK
diff --git a/arch/powerpc/mm/mmap.c b/arch/powerpc/mm/mmap.c
index a5d9ef5..92a9355 100644
--- a/arch/powerpc/mm/mmap.c
+++ b/arch/powerpc/mm/mmap.c
@@ -61,11 +61,12 @@ unsigned long arch_mmap_rnd(void)
 {
unsigned long rnd;
 
-   /* 8MB for 32bit, 1GB for 64bit */
+#ifdef CONFIG_COMPAT
if (is_32bit_task())
-   rnd = get_random_long() % (1<<(23-PAGE_SHIFT));
+   rnd = get_random_long() & ((1UL << mmap_rnd_compat_bits) - 1);
else
-   rnd = get_random_long() % (1UL<<(30-PAGE_SHIFT));
+#endif
+   rnd = get_random_long() & ((1UL << mmap_rnd_bits) - 1);
 
return rnd << PAGE_SHIFT;
 }
-- 
2.7.4

Re: [kernel-hardening] Re: [PATCH 1/2] powerpc: mm: support ARCH_MMAP_RND_BITS

2017-02-23 Thread Bhupesh Sharma

Hi Michael,

On Thu, Feb 16, 2017 at 10:19 AM, Bhupesh Sharma  wrote:
> Hi Michael,
>
> On Fri, Feb 10, 2017 at 4:41 PM, Bhupesh Sharma  wrote:
>> On Fri, Feb 10, 2017 at 4:31 PM, Michael Ellerman  
>> wrote:
>>> Bhupesh Sharma  writes:
>>>
>>>> HI Michael,
>>>>
>>>> On Thu, Feb 2, 2017 at 3:53 PM, Michael Ellerman  
>>>> wrote:
>>>>> Bhupesh Sharma  writes:
>>>>>
>>>>>> powerpc: arch_mmap_rnd() uses hard-coded values, (23-PAGE_SHIFT) for
>>>>>> 32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset
>>>>>> for the mmap base address.
>>>>>>
>>>>>> This value represents a compromise between increased
>>>>>> ASLR effectiveness and avoiding address-space fragmentation.
>>>>>> Replace it with a Kconfig option, which is sensibly bounded, so that
>>>>>> platform developers may choose where to place this compromise.
>>>>>> Keep default values as new minimums.
>>>>>>
>>>>>> This patch makes sure that now powerpc mmap arch_mmap_rnd() approach
>>>>>> is similar to other ARCHs like x86, arm64 and arm.
>>>>>
>>>>> Thanks for looking at this, it's been on my TODO for a while.
>>>>>
>>>>> I have a half completed version locally, but never got around to testing
>>>>> it thoroughly.
>>>>
>>>> Sure :)
>>>>
>>>>>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>>>>>> index a8ee573fe610..b4a843f68705 100644
>>>>>> --- a/arch/powerpc/Kconfig
>>>>>> +++ b/arch/powerpc/Kconfig
>>>>>> @@ -22,6 +22,38 @@ config MMU
>>>>>>   bool
>>>>>>   default y
>>>>>>
>>>>>> +config ARCH_MMAP_RND_BITS_MIN
>>>>>> +   default 5 if PPC_256K_PAGES && 32BIT
>>>>>> +   default 12 if PPC_256K_PAGES && 64BIT
>>>>>> +   default 7 if PPC_64K_PAGES && 32BIT
>>>>>> +   default 14 if PPC_64K_PAGES && 64BIT
>>>>>> +   default 9 if PPC_16K_PAGES && 32BIT
>>>>>> +   default 16 if PPC_16K_PAGES && 64BIT
>>>>>> +   default 11 if PPC_4K_PAGES && 32BIT
>>>>>> +   default 18 if PPC_4K_PAGES && 64BIT
>>>>>> +
>>>>>> +# max bits determined by the following formula:
>>>>>> +#  VA_BITS - PAGE_SHIFT - 4
>>>>>> +#  for e.g for 64K page and 64BIT = 48 - 16 - 4 = 28
>>>>>> +config ARCH_MMAP_RND_BITS_MAX
>>>>>> +   default 10 if PPC_256K_PAGES && 32BIT
>>>>>> +   default 26 if PPC_256K_PAGES && 64BIT
>>>>>> +   default 12 if PPC_64K_PAGES && 32BIT
>>>>>> +   default 28 if PPC_64K_PAGES && 64BIT
>>>>>> +   default 14 if PPC_16K_PAGES && 32BIT
>>>>>> +   default 30 if PPC_16K_PAGES && 64BIT
>>>>>> +   default 16 if PPC_4K_PAGES && 32BIT
>>>>>> +   default 32 if PPC_4K_PAGES && 64BIT
>>>>>> +
>>>>>> +config ARCH_MMAP_RND_COMPAT_BITS_MIN
>>>>>> +   default 5 if PPC_256K_PAGES
>>>>>> +   default 7 if PPC_64K_PAGES
>>>>>> +   default 9 if PPC_16K_PAGES
>>>>>> +   default 11
>>>>>> +
>>>>>> +config ARCH_MMAP_RND_COMPAT_BITS_MAX
>>>>>> +   default 16
>>>>>> +
>>>>>
>>>>> This is what I have below, which is a bit neater I think because each
>>>>> value is only there once (by defaulting to the COMPAT value).
>>>>>
>>>>> My max values are different to yours, I don't really remember why I
>>>>> chose those values, so we can argue about which is right.
>>>>
>>>> I am not sure how you derived these values, but I am not sure there
>>>> should be differences between 64-BIT x86/ARM64 and PPC values for the
>>>> MAX values.
>>>
>>> But your values *are* different to x86 and arm64.
>>>
>>> And why would they be the same anyway? x86 has a 47 bit address space,
>>> 64-bit powerpc is 46 bits, and arm64 is configurable fro

Re: [kernel-hardening] Re: [PATCH 1/2] powerpc: mm: support ARCH_MMAP_RND_BITS

2017-02-15 Thread Bhupesh Sharma

Hi Michael,

On Fri, Feb 10, 2017 at 4:41 PM, Bhupesh Sharma  wrote:
> On Fri, Feb 10, 2017 at 4:31 PM, Michael Ellerman  wrote:
>> Bhupesh Sharma  writes:
>>
>>> HI Michael,
>>>
>>> On Thu, Feb 2, 2017 at 3:53 PM, Michael Ellerman  
>>> wrote:
>>>> Bhupesh Sharma  writes:
>>>>
>>>>> powerpc: arch_mmap_rnd() uses hard-coded values, (23-PAGE_SHIFT) for
>>>>> 32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset
>>>>> for the mmap base address.
>>>>>
>>>>> This value represents a compromise between increased
>>>>> ASLR effectiveness and avoiding address-space fragmentation.
>>>>> Replace it with a Kconfig option, which is sensibly bounded, so that
>>>>> platform developers may choose where to place this compromise.
>>>>> Keep default values as new minimums.
>>>>>
>>>>> This patch makes sure that now powerpc mmap arch_mmap_rnd() approach
>>>>> is similar to other ARCHs like x86, arm64 and arm.
>>>>
>>>> Thanks for looking at this, it's been on my TODO for a while.
>>>>
>>>> I have a half completed version locally, but never got around to testing
>>>> it thoroughly.
>>>
>>> Sure :)
>>>
>>>>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>>>>> index a8ee573fe610..b4a843f68705 100644
>>>>> --- a/arch/powerpc/Kconfig
>>>>> +++ b/arch/powerpc/Kconfig
>>>>> @@ -22,6 +22,38 @@ config MMU
>>>>>   bool
>>>>>   default y
>>>>>
>>>>> +config ARCH_MMAP_RND_BITS_MIN
>>>>> +   default 5 if PPC_256K_PAGES && 32BIT
>>>>> +   default 12 if PPC_256K_PAGES && 64BIT
>>>>> +   default 7 if PPC_64K_PAGES && 32BIT
>>>>> +   default 14 if PPC_64K_PAGES && 64BIT
>>>>> +   default 9 if PPC_16K_PAGES && 32BIT
>>>>> +   default 16 if PPC_16K_PAGES && 64BIT
>>>>> +   default 11 if PPC_4K_PAGES && 32BIT
>>>>> +   default 18 if PPC_4K_PAGES && 64BIT
>>>>> +
>>>>> +# max bits determined by the following formula:
>>>>> +#  VA_BITS - PAGE_SHIFT - 4
>>>>> +#  for e.g for 64K page and 64BIT = 48 - 16 - 4 = 28
>>>>> +config ARCH_MMAP_RND_BITS_MAX
>>>>> +   default 10 if PPC_256K_PAGES && 32BIT
>>>>> +   default 26 if PPC_256K_PAGES && 64BIT
>>>>> +   default 12 if PPC_64K_PAGES && 32BIT
>>>>> +   default 28 if PPC_64K_PAGES && 64BIT
>>>>> +   default 14 if PPC_16K_PAGES && 32BIT
>>>>> +   default 30 if PPC_16K_PAGES && 64BIT
>>>>> +   default 16 if PPC_4K_PAGES && 32BIT
>>>>> +   default 32 if PPC_4K_PAGES && 64BIT
>>>>> +
>>>>> +config ARCH_MMAP_RND_COMPAT_BITS_MIN
>>>>> +   default 5 if PPC_256K_PAGES
>>>>> +   default 7 if PPC_64K_PAGES
>>>>> +   default 9 if PPC_16K_PAGES
>>>>> +   default 11
>>>>> +
>>>>> +config ARCH_MMAP_RND_COMPAT_BITS_MAX
>>>>> +   default 16
>>>>> +
>>>>
>>>> This is what I have below, which is a bit neater I think because each
>>>> value is only there once (by defaulting to the COMPAT value).
>>>>
>>>> My max values are different to yours, I don't really remember why I
>>>> chose those values, so we can argue about which is right.
>>>
>>> I am not sure how you derived these values, but I am not sure there
>>> should be differences between 64-BIT x86/ARM64 and PPC values for the
>>> MAX values.
>>
>> But your values *are* different to x86 and arm64.
>>
>> And why would they be the same anyway? x86 has a 47 bit address space,
>> 64-bit powerpc is 46 bits, and arm64 is configurable from 36 to 48 bits.
>>
>> So your calculations above using VA_BITS = 48 should be using 46 bits.
>>
>> But if you fixed that, your formula basically gives 1/16th of the
>> address space as the maximum range. Why is that the right amount?
>>
>> x86 uses 1/8th, and arm64 uses a mixture of 1/8th and 1/32nd (though
>> those might be bugs).
>>
>> My values were more libera

Re: [kernel-hardening] Re: [PATCH 1/2] powerpc: mm: support ARCH_MMAP_RND_BITS

2017-02-10 Thread Bhupesh Sharma

On Fri, Feb 10, 2017 at 4:31 PM, Michael Ellerman  wrote:
> Bhupesh Sharma  writes:
>
>> HI Michael,
>>
>> On Thu, Feb 2, 2017 at 3:53 PM, Michael Ellerman  wrote:
>>> Bhupesh Sharma  writes:
>>>
>>>> powerpc: arch_mmap_rnd() uses hard-coded values, (23-PAGE_SHIFT) for
>>>> 32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset
>>>> for the mmap base address.
>>>>
>>>> This value represents a compromise between increased
>>>> ASLR effectiveness and avoiding address-space fragmentation.
>>>> Replace it with a Kconfig option, which is sensibly bounded, so that
>>>> platform developers may choose where to place this compromise.
>>>> Keep default values as new minimums.
>>>>
>>>> This patch makes sure that now powerpc mmap arch_mmap_rnd() approach
>>>> is similar to other ARCHs like x86, arm64 and arm.
>>>
>>> Thanks for looking at this, it's been on my TODO for a while.
>>>
>>> I have a half completed version locally, but never got around to testing
>>> it thoroughly.
>>
>> Sure :)
>>
>>>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>>>> index a8ee573fe610..b4a843f68705 100644
>>>> --- a/arch/powerpc/Kconfig
>>>> +++ b/arch/powerpc/Kconfig
>>>> @@ -22,6 +22,38 @@ config MMU
>>>>   bool
>>>>   default y
>>>>
>>>> +config ARCH_MMAP_RND_BITS_MIN
>>>> +   default 5 if PPC_256K_PAGES && 32BIT
>>>> +   default 12 if PPC_256K_PAGES && 64BIT
>>>> +   default 7 if PPC_64K_PAGES && 32BIT
>>>> +   default 14 if PPC_64K_PAGES && 64BIT
>>>> +   default 9 if PPC_16K_PAGES && 32BIT
>>>> +   default 16 if PPC_16K_PAGES && 64BIT
>>>> +   default 11 if PPC_4K_PAGES && 32BIT
>>>> +   default 18 if PPC_4K_PAGES && 64BIT
>>>> +
>>>> +# max bits determined by the following formula:
>>>> +#  VA_BITS - PAGE_SHIFT - 4
>>>> +#  for e.g for 64K page and 64BIT = 48 - 16 - 4 = 28
>>>> +config ARCH_MMAP_RND_BITS_MAX
>>>> +   default 10 if PPC_256K_PAGES && 32BIT
>>>> +   default 26 if PPC_256K_PAGES && 64BIT
>>>> +   default 12 if PPC_64K_PAGES && 32BIT
>>>> +   default 28 if PPC_64K_PAGES && 64BIT
>>>> +   default 14 if PPC_16K_PAGES && 32BIT
>>>> +   default 30 if PPC_16K_PAGES && 64BIT
>>>> +   default 16 if PPC_4K_PAGES && 32BIT
>>>> +   default 32 if PPC_4K_PAGES && 64BIT
>>>> +
>>>> +config ARCH_MMAP_RND_COMPAT_BITS_MIN
>>>> +   default 5 if PPC_256K_PAGES
>>>> +   default 7 if PPC_64K_PAGES
>>>> +   default 9 if PPC_16K_PAGES
>>>> +   default 11
>>>> +
>>>> +config ARCH_MMAP_RND_COMPAT_BITS_MAX
>>>> +   default 16
>>>> +
>>>
>>> This is what I have below, which is a bit neater I think because each
>>> value is only there once (by defaulting to the COMPAT value).
>>>
>>> My max values are different to yours, I don't really remember why I
>>> chose those values, so we can argue about which is right.
>>
>> I am not sure how you derived these values, but I am not sure there
>> should be differences between 64-BIT x86/ARM64 and PPC values for the
>> MAX values.
>
> But your values *are* different to x86 and arm64.
>
> And why would they be the same anyway? x86 has a 47 bit address space,
> 64-bit powerpc is 46 bits, and arm64 is configurable from 36 to 48 bits.
>
> So your calculations above using VA_BITS = 48 should be using 46 bits.
>
> But if you fixed that, your formula basically gives 1/16th of the
> address space as the maximum range. Why is that the right amount?
>
> x86 uses 1/8th, and arm64 uses a mixture of 1/8th and 1/32nd (though
> those might be bugs).
>
> My values were more liberal, giving up to half the address space for 32
> & 64-bit. Maybe that's too generous, but my rationale was it's up to the
> sysadmin to tweak the values and they get to keep the pieces if it
> breaks.

I am not sure why would one want to use more than the practical limits
of 1/8th used by x86 - this causes additional burden of address space
fragmentation.

So we need to balance between the randomness increase and the address

Re: [kernel-hardening] [PATCH v2 1/1] powerpc: mm: support ARCH_MMAP_RND_BITS

2017-02-09 Thread Bhupesh Sharma

Hi Michael,

On Tue, Feb 7, 2017 at 7:57 AM, Michael Ellerman  wrote:
> Bhupesh Sharma  writes:
>
>> powerpc: arch_mmap_rnd() uses hard-coded values, (23-PAGE_SHIFT) for
>> 32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset
>> for the mmap base address.
>>
>> This value represents a compromise between increased
>> ASLR effectiveness and avoiding address-space fragmentation.
>> Replace it with a Kconfig option, which is sensibly bounded, so that
>> platform developers may choose where to place this compromise.
>> Keep default values as new minimums.
>>
>> This patch makes sure that now powerpc mmap arch_mmap_rnd() approach
>> is similar to other ARCHs like x86, arm64 and arm.
>>
>> Cc: Alexander Graf 
>> Cc: Benjamin Herrenschmidt 
>> Cc: Paul Mackerras 
>> Cc: Michael Ellerman 
>> Cc: Anatolij Gustschin 
>> Cc: Alistair Popple 
>> Cc: Matt Porter 
>> Cc: Vitaly Bordug 
>> Cc: Scott Wood 
>> Cc: Kumar Gala 
>> Cc: Daniel Cashman 
>> Signed-off-by: Bhupesh Sharma 
>> Reviewed-by: Kees Cook 
>> ---
>> Changes since v1:
>> v1 can be seen here 
>> (https://lists.ozlabs.org/pipermail/linuxppc-dev/2017-February/153594.html)
>> - No functional change in this patch.
>> - Added R-B from Kees.
>> - Dropped PATCH 2/2 from v1 as recommended by Kees Cook.
>
> Thanks for v2.
>
> But I replied to your v1 with some comments, did you see them?
>

I have replied to your comments on the original thread.
Please share your views and if possible share your test results on the
PPC setups you might have at your end.

Thanks,
Bhupesh

Re: [PATCH 1/2] powerpc: mm: support ARCH_MMAP_RND_BITS

2017-02-08 Thread Bhupesh Sharma

HI Michael,

On Thu, Feb 2, 2017 at 3:53 PM, Michael Ellerman  wrote:
> Bhupesh Sharma  writes:
>
>> powerpc: arch_mmap_rnd() uses hard-coded values, (23-PAGE_SHIFT) for
>> 32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset
>> for the mmap base address.
>>
>> This value represents a compromise between increased
>> ASLR effectiveness and avoiding address-space fragmentation.
>> Replace it with a Kconfig option, which is sensibly bounded, so that
>> platform developers may choose where to place this compromise.
>> Keep default values as new minimums.
>>
>> This patch makes sure that now powerpc mmap arch_mmap_rnd() approach
>> is similar to other ARCHs like x86, arm64 and arm.
>
> Thanks for looking at this, it's been on my TODO for a while.
>
> I have a half completed version locally, but never got around to testing
> it thoroughly.

Sure :)

>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>> index a8ee573fe610..b4a843f68705 100644
>> --- a/arch/powerpc/Kconfig
>> +++ b/arch/powerpc/Kconfig
>> @@ -22,6 +22,38 @@ config MMU
>>   bool
>>   default y
>>
>> +config ARCH_MMAP_RND_BITS_MIN
>> +   default 5 if PPC_256K_PAGES && 32BIT
>> +   default 12 if PPC_256K_PAGES && 64BIT
>> +   default 7 if PPC_64K_PAGES && 32BIT
>> +   default 14 if PPC_64K_PAGES && 64BIT
>> +   default 9 if PPC_16K_PAGES && 32BIT
>> +   default 16 if PPC_16K_PAGES && 64BIT
>> +   default 11 if PPC_4K_PAGES && 32BIT
>> +   default 18 if PPC_4K_PAGES && 64BIT
>> +
>> +# max bits determined by the following formula:
>> +#  VA_BITS - PAGE_SHIFT - 4
>> +#  for e.g for 64K page and 64BIT = 48 - 16 - 4 = 28
>> +config ARCH_MMAP_RND_BITS_MAX
>> +   default 10 if PPC_256K_PAGES && 32BIT
>> +   default 26 if PPC_256K_PAGES && 64BIT
>> +   default 12 if PPC_64K_PAGES && 32BIT
>> +   default 28 if PPC_64K_PAGES && 64BIT
>> +   default 14 if PPC_16K_PAGES && 32BIT
>> +   default 30 if PPC_16K_PAGES && 64BIT
>> +   default 16 if PPC_4K_PAGES && 32BIT
>> +   default 32 if PPC_4K_PAGES && 64BIT
>> +
>> +config ARCH_MMAP_RND_COMPAT_BITS_MIN
>> +   default 5 if PPC_256K_PAGES
>> +   default 7 if PPC_64K_PAGES
>> +   default 9 if PPC_16K_PAGES
>> +   default 11
>> +
>> +config ARCH_MMAP_RND_COMPAT_BITS_MAX
>> +   default 16
>> +
>
> This is what I have below, which is a bit neater I think because each
> value is only there once (by defaulting to the COMPAT value).
>
> My max values are different to yours, I don't really remember why I
> chose those values, so we can argue about which is right.

I am not sure how you derived these values, but I am not sure there
should be differences between 64-BIT x86/ARM64 and PPC values for the
MAX values.

>
> +config ARCH_MMAP_RND_BITS_MIN
> +   # On 64-bit up to 1G of address space (2^30)
> +   default 12 if 64BIT && PPC_256K_PAGES   # 256K (2^18), = 30 - 18 = 12
> +   default 14 if 64BIT && PPC_64K_PAGES# 64K  (2^16), = 30 - 16 = 14
> +   default 16 if 64BIT && PPC_16K_PAGES# 16K  (2^14), = 30 - 14 = 16
> +   default 18 if 64BIT # 4K   (2^12), = 30 - 12 = 18
> +   default ARCH_MMAP_RND_COMPAT_BITS_MIN
> +
> +config ARCH_MMAP_RND_BITS_MAX
> +   # On 64-bit up to 32T of address space (2^45)
> +   default 27 if 64BIT && PPC_256K_PAGES   # 256K (2^18), = 45 - 18 = 27
> +   default 29 if 64BIT && PPC_64K_PAGES# 64K  (2^16), = 45 - 16 = 29
> +   default 31 if 64BIT && PPC_16K_PAGES# 16K  (2^14), = 45 - 14 = 31
> +   default 33 if 64BIT # 4K   (2^12), = 45 - 12 = 33
> +   default ARCH_MMAP_RND_COMPAT_BITS_MAX
> +
> +config ARCH_MMAP_RND_COMPAT_BITS_MIN
> +   # Up to 8MB of address space (2^23)
> +   default 5 if PPC_256K_PAGES # 256K (2^18), = 23 - 18 = 5
> +   default 7 if PPC_64K_PAGES  # 64K  (2^16), = 23 - 16 = 7
> +   default 9 if PPC_16K_PAGES  # 16K  (2^14), = 23 - 14 = 9
> +   default 11  # 4K   (2^12), = 23 - 12 = 11
> +
> +config ARCH_MMAP_RND_COMPAT_BITS_MAX
> +   # Up to 2G of address space (2^31)
> +   default 13 if PPC_256K_PAGES# 256K (2^18), = 31 - 18 = 13
> +   default 15 if PPC_64K_PAGES # 64K  (2^16), = 31 - 16 = 15
> +   default 1

Re: [PATCH v2 1/1] powerpc: mm: support ARCH_MMAP_RND_BITS

2017-02-08 Thread Bhupesh Sharma

On Sat, Feb 4, 2017 at 6:13 AM, Kees Cook  wrote:
> On Thu, Feb 2, 2017 at 9:11 PM, Bhupesh Sharma  wrote:
>> powerpc: arch_mmap_rnd() uses hard-coded values, (23-PAGE_SHIFT) for
>> 32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset
>> for the mmap base address.
>>
>> This value represents a compromise between increased
>> ASLR effectiveness and avoiding address-space fragmentation.
>> Replace it with a Kconfig option, which is sensibly bounded, so that
>> platform developers may choose where to place this compromise.
>> Keep default values as new minimums.
>>
>> This patch makes sure that now powerpc mmap arch_mmap_rnd() approach
>> is similar to other ARCHs like x86, arm64 and arm.
>>
>> Cc: Alexander Graf 
>> Cc: Benjamin Herrenschmidt 
>> Cc: Paul Mackerras 
>> Cc: Michael Ellerman 
>> Cc: Anatolij Gustschin 
>> Cc: Alistair Popple 
>> Cc: Matt Porter 
>> Cc: Vitaly Bordug 
>> Cc: Scott Wood 
>> Cc: Kumar Gala 
>> Cc: Daniel Cashman 
>> Signed-off-by: Bhupesh Sharma 
>> Reviewed-by: Kees Cook 
>
> This " at " should be "@", but otherwise, yay v2! :)
>

Noted. Sorry for the typo :(

Regards,
Bhupesh

[PATCH v2 1/1] powerpc: mm: support ARCH_MMAP_RND_BITS

2017-02-02 Thread Bhupesh Sharma

powerpc: arch_mmap_rnd() uses hard-coded values, (23-PAGE_SHIFT) for
32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset
for the mmap base address.

This value represents a compromise between increased
ASLR effectiveness and avoiding address-space fragmentation.
Replace it with a Kconfig option, which is sensibly bounded, so that
platform developers may choose where to place this compromise.
Keep default values as new minimums.

This patch makes sure that now powerpc mmap arch_mmap_rnd() approach
is similar to other ARCHs like x86, arm64 and arm.

Cc: Alexander Graf 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: Anatolij Gustschin 
Cc: Alistair Popple 
Cc: Matt Porter 
Cc: Vitaly Bordug 
Cc: Scott Wood 
Cc: Kumar Gala 
Cc: Daniel Cashman 
Signed-off-by: Bhupesh Sharma 
Reviewed-by: Kees Cook 
---
Changes since v1:
v1 can be seen here 
(https://lists.ozlabs.org/pipermail/linuxppc-dev/2017-February/153594.html)
- No functional change in this patch.
- Added R-B from Kees.
- Dropped PATCH 2/2 from v1 as recommended by Kees Cook.

 arch/powerpc/Kconfig   | 34 ++
 arch/powerpc/mm/mmap.c |  7 ---
 2 files changed, 38 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index a8ee573fe610..b4a843f68705 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -22,6 +22,38 @@ config MMU
bool
default y
 
+config ARCH_MMAP_RND_BITS_MIN
+   default 5 if PPC_256K_PAGES && 32BIT
+   default 12 if PPC_256K_PAGES && 64BIT
+   default 7 if PPC_64K_PAGES && 32BIT
+   default 14 if PPC_64K_PAGES && 64BIT
+   default 9 if PPC_16K_PAGES && 32BIT
+   default 16 if PPC_16K_PAGES && 64BIT
+   default 11 if PPC_4K_PAGES && 32BIT
+   default 18 if PPC_4K_PAGES && 64BIT
+
+# max bits determined by the following formula:
+#  VA_BITS - PAGE_SHIFT - 4
+#  for e.g for 64K page and 64BIT = 48 - 16 - 4 = 28
+config ARCH_MMAP_RND_BITS_MAX
+   default 10 if PPC_256K_PAGES && 32BIT
+   default 26 if PPC_256K_PAGES && 64BIT
+   default 12 if PPC_64K_PAGES && 32BIT
+   default 28 if PPC_64K_PAGES && 64BIT
+   default 14 if PPC_16K_PAGES && 32BIT
+   default 30 if PPC_16K_PAGES && 64BIT
+   default 16 if PPC_4K_PAGES && 32BIT
+   default 32 if PPC_4K_PAGES && 64BIT
+
+config ARCH_MMAP_RND_COMPAT_BITS_MIN
+   default 5 if PPC_256K_PAGES
+   default 7 if PPC_64K_PAGES
+   default 9 if PPC_16K_PAGES
+   default 11
+
+config ARCH_MMAP_RND_COMPAT_BITS_MAX
+   default 16
+
 config HAVE_SETUP_PER_CPU_AREA
def_bool PPC64
 
@@ -100,6 +132,8 @@ config PPC
select HAVE_EFFICIENT_UNALIGNED_ACCESS if !(CPU_LITTLE_ENDIAN && 
POWER7_CPU)
select HAVE_KPROBES
select HAVE_ARCH_KGDB
+   select HAVE_ARCH_MMAP_RND_BITS
+   select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
select HAVE_KRETPROBES
select HAVE_ARCH_TRACEHOOK
select HAVE_MEMBLOCK
diff --git a/arch/powerpc/mm/mmap.c b/arch/powerpc/mm/mmap.c
index 2f1e44362198..babf59faab3b 100644
--- a/arch/powerpc/mm/mmap.c
+++ b/arch/powerpc/mm/mmap.c
@@ -60,11 +60,12 @@ unsigned long arch_mmap_rnd(void)
 {
unsigned long rnd;
 
-   /* 8MB for 32bit, 1GB for 64bit */
+#ifdef CONFIG_COMPAT
if (is_32bit_task())
-   rnd = get_random_long() % (1<<(23-PAGE_SHIFT));
+   rnd = get_random_long() & ((1UL << mmap_rnd_compat_bits) - 1);
else
-   rnd = get_random_long() % (1UL<<(30-PAGE_SHIFT));
+#endif
+   rnd = get_random_long() & ((1UL << mmap_rnd_bits) - 1);
 
return rnd << PAGE_SHIFT;
 }
-- 
2.7.4

Re: [PATCH 0/2] RFC: Adjust powerpc ASLR elf randomness

2017-02-02 Thread Bhupesh Sharma

On 3 Feb 2017 00:49, "Kees Cook"  wrote:

On Thu, Feb 2, 2017 at 10:08 AM, Bhupesh Sharma  wrote:
> On Thu, Feb 2, 2017 at 7:51 PM, Kees Cook  wrote:
>> On Wed, Feb 1, 2017 at 9:42 PM, Bhupesh Sharma 
wrote:
>>> The 2nd patch increases the ELF_ET_DYN_BASE value from the current
>>> hardcoded value of 0x2000_ to something more practical,
>>> i.e. TASK_SIZE - PAGE_SHIFT (which makes sense especially for
>>> 64-bit platforms which would like to utilize more randomization
>>> in the load address of a PIE elf).
>>
>> I don't think you want this second patch. Moving ELF_ET_DYN_BASE to
>> the top of TASK_SIZE means you'll be constantly colliding with stack
>> and mmap randomization. 0x2000 is way better since it randomizes
>> up from there towards the mmap area.
>>
>> Is there a reason to avoid the 32-bit memory range for the ELF addresses?
>
> I think you are right. Hmm, I think I was going by my particular use
> case which might not be required for generic PPC platforms.
>
> I have one doubt though - I have primarily worked on arm64 and x86
> architectures and there I see there 64-bit user space applications
> using the 64-bit load addresses/ranges. I am not sure why PPC64 is
> different historically.

x86's ELF_ET_DYN_BASE is (TASK_SIZE / 3 * 2), so it puts it ET_DYN
base at the top third of the address space. (In theory, this is to
avoid interpreter collisions, but I'm working on removing that
restriction, as it seems pointless.)

Other architectures have small ELF_ET_DYN_BASEs, which is good: it
allows them to have larger entropy for ET_DYN.


Fair enough. I will drop the 2nd patch then and spin a v2.

Regards,
Bhupesh

Re: [PATCH 0/2] RFC: Adjust powerpc ASLR elf randomness

2017-02-02 Thread Bhupesh Sharma

Hi Balbir,

On Thu, Feb 2, 2017 at 12:14 PM, Balbir Singh  wrote:
> On Thu, Feb 02, 2017 at 11:12:46AM +0530, Bhupesh Sharma wrote:
>> This RFC patchset tries to make the powerpc ASLR elf randomness
>> implementation similar to other ARCHs (like x86).
>>
>> The 1st patch introduces the support of ARCH_MMAP_RND_BITS in powerpc
>> mmap implementation to allow a sane balance between increased randomness
>> in the mmap address of ASLR elfs and increased address space
>> fragmentation.
>>
>
> From what I see we get 28 bits of entropy right for 64k pages
> bits as compared to 14 bits earlier?

That's correct. We can go upto 28-bits of entropy for 64BIT platforms
using 64K pages with the current approach. I see arm64 using > 28 bits
of entropy randomness in some cases, but I think 28-bit MAX entropy is
sensible for 64BIT/64K combination on PPC.

>> The 2nd patch increases the ELF_ET_DYN_BASE value from the current
>> hardcoded value of 0x2000_ to something more practical,
>> i.e. TASK_SIZE - PAGE_SHIFT (which makes sense especially for
>> 64-bit platforms which would like to utilize more randomization
>> in the load address of a PIE elf).
>>
>
> This helps PIE executables as such and leaves other not impacted?

It basically affects all shared object files (as noted in [1]).
However as Kees noted in one of his reviews, I think this 2nd patch
might not be needed for all generic ppc platforms.

[1] http://lxr.free-electrons.com/source/arch/powerpc/include/asm/elf.h#L26.

Regards,
Bhupesh

Re: [PATCH 0/2] RFC: Adjust powerpc ASLR elf randomness

2017-02-02 Thread Bhupesh Sharma

Hi Kees,

Thanks for the review.
Please see my comments inline.

On Thu, Feb 2, 2017 at 7:51 PM, Kees Cook  wrote:
> On Wed, Feb 1, 2017 at 9:42 PM, Bhupesh Sharma  wrote:
>> This RFC patchset tries to make the powerpc ASLR elf randomness
>> implementation similar to other ARCHs (like x86).
>>
>> The 1st patch introduces the support of ARCH_MMAP_RND_BITS in powerpc
>> mmap implementation to allow a sane balance between increased randomness
>> in the mmap address of ASLR elfs and increased address space
>> fragmentation.
>>
>> The 2nd patch increases the ELF_ET_DYN_BASE value from the current
>> hardcoded value of 0x2000_ to something more practical,
>> i.e. TASK_SIZE - PAGE_SHIFT (which makes sense especially for
>> 64-bit platforms which would like to utilize more randomization
>> in the load address of a PIE elf).
>
> I don't think you want this second patch. Moving ELF_ET_DYN_BASE to
> the top of TASK_SIZE means you'll be constantly colliding with stack
> and mmap randomization. 0x2000 is way better since it randomizes
> up from there towards the mmap area.
>
> Is there a reason to avoid the 32-bit memory range for the ELF addresses?
>
> -Kees

I think you are right. Hmm, I think I was going by my particular use
case which might not be required for generic PPC platforms.

I have one doubt though - I have primarily worked on arm64 and x86
architectures and there I see there 64-bit user space applications
using the 64-bit load addresses/ranges. I am not sure why PPC64 is
different historically.

Regards,
Bhupesh

Re: [PATCH 1/2] powerpc: mm: support ARCH_MMAP_RND_BITS

2017-02-02 Thread Bhupesh Sharma

HI Balbir,

On Thu, Feb 2, 2017 at 2:41 PM, Balbir Singh  wrote:
>> @@ -100,6 +132,8 @@ config PPC
>>   select HAVE_EFFICIENT_UNALIGNED_ACCESS if !(CPU_LITTLE_ENDIAN && 
>> POWER7_CPU)
>>   select HAVE_KPROBES
>>   select HAVE_ARCH_KGDB
>> + select HAVE_ARCH_MMAP_RND_BITS
>> + select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
>
> COMPAT is on for ppc64 by default, so we'll end up with COMPAT_BITS same
> as before all the time.


No, actually the 'ARCH_MMAP_RND_COMPAT_BITS'  values can be changed
after boot using the '/proc/sys/vm/mmap_rnd_compat_bits' tunable:

http://lxr.free-electrons.com/source/arch/Kconfig#L624

Regards,
Bhupesh

Re: [PATCH 1/2] powerpc: mm: support ARCH_MMAP_RND_BITS

2017-02-02 Thread Bhupesh Sharma

Hi Kees,

On Thu, Feb 2, 2017 at 7:55 PM, Kees Cook  wrote:
> On Wed, Feb 1, 2017 at 9:42 PM, Bhupesh Sharma  wrote:
>> powerpc: arch_mmap_rnd() uses hard-coded values, (23-PAGE_SHIFT) for
>> 32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset
>> for the mmap base address.
>>
>> This value represents a compromise between increased
>> ASLR effectiveness and avoiding address-space fragmentation.
>> Replace it with a Kconfig option, which is sensibly bounded, so that
>> platform developers may choose where to place this compromise.
>> Keep default values as new minimums.
>>
>> This patch makes sure that now powerpc mmap arch_mmap_rnd() approach
>> is similar to other ARCHs like x86, arm64 and arm.
>>
>> Cc: Alexander Graf 
>> Cc: Benjamin Herrenschmidt 
>> Cc: Paul Mackerras 
>> Cc: Michael Ellerman 
>> Cc: Anatolij Gustschin 
>> Cc: Alistair Popple 
>> Cc: Matt Porter 
>> Cc: Vitaly Bordug 
>> Cc: Scott Wood 
>> Cc: Kumar Gala 
>> Cc: Daniel Cashman 
>> Cc: Kees Cook 
>> Signed-off-by: Bhupesh Sharma 
>> ---
>>  arch/powerpc/Kconfig   | 34 ++
>>  arch/powerpc/mm/mmap.c |  7 ---
>>  2 files changed, 38 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>> index a8ee573fe610..b4a843f68705 100644
>> --- a/arch/powerpc/Kconfig
>> +++ b/arch/powerpc/Kconfig
>> @@ -22,6 +22,38 @@ config MMU
>> bool
>> default y
>>
>> +config ARCH_MMAP_RND_BITS_MIN
>> +   default 5 if PPC_256K_PAGES && 32BIT
>> +   default 12 if PPC_256K_PAGES && 64BIT
>> +   default 7 if PPC_64K_PAGES && 32BIT
>> +   default 14 if PPC_64K_PAGES && 64BIT
>> +   default 9 if PPC_16K_PAGES && 32BIT
>> +   default 16 if PPC_16K_PAGES && 64BIT
>> +   default 11 if PPC_4K_PAGES && 32BIT
>> +   default 18 if PPC_4K_PAGES && 64BIT
>> +
>> +# max bits determined by the following formula:
>> +#  VA_BITS - PAGE_SHIFT - 4
>> +#  for e.g for 64K page and 64BIT = 48 - 16 - 4 = 28
>> +config ARCH_MMAP_RND_BITS_MAX
>> +   default 10 if PPC_256K_PAGES && 32BIT
>> +   default 26 if PPC_256K_PAGES && 64BIT
>> +   default 12 if PPC_64K_PAGES && 32BIT
>> +   default 28 if PPC_64K_PAGES && 64BIT
>> +   default 14 if PPC_16K_PAGES && 32BIT
>> +   default 30 if PPC_16K_PAGES && 64BIT
>> +   default 16 if PPC_4K_PAGES && 32BIT
>> +   default 32 if PPC_4K_PAGES && 64BIT
>> +
>> +config ARCH_MMAP_RND_COMPAT_BITS_MIN
>> +   default 5 if PPC_256K_PAGES
>> +   default 7 if PPC_64K_PAGES
>> +   default 9 if PPC_16K_PAGES
>> +   default 11
>> +
>> +config ARCH_MMAP_RND_COMPAT_BITS_MAX
>> +   default 16
>> +
>>  config HAVE_SETUP_PER_CPU_AREA
>> def_bool PPC64
>>
>> @@ -100,6 +132,8 @@ config PPC
>> select HAVE_EFFICIENT_UNALIGNED_ACCESS if !(CPU_LITTLE_ENDIAN && 
>> POWER7_CPU)
>> select HAVE_KPROBES
>> select HAVE_ARCH_KGDB
>> +   select HAVE_ARCH_MMAP_RND_BITS
>> +   select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
>> select HAVE_KRETPROBES
>> select HAVE_ARCH_TRACEHOOK
>> select HAVE_MEMBLOCK
>> diff --git a/arch/powerpc/mm/mmap.c b/arch/powerpc/mm/mmap.c
>> index 2f1e44362198..babf59faab3b 100644
>> --- a/arch/powerpc/mm/mmap.c
>> +++ b/arch/powerpc/mm/mmap.c
>> @@ -60,11 +60,12 @@ unsigned long arch_mmap_rnd(void)
>>  {
>> unsigned long rnd;
>>
>> -   /* 8MB for 32bit, 1GB for 64bit */
>> +#ifdef CONFIG_COMPAT
>> if (is_32bit_task())
>> -   rnd = get_random_long() % (1<<(23-PAGE_SHIFT));
>> +   rnd = get_random_long() & ((1UL << mmap_rnd_compat_bits) - 
>> 1);
>> else
>> -   rnd = get_random_long() % (1UL<<(30-PAGE_SHIFT));
>> +#endif
>> +   rnd = get_random_long() & ((1UL << mmap_rnd_bits) - 1);
>>
>> return rnd << PAGE_SHIFT;
>>  }
>
> Awesome! This looks good to me based on my earlier analysis.
>
> Reviewed-by: Kees Cook 

Many thanks.
Regards,
Bhupesh

Re: Query regarding randomization bits for a ASLR elf on PPC64

2017-02-01 Thread Bhupesh Sharma

Hi Kees,

On Thu, Jan 26, 2017 at 7:08 AM, Kees Cook  wrote:
> On Sun, Jan 22, 2017 at 9:34 PM, Bhupesh Sharma  wrote:
>> I was recently looking at ways to extend the randomization range for a
>> ASLR elf on a PPC64LE system.
>>
>> I basically have been using 28-bits of randomization on x86_64 for an
>> ASLR elf using appropriate ARCH_MMAP_RND_BITS_MIN and
>> ARCH_MMAP_RND_BITS_MAX values:
>>
>> http://lxr.free-electrons.com/source/arch/x86/Kconfig#L192
>>
>> And I understand from looking at the PPC64 code base that both
>> ARCH_MMAP_RND_BITS_MIN and ARCH_MMAP_RND_BITS_MAX are not used in the
>> current upstream code.
>
> Yeah, looks like PPC could use it. If you've got hardware to test
> with, please add it. :)
>
>> I am looking at ways to randomize the mmap, stack and brk ranges for a
>> ALSR elf on PPC64LE. Currently I am using a PAGE SIZE of 64K in my
>> config file and hence the randomization usually translates to
>> something like this for me:
>
> Just to be clear: 64K pages will lose you 4 bits of entropy when
> compared to 4K on x86_64. (Assuming I'm doing the math right...)
>
>> mmap:
>> ---
>> http://lxr.free-electrons.com/source/arch/powerpc/mm/mmap.c#L67
>>
>> rnd = get_random_long() % (1UL<<(30-PAGE_SHIFT));
>>
>> Since PAGE_SHIFT is 16 for 64K page size, this computation reduces to:
>> rnd = get_random_long() % (1UL<<(14));
>>
>> If I compare this to x86_64, I see there:
>>
>> http://lxr.free-electrons.com/source/arch/x86/mm/mmap.c#L79
>>
>> rnd = get_random_long() & ((1UL << mmap_rnd_bits) - 1);
>>
>> So, if mmap_rnd_bits = 28, this equates to:
>> rnd = get_random_long() & ((1UL << 28) - 1);
>>
>> Observations and Queries:
>> --
>>
>> - So, x86_64 gives approx twice number of random bits for a ASLR elf
>> running on it as compared to PPC64 although both use a 48-bit VA.
>>
>> - I also see this comment for PPC at various places, regarding 1GB
>> randomness spread for PPC64. Is this restricted by the hardware or the
>> kernel usage?:
>>
>> /* 8MB for 32bit, 1GB for 64bit */
>>  64 if (is_32bit_task())
>>  65 rnd = get_random_long() % (1<<(23-PAGE_SHIFT));
>>  66 else
>>  67 rnd = get_random_long() % (1UL<<(30-PAGE_SHIFT));
>
> Yeah, I'm not sure about this. The comments above the MIN_GAP* macros
> seem to talk about making sure there is the 1GB stack gap, but that
> shouldn't limit mmap.
>
> Stack base is randomized in fs/binfmt_elf.c randomize_stack_top()
> which uses STACK_RND_MASK (and PAGE_SHIFT).
>
> x86:
> /* 1GB for 64bit, 8MB for 32bit */
> #define STACK_RND_MASK (test_thread_flag(TIF_ADDR32) ? 0x7ff : 0x3f)
>
> powerpc:
> /* 1GB for 64bit, 8MB for 32bit */
> #define STACK_RND_MASK (is_32bit_task() ? \
> (0x7ff >> (PAGE_SHIFT - 12)) : \
> (0x3 >> (PAGE_SHIFT - 12)))
>
> So, in the 64k page case, stack randomization entropy is reduced, but
> otherwise identical to x86.
>
> x86 and powerpc both use arch_mmap_rnd() for both mmap and ET_DYN
> (with different bases).
>
> x86 uses ELF_ET_DYN_BASE as TASK_SIZE / 3 * 2 (which the ELF loader
> pushes back up the nearest PAGE_SIZE alignment: 0x5000),
> though powerpc uses 0x2000, so it should have significantly more
> space for mmap and ET_DYN ASLR than x86.
>
>> - I tried to increase the randomness to 28 bits for PPC as well by
>> making the PPC mmap, brk code equivalent to x86_64 and it works fine
>> for my use case.
>
> The PPC brk randomization on powerpc doesn't use the more common
> randomize_page() way other archs do it...
>
> /* 8MB for 32bit, 1GB for 64bit */
> if (is_32bit_task())
> rnd = (get_random_long() % (1UL<<(23-PAGE_SHIFT)));
> else
> rnd = (get_random_long() % (1UL<<(30-PAGE_SHIFT)));
>
> return rnd << PAGE_SHIFT;
>
> x86 uses 0x0200 (via randomize_page()), which, if I'm doing the
> math right is 14 bits, regardless of 32/64-bit. arm64 uses 0x4000
> (20 bits) on 64-bit processes and the same as x86 (14) for 32-bit
> processes. Looks like powerpc uses either 13 or 20 for 4k pages, which
> is close to the same.
>
>> - But, I am not sure this is the right thing to do and whether the
>> PPC64 also supports the MIN and MAX ranges for randomization.
>
> It can support it once you implement the Kconfigs for it. :)
>
>> - If it does I would

[PATCH 2/2] powerpc: Redefine ELF_ET_DYN_BASE

2017-02-01 Thread Bhupesh Sharma

Currently the powerpc arch uses a ELF_ET_DYN_BASE value of 0x2000
which ends up pushing an elf to a load address which is 32-bit.

On 64-bit platforms, this might be too less especially when one is
trying to increase the randomness of the load address of the ASLR elfs
on such platforms.

This patch makes the powerpc platforms mimic the x86 ones, by ensuring
that the ELF_ET_DYN_BASE is calculated on basis of the current task's
TASK_SIZE.

Cc: Alexander Graf 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: Anatolij Gustschin 
Cc: Alistair Popple 
Cc: Matt Porter 
Cc: Vitaly Bordug 
Cc: Scott Wood 
Cc: Kumar Gala 
Cc: Daniel Cashman 
Cc: Kees Cook 
Signed-off-by: Bhupesh Sharma 
---
 arch/powerpc/include/asm/elf.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/powerpc/include/asm/elf.h b/arch/powerpc/include/asm/elf.h
index ee46ffef608e..dd035f6dd782 100644
--- a/arch/powerpc/include/asm/elf.h
+++ b/arch/powerpc/include/asm/elf.h
@@ -28,7 +28,7 @@
the loader.  We need to make sure that it is out of the way of the program
that it will "exec", and that there is sufficient room for the brk.  */
 
-#define ELF_ET_DYN_BASE0x2000
+#define ELF_ET_DYN_BASE(TASK_SIZE - PAGE_SIZE)
 
 #define ELF_CORE_EFLAGS (is_elf2_task() ? 2 : 0)
 
-- 
2.7.4

[PATCH 1/2] powerpc: mm: support ARCH_MMAP_RND_BITS

2017-02-01 Thread Bhupesh Sharma

powerpc: arch_mmap_rnd() uses hard-coded values, (23-PAGE_SHIFT) for
32-bit and (30-PAGE_SHIFT) for 64-bit, to generate the random offset
for the mmap base address.

This value represents a compromise between increased
ASLR effectiveness and avoiding address-space fragmentation.
Replace it with a Kconfig option, which is sensibly bounded, so that
platform developers may choose where to place this compromise.
Keep default values as new minimums.

This patch makes sure that now powerpc mmap arch_mmap_rnd() approach
is similar to other ARCHs like x86, arm64 and arm.

Cc: Alexander Graf 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: Anatolij Gustschin 
Cc: Alistair Popple 
Cc: Matt Porter 
Cc: Vitaly Bordug 
Cc: Scott Wood 
Cc: Kumar Gala 
Cc: Daniel Cashman 
Cc: Kees Cook 
Signed-off-by: Bhupesh Sharma 
---
 arch/powerpc/Kconfig   | 34 ++
 arch/powerpc/mm/mmap.c |  7 ---
 2 files changed, 38 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index a8ee573fe610..b4a843f68705 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -22,6 +22,38 @@ config MMU
bool
default y
 
+config ARCH_MMAP_RND_BITS_MIN
+   default 5 if PPC_256K_PAGES && 32BIT
+   default 12 if PPC_256K_PAGES && 64BIT
+   default 7 if PPC_64K_PAGES && 32BIT
+   default 14 if PPC_64K_PAGES && 64BIT
+   default 9 if PPC_16K_PAGES && 32BIT
+   default 16 if PPC_16K_PAGES && 64BIT
+   default 11 if PPC_4K_PAGES && 32BIT
+   default 18 if PPC_4K_PAGES && 64BIT
+
+# max bits determined by the following formula:
+#  VA_BITS - PAGE_SHIFT - 4
+#  for e.g for 64K page and 64BIT = 48 - 16 - 4 = 28
+config ARCH_MMAP_RND_BITS_MAX
+   default 10 if PPC_256K_PAGES && 32BIT
+   default 26 if PPC_256K_PAGES && 64BIT
+   default 12 if PPC_64K_PAGES && 32BIT
+   default 28 if PPC_64K_PAGES && 64BIT
+   default 14 if PPC_16K_PAGES && 32BIT
+   default 30 if PPC_16K_PAGES && 64BIT
+   default 16 if PPC_4K_PAGES && 32BIT
+   default 32 if PPC_4K_PAGES && 64BIT
+
+config ARCH_MMAP_RND_COMPAT_BITS_MIN
+   default 5 if PPC_256K_PAGES
+   default 7 if PPC_64K_PAGES
+   default 9 if PPC_16K_PAGES
+   default 11
+
+config ARCH_MMAP_RND_COMPAT_BITS_MAX
+   default 16
+
 config HAVE_SETUP_PER_CPU_AREA
def_bool PPC64
 
@@ -100,6 +132,8 @@ config PPC
select HAVE_EFFICIENT_UNALIGNED_ACCESS if !(CPU_LITTLE_ENDIAN && 
POWER7_CPU)
select HAVE_KPROBES
select HAVE_ARCH_KGDB
+   select HAVE_ARCH_MMAP_RND_BITS
+   select HAVE_ARCH_MMAP_RND_COMPAT_BITS if COMPAT
select HAVE_KRETPROBES
select HAVE_ARCH_TRACEHOOK
select HAVE_MEMBLOCK
diff --git a/arch/powerpc/mm/mmap.c b/arch/powerpc/mm/mmap.c
index 2f1e44362198..babf59faab3b 100644
--- a/arch/powerpc/mm/mmap.c
+++ b/arch/powerpc/mm/mmap.c
@@ -60,11 +60,12 @@ unsigned long arch_mmap_rnd(void)
 {
unsigned long rnd;
 
-   /* 8MB for 32bit, 1GB for 64bit */
+#ifdef CONFIG_COMPAT
if (is_32bit_task())
-   rnd = get_random_long() % (1<<(23-PAGE_SHIFT));
+   rnd = get_random_long() & ((1UL << mmap_rnd_compat_bits) - 1);
else
-   rnd = get_random_long() % (1UL<<(30-PAGE_SHIFT));
+#endif
+   rnd = get_random_long() & ((1UL << mmap_rnd_bits) - 1);
 
return rnd << PAGE_SHIFT;
 }
-- 
2.7.4

[PATCH 0/2] RFC: Adjust powerpc ASLR elf randomness

2017-02-01 Thread Bhupesh Sharma

This RFC patchset tries to make the powerpc ASLR elf randomness
implementation similar to other ARCHs (like x86).

The 1st patch introduces the support of ARCH_MMAP_RND_BITS in powerpc
mmap implementation to allow a sane balance between increased randomness
in the mmap address of ASLR elfs and increased address space
fragmentation.

The 2nd patch increases the ELF_ET_DYN_BASE value from the current
hardcoded value of 0x2000_ to something more practical,
i.e. TASK_SIZE - PAGE_SHIFT (which makes sense especially for
64-bit platforms which would like to utilize more randomization
in the load address of a PIE elf).

I have tested this patchset on 64-bit Fedora and RHEL7 machines/VMs.
Here are the test results and details of the test environment:

1. Create a test PIE program which shows its own memory map:

$ cat show_mmap_pie.c
#include 
#include 

int main(void){
char command[1024];
sprintf(command,"cat /proc/%d/maps",getpid());
system(command);
return 0;
}

2. Compile it as a PIE:

$ gcc -o show_mmap_pie -fpie -pie show_mmap_pie.c

3. Before this patchset (on a Fedora-25 PPC64 POWER7 machine):

# ./show_mmap_pie
33dd-33de r-xp  fd:00 1724816
/root/git/linux/show_mmap_pie
33de-33df r--p  fd:00 1724816
/root/git/linux/show_mmap_pie
33df-33e0 rw-p 0001 fd:00 1724816
/root/git/linux/show_mmap_pie
3fff9d75-3fff9d94 r-xp  fd:00 2753176
/usr/lib64/power7/libc-2.23.so
3fff9d94-3fff9d95 ---p 001f fd:00 2753176
/usr/lib64/power7/libc-2.23.so
3fff9d95-3fff9d96 r--p 001f fd:00 2753176
/usr/lib64/power7/libc-2.23.so
3fff9d96-3fff9d97 rw-p 0020 fd:00 2753176
/usr/lib64/power7/libc-2.23.so
3fff9d98-3fff9d9a r-xp  00:00 0  [vdso]
3fff9d9a-3fff9d9e r-xp  fd:00 2625136
/usr/lib64/ld-2.23.so
3fff9d9e-3fff9d9f r--p 0003 fd:00 2625136
/usr/lib64/ld-2.23.so
3fff9d9f-3fff9da0 rw-p 0004 fd:00 2625136
/usr/lib64/ld-2.23.so
3528-352b rw-p  00:00 0  [stack]

As one can notice, the load address even for a 64-bit binary
(show_mmap_pie), is within the 32-bit range.

4. After this patchset (on a Fedora-25 PPC64 POWER7 machine):

# ./show_mmap_pie
3fffad25-3fffad44 r-xp  fd:00 2753176
/usr/lib64/power7/libc-2.23.so
3fffad44-3fffad45 ---p 001f fd:00 2753176
/usr/lib64/power7/libc-2.23.so
3fffad45-3fffad46 r--p 001f fd:00 2753176
/usr/lib64/power7/libc-2.23.so
3fffad46-3fffad47 rw-p 0020 fd:00 2753176
/usr/lib64/power7/libc-2.23.so
3fffad48-3fffad4a r-xp  00:00 0  [vdso]
3fffad4a-3fffad4e r-xp  fd:00 2625136
/usr/lib64/ld-2.23.so
3fffad4e-3fffad4f r--p 0003 fd:00 2625136
/usr/lib64/ld-2.23.so
3fffad4f-3fffad50 rw-p 0004 fd:00 2625136
/usr/lib64/ld-2.23.so
3fffad50-3fffad51 r-xp  fd:00 1724816
/root/git/linux/show_mmap_pie
3fffad51-3fffad52 r--p  fd:00 1724816
/root/git/linux/show_mmap_pie
3fffad52-3fffad53 rw-p 0001 fd:00 1724816
/root/git/linux/show_mmap_pie
3fffe311-3fffe314 rw-p  00:00 0  [stack]

The load address of the elf is now pushed to be in a 64-bit range.

As I have access to limited number of powerpc machines, request folks
having powerpc platforms to try this patchset and share their
test results/issues as well.

Cc: Alexander Graf 
Cc: Benjamin Herrenschmidt 
Cc: Paul Mackerras 
Cc: Michael Ellerman 
Cc: Anatolij Gustschin 
Cc: Alistair Popple 
Cc: Matt Porter 
Cc: Vitaly Bordug 
Cc: Scott Wood 
Cc: Kumar Gala 
Cc: Daniel Cashman 
Cc: Kees Cook 

Bhupesh Sharma (2):
  powerpc: mm: support ARCH_MMAP_RND_BITS
  powerpc: Redefine ELF_ET_DYN_BASE

 arch/powerpc/Kconfig   | 34 ++
 arch/powerpc/include/asm/elf.h |  2 +-
 arch/powerpc/mm/mmap.c |  7 ---
 3 files changed, 39 insertions(+), 4 deletions(-)

-- 
2.7.4

Query regarding randomization bits for a ASLR elf on PPC64

2017-01-22 Thread Bhupesh Sharma

Hi Experts,

I was recently looking at ways to extend the randomization range for a
ASLR elf on a PPC64LE system.

I basically have been using 28-bits of randomization on x86_64 for an
ASLR elf using appropriate ARCH_MMAP_RND_BITS_MIN and
ARCH_MMAP_RND_BITS_MAX values:

http://lxr.free-electrons.com/source/arch/x86/Kconfig#L192

And I understand from looking at the PPC64 code base that both
ARCH_MMAP_RND_BITS_MIN and ARCH_MMAP_RND_BITS_MAX are not used in the
current upstream code.

I am looking at ways to randomize the mmap, stack and brk ranges for a
ALSR elf on PPC64LE. Currently I am using a PAGE SIZE of 64K in my
config file and hence the randomization usually translates to
something like this for me:

mmap:
---
http://lxr.free-electrons.com/source/arch/powerpc/mm/mmap.c#L67

rnd = get_random_long() % (1UL<<(30-PAGE_SHIFT));

Since PAGE_SHIFT is 16 for 64K page size, this computation reduces to:
rnd = get_random_long() % (1UL<<(14));

If I compare this to x86_64, I see there:

http://lxr.free-electrons.com/source/arch/x86/mm/mmap.c#L79

rnd = get_random_long() & ((1UL << mmap_rnd_bits) - 1);

So, if mmap_rnd_bits = 28, this equates to:
rnd = get_random_long() & ((1UL << 28) - 1);

Observations and Queries:
--

- So, x86_64 gives approx twice number of random bits for a ASLR elf
running on it as compared to PPC64 although both use a 48-bit VA.

- I also see this comment for PPC at various places, regarding 1GB
randomness spread for PPC64. Is this restricted by the hardware or the
kernel usage?:

/* 8MB for 32bit, 1GB for 64bit */
 64 if (is_32bit_task())
 65 rnd = get_random_long() % (1<<(23-PAGE_SHIFT));
 66 else
 67 rnd = get_random_long() % (1UL<<(30-PAGE_SHIFT));

- I tried to increase the randomness to 28 bits for PPC as well by
making the PPC mmap, brk code equivalent to x86_64 and it works fine
for my use case.

- But, I am not sure this is the right thing to do and whether the
PPC64 also supports the MIN and MAX ranges for randomization.

- If it does I would like to understand, test and push a patch to
implement the same for PPC64 in upstream.

Sorry for the long mail, but would really appreciate if someone can
help me understand the details here.

Thanks,
Bhupesh

74 matches

Mail list logo