Re: [RFC2 nowrap: PATCH v7 00/18] ILP32 for ARM64

2016-09-02 Thread Bamvor Jian Zhang
Base on the off-list discussion, the community care about the
performance regression of aarch64 LP64 and aarch32 after ILP32
is merged.

Given that there is not big open issue in ILP32 in kernel part, I try
to address this concern. It is reasonable that we should run lots of
testsuite(such as LKP) to ensure there is no performance regression.
But I am not expert of this, I started from test the lmbench for
aarch64 LP64 and compare the differnce between ILP32 enabled and
without ILP32 patches.

The branch I used is ilp32-4.8 on [1], compare the result between
two commit "d3746f1 arm64:ilp32: add ARM64_ILP32 to Kconfig"(defconfig
with CONFIG_ARM64_ILP32) and "3054de8 fiz set_personality by Catalin"
(defconfig).

The result show there is no big difference. Most of the difference is
less than 5%. Only two differnce more than 10%:
1.  Context switching 2p/16K 13.16%(ILP32 is bigger than No_ILP32.
smaller is better)
2.  *Local* Communication bandwidths: TCP -10.77%.(ILP32 is smaller than
No_ILP32. bigger is better).


If it is make sense to community, I could continue to do more that.

Thanks

Bamvor

[1] https://github.com/norov/linux.git
[2] The full result: (ILP32 - No_ILP32)/No_ILP32

 L M B E N C H  3 . 0   S U M M A R Y
 
 (Alpha software, do not distribute)

Basic system parameters
--
Host OS Description  Mhz  tlb  cache  mem   scal
 pages line   par   load
   bytes
- - ---  - - -- 
buildroot Linux 4.8.0-r A64_ILP32_diff_No_ILP32 102432   128 0.23% 1

Processor, Processes - times in microseconds - smaller is better
--
Host OS  Mhz null null  open slct sig  sig  fork exec sh
 call  I/O stat clos TCP  inst hndl proc proc proc
- -           
buildroot Linux 4.8.0-r 0.00% 0.00% 0.00% -3.03% -0.42% -1.96% 0.00% -0.67% 
2.29% -6.34% 0.85%

Basic integer operations - times in nanoseconds - smaller is better
---
Host OS  intgr intgr  intgr  intgr  intgr
  bit   addmuldivmod
- - -- -- -- -- --
buildroot Linux 4.8.0-r 0.00%  0.00%  0.00%  0.00%  0.00%

Basic uint64 operations - times in nanoseconds - smaller is better
--
Host OS int64  int64  int64  int64  int64
 bitaddmuldivmod
- - -- -- -- -- --
buildroot Linux 4.8.0-r  0.00%0.00%  0.00%  0.00%

Basic float operations - times in nanoseconds - smaller is better
-
Host OS  float  float  float  float
 addmuldivbogo
- - -- -- -- --
buildroot Linux 4.8.0-r 0.00%  0.00%  0.04%  0.00%

Basic double operations - times in nanoseconds - smaller is better
--
Host OS  double double double double
 addmuldivbogo
- - --  -- -- --
buildroot Linux 4.8.0-r 0.00%  0.00%0.00%  0.00%

Context switching - times in microseconds - smaller is better
-
Host OS  2p/0K 2p/16K 2p/64K 8p/16K 8p/64K 16p/16K 16p/64K
 ctxsw  ctxsw  ctxsw ctxsw  ctxsw   ctxsw   ctxsw
- - -- -- -- -- -- --- ---
buildroot Linux 4.8.0-r  -6.00%  13.16% -1.83% 3.80%  9.94%  -6.17%   2.72%

*Local* Communication latencies in microseconds - smaller is better
-
Host OS 2p/0K  Pipe AF UDP  RPC/   TCP  RPC/ TCP
ctxsw   UNIX UDP TCP conn
- - - -  - - - - 
buildroot Linux 4.8.0-r -6.00% -4.08% 1.95% -5.02%4.87% 0.00%


File & VM system latencies in microseconds - smaller is better
---
Host OS   0K File  10K File MmapProt   Page   100fd
Create Delete Create Delete Latency Fault  Fault  selct
- - -- -- -- -- --- - --- -
buildroot 

Re: [RFC2 nowrap: PATCH v7 00/18] ILP32 for ARM64

2016-08-18 Thread Yury Norov
On Wed, Aug 17, 2016 at 04:26:42PM +0100, Catalin Marinas wrote:
> On Wed, Aug 17, 2016 at 04:32:23PM +0200, Dr. Philipp Tomsich wrote:
> > On 17 Aug 2016, at 16:29, Catalin Marinas  wrote:
> > > On Wed, Aug 17, 2016 at 02:54:59PM +0200, Dr. Philipp Tomsich wrote:
> > >> On 17 Aug 2016, at 14:48, Yury Norov  wrote:
> > >>> On Wed, Aug 17, 2016 at 02:28:50PM +0200, Alexander Graf wrote:
> >  On 17 Aug 2016, at 13:46, Yury Norov  wrote:
> > > This series enables aarch64 with ilp32 mode, and as supporting work,
> > > introduces ARCH_32BIT_OFF_T configuration option that is enabled for
> > > existing 32-bit architectures but disabled for new arches (so 64-bit
> > > off_t is is used by new userspace).
> > > 
> > > This version is based on kernel v4.8-rc2.
> > > It works with glibc-2.23, and tested with LTP.
> > > 
> > > This is RFC because there is still no solid understanding what type 
> > > of registers
> > > top-halves delousing we prefer. In this patchset, w0-w7 are cleared 
> > > for each
> > > syscall in assembler entry. The alternative approach is in 
> > > introducing compat
> > > wrappers which is little faster for natively routed syscalls (~2.6% 
> > > for syscall
> > > with no payload) but much more complicated.
> >  
> >  So you’re saying there are 2 options:
> >  
> >  1) easy to get right, slightly slower, same ABI to user space as 2
> >  2) harder to get right, minor performance benefit
> > >>> 
> > >>> No, ABI is little different. If 1) we pass off_t in a pair to syscalls,
> > >>> if 2) - in a single register. So if 1, we 'd take some wrappers from 
> > >>> aarch32.
> > >>> See patch 12 here.
> > >> 
> > >> From our experience with ILP32, I’d prefer to have off_t (and similar)
> > >> in a single register whenever possible (i.e. option #2).  It feels
> > >> more natural to use the full 64bit registers whenever possible, as
> > >> ILP32 on ARMv8 should really be understood as a 64bit ABI with a 32bit
> > >> memory model.
> > > 
> > > I think we are well past the point where we considered ILP32 a 64-bit
> > > ABI. It would have been nice but we decided that breaking POSIX
> > > compatibility is a bad idea, so we went back (again) to a 32-bit ABI for
> > > ILP32. While there are 64-bit arguments that, at a first look, would
> > > make sense to be passed in 64-bit registers, the kernel maintenance cost
> > > is significant with changes to generic files.
> > > 
> > > Allowing 64-bit wide registers at the ILP32 syscall interface means that
> > > the kernel would have to zero/sign-extend the upper half of the 32-bit
> > > arguments for the cases where they are passed directly to a native
> > > syscall that expects a 64-bit argument. This (a) adds a significant
> > > number of wrappers to the generic code together additional annotations
> > > to the generic unistd.h and (b) it adds a small overhead to the AArch32
> > > (compat) ABI since it doesn't need such generic wrapping (the upper half
> > > of 64-bit registers is guaranteed to be zero/preserved by the
> > > architecture when coming from the AArch32 mode).
> > 
> > Yes, I remember the discussions and just wanted to put option #2 in
> > context again.
> 
> I don't particularly like splitting 64-bit arguments in two 32-bit
> values either but I don't see a better alternative. To keep this
> mostly in the arch code we would need an additional table of syscall
> wrappers where the majority just use the default zero-extend everything
> with a few specific wrappers where we pass 64-bit arguments. Or we could
> set an extra bit in the syscall number for those syscalls that need
> special wrapping and avoid zero-extending. But neither of these look any
> nicer (well, maybe only from the user-space perspective).
> 

This is the discussion started by David Miller
https://patchwork.kernel.org/patch/9132521/

After it we switched to current version.

> > Everything points to just going with the pair-of-registers and getting
> > this merged quickly then, I suppose.
> 
> I will refrain from commenting on how quickly we merge this ;) (it may
> be seen as binding by some).
> 
> -- 
> Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC2 nowrap: PATCH v7 00/18] ILP32 for ARM64

2016-08-17 Thread Dr. Philipp Tomsich

> On 17 Aug 2016, at 16:29, Catalin Marinas  wrote:
> 
> On Wed, Aug 17, 2016 at 02:54:59PM +0200, Dr. Philipp Tomsich wrote:
>> On 17 Aug 2016, at 14:48, Yury Norov  wrote:
>>> On Wed, Aug 17, 2016 at 02:28:50PM +0200, Alexander Graf wrote:
 On 17 Aug 2016, at 13:46, Yury Norov  wrote:
> This series enables aarch64 with ilp32 mode, and as supporting work,
> introduces ARCH_32BIT_OFF_T configuration option that is enabled for
> existing 32-bit architectures but disabled for new arches (so 64-bit
> off_t is is used by new userspace).
> 
> This version is based on kernel v4.8-rc2.
> It works with glibc-2.23, and tested with LTP.
> 
> This is RFC because there is still no solid understanding what type of 
> registers
> top-halves delousing we prefer. In this patchset, w0-w7 are cleared for 
> each
> syscall in assembler entry. The alternative approach is in introducing 
> compat
> wrappers which is little faster for natively routed syscalls (~2.6% for 
> syscall
> with no payload) but much more complicated.
 
 So you’re saying there are 2 options:
 
 1) easy to get right, slightly slower, same ABI to user space as 2
 2) harder to get right, minor performance benefit
>>> 
>>> No, ABI is little different. If 1) we pass off_t in a pair to syscalls,
>>> if 2) - in a single register. So if 1, we 'd take some wrappers from 
>>> aarch32.
>>> See patch 12 here.
>> 
>> From our experience with ILP32, I’d prefer to have off_t (and similar)
>> in a single register whenever possible (i.e. option #2).  It feels
>> more natural to use the full 64bit registers whenever possible, as
>> ILP32 on ARMv8 should really be understood as a 64bit ABI with a 32bit
>> memory model.
> 
> I think we are well past the point where we considered ILP32 a 64-bit
> ABI. It would have been nice but we decided that breaking POSIX
> compatibility is a bad idea, so we went back (again) to a 32-bit ABI for
> ILP32. While there are 64-bit arguments that, at a first look, would
> make sense to be passed in 64-bit registers, the kernel maintenance cost
> is significant with changes to generic files.
> 
> Allowing 64-bit wide registers at the ILP32 syscall interface means that
> the kernel would have to zero/sign-extend the upper half of the 32-bit
> arguments for the cases where they are passed directly to a native
> syscall that expects a 64-bit argument. This (a) adds a significant
> number of wrappers to the generic code together additional annotations
> to the generic unistd.h and (b) it adds a small overhead to the AArch32
> (compat) ABI since it doesn't need such generic wrapping (the upper half
> of 64-bit registers is guaranteed to be zero/preserved by the
> architecture when coming from the AArch32 mode).

Yes, I remember the discussions and just wanted to put option #2 in context 
again.
Everything points to just going with the pair-of-registers and getting this 
merged 
quickly then, I suppose.

Cheers,
Philipp.--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC2 nowrap: PATCH v7 00/18] ILP32 for ARM64

2016-08-17 Thread Catalin Marinas
On Wed, Aug 17, 2016 at 02:54:59PM +0200, Dr. Philipp Tomsich wrote:
> On 17 Aug 2016, at 14:48, Yury Norov  wrote:
> > On Wed, Aug 17, 2016 at 02:28:50PM +0200, Alexander Graf wrote:
> >> On 17 Aug 2016, at 13:46, Yury Norov  wrote:
> >>> This series enables aarch64 with ilp32 mode, and as supporting work,
> >>> introduces ARCH_32BIT_OFF_T configuration option that is enabled for
> >>> existing 32-bit architectures but disabled for new arches (so 64-bit
> >>> off_t is is used by new userspace).
> >>> 
> >>> This version is based on kernel v4.8-rc2.
> >>> It works with glibc-2.23, and tested with LTP.
> >>> 
> >>> This is RFC because there is still no solid understanding what type of 
> >>> registers
> >>> top-halves delousing we prefer. In this patchset, w0-w7 are cleared for 
> >>> each
> >>> syscall in assembler entry. The alternative approach is in introducing 
> >>> compat
> >>> wrappers which is little faster for natively routed syscalls (~2.6% for 
> >>> syscall
> >>> with no payload) but much more complicated.
> >> 
> >> So you’re saying there are 2 options:
> >> 
> >>  1) easy to get right, slightly slower, same ABI to user space as 2
> >>  2) harder to get right, minor performance benefit
> > 
> > No, ABI is little different. If 1) we pass off_t in a pair to syscalls,
> > if 2) - in a single register. So if 1, we 'd take some wrappers from 
> > aarch32.
> > See patch 12 here.
> 
> From our experience with ILP32, I’d prefer to have off_t (and similar)
> in a single register whenever possible (i.e. option #2).  It feels
> more natural to use the full 64bit registers whenever possible, as
> ILP32 on ARMv8 should really be understood as a 64bit ABI with a 32bit
> memory model.

I think we are well past the point where we considered ILP32 a 64-bit
ABI. It would have been nice but we decided that breaking POSIX
compatibility is a bad idea, so we went back (again) to a 32-bit ABI for
ILP32. While there are 64-bit arguments that, at a first look, would
make sense to be passed in 64-bit registers, the kernel maintenance cost
is significant with changes to generic files.

Allowing 64-bit wide registers at the ILP32 syscall interface means that
the kernel would have to zero/sign-extend the upper half of the 32-bit
arguments for the cases where they are passed directly to a native
syscall that expects a 64-bit argument. This (a) adds a significant
number of wrappers to the generic code together additional annotations
to the generic unistd.h and (b) it adds a small overhead to the AArch32
(compat) ABI since it doesn't need such generic wrapping (the upper half
of 64-bit registers is guaranteed to be zero/preserved by the
architecture when coming from the AArch32 mode).

-- 
Catalin
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC2 nowrap: PATCH v7 00/18] ILP32 for ARM64

2016-08-17 Thread Dr. Philipp Tomsich

> On 17 Aug 2016, at 14:48, Yury Norov  wrote:
> 
> On Wed, Aug 17, 2016 at 02:28:50PM +0200, Alexander Graf wrote:
>> 
>>> On 17 Aug 2016, at 13:46, Yury Norov  wrote:
>>> 
>>> This series enables aarch64 with ilp32 mode, and as supporting work,
>>> introduces ARCH_32BIT_OFF_T configuration option that is enabled for
>>> existing 32-bit architectures but disabled for new arches (so 64-bit
>>> off_t is is used by new userspace).
>>> 
>>> This version is based on kernel v4.8-rc2.
>>> It works with glibc-2.23, and tested with LTP.
>>> 
>>> This is RFC because there is still no solid understanding what type of 
>>> registers
>>> top-halves delousing we prefer. In this patchset, w0-w7 are cleared for each
>>> syscall in assembler entry. The alternative approach is in introducing 
>>> compat
>>> wrappers which is little faster for natively routed syscalls (~2.6% for 
>>> syscall
>>> with no payload) but much more complicated.
>> 
>> So you’re saying there are 2 options:
>> 
>>  1) easy to get right, slightly slower, same ABI to user space as 2
>>  2) harder to get right, minor performance benefit
> 
> No, ABI is little different. If 1) we pass off_t in a pair to syscalls,
> if 2) - in a single register. So if 1, we 'd take some wrappers from aarch32.
> See patch 12 here.

>From our experience with ILP32, I’d prefer to have off_t (and similar) in a 
>single register
whenever possible (i.e. option #2).  It feels more natural to use the full 
64bit registers 
whenever possible, as ILP32 on ARMv8 should really be understood as a 64bit ABI 
with 
a 32bit memory model.

Cheers,
Philipp.--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC2 nowrap: PATCH v7 00/18] ILP32 for ARM64

2016-08-17 Thread Yury Norov
On Wed, Aug 17, 2016 at 02:28:50PM +0200, Alexander Graf wrote:
> 
> > On 17 Aug 2016, at 13:46, Yury Norov  wrote:
> > 
> > This series enables aarch64 with ilp32 mode, and as supporting work,
> > introduces ARCH_32BIT_OFF_T configuration option that is enabled for
> > existing 32-bit architectures but disabled for new arches (so 64-bit
> > off_t is is used by new userspace).
> > 
> > This version is based on kernel v4.8-rc2.
> > It works with glibc-2.23, and tested with LTP.
> > 
> > This is RFC because there is still no solid understanding what type of 
> > registers
> > top-halves delousing we prefer. In this patchset, w0-w7 are cleared for each
> > syscall in assembler entry. The alternative approach is in introducing 
> > compat
> > wrappers which is little faster for natively routed syscalls (~2.6% for 
> > syscall
> > with no payload) but much more complicated.
> 
> So you’re saying there are 2 options:
> 
>   1) easy to get right, slightly slower, same ABI to user space as 2
>   2) harder to get right, minor performance benefit

No, ABI is little different. If 1) we pass off_t in a pair to syscalls,
if 2) - in a single register. So if 1, we 'd take some wrappers from aarch32.
See patch 12 here.

> That’s an obvious pick, no? Mark it non-RFC and stay with the clearing in 
> assembler entry. If anyone cares about those last few percent, they can still 
> push the harder path upstream later if they want to, but at least we’ll have 
> the ABI stable, so that you can start using and developing for ilp32 on 
> aarch64.
> 
> 
> Alex
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [RFC2 nowrap: PATCH v7 00/18] ILP32 for ARM64

2016-08-17 Thread Alexander Graf

> On 17 Aug 2016, at 13:46, Yury Norov  wrote:
> 
> This series enables aarch64 with ilp32 mode, and as supporting work,
> introduces ARCH_32BIT_OFF_T configuration option that is enabled for
> existing 32-bit architectures but disabled for new arches (so 64-bit
> off_t is is used by new userspace).
> 
> This version is based on kernel v4.8-rc2.
> It works with glibc-2.23, and tested with LTP.
> 
> This is RFC because there is still no solid understanding what type of 
> registers
> top-halves delousing we prefer. In this patchset, w0-w7 are cleared for each
> syscall in assembler entry. The alternative approach is in introducing compat
> wrappers which is little faster for natively routed syscalls (~2.6% for 
> syscall
> with no payload) but much more complicated.

So you’re saying there are 2 options:

  1) easy to get right, slightly slower, same ABI to user space as 2
  2) harder to get right, minor performance benefit

That’s an obvious pick, no? Mark it non-RFC and stay with the clearing in 
assembler entry. If anyone cares about those last few percent, they can still 
push the harder path upstream later if they want to, but at least we’ll have 
the ABI stable, so that you can start using and developing for ilp32 on aarch64.


Alex

--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[RFC2 nowrap: PATCH v7 00/18] ILP32 for ARM64

2016-08-17 Thread Yury Norov
This series enables aarch64 with ilp32 mode, and as supporting work,
introduces ARCH_32BIT_OFF_T configuration option that is enabled for
existing 32-bit architectures but disabled for new arches (so 64-bit
off_t is is used by new userspace).

This version is based on kernel v4.8-rc2.
It works with glibc-2.23, and tested with LTP.

This is RFC because there is still no solid understanding what type of registers
top-halves delousing we prefer. In this patchset, w0-w7 are cleared for each
syscall in assembler entry. The alternative approach is in introducing compat
wrappers which is little faster for natively routed syscalls (~2.6% for syscall
with no payload) but much more complicated.

There's no major changes here comparing to previous submission, mostly
the rebase to current master. All changes in details are listed below.
No additional regression is observed since previous submission.

Patch 1 may be applied separately from other patches of series.

v3: https://lkml.org/lkml/2014/9/3/704
v4: https://lkml.org/lkml/2015/4/13/691
v5: https://lkml.org/lkml/2015/9/29/911
v6: https://lkml.org/lkml/2016/5/23/661
v7: RFC nowrap: https://lkml.org/lkml/2016/6/17/990
v7: RFC2 nowrap:
 - rebased on kernel 4.8-rc2;
 - setrlimit(), getrlimit() are handled by non-compat handlers to follow 
   switching rlim_t to 64-bit in glibc, as pointed by Andreas Shwab;
 - fixed {GET,SET}SIGMASK handling in ptrace(), as pointed by Zhou Chengming;
 - removed put_sig{set,get)_t duplication;
 - patches 1 and 2 from previous submission are joined, missed chunk restored,
   found by by Andreas Shwab.

Links:
Kernel: https://github.com/norov/linux/commits/ilp32-4.8
glibc:  https://github.com/norov/glibc/commits/ilp32-2.24-dev

Andrew Pinski (6):
  arm64: ensure the kernel is compiled for LP64
  arm64: rename COMPAT to AARCH32_EL0 in Kconfig
  arm64:uapi: set __BITS_PER_LONG correctly for ILP32 and LP64
  arm64: ilp32: add sys_ilp32.c and a separate table (in entry.S) to use
it
  arm64: ilp32: introduce ilp32-specific handlers for sigframe and
ucontext
  arm64:ilp32: add ARM64_ILP32 to Kconfig

Philipp Tomsich (1):
  arm64:ilp32: add vdso-ilp32 and use for signal return

Yury Norov (11):
  32-bit ABI: introduce ARCH_32BIT_OFF_T config option
  arm64: ilp32: add documentation on the ILP32 ABI for ARM64
  thread: move thread bits accessors to separated file
  arm64: introduce is_a32_task and is_a32_thread (for AArch32 compat)
  arm64: ilp32: add is_ilp32_compat_{task,thread} and TIF_32BIT_AARCH64
  arm64: introduce binfmt_elf32.c
  arm64: ilp32: introduce binfmt_ilp32.c
  arm64: ilp32: share aarch32 syscall handlers
  arm64: signal: share lp64 signal routines to ilp32
  arm64: signal32: move ilp32 and aarch32 common code to separated file
  arm64: ptrace: handle ptrace_request differently for aarch32 and ilp32

 Documentation/arm64/ilp32.txt |  54 
 arch/Kconfig  |   4 +
 arch/arc/Kconfig  |   1 +
 arch/arm/Kconfig  |   1 +
 arch/arm64/Kconfig|  19 ++-
 arch/arm64/Makefile   |   5 +
 arch/arm64/include/asm/compat.h   |  19 +--
 arch/arm64/include/asm/elf.h  |  29 +++--
 arch/arm64/include/asm/fpsimd.h   |   2 +-
 arch/arm64/include/asm/ftrace.h   |   2 +-
 arch/arm64/include/asm/hwcap.h|   6 +-
 arch/arm64/include/asm/is_compat.h|  90 ++
 arch/arm64/include/asm/memory.h   |   5 +-
 arch/arm64/include/asm/processor.h|  11 +-
 arch/arm64/include/asm/ptrace.h   |   2 +-
 arch/arm64/include/asm/signal32.h |   9 +-
 arch/arm64/include/asm/signal32_common.h  |  28 +
 arch/arm64/include/asm/signal_common.h|  33 +
 arch/arm64/include/asm/signal_ilp32.h |  38 ++
 arch/arm64/include/asm/syscall.h  |   2 +-
 arch/arm64/include/asm/thread_info.h  |   4 +-
 arch/arm64/include/asm/unistd.h   |   6 +-
 arch/arm64/include/asm/unistd32.h |   2 +-
 arch/arm64/include/asm/vdso.h |   6 +
 arch/arm64/include/uapi/asm/bitsperlong.h |   9 +-
 arch/arm64/kernel/Makefile|  18 ++-
 arch/arm64/kernel/asm-offsets.c   |   9 +-
 arch/arm64/kernel/binfmt_elf32.c  |  31 +
 arch/arm64/kernel/binfmt_ilp32.c  |  96 +++
 arch/arm64/kernel/cpufeature.c|   8 +-
 arch/arm64/kernel/cpuinfo.c   |  20 +--
 arch/arm64/kernel/entry.S |  34 -
 arch/arm64/kernel/entry32.S   |  65 --
 arch/arm64/kernel/entry32_common.S|  93 ++
 arch/arm64/kernel/entry_ilp32.S   |  23 
 arch/arm64/kernel/head.S  |   2 +-
 arch/arm64/kernel/hw_breakpoint.c |  10 +-