On Thu, Apr 14, 2016 at 04:08:23PM +0300, Maxim Kuvyrkov wrote: > On Mar 14, 2016, at 11:14 AM, Li Bin <huawei.li...@huawei.com> wrote: > > > > As ARM64 is entering enterprise world, machines can not be stopped for > > some critical enterprise production environment, that is, live patch as > > one of the RAS features is increasing more important for ARM64 arch now. > > > > Now, the mainstream live patch implementation which has been merged in > > Linux kernel (x86/s390) is based on the 'ftrace with regs' feature, and > > this feature needs the help of gcc. > > > > This patch proposes a generic solution for arm64 gcc which called mfentry, > > following the example of x86, mips, s390, etc. and on these archs, this > > feature has been used to implement the ftrace feature 'ftrace with regs' > > to support live patch. > > > > By now, there is an another solution from linaro [1], which proposes to > > implement a new option -fprolog-pad=N that generate a pad of N nops at the > > beginning of each function. This solution is a arch-independent way for gcc, > > but there may be some limitations which have not been recognized for Linux > > kernel to adapt to this solution besides the discussion on [2] > > It appears that implementing -fprolog-pad=N option in GCC will not enable > kernel live-patching support for AArch64. The proposal for the option was to > make GCC output a given number of NOPs at the beginning of each function, and > then the kernel could use that NOP pad to insert whatever instructions it > needs. The modification of kernel instruction stream needs to be done > atomically, and, unfortunately, it seems the kernel can use only > architecture-provided atomicity primitives -- i.e., changing at most 8 bytes > at a time.
Let me clarify the issue with -fprolog-pad=N. The kernel/ftrace has two chances of replacing prologue instructions: 1) at boot time for all the "C" functions 2) at run time for given functions 1) will be done in part of kernel/ftrace initialization and executed while no other threads(cpus) are running. So we don't need atomicity here. See [1]. For 2), we only have to replace one instruction (nop <-> bl) as [1] stated. So we can guarantee atomicity. Therefore, I still believe that -fproglog-pad=N approach will work for Aarch64. > From the kernel discussion thread it appears that the pad needs to be more > than 8 bytes, and that the kernel can't update that atomically. However if > -mfentry approach is used, then we need to update only 4 (or 8) bytes of the > pad, and we avoid the atomicity problem. > > Therefore, [unless there is a clever multi-stage update process to atomically > change NOPs to whatever we need,] I think we have to go with Li's -mfentry > approach. The reason that I gave up this approach is that it is not as generic as we have expected. At least, power pc needs a specific instruction (i.e. saving TOC) before NOPs. See discussions in [2]. [1] http://lists.infradead.org/pipermail/linux-arm-kernel/2016-January/401854.html [2] http://lkml.iu.edu//hypermail/linux/kernel/1602.0/02257.html Thanks, -Takahiro AKASHI > Comments? > > -- > Maxim Kuvyrkov > www.linaro.org > > > > , typically > > for powerpc archs. Furthermore I think there are no good reasons to promote > > the other archs (such as x86) which have implemented the feature 'ftrace > > with regs' > > to replace the current method with the new option, which may bring heavily > > target-dependent code adaption, as a result it becomes a arm64 dedicated > > solution, leaving kernel with two different forms of implementation. > > > > [1] https://gcc.gnu.org/ml/gcc/2015-10/msg00090.html > > [2] > > http://lists.infradead.org/pipermail/linux-arm-kernel/2016-January/401854.html > -- Thanks, -Takahiro AKASHI