Re: [PATCH] mm: add VM_STATIC flag to vmalloc and prevent from removing the areas

2017-04-12 Thread Ho-Eun Ryu

> On 13 Apr 2017, at 2:31 AM, Christoph Hellwig  wrote:
> 
> On Wed, Apr 12, 2017 at 08:42:08PM +0900, Hoeun Ryu wrote:
>> 
>>> On Apr 12, 2017, at 3:02 PM, Christoph Hellwig  wrote:
>>> 
 On Wed, Apr 12, 2017 at 02:01:59PM +0900, Hoeun Ryu wrote:
 vm_area_add_early/vm_area_register_early() are used to reserve vmalloc area
 during boot process and those virtually mapped areas are never unmapped.
 So `OR` VM_STATIC flag to the areas in vmalloc_init() when importing
 existing vmlist entries and prevent those areas from being removed from the
 rbtree by accident.
>>> 
>>> How would they be removed "by accident"?
>> 
>> I don't mean actual use-cases, but I just want to make it robust against 
>> like programming errors.
> 
> Oh, ok.  The patch makes sense then, although the changelog could use
> a little update.

OK, I will.
Any other suggestions for code itself ?



Re: [PATCH] mm: add VM_STATIC flag to vmalloc and prevent from removing the areas

2017-04-12 Thread Ho-Eun Ryu

> On 13 Apr 2017, at 2:31 AM, Christoph Hellwig  wrote:
> 
> On Wed, Apr 12, 2017 at 08:42:08PM +0900, Hoeun Ryu wrote:
>> 
>>> On Apr 12, 2017, at 3:02 PM, Christoph Hellwig  wrote:
>>> 
 On Wed, Apr 12, 2017 at 02:01:59PM +0900, Hoeun Ryu wrote:
 vm_area_add_early/vm_area_register_early() are used to reserve vmalloc area
 during boot process and those virtually mapped areas are never unmapped.
 So `OR` VM_STATIC flag to the areas in vmalloc_init() when importing
 existing vmlist entries and prevent those areas from being removed from the
 rbtree by accident.
>>> 
>>> How would they be removed "by accident"?
>> 
>> I don't mean actual use-cases, but I just want to make it robust against 
>> like programming errors.
> 
> Oh, ok.  The patch makes sense then, although the changelog could use
> a little update.

OK, I will.
Any other suggestions for code itself ?



Re: [RFC v2][PATCH 01/11] Introduce rare_write() infrastructure

2017-04-07 Thread Ho-Eun Ryu

> On 30 Mar 2017, at 3:15 AM, Kees Cook  wrote:
> 
> Several types of data storage exist in the kernel: read-write data (.data,
> .bss), read-only data (.rodata), and RO-after-init. This introduces the
> infrastructure for another type: write-rarely, which is intended for data
> that is either only rarely modified or especially security-sensitive. The
> goal is to further reduce the internal attack surface of the kernel by
> making this storage read-only when "at rest". This makes it much harder
> to be subverted by attackers who have a kernel-write flaw, since they
> cannot directly change these memory contents.
> 
> This work is heavily based on PaX and grsecurity's pax_{open,close}_kernel
> API, its __read_only annotations, its constify plugin, and the work done
> to identify sensitive structures that should be moved from .data into
> .rodata. This builds the initial infrastructure to support these kinds
> of changes, though the API and naming has been adjusted in places for
> clarity and maintainability.
> 
> Variables declared with the __wr_rare annotation will be moved to the
> .rodata section if an architecture supports CONFIG_HAVE_ARCH_WRITE_RARE.
> To change these variables, either a single rare_write() macro can be used,
> or multiple uses of __rare_write(), wrapped in a matching pair of
> rare_write_begin() and rare_write_end() macros can be used. These macros
> are expanded into the arch-specific functions that perform the actions
> needed to write to otherwise read-only memory.
> 
> As detailed in the Kconfig help, the arch-specific helpers have several
> requirements to make them sensible/safe for use by the kernel: they must
> not allow non-current CPUs to write the memory area, they must run
> non-preemptible to avoid accidentally leaving memory writable, and must
> be inline to avoid making them desirable ROP targets for attackers.
> 
> Signed-off-by: Kees Cook 
> ---
> arch/Kconfig | 25 +
> include/linux/compiler.h | 32 
> include/linux/preempt.h  |  6 --
> 3 files changed, 61 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/Kconfig b/arch/Kconfig
> index cd211a14a88f..5ebf62500b99 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -847,4 +847,29 @@ config STRICT_MODULE_RWX
> config ARCH_WANT_RELAX_ORDER
>   bool
> 
> +config HAVE_ARCH_RARE_WRITE
> + def_bool n
> + help
> +   An arch should select this option if it has defined the functions
> +   __arch_rare_write_begin() and __arch_rare_write_end() to
> +   respectively enable and disable writing to read-only memory. The
> +   routines must meet the following requirements:
> +   - read-only memory writing must only be available on the current
> + CPU (to make sure other CPUs can't race to make changes too).
> +   - the routines must be declared inline (to discourage ROP use).
> +   - the routines must not be preemptible (likely they will call
> + preempt_disable() and preempt_enable_no_resched() respectively).
> +   - the routines must validate expected state (e.g. when enabling
> + writes, BUG() if writes are already be enabled).
> +
> +config HAVE_ARCH_RARE_WRITE_MEMCPY
> + def_bool n
> + depends on HAVE_ARCH_RARE_WRITE
> + help
> +   An arch should select this option if a special accessor is needed
> +   to write to otherwise read-only memory, defined by the function
> +   __arch_rare_write_memcpy(). Without this, the write-rarely
> +   infrastructure will just attempt to write directly to the memory
> +   using a const-ignoring assignment.
> +
> source "kernel/gcov/Kconfig"
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index f8110051188f..274bd03cfe9e 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -336,6 +336,38 @@ static __always_inline void __write_once_size(volatile 
> void *p, void *res, int s
>   __u.__val;  \
> })
> 
> +/*
> + * Build "write rarely" infrastructure for flipping memory r/w
> + * on a per-CPU basis.
> + */
> +#ifndef CONFIG_HAVE_ARCH_RARE_WRITE
> +# define __wr_rare
> +# define __wr_rare_type
> +# define __rare_write(__var, __val)  (__var = (__val))
> +# define rare_write_begin()  do { } while (0)
> +# define rare_write_end()do { } while (0)
> +#else
> +# define __wr_rare   __ro_after_init
> +# define __wr_rare_type  const
> +# ifdef CONFIG_HAVE_ARCH_RARE_WRITE_MEMCPY
> +#  define __rare_write_n(dst, src, len)  ({  \
> + BUILD_BUG(!builtin_const(len)); \
> + __arch_rare_write_memcpy((dst), (src), (len));  \
> + })
> +#  define __rare_write(var, val)  __rare_write_n(&(var), &(val), sizeof(var))
> +# else
> +#  define __rare_write(var, val)  ((*(typeof((typeof(var))0) *)&(var)) = 

Re: [RFC v2][PATCH 01/11] Introduce rare_write() infrastructure

2017-04-07 Thread Ho-Eun Ryu

> On 30 Mar 2017, at 3:15 AM, Kees Cook  wrote:
> 
> Several types of data storage exist in the kernel: read-write data (.data,
> .bss), read-only data (.rodata), and RO-after-init. This introduces the
> infrastructure for another type: write-rarely, which is intended for data
> that is either only rarely modified or especially security-sensitive. The
> goal is to further reduce the internal attack surface of the kernel by
> making this storage read-only when "at rest". This makes it much harder
> to be subverted by attackers who have a kernel-write flaw, since they
> cannot directly change these memory contents.
> 
> This work is heavily based on PaX and grsecurity's pax_{open,close}_kernel
> API, its __read_only annotations, its constify plugin, and the work done
> to identify sensitive structures that should be moved from .data into
> .rodata. This builds the initial infrastructure to support these kinds
> of changes, though the API and naming has been adjusted in places for
> clarity and maintainability.
> 
> Variables declared with the __wr_rare annotation will be moved to the
> .rodata section if an architecture supports CONFIG_HAVE_ARCH_WRITE_RARE.
> To change these variables, either a single rare_write() macro can be used,
> or multiple uses of __rare_write(), wrapped in a matching pair of
> rare_write_begin() and rare_write_end() macros can be used. These macros
> are expanded into the arch-specific functions that perform the actions
> needed to write to otherwise read-only memory.
> 
> As detailed in the Kconfig help, the arch-specific helpers have several
> requirements to make them sensible/safe for use by the kernel: they must
> not allow non-current CPUs to write the memory area, they must run
> non-preemptible to avoid accidentally leaving memory writable, and must
> be inline to avoid making them desirable ROP targets for attackers.
> 
> Signed-off-by: Kees Cook 
> ---
> arch/Kconfig | 25 +
> include/linux/compiler.h | 32 
> include/linux/preempt.h  |  6 --
> 3 files changed, 61 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/Kconfig b/arch/Kconfig
> index cd211a14a88f..5ebf62500b99 100644
> --- a/arch/Kconfig
> +++ b/arch/Kconfig
> @@ -847,4 +847,29 @@ config STRICT_MODULE_RWX
> config ARCH_WANT_RELAX_ORDER
>   bool
> 
> +config HAVE_ARCH_RARE_WRITE
> + def_bool n
> + help
> +   An arch should select this option if it has defined the functions
> +   __arch_rare_write_begin() and __arch_rare_write_end() to
> +   respectively enable and disable writing to read-only memory. The
> +   routines must meet the following requirements:
> +   - read-only memory writing must only be available on the current
> + CPU (to make sure other CPUs can't race to make changes too).
> +   - the routines must be declared inline (to discourage ROP use).
> +   - the routines must not be preemptible (likely they will call
> + preempt_disable() and preempt_enable_no_resched() respectively).
> +   - the routines must validate expected state (e.g. when enabling
> + writes, BUG() if writes are already be enabled).
> +
> +config HAVE_ARCH_RARE_WRITE_MEMCPY
> + def_bool n
> + depends on HAVE_ARCH_RARE_WRITE
> + help
> +   An arch should select this option if a special accessor is needed
> +   to write to otherwise read-only memory, defined by the function
> +   __arch_rare_write_memcpy(). Without this, the write-rarely
> +   infrastructure will just attempt to write directly to the memory
> +   using a const-ignoring assignment.
> +
> source "kernel/gcov/Kconfig"
> diff --git a/include/linux/compiler.h b/include/linux/compiler.h
> index f8110051188f..274bd03cfe9e 100644
> --- a/include/linux/compiler.h
> +++ b/include/linux/compiler.h
> @@ -336,6 +336,38 @@ static __always_inline void __write_once_size(volatile 
> void *p, void *res, int s
>   __u.__val;  \
> })
> 
> +/*
> + * Build "write rarely" infrastructure for flipping memory r/w
> + * on a per-CPU basis.
> + */
> +#ifndef CONFIG_HAVE_ARCH_RARE_WRITE
> +# define __wr_rare
> +# define __wr_rare_type
> +# define __rare_write(__var, __val)  (__var = (__val))
> +# define rare_write_begin()  do { } while (0)
> +# define rare_write_end()do { } while (0)
> +#else
> +# define __wr_rare   __ro_after_init
> +# define __wr_rare_type  const
> +# ifdef CONFIG_HAVE_ARCH_RARE_WRITE_MEMCPY
> +#  define __rare_write_n(dst, src, len)  ({  \
> + BUILD_BUG(!builtin_const(len)); \
> + __arch_rare_write_memcpy((dst), (src), (len));  \
> + })
> +#  define __rare_write(var, val)  __rare_write_n(&(var), &(val), sizeof(var))
> +# else
> +#  define __rare_write(var, val)  ((*(typeof((typeof(var))0) *)&(var)) = 
> (val))
> +# endif
> +# define 

Re: [RFCv2] arm64: support HAVE_ARCH_RARE_WRITE and HAVE_ARCH_RARE_WRITE_MEMCPY

2017-04-04 Thread Ho-Eun Ryu

> On 3 Apr 2017, at 4:11 PM, Ard Biesheuvel <ard.biesheu...@linaro.org> wrote:
> 
> On 3 April 2017 at 05:17, Ho-Eun Ryu <hoeun@gmail.com> wrote:
>> 
>>> On 31 Mar 2017, at 6:25 PM, Ard Biesheuvel <ard.biesheu...@linaro.org> 
>>> wrote:
>>> 
>>> On 30 March 2017 at 15:39, Hoeun Ryu <hoeun@gmail.com> wrote:
>>>> This patch might be a part of Kees Cook's rare_write infrastructure series
>>>> for [1] for arm64 architecture.
>>>> 
>>>> This implementation is based on Mark Rutland's suggestions [2], which is
>>>> that a special userspace mm that maps only __start/end_rodata as RW permi-
>>>> ssion is prepared during early boot time (paging_init) and __arch_rare_-
>>>> write_begin() switches to the mm [2].
>>>> 
>>>> rare_write_mm address space is added for the special purpose and a page
>>>> global directory is also prepared for it. The mm remaps __start_rodata ~
>>>> __end_rodata to rodata_rw_alias_start which starts from TASK_SIZE_64 / 4
>>>> + kaslr_offset().
>>>> 
>>>> It passes LKDTM's rare write test.
>>>> 
>>>> [1] : http://www.openwall.com/lists/kernel-hardening/2017/02/27/5
>>>> [2] : https://lkml.org/lkml/2017/3/29/704
>>>> 
>>>> Signed-off-by: Hoeun Ryu <hoeun@gmail.com>
>>>> ---
>>>> arch/arm64/Kconfig   |   2 +
>>>> arch/arm64/include/asm/pgtable.h |   4 ++
>>>> arch/arm64/mm/mmu.c  | 101 
>>>> +++
>>>> 3 files changed, 107 insertions(+)
>>>> 
>>>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>>>> index f2b0b52..6e2c592 100644
>>>> --- a/arch/arm64/Kconfig
>>>> +++ b/arch/arm64/Kconfig
>>>> @@ -102,6 +102,8 @@ config ARM64
>>>>   select HAVE_SYSCALL_TRACEPOINTS
>>>>   select HAVE_KPROBES
>>>>   select HAVE_KRETPROBES
>>>> +   select HAVE_ARCH_RARE_WRITE
>>>> +   select HAVE_ARCH_RARE_WRITE_MEMCPY
>>>>   select IOMMU_DMA if IOMMU_SUPPORT
>>>>   select IRQ_DOMAIN
>>>>   select IRQ_FORCED_THREADING
>>>> diff --git a/arch/arm64/include/asm/pgtable.h 
>>>> b/arch/arm64/include/asm/pgtable.h
>>>> index c213fdbd0..1514933 100644
>>>> --- a/arch/arm64/include/asm/pgtable.h
>>>> +++ b/arch/arm64/include/asm/pgtable.h
>>>> @@ -741,6 +741,10 @@ static inline void update_mmu_cache(struct 
>>>> vm_area_struct *vma,
>>>> #define kc_vaddr_to_offset(v)  ((v) & ~VA_START)
>>>> #define kc_offset_to_vaddr(o)  ((o) | VA_START)
>>>> 
>>>> +unsigned long __arch_rare_write_begin(void);
>>>> +unsigned long __arch_rare_write_end(void);
>>>> +void __arch_rare_write_memcpy(void *dst, const void *src, __kernel_size_t 
>>>> len);
>>>> +
>>> 
>>> If these hook into a generic framework, shouldn't these already be
>>> declared in a generic header file?
>> 
>> the generic header file doesn't declare arch-specific version of api.
>> 
>>> 
>>>> #endif /* !__ASSEMBLY__ */
>>>> 
>>>> #endif /* __ASM_PGTABLE_H */
>>>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>>>> index 91502e3..86b25c9 100644
>>>> --- a/arch/arm64/mm/mmu.c
>>>> +++ b/arch/arm64/mm/mmu.c
>>>> @@ -570,6 +570,105 @@ static void __init map_kernel(pgd_t *pgd)
>>>>   kasan_copy_shadow(pgd);
>>>> }
>>>> 
>>>> +struct mm_struct rare_write_mm = {
>>> 
>>> static please
>> 
>> OK, add static.
>> 
>>> 
>>>> +   .mm_rb  = RB_ROOT,
>>>> +   .mm_users   = ATOMIC_INIT(2),
>>>> +   .mm_count   = ATOMIC_INIT(1),
>>>> +   .mmap_sem   = __RWSEM_INITIALIZER(rare_write_mm.mmap_sem),
>>>> +   .page_table_lock= 
>>>> __SPIN_LOCK_UNLOCKED(rare_write_mm.page_table_lock),
>>>> +   .mmlist = LIST_HEAD_INIT(rare_write_mm.mmlist),
>>>> +};
>>>> +
>>>> +#ifdef CONFIG_ARM64_PTDUMP_DEBUGFS
>>>> +#include 
>>>> +
>>>> +static struct ptdump_info rare_write_ptdump_info = {
>>>> +   .mm = _write_mm,
>>>> +   .markers= (struct addr_marker[]){
>>>>

Re: [RFCv2] arm64: support HAVE_ARCH_RARE_WRITE and HAVE_ARCH_RARE_WRITE_MEMCPY

2017-04-04 Thread Ho-Eun Ryu

> On 3 Apr 2017, at 4:11 PM, Ard Biesheuvel  wrote:
> 
> On 3 April 2017 at 05:17, Ho-Eun Ryu  wrote:
>> 
>>> On 31 Mar 2017, at 6:25 PM, Ard Biesheuvel  
>>> wrote:
>>> 
>>> On 30 March 2017 at 15:39, Hoeun Ryu  wrote:
>>>> This patch might be a part of Kees Cook's rare_write infrastructure series
>>>> for [1] for arm64 architecture.
>>>> 
>>>> This implementation is based on Mark Rutland's suggestions [2], which is
>>>> that a special userspace mm that maps only __start/end_rodata as RW permi-
>>>> ssion is prepared during early boot time (paging_init) and __arch_rare_-
>>>> write_begin() switches to the mm [2].
>>>> 
>>>> rare_write_mm address space is added for the special purpose and a page
>>>> global directory is also prepared for it. The mm remaps __start_rodata ~
>>>> __end_rodata to rodata_rw_alias_start which starts from TASK_SIZE_64 / 4
>>>> + kaslr_offset().
>>>> 
>>>> It passes LKDTM's rare write test.
>>>> 
>>>> [1] : http://www.openwall.com/lists/kernel-hardening/2017/02/27/5
>>>> [2] : https://lkml.org/lkml/2017/3/29/704
>>>> 
>>>> Signed-off-by: Hoeun Ryu 
>>>> ---
>>>> arch/arm64/Kconfig   |   2 +
>>>> arch/arm64/include/asm/pgtable.h |   4 ++
>>>> arch/arm64/mm/mmu.c  | 101 
>>>> +++
>>>> 3 files changed, 107 insertions(+)
>>>> 
>>>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>>>> index f2b0b52..6e2c592 100644
>>>> --- a/arch/arm64/Kconfig
>>>> +++ b/arch/arm64/Kconfig
>>>> @@ -102,6 +102,8 @@ config ARM64
>>>>   select HAVE_SYSCALL_TRACEPOINTS
>>>>   select HAVE_KPROBES
>>>>   select HAVE_KRETPROBES
>>>> +   select HAVE_ARCH_RARE_WRITE
>>>> +   select HAVE_ARCH_RARE_WRITE_MEMCPY
>>>>   select IOMMU_DMA if IOMMU_SUPPORT
>>>>   select IRQ_DOMAIN
>>>>   select IRQ_FORCED_THREADING
>>>> diff --git a/arch/arm64/include/asm/pgtable.h 
>>>> b/arch/arm64/include/asm/pgtable.h
>>>> index c213fdbd0..1514933 100644
>>>> --- a/arch/arm64/include/asm/pgtable.h
>>>> +++ b/arch/arm64/include/asm/pgtable.h
>>>> @@ -741,6 +741,10 @@ static inline void update_mmu_cache(struct 
>>>> vm_area_struct *vma,
>>>> #define kc_vaddr_to_offset(v)  ((v) & ~VA_START)
>>>> #define kc_offset_to_vaddr(o)  ((o) | VA_START)
>>>> 
>>>> +unsigned long __arch_rare_write_begin(void);
>>>> +unsigned long __arch_rare_write_end(void);
>>>> +void __arch_rare_write_memcpy(void *dst, const void *src, __kernel_size_t 
>>>> len);
>>>> +
>>> 
>>> If these hook into a generic framework, shouldn't these already be
>>> declared in a generic header file?
>> 
>> the generic header file doesn't declare arch-specific version of api.
>> 
>>> 
>>>> #endif /* !__ASSEMBLY__ */
>>>> 
>>>> #endif /* __ASM_PGTABLE_H */
>>>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>>>> index 91502e3..86b25c9 100644
>>>> --- a/arch/arm64/mm/mmu.c
>>>> +++ b/arch/arm64/mm/mmu.c
>>>> @@ -570,6 +570,105 @@ static void __init map_kernel(pgd_t *pgd)
>>>>   kasan_copy_shadow(pgd);
>>>> }
>>>> 
>>>> +struct mm_struct rare_write_mm = {
>>> 
>>> static please
>> 
>> OK, add static.
>> 
>>> 
>>>> +   .mm_rb  = RB_ROOT,
>>>> +   .mm_users   = ATOMIC_INIT(2),
>>>> +   .mm_count   = ATOMIC_INIT(1),
>>>> +   .mmap_sem   = __RWSEM_INITIALIZER(rare_write_mm.mmap_sem),
>>>> +   .page_table_lock= 
>>>> __SPIN_LOCK_UNLOCKED(rare_write_mm.page_table_lock),
>>>> +   .mmlist = LIST_HEAD_INIT(rare_write_mm.mmlist),
>>>> +};
>>>> +
>>>> +#ifdef CONFIG_ARM64_PTDUMP_DEBUGFS
>>>> +#include 
>>>> +
>>>> +static struct ptdump_info rare_write_ptdump_info = {
>>>> +   .mm = _write_mm,
>>>> +   .markers= (struct addr_marker[]){
>>>> +   { 0,"rare-write start" },
>>>> +   { TASK_SIZE_64, "rare-write end" }
>>

Re: [RFCv2] arm64: support HAVE_ARCH_RARE_WRITE and HAVE_ARCH_RARE_WRITE_MEMCPY

2017-04-02 Thread Ho-Eun Ryu

> On 31 Mar 2017, at 6:25 PM, Ard Biesheuvel  wrote:
> 
> On 30 March 2017 at 15:39, Hoeun Ryu  wrote:
>> This patch might be a part of Kees Cook's rare_write infrastructure series
>> for [1] for arm64 architecture.
>> 
>> This implementation is based on Mark Rutland's suggestions [2], which is
>> that a special userspace mm that maps only __start/end_rodata as RW permi-
>> ssion is prepared during early boot time (paging_init) and __arch_rare_-
>> write_begin() switches to the mm [2].
>> 
>> rare_write_mm address space is added for the special purpose and a page
>> global directory is also prepared for it. The mm remaps __start_rodata ~
>> __end_rodata to rodata_rw_alias_start which starts from TASK_SIZE_64 / 4
>> + kaslr_offset().
>> 
>> It passes LKDTM's rare write test.
>> 
>> [1] : http://www.openwall.com/lists/kernel-hardening/2017/02/27/5
>> [2] : https://lkml.org/lkml/2017/3/29/704
>> 
>> Signed-off-by: Hoeun Ryu 
>> ---
>> arch/arm64/Kconfig   |   2 +
>> arch/arm64/include/asm/pgtable.h |   4 ++
>> arch/arm64/mm/mmu.c  | 101 
>> +++
>> 3 files changed, 107 insertions(+)
>> 
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index f2b0b52..6e2c592 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -102,6 +102,8 @@ config ARM64
>>select HAVE_SYSCALL_TRACEPOINTS
>>select HAVE_KPROBES
>>select HAVE_KRETPROBES
>> +   select HAVE_ARCH_RARE_WRITE
>> +   select HAVE_ARCH_RARE_WRITE_MEMCPY
>>select IOMMU_DMA if IOMMU_SUPPORT
>>select IRQ_DOMAIN
>>select IRQ_FORCED_THREADING
>> diff --git a/arch/arm64/include/asm/pgtable.h 
>> b/arch/arm64/include/asm/pgtable.h
>> index c213fdbd0..1514933 100644
>> --- a/arch/arm64/include/asm/pgtable.h
>> +++ b/arch/arm64/include/asm/pgtable.h
>> @@ -741,6 +741,10 @@ static inline void update_mmu_cache(struct 
>> vm_area_struct *vma,
>> #define kc_vaddr_to_offset(v)  ((v) & ~VA_START)
>> #define kc_offset_to_vaddr(o)  ((o) | VA_START)
>> 
>> +unsigned long __arch_rare_write_begin(void);
>> +unsigned long __arch_rare_write_end(void);
>> +void __arch_rare_write_memcpy(void *dst, const void *src, __kernel_size_t 
>> len);
>> +
> 
> If these hook into a generic framework, shouldn't these already be
> declared in a generic header file?

the generic header file doesn't declare arch-specific version of api.

> 
>> #endif /* !__ASSEMBLY__ */
>> 
>> #endif /* __ASM_PGTABLE_H */
>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>> index 91502e3..86b25c9 100644
>> --- a/arch/arm64/mm/mmu.c
>> +++ b/arch/arm64/mm/mmu.c
>> @@ -570,6 +570,105 @@ static void __init map_kernel(pgd_t *pgd)
>>kasan_copy_shadow(pgd);
>> }
>> 
>> +struct mm_struct rare_write_mm = {
> 
> static please

OK, add static. 

> 
>> +   .mm_rb  = RB_ROOT,
>> +   .mm_users   = ATOMIC_INIT(2),
>> +   .mm_count   = ATOMIC_INIT(1),
>> +   .mmap_sem   = __RWSEM_INITIALIZER(rare_write_mm.mmap_sem),
>> +   .page_table_lock= 
>> __SPIN_LOCK_UNLOCKED(rare_write_mm.page_table_lock),
>> +   .mmlist = LIST_HEAD_INIT(rare_write_mm.mmlist),
>> +};
>> +
>> +#ifdef CONFIG_ARM64_PTDUMP_DEBUGFS
>> +#include 
>> +
>> +static struct ptdump_info rare_write_ptdump_info = {
>> +   .mm = _write_mm,
>> +   .markers= (struct addr_marker[]){
>> +   { 0,"rare-write start" },
>> +   { TASK_SIZE_64, "rare-write end" }
>> +   },
>> +   .base_addr  = 0,
>> +};
>> +
>> +static int __init ptdump_init(void)
>> +{
>> +   return ptdump_debugfs_register(_write_ptdump_info,
>> +  "rare_write_page_tables");
>> +}
>> +device_initcall(ptdump_init);
>> +
>> +#endif
>> +
>> +__always_inline unsigned long __arch_rare_write_begin(void)
> 
> These functions are not called from the same compilation unit, so what
> do you think __always_inline buys you here?

The first patch of Kee's series ([PATCH 01/11] Introduce rare_write() 
infrastructure) [1] says some requirements and one of these is the 
arch-specific helpers should be inline to avoid making them ROP targets.

So I'd make sure the helpers are inline.

[1] : http://www.openwall.com/lists/kernel-hardening/2017/03/29/16

> 
>> +{
>> +   struct mm_struct *mm = _write_mm;
>> +
>> +   preempt_disable();
>> +
>> +   __switch_mm(mm);
>> +
>> +   if (system_uses_ttbr0_pan()) {
>> +   update_saved_ttbr0(current, mm);
>> +   cpu_switch_mm(mm->pgd, mm);
>> +   }
>> +
>> +   return 0;
>> +}
>> +
>> +__always_inline unsigned long __arch_rare_write_end(void)
>> +{
>> +   struct mm_struct *mm = current->active_mm;
>> +
>> +   __switch_mm(mm);
>> +
>> +   if (system_uses_ttbr0_pan()) {
>> +   cpu_set_reserved_ttbr0();
>> +   if (mm != _mm)
>> + 

Re: [RFCv2] arm64: support HAVE_ARCH_RARE_WRITE and HAVE_ARCH_RARE_WRITE_MEMCPY

2017-04-02 Thread Ho-Eun Ryu

> On 31 Mar 2017, at 6:25 PM, Ard Biesheuvel  wrote:
> 
> On 30 March 2017 at 15:39, Hoeun Ryu  wrote:
>> This patch might be a part of Kees Cook's rare_write infrastructure series
>> for [1] for arm64 architecture.
>> 
>> This implementation is based on Mark Rutland's suggestions [2], which is
>> that a special userspace mm that maps only __start/end_rodata as RW permi-
>> ssion is prepared during early boot time (paging_init) and __arch_rare_-
>> write_begin() switches to the mm [2].
>> 
>> rare_write_mm address space is added for the special purpose and a page
>> global directory is also prepared for it. The mm remaps __start_rodata ~
>> __end_rodata to rodata_rw_alias_start which starts from TASK_SIZE_64 / 4
>> + kaslr_offset().
>> 
>> It passes LKDTM's rare write test.
>> 
>> [1] : http://www.openwall.com/lists/kernel-hardening/2017/02/27/5
>> [2] : https://lkml.org/lkml/2017/3/29/704
>> 
>> Signed-off-by: Hoeun Ryu 
>> ---
>> arch/arm64/Kconfig   |   2 +
>> arch/arm64/include/asm/pgtable.h |   4 ++
>> arch/arm64/mm/mmu.c  | 101 
>> +++
>> 3 files changed, 107 insertions(+)
>> 
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index f2b0b52..6e2c592 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -102,6 +102,8 @@ config ARM64
>>select HAVE_SYSCALL_TRACEPOINTS
>>select HAVE_KPROBES
>>select HAVE_KRETPROBES
>> +   select HAVE_ARCH_RARE_WRITE
>> +   select HAVE_ARCH_RARE_WRITE_MEMCPY
>>select IOMMU_DMA if IOMMU_SUPPORT
>>select IRQ_DOMAIN
>>select IRQ_FORCED_THREADING
>> diff --git a/arch/arm64/include/asm/pgtable.h 
>> b/arch/arm64/include/asm/pgtable.h
>> index c213fdbd0..1514933 100644
>> --- a/arch/arm64/include/asm/pgtable.h
>> +++ b/arch/arm64/include/asm/pgtable.h
>> @@ -741,6 +741,10 @@ static inline void update_mmu_cache(struct 
>> vm_area_struct *vma,
>> #define kc_vaddr_to_offset(v)  ((v) & ~VA_START)
>> #define kc_offset_to_vaddr(o)  ((o) | VA_START)
>> 
>> +unsigned long __arch_rare_write_begin(void);
>> +unsigned long __arch_rare_write_end(void);
>> +void __arch_rare_write_memcpy(void *dst, const void *src, __kernel_size_t 
>> len);
>> +
> 
> If these hook into a generic framework, shouldn't these already be
> declared in a generic header file?

the generic header file doesn't declare arch-specific version of api.

> 
>> #endif /* !__ASSEMBLY__ */
>> 
>> #endif /* __ASM_PGTABLE_H */
>> diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
>> index 91502e3..86b25c9 100644
>> --- a/arch/arm64/mm/mmu.c
>> +++ b/arch/arm64/mm/mmu.c
>> @@ -570,6 +570,105 @@ static void __init map_kernel(pgd_t *pgd)
>>kasan_copy_shadow(pgd);
>> }
>> 
>> +struct mm_struct rare_write_mm = {
> 
> static please

OK, add static. 

> 
>> +   .mm_rb  = RB_ROOT,
>> +   .mm_users   = ATOMIC_INIT(2),
>> +   .mm_count   = ATOMIC_INIT(1),
>> +   .mmap_sem   = __RWSEM_INITIALIZER(rare_write_mm.mmap_sem),
>> +   .page_table_lock= 
>> __SPIN_LOCK_UNLOCKED(rare_write_mm.page_table_lock),
>> +   .mmlist = LIST_HEAD_INIT(rare_write_mm.mmlist),
>> +};
>> +
>> +#ifdef CONFIG_ARM64_PTDUMP_DEBUGFS
>> +#include 
>> +
>> +static struct ptdump_info rare_write_ptdump_info = {
>> +   .mm = _write_mm,
>> +   .markers= (struct addr_marker[]){
>> +   { 0,"rare-write start" },
>> +   { TASK_SIZE_64, "rare-write end" }
>> +   },
>> +   .base_addr  = 0,
>> +};
>> +
>> +static int __init ptdump_init(void)
>> +{
>> +   return ptdump_debugfs_register(_write_ptdump_info,
>> +  "rare_write_page_tables");
>> +}
>> +device_initcall(ptdump_init);
>> +
>> +#endif
>> +
>> +__always_inline unsigned long __arch_rare_write_begin(void)
> 
> These functions are not called from the same compilation unit, so what
> do you think __always_inline buys you here?

The first patch of Kee's series ([PATCH 01/11] Introduce rare_write() 
infrastructure) [1] says some requirements and one of these is the 
arch-specific helpers should be inline to avoid making them ROP targets.

So I'd make sure the helpers are inline.

[1] : http://www.openwall.com/lists/kernel-hardening/2017/03/29/16

> 
>> +{
>> +   struct mm_struct *mm = _write_mm;
>> +
>> +   preempt_disable();
>> +
>> +   __switch_mm(mm);
>> +
>> +   if (system_uses_ttbr0_pan()) {
>> +   update_saved_ttbr0(current, mm);
>> +   cpu_switch_mm(mm->pgd, mm);
>> +   }
>> +
>> +   return 0;
>> +}
>> +
>> +__always_inline unsigned long __arch_rare_write_end(void)
>> +{
>> +   struct mm_struct *mm = current->active_mm;
>> +
>> +   __switch_mm(mm);
>> +
>> +   if (system_uses_ttbr0_pan()) {
>> +   cpu_set_reserved_ttbr0();
>> +   if (mm != _mm)
>> +   update_saved_ttbr0(current, mm);
>> +   }
>> +
>> 

Re: [RFC v2][PATCH 01/11] Introduce rare_write() infrastructure

2017-03-30 Thread Ho-Eun Ryu

> On 30 Mar 2017, at 3:23 AM, Kees Cook  wrote:
> 
> On Wed, Mar 29, 2017 at 11:15 AM, Kees Cook  wrote:
>> +/*
>> + * Build "write rarely" infrastructure for flipping memory r/w
>> + * on a per-CPU basis.
>> + */
>> +#ifndef CONFIG_HAVE_ARCH_RARE_WRITE
>> +# define __wr_rare
>> +# define __wr_rare_type
>> +# define __rare_write(__var, __val)(__var = (__val))
>> +# define rare_write_begin()do { } while (0)
>> +# define rare_write_end()  do { } while (0)
>> +#else
>> +# define __wr_rare __ro_after_init
>> +# define __wr_rare_typeconst
>> +# ifdef CONFIG_HAVE_ARCH_RARE_WRITE_MEMCPY
>> +#  define __rare_write_n(dst, src, len)({  \
>> +   BUILD_BUG(!builtin_const(len)); \
>> +   __arch_rare_write_memcpy((dst), (src), (len));  \
>> +   })
>> +#  define __rare_write(var, val)  __rare_write_n(&(var), &(val), 
>> sizeof(var))
>> +# else
>> +#  define __rare_write(var, val)  ((*(typeof((typeof(var))0) *)&(var)) = 
>> (val))
>> +# endif
>> +# define rare_write_begin()__arch_rare_write_begin()
>> +# define rare_write_end()  __arch_rare_write_end()
>> +#endif
>> +#define rare_write(__var, __val) ({\
>> +   rare_write_begin(); \
>> +   __rare_write(__var, __val); \
>> +   rare_write_end();   \
>> +   __var;  \
>> +})
>> +
> 
> Of course, only after sending this do I realize that the MEMCPY case
> will need to be further adjusted, since it currently can't take
> literals. I guess something like this needs to be done:
> 
> #define __rare_write(var, val) ({ \
>typeof(var) __src = (val); \
>__rare_write_n(&(var), &(__src), sizeof(var)); \
> })
> 

Right, and it has a problem with BUILD_BUG, which causes compilation error when 
CONFIG_HABE_ARCH_RARE_WRITE_MEMCPY is true
BUILD_BUG is defined in  but  includes 


Please see the following.

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 3334fa9..3fa50e1 100644   
--- a/include/linux/compiler.h  
+++ b/include/linux/compiler.h  
@@ -350,11 +350,11 @@ static __always_inline void __write_once_size(volatile vo\
id *p, void *res, int s
 # define __wr_rare __ro_after_init 
 # define __wr_rare_typeconst   
 # ifdef CONFIG_HAVE_ARCH_RARE_WRITE_MEMCPY 
-#  define __rare_write_n(dst, src, len)({  \   
-   BUILD_BUG(!builtin_const(len)); \
-   __arch_rare_write_memcpy((dst), (src), (len));   \  
 
+#  define __rare_write_n(var, val, len)({  
   \
  
+   typeof(val) __val = val;
\
+   __arch_rare_write_memcpy(&(var), &(__val), (len));  \   
})
-#  define __rare_write(var, val)  __rare_write_n(&(var), &(val), sizeof(var))  
+#  define __rare_write(var, val)  __rare_write_n((var), (val), sizeof(var))
 # else 
 #  define __rare_write(var, val)  ((*(typeof((typeof(var))0) *)&(var)) = (val)\
)   
 # endif


> -Kees
> 
> -- 
> Kees Cook
> Pixel Security



Re: [RFC v2][PATCH 01/11] Introduce rare_write() infrastructure

2017-03-30 Thread Ho-Eun Ryu

> On 30 Mar 2017, at 3:23 AM, Kees Cook  wrote:
> 
> On Wed, Mar 29, 2017 at 11:15 AM, Kees Cook  wrote:
>> +/*
>> + * Build "write rarely" infrastructure for flipping memory r/w
>> + * on a per-CPU basis.
>> + */
>> +#ifndef CONFIG_HAVE_ARCH_RARE_WRITE
>> +# define __wr_rare
>> +# define __wr_rare_type
>> +# define __rare_write(__var, __val)(__var = (__val))
>> +# define rare_write_begin()do { } while (0)
>> +# define rare_write_end()  do { } while (0)
>> +#else
>> +# define __wr_rare __ro_after_init
>> +# define __wr_rare_typeconst
>> +# ifdef CONFIG_HAVE_ARCH_RARE_WRITE_MEMCPY
>> +#  define __rare_write_n(dst, src, len)({  \
>> +   BUILD_BUG(!builtin_const(len)); \
>> +   __arch_rare_write_memcpy((dst), (src), (len));  \
>> +   })
>> +#  define __rare_write(var, val)  __rare_write_n(&(var), &(val), 
>> sizeof(var))
>> +# else
>> +#  define __rare_write(var, val)  ((*(typeof((typeof(var))0) *)&(var)) = 
>> (val))
>> +# endif
>> +# define rare_write_begin()__arch_rare_write_begin()
>> +# define rare_write_end()  __arch_rare_write_end()
>> +#endif
>> +#define rare_write(__var, __val) ({\
>> +   rare_write_begin(); \
>> +   __rare_write(__var, __val); \
>> +   rare_write_end();   \
>> +   __var;  \
>> +})
>> +
> 
> Of course, only after sending this do I realize that the MEMCPY case
> will need to be further adjusted, since it currently can't take
> literals. I guess something like this needs to be done:
> 
> #define __rare_write(var, val) ({ \
>typeof(var) __src = (val); \
>__rare_write_n(&(var), &(__src), sizeof(var)); \
> })
> 

Right, and it has a problem with BUILD_BUG, which causes compilation error when 
CONFIG_HABE_ARCH_RARE_WRITE_MEMCPY is true
BUILD_BUG is defined in  but  includes 


Please see the following.

diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index 3334fa9..3fa50e1 100644   
--- a/include/linux/compiler.h  
+++ b/include/linux/compiler.h  
@@ -350,11 +350,11 @@ static __always_inline void __write_once_size(volatile vo\
id *p, void *res, int s
 # define __wr_rare __ro_after_init 
 # define __wr_rare_typeconst   
 # ifdef CONFIG_HAVE_ARCH_RARE_WRITE_MEMCPY 
-#  define __rare_write_n(dst, src, len)({  \   
-   BUILD_BUG(!builtin_const(len)); \
-   __arch_rare_write_memcpy((dst), (src), (len));   \  
 
+#  define __rare_write_n(var, val, len)({  
   \
  
+   typeof(val) __val = val;
\
+   __arch_rare_write_memcpy(&(var), &(__val), (len));  \   
})
-#  define __rare_write(var, val)  __rare_write_n(&(var), &(val), sizeof(var))  
+#  define __rare_write(var, val)  __rare_write_n((var), (val), sizeof(var))
 # else 
 #  define __rare_write(var, val)  ((*(typeof((typeof(var))0) *)&(var)) = (val)\
)   
 # endif


> -Kees
> 
> -- 
> Kees Cook
> Pixel Security



Re: [kernel-hardening] [RFC 3/7] module: modify memory attrs for __ro_mostly_after_init during module_init/exit

2017-02-21 Thread Ho-Eun Ryu

> On 20 Feb 2017, at 7:30 PM, Mark Rutland  wrote:
> 
> On Sun, Feb 19, 2017 at 07:04:06PM +0900, Hoeun Ryu wrote:
>> `__ro_mostly_after_init` is almost like `__ro_after_init`. The section is
>> read-only as same as `__ro_after_init` after kernel init. This patch makes
>> `__ro_mostly_after_init` section read-write temporarily only during
>> module_init/module_exit.
>> 
>> Signed-off-by: Hoeun Ryu 
>> ---
>> kernel/module.c | 10 --
>> 1 file changed, 8 insertions(+), 2 deletions(-)
>> 
>> diff --git a/kernel/module.c b/kernel/module.c
>> index 7eba6de..3b25e0e 100644
>> --- a/kernel/module.c
>> +++ b/kernel/module.c
>> @@ -987,8 +987,11 @@ SYSCALL_DEFINE2(delete_module, const char __user *, 
>> name_user,
>> 
>>  mutex_unlock(_mutex);
>>  /* Final destruction now no one is using it. */
>> -if (mod->exit != NULL)
>> +if (mod->exit != NULL) {
>> +set_ro_mostly_after_init_rw();
>>  mod->exit();
>> +set_ro_mostly_after_init_ro();
>> +}
>>  blocking_notifier_call_chain(_notify_list,
>>   MODULE_STATE_GOING, mod);
>>  klp_module_going(mod);
>> @@ -3396,8 +3399,11 @@ static noinline int do_init_module(struct module *mod)
>> 
>>  do_mod_ctors(mod);
>>  /* Start the module */
>> -if (mod->init != NULL)
>> +if (mod->init != NULL) {
>> +set_ro_mostly_after_init_rw();
>>  ret = do_one_initcall(mod->init);
>> +set_ro_mostly_after_init_ro();
>> +}
> 
> This looks very much like the pax_{open,close}_kernel() approach for
> write-rarely data.

I read the discussion [1] and I agree that __ro_mostly_after_init marker
looks very similar to __write_rarely. 

> 
> I think it would be better to implement a first class write-rarely
> mechanism rather than trying to extend __ro_after_init to cover this
> case.

I’m not extending __ro_after_init. __ro_mostly_after_init resides in the same 
section of rodata though.

> 
> As mentioned previously, I *think* we can have a generic implementation
> that uses an mm to temporarily map a (thread/cpu-local) RW alias of the
> data in question in what would otherwise be the user half of the address
> space. Regardless, we can have a generic interface [1] that can cater
> for that style of approach and/or something like ARM's domains or x86's
> pkeys.
> 

I’m still learning cpu/kernel architectures, It would be very thankful if you 
tell me more about the detail of the implementation itself.

The mm that maps temporary RW alias is like
* special mm like idmap/init_mm which have its own page tables?
* the page tables have the same content of page tables of init_mm’s 
swapper_pg_dir except for RW permissions for a specific section (let’s say 
__write_rarely)
* then use switch_mm(special_rw_mm) to change the address space before the 
access happens to the section
* then use switch_mm(current->mm) to change the address space to original 
after the access is done

And the interface itself. rare_write(__val, __val), is it a single value access 
interface.
I’m intending to make data in __ro_mostly_after_init section RW during multiple 
accesses like during module_init/exit.
and __rare_rw_map()/unmap() used in rare_write() seems to work like open/close 
api.

How could __rare_rw_ptr() be implemented and what happens when `__rw_var = 
__rare_rw_ptr(&(__var))` is done ?

However the interface will look like, Do we still need a special data section 
that is mapped RO in general but RW in some cases ?
if then, doesn’t __ro_mostly_after_init marker itself make sense and we still 
need it ?

> Thanks,
> Mark.
> 
> [1] http://www.openwall.com/lists/kernel-hardening/2016/11/18/3



Re: [kernel-hardening] [RFC 3/7] module: modify memory attrs for __ro_mostly_after_init during module_init/exit

2017-02-21 Thread Ho-Eun Ryu

> On 20 Feb 2017, at 7:30 PM, Mark Rutland  wrote:
> 
> On Sun, Feb 19, 2017 at 07:04:06PM +0900, Hoeun Ryu wrote:
>> `__ro_mostly_after_init` is almost like `__ro_after_init`. The section is
>> read-only as same as `__ro_after_init` after kernel init. This patch makes
>> `__ro_mostly_after_init` section read-write temporarily only during
>> module_init/module_exit.
>> 
>> Signed-off-by: Hoeun Ryu 
>> ---
>> kernel/module.c | 10 --
>> 1 file changed, 8 insertions(+), 2 deletions(-)
>> 
>> diff --git a/kernel/module.c b/kernel/module.c
>> index 7eba6de..3b25e0e 100644
>> --- a/kernel/module.c
>> +++ b/kernel/module.c
>> @@ -987,8 +987,11 @@ SYSCALL_DEFINE2(delete_module, const char __user *, 
>> name_user,
>> 
>>  mutex_unlock(_mutex);
>>  /* Final destruction now no one is using it. */
>> -if (mod->exit != NULL)
>> +if (mod->exit != NULL) {
>> +set_ro_mostly_after_init_rw();
>>  mod->exit();
>> +set_ro_mostly_after_init_ro();
>> +}
>>  blocking_notifier_call_chain(_notify_list,
>>   MODULE_STATE_GOING, mod);
>>  klp_module_going(mod);
>> @@ -3396,8 +3399,11 @@ static noinline int do_init_module(struct module *mod)
>> 
>>  do_mod_ctors(mod);
>>  /* Start the module */
>> -if (mod->init != NULL)
>> +if (mod->init != NULL) {
>> +set_ro_mostly_after_init_rw();
>>  ret = do_one_initcall(mod->init);
>> +set_ro_mostly_after_init_ro();
>> +}
> 
> This looks very much like the pax_{open,close}_kernel() approach for
> write-rarely data.

I read the discussion [1] and I agree that __ro_mostly_after_init marker
looks very similar to __write_rarely. 

> 
> I think it would be better to implement a first class write-rarely
> mechanism rather than trying to extend __ro_after_init to cover this
> case.

I’m not extending __ro_after_init. __ro_mostly_after_init resides in the same 
section of rodata though.

> 
> As mentioned previously, I *think* we can have a generic implementation
> that uses an mm to temporarily map a (thread/cpu-local) RW alias of the
> data in question in what would otherwise be the user half of the address
> space. Regardless, we can have a generic interface [1] that can cater
> for that style of approach and/or something like ARM's domains or x86's
> pkeys.
> 

I’m still learning cpu/kernel architectures, It would be very thankful if you 
tell me more about the detail of the implementation itself.

The mm that maps temporary RW alias is like
* special mm like idmap/init_mm which have its own page tables?
* the page tables have the same content of page tables of init_mm’s 
swapper_pg_dir except for RW permissions for a specific section (let’s say 
__write_rarely)
* then use switch_mm(special_rw_mm) to change the address space before the 
access happens to the section
* then use switch_mm(current->mm) to change the address space to original 
after the access is done

And the interface itself. rare_write(__val, __val), is it a single value access 
interface.
I’m intending to make data in __ro_mostly_after_init section RW during multiple 
accesses like during module_init/exit.
and __rare_rw_map()/unmap() used in rare_write() seems to work like open/close 
api.

How could __rare_rw_ptr() be implemented and what happens when `__rw_var = 
__rare_rw_ptr(&(__var))` is done ?

However the interface will look like, Do we still need a special data section 
that is mapped RO in general but RW in some cases ?
if then, doesn’t __ro_mostly_after_init marker itself make sense and we still 
need it ?

> Thanks,
> Mark.
> 
> [1] http://www.openwall.com/lists/kernel-hardening/2016/11/18/3



Re: [kernel-hardening] [RFC 2/7] init: add set_ro_mostly_after_init_rw/ro function

2017-02-20 Thread Ho-Eun Ryu

> On 20 Feb 2017, at 7:22 PM, Mark Rutland  wrote:
> 
> On Sun, Feb 19, 2017 at 07:04:05PM +0900, Hoeun Ryu wrote:
>> Add set_ro_mostly_after_init_rw/ro pair to modify memory attributes for
>> memory marked as `ro_mostly_after_init`.
>> 
>> I am doubtful that this is the right place where these functions reside and
>> these functions are suitable for all architectures for memory attributes
>> modification. Please comment.
> 
> These won't work for arm64, since set_memory_* only work on
> page-granular mappings in the vmalloc area.
> 
> The "real" kernel mappings can use larger block mappings, and would need
> to be split (which cannot be done at runtime) before permissions could
> be changed at page granularity.

So I sent RFC 6/7 [1] and 7/7 [2] that splits the block mapping to the page 
granular.
I think you and Ard Biesheuvel don’t like it anyway.

[1] : https://lkml.org/lkml/2017/2/19/38
[2] : https://lkml.org/lkml/2017/2/19/39

> 
> Thanks,
> Mark.
> 
>> Signed-off-by: Hoeun Ryu 
>> ---
>> include/linux/init.h |  6 ++
>> init/main.c  | 24 
>> 2 files changed, 30 insertions(+)
>> 
>> diff --git a/include/linux/init.h b/include/linux/init.h
>> index 79af096..d68e4f7 100644
>> --- a/include/linux/init.h
>> +++ b/include/linux/init.h
>> @@ -131,6 +131,12 @@ extern bool rodata_enabled;
>> #endif
>> #ifdef CONFIG_STRICT_KERNEL_RWX
>> void mark_rodata_ro(void);
>> +
>> +void set_ro_mostly_after_init_rw(void);
>> +void set_ro_mostly_after_init_ro(void);
>> +#else
>> +static inline void set_ro_mostly_after_init_rw(void) { }
>> +static inline void set_ro_mostly_after_init_ro(void) { }
>> #endif
>> 
>> extern void (*late_time_init)(void);
>> diff --git a/init/main.c b/init/main.c
>> index 4719abf..a5d4873 100644
>> --- a/init/main.c
>> +++ b/init/main.c
>> @@ -941,6 +941,30 @@ static void mark_readonly(void)
>>  } else
>>  pr_info("Kernel memory protection disabled.\n");
>> }
>> +
>> +void set_ro_mostly_after_init_rw(void)
>> +{
>> +unsigned long start = PFN_ALIGN(__start_data_ro_mostly_after_init);
>> +unsigned long end = PFN_ALIGN(&__end_data_ro_mostly_after_init);
>> +unsigned long nr_pages = (end - start) >> PAGE_SHIFT;
>> +
>> +if (!rodata_enabled)
>> +return;
>> +
>> +set_memory_rw(start, nr_pages);
>> +}
>> +
>> +void set_ro_mostly_after_init_ro(void)
>> +{
>> +unsigned long start = PFN_ALIGN(__start_data_ro_mostly_after_init);
>> +unsigned long end = PFN_ALIGN(&__end_data_ro_mostly_after_init);
>> +unsigned long nr_pages = (end - start) >> PAGE_SHIFT;
>> +
>> +if (!rodata_enabled)
>> +return;
>> +
>> +set_memory_ro(start, nr_pages);
>> +}
>> #else
>> static inline void mark_readonly(void)
>> {
>> -- 
>> 2.7.4
>> 



Re: [kernel-hardening] [RFC 2/7] init: add set_ro_mostly_after_init_rw/ro function

2017-02-20 Thread Ho-Eun Ryu

> On 20 Feb 2017, at 7:22 PM, Mark Rutland  wrote:
> 
> On Sun, Feb 19, 2017 at 07:04:05PM +0900, Hoeun Ryu wrote:
>> Add set_ro_mostly_after_init_rw/ro pair to modify memory attributes for
>> memory marked as `ro_mostly_after_init`.
>> 
>> I am doubtful that this is the right place where these functions reside and
>> these functions are suitable for all architectures for memory attributes
>> modification. Please comment.
> 
> These won't work for arm64, since set_memory_* only work on
> page-granular mappings in the vmalloc area.
> 
> The "real" kernel mappings can use larger block mappings, and would need
> to be split (which cannot be done at runtime) before permissions could
> be changed at page granularity.

So I sent RFC 6/7 [1] and 7/7 [2] that splits the block mapping to the page 
granular.
I think you and Ard Biesheuvel don’t like it anyway.

[1] : https://lkml.org/lkml/2017/2/19/38
[2] : https://lkml.org/lkml/2017/2/19/39

> 
> Thanks,
> Mark.
> 
>> Signed-off-by: Hoeun Ryu 
>> ---
>> include/linux/init.h |  6 ++
>> init/main.c  | 24 
>> 2 files changed, 30 insertions(+)
>> 
>> diff --git a/include/linux/init.h b/include/linux/init.h
>> index 79af096..d68e4f7 100644
>> --- a/include/linux/init.h
>> +++ b/include/linux/init.h
>> @@ -131,6 +131,12 @@ extern bool rodata_enabled;
>> #endif
>> #ifdef CONFIG_STRICT_KERNEL_RWX
>> void mark_rodata_ro(void);
>> +
>> +void set_ro_mostly_after_init_rw(void);
>> +void set_ro_mostly_after_init_ro(void);
>> +#else
>> +static inline void set_ro_mostly_after_init_rw(void) { }
>> +static inline void set_ro_mostly_after_init_ro(void) { }
>> #endif
>> 
>> extern void (*late_time_init)(void);
>> diff --git a/init/main.c b/init/main.c
>> index 4719abf..a5d4873 100644
>> --- a/init/main.c
>> +++ b/init/main.c
>> @@ -941,6 +941,30 @@ static void mark_readonly(void)
>>  } else
>>  pr_info("Kernel memory protection disabled.\n");
>> }
>> +
>> +void set_ro_mostly_after_init_rw(void)
>> +{
>> +unsigned long start = PFN_ALIGN(__start_data_ro_mostly_after_init);
>> +unsigned long end = PFN_ALIGN(&__end_data_ro_mostly_after_init);
>> +unsigned long nr_pages = (end - start) >> PAGE_SHIFT;
>> +
>> +if (!rodata_enabled)
>> +return;
>> +
>> +set_memory_rw(start, nr_pages);
>> +}
>> +
>> +void set_ro_mostly_after_init_ro(void)
>> +{
>> +unsigned long start = PFN_ALIGN(__start_data_ro_mostly_after_init);
>> +unsigned long end = PFN_ALIGN(&__end_data_ro_mostly_after_init);
>> +unsigned long nr_pages = (end - start) >> PAGE_SHIFT;
>> +
>> +if (!rodata_enabled)
>> +return;
>> +
>> +set_memory_ro(start, nr_pages);
>> +}
>> #else
>> static inline void mark_readonly(void)
>> {
>> -- 
>> 2.7.4
>> 



Re: [kernel-hardening] [RFC 1/7] arch: add __ro_mostly_after_init section marker

2017-02-20 Thread Ho-Eun Ryu

> On 19 Feb 2017, at 8:24 PM, Ard Biesheuvel  wrote:
> 
> On 19 February 2017 at 10:04, Hoeun Ryu  wrote:
>> After `__ro_after_init` marker is included in kernel, many kernel data
>> objects can be read-only-after-init. But there are many other places that
>> would be good to read-only-after-init but `__ro_after_init` can not be simply
>> applicable to them because they should be writable at some points, which are
>> during module_init/exit or dynamic de/registration for a specific subsystem.
>> `__ro_mostly_after_init` is basically the same to `__ro_after_init`. The
>> section is mapped as read-only after kernel init. The different thing is
>> this section is temporarily mapped as read-write during module_init/exit and
>> de/registration of a subsystem using set_ro_mostly_after_init_rw/ro pair.
>> Use `__ro_mostly_after_init` as a way to mark such memory instead when
>> `__ro_after_init` is not applicable because the memory should be writable
>> at the described points of time. They are read-only right after kernel init
>> and writable temporarily only during module_init/exit and dynamic
>> de/registration for a subsystem.
>> 
>> Signed-off-by: Hoeun Ryu 
>> ---
>> include/asm-generic/sections.h|  1 +
>> include/asm-generic/vmlinux.lds.h | 10 ++
>> include/linux/cache.h | 11 +++
>> 3 files changed, 22 insertions(+)
>> 
>> diff --git a/include/asm-generic/sections.h b/include/asm-generic/sections.h
>> index 4df64a1..16a6f21 100644
>> --- a/include/asm-generic/sections.h
>> +++ b/include/asm-generic/sections.h
>> @@ -34,6 +34,7 @@ extern char __bss_start[], __bss_stop[];
>> extern char __init_begin[], __init_end[];
>> extern char _sinittext[], _einittext[];
>> extern char __start_data_ro_after_init[], __end_data_ro_after_init[];
>> +extern char __start_data_ro_mostly_after_init[], 
>> __end_data_ro_mostly_after_init[];
>> extern char _end[];
>> extern char __per_cpu_load[], __per_cpu_start[], __per_cpu_end[];
>> extern char __kprobes_text_start[], __kprobes_text_end[];
>> diff --git a/include/asm-generic/vmlinux.lds.h 
>> b/include/asm-generic/vmlinux.lds.h
>> index 4e09b28..cc5f44e 100644
>> --- a/include/asm-generic/vmlinux.lds.h
>> +++ b/include/asm-generic/vmlinux.lds.h
>> @@ -265,6 +265,15 @@
>>__end_data_ro_after_init = .;
>> #endif
>> 
>> +#ifndef RO_MOSTLY_AFTER_INIT_DATA
>> +#define RO_MOSTLY_AFTER_INIT_DATA(align)   \
>> +   . = ALIGN(align);   \
>> +   VMLINUX_SYMBOL(__start_data_ro_mostly_after_init) = .;  \
>> +   *(.data..ro_mostly_after_init)  \
>> +   . = ALIGN(align);   \
>> +   VMLINUX_SYMBOL(__end_data_ro_mostly_after_init) = .;
>> +#endif
>> +
>> /*
>>  * Read only Data
>>  */
>> @@ -275,6 +284,7 @@
>>*(.rodata) *(.rodata.*) \
>>RO_AFTER_INIT_DATA  /* Read only after init */  \
>>KEEP(*(__vermagic)) /* Kernel version magic */  \
>> +   RO_MOSTLY_AFTER_INIT_DATA(align)\
> 
> You can't really drop this in the middle of a section like this. On
> arm64, we try very hard to use a minimal segment alignment of 64 KB
> (of 2 MB if DEBUG_ALIGN_RODATA=y), to ensure that the TLB footprint of
> the kernel image is minimized.
> 
> So this should be a separate section in the arm64 linker script.
> 

Could we achieve that using ALIGN(SECTION_SIZE) not ALIGN(PAGE_SIZE) for 
RO_DATA ?

>>. = ALIGN(8);   \
>>VMLINUX_SYMBOL(__start___tracepoints_ptrs) = .; \
>>KEEP(*(__tracepoints_ptrs)) /* Tracepoints: pointer array */ \
>> diff --git a/include/linux/cache.h b/include/linux/cache.h
>> index 1be04f8..fd1cb9b 100644
>> --- a/include/linux/cache.h
>> +++ b/include/linux/cache.h
>> @@ -30,6 +30,17 @@
>> #define __ro_after_init __attribute__((__section__(".data..ro_after_init")))
>> #endif
>> 
>> +/*
>> + * __ro_mostly_after_init is almost like __ro_after_init.
>> + * but __ro_mostly_after_init section is temporarily writable only during
>> + * module_init/exit or dynamic de/registeration of a subsystem using
>> + * set_ro_mostly_after_init_rw/ro pair.
>> + */
>> +#ifndef __ro_mostly_after_init
>> +#define __ro_mostly_after_init \
>> +   __attribute__((__section__(".data..ro_mostly_after_init")))
>> +#endif
>> +
>> #ifndef cacheline_aligned
>> #define cacheline_aligned __attribute__((__aligned__(SMP_CACHE_BYTES)))
>> #endif
>> --
>> 2.7.4



Re: [kernel-hardening] [RFC 1/7] arch: add __ro_mostly_after_init section marker

2017-02-20 Thread Ho-Eun Ryu

> On 19 Feb 2017, at 8:24 PM, Ard Biesheuvel  wrote:
> 
> On 19 February 2017 at 10:04, Hoeun Ryu  wrote:
>> After `__ro_after_init` marker is included in kernel, many kernel data
>> objects can be read-only-after-init. But there are many other places that
>> would be good to read-only-after-init but `__ro_after_init` can not be simply
>> applicable to them because they should be writable at some points, which are
>> during module_init/exit or dynamic de/registration for a specific subsystem.
>> `__ro_mostly_after_init` is basically the same to `__ro_after_init`. The
>> section is mapped as read-only after kernel init. The different thing is
>> this section is temporarily mapped as read-write during module_init/exit and
>> de/registration of a subsystem using set_ro_mostly_after_init_rw/ro pair.
>> Use `__ro_mostly_after_init` as a way to mark such memory instead when
>> `__ro_after_init` is not applicable because the memory should be writable
>> at the described points of time. They are read-only right after kernel init
>> and writable temporarily only during module_init/exit and dynamic
>> de/registration for a subsystem.
>> 
>> Signed-off-by: Hoeun Ryu 
>> ---
>> include/asm-generic/sections.h|  1 +
>> include/asm-generic/vmlinux.lds.h | 10 ++
>> include/linux/cache.h | 11 +++
>> 3 files changed, 22 insertions(+)
>> 
>> diff --git a/include/asm-generic/sections.h b/include/asm-generic/sections.h
>> index 4df64a1..16a6f21 100644
>> --- a/include/asm-generic/sections.h
>> +++ b/include/asm-generic/sections.h
>> @@ -34,6 +34,7 @@ extern char __bss_start[], __bss_stop[];
>> extern char __init_begin[], __init_end[];
>> extern char _sinittext[], _einittext[];
>> extern char __start_data_ro_after_init[], __end_data_ro_after_init[];
>> +extern char __start_data_ro_mostly_after_init[], 
>> __end_data_ro_mostly_after_init[];
>> extern char _end[];
>> extern char __per_cpu_load[], __per_cpu_start[], __per_cpu_end[];
>> extern char __kprobes_text_start[], __kprobes_text_end[];
>> diff --git a/include/asm-generic/vmlinux.lds.h 
>> b/include/asm-generic/vmlinux.lds.h
>> index 4e09b28..cc5f44e 100644
>> --- a/include/asm-generic/vmlinux.lds.h
>> +++ b/include/asm-generic/vmlinux.lds.h
>> @@ -265,6 +265,15 @@
>>__end_data_ro_after_init = .;
>> #endif
>> 
>> +#ifndef RO_MOSTLY_AFTER_INIT_DATA
>> +#define RO_MOSTLY_AFTER_INIT_DATA(align)   \
>> +   . = ALIGN(align);   \
>> +   VMLINUX_SYMBOL(__start_data_ro_mostly_after_init) = .;  \
>> +   *(.data..ro_mostly_after_init)  \
>> +   . = ALIGN(align);   \
>> +   VMLINUX_SYMBOL(__end_data_ro_mostly_after_init) = .;
>> +#endif
>> +
>> /*
>>  * Read only Data
>>  */
>> @@ -275,6 +284,7 @@
>>*(.rodata) *(.rodata.*) \
>>RO_AFTER_INIT_DATA  /* Read only after init */  \
>>KEEP(*(__vermagic)) /* Kernel version magic */  \
>> +   RO_MOSTLY_AFTER_INIT_DATA(align)\
> 
> You can't really drop this in the middle of a section like this. On
> arm64, we try very hard to use a minimal segment alignment of 64 KB
> (of 2 MB if DEBUG_ALIGN_RODATA=y), to ensure that the TLB footprint of
> the kernel image is minimized.
> 
> So this should be a separate section in the arm64 linker script.
> 

Could we achieve that using ALIGN(SECTION_SIZE) not ALIGN(PAGE_SIZE) for 
RO_DATA ?

>>. = ALIGN(8);   \
>>VMLINUX_SYMBOL(__start___tracepoints_ptrs) = .; \
>>KEEP(*(__tracepoints_ptrs)) /* Tracepoints: pointer array */ \
>> diff --git a/include/linux/cache.h b/include/linux/cache.h
>> index 1be04f8..fd1cb9b 100644
>> --- a/include/linux/cache.h
>> +++ b/include/linux/cache.h
>> @@ -30,6 +30,17 @@
>> #define __ro_after_init __attribute__((__section__(".data..ro_after_init")))
>> #endif
>> 
>> +/*
>> + * __ro_mostly_after_init is almost like __ro_after_init.
>> + * but __ro_mostly_after_init section is temporarily writable only during
>> + * module_init/exit or dynamic de/registeration of a subsystem using
>> + * set_ro_mostly_after_init_rw/ro pair.
>> + */
>> +#ifndef __ro_mostly_after_init
>> +#define __ro_mostly_after_init \
>> +   __attribute__((__section__(".data..ro_mostly_after_init")))
>> +#endif
>> +
>> #ifndef cacheline_aligned
>> #define cacheline_aligned __attribute__((__aligned__(SMP_CACHE_BYTES)))
>> #endif
>> --
>> 2.7.4



Re: [kernel-hardening] [PATCH 0/7] introduce __ro_mostly_after_init section marker

2017-02-20 Thread Ho-Eun Ryu

> On 19 Feb 2017, at 8:14 PM, Ard Biesheuvel  wrote:
> 
> Hi Hoeun,
> 
> On 19 February 2017 at 10:03, Hoeun Ryu  wrote:
>> After `__ro_after_init` marker is included in kernel, many kernel data
>> objects can be read-only-after-init. But there are many other places that
>> would be good to read-only-after-init but `__ro_after_init` can not be simply
>> applicable to them because they should be writable at some points, which are
>> during module_init/exit or dynamic de/registration for a specific subsystem.
> 
> The argument sounds reasonable, but it would really help if you could
> describe some use cases in more detail.

Please see RFC 4/7 [1] and RFC 5/7 [2].

RFC 4 shows that cpu_ap/bp_states are marked as __ro_mostly_after_init.
cpu_ap/bp_states can not be marked as __ro_after_init because some modules
can write to those objects using cpuhp_setup/remove_state api during 
module_init/exit.

RFC 5 shows that selinux_hooks and selinux_nf_hooks are marked as 
__ro_mostly_after_init.
those hooks can not be marked as __ro_after_init because selinux_disable() 
function.
Those hooks should be writable during selinux_disable to deregister the hooks.

[1] https://lkml.org/lkml/2017/2/19/37
[2] https://lkml.org/lkml/2017/2/19/35

> 
>> `__ro_mostly_after_init` is basically the same to `__ro_after_init`. The
>> section is mapped as read-only after kernel init. The different thing is
>> this section is temporarily mapped as read-write during module_init/exit and
>> de/registration of a subsystem using set_ro_mostly_after_init_rw/ro pair.
>> 
>> - Tested only on arm64.
>> 
>> Description:
>>  0001 patch is `__ro_mostly_after_init` itself.
>>  0002 patch is to add set_ro_mostly_after_init_rw/ro pair using
>>set_memory_rw/ro.
>>  0003 patch is to make the section read-write in module_init/exit.
>>  0004 patch is an example for dynamic init/deinit of a subsystem.
>>  0005 patch is an example for __ro_mostly_after_init section modified during
>>module_init/exit.
>>  0006/0007 patches are fixes for arm64 kernel mapping.
>> 
>> Hoeun Ryu (7):
>>  arch: add __ro_mostly_after_init section marker
>>  init: add set_ro_mostly_after_init_rw/ro function
>>  module: modify memory attrs for __ro_mostly_after_init during
>>module_init/exit
>>  selinux: mark __ro_mostly_after_init for selinux_hooks/selinux_nf_ops
>>  cpu: mark ro_mostly_after_init for cpuhp_ap/bp_states
>>  arm64: add __map_kernel_segment to accept additional vm flags
>>  arm64: map seperately rodata sections for __ro_mostly_after_init
>>section
>> 
>> arch/arm64/mm/mmu.c   | 44 
>> ---
>> include/asm-generic/sections.h|  1 +
>> include/asm-generic/vmlinux.lds.h | 10 +
>> include/linux/cache.h | 11 ++
>> include/linux/init.h  |  6 ++
>> init/main.c   | 24 +
>> kernel/cpu.c  |  4 ++--
>> kernel/module.c   | 10 +++--
>> security/selinux/hooks.c  |  8 +--
>> 9 files changed, 105 insertions(+), 13 deletions(-)
>> 
>> --
>> 2.7.4
>> 



Re: [kernel-hardening] [PATCH 0/7] introduce __ro_mostly_after_init section marker

2017-02-20 Thread Ho-Eun Ryu

> On 19 Feb 2017, at 8:14 PM, Ard Biesheuvel  wrote:
> 
> Hi Hoeun,
> 
> On 19 February 2017 at 10:03, Hoeun Ryu  wrote:
>> After `__ro_after_init` marker is included in kernel, many kernel data
>> objects can be read-only-after-init. But there are many other places that
>> would be good to read-only-after-init but `__ro_after_init` can not be simply
>> applicable to them because they should be writable at some points, which are
>> during module_init/exit or dynamic de/registration for a specific subsystem.
> 
> The argument sounds reasonable, but it would really help if you could
> describe some use cases in more detail.

Please see RFC 4/7 [1] and RFC 5/7 [2].

RFC 4 shows that cpu_ap/bp_states are marked as __ro_mostly_after_init.
cpu_ap/bp_states can not be marked as __ro_after_init because some modules
can write to those objects using cpuhp_setup/remove_state api during 
module_init/exit.

RFC 5 shows that selinux_hooks and selinux_nf_hooks are marked as 
__ro_mostly_after_init.
those hooks can not be marked as __ro_after_init because selinux_disable() 
function.
Those hooks should be writable during selinux_disable to deregister the hooks.

[1] https://lkml.org/lkml/2017/2/19/37
[2] https://lkml.org/lkml/2017/2/19/35

> 
>> `__ro_mostly_after_init` is basically the same to `__ro_after_init`. The
>> section is mapped as read-only after kernel init. The different thing is
>> this section is temporarily mapped as read-write during module_init/exit and
>> de/registration of a subsystem using set_ro_mostly_after_init_rw/ro pair.
>> 
>> - Tested only on arm64.
>> 
>> Description:
>>  0001 patch is `__ro_mostly_after_init` itself.
>>  0002 patch is to add set_ro_mostly_after_init_rw/ro pair using
>>set_memory_rw/ro.
>>  0003 patch is to make the section read-write in module_init/exit.
>>  0004 patch is an example for dynamic init/deinit of a subsystem.
>>  0005 patch is an example for __ro_mostly_after_init section modified during
>>module_init/exit.
>>  0006/0007 patches are fixes for arm64 kernel mapping.
>> 
>> Hoeun Ryu (7):
>>  arch: add __ro_mostly_after_init section marker
>>  init: add set_ro_mostly_after_init_rw/ro function
>>  module: modify memory attrs for __ro_mostly_after_init during
>>module_init/exit
>>  selinux: mark __ro_mostly_after_init for selinux_hooks/selinux_nf_ops
>>  cpu: mark ro_mostly_after_init for cpuhp_ap/bp_states
>>  arm64: add __map_kernel_segment to accept additional vm flags
>>  arm64: map seperately rodata sections for __ro_mostly_after_init
>>section
>> 
>> arch/arm64/mm/mmu.c   | 44 
>> ---
>> include/asm-generic/sections.h|  1 +
>> include/asm-generic/vmlinux.lds.h | 10 +
>> include/linux/cache.h | 11 ++
>> include/linux/init.h  |  6 ++
>> init/main.c   | 24 +
>> kernel/cpu.c  |  4 ++--
>> kernel/module.c   | 10 +++--
>> security/selinux/hooks.c  |  8 +--
>> 9 files changed, 105 insertions(+), 13 deletions(-)
>> 
>> --
>> 2.7.4
>> 



Re: [kernel-hardening] [PATCH 0/7] introduce __ro_mostly_after_init section marker

2017-02-20 Thread Ho-Eun Ryu

> On 20 Feb 2017, at 7:02 PM, Mark Rutland  wrote:
> 
> On Sun, Feb 19, 2017 at 07:03:38PM +0900, Hoeun Ryu wrote:
>> After `__ro_after_init` marker is included in kernel, many kernel data
>> objects can be read-only-after-init. But there are many other places that
>> would be good to read-only-after-init but `__ro_after_init` can not be simply
>> applicable to them because they should be writable at some points, which are
>> during module_init/exit or dynamic de/registration for a specific subsystem.
> 
> Could you elaborate on this?
> 
> For modules, I assume that the __ro_after_init data structures are part
> of the module, and not part of the "real" kernel image. Is that the case?
> 

__ro_mostly_after_init is for kernel builtin core subsystems, not for modules 
themselves.
The section can be writable only during kernel init and module_init/exit.
Some hooks (or array of hooks) of a core subsystem can be marked as 
__ro_mostly_after_init
similar to that way of __ro_after_init. After that some modules that may write 
to those hooks of
the subsystem to register/deregister something to the subsystem can safely 
access those section.
Please see RFC 3/7 that makes this section writable.

In addition, some subsystems may use this marker for their (array of) hooks and 
make them writable
only at some point of time via set_ro_mostly_after_init_rw/ro pair.
please read RFC 4/7 for selinux.

> Which specific subsystems whish to modify data structures that are
> __ro_after_init?

I’m not intending to make writable __ro_after_init section but introducing new 
section marker
that works mostly like __ro_after_init but can be written to at some points.
please see RFC 5/7 for cpuhotplug.

> 
> This sounds like the proposed mostly-ro/rarely-rw stuff would be a
> better fit for that case.
> 
> Thanks,
> Mark.
> 
>> `__ro_mostly_after_init` is basically the same to `__ro_after_init`. The
>> section is mapped as read-only after kernel init. The different thing is
>> this section is temporarily mapped as read-write during module_init/exit and
>> de/registration of a subsystem using set_ro_mostly_after_init_rw/ro pair.
>> 
>> - Tested only on arm64.
>> 
>> Description:
>>  0001 patch is `__ro_mostly_after_init` itself.
>>  0002 patch is to add set_ro_mostly_after_init_rw/ro pair using
>>set_memory_rw/ro.
>>  0003 patch is to make the section read-write in module_init/exit.
>>  0004 patch is an example for dynamic init/deinit of a subsystem.
>>  0005 patch is an example for __ro_mostly_after_init section modified during
>>module_init/exit.
>>  0006/0007 patches are fixes for arm64 kernel mapping.
>> 
>> Hoeun Ryu (7):
>>  arch: add __ro_mostly_after_init section marker
>>  init: add set_ro_mostly_after_init_rw/ro function
>>  module: modify memory attrs for __ro_mostly_after_init during
>>module_init/exit
>>  selinux: mark __ro_mostly_after_init for selinux_hooks/selinux_nf_ops
>>  cpu: mark ro_mostly_after_init for cpuhp_ap/bp_states
>>  arm64: add __map_kernel_segment to accept additional vm flags
>>  arm64: map seperately rodata sections for __ro_mostly_after_init
>>section
>> 
>> arch/arm64/mm/mmu.c   | 44 
>> ---
>> include/asm-generic/sections.h|  1 +
>> include/asm-generic/vmlinux.lds.h | 10 +
>> include/linux/cache.h | 11 ++
>> include/linux/init.h  |  6 ++
>> init/main.c   | 24 +
>> kernel/cpu.c  |  4 ++--
>> kernel/module.c   | 10 +++--
>> security/selinux/hooks.c  |  8 +--
>> 9 files changed, 105 insertions(+), 13 deletions(-)
>> 
>> -- 
>> 2.7.4
>> 



Re: [kernel-hardening] [PATCH 0/7] introduce __ro_mostly_after_init section marker

2017-02-20 Thread Ho-Eun Ryu

> On 20 Feb 2017, at 7:02 PM, Mark Rutland  wrote:
> 
> On Sun, Feb 19, 2017 at 07:03:38PM +0900, Hoeun Ryu wrote:
>> After `__ro_after_init` marker is included in kernel, many kernel data
>> objects can be read-only-after-init. But there are many other places that
>> would be good to read-only-after-init but `__ro_after_init` can not be simply
>> applicable to them because they should be writable at some points, which are
>> during module_init/exit or dynamic de/registration for a specific subsystem.
> 
> Could you elaborate on this?
> 
> For modules, I assume that the __ro_after_init data structures are part
> of the module, and not part of the "real" kernel image. Is that the case?
> 

__ro_mostly_after_init is for kernel builtin core subsystems, not for modules 
themselves.
The section can be writable only during kernel init and module_init/exit.
Some hooks (or array of hooks) of a core subsystem can be marked as 
__ro_mostly_after_init
similar to that way of __ro_after_init. After that some modules that may write 
to those hooks of
the subsystem to register/deregister something to the subsystem can safely 
access those section.
Please see RFC 3/7 that makes this section writable.

In addition, some subsystems may use this marker for their (array of) hooks and 
make them writable
only at some point of time via set_ro_mostly_after_init_rw/ro pair.
please read RFC 4/7 for selinux.

> Which specific subsystems whish to modify data structures that are
> __ro_after_init?

I’m not intending to make writable __ro_after_init section but introducing new 
section marker
that works mostly like __ro_after_init but can be written to at some points.
please see RFC 5/7 for cpuhotplug.

> 
> This sounds like the proposed mostly-ro/rarely-rw stuff would be a
> better fit for that case.
> 
> Thanks,
> Mark.
> 
>> `__ro_mostly_after_init` is basically the same to `__ro_after_init`. The
>> section is mapped as read-only after kernel init. The different thing is
>> this section is temporarily mapped as read-write during module_init/exit and
>> de/registration of a subsystem using set_ro_mostly_after_init_rw/ro pair.
>> 
>> - Tested only on arm64.
>> 
>> Description:
>>  0001 patch is `__ro_mostly_after_init` itself.
>>  0002 patch is to add set_ro_mostly_after_init_rw/ro pair using
>>set_memory_rw/ro.
>>  0003 patch is to make the section read-write in module_init/exit.
>>  0004 patch is an example for dynamic init/deinit of a subsystem.
>>  0005 patch is an example for __ro_mostly_after_init section modified during
>>module_init/exit.
>>  0006/0007 patches are fixes for arm64 kernel mapping.
>> 
>> Hoeun Ryu (7):
>>  arch: add __ro_mostly_after_init section marker
>>  init: add set_ro_mostly_after_init_rw/ro function
>>  module: modify memory attrs for __ro_mostly_after_init during
>>module_init/exit
>>  selinux: mark __ro_mostly_after_init for selinux_hooks/selinux_nf_ops
>>  cpu: mark ro_mostly_after_init for cpuhp_ap/bp_states
>>  arm64: add __map_kernel_segment to accept additional vm flags
>>  arm64: map seperately rodata sections for __ro_mostly_after_init
>>section
>> 
>> arch/arm64/mm/mmu.c   | 44 
>> ---
>> include/asm-generic/sections.h|  1 +
>> include/asm-generic/vmlinux.lds.h | 10 +
>> include/linux/cache.h | 11 ++
>> include/linux/init.h  |  6 ++
>> init/main.c   | 24 +
>> kernel/cpu.c  |  4 ++--
>> kernel/module.c   | 10 +++--
>> security/selinux/hooks.c  |  8 +--
>> 9 files changed, 105 insertions(+), 13 deletions(-)
>> 
>> -- 
>> 2.7.4
>> 



Re: [RFC 5/7] cpu: mark ro_mostly_after_init for cpuhp_ap/bp_states

2017-02-20 Thread Ho-Eun Ryu

> On 20 Feb 2017, at 5:20 PM, Sebastian Andrzej Siewior  
> wrote:
> 
> On 2017-02-19 19:04:08 [+0900], Hoeun Ryu wrote:
>> It would be good that `__ro_mostly_after_init` is marked to cpuhp state
>> objects. 
> why?
> 

I’m requesting for comments of a new feature called __ro_mostly_after_init 
section marker.
It’s similar to __ro_after_init, but the section can be writable for some point 
of time.
Please see the cover letter [1] and the first patch [2].

[1] : https://lkml.org/lkml/2017/2/19/29
[2] : https://lkml.org/lkml/2017/2/19/32

> Sebastian



Re: [RFC 5/7] cpu: mark ro_mostly_after_init for cpuhp_ap/bp_states

2017-02-20 Thread Ho-Eun Ryu

> On 20 Feb 2017, at 5:20 PM, Sebastian Andrzej Siewior  
> wrote:
> 
> On 2017-02-19 19:04:08 [+0900], Hoeun Ryu wrote:
>> It would be good that `__ro_mostly_after_init` is marked to cpuhp state
>> objects. 
> why?
> 

I’m requesting for comments of a new feature called __ro_mostly_after_init 
section marker.
It’s similar to __ro_after_init, but the section can be writable for some point 
of time.
Please see the cover letter [1] and the first patch [2].

[1] : https://lkml.org/lkml/2017/2/19/29
[2] : https://lkml.org/lkml/2017/2/19/32

> Sebastian