Re: [PATCH v19 02/15] arm64: Introduce prctl() options to control the tagged user addresses ABI

2019-08-09 Thread Catalin Marinas
On Tue, Jul 23, 2019 at 07:58:39PM +0200, Andrey Konovalov wrote:
> From: Catalin Marinas 
> 
> It is not desirable to relax the ABI to allow tagged user addresses into
> the kernel indiscriminately. This patch introduces a prctl() interface
> for enabling or disabling the tagged ABI with a global sysctl control
> for preventing applications from enabling the relaxed ABI (meant for
> testing user-space prctl() return error checking without reconfiguring
> the kernel). The ABI properties are inherited by threads of the same
> application and fork()'ed children but cleared on execve(). A Kconfig
> option allows the overall disabling of the relaxed ABI.
> 
> The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle
> MTE-specific settings like imprecise vs precise exceptions.
> 
> Reviewed-by: Kees Cook 
> Signed-off-by: Catalin Marinas 
> Signed-off-by: Andrey Konovalov 

Following several discussions on the list and in private, I'm proposing
the update below. I can send it as a patch on top of the current series
since Will has already queued this.

---8<-
>From 1b3f57ab0c2c51f8b31c19fb34d270e1f3ee57fe Mon Sep 17 00:00:00 2001
From: Catalin Marinas 
Date: Fri, 9 Aug 2019 15:09:15 +0100
Subject: [PATCH] fixup! arm64: Introduce prctl() options to control the
 tagged user addresses ABI

Rename abi.tagged_addr sysctl control to abi.tagged_addr_disabled,
defaulting to 0. Only prevent prctl(PR_TAGGED_ADDR_ENABLE)from being
called when abi.tagged_addr_disabled==1.

Force unused arg* of the new prctl() to 0.

Signed-off-by: Catalin Marinas 
---
 arch/arm64/kernel/process.c | 17 ++---
 kernel/sys.c|  4 
 2 files changed, 14 insertions(+), 7 deletions(-)

diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 76b7c55026aa..03689c0beb34 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -579,17 +579,22 @@ void arch_setup_new_exec(void)
 /*
  * Control the relaxed ABI allowing tagged user addresses into the kernel.
  */
-static unsigned int tagged_addr_prctl_allowed = 1;
+static unsigned int tagged_addr_disabled;
 
 long set_tagged_addr_ctrl(unsigned long arg)
 {
-   if (!tagged_addr_prctl_allowed)
-   return -EINVAL;
if (is_compat_task())
return -EINVAL;
if (arg & ~PR_TAGGED_ADDR_ENABLE)
return -EINVAL;
 
+   /*
+* Do not allow the enabling of the tagged address ABI if globally
+* disabled via sysctl abi.tagged_addr_disabled.
+*/
+   if (arg & PR_TAGGED_ADDR_ENABLE && tagged_addr_disabled)
+   return -EINVAL;
+
update_thread_flag(TIF_TAGGED_ADDR, arg & PR_TAGGED_ADDR_ENABLE);
 
return 0;
@@ -597,8 +602,6 @@ long set_tagged_addr_ctrl(unsigned long arg)
 
 long get_tagged_addr_ctrl(void)
 {
-   if (!tagged_addr_prctl_allowed)
-   return -EINVAL;
if (is_compat_task())
return -EINVAL;
 
@@ -618,9 +621,9 @@ static int one = 1;
 
 static struct ctl_table tagged_addr_sysctl_table[] = {
{
-   .procname   = "tagged_addr",
+   .procname   = "tagged_addr_disabled",
.mode   = 0644,
-   .data   = _addr_prctl_allowed,
+   .data   = _addr_disabled,
.maxlen = sizeof(int),
.proc_handler   = proc_dointvec_minmax,
.extra1 = ,
diff --git a/kernel/sys.c b/kernel/sys.c
index c6c4d5358bd3..ec48396b4943 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -2499,9 +2499,13 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, arg2, 
unsigned long, arg3,
error = PAC_RESET_KEYS(me, arg2);
break;
case PR_SET_TAGGED_ADDR_CTRL:
+   if (arg3 || arg4 || arg5)
+   return -EINVAL;
error = SET_TAGGED_ADDR_CTRL(arg2);
break;
case PR_GET_TAGGED_ADDR_CTRL:
+   if (arg2 || arg3 || arg4 || arg5)
+   return -EINVAL;
error = GET_TAGGED_ADDR_CTRL();
break;
default:


Re: [PATCH v19 04/15] mm: untag user pointers passed to memory syscalls

2019-08-09 Thread Catalin Marinas
On Tue, Jul 23, 2019 at 07:58:41PM +0200, Andrey Konovalov wrote:
> This patch is a part of a series that extends kernel ABI to allow to pass
> tagged user pointers (with the top byte set to something else other than
> 0x00) as syscall arguments.
> 
> This patch allows tagged pointers to be passed to the following memory
> syscalls: get_mempolicy, madvise, mbind, mincore, mlock, mlock2, mprotect,
> mremap, msync, munlock, move_pages.
> 
> The mmap and mremap syscalls do not currently accept tagged addresses.
> Architectures may interpret the tag as a background colour for the
> corresponding vma.
> 
> Reviewed-by: Khalid Aziz 
> Reviewed-by: Vincenzo Frascino 
> Reviewed-by: Catalin Marinas 
> Reviewed-by: Kees Cook 
> Signed-off-by: Andrey Konovalov 
> ---
>  mm/madvise.c   | 2 ++
>  mm/mempolicy.c | 3 +++
>  mm/migrate.c   | 2 +-
>  mm/mincore.c   | 2 ++
>  mm/mlock.c | 4 
>  mm/mprotect.c  | 2 ++
>  mm/mremap.c| 7 +++
>  mm/msync.c | 2 ++
>  8 files changed, 23 insertions(+), 1 deletion(-)

More back and forth discussions on how to specify the exceptions here.
I'm proposing just dropping the exceptions and folding in the diff
below.

Andrew, if you prefer a standalone patch instead, please let me know:

--8<----
>From 9a5286acaa638c6a917d96986bf28dad35e24a0c Mon Sep 17 00:00:00 2001
From: Catalin Marinas 
Date: Fri, 9 Aug 2019 14:21:33 +0100
Subject: [PATCH] fixup! mm: untag user pointers passed to memory syscalls

mmap, mremap, munmap, brk added to the list of syscalls that accept
tagged pointers.

Signed-off-by: Catalin Marinas 
---
 mm/mmap.c   | 5 +
 mm/mremap.c | 6 +-
 2 files changed, 6 insertions(+), 5 deletions(-)

diff --git a/mm/mmap.c b/mm/mmap.c
index 7e8c3e8ae75f..b766b633b7ae 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -201,6 +201,8 @@ SYSCALL_DEFINE1(brk, unsigned long, brk)
bool downgraded = false;
LIST_HEAD(uf);
 
+   brk = untagged_addr(brk);
+
if (down_write_killable(>mmap_sem))
return -EINTR;
 
@@ -1573,6 +1575,8 @@ unsigned long ksys_mmap_pgoff(unsigned long addr, 
unsigned long len,
struct file *file = NULL;
unsigned long retval;
 
+   addr = untagged_addr(addr);
+
if (!(flags & MAP_ANONYMOUS)) {
audit_mmap_fd(fd, flags);
file = fget(fd);
@@ -2874,6 +2878,7 @@ EXPORT_SYMBOL(vm_munmap);
 
 SYSCALL_DEFINE2(munmap, unsigned long, addr, size_t, len)
 {
+   addr = untagged_addr(addr);
profile_munmap(addr);
return __vm_munmap(addr, len, true);
 }
diff --git a/mm/mremap.c b/mm/mremap.c
index 64c9a3b8be0a..1fc8a29fbe3f 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -606,12 +606,8 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned 
long, old_len,
LIST_HEAD(uf_unmap_early);
LIST_HEAD(uf_unmap);
 
-   /*
-* Architectures may interpret the tag passed to mmap as a background
-* colour for the corresponding vma. For mremap we don't allow tagged
-* new_addr to preserve similar behaviour to mmap.
-*/
addr = untagged_addr(addr);
+   new_addr = untagged_addr(new_addr);
 
if (flags & ~(MREMAP_FIXED | MREMAP_MAYMOVE))
return ret;


Re: [PATCH v19 00/15] arm64: untag user pointers passed to the kernel

2019-08-09 Thread Catalin Marinas
On Thu, Aug 08, 2019 at 04:09:04PM -0700, Kees Cook wrote:
> On Thu, Aug 08, 2019 at 03:33:00PM -0700, Andrew Morton wrote:
> > On Thu, 8 Aug 2019 14:12:19 -0700 Kees Cook  wrote:
> > 
> > > > The ones that are left are the mm ones: 4, 5, 6, 7 and 8.
> > > > 
> > > > Andrew, could you take a look and give your Acked-by or pick them up 
> > > > directly?
> > > 
> > > Given the subsystem Acks, it seems like 3-10 and 12 could all just go
> > > via Andrew? I hope he agrees. :)
> > 
> > I'll grab everything that has not yet appeared in linux-next.  If more
> > of these patches appear in linux-next I'll drop those as well.
> > 
> > The review discussion against " [PATCH v19 02/15] arm64: Introduce
> > prctl() options to control the tagged user addresses ABI" has petered
> > out inconclusively.  prctl() vs arch_prctl().
> 
> I've always disliked arch_prctl() existing at all. Given that tagging is
> likely to be a multi-architectural feature, it seems like the controls
> should live in prctl() to me.

It took a bit of grep'ing to figure out what Dave H meant by
arch_prctl(). It's an x86-specific syscall which we do not have on arm64
(and possibly any other architecture). Actually, we don't have any arm64
specific syscalls, only the generic unistd.h, hence the confusion. For
other arm64-specific prctls like SVE we used the generic sys_prctl() and
I can see x86 not being consistent either (PR_MPX_ENABLE_MANAGEMENT).

In general I disagree with adding any arm64-specific syscalls but in
this instance it can't even be justified. I'd rather see some clean-up
similar to arch_ptrace/ptrace_request than introducing new syscall
numbers (but as I suggested in my reply to Dave, that's for another
patch series).

-- 
Catalin


Re: [PATCH v19 02/15] arm64: Introduce prctl() options to control the tagged user addresses ABI

2019-08-02 Thread Catalin Marinas
On Thu, Aug 01, 2019 at 09:45:05AM -0700, Dave Hansen wrote:
> On 8/1/19 5:38 AM, Kevin Brodsky wrote:
> > This patch series only changes what is allowed or not at the syscall
> > interface. It does not change the address space size. On arm64, TBI (Top
> > Byte Ignore) has always been enabled for userspace, so it has never been
> > possible to use the upper 8 bits of user pointers for addressing.
> 
> Oh, so does the address space that's available already chop that out?

Yes. Currently the hardware only supports 52-bit virtual addresses. It
could be expanded (though it needs a 5th page table level) to 56-bit VA
but it's not currently on our (hardware) plans. Beyond 56-bit, it cannot
be done without breaking the software expectations (and hopefully I'll
retire before we need this ;)).

> > If other architectures were to support a similar functionality, then I
> > agree that a common and more generic interface (if needed) would be
> > helpful, but as it stands this is an arm64-specific prctl, and on arm64
> > the address tag is defined by the architecture as bits [63:56].
> 
> It should then be an arch_prctl(), no?

I guess you just want renaming SET_TAGGED_ADDR_CTRL() to
arch_prctl_tagged_addr_ctrl_set()? (similarly for 'get')

-- 
Catalin


Re: [PATCH v19 00/15] arm64: untag user pointers passed to the kernel

2019-08-02 Thread Catalin Marinas
On Thu, Aug 01, 2019 at 08:36:47AM -0700, Dave Hansen wrote:
> On 8/1/19 5:48 AM, Andrey Konovalov wrote:
> > On Thu, Aug 1, 2019 at 2:11 PM Kevin Brodsky  wrote:
> >> On 31/07/2019 17:50, Dave Hansen wrote:
> >>> On 7/23/19 10:58 AM, Andrey Konovalov wrote:
>  The mmap and mremap (only new_addr) syscalls do not currently accept
>  tagged addresses. Architectures may interpret the tag as a background
>  colour for the corresponding vma.
> >>>
> >>> What the heck is a "background colour"? :)
> >>
> >> Good point, this is some jargon that we started using for MTE, the idea 
> >> being that
> >> the kernel could set a tag value (specified during mmap()) as "background 
> >> colour" for
> >> anonymous pages allocated in that range.
> >>
> >> Anyway, this patch series is not about MTE. Andrey, for v20 (if any), I 
> >> think it's
> >> best to drop this last sentence to avoid any confusion.

Indeed, the part with the "background colour" and even the "currently"
adverb should be dropped.

Also, if we merge the patches via different trees anyway, I don't think
there is a need for Andrey to integrate them with his series. We can
pick them up directly in the arm64 tree (once the review finished).

> OK, but what does that mean for tagged addresses getting passed to
> mmap/mremap?  That sentence read to me like "architectures might allow
> tags for ...something...".  So do we accept tagged addresses into those
> syscalls?

If mmap() does not return a tagged address, the reasoning is that it
should not accept one as an address hint (with or without MAP_FIXED).
Note that these docs should only describe the top-byte-ignore ABI while
leaving the memory tagging for a future patchset.

In that future patchset, we may want to update the mmap() ABI to allow,
only in conjunction with PROT_MTE, a tagged pointer as an address
argument. In such case mmap() will return a tagged address and the pages
pre-coloured (on fault) with the tag requested by the user. As I said,
that's to be discussed later in the year.

-- 
Catalin


Re: [PATCH v18 00/15] arm64: untag user pointers passed to the kernel

2019-06-26 Thread Catalin Marinas
Hi Andrew,

On Mon, Jun 24, 2019 at 04:32:45PM +0200, Andrey Konovalov wrote:
> Andrey Konovalov (14):
>   arm64: untag user pointers in access_ok and __uaccess_mask_ptr
>   lib: untag user pointers in strn*_user
>   mm: untag user pointers passed to memory syscalls
>   mm: untag user pointers in mm/gup.c
>   mm: untag user pointers in get_vaddr_frames
>   fs/namespace: untag user pointers in copy_mount_options
>   userfaultfd: untag user pointers
>   drm/amdgpu: untag user pointers
>   drm/radeon: untag user pointers in radeon_gem_userptr_ioctl
>   IB/mlx4: untag user pointers in mlx4_get_umem_mr
>   media/v4l2-core: untag user pointers in videobuf_dma_contig_user_get
>   tee/shm: untag user pointers in tee_shm_register
>   vfio/type1: untag user pointers in vaddr_get_pfn
>   selftests, arm64: add a selftest for passing tagged pointers to kernel
> 
> Catalin Marinas (1):
>   arm64: Introduce prctl() options to control the tagged user addresses
> ABI
> 
>  arch/arm64/Kconfig|  9 +++
>  arch/arm64/include/asm/processor.h|  8 +++
>  arch/arm64/include/asm/thread_info.h  |  1 +
>  arch/arm64/include/asm/uaccess.h  | 12 +++-
>  arch/arm64/kernel/process.c   | 72 +++
>  .../gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c  |  2 +-
>  drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c   |  2 +
>  drivers/gpu/drm/radeon/radeon_gem.c   |  2 +
>  drivers/infiniband/hw/mlx4/mr.c   |  7 +-
>  drivers/media/v4l2-core/videobuf-dma-contig.c |  9 +--
>  drivers/tee/tee_shm.c |  1 +
>  drivers/vfio/vfio_iommu_type1.c   |  2 +
>  fs/namespace.c|  2 +-
>  fs/userfaultfd.c  | 22 +++---
>  include/uapi/linux/prctl.h|  5 ++
>  kernel/sys.c  | 12 
>  lib/strncpy_from_user.c   |  3 +-
>  lib/strnlen_user.c|  3 +-
>  mm/frame_vector.c |  2 +
>  mm/gup.c  |  4 ++
>  mm/madvise.c  |  2 +
>  mm/mempolicy.c|  3 +
>  mm/migrate.c  |  2 +-
>  mm/mincore.c  |  2 +
>  mm/mlock.c|  4 ++
>  mm/mprotect.c |  2 +
>  mm/mremap.c   |  7 ++
>  mm/msync.c|  2 +
>  tools/testing/selftests/arm64/.gitignore  |  1 +
>  tools/testing/selftests/arm64/Makefile| 11 +++
>  .../testing/selftests/arm64/run_tags_test.sh  | 12 
>  tools/testing/selftests/arm64/tags_test.c | 29 
>  32 files changed, 232 insertions(+), 25 deletions(-)

It looks like we got to an agreement on how to deal with tagged user
addresses between SPARC ADI and ARM Memory Tagging. If there are no
other objections, what's your preferred way of getting this series into
-next first and then mainline? Are you ok to merge them into the mm
tree?

Thanks.

-- 
Catalin


Re: [PATCH v18 08/15] userfaultfd: untag user pointers

2019-06-24 Thread Catalin Marinas
On Mon, Jun 24, 2019 at 04:32:53PM +0200, Andrey Konovalov wrote:
> This patch is a part of a series that extends kernel ABI to allow to pass
> tagged user pointers (with the top byte set to something else other than
> 0x00) as syscall arguments.
> 
> userfaultfd code use provided user pointers for vma lookups, which can
> only by done with untagged pointers.
> 
> Untag user pointers in validate_range().
> 
> Reviewed-by: Vincenzo Frascino 
> Reviewed-by: Catalin Marinas 
> Reviewed-by: Kees Cook 
> Signed-off-by: Andrey Konovalov 
> ---
>  fs/userfaultfd.c | 22 --
>  1 file changed, 12 insertions(+), 10 deletions(-)

Same here, it needs an ack from Al Viro.

> diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
> index ae0b8b5f69e6..c2be36a168ca 100644
> --- a/fs/userfaultfd.c
> +++ b/fs/userfaultfd.c
> @@ -1261,21 +1261,23 @@ static __always_inline void wake_userfault(struct 
> userfaultfd_ctx *ctx,
>  }
>  
>  static __always_inline int validate_range(struct mm_struct *mm,
> -   __u64 start, __u64 len)
> +   __u64 *start, __u64 len)
>  {
>   __u64 task_size = mm->task_size;
>  
> - if (start & ~PAGE_MASK)
> + *start = untagged_addr(*start);
> +
> + if (*start & ~PAGE_MASK)
>   return -EINVAL;
>   if (len & ~PAGE_MASK)
>   return -EINVAL;
>   if (!len)
>   return -EINVAL;
> - if (start < mmap_min_addr)
> + if (*start < mmap_min_addr)
>   return -EINVAL;
> - if (start >= task_size)
> + if (*start >= task_size)
>   return -EINVAL;
> - if (len > task_size - start)
> + if (len > task_size - *start)
>   return -EINVAL;
>   return 0;
>  }
> @@ -1325,7 +1327,7 @@ static int userfaultfd_register(struct userfaultfd_ctx 
> *ctx,
>   goto out;
>   }
>  
> - ret = validate_range(mm, uffdio_register.range.start,
> + ret = validate_range(mm, _register.range.start,
>uffdio_register.range.len);
>   if (ret)
>   goto out;
> @@ -1514,7 +1516,7 @@ static int userfaultfd_unregister(struct 
> userfaultfd_ctx *ctx,
>   if (copy_from_user(_unregister, buf, sizeof(uffdio_unregister)))
>   goto out;
>  
> - ret = validate_range(mm, uffdio_unregister.start,
> + ret = validate_range(mm, _unregister.start,
>uffdio_unregister.len);
>   if (ret)
>   goto out;
> @@ -1665,7 +1667,7 @@ static int userfaultfd_wake(struct userfaultfd_ctx *ctx,
>   if (copy_from_user(_wake, buf, sizeof(uffdio_wake)))
>   goto out;
>  
> - ret = validate_range(ctx->mm, uffdio_wake.start, uffdio_wake.len);
> + ret = validate_range(ctx->mm, _wake.start, uffdio_wake.len);
>   if (ret)
>   goto out;
>  
> @@ -1705,7 +1707,7 @@ static int userfaultfd_copy(struct userfaultfd_ctx *ctx,
>  sizeof(uffdio_copy)-sizeof(__s64)))
>   goto out;
>  
> - ret = validate_range(ctx->mm, uffdio_copy.dst, uffdio_copy.len);
> + ret = validate_range(ctx->mm, _copy.dst, uffdio_copy.len);
>   if (ret)
>   goto out;
>   /*
> @@ -1761,7 +1763,7 @@ static int userfaultfd_zeropage(struct userfaultfd_ctx 
> *ctx,
>  sizeof(uffdio_zeropage)-sizeof(__s64)))
>   goto out;
>  
> - ret = validate_range(ctx->mm, uffdio_zeropage.range.start,
> + ret = validate_range(ctx->mm, _zeropage.range.start,
>uffdio_zeropage.range.len);
>   if (ret)
>   goto out;
> -- 
> 2.22.0.410.gd8fdbe21b5-goog


Re: [PATCH v18 07/15] fs/namespace: untag user pointers in copy_mount_options

2019-06-24 Thread Catalin Marinas
On Mon, Jun 24, 2019 at 04:32:52PM +0200, Andrey Konovalov wrote:
> This patch is a part of a series that extends kernel ABI to allow to pass
> tagged user pointers (with the top byte set to something else other than
> 0x00) as syscall arguments.
> 
> In copy_mount_options a user address is being subtracted from TASK_SIZE.
> If the address is lower than TASK_SIZE, the size is calculated to not
> allow the exact_copy_from_user() call to cross TASK_SIZE boundary.
> However if the address is tagged, then the size will be calculated
> incorrectly.
> 
> Untag the address before subtracting.
> 
> Reviewed-by: Khalid Aziz 
> Reviewed-by: Vincenzo Frascino 
> Reviewed-by: Kees Cook 
> Reviewed-by: Catalin Marinas 
> Signed-off-by: Andrey Konovalov 
> ---
>  fs/namespace.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/namespace.c b/fs/namespace.c
> index 7660c2749c96..ec78f7223917 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -2994,7 +2994,7 @@ void *copy_mount_options(const void __user * data)
>* the remainder of the page.
>*/
>   /* copy_from_user cannot cross TASK_SIZE ! */
> - size = TASK_SIZE - (unsigned long)data;
> + size = TASK_SIZE - (unsigned long)untagged_addr(data);
>   if (size > PAGE_SIZE)
>   size = PAGE_SIZE;

I think this patch needs an ack from Al Viro (cc'ed).

-- 
Catalin


Re: [PATCH v18 11/15] IB/mlx4: untag user pointers in mlx4_get_umem_mr

2019-06-24 Thread Catalin Marinas
On Mon, Jun 24, 2019 at 04:32:56PM +0200, Andrey Konovalov wrote:
> This patch is a part of a series that extends kernel ABI to allow to pass
> tagged user pointers (with the top byte set to something else other than
> 0x00) as syscall arguments.
> 
> mlx4_get_umem_mr() uses provided user pointers for vma lookups, which can
> only by done with untagged pointers.
> 
> Untag user pointers in this function.
> 
> Signed-off-by: Andrey Konovalov 
> ---
>  drivers/infiniband/hw/mlx4/mr.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)

Acked-by: Catalin Marinas 

This patch also needs an ack from the infiniband maintainers (Jason).

-- 
Catalin


Re: [PATCH v18 15/15] selftests, arm64: add a selftest for passing tagged pointers to kernel

2019-06-24 Thread Catalin Marinas
On Mon, Jun 24, 2019 at 04:33:00PM +0200, Andrey Konovalov wrote:
> --- /dev/null
> +++ b/tools/testing/selftests/arm64/tags_test.c
> @@ -0,0 +1,29 @@
> +// SPDX-License-Identifier: GPL-2.0
> +
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +#include 
> +
> +#define SHIFT_TAG(tag)   ((uint64_t)(tag) << 56)
> +#define SET_TAG(ptr, tag)(((uint64_t)(ptr) & ~SHIFT_TAG(0xff)) | \
> + SHIFT_TAG(tag))
> +
> +int main(void)
> +{
> + static int tbi_enabled = 0;
> + struct utsname *ptr, *tagged_ptr;
> + int err;
> +
> + if (prctl(PR_SET_TAGGED_ADDR_CTRL, PR_TAGGED_ADDR_ENABLE, 0, 0, 0) == 0)
> + tbi_enabled = 1;

Nitpick: with the latest prctl() patch, you can skip the last three
arguments as they are ignored.

Either way:

Reviewed-by: Catalin Marinas 


Re: [PATCH v17 03/15] arm64: Introduce prctl() options to control the tagged user addresses ABI

2019-06-19 Thread Catalin Marinas
On Wed, Jun 19, 2019 at 04:45:02PM +0200, Andrey Konovalov wrote:
> On Wed, Jun 12, 2019 at 1:43 PM Andrey Konovalov  
> wrote:
> > From: Catalin Marinas 
> >
> > It is not desirable to relax the ABI to allow tagged user addresses into
> > the kernel indiscriminately. This patch introduces a prctl() interface
> > for enabling or disabling the tagged ABI with a global sysctl control
> > for preventing applications from enabling the relaxed ABI (meant for
> > testing user-space prctl() return error checking without reconfiguring
> > the kernel). The ABI properties are inherited by threads of the same
> > application and fork()'ed children but cleared on execve().
> >
> > The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle
> > MTE-specific settings like imprecise vs precise exceptions.
> >
> > Signed-off-by: Catalin Marinas 
> 
> Catalin, would you like to do the requested changes to this patch
> yourself and send it to me or should I do that?

I'll send you an updated version this week.

-- 
Catalin


Re: [PATCH v17 03/15] arm64: Introduce prctl() options to control the tagged user addresses ABI

2019-06-17 Thread Catalin Marinas
On Mon, Jun 17, 2019 at 09:57:36AM -0700, Evgenii Stepanov wrote:
> On Mon, Jun 17, 2019 at 6:56 AM Catalin Marinas  
> wrote:
> > On Wed, Jun 12, 2019 at 01:43:20PM +0200, Andrey Konovalov wrote:
> > > From: Catalin Marinas 
> > >
> > > It is not desirable to relax the ABI to allow tagged user addresses into
> > > the kernel indiscriminately. This patch introduces a prctl() interface
> > > for enabling or disabling the tagged ABI with a global sysctl control
> > > for preventing applications from enabling the relaxed ABI (meant for
> > > testing user-space prctl() return error checking without reconfiguring
> > > the kernel). The ABI properties are inherited by threads of the same
> > > application and fork()'ed children but cleared on execve().
> > >
> > > The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle
> > > MTE-specific settings like imprecise vs precise exceptions.
> > >
> > > Signed-off-by: Catalin Marinas 
> >
> > A question for the user-space folk: if an application opts in to this
> > ABI, would you want the sigcontext.fault_address and/or siginfo.si_addr
> > to contain the tag? We currently clear it early in the arm64 entry.S but
> > we could find a way to pass it down if needed.
> 
> For HWASan this would not be useful because we instrument memory
> accesses with explicit checks anyway. For MTE, on the other hand, it
> would be very convenient to know the fault address tag without
> disassembling the code.

I could as this differently: does anything break if, once the user
opts in to TBI, fault_address and/or si_addr have non-zero top byte?

Alternatively, we could present the original FAR_EL1 register as a
separate field as we do with ESR_EL1, independently of whether the user
opted in to TBI or not.

-- 
Catalin


Re: [PATCH v17 03/15] arm64: Introduce prctl() options to control the tagged user addresses ABI

2019-06-17 Thread Catalin Marinas
On Wed, Jun 12, 2019 at 01:43:20PM +0200, Andrey Konovalov wrote:
> From: Catalin Marinas 
> 
> It is not desirable to relax the ABI to allow tagged user addresses into
> the kernel indiscriminately. This patch introduces a prctl() interface
> for enabling or disabling the tagged ABI with a global sysctl control
> for preventing applications from enabling the relaxed ABI (meant for
> testing user-space prctl() return error checking without reconfiguring
> the kernel). The ABI properties are inherited by threads of the same
> application and fork()'ed children but cleared on execve().
> 
> The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle
> MTE-specific settings like imprecise vs precise exceptions.
> 
> Signed-off-by: Catalin Marinas 

A question for the user-space folk: if an application opts in to this
ABI, would you want the sigcontext.fault_address and/or siginfo.si_addr
to contain the tag? We currently clear it early in the arm64 entry.S but
we could find a way to pass it down if needed.

-- 
Catalin


Re: [PATCH v17 03/15] arm64: Introduce prctl() options to control the tagged user addresses ABI

2019-06-13 Thread Catalin Marinas
On Thu, Jun 13, 2019 at 04:45:54PM +0100, Vincenzo Frascino wrote:
> On 13/06/2019 16:35, Catalin Marinas wrote:
> > On Thu, Jun 13, 2019 at 12:16:59PM +0100, Dave P Martin wrote:
> >> On Wed, Jun 12, 2019 at 01:43:20PM +0200, Andrey Konovalov wrote:
> >>> +
> >>> +/*
> >>> + * Control the relaxed ABI allowing tagged user addresses into the 
> >>> kernel.
> >>> + */
> >>> +static unsigned int tagged_addr_prctl_allowed = 1;
> >>> +
> >>> +long set_tagged_addr_ctrl(unsigned long arg)
> >>> +{
> >>> + if (!tagged_addr_prctl_allowed)
> >>> + return -EINVAL;
> >>
> >> So, tagging can actually be locked on by having a process enable it and
> >> then some possibly unrelated process clearing tagged_addr_prctl_allowed.
> >> That feels a bit weird.
> > 
> > The problem is that if you disable the ABI globally, lots of
> > applications would crash. This sysctl is meant as a way to disable the
> > opt-in to the TBI ABI. Another option would be a kernel command line
> > option (I'm not keen on a Kconfig option).
> 
> Why you are not keen on a Kconfig option?

Because I don't want to rebuild the kernel/reboot just to be able to
test how user space handles the ABI opt-in. I'm ok with a Kconfig option
to disable this globally in addition to a run-time option (if actually
needed, I'm not sure).

-- 
Catalin


Re: [PATCH v17 03/15] arm64: Introduce prctl() options to control the tagged user addresses ABI

2019-06-13 Thread Catalin Marinas
On Thu, Jun 13, 2019 at 12:16:59PM +0100, Dave P Martin wrote:
> On Wed, Jun 12, 2019 at 01:43:20PM +0200, Andrey Konovalov wrote:
> > From: Catalin Marinas 
> > 
> > It is not desirable to relax the ABI to allow tagged user addresses into
> > the kernel indiscriminately. This patch introduces a prctl() interface
> > for enabling or disabling the tagged ABI with a global sysctl control
> > for preventing applications from enabling the relaxed ABI (meant for
> > testing user-space prctl() return error checking without reconfiguring
> > the kernel). The ABI properties are inherited by threads of the same
> > application and fork()'ed children but cleared on execve().
> > 
> > The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle
> > MTE-specific settings like imprecise vs precise exceptions.
> > 
> > Signed-off-by: Catalin Marinas 
> > ---
> >  arch/arm64/include/asm/processor.h   |  6 +++
> >  arch/arm64/include/asm/thread_info.h |  1 +
> >  arch/arm64/include/asm/uaccess.h |  3 +-
> >  arch/arm64/kernel/process.c  | 67 
> >  include/uapi/linux/prctl.h   |  5 +++
> >  kernel/sys.c | 16 +++
> >  6 files changed, 97 insertions(+), 1 deletion(-)
> > 
> > diff --git a/arch/arm64/include/asm/processor.h 
> > b/arch/arm64/include/asm/processor.h
> > index fcd0e691b1ea..fee457456aa8 100644
> > --- a/arch/arm64/include/asm/processor.h
> > +++ b/arch/arm64/include/asm/processor.h
> > @@ -307,6 +307,12 @@ extern void __init minsigstksz_setup(void);
> >  /* PR_PAC_RESET_KEYS prctl */
> >  #define PAC_RESET_KEYS(tsk, arg)   ptrauth_prctl_reset_keys(tsk, arg)
> >  
> > +/* PR_TAGGED_ADDR prctl */
> 
> (A couple of comments I missed in my last reply:)
> 
> Name mismatch?

Yeah, it went through several names but it seems that I didn't update
all places.

> > +long set_tagged_addr_ctrl(unsigned long arg);
> > +long get_tagged_addr_ctrl(void);
> > +#define SET_TAGGED_ADDR_CTRL(arg)  set_tagged_addr_ctrl(arg)
> > +#define GET_TAGGED_ADDR_CTRL() get_tagged_addr_ctrl()
> > +
> 
> [...]
> 
> > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> > index 3767fb21a5b8..69d0be1fc708 100644
> > --- a/arch/arm64/kernel/process.c
> > +++ b/arch/arm64/kernel/process.c
> > @@ -30,6 +30,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > @@ -323,6 +324,7 @@ void flush_thread(void)
> > fpsimd_flush_thread();
> > tls_thread_flush();
> > flush_ptrace_hw_breakpoint(current);
> > +   clear_thread_flag(TIF_TAGGED_ADDR);
> >  }
> >  
> >  void release_thread(struct task_struct *dead_task)
> > @@ -552,3 +554,68 @@ void arch_setup_new_exec(void)
> >  
> > ptrauth_thread_init_user(current);
> >  }
> > +
> > +/*
> > + * Control the relaxed ABI allowing tagged user addresses into the kernel.
> > + */
> > +static unsigned int tagged_addr_prctl_allowed = 1;
> > +
> > +long set_tagged_addr_ctrl(unsigned long arg)
> > +{
> > +   if (!tagged_addr_prctl_allowed)
> > +   return -EINVAL;
> 
> So, tagging can actually be locked on by having a process enable it and
> then some possibly unrelated process clearing tagged_addr_prctl_allowed.
> That feels a bit weird.

The problem is that if you disable the ABI globally, lots of
applications would crash. This sysctl is meant as a way to disable the
opt-in to the TBI ABI. Another option would be a kernel command line
option (I'm not keen on a Kconfig option).

> Do we want to allow a process that has tagging on to be able to turn
> it off at all?  Possibly things like CRIU might want to do that.

I left it in for symmetry but I don't expect it to be used. A potential
use-case is doing it per subsequent threads in an application.

> > +   if (is_compat_task())
> > +   return -EINVAL;
> > +   if (arg & ~PR_TAGGED_ADDR_ENABLE)
> > +   return -EINVAL;
> 
> How do we expect this argument to be extended in the future?

Yes, for MTE. That's why I wouldn't allow random bits here.

> I'm wondering whether this is really a bitmask or an enum, or a mixture
> of the two.  Maybe it doesn't matter.

User may want to set PR_TAGGED_ADDR_ENABLE | PR_MTE_PRECISE in a single
call.

> > +   if (arg & PR_TAGGED_ADDR_ENABLE)
> > +   set_thread_flag(TIF_TAGGED_ADDR);
> > +   else
> > +   clear_thread_flag(TIF_TAGGED_ADDR);
> 
> I think update_thread_flag() could be used here.

Yes. I forgot you added this.

-- 
Catalin


Re: [PATCH v17 03/15] arm64: Introduce prctl() options to control the tagged user addresses ABI

2019-06-13 Thread Catalin Marinas
Hi Dave,

On Thu, Jun 13, 2019 at 12:02:35PM +0100, Dave P Martin wrote:
> On Wed, Jun 12, 2019 at 01:43:20PM +0200, Andrey Konovalov wrote:
> > +/*
> > + * Global sysctl to disable the tagged user addresses support. This control
> > + * only prevents the tagged address ABI enabling via prctl() and does not
> > + * disable it for tasks that already opted in to the relaxed ABI.
> > + */
> > +static int zero;
> > +static int one = 1;
> 
> !!!
> 
> And these can't even be const without a cast.  Yuk.
> 
> (Not your fault though, but it would be nice to have a proc_dobool() to
> avoid this.)

I had the same reaction. Maybe for another patch sanitising this pattern
across the kernel.

> > --- a/include/uapi/linux/prctl.h
> > +++ b/include/uapi/linux/prctl.h
> > @@ -229,4 +229,9 @@ struct prctl_mm_map {
> >  # define PR_PAC_APDBKEY(1UL << 3)
> >  # define PR_PAC_APGAKEY(1UL << 4)
> >  
> > +/* Tagged user address controls for arm64 */
> > +#define PR_SET_TAGGED_ADDR_CTRL55
> > +#define PR_GET_TAGGED_ADDR_CTRL56
> > +# define PR_TAGGED_ADDR_ENABLE (1UL << 0)
> > +
> 
> Do we expect this prctl to be applicable to other arches, or is it
> strictly arm64-specific?

I kept it generic, at least the tagged address part. The MTE bits later
on would be arm64-specific.

> > @@ -2492,6 +2498,16 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, 
> > arg2, unsigned long, arg3,
> > return -EINVAL;
> > error = PAC_RESET_KEYS(me, arg2);
> > break;
> > +   case PR_SET_TAGGED_ADDR_CTRL:
> > +   if (arg3 || arg4 || arg5)
> 
> 
> 
> How do you anticipate these arguments being used in the future?

I don't expect them to be used at all. But since I'm not sure, I'd force
them as zero for now rather than ignored. The GET is supposed to return
the SET arg2, hence I'd rather not used the other arguments.

> For the SVE prctls I took the view that "get" could only ever mean one
> thing, and "put" already had a flags argument with spare bits for future
> expansion anyway, so forcing the extra arguments to zero would be
> unnecessary.
> 
> Opinions seem to differ on whether requiring surplus arguments to be 0
> is beneficial for hygiene, but the glibc prototype for prctl() is
> 
>   int prctl (int __option, ...);
> 
> so it seemed annoying to have to pass extra arguments to it just for the
> sake of it.  IMHO this also makes the code at the call site less
> readable, since it's not immediately apparent that all those 0s are
> meaningless.

It's fine by me to ignore the other arguments. I just followed the
pattern of some existing prctl options. I don't have a strong opinion
either way.

> > +   return -EINVAL;
> > +   error = SET_TAGGED_ADDR_CTRL(arg2);
> > +   break;
> > +   case PR_GET_TAGGED_ADDR_CTRL:
> > +   if (arg2 || arg3 || arg4 || arg5)
> > +   return -EINVAL;
> > +   error = GET_TAGGED_ADDR_CTRL();
> 
> Having a "get" prctl is probably a good idea, but is there a clear
> usecase for it?

Not sure, maybe some other library (e.g. a JIT compiler) would like to
check whether tagged addresses have been enabled during application
start and decide to generate tagged pointers for itself. It seemed
pretty harmless, unless we add more complex things to the prctl() that
cannot be returned in one request).

-- 
Catalin


Re: [PATCH v17 15/15] selftests, arm64: add a selftest for passing tagged pointers to kernel

2019-06-12 Thread Catalin Marinas
On Wed, Jun 12, 2019 at 01:30:36PM +0100, Szabolcs Nagy wrote:
> On 12/06/2019 12:43, Andrey Konovalov wrote:
> > --- /dev/null
> > +++ b/tools/testing/selftests/arm64/tags_lib.c
> > @@ -0,0 +1,62 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +
> > +#include 
> > +#include 
> > +
> > +#define TAG_SHIFT  (56)
> > +#define TAG_MASK   (0xffUL << TAG_SHIFT)
> > +
> > +#define PR_SET_TAGGED_ADDR_CTRL55
> > +#define PR_GET_TAGGED_ADDR_CTRL56
> > +#define PR_TAGGED_ADDR_ENABLE  (1UL << 0)
> > +
> > +void *__libc_malloc(size_t size);
> > +void __libc_free(void *ptr);
> > +void *__libc_realloc(void *ptr, size_t size);
> > +void *__libc_calloc(size_t nmemb, size_t size);
> 
> this does not work on at least musl.
> 
> the most robust solution would be to implement
> the malloc apis with mmap/munmap/mremap, if that's
> too cumbersome then use dlsym RTLD_NEXT (although
> that has the slight wart that in glibc it may call
> calloc so wrapping calloc that way is tricky).
> 
> in simple linux tests i'd just use static or
> stack allocations or mmap.
> 
> if a generic preloadable lib solution is needed
> then do it properly with pthread_once to avoid
> races etc.

Thanks for the feedback Szabolcs. I guess we can go back to the initial
simple test that Andrey had and drop the whole LD_PRELOAD hack (I'll
just use it for my internal testing).

BTW, when you get some time, please review Vincenzo's ABI documentation
patches from a user/libc perspective. Once agreed, they should become
part of this series.

-- 
Catalin
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v17 03/15] arm64: Introduce prctl() options to control the tagged user addresses ABI

2019-06-12 Thread Catalin Marinas
On Wed, Jun 12, 2019 at 01:43:20PM +0200, Andrey Konovalov wrote:
> From: Catalin Marinas 
> 
> It is not desirable to relax the ABI to allow tagged user addresses into
> the kernel indiscriminately. This patch introduces a prctl() interface
> for enabling or disabling the tagged ABI with a global sysctl control
> for preventing applications from enabling the relaxed ABI (meant for
> testing user-space prctl() return error checking without reconfiguring
> the kernel). The ABI properties are inherited by threads of the same
> application and fork()'ed children but cleared on execve().
> 
> The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle
> MTE-specific settings like imprecise vs precise exceptions.
> 
> Signed-off-by: Catalin Marinas 

You need your signed-off-by here since you are contributing it. And
thanks for adding the comment to the TIF definition.

-- 
Catalin


Re: [PATCH v16 02/16] arm64: untag user pointers in access_ok and __uaccess_mask_ptr

2019-06-12 Thread Catalin Marinas
On Wed, Jun 12, 2019 at 01:03:10PM +0200, Andrey Konovalov wrote:
> On Tue, Jun 11, 2019 at 7:39 PM Catalin Marinas  
> wrote:
> > On Tue, Jun 11, 2019 at 07:09:46PM +0200, Andrey Konovalov wrote:
> > > Should I drop access_ok() change from my patch, since yours just reverts 
> > > it?
> >
> > Not necessary, your patch just relaxes the ABI for all apps, mine
> > tightens it. You could instead move the untagging to __range_ok() and
> > rebase my patch accordingly.
> 
> OK, will do. I'll also add a comment next to TIF_TAGGED_ADDR as Vincenzo 
> asked.

Thanks.

-- 
Catalin


Re: [PATCH v16 12/16] IB, arm64: untag user pointers in ib_uverbs_(re)reg_mr()

2019-06-12 Thread Catalin Marinas
On Tue, Jun 04, 2019 at 03:09:26PM +0200, Andrey Konovalov wrote:
> On Tue, Jun 4, 2019 at 3:02 PM Jason Gunthorpe  wrote:
> > On Tue, Jun 04, 2019 at 02:45:32PM +0200, Andrey Konovalov wrote:
> > > On Tue, Jun 4, 2019 at 2:27 PM Jason Gunthorpe  wrote:
> > > > On Tue, Jun 04, 2019 at 02:18:19PM +0200, Andrey Konovalov wrote:
> > > > > On Mon, Jun 3, 2019 at 7:46 PM Jason Gunthorpe  wrote:
> > > > > > On Mon, Jun 03, 2019 at 06:55:14PM +0200, Andrey Konovalov wrote:
> > > > > > > This patch is a part of a series that extends arm64 kernel ABI to 
> > > > > > > allow to
> > > > > > > pass tagged user pointers (with the top byte set to something 
> > > > > > > else other
> > > > > > > than 0x00) as syscall arguments.
> > > > > > >
> > > > > > > ib_uverbs_(re)reg_mr() use provided user pointers for vma lookups 
> > > > > > > (through
> > > > > > > e.g. mlx4_get_umem_mr()), which can only by done with untagged 
> > > > > > > pointers.
> > > > > > >
> > > > > > > Untag user pointers in these functions.
> > > > > > >
> > > > > > > Signed-off-by: Andrey Konovalov 
> > > > > > >  drivers/infiniband/core/uverbs_cmd.c | 4 
> > > > > > >  1 file changed, 4 insertions(+)
> > > > > > >
> > > > > > > diff --git a/drivers/infiniband/core/uverbs_cmd.c 
> > > > > > > b/drivers/infiniband/core/uverbs_cmd.c
> > > > > > > index 5a3a1780ceea..f88ee733e617 100644
> > > > > > > +++ b/drivers/infiniband/core/uverbs_cmd.c
> > > > > > > @@ -709,6 +709,8 @@ static int ib_uverbs_reg_mr(struct 
> > > > > > > uverbs_attr_bundle *attrs)
> > > > > > >   if (ret)
> > > > > > >   return ret;
> > > > > > >
> > > > > > > + cmd.start = untagged_addr(cmd.start);
> > > > > > > +
> > > > > > >   if ((cmd.start & ~PAGE_MASK) != (cmd.hca_va & ~PAGE_MASK))
> > > > > > >   return -EINVAL;
> > > > > >
> > > > > > I feel like we shouldn't thave to do this here, surely the cmd.start
> > > > > > should flow unmodified to get_user_pages, and gup should untag it?
> > > > > >
> > > > > > ie, this sort of direction for the IB code (this would be a giant
> > > > > > patch, so I didn't have time to write it all, but I think it is much
> > > > > > saner):
> > > > >
> > > > > ib_uverbs_reg_mr() passes cmd.start to mlx4_get_umem_mr(), which calls
> > > > > find_vma(), which only accepts untagged addresses. Could you explain
> > > > > how your patch helps?
> > > >
> > > > That mlx4 is just a 'weird duck', it is not the normal flow, and I
> > > > don't think the core code should be making special consideration for
> > > > it.
> > >
> > > How do you think we should do untagging (or something else) to deal
> > > with this 'weird duck' case?
> >
> > mlx4 should handle it around the call to find_vma like other patches
> > do, ideally as part of the cast from a void __user * to the unsigned
> > long that find_vma needs
> 
> So essentially what we had a few versions ago
> (https://lkml.org/lkml/2019/4/30/785) plus changing unsigned longs to
> __user * across all IB code? I think the second part is something
> that's not related to this series and needs to be done separately. I
> can move untagging back to mlx4_get_umem_mr() though.
> 
> Catalin, you've initially asked to to move untagging out of
> mlx4_get_umem_mr(), do you have any comments on this?

It's fine by me either way. My original reasoning was to untag this at
the higher level as tags may not be relevant to the mlx4 code. If that's
what Jason prefers, go for it.

-- 
Catalin


Re: [PATCH v16 09/16] fs, arm64: untag user pointers in fs/userfaultfd.c

2019-06-12 Thread Catalin Marinas
On Mon, Jun 03, 2019 at 06:55:11PM +0200, Andrey Konovalov wrote:
> This patch is a part of a series that extends arm64 kernel ABI to allow to
> pass tagged user pointers (with the top byte set to something else other
> than 0x00) as syscall arguments.
> 
> userfaultfd code use provided user pointers for vma lookups, which can
> only by done with untagged pointers.
> 
> Untag user pointers in validate_range().
> 
> Signed-off-by: Andrey Konovalov 

Reviewed-by: Catalin Marinas 


Re: [PATCH v16 15/16] vfio/type1, arm64: untag user pointers in vaddr_get_pfn

2019-06-12 Thread Catalin Marinas
On Mon, Jun 03, 2019 at 06:55:17PM +0200, Andrey Konovalov wrote:
> This patch is a part of a series that extends arm64 kernel ABI to allow to
> pass tagged user pointers (with the top byte set to something else other
> than 0x00) as syscall arguments.
> 
> vaddr_get_pfn() uses provided user pointers for vma lookups, which can
> only by done with untagged pointers.
> 
> Untag user pointers in this function.
> 
> Signed-off-by: Andrey Konovalov 

Reviewed-by: Catalin Marinas 


Re: [PATCH v16 02/16] arm64: untag user pointers in access_ok and __uaccess_mask_ptr

2019-06-12 Thread Catalin Marinas
Hi Vincenzo,

On Tue, Jun 11, 2019 at 06:09:10PM +0100, Vincenzo Frascino wrote:
> > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> > index 3767fb21a5b8..69d0be1fc708 100644
> > --- a/arch/arm64/kernel/process.c
> > +++ b/arch/arm64/kernel/process.c
> > @@ -30,6 +30,7 @@
> >  #include 
> >  #include 
> >  #include 
> > +#include 
> >  #include 
> >  #include 
> >  #include 
> > @@ -323,6 +324,7 @@ void flush_thread(void)
> > fpsimd_flush_thread();
> > tls_thread_flush();
> > flush_ptrace_hw_breakpoint(current);
> > +   clear_thread_flag(TIF_TAGGED_ADDR);
> 
> Nit: in line we the other functions in thread_flush we could have something 
> like
> "tagged_addr_thread_flush", maybe inlined.

The other functions do a lot more than clearing a TIF flag, so they
deserved their own place. We could do this when adding MTE support. I
think we also need to check what other TIF flags we may inadvertently
pass on execve(), maybe have a mask clearing.

> > diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
> > index 094bb03b9cc2..2e927b3e9d6c 100644
> > --- a/include/uapi/linux/prctl.h
> > +++ b/include/uapi/linux/prctl.h
> > @@ -229,4 +229,9 @@ struct prctl_mm_map {
> >  # define PR_PAC_APDBKEY(1UL << 3)
> >  # define PR_PAC_APGAKEY(1UL << 4)
> >  
> > +/* Tagged user address controls for arm64 */
> > +#define PR_SET_TAGGED_ADDR_CTRL55
> > +#define PR_GET_TAGGED_ADDR_CTRL56
> > +# define PR_TAGGED_ADDR_ENABLE (1UL << 0)
> > +
> >  #endif /* _LINUX_PRCTL_H */
> > diff --git a/kernel/sys.c b/kernel/sys.c
> > index 2969304c29fe..ec48396b4943 100644
> > --- a/kernel/sys.c
> > +++ b/kernel/sys.c
> > @@ -124,6 +124,12 @@
> >  #ifndef PAC_RESET_KEYS
> >  # define PAC_RESET_KEYS(a, b)  (-EINVAL)
> >  #endif
> > +#ifndef SET_TAGGED_ADDR_CTRL
> > +# define SET_TAGGED_ADDR_CTRL(a)   (-EINVAL)
> > +#endif
> > +#ifndef GET_TAGGED_ADDR_CTRL
> > +# define GET_TAGGED_ADDR_CTRL()(-EINVAL)
> > +#endif
> >  
> >  /*
> >   * this is where the system-wide overflow UID and GID are defined, for
> > @@ -2492,6 +2498,16 @@ SYSCALL_DEFINE5(prctl, int, option, unsigned long, 
> > arg2, unsigned long, arg3,
> > return -EINVAL;
> > error = PAC_RESET_KEYS(me, arg2);
> > break;
> > +   case PR_SET_TAGGED_ADDR_CTRL:
> > +   if (arg3 || arg4 || arg5)
> > +   return -EINVAL;
> > +   error = SET_TAGGED_ADDR_CTRL(arg2);
> > +   break;
> > +   case PR_GET_TAGGED_ADDR_CTRL:
> > +   if (arg2 || arg3 || arg4 || arg5)
> > +   return -EINVAL;
> > +   error = GET_TAGGED_ADDR_CTRL();
> > +   break;
> 
> Why do we need two prctl here? We could have only one and use arg2 as set/get
> and arg3 as a parameter. What do you think?

This follows the other PR_* options, e.g. PR_SET_VL/GET_VL,
PR_*_FP_MODE. We will use other bits in arg2, for example to set the
precise vs imprecise MTE trapping.

-- 
Catalin


Re: [PATCH v16 16/16] selftests, arm64: add a selftest for passing tagged pointers to kernel

2019-06-11 Thread Catalin Marinas
On Tue, Jun 11, 2019 at 07:18:04PM +0200, Andrey Konovalov wrote:
> On Tue, Jun 11, 2019 at 5:01 PM Catalin Marinas  
> wrote:
> > static void *tag_ptr(void *ptr)
> > {
> > static int tagged_addr_err = 1;
> > unsigned long tag = 0;
> >
> > if (tagged_addr_err == 1)
> > tagged_addr_err = prctl(PR_SET_TAGGED_ADDR_CTRL,
> > PR_TAGGED_ADDR_ENABLE, 0, 0, 0);
> 
> I think this requires atomics. malloc() can be called from multiple threads.

It's slightly racy but I assume in a real libc it can be initialised
earlier than the hook calls while still in single-threaded mode (I had
a quick attempt with __attribute__((constructor)) but didn't get far).

Even with the race, under normal circumstances calling the prctl() twice
is not a problem. I think the risk here is that someone disables the ABI
via sysctl and the ABI is enabled for some of the threads only.

-- 
Catalin


Re: [PATCH v16 05/16] arm64: untag user pointers passed to memory syscalls

2019-06-11 Thread Catalin Marinas
On Tue, Jun 11, 2019 at 05:35:31PM +0200, Andrey Konovalov wrote:
> On Mon, Jun 10, 2019 at 4:28 PM Catalin Marinas  
> wrote:
> > On Mon, Jun 03, 2019 at 06:55:07PM +0200, Andrey Konovalov wrote:
> > > This patch is a part of a series that extends arm64 kernel ABI to allow to
> > > pass tagged user pointers (with the top byte set to something else other
> > > than 0x00) as syscall arguments.
> > >
> > > This patch allows tagged pointers to be passed to the following memory
> > > syscalls: get_mempolicy, madvise, mbind, mincore, mlock, mlock2, mprotect,
> > > mremap, msync, munlock.
> > >
> > > Signed-off-by: Andrey Konovalov 
> >
> > I would add in the commit log (and possibly in the code with a comment)
> > that mremap() and mmap() do not currently accept tagged hint addresses.
> > Architectures may interpret the hint tag as a background colour for the
> > corresponding vma. With this:
> 
> I'll change the commit log. Where do you you think I should put this
> comment? Before mmap and mremap definitions in mm/?

On arm64 we use our own sys_mmap(). I'd say just add a comment on the
generic mremap() just before the untagged_addr() along the lines that
new_address is not untagged for preserving similar behaviour to mmap().

-- 
Catalin


Re: [PATCH v16 02/16] arm64: untag user pointers in access_ok and __uaccess_mask_ptr

2019-06-11 Thread Catalin Marinas
On Tue, Jun 11, 2019 at 07:09:46PM +0200, Andrey Konovalov wrote:
> On Tue, Jun 11, 2019 at 4:57 PM Catalin Marinas  
> wrote:
> >
> > On Mon, Jun 10, 2019 at 06:53:27PM +0100, Catalin Marinas wrote:
> > > On Mon, Jun 03, 2019 at 06:55:04PM +0200, Andrey Konovalov wrote:
> > > > diff --git a/arch/arm64/include/asm/uaccess.h 
> > > > b/arch/arm64/include/asm/uaccess.h
> > > > index e5d5f31c6d36..9164ecb5feca 100644
> > > > --- a/arch/arm64/include/asm/uaccess.h
> > > > +++ b/arch/arm64/include/asm/uaccess.h
> > > > @@ -94,7 +94,7 @@ static inline unsigned long __range_ok(const void 
> > > > __user *addr, unsigned long si
> > > > return ret;
> > > >  }
> > > >
> > > > -#define access_ok(addr, size)  __range_ok(addr, size)
> > > > +#define access_ok(addr, size)  __range_ok(untagged_addr(addr), 
> > > > size)
> > >
> > > I'm going to propose an opt-in method here (RFC for now). We can't have
> > > a check in untagged_addr() since this is already used throughout the
> > > kernel for both user and kernel addresses (khwasan) but we can add one
> > > in __range_ok(). The same prctl() option will be used for controlling
> > > the precise/imprecise mode of MTE later on. We can use a TIF_ flag here
> > > assuming that this will be called early on and any cloned thread will
> > > inherit this.
> >
> > Updated patch, inlining it below. Once we agreed on the approach, I
> > think Andrey can insert in in this series, probably after patch 2. The
> > differences from the one I posted yesterday:
> >
> > - renamed PR_* macros together with get/set variants and the possibility
> >   to disable the relaxed ABI
> >
> > - sysctl option - /proc/sys/abi/tagged_addr to disable the ABI globally
> >   (just the prctl() opt-in, tasks already using it won't be affected)
> >
> > And, of course, it needs more testing.
> 
> Sure, I'll add it to the series.
> 
> Should I drop access_ok() change from my patch, since yours just reverts it?

Not necessary, your patch just relaxes the ABI for all apps, mine
tightens it. You could instead move the untagging to __range_ok() and
rebase my patch accordingly.

-- 
Catalin


Re: [PATCH v16 16/16] selftests, arm64: add a selftest for passing tagged pointers to kernel

2019-06-11 Thread Catalin Marinas
On Mon, Jun 03, 2019 at 06:55:18PM +0200, Andrey Konovalov wrote:
> This patch is a part of a series that extends arm64 kernel ABI to allow to
> pass tagged user pointers (with the top byte set to something else other
> than 0x00) as syscall arguments.
> 
> This patch adds a simple test, that calls the uname syscall with a
> tagged user pointer as an argument. Without the kernel accepting tagged
> user pointers the test fails with EFAULT.
> 
> Signed-off-by: Andrey Konovalov 

BTW, you could add

Co-developed-by: Catalin Marinas 

since I wrote the malloc() etc. hooks.


> +static void *tag_ptr(void *ptr)
> +{
> + unsigned long tag = rand() & 0xff;
> + if (!ptr)
> + return ptr;
> + return (void *)((unsigned long)ptr | (tag << TAG_SHIFT));
> +}

With the prctl() option, this function becomes (if you have a better
idea, fine by me):

--8<---
#include 
#include 

#define TAG_SHIFT   (56)
#define TAG_MASK(0xffUL << TAG_SHIFT)

#define PR_SET_TAGGED_ADDR_CTRL 55
#define PR_GET_TAGGED_ADDR_CTRL 56
# define PR_TAGGED_ADDR_ENABLE  (1UL << 0)

void *__libc_malloc(size_t size);
void __libc_free(void *ptr);
void *__libc_realloc(void *ptr, size_t size);
void *__libc_calloc(size_t nmemb, size_t size);

static void *tag_ptr(void *ptr)
{
static int tagged_addr_err = 1;
unsigned long tag = 0;

if (tagged_addr_err == 1)
tagged_addr_err = prctl(PR_SET_TAGGED_ADDR_CTRL,
PR_TAGGED_ADDR_ENABLE, 0, 0, 0);

if (!ptr)
return ptr;
if (!tagged_addr_err)
tag = rand() & 0xff;

return (void *)((unsigned long)ptr | (tag << TAG_SHIFT));
}

-- 
Catalin


Re: [PATCH v16 02/16] arm64: untag user pointers in access_ok and __uaccess_mask_ptr

2019-06-11 Thread Catalin Marinas
On Mon, Jun 10, 2019 at 06:53:27PM +0100, Catalin Marinas wrote:
> On Mon, Jun 03, 2019 at 06:55:04PM +0200, Andrey Konovalov wrote:
> > diff --git a/arch/arm64/include/asm/uaccess.h 
> > b/arch/arm64/include/asm/uaccess.h
> > index e5d5f31c6d36..9164ecb5feca 100644
> > --- a/arch/arm64/include/asm/uaccess.h
> > +++ b/arch/arm64/include/asm/uaccess.h
> > @@ -94,7 +94,7 @@ static inline unsigned long __range_ok(const void __user 
> > *addr, unsigned long si
> > return ret;
> >  }
> >  
> > -#define access_ok(addr, size)  __range_ok(addr, size)
> > +#define access_ok(addr, size)  __range_ok(untagged_addr(addr), size)
> 
> I'm going to propose an opt-in method here (RFC for now). We can't have
> a check in untagged_addr() since this is already used throughout the
> kernel for both user and kernel addresses (khwasan) but we can add one
> in __range_ok(). The same prctl() option will be used for controlling
> the precise/imprecise mode of MTE later on. We can use a TIF_ flag here
> assuming that this will be called early on and any cloned thread will
> inherit this.

Updated patch, inlining it below. Once we agreed on the approach, I
think Andrey can insert in in this series, probably after patch 2. The
differences from the one I posted yesterday:

- renamed PR_* macros together with get/set variants and the possibility
  to disable the relaxed ABI

- sysctl option - /proc/sys/abi/tagged_addr to disable the ABI globally
  (just the prctl() opt-in, tasks already using it won't be affected)

And, of course, it needs more testing.

-----8<----
>From 7c624777a4e545522dec1b34e60f0229cb2bd59f Mon Sep 17 00:00:00 2001
From: Catalin Marinas 
Date: Tue, 11 Jun 2019 13:03:38 +0100
Subject: [PATCH] arm64: Introduce prctl() options to control the tagged user
 addresses ABI

It is not desirable to relax the ABI to allow tagged user addresses into
the kernel indiscriminately. This patch introduces a prctl() interface
for enabling or disabling the tagged ABI with a global sysctl control
for preventing applications from enabling the relaxed ABI (meant for
testing user-space prctl() return error checking without reconfiguring
the kernel). The ABI properties are inherited by threads of the same
application and fork()'ed children but cleared on execve().

The PR_SET_TAGGED_ADDR_CTRL will be expanded in the future to handle
MTE-specific settings like imprecise vs precise exceptions.

Signed-off-by: Catalin Marinas 
---
 arch/arm64/include/asm/processor.h   |  6 +++
 arch/arm64/include/asm/thread_info.h |  1 +
 arch/arm64/include/asm/uaccess.h |  5 ++-
 arch/arm64/kernel/process.c  | 67 
 include/uapi/linux/prctl.h   |  5 +++
 kernel/sys.c | 16 +++
 6 files changed, 99 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/include/asm/processor.h 
b/arch/arm64/include/asm/processor.h
index fcd0e691b1ea..fee457456aa8 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -307,6 +307,12 @@ extern void __init minsigstksz_setup(void);
 /* PR_PAC_RESET_KEYS prctl */
 #define PAC_RESET_KEYS(tsk, arg)   ptrauth_prctl_reset_keys(tsk, arg)
 
+/* PR_TAGGED_ADDR prctl */
+long set_tagged_addr_ctrl(unsigned long arg);
+long get_tagged_addr_ctrl(void);
+#define SET_TAGGED_ADDR_CTRL(arg)  set_tagged_addr_ctrl(arg)
+#define GET_TAGGED_ADDR_CTRL() get_tagged_addr_ctrl()
+
 /*
  * For CONFIG_GCC_PLUGIN_STACKLEAK
  *
diff --git a/arch/arm64/include/asm/thread_info.h 
b/arch/arm64/include/asm/thread_info.h
index c285d1ce7186..7263d4c973ce 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -101,6 +101,7 @@ void arch_release_task_struct(struct task_struct *tsk);
 #define TIF_SVE23  /* Scalable Vector Extension in 
use */
 #define TIF_SVE_VL_INHERIT 24  /* Inherit sve_vl_onexec across exec */
 #define TIF_SSBD   25  /* Wants SSB mitigation */
+#define TIF_TAGGED_ADDR26
 
 #define _TIF_SIGPENDING(1 << TIF_SIGPENDING)
 #define _TIF_NEED_RESCHED  (1 << TIF_NEED_RESCHED)
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index 9164ecb5feca..995b9ea11a89 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -73,6 +73,9 @@ static inline unsigned long __range_ok(const void __user 
*addr, unsigned long si
 {
unsigned long ret, limit = current_thread_info()->addr_limit;
 
+   if (test_thread_flag(TIF_TAGGED_ADDR))
+   addr = untagged_addr(addr);
+
__chk_user_ptr(addr);
asm volatile(
// A + B <= C + 1 for all A,B,C, in four easy steps:
@@ -94,7 +97,7 @@ static inline unsigned long __range_ok(const void __user 
*addr, unsigned long si
return

Re: [PATCH v16 02/16] arm64: untag user pointers in access_ok and __uaccess_mask_ptr

2019-06-10 Thread Catalin Marinas
On Mon, Jun 10, 2019 at 11:07:03AM -0700, Kees Cook wrote:
> On Mon, Jun 10, 2019 at 06:53:27PM +0100, Catalin Marinas wrote:
> > On Mon, Jun 03, 2019 at 06:55:04PM +0200, Andrey Konovalov wrote:
> > > diff --git a/arch/arm64/include/asm/uaccess.h 
> > > b/arch/arm64/include/asm/uaccess.h
> > > index e5d5f31c6d36..9164ecb5feca 100644
> > > --- a/arch/arm64/include/asm/uaccess.h
> > > +++ b/arch/arm64/include/asm/uaccess.h
> > > @@ -94,7 +94,7 @@ static inline unsigned long __range_ok(const void 
> > > __user *addr, unsigned long si
> > >   return ret;
> > >  }
> > >  
> > > -#define access_ok(addr, size)__range_ok(addr, size)
> > > +#define access_ok(addr, size)__range_ok(untagged_addr(addr), size)
[...]
> > diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
> > index 3767fb21a5b8..fd191c5b92aa 100644
> > --- a/arch/arm64/kernel/process.c
> > +++ b/arch/arm64/kernel/process.c
> > @@ -552,3 +552,18 @@ void arch_setup_new_exec(void)
> >  
> > ptrauth_thread_init_user(current);
> >  }
> > +
> > +/*
> > + * Enable the relaxed ABI allowing tagged user addresses into the kernel.
> > + */
> > +int untagged_uaddr_set_mode(unsigned long arg)
> > +{
> > +   if (is_compat_task())
> > +   return -ENOTSUPP;
> > +   if (arg)
> > +   return -EINVAL;
> > +
> > +   set_thread_flag(TIF_UNTAGGED_UADDR);
> > +
> > +   return 0;
> > +}
> 
> I think this should be paired with a flag clearing in copy_thread(),
> yes? (i.e. each binary needs to opt in)

It indeed needs clearing though not in copy_thread() as that's used on
clone/fork but rather in flush_thread(), called on the execve() path.

And a note to myself: I think PR_UNTAGGED_ADDR (not UADDR) looks better
in a uapi header, the user doesn't differentiate between uaddr and
kaddr.

-- 
Catalin


Re: [PATCH v16 02/16] arm64: untag user pointers in access_ok and __uaccess_mask_ptr

2019-06-10 Thread Catalin Marinas
On Mon, Jun 03, 2019 at 06:55:04PM +0200, Andrey Konovalov wrote:
> diff --git a/arch/arm64/include/asm/uaccess.h 
> b/arch/arm64/include/asm/uaccess.h
> index e5d5f31c6d36..9164ecb5feca 100644
> --- a/arch/arm64/include/asm/uaccess.h
> +++ b/arch/arm64/include/asm/uaccess.h
> @@ -94,7 +94,7 @@ static inline unsigned long __range_ok(const void __user 
> *addr, unsigned long si
>   return ret;
>  }
>  
> -#define access_ok(addr, size)__range_ok(addr, size)
> +#define access_ok(addr, size)__range_ok(untagged_addr(addr), size)

I'm going to propose an opt-in method here (RFC for now). We can't have
a check in untagged_addr() since this is already used throughout the
kernel for both user and kernel addresses (khwasan) but we can add one
in __range_ok(). The same prctl() option will be used for controlling
the precise/imprecise mode of MTE later on. We can use a TIF_ flag here
assuming that this will be called early on and any cloned thread will
inherit this.

Anyway, it's easier to paste some diff than explain but Vincenzo can
fold them into his ABI patches that should really go together with
these. I added a couple of MTE definitions for prctl() as an example,
not used currently:

--8<-
diff --git a/arch/arm64/include/asm/processor.h 
b/arch/arm64/include/asm/processor.h
index fcd0e691b1ea..2d4cb7e4edab 100644
--- a/arch/arm64/include/asm/processor.h
+++ b/arch/arm64/include/asm/processor.h
@@ -307,6 +307,10 @@ extern void __init minsigstksz_setup(void);
 /* PR_PAC_RESET_KEYS prctl */
 #define PAC_RESET_KEYS(tsk, arg)   ptrauth_prctl_reset_keys(tsk, arg)
 
+/* PR_UNTAGGED_UADDR prctl */
+int untagged_uaddr_set_mode(unsigned long arg);
+#define SET_UNTAGGED_UADDR_MODE(arg)   untagged_uaddr_set_mode(arg)
+
 /*
  * For CONFIG_GCC_PLUGIN_STACKLEAK
  *
diff --git a/arch/arm64/include/asm/thread_info.h 
b/arch/arm64/include/asm/thread_info.h
index c285d1ce7186..89ce3c49 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -101,6 +101,7 @@ void arch_release_task_struct(struct task_struct *tsk);
 #define TIF_SVE23  /* Scalable Vector Extension in 
use */
 #define TIF_SVE_VL_INHERIT 24  /* Inherit sve_vl_onexec across exec */
 #define TIF_SSBD   25  /* Wants SSB mitigation */
+#define TIF_UNTAGGED_UADDR 26
 
 #define _TIF_SIGPENDING(1 << TIF_SIGPENDING)
 #define _TIF_NEED_RESCHED  (1 << TIF_NEED_RESCHED)
@@ -116,6 +117,7 @@ void arch_release_task_struct(struct task_struct *tsk);
 #define _TIF_FSCHECK   (1 << TIF_FSCHECK)
 #define _TIF_32BIT (1 << TIF_32BIT)
 #define _TIF_SVE   (1 << TIF_SVE)
+#define _TIF_UNTAGGED_UADDR(1 << TIF_UNTAGGED_UADDR)
 
 #define _TIF_WORK_MASK (_TIF_NEED_RESCHED | _TIF_SIGPENDING | \
 _TIF_NOTIFY_RESUME | _TIF_FOREIGN_FPSTATE | \
diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h
index 9164ecb5feca..54f5bbaebbc4 100644
--- a/arch/arm64/include/asm/uaccess.h
+++ b/arch/arm64/include/asm/uaccess.h
@@ -73,6 +73,9 @@ static inline unsigned long __range_ok(const void __user 
*addr, unsigned long si
 {
unsigned long ret, limit = current_thread_info()->addr_limit;
 
+   if (test_thread_flag(TIF_UNTAGGED_UADDR))
+   addr = untagged_addr(addr);
+
__chk_user_ptr(addr);
asm volatile(
// A + B <= C + 1 for all A,B,C, in four easy steps:
@@ -94,7 +97,7 @@ static inline unsigned long __range_ok(const void __user 
*addr, unsigned long si
return ret;
 }
 
-#define access_ok(addr, size)  __range_ok(untagged_addr(addr), size)
+#define access_ok(addr, size)  __range_ok(addr, size)
 #define user_addr_max  get_fs
 
 #define _ASM_EXTABLE(from, to) \
diff --git a/arch/arm64/kernel/process.c b/arch/arm64/kernel/process.c
index 3767fb21a5b8..fd191c5b92aa 100644
--- a/arch/arm64/kernel/process.c
+++ b/arch/arm64/kernel/process.c
@@ -552,3 +552,18 @@ void arch_setup_new_exec(void)
 
ptrauth_thread_init_user(current);
 }
+
+/*
+ * Enable the relaxed ABI allowing tagged user addresses into the kernel.
+ */
+int untagged_uaddr_set_mode(unsigned long arg)
+{
+   if (is_compat_task())
+   return -ENOTSUPP;
+   if (arg)
+   return -EINVAL;
+
+   set_thread_flag(TIF_UNTAGGED_UADDR);
+
+   return 0;
+}
diff --git a/include/uapi/linux/prctl.h b/include/uapi/linux/prctl.h
index 094bb03b9cc2..4afd5e2980ee 100644
--- a/include/uapi/linux/prctl.h
+++ b/include/uapi/linux/prctl.h
@@ -229,4 +229,9 @@ struct prctl_mm_map {
 # define PR_PAC_APDBKEY(1UL << 3)
 # define PR_PAC_APGAKEY(1UL << 4)
 
+/* Untagged user addresses for arm64 */
+#define PR_UNTAGGED_UADDR  55
+# define PR_MTE_IMPRECISE_CHECK   

Re: [PATCH v16 07/16] mm, arm64: untag user pointers in get_vaddr_frames

2019-06-10 Thread Catalin Marinas
On Mon, Jun 03, 2019 at 06:55:09PM +0200, Andrey Konovalov wrote:
> This patch is a part of a series that extends arm64 kernel ABI to allow to
> pass tagged user pointers (with the top byte set to something else other
> than 0x00) as syscall arguments.
> 
> get_vaddr_frames uses provided user pointers for vma lookups, which can
> only by done with untagged pointers. Instead of locating and changing
> all callers of this function, perform untagging in it.
> 
> Signed-off-by: Andrey Konovalov 

Acked-by: Catalin Marinas 


Re: [PATCH v16 05/16] arm64: untag user pointers passed to memory syscalls

2019-06-10 Thread Catalin Marinas
On Mon, Jun 03, 2019 at 06:55:07PM +0200, Andrey Konovalov wrote:
> This patch is a part of a series that extends arm64 kernel ABI to allow to
> pass tagged user pointers (with the top byte set to something else other
> than 0x00) as syscall arguments.
> 
> This patch allows tagged pointers to be passed to the following memory
> syscalls: get_mempolicy, madvise, mbind, mincore, mlock, mlock2, mprotect,
> mremap, msync, munlock.
> 
> Signed-off-by: Andrey Konovalov 

I would add in the commit log (and possibly in the code with a comment)
that mremap() and mmap() do not currently accept tagged hint addresses.
Architectures may interpret the hint tag as a background colour for the
corresponding vma. With this:

Reviewed-by: Catalin Marinas 

-- 
Catalin


Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel

2019-05-31 Thread Catalin Marinas
On Fri, May 31, 2019 at 06:24:06PM +0200, Andrey Konovalov wrote:
> On Fri, May 31, 2019 at 6:20 PM Catalin Marinas  
> wrote:
> > On Fri, May 31, 2019 at 04:29:10PM +0200, Andrey Konovalov wrote:
> > > On Thu, May 30, 2019 at 7:15 PM Catalin Marinas  
> > > wrote:
> > > > On Tue, May 28, 2019 at 04:14:45PM +0200, Andrey Konovalov wrote:
> > > > > Thanks for a lot of valuable input! I've read through all the replies
> > > > > and got somewhat lost. What are the changes I need to do to this
> > > > > series?
> > > > >
> > > > > 1. Should I move untagging for memory syscalls back to the generic
> > > > > code so other arches would make use of it as well, or should I keep
> > > > > the arm64 specific memory syscalls wrappers and address the comments
> > > > > on that patch?
> > > >
> > > > Keep them generic again but make sure we get agreement with Khalid on
> > > > the actual ABI implications for sparc.
> > >
> > > OK, will do. I find it hard to understand what the ABI implications
> > > are. I'll post the next version without untagging in brk, mmap,
> > > munmap, mremap (for new_address), mmap_pgoff, remap_file_pages, shmat
> > > and shmdt.
> >
> > It's more about not relaxing the ABI to accept non-zero top-byte unless
> > we have a use-case for it. For mmap() etc., I don't think that's needed
> > but if you think otherwise, please raise it.
> >
> > > > > 2. Should I make untagging opt-in and controlled by a command line 
> > > > > argument?
> > > >
> > > > Opt-in, yes, but per task rather than kernel command line option.
> > > > prctl() is a possibility of opting in.
> > >
> > > OK. Should I store a flag somewhere in task_struct? Should it be
> > > inheritable on clone?
> >
> > A TIF flag would do but I'd say leave it out for now (default opted in)
> > until we figure out the best way to do this (can be a patch on top of
> > this series).
> 
> You mean leave the whole opt-in/prctl part out? So the only change
> would be to move untagging for memory syscalls into generic code?

Yes (or just wait until next week to see if the discussion settles
down).

-- 
Catalin


Re: [PATCH v15 17/17] selftests, arm64: add a selftest for passing tagged pointers to kernel

2019-05-31 Thread Catalin Marinas
On Fri, May 31, 2019 at 04:21:48PM +0200, Andrey Konovalov wrote:
> On Wed, May 22, 2019 at 4:16 PM Catalin Marinas  
> wrote:
> > On Mon, May 06, 2019 at 06:31:03PM +0200, Andrey Konovalov wrote:
> > > This patch is a part of a series that extends arm64 kernel ABI to allow to
> > > pass tagged user pointers (with the top byte set to something else other
> > > than 0x00) as syscall arguments.
> > >
> > > This patch adds a simple test, that calls the uname syscall with a
> > > tagged user pointer as an argument. Without the kernel accepting tagged
> > > user pointers the test fails with EFAULT.
> >
> > That's probably sufficient for a simple example. Something we could add
> > to Documentation maybe is a small library that can be LD_PRELOAD'ed so
> > that you can run a lot more tests like LTP.
> 
> Should I add this into this series, or should this go into Vincenzo's 
> patchset?

If you can tweak the selftest Makefile to build a library and force it
with LD_PRELOAD, you can keep it with this patch. It would be easier to
extend to other syscall tests, signal handling etc.

-- 
Catalin


Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel

2019-05-30 Thread Catalin Marinas
On Tue, May 28, 2019 at 04:14:45PM +0200, Andrey Konovalov wrote:
> Thanks for a lot of valuable input! I've read through all the replies
> and got somewhat lost. What are the changes I need to do to this
> series?
> 
> 1. Should I move untagging for memory syscalls back to the generic
> code so other arches would make use of it as well, or should I keep
> the arm64 specific memory syscalls wrappers and address the comments
> on that patch?

Keep them generic again but make sure we get agreement with Khalid on
the actual ABI implications for sparc.

> 2. Should I make untagging opt-in and controlled by a command line argument?

Opt-in, yes, but per task rather than kernel command line option.
prctl() is a possibility of opting in.

> 3. Should I "add Documentation/core-api/user-addresses.rst to describe
> proper care and handling of user space pointers with untagged_addr(),
> with examples based on all the cases seen so far in this series"?
> Which examples specifically should it cover?

I think we can leave 3 for now as not too urgent. What I'd like is for
Vincenzo's TBI user ABI document to go into a more common place since we
can expand it to cover both sparc and arm64. We'd need an arm64-specific
doc as well for things like prctl() and later MTE that sparc may support
differently.

-- 
Catalin


Re: [PATCH v15 05/17] arms64: untag user pointers passed to memory syscalls

2019-05-30 Thread Catalin Marinas
On Thu, May 30, 2019 at 10:05:55AM -0600, Khalid Aziz wrote:
> On 5/30/19 9:11 AM, Catalin Marinas wrote:
> > So if a database program is doing an anonymous mmap(PROT_TBI) of 100GB,
> > IIUC for sparc the faulted-in pages will have random colours (on 64-byte
> > granularity). Ignoring the information leak from prior uses of such
> > pages, it would be the responsibility of the db program to issue the
> > stxa. On arm64, since we also want to do this via malloc(), any large
> > allocation would require all pages to be faulted in so that malloc() can
> > set the write colour before being handed over to the user. That's what
> > we want to avoid and the user is free to repaint the memory as it likes.
> 
> On sparc, any newly allocated page is cleared along with any old tags on
> it. Since clearing tag happens automatically when page is cleared on
> sparc, clear_user_page() will need to execute additional stxa
> instructions to set a new tag. It is doable. In a way it is done already
> if page is being pre-colored with tag 0 always ;)

Ah, good to know. On arm64 we'd have to use different instructions,
although the same loop.

> Where would the pre-defined tag be stored - as part of address stored
> in vm_start or a new field in vm_area_struct?

I think we can discuss the details when we post the actual MTE patches.
In our internal hack we overloaded the VM_HIGH_ARCH_* flags and selected
CONFIG_ARCH_USES_HIGH_VMA_FLAGS (used for pkeys on x86).

For the time being, I'd rather restrict tagged addresses passed to
mmap() until we agreed that they have any meaning. If we allowed them
now but get ignored (though probably no-one would be doing this), I feel
it's slightly harder to change the semantics afterwards.

Thanks.

-- 
Catalin


Re: [PATCH v15 05/17] arms64: untag user pointers passed to memory syscalls

2019-05-30 Thread Catalin Marinas
On Wed, May 29, 2019 at 01:16:37PM -0600, Khalid Aziz wrote:
> On 5/29/19 8:20 AM, Catalin Marinas wrote:
> > On Tue, May 28, 2019 at 05:33:04PM -0600, Khalid Aziz wrote:
> >> Steps 1 and 2 are accomplished by userspace by calling mprotect() with
> >> PROT_ADI. Tags are set by storing tags in a loop, for example:
> >>
> >> version = 10;
> >> tmp_addr = shmaddr;
> >> end = shmaddr + BUFFER_SIZE;
> >> while (tmp_addr < end) {
> >> asm volatile(
> >> "stxa %1, [%0]0x90\n\t"
> >> :
> >> : "r" (tmp_addr), "r" (version));
> >> tmp_addr += adi_blksz;
> >> }
> > 
> > On arm64, a sequence similar to the above would live in the libc. So a
> > malloc() call will tag the memory and return the tagged address to 
> > thePre-coloring could easily be done by 
> > user.
> > 
> > We were not planning for a PROT_ADI/MTE but rather have MTE enabled for
> > all user memory ranges. We may revisit this before we upstream the MTE
> > support (probably some marginal benefit for the hardware not fetching
> > the tags from memory if we don't need to, e.g. code sections).
> > 
> > Given that we already have the TBI feature and with MTE enabled the top
> > byte is no longer ignored, we are planning for an explicit opt-in by the
> > user via prctl() to enable MTE.
> 
> OK. I had initially proposed enabling ADI for a process using prctl().
> Feedback I got was prctl was not a desirable interface and I ended up
> making mprotect() with PROT_ADI enable ADI on the process instead. Just
> something to keep in mind.

Thanks for the feedback. We'll keep this in mind when adding MTE
support. In the way we plan to deploy this, it would be a libc decision
to invoke the mmap() with the right flag.

This could actually simplify the automatic page faulting below brk(),
basically no tagged/coloured memory allowed implicitly. It needs
feedback from the bionic/glibc folk.

> >> With these semantics, giving mmap() or shamat() a tagged address is
> >> meaningless since no tags have been stored at the addresses mmap() will
> >> allocate and one can not store tags before memory range has been
> >> allocated. If we choose to allow tagged addresses to come into mmap()
> >> and shmat(), sparc code can strip the tags unconditionally and that may
> >> help simplify ABI and/or code.
> > 
> > We could say that with TBI (pre-MTE support), the top byte is actually
> > ignored on mmap(). Now, if you pass a MAP_FIXED with a tagged address,
> > should the user expect the same tagged address back or stripping the tag
> > is acceptable? If we want to keep the current mmap() semantics, I'd say
> > the same tag is returned. However, with MTE this also implies that the
> > memory was coloured.
> 
> Is assigning a tag aprivileged operationon ARM64? I am thinking not
> since you mentioned libc could do it in a loop for malloc'd memory.

Indeed it's not, the user can do it.

> mmap() can return the same tagged address but I am uneasy about kernel
> pre-coloring the pages. Database can mmap 100's of GB of memory. That is
> lot of work being offloaded to the kernel to pre-color the page even if
> done in batches as pages are faulted in.

For anonymous mmap() for example, the kernel would have to zero the
faulted in pages anyway. We can handle the colouring at the same time in
clear_user_page() (as I said below, we have to clear the colour anyway
from previous uses, so it's simply extending this to support something
other than tag/colour 0 by default with no additional overhead).

> > Since the user can probe the pre-existing colour in a faulted-in page
> > (either with some 'ldxa' instruction or by performing a tag-checked
> > access), the kernel should always pre-colour (even if colour 0) any
> > allocated page. There might not be an obvious security risk but I feel
> > uneasy about letting colours leak between address spaces (different user
> > processes or between kernel and user).
> 
> On sparc, tags 0 and 15 are special in that 0 means untagged memory and
> 15 means match any tag in the address. Colour 0 is the default for any
> newly faulted in page on sparc.

With MTE we don't have match-all/any tag in memory, only in the virtual
address/pointer. So if we turn on MTE for all pages and the user
accesses an address with a 0 tag, the underlying memory needs to be
coloured with the same 0 value.

> > Since we already need such loop in the kernel, we might as well allow
> > user space to require a certain 

Re: [PATCH v15 05/17] arms64: untag user pointers passed to memory syscalls

2019-05-29 Thread Catalin Marinas
Hi Khalid,

On Tue, May 28, 2019 at 05:33:04PM -0600, Khalid Aziz wrote:
> On Tue, 2019-05-28 at 16:40 +0100, Catalin Marinas wrote:
> > I think another aspect is how we define the ABI. Is allowing tags to
> > mlock() for example something specific to arm64 or would sparc ADI
> > need the same? In the absence of other architectures defining such
> > ABI, my preference would be to keep the wrappers in the arch code.
> > 
> > Assuming sparc won't implement untagged_addr(), we can place the
> > macros back in the generic code but, as per the review here, we need
> > to be more restrictive on where we allow tagged addresses. For
> > example, if mmap() gets a tagged address with MAP_FIXED, is it
> > expected to return the tag?
> 
> I would recommend against any ABI differences between ARM64 MTE/TBI and
> sparc ADI unless it simply can not be helped. My understanding of
> MTE/TBI is limited, so I will explain how sparc ADI works. On sparc, a
> tagged address has no meaning until following steps happen:

Before we go into the MTE/ADI similarities or differences, just to
clarify that TBI is something that we supported from the start of the
arm64 kernel port. TBI (top byte ignore) allows a user pointer to have
non-zero top byte and dereference it without causing a fault (the
hardware masks it out). The user/kernel ABI does not allow such tagged
pointers into the kernel, nor would the kernel return any such tagged
addresses.

With MTE (memory tagging extensions), the top-byte meaning is changed
from no longer being ignored to actually being checked against a tag in
the physical RAM (we call it allocation tag).

> 1. set the user mode PSTATE.mcde bit. This acts as the master switch to
> enable ADI for a process.
> 
> 2. set TTE.mcd bit on TLB entries that match the address range ADI is
> being enabled on.

Something close enough for MTE, with the difference that enabling it is
not a PSTATE bit but rather a system control bit (SCTLR_EL1 register),
so only the kernel can turn it on/off for the user.

> 3. Store version tag for the range of addresses userspace wants ADI
> enabled on using "stxa" instruction. These tags are stored in physical
> memory by the memory controller.

Do you have an "ldxa" instruction to load the tags from physical memory?

> Steps 1 and 2 are accomplished by userspace by calling mprotect() with
> PROT_ADI. Tags are set by storing tags in a loop, for example:
> 
> version = 10;
> tmp_addr = shmaddr;
> end = shmaddr + BUFFER_SIZE;
> while (tmp_addr < end) {
> asm volatile(
> "stxa %1, [%0]0x90\n\t"
> :
> : "r" (tmp_addr), "r" (version));
> tmp_addr += adi_blksz;
> }

On arm64, a sequence similar to the above would live in the libc. So a
malloc() call will tag the memory and return the tagged address to the
user.

We were not planning for a PROT_ADI/MTE but rather have MTE enabled for
all user memory ranges. We may revisit this before we upstream the MTE
support (probably some marginal benefit for the hardware not fetching
the tags from memory if we don't need to, e.g. code sections).

Given that we already have the TBI feature and with MTE enabled the top
byte is no longer ignored, we are planning for an explicit opt-in by the
user via prctl() to enable MTE.

> With these semantics, giving mmap() or shamat() a tagged address is
> meaningless since no tags have been stored at the addresses mmap() will
> allocate and one can not store tags before memory range has been
> allocated. If we choose to allow tagged addresses to come into mmap()
> and shmat(), sparc code can strip the tags unconditionally and that may
> help simplify ABI and/or code.

We could say that with TBI (pre-MTE support), the top byte is actually
ignored on mmap(). Now, if you pass a MAP_FIXED with a tagged address,
should the user expect the same tagged address back or stripping the tag
is acceptable? If we want to keep the current mmap() semantics, I'd say
the same tag is returned. However, with MTE this also implies that the
memory was coloured.

> > My thoughts on allowing tags (quick look):
> > 
> > brk - no
> > get_mempolicy - yes
> > madvise - yes
> > mbind - yes
> > mincore - yes
> > mlock, mlock2, munlock - yes
> > mmap - no (we may change this with MTE but not for TBI)
> > mmap_pgoff - not used on arm64
> > mprotect - yes
> > mremap - yes for old_address, no for new_address (on par with mmap)
> > msync - yes
> > munmap - probably no (mmap does not return tagged ptrs)
> > remap_file_pages - no (also deprecated syscall)
> > shmat, shmdt - shall we allow tagged addresses on shar

Re: [PATCH v15 05/17] arms64: untag user pointers passed to memory syscalls

2019-05-29 Thread Catalin Marinas
On Wed, May 29, 2019 at 01:42:25PM +0100, Dave P Martin wrote:
> On Tue, May 28, 2019 at 05:34:00PM +0100, Catalin Marinas wrote:
> > On Tue, May 28, 2019 at 04:56:45PM +0100, Dave P Martin wrote:
> > > On Tue, May 28, 2019 at 04:40:58PM +0100, Catalin Marinas wrote:
> > > 
> > > [...]
> > > 
> > > > My thoughts on allowing tags (quick look):
> > > >
> > > > brk - no
> > > 
> > > [...]
> > > 
> > > > mlock, mlock2, munlock - yes
> > > > mmap - no (we may change this with MTE but not for TBI)
> > > 
> > > [...]
> > > 
> > > > mprotect - yes
> > > 
> > > I haven't following this discussion closely... what's the rationale for
> > > the inconsistencies here (feel free to refer me back to the discussion
> > > if it's elsewhere).
> > 
> > _My_ rationale (feel free to disagree) is that mmap() by default would
> > not return a tagged address (ignoring MTE for now). If it gets passed a
> > tagged address or a "tagged NULL" (for lack of a better name) we don't
> > have clear semantics of whether the returned address should be tagged in
> > this ABI relaxation. I'd rather reserve this specific behaviour if we
> > overload the non-zero tag meaning of mmap() for MTE. Similar reasoning
> > for mremap(), at least on the new_address argument (not entirely sure
> > about old_address).
> > 
> > munmap() should probably follow the mmap() rules.
> > 
> > As for brk(), I don't see why the user would need to pass a tagged
> > address, we can't associate any meaning to this tag.
> > 
> > For the rest, since it's likely such addresses would have been tagged by
> > malloc() in user space, we should allow tagged pointers.
> 
> Those arguments seem reasonable.  We should try to capture this
> somewhere when documenting the ABI.
> 
> To be clear, I'm not sure that we should guarantee anywhere that a
> tagged pointer is rejected: rather the behaviour should probably be
> left unspecified.  Then we can tidy it up incrementally.
> 
> (The behaviour is unspecified today, in any case.)

What is specified (or rather de-facto ABI) today is that passing a user
address above TASK_SIZE (e.g. non-zero top byte) would fail in most
cases. If we relax this with the TBI we may end up with some de-facto
ABI before we actually get MTE hardware. Tightening it afterwards may be
slightly more problematic, although MTE needs to be an explicit opt-in.

IOW, I wouldn't want to unnecessarily relax the ABI if we don't need to.

-- 
Catalin


Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel

2019-05-29 Thread Catalin Marinas
On Tue, May 28, 2019 at 11:11:26PM -0700, Christoph Hellwig wrote:
> On Tue, May 28, 2019 at 04:14:45PM +0200, Andrey Konovalov wrote:
> > Thanks for a lot of valuable input! I've read through all the replies
> > and got somewhat lost. What are the changes I need to do to this
> > series?
> > 
> > 1. Should I move untagging for memory syscalls back to the generic
> > code so other arches would make use of it as well, or should I keep
> > the arm64 specific memory syscalls wrappers and address the comments
> > on that patch?
> 
> It absolutely needs to move to common code.  Having arch code leads
> to pointless (often unintentional) semantic difference between
> architectures, and lots of boilerplate code.

That's fine by me as long as we agree on the semantics (which shouldn't
be hard; Khalid already following up). We should probably also move the
proposed ABI document [1] into a common place (or part of since we'll
have arm64-specifics like prctl() calls to explicitly opt in to memory
tagging).

[1] 
https://lore.kernel.org/lkml/20190318163533.26838-1-vincenzo.frasc...@arm.com/T/#u

-- 
Catalin


Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel

2019-05-28 Thread Catalin Marinas
On Thu, May 23, 2019 at 02:31:16PM -0700, Kees Cook wrote:
> syzkaller already attempts to randomly inject non-canonical and
> 0x addresses for user pointers in syscalls in an effort to
> find bugs like CVE-2017-5123 where waitid() via unchecked put_user() was
> able to write directly to kernel memory[1].
> 
> It seems that using TBI by default and not allowing a switch back to
> "normal" ABI without a reboot actually means that userspace cannot inject
> kernel pointers into syscalls any more, since they'll get universally
> stripped now. Is my understanding correct, here? i.e. exploiting
> CVE-2017-5123 would be impossible under TBI?
> 
> If so, then I think we should commit to the TBI ABI and have a boot
> flag to disable it, but NOT have a process flag, as that would allow
> attackers to bypass the masking. The only flag should be "TBI or MTE".
> 
> If so, can I get top byte masking for other architectures too? Like,
> just to strip high bits off userspace addresses? ;)

Just for fun, hack/attempt at your idea which should not interfere with
TBI. Only briefly tested on arm64 (and the s390 __TYPE_IS_PTR macro is
pretty weird ;)):

--8<-
diff --git a/arch/s390/include/asm/compat.h b/arch/s390/include/asm/compat.h
index 63b46e30b2c3..338455a74eff 100644
--- a/arch/s390/include/asm/compat.h
+++ b/arch/s390/include/asm/compat.h
@@ -11,9 +11,6 @@
 
 #include 
 
-#define __TYPE_IS_PTR(t) (!__builtin_types_compatible_p( \
-   typeof(0?(__force t)0:0ULL), u64))
-
 #define __SC_DELOUSE(t,v) ({ \
BUILD_BUG_ON(sizeof(t) > 4 && !__TYPE_IS_PTR(t)); \
(__force t)(__TYPE_IS_PTR(t) ? ((v) & 0x7fff) : (v)); \
diff --git a/include/linux/syscalls.h b/include/linux/syscalls.h
index e2870fe1be5b..b1b9fe8502da 100644
--- a/include/linux/syscalls.h
+++ b/include/linux/syscalls.h
@@ -119,8 +119,15 @@ struct io_uring_params;
 #define __TYPE_IS_L(t) (__TYPE_AS(t, 0L))
 #define __TYPE_IS_UL(t)(__TYPE_AS(t, 0UL))
 #define __TYPE_IS_LL(t) (__TYPE_AS(t, 0LL) || __TYPE_AS(t, 0ULL))
+#define __TYPE_IS_PTR(t) (!__builtin_types_compatible_p(typeof(0 ? (__force 
t)0 : 0ULL), u64))
 #define __SC_LONG(t, a) __typeof(__builtin_choose_expr(__TYPE_IS_LL(t), 0LL, 
0L)) a
+#ifdef CONFIG_64BIT
+#define __SC_CAST(t, a)(__TYPE_IS_PTR(t) \
+   ? (__force t) ((__u64)a & ~(1UL << 55)) \
+   : (__force t) a)
+#else
 #define __SC_CAST(t, a)(__force t) a
+#endif
 #define __SC_ARGS(t, a)a
 #define __SC_TEST(t, a) (void)BUILD_BUG_ON_ZERO(!__TYPE_IS_LL(t) && sizeof(t) 
> sizeof(long))
 

-- 
Catalin


Re: [PATCH v15 05/17] arms64: untag user pointers passed to memory syscalls

2019-05-28 Thread Catalin Marinas
On Tue, May 28, 2019 at 04:56:45PM +0100, Dave P Martin wrote:
> On Tue, May 28, 2019 at 04:40:58PM +0100, Catalin Marinas wrote:
> 
> [...]
> 
> > My thoughts on allowing tags (quick look):
> >
> > brk - no
> 
> [...]
> 
> > mlock, mlock2, munlock - yes
> > mmap - no (we may change this with MTE but not for TBI)
> 
> [...]
> 
> > mprotect - yes
> 
> I haven't following this discussion closely... what's the rationale for
> the inconsistencies here (feel free to refer me back to the discussion
> if it's elsewhere).

_My_ rationale (feel free to disagree) is that mmap() by default would
not return a tagged address (ignoring MTE for now). If it gets passed a
tagged address or a "tagged NULL" (for lack of a better name) we don't
have clear semantics of whether the returned address should be tagged in
this ABI relaxation. I'd rather reserve this specific behaviour if we
overload the non-zero tag meaning of mmap() for MTE. Similar reasoning
for mremap(), at least on the new_address argument (not entirely sure
about old_address).

munmap() should probably follow the mmap() rules.

As for brk(), I don't see why the user would need to pass a tagged
address, we can't associate any meaning to this tag.

For the rest, since it's likely such addresses would have been tagged by
malloc() in user space, we should allow tagged pointers.

-- 
Catalin


Re: [PATCH v15 05/17] arms64: untag user pointers passed to memory syscalls

2019-05-28 Thread Catalin Marinas
On Tue, May 28, 2019 at 03:54:11PM +0100, Andrew Murray wrote:
> On Mon, May 27, 2019 at 03:37:20PM +0100, Catalin Marinas wrote:
> > On Mon, May 06, 2019 at 06:30:51PM +0200, Andrey Konovalov wrote:
> > > This patch is a part of a series that extends arm64 kernel ABI to allow to
> > > pass tagged user pointers (with the top byte set to something else other
> > > than 0x00) as syscall arguments.
> > > 
> > > This patch allows tagged pointers to be passed to the following memory
> > > syscalls: brk, get_mempolicy, madvise, mbind, mincore, mlock, mlock2,
> > > mmap, mmap_pgoff, mprotect, mremap, msync, munlock, munmap,
> > > remap_file_pages, shmat and shmdt.
> > > 
> > > This is done by untagging pointers passed to these syscalls in the
> > > prologues of their handlers.
> > > 
> > > Signed-off-by: Andrey Konovalov 
> > 
> > Actually, I don't think any of these wrappers get called (have you
> > tested this patch?). Following commit 4378a7d4be30 ("arm64: implement
> > syscall wrappers"), I think we have other macro names for overriding the
> > sys_* ones.
> 
> What is the value in adding these wrappers?

Not much value, initially proposed just to keep the core changes small.
I'm fine with moving them back in the generic code (but see below).

I think another aspect is how we define the ABI. Is allowing tags to
mlock() for example something specific to arm64 or would sparc ADI need
the same? In the absence of other architectures defining such ABI, my
preference would be to keep the wrappers in the arch code.

Assuming sparc won't implement untagged_addr(), we can place the macros
back in the generic code but, as per the review here, we need to be more
restrictive on where we allow tagged addresses. For example, if mmap()
gets a tagged address with MAP_FIXED, is it expected to return the tag?

My thoughts on allowing tags (quick look):

brk - no
get_mempolicy - yes
madvise - yes
mbind - yes
mincore - yes
mlock, mlock2, munlock - yes
mmap - no (we may change this with MTE but not for TBI)
mmap_pgoff - not used on arm64
mprotect - yes
mremap - yes for old_address, no for new_address (on par with mmap)
msync - yes
munmap - probably no (mmap does not return tagged ptrs)
remap_file_pages - no (also deprecated syscall)
shmat, shmdt - shall we allow tagged addresses on shared memory?

The above is only about the TBI ABI while ignoring hardware MTE. For the
latter, we may want to change the mmap() to allow pre-colouring on page
fault which means that munmap()/mprotect() should also support tagged
pointers. Possibly mremap() as well but we need to decide whether it
should allow re-colouring the page (probably no, in which case
old_address and new_address should have the same tag). For some of these
we'll end up with arm64 specific wrappers again, unless sparc ADI adopts
exactly the same ABI restrictions.

-- 
Catalin


Re: [PATCH v15 05/17] arms64: untag user pointers passed to memory syscalls

2019-05-28 Thread Catalin Marinas
On Mon, May 06, 2019 at 06:30:51PM +0200, Andrey Konovalov wrote:
>  /*
>   * Wrappers to pass the pt_regs argument.
>   */
>  #define sys_personality  sys_arm64_personality
> +#define sys_mmap_pgoff   sys_arm64_mmap_pgoff
> +#define sys_mremap   sys_arm64_mremap
> +#define sys_munmap   sys_arm64_munmap
> +#define sys_brk  sys_arm64_brk
> +#define sys_get_mempolicysys_arm64_get_mempolicy
> +#define sys_madvise  sys_arm64_madvise
> +#define sys_mbindsys_arm64_mbind
> +#define sys_mlocksys_arm64_mlock
> +#define sys_mlock2   sys_arm64_mlock2
> +#define sys_munlock  sys_arm64_munlock
> +#define sys_mprotect sys_arm64_mprotect
> +#define sys_msyncsys_arm64_msync
> +#define sys_mincore  sys_arm64_mincore
> +#define sys_remap_file_pages sys_arm64_remap_file_pages
> +#define sys_shmatsys_arm64_shmat
> +#define sys_shmdtsys_arm64_shmdt

This hunk should be (I sent a separate patch for sys_personality):

@@ -160,23 +163,23 @@ SYSCALL_DEFINE1(arm64_shmdt, char __user *, shmaddr)
 /*
  * Wrappers to pass the pt_regs argument.
  */
-#define sys_personalitysys_arm64_personality
-#define sys_mmap_pgoff sys_arm64_mmap_pgoff
-#define sys_mremap sys_arm64_mremap
-#define sys_munmap sys_arm64_munmap
-#define sys_brksys_arm64_brk
-#define sys_get_mempolicy  sys_arm64_get_mempolicy
-#define sys_madvisesys_arm64_madvise
-#define sys_mbind  sys_arm64_mbind
-#define sys_mlock  sys_arm64_mlock
-#define sys_mlock2 sys_arm64_mlock2
-#define sys_munlocksys_arm64_munlock
-#define sys_mprotect   sys_arm64_mprotect
-#define sys_msync  sys_arm64_msync
-#define sys_mincoresys_arm64_mincore
-#define sys_remap_file_pages   sys_arm64_remap_file_pages
-#define sys_shmat  sys_arm64_shmat
-#define sys_shmdt  sys_arm64_shmdt
+#define __arm64_sys_personality__arm64_sys_arm64_personality
+#define __arm64_sys_mmap_pgoff __arm64_sys_arm64_mmap_pgoff
+#define __arm64_sys_mremap __arm64_sys_arm64_mremap
+#define __arm64_sys_munmap __arm64_sys_arm64_munmap
+#define __arm64_sys_brk__arm64_sys_arm64_brk
+#define __arm64_sys_get_mempolicy  __arm64_sys_arm64_get_mempolicy
+#define __arm64_sys_madvise__arm64_sys_arm64_madvise
+#define __arm64_sys_mbind  __arm64_sys_arm64_mbind
+#define __arm64_sys_mlock  __arm64_sys_arm64_mlock
+#define __arm64_sys_mlock2 __arm64_sys_arm64_mlock2
+#define __arm64_sys_munlock__arm64_sys_arm64_munlock
+#define __arm64_sys_mprotect   __arm64_sys_arm64_mprotect
+#define __arm64_sys_msync  __arm64_sys_arm64_msync
+#define __arm64_sys_mincore__arm64_sys_arm64_mincore
+#define __arm64_sys_remap_file_pages   __arm64_sys_arm64_remap_file_pages
+#define __arm64_sys_shmat  __arm64_sys_arm64_shmat
+#define __arm64_sys_shmdt  __arm64_sys_arm64_shmdt
 
 asmlinkage long sys_ni_syscall(const struct pt_regs *);
 #define __arm64_sys_ni_syscall sys_ni_syscall

-- 
Catalin


Re: [PATCH v15 05/17] arms64: untag user pointers passed to memory syscalls

2019-05-27 Thread Catalin Marinas
On Mon, May 06, 2019 at 06:30:51PM +0200, Andrey Konovalov wrote:
> This patch is a part of a series that extends arm64 kernel ABI to allow to
> pass tagged user pointers (with the top byte set to something else other
> than 0x00) as syscall arguments.
> 
> This patch allows tagged pointers to be passed to the following memory
> syscalls: brk, get_mempolicy, madvise, mbind, mincore, mlock, mlock2,
> mmap, mmap_pgoff, mprotect, mremap, msync, munlock, munmap,
> remap_file_pages, shmat and shmdt.
> 
> This is done by untagging pointers passed to these syscalls in the
> prologues of their handlers.
> 
> Signed-off-by: Andrey Konovalov 

Actually, I don't think any of these wrappers get called (have you
tested this patch?). Following commit 4378a7d4be30 ("arm64: implement
syscall wrappers"), I think we have other macro names for overriding the
sys_* ones.

-- 
Catalin


Re: [PATCH v15 05/17] arms64: untag user pointers passed to memory syscalls

2019-05-27 Thread Catalin Marinas
On Mon, May 06, 2019 at 06:30:51PM +0200, Andrey Konovalov wrote:
> +SYSCALL_DEFINE2(arm64_mlock2, unsigned long, start, size_t, len)
> +{
> + start = untagged_addr(start);
> + return ksys_mlock(start, len, VM_LOCKED);
> +}

Copy/paste error: sys_mlock2() has 3 arguments and should call
ksys_mlock2().

Still tracking down an LTP failure on test mlock01.

-- 
Catalin


Re: [PATCH v15 05/17] arms64: untag user pointers passed to memory syscalls

2019-05-25 Thread Catalin Marinas
On Mon, May 06, 2019 at 06:30:51PM +0200, Andrey Konovalov wrote:
> +SYSCALL_DEFINE5(arm64_get_mempolicy, int __user *, policy,
> + unsigned long __user *, nmask, unsigned long, maxnode,
> + unsigned long, addr, unsigned long, flags)
> +{
> + addr = untagged_addr(addr);
> + return ksys_get_mempolicy(policy, nmask, maxnode, addr, flags);
> +}
[...]
> +SYSCALL_DEFINE6(arm64_mbind, unsigned long, start, unsigned long, len,
> + unsigned long, mode, const unsigned long __user *, nmask,
> + unsigned long, maxnode, unsigned int, flags)
> +{
> + start = untagged_addr(start);
> + return ksys_mbind(start, len, mode, nmask, maxnode, flags);
> +}

The kernel fails to build with CONFIG_NUMA disabled because the above
are in mm/mempolicy.c which is no longer compiled in.

-- 
Catalin


Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel

2019-05-24 Thread Catalin Marinas
On Thu, May 23, 2019 at 02:31:16PM -0700, Kees Cook wrote:
> On Thu, May 23, 2019 at 06:43:46PM +0100, Catalin Marinas wrote:
> > On Thu, May 23, 2019 at 09:38:19AM -0700, Kees Cook wrote:
> > > What about testing tools that intentionally insert high bits for syscalls
> > > and are _expecting_ them to fail? It seems the TBI series will break them?
> > > In that case, do we need to opt into TBI as well?
> > 
> > If there are such tools, then we may need a per-process control. It's
> > basically an ABI change.
> 
> syzkaller already attempts to randomly inject non-canonical and
> 0x addresses for user pointers in syscalls in an effort to
> find bugs like CVE-2017-5123 where waitid() via unchecked put_user() was
> able to write directly to kernel memory[1].
> 
> It seems that using TBI by default and not allowing a switch back to
> "normal" ABI without a reboot actually means that userspace cannot inject
> kernel pointers into syscalls any more, since they'll get universally
> stripped now. Is my understanding correct, here? i.e. exploiting
> CVE-2017-5123 would be impossible under TBI?

Unless the kernel is also using TBI (khwasan), in which case masking out
the top byte wouldn't help. Anyway, as per this discussion, we want the
tagged pointer to remain intact all the way to put_user(), so nothing
gets masked out. I don't think this would have helped with the waitid()
bug.

> If so, then I think we should commit to the TBI ABI and have a boot
> flag to disable it, but NOT have a process flag, as that would allow
> attackers to bypass the masking. The only flag should be "TBI or MTE".
> 
> If so, can I get top byte masking for other architectures too? Like,
> just to strip high bits off userspace addresses? ;)

But you didn't like my option 2 shim proposal which strips the tag on
kernel entry because it lowers the value of MTE ;).

> (Oh, in looking I see this is implemented with sign-extension... why
> not just a mask? So it'll either be valid userspace address or forced
> into the non-canonical range?)

The TTBR0/1 selection on memory accesses is done based on bit 63 if TBI
is disabled and bit 55 when enabled. Sign-extension allows us to use the
same macro for both user and kernel tagged pointers. With MTE tag 0
would be match-all for TTBR0 and 0xff for TTBR1 (so that we don't modify
the virtual address space of the kernel; I need to check the latest spec
to be sure). Note that the VA space for both user and kernel is limited
to 52-bit architecturally so, on access, bits 55..52 must be the same, 0
or 1, otherwise you get a fault.

Since the syzkaller tests would also need to set bits 55-52 (actually 48
for kernel addresses, we haven't merged the 52-bit kernel VA patches
yet) to hit a valid kernel address, I don't think ignoring the top byte
makes much difference to the expected failure scenario.

> > > Alright, the tl;dr appears to be:
> > > - you want more assurances that we can find __user stripping in the
> > >   kernel more easily. (But this seems like a parallel problem.)
> > 
> > Yes, and that we found all (most) cases now. The reason I don't see it
> > as a parallel problem is that, as maintainer, I promise an ABI to user
> > and I'd rather stick to it. I don't want, for example, ncurses to stop
> > working because of some ioctl() rejecting tagged pointers.
> 
> But this is what I don't understand: it would need to be ncurses _using
> TBI_, that would stop working (having started to work before, but then
> regress due to a newly added one-off bug). Regular ncurses will be fine
> because it's not using TBI. So The Golden Rule isn't violated,

Once we introduced TBI and the libc starts tagging heap allocations,
this becomes the new "regular" user space behaviour (i.e. using TBI). So
a new bug would break the golden rule. It could also be an old bug that
went unnoticed (i.e. you changed the graphics card and its driver gets
confused by tagged pointers coming from user-space).

> and by definition, it's a specific regression caused by some bug
> (since TBI would have had to have worked _before_ in the situation to
> be considered a regression now). Which describes the normal path for
> kernel development... add feature, find corner cases where it doesn't
> work, fix them, encounter new regressions, fix those, repeat forever.
> 
> > If it's just the occasional one-off bug I'm fine to deal with it. But
> > no-one convinced me yet that this is the case.
> 
> You believe there still to be some systemic cases that haven't been
> found yet? And even if so -- isn't it better to work on that
> incrementally?

I want some way to systematically identify potential issues (sparse?).
Since problems are most likely in drivers, I don't have all device

Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel

2019-05-24 Thread Catalin Marinas
On Thu, May 23, 2019 at 03:49:05PM -0600, Khalid Aziz wrote:
> On 5/23/19 2:11 PM, Catalin Marinas wrote:
> > On Thu, May 23, 2019 at 11:51:40AM -0600, Khalid Aziz wrote:
> >> On 5/21/19 6:04 PM, Kees Cook wrote:
> >>> As an aside: I think Sparc ADI support in Linux actually side-stepped
> >>> this[1] (i.e. chose "solution 1"): "All addresses passed to kernel must
> >>> be non-ADI tagged addresses." (And sadly, "Kernel does not enable ADI
> >>> for kernel code.") I think this was a mistake we should not repeat for
> >>> arm64 (we do seem to be at least in agreement about this, I think).
> >>>
> >>> [1] https://lore.kernel.org/patchwork/patch/654481/
> >>
> >> That is a very early version of the sparc ADI patch. Support for tagged
> >> addresses in syscalls was added in later versions and is in the patch
> >> that is in the kernel.
> > 
> > I tried to figure out but I'm not familiar with the sparc port. How did
> > you solve the tagged address going into various syscall implementations
> > in the kernel (e.g. sys_write)? Is the tag removed on kernel entry or it
> > ends up deeper in the core code?
> 
> Another spot I should point out in ADI patch - Tags are not stored in
> VMAs and IOMMU does not support ADI tags on M7. ADI tags are stripped
> before userspace addresses are passed to IOMMU in the following snippet
> from the patch:
> 
> diff --git a/arch/sparc/mm/gup.c b/arch/sparc/mm/gup.c
> index 5335ba3c850e..357b6047653a 100644
> --- a/arch/sparc/mm/gup.c
> +++ b/arch/sparc/mm/gup.c
> @@ -201,6 +202,24 @@ int __get_user_pages_fast(unsigned long start, int
> nr_pages
> , int write,
> pgd_t *pgdp;
> int nr = 0;
> 
> +#ifdef CONFIG_SPARC64
> +   if (adi_capable()) {
> +   long addr = start;
> +
> +   /* If userspace has passed a versioned address, kernel
> +* will not find it in the VMAs since it does not store
> +* the version tags in the list of VMAs. Storing version
> +* tags in list of VMAs is impractical since they can be
> +* changed any time from userspace without dropping into
> +* kernel. Any address search in VMAs will be done with
> +* non-versioned addresses. Ensure the ADI version bits
> +* are dropped here by sign extending the last bit before
> +* ADI bits. IOMMU does not implement version tags.
> +*/
> +   addr = (addr << (long)adi_nbits()) >> (long)adi_nbits();
> +   start = addr;
> +   }
> +#endif
> start &= PAGE_MASK;
> addr = start;
> len = (unsigned long) nr_pages << PAGE_SHIFT;

Thanks Khalid. I missed that sparc does not enable HAVE_GENERIC_GUP, so
you fix this case here. If we add the generic untagged_addr() macro in
the generic code, I think sparc can start making use of it rather than
open-coding the shifts.

There are a few other other places where tags can leak and the core code
would get confused (for example, madvise()). I presume your user space
doesn't exercise them. On arm64 we plan to just allow the C library to
tag any new memory allocation, so those core code paths would need to be
covered.

And similarly, devices, IOMMU, any DMA would ignore tags.

-- 
Catalin


Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel

2019-05-23 Thread Catalin Marinas
Hi Khalid,

On Thu, May 23, 2019 at 11:51:40AM -0600, Khalid Aziz wrote:
> On 5/21/19 6:04 PM, Kees Cook wrote:
> > As an aside: I think Sparc ADI support in Linux actually side-stepped
> > this[1] (i.e. chose "solution 1"): "All addresses passed to kernel must
> > be non-ADI tagged addresses." (And sadly, "Kernel does not enable ADI
> > for kernel code.") I think this was a mistake we should not repeat for
> > arm64 (we do seem to be at least in agreement about this, I think).
> > 
> > [1] https://lore.kernel.org/patchwork/patch/654481/
> 
> That is a very early version of the sparc ADI patch. Support for tagged
> addresses in syscalls was added in later versions and is in the patch
> that is in the kernel.

I tried to figure out but I'm not familiar with the sparc port. How did
you solve the tagged address going into various syscall implementations
in the kernel (e.g. sys_write)? Is the tag removed on kernel entry or it
ends up deeper in the core code?

Thanks.

-- 
Catalin


Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel

2019-05-23 Thread Catalin Marinas
On Thu, May 23, 2019 at 09:38:19AM -0700, Kees Cook wrote:
> On Thu, May 23, 2019 at 03:44:49PM +0100, Catalin Marinas wrote:
> > There is also the obvious requirement which I didn't mention: new user
> > space continues to run on new/subsequent kernel versions. That's one of
> > the points of contention for this series (ignoring MTE) with the
> > maintainers having to guarantee this without much effort. IOW, do the
> > 500K+ new lines in a subsequent kernel version break any user space out
> > there? I'm only talking about the relaxed TBI ABI. Are the usual LTP,
> > syskaller sufficient? Better static analysis would definitely help.
> 
> We can't have perfect coverage of people actively (or accidentally)
> working to trick static analyzers (and the compiler) into "forgetting"
> about a __user annotation. We can certainly improve analysis (I see
> the other sub-thread on that), but I'd like that work not to block
> this series.

I don't want to block this series either as it's a prerequisite for MTE
but at the moment I'm not confident we tracked down all the places where
things can break and what the future effort will be to analyse new
kernel changes.

We've made lots of progress since March last year (I think when the
first version was posted) by both identifying new places in the kernel
that need untagging and limiting the ranges that user space can tag
(e.g. an mmap() on a device should not allow tagged pointers). Can we
say we are done or more investigation is needed?

> What on this front would you be comfortable with? Given it's a new
> feature isn't it sufficient to have a CONFIG (and/or boot option)?

I'd rather avoid re-building kernels. A boot option would do, unless we
see value in a per-process (inherited) personality or prctl. The
per-process option has the slight advantage that I can boot my machine
normally while being able to run LTP with a relaxed ABI (and a new libc
which tags pointers). I admit I don't have a very strong argument on a
per-process opt-in.

Another option would be a sysctl control together with a cmdline option.

> > Or, if we ever want MTE to be turned on by default (i.e. tag checking),
> > even if everything is tagged with 0, we have to disallow TBI for user
> > and this includes hwasan. There were a small number of programs using
> > the TBI (I think some JavaScript compilers tried this). But now we are
> > bringing in the hwasan support and this can be a large user base. Shall
> > we add an ELF note for such binaries that use TBI/hwasan?
> 
> Just to be clear, you say "disallow TBI for user" -- you mean a
> particular process, yes? i.e. there is no architectural limitation that
> says once we're using MTE nothing can switch to TBI. i.e. a process is
> either doing MTE or TBI (or nothing, but that's the same as TBI).

I may have answered this below. The architecture does not allow TBI
(i.e. faults) if MTE is enabled for a process and address range.

> > > So there needs to be some way to let the kernel know which of three
> > > things it should be doing:
> > > 1- leaving userspace addresses as-is (present)
> > > 2- wiping the top bits before using (this series)
> > 
> > (I'd say tolerating rather than wiping since get_user still uses the tag
> > in the current series)
> > 
> > The current series does not allow any choice between 1 and 2, the
> > default ABI basically becomes option 2.
> 
> What about testing tools that intentionally insert high bits for syscalls
> and are _expecting_ them to fail? It seems the TBI series will break them?
> In that case, do we need to opt into TBI as well?

If there are such tools, then we may need a per-process control. It's
basically an ABI change.

> > > 3- wiping the top bits for most things, but retaining them for MTE as
> > >needed (the future)
> > 
> > 2 and 3 are not entirely compatible as a tagged pointer may be checked
> > against the memory colour by the hardware. So you can't have hwasan
> > binary with MTE enabled.
> 
> Right: a process must be either MTE or TBI, not both.

Indeed.

> > > I expect MTE to be the "default" in the future. Once a system's libc has
> > > grown support for it, everything will be trying to use MTE. TBI will be
> > > the special case (but TBI is effectively a prerequisite).
> > 
> > The kernel handling of tagged pointers is indeed a prerequisite. The ABI
> > distinction between the above 2 and 3 needs to be solved.
> 
> Does that need solving now or when the MTE series appears? As there is
> no reason to distinguish between "normal" and "TBI", that doesn't seem
> to need solving at this point?

We don't need to solve 2 vs 3 as long as op

Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel

2019-05-23 Thread Catalin Marinas
On Thu, May 23, 2019 at 08:44:12AM -0700, enh wrote:
> On Thu, May 23, 2019 at 7:45 AM Catalin Marinas  
> wrote:
> > On Wed, May 22, 2019 at 01:47:36PM -0700, Kees Cook wrote:
> > > For userspace, how would a future binary choose TBI over MTE? If it's
> > > a library issue, we can't use an ELF bit, since the choice may be
> > > "late" after ELF load (this implies the need for a prctl().) If it's
> > > binary-only ("built with HWKASan") then an ELF bit seems sufficient.
> > > And without the marking, I'd expect the kernel to enforce MTE when
> > > there are high bits.
> >
> > The current plan is that a future binary issues a prctl(), after
> > checking the HWCAP_MTE bit (as I replied to Elliot, the MTE instructions
> > are not in the current NOP space). I'd expect this to be done by the
> > libc or dynamic loader under the assumption that the binaries it loads
> > do _not_ use the top pointer byte for anything else.
> 
> yeah, it sounds like to support hwasan and MTE, the dynamic linker
> will need to not use either itself.
> 
> > With hwasan compiled objects this gets more confusing (any ELF note
> > to identify them?).
> 
> no, at the moment code that wants to know checks for the presence of
> __hwasan_init. (and bionic doesn't actually look at any ELF notes
> right now.) but we can always add something if we need to.

It's a userspace decision to make. In the kernel, we are proposing that
bionic calls a prctl() to enable MTE explicitly. It could first check
for the presence of __hwasan_init.

-- 
Catalin


Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel

2019-05-23 Thread Catalin Marinas
On Thu, May 23, 2019 at 11:42:57AM +0100, Dave P Martin wrote:
> On Wed, May 22, 2019 at 09:20:52PM -0300, Jason Gunthorpe wrote:
> > On Wed, May 22, 2019 at 02:49:28PM +0100, Dave Martin wrote:
> > > If multiple people will care about this, perhaps we should try to
> > > annotate types more explicitly in SYSCALL_DEFINEx() and ABI data
> > > structures.
> > > 
> > > For example, we could have a couple of mutually exclusive modifiers
> > > 
> > > T __object *
> > > T __vaddr * (or U __vaddr)
> > > 
> > > In the first case the pointer points to an object (in the C sense)
> > > that the call may dereference but not use for any other purpose.
> > 
> > How would you use these two differently?
> > 
> > So far the kernel has worked that __user should tag any pointer that
> > is from userspace and then you can't do anything with it until you
> > transform it into a kernel something
> 
> Ultimately it would be good to disallow casting __object pointers execpt
> to compatible __object pointer types, and to make get_user etc. demand
> __object.
> 
> __vaddr pointers / addresses would be freely castable, but not to
> __object and so would not be dereferenceable even indirectly.

I think it gets too complicated and there are ambiguous cases that we
may not be able to distinguish. For example copy_from_user() may be used
to copy a user data structure into the kernel, hence __object would
work, while the same function may be used to copy opaque data to a file,
so __vaddr may be a better option (unless I misunderstood your
proposal).

We currently have T __user * and I think it's a good starting point. The
prior attempt [1] was shut down because it was just hiding the cast
using __force. We'd need to work through those cases again and rather
start changing the function prototypes to avoid unnecessary casting in
the callers (e.g. get_user_pages(void __user *) or come up with a new
type) while changing the explicit casting to a macro where it needs to
be obvious that we are converting a user pointer, potentially typed
(tagged), to an untyped address range. We may need a user_ptr_to_ulong()
macro or similar (it seems that we have a u64_to_user_ptr, wasn't aware
of it).

It may actually not be far from what you suggested but I'd keep the
current T __user * to denote possible dereference.

[1] 
https://lore.kernel.org/lkml/5d54526e5ff2e5ad63d0dfdd9ab17cf359afa4f2.1535629099.git.andreyk...@google.com/

-- 
Catalin


Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel

2019-05-23 Thread Catalin Marinas
On Wed, May 22, 2019 at 09:58:22AM -0700, enh wrote:
> i was questioning the argument about the ioctl issues, and saying that
> from my perspective, untagging bugs are not really any different than
> any other kind of kernel bug.

Once this series gets in, they are indeed just kernel bugs. What I want
is an easier way to identify them, ideally before they trigger in the
field.

> i still don't see how this isn't just a regular testing/CI issue, the
> same as any other kind of kernel bug. it's already the case that i can
> get a bad kernel...

The testing would have a smaller code coverage in terms of drivers,
filesystems than something like a static checker (though one does not
exclude the other).

-- 
Catalin


Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel

2019-05-23 Thread Catalin Marinas
On Wed, May 22, 2019 at 12:21:27PM -0700, Kees Cook wrote:
> If a process wants to not tag, that's also up to the allocator where
> it can decide not to ask the kernel, and just not tag. Nothing breaks in
> userspace if a process is NOT tagging and untagged_addr() exists or is
> missing. This, I think, is the core way this doesn't trip over the
> golden rule: an old system image will run fine (because it's not
> tagging). A *new* system may encounter bugs with tagging because it's a
> new feature: this is The Way Of Things. But we don't break old userspace
> because old userspace isn't using tags.

With this series and hwasan binaries, at some point in the future they
will be considered "old userspace" and they do use pointer tags which
expect to be ignored by both the hardware and the kernel. MTE breaks
this assumption.

-- 
Catalin


Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel

2019-05-23 Thread Catalin Marinas
On Wed, May 22, 2019 at 01:47:36PM -0700, Kees Cook wrote:
> On Wed, May 22, 2019 at 05:35:27PM +0100, Catalin Marinas wrote:
> > The two hard requirements I have for supporting any new hardware feature
> > in Linux are (1) a single kernel image binary continues to run on old
> > hardware while making use of the new feature if available and (2) old
> > user space continues to run on new hardware while new user space can
> > take advantage of the new feature.
> 
> Agreed! And I think the series meets these requirements, yes?

Yes. I mentioned this just to make sure people don't expect different
kernel builds for different hardware features.

There is also the obvious requirement which I didn't mention: new user
space continues to run on new/subsequent kernel versions. That's one of
the points of contention for this series (ignoring MTE) with the
maintainers having to guarantee this without much effort. IOW, do the
500K+ new lines in a subsequent kernel version break any user space out
there? I'm only talking about the relaxed TBI ABI. Are the usual LTP,
syskaller sufficient? Better static analysis would definitely help.

> > For MTE, we just can't enable it by default since there are applications
> > who use the top byte of a pointer and expect it to be ignored rather
> > than failing with a mismatched tag. Just think of a hwasan compiled
> > binary where TBI is expected to work and you try to run it with MTE
> > turned on.
> 
> Ah! Okay, here's the use-case I wasn't thinking of: the concern is TBI
> conflicting with MTE. And anything that starts using TBI suddenly can't
> run in the future because it's being interpreted as MTE bits? (Is that
> the ABI concern?

That's another aspect to figure out when we add the MTE support. I don't
think we'd be able to do this without an explicit opt-in by the user.

Or, if we ever want MTE to be turned on by default (i.e. tag checking),
even if everything is tagged with 0, we have to disallow TBI for user
and this includes hwasan. There were a small number of programs using
the TBI (I think some JavaScript compilers tried this). But now we are
bringing in the hwasan support and this can be a large user base. Shall
we add an ELF note for such binaries that use TBI/hwasan?

This series is still required for MTE but we may decide not to relax the
ABI blindly, therefore the opt-in (prctl) or personality idea.

> I feel like we got into the weeds about ioctl()s and one-off bugs...)

This needs solving as well. Most driver developers won't know why
untagged_addr() is needed unless we have more rigorous types or type
annotations and a tool to check them (we should probably revive the old
sparse thread).

> So there needs to be some way to let the kernel know which of three
> things it should be doing:
> 1- leaving userspace addresses as-is (present)
> 2- wiping the top bits before using (this series)

(I'd say tolerating rather than wiping since get_user still uses the tag
in the current series)

The current series does not allow any choice between 1 and 2, the
default ABI basically becomes option 2.

> 3- wiping the top bits for most things, but retaining them for MTE as
>needed (the future)

2 and 3 are not entirely compatible as a tagged pointer may be checked
against the memory colour by the hardware. So you can't have hwasan
binary with MTE enabled.

> I expect MTE to be the "default" in the future. Once a system's libc has
> grown support for it, everything will be trying to use MTE. TBI will be
> the special case (but TBI is effectively a prerequisite).

The kernel handling of tagged pointers is indeed a prerequisite. The ABI
distinction between the above 2 and 3 needs to be solved.

> AFAICT, the only difference I see between 2 and 3 will be the tag handling
> in usercopy (all other places will continue to ignore the top bits). Is
> that accurate?

Yes, mostly (for the kernel). If MTE is enabled by default for a hwasan
binary, it will SEGFAULT (either in user space or in kernel uaccess).
How does the kernel choose between 2 and 3?

> Is "1" a per-process state we want to keep? (I assume not, but rather it
> is available via no TBI/MTE CONFIG or a boot-time option, if at all?)

Possibly, though not necessarily per process. For testing or if
something goes wrong during boot, a command line option with a static
label would do. The AT_FLAGS bit needs to be checked by user space. My
preference would be per-process.

> To choose between "2" and "3", it seems we need a per-process flag to
> opt into TBI (and out of MTE).

Or leave option 2 the default and get it to opt in to MTE.

> For userspace, how would a future binary choose TBI over MTE? If it's
> a library issue, we can't use an ELF bit, since the choice may be
> "late" after ELF load (this implies the need for a prct

Re: [PATCH v15 05/17] arms64: untag user pointers passed to memory syscalls

2019-05-23 Thread Catalin Marinas
On Wed, May 22, 2019 at 02:16:57PM -0700, Evgenii Stepanov wrote:
> On Wed, May 22, 2019 at 4:49 AM Catalin Marinas  
> wrote:
> > On Mon, May 06, 2019 at 06:30:51PM +0200, Andrey Konovalov wrote:
> > > This patch is a part of a series that extends arm64 kernel ABI to allow to
> > > pass tagged user pointers (with the top byte set to something else other
> > > than 0x00) as syscall arguments.
> > >
> > > This patch allows tagged pointers to be passed to the following memory
> > > syscalls: brk, get_mempolicy, madvise, mbind, mincore, mlock, mlock2,
> > > mmap, mmap_pgoff, mprotect, mremap, msync, munlock, munmap,
> > > remap_file_pages, shmat and shmdt.
> > >
> > > This is done by untagging pointers passed to these syscalls in the
> > > prologues of their handlers.
> >
> > I'll go through them one by one to see if we can tighten the expected
> > ABI while having the MTE in mind.
> >
> > > diff --git a/arch/arm64/kernel/sys.c b/arch/arm64/kernel/sys.c
> > > index b44065fb1616..933bb9f3d6ec 100644
> > > --- a/arch/arm64/kernel/sys.c
> > > +++ b/arch/arm64/kernel/sys.c
> > > @@ -35,10 +35,33 @@ SYSCALL_DEFINE6(mmap, unsigned long, addr, unsigned 
> > > long, len,
> > >  {
> > >   if (offset_in_page(off) != 0)
> > >   return -EINVAL;
> > > -
> > > + addr = untagged_addr(addr);
> > >   return ksys_mmap_pgoff(addr, len, prot, flags, fd, off >> 
> > > PAGE_SHIFT);
> > >  }
> >
> > If user passes a tagged pointer to mmap() and the address is honoured
> > (or MAP_FIXED is given), what is the expected return pointer? Does it
> > need to be tagged with the value from the hint?
> 
> For HWASan the most convenient would be to use the tag from the hint.
> But since in the TBI (not MTE) mode the kernel has no idea what
> meaning userspace assigns to pointer tags, perhaps it should not try
> to guess, and should return raw (zero-tagged) address instead.

Then, just to relax the ABI for hwasan, shall we simply disallow tagged
pointers on mmap() arguments? We can leave them in for
mremap(old_address), madvise().

> > With MTE, we may want to use this as a request for the default colour of
> > the mapped pages (still under discussion).
> 
> I like this - and in that case it would make sense to return the
> pointer that can be immediately dereferenced without crashing the
> process, i.e. with the matching tag.

This came up from the Android investigation work where large memory
allocations (using mmap) could be more efficiently pre-tagged by the
kernel on page fault. Not sure about the implementation details yet.

-- 
Catalin


Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel

2019-05-23 Thread Catalin Marinas
On Wed, May 22, 2019 at 04:09:31PM -0700, enh wrote:
> On Wed, May 22, 2019 at 4:03 PM Evgenii Stepanov  wrote:
> > On Wed, May 22, 2019 at 1:47 PM Kees Cook  wrote:
> > > On Wed, May 22, 2019 at 05:35:27PM +0100, Catalin Marinas wrote:
> > > > I would also expect the C library or dynamic loader to check for the
> > > > presence of a HWCAP_MTE bit before starting to tag memory allocations,
> > > > otherwise it would get SIGILL on the first MTE instruction it tries to
> > > > execute.
> > >
> > > I've got the same question as Elliot: aren't MTE instructions just NOP
> > > to older CPUs? I.e. if the CPU (or kernel) don't support it, it just
> > > gets entirely ignored: checking is only needed to satisfy curiosity
> > > or behavioral expectations.
> >
> > MTE instructions are not NOP. Most of them have side effects (changing
> > register values, zeroing memory).
> 
> no, i meant "they're encoded in a space that was previously no-ops, so
> running on MTE code on old hardware doesn't cause SIGILL".

It does result in SIGILL, there wasn't enough encoding left in the NOP
space for old/current CPU implementations (in hindsight, we should have
reserved a bigger NOP space).

As Evgenii said, the libc needs to be careful when tagging the heap as
it would cause a SIGILL if the HWCAP_MTE is not set. The standard
application doesn't need to be recompiled as it would not issue MTE
colouring instructions, just standard LDR/STR.

Stack tagging is problematic if you want to colour each frame
individually, the function prologue would need the non-NOP MTE
instructions. The best we can do here is just having the (thread) stacks
of different colours.

-- 
Catalin


Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel

2019-05-22 Thread Catalin Marinas
On Wed, May 22, 2019 at 08:30:21AM -0700, enh wrote:
> On Wed, May 22, 2019 at 3:11 AM Catalin Marinas  
> wrote:
> > On Tue, May 21, 2019 at 05:04:39PM -0700, Kees Cook wrote:
> > > I just want to make sure I fully understand your concern about this
> > > being an ABI break, and I work best with examples. The closest situation
> > > I can see would be:
> > >
> > > - some program has no idea about MTE
> >
> > Apart from some libraries like libc (and maybe those that handle
> > specific device ioctls), I think most programs should have no idea about
> > MTE. I wouldn't expect programmers to have to change their app just
> > because we have a new feature that colours heap allocations.
> 
> obviously i'm biased as a libc maintainer, but...
> 
> i don't think it helps to move this to libc --- now you just have an
> extra dependency where to have a guaranteed working system you need to
> update your kernel and libc together. (or at least update your libc to
> understand new ioctls etc _before_ you can update your kernel.)

That's not what I meant (or I misunderstood you). If we have a relaxed
ABI in the kernel and a libc that returns tagged pointers on malloc() I
wouldn't expect the programmer to do anything different in the
application code like explicit untagging. Basically the program would
continue to run unmodified irrespective of whether you use an old libc
without tagged pointers or a new one which tags heap allocations.

What I do expect is that the libc checks for the presence of the relaxed
ABI, currently proposed as an AT_FLAGS bit (for MTE we'd have a
HWCAP_MTE), and only tag the malloc() pointers if the kernel supports
the relaxed ABI. As you said, you shouldn't expect that the C library
and kernel are upgraded together, so they should be able to work in any
new/old version combination.

> > > The trouble I see with this is that it is largely theoretical and
> > > requires part of userspace to collude to start using a new CPU feature
> > > that tickles a bug in the kernel. As I understand the golden rule,
> > > this is a bug in the kernel (a missed ioctl() or such) to be fixed,
> > > not a global breaking of some userspace behavior.
> >
> > Yes, we should follow the rule that it's a kernel bug but it doesn't
> > help the user that a newly installed kernel causes user space to no
> > longer reach a prompt. Hence the proposal of an opt-in via personality
> > (for MTE we would need an explicit opt-in by the user anyway since the
> > top byte is no longer ignored but checked against the allocation tag).
> 
> but realistically would this actually get used in this way? or would
> any given system either be MTE or non-MTE. in which case a kernel
> configuration option would seem to make more sense. (because either
> way, the hypothetical user basically needs to recompile the kernel to
> get back on their feet. or all of userspace.)

The two hard requirements I have for supporting any new hardware feature
in Linux are (1) a single kernel image binary continues to run on old
hardware while making use of the new feature if available and (2) old
user space continues to run on new hardware while new user space can
take advantage of the new feature.

The distro user space usually has a hard requirement that it continues
to run on (certain) old hardware. We can't enforce this in the kernel
but we offer the option to user space developers of checking feature
availability through HWCAP bits.

The Android story may be different as you have more control about which
kernel configurations are deployed on specific SoCs. I'm looking more
from a Linux distro angle where you just get an off-the-shelf OS image
and install it on your hardware, either taking advantage of new features
or just not using them if the software was not updated. Or, if updated
software is installed on old hardware, it would just run.

For MTE, we just can't enable it by default since there are applications
who use the top byte of a pointer and expect it to be ignored rather
than failing with a mismatched tag. Just think of a hwasan compiled
binary where TBI is expected to work and you try to run it with MTE
turned on.

I would also expect the C library or dynamic loader to check for the
presence of a HWCAP_MTE bit before starting to tag memory allocations,
otherwise it would get SIGILL on the first MTE instruction it tries to
execute.

> i'm not sure i see this new way for a kernel update to break my system
> and need to be fixed forward/rolled back as any different from any of
> the existing ways in which this can happen :-) as an end-user i have
> to rely on whoever's sending me software updates to test adequately
> enough that they find the problems. as an end user, there isn't any
> difference between "my phone 

Re: [PATCH v15 17/17] selftests, arm64: add a selftest for passing tagged pointers to kernel

2019-05-22 Thread Catalin Marinas
On Mon, May 06, 2019 at 06:31:03PM +0200, Andrey Konovalov wrote:
> This patch is a part of a series that extends arm64 kernel ABI to allow to
> pass tagged user pointers (with the top byte set to something else other
> than 0x00) as syscall arguments.
> 
> This patch adds a simple test, that calls the uname syscall with a
> tagged user pointer as an argument. Without the kernel accepting tagged
> user pointers the test fails with EFAULT.

That's probably sufficient for a simple example. Something we could add
to Documentation maybe is a small library that can be LD_PRELOAD'ed so
that you can run a lot more tests like LTP.

We could add this to selftests but I think it's too glibc specific.

8<
#include 

#define TAG_SHIFT   (56)
#define TAG_MASK(0xffUL << TAG_SHIFT)

void *__libc_malloc(size_t size);
void __libc_free(void *ptr);
void *__libc_realloc(void *ptr, size_t size);
void *__libc_calloc(size_t nmemb, size_t size);

static void *tag_ptr(void *ptr)
{
unsigned long tag = rand() & 0xff;
if (!ptr)
return ptr;
return (void *)((unsigned long)ptr | (tag << TAG_SHIFT));
}

static void *untag_ptr(void *ptr)
{
return (void *)((unsigned long)ptr & ~TAG_MASK);
}

void *malloc(size_t size)
{
return tag_ptr(__libc_malloc(size));
}

void free(void *ptr)
{
__libc_free(untag_ptr(ptr));
}

void *realloc(void *ptr, size_t size)
{
return tag_ptr(__libc_realloc(untag_ptr(ptr), size));
}

void *calloc(size_t nmemb, size_t size)
{
return tag_ptr(__libc_calloc(nmemb, size));
}


Re: [PATCH v15 09/17] fs, arm64: untag user pointers in copy_mount_options

2019-05-22 Thread Catalin Marinas
On Mon, May 06, 2019 at 06:30:55PM +0200, Andrey Konovalov wrote:
> This patch is a part of a series that extends arm64 kernel ABI to allow to
> pass tagged user pointers (with the top byte set to something else other
> than 0x00) as syscall arguments.
> 
> In copy_mount_options a user address is being subtracted from TASK_SIZE.
> If the address is lower than TASK_SIZE, the size is calculated to not
> allow the exact_copy_from_user() call to cross TASK_SIZE boundary.
> However if the address is tagged, then the size will be calculated
> incorrectly.
> 
> Untag the address before subtracting.
> 
> Signed-off-by: Andrey Konovalov 

Reviewed-by: Catalin Marinas 


Re: [PATCH v15 07/17] mm, arm64: untag user pointers in mm/gup.c

2019-05-22 Thread Catalin Marinas
On Mon, May 06, 2019 at 06:30:53PM +0200, Andrey Konovalov wrote:
> This patch is a part of a series that extends arm64 kernel ABI to allow to
> pass tagged user pointers (with the top byte set to something else other
> than 0x00) as syscall arguments.
> 
> mm/gup.c provides a kernel interface that accepts user addresses and
> manipulates user pages directly (for example get_user_pages, that is used
> by the futex syscall). Since a user can provided tagged addresses, we need
> to handle this case.
> 
> Add untagging to gup.c functions that use user addresses for vma lookups.
> 
> Signed-off-by: Andrey Konovalov 

Reviewed-by: Catalin Marinas 


Re: [PATCH v15 06/17] mm: untag user pointers in do_pages_move

2019-05-22 Thread Catalin Marinas
On Mon, May 06, 2019 at 06:30:52PM +0200, Andrey Konovalov wrote:
> This patch is a part of a series that extends arm64 kernel ABI to allow to
> pass tagged user pointers (with the top byte set to something else other
> than 0x00) as syscall arguments.
> 
> do_pages_move() is used in the implementation of the move_pages syscall.
> 
> Untag user pointers in this function.
> 
> Signed-off-by: Andrey Konovalov 

Reviewed-by: Catalin Marinas 


Re: [PATCH v15 05/17] arms64: untag user pointers passed to memory syscalls

2019-05-22 Thread Catalin Marinas
On Mon, May 06, 2019 at 06:30:51PM +0200, Andrey Konovalov wrote:
> This patch is a part of a series that extends arm64 kernel ABI to allow to
> pass tagged user pointers (with the top byte set to something else other
> than 0x00) as syscall arguments.
> 
> This patch allows tagged pointers to be passed to the following memory
> syscalls: brk, get_mempolicy, madvise, mbind, mincore, mlock, mlock2,
> mmap, mmap_pgoff, mprotect, mremap, msync, munlock, munmap,
> remap_file_pages, shmat and shmdt.
> 
> This is done by untagging pointers passed to these syscalls in the
> prologues of their handlers.

I'll go through them one by one to see if we can tighten the expected
ABI while having the MTE in mind.

> diff --git a/arch/arm64/kernel/sys.c b/arch/arm64/kernel/sys.c
> index b44065fb1616..933bb9f3d6ec 100644
> --- a/arch/arm64/kernel/sys.c
> +++ b/arch/arm64/kernel/sys.c
> @@ -35,10 +35,33 @@ SYSCALL_DEFINE6(mmap, unsigned long, addr, unsigned long, 
> len,
>  {
>   if (offset_in_page(off) != 0)
>   return -EINVAL;
> -
> + addr = untagged_addr(addr);
>   return ksys_mmap_pgoff(addr, len, prot, flags, fd, off >> PAGE_SHIFT);
>  }

If user passes a tagged pointer to mmap() and the address is honoured
(or MAP_FIXED is given), what is the expected return pointer? Does it
need to be tagged with the value from the hint?

With MTE, we may want to use this as a request for the default colour of
the mapped pages (still under discussion).

> +SYSCALL_DEFINE6(arm64_mmap_pgoff, unsigned long, addr, unsigned long, len,
> + unsigned long, prot, unsigned long, flags,
> + unsigned long, fd, unsigned long, pgoff)
> +{
> + addr = untagged_addr(addr);
> + return ksys_mmap_pgoff(addr, len, prot, flags, fd, pgoff);
> +}

We don't have __NR_mmap_pgoff on arm64.

> +SYSCALL_DEFINE5(arm64_mremap, unsigned long, addr, unsigned long, old_len,
> + unsigned long, new_len, unsigned long, flags,
> + unsigned long, new_addr)
> +{
> + addr = untagged_addr(addr);
> + new_addr = untagged_addr(new_addr);
> + return ksys_mremap(addr, old_len, new_len, flags, new_addr);
> +}

Similar comment as for mmap(), do we want the tag from new_addr to be
preserved? In addition, should we check that the two tags are identical
or mremap() should become a way to repaint a memory region?

> +SYSCALL_DEFINE2(arm64_munmap, unsigned long, addr, size_t, len)
> +{
> + addr = untagged_addr(addr);
> + return ksys_munmap(addr, len);
> +}

This looks fine.

> +SYSCALL_DEFINE1(arm64_brk, unsigned long, brk)
> +{
> + brk = untagged_addr(brk);
> + return ksys_brk(brk);
> +}

I wonder whether brk() should simply not accept tags, and should not
return them (similar to the prctl(PR_SET_MM) discussion). We could
document this in the ABI requirements.

> +SYSCALL_DEFINE5(arm64_get_mempolicy, int __user *, policy,
> + unsigned long __user *, nmask, unsigned long, maxnode,
> + unsigned long, addr, unsigned long, flags)
> +{
> + addr = untagged_addr(addr);
> + return ksys_get_mempolicy(policy, nmask, maxnode, addr, flags);
> +}
> +
> +SYSCALL_DEFINE3(arm64_madvise, unsigned long, start,
> + size_t, len_in, int, behavior)
> +{
> + start = untagged_addr(start);
> + return ksys_madvise(start, len_in, behavior);
> +}
> +
> +SYSCALL_DEFINE6(arm64_mbind, unsigned long, start, unsigned long, len,
> + unsigned long, mode, const unsigned long __user *, nmask,
> + unsigned long, maxnode, unsigned int, flags)
> +{
> + start = untagged_addr(start);
> + return ksys_mbind(start, len, mode, nmask, maxnode, flags);
> +}
> +
> +SYSCALL_DEFINE2(arm64_mlock, unsigned long, start, size_t, len)
> +{
> + start = untagged_addr(start);
> + return ksys_mlock(start, len, VM_LOCKED);
> +}
> +
> +SYSCALL_DEFINE2(arm64_mlock2, unsigned long, start, size_t, len)
> +{
> + start = untagged_addr(start);
> + return ksys_mlock(start, len, VM_LOCKED);
> +}
> +
> +SYSCALL_DEFINE2(arm64_munlock, unsigned long, start, size_t, len)
> +{
> + start = untagged_addr(start);
> + return ksys_munlock(start, len);
> +}
> +
> +SYSCALL_DEFINE3(arm64_mprotect, unsigned long, start, size_t, len,
> + unsigned long, prot)
> +{
> + start = untagged_addr(start);
> + return ksys_mprotect_pkey(start, len, prot, -1);
> +}
> +
> +SYSCALL_DEFINE3(arm64_msync, unsigned long, start, size_t, len, int, flags)
> +{
> + start = untagged_addr(start);
> + return ksys_msync(start, len, flags);
> +}
> +
> +SYSCALL_DEFINE3(arm64_mincore, unsigned long, start, size_t, len,
> + unsigned char __user *, vec)
> +{
> + start = untagged_addr(start);
> + return ksys_mincore(start, len, vec);
> +}

These look fine.

> +SYSCALL_DEFINE5(arm64_remap_file_pages, unsigned long, start,
> + unsigned long, size, unsigned long, prot,
> + unsigned long, pgoff, unsigned long, flags)
> 

Re: [PATCH v15 04/17] mm: add ksys_ wrappers to memory syscalls

2019-05-22 Thread Catalin Marinas
On Mon, May 06, 2019 at 06:30:50PM +0200, Andrey Konovalov wrote:
> This patch is a part of a series that extends arm64 kernel ABI to allow to
> pass tagged user pointers (with the top byte set to something else other
> than 0x00) as syscall arguments.
> 
> This patch adds ksys_ wrappers to the following memory syscalls:
> 
> brk, get_mempolicy (renamed kernel_get_mempolicy -> ksys_get_mempolicy),
> madvise, mbind (renamed kernel_mbind -> ksys_mbind), mincore,
> mlock (renamed do_mlock -> ksys_mlock), mlock2, mmap_pgoff,
> mprotect (renamed do_mprotect_pkey -> ksys_mprotect_pkey), mremap, msync,
> munlock, munmap, remap_file_pages, shmat, shmdt.
> 
> The next patch in this series will add a custom implementation for these
> syscalls that makes them accept tagged pointers on arm64.
> 
> Signed-off-by: Andrey Konovalov 

Reviewed-by: Catalin Marinas 


Re: [PATCH v15 03/17] lib, arm64: untag user pointers in strn*_user

2019-05-22 Thread Catalin Marinas
On Mon, May 06, 2019 at 06:30:49PM +0200, Andrey Konovalov wrote:
> This patch is a part of a series that extends arm64 kernel ABI to allow to
> pass tagged user pointers (with the top byte set to something else other
> than 0x00) as syscall arguments.
> 
> strncpy_from_user and strnlen_user accept user addresses as arguments, and
> do not go through the same path as copy_from_user and others, so here we
> need to handle the case of tagged user addresses separately.
> 
> Untag user pointers passed to these functions.
> 
> Note, that this patch only temporarily untags the pointers to perform
> validity checks, but then uses them as is to perform user memory accesses.
> 
> Signed-off-by: Andrey Konovalov 

Just to keep track of where I am with the reviews while the ABI
discussion continues:

Reviewed-by: Catalin Marinas 


Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel

2019-05-22 Thread Catalin Marinas
Hi Kees,

Thanks for joining the thread ;).

On Tue, May 21, 2019 at 05:04:39PM -0700, Kees Cook wrote:
> On Tue, May 21, 2019 at 07:29:33PM +0100, Catalin Marinas wrote:
> > On Mon, May 20, 2019 at 04:53:07PM -0700, Evgenii Stepanov wrote:
> > > On Fri, May 17, 2019 at 7:49 AM Catalin Marinas  
> > > wrote:
> > > > IMO (RFC for now), I see two ways forward:
> > > > [...]
> > > > 2. Similar shim to the above libc wrapper but inside the kernel
> > > >(arch/arm64 only; most pointer arguments could be covered with an
> > > >__SC_CAST similar to the s390 one). There are two differences from
> > > >what we've discussed in the past:
> > > >
> > > >a) this is an opt-in by the user which would have to explicitly call
> > > >   prctl(). If it returns -ENOTSUPP etc., the user won't be allowed
> > > >   to pass tagged pointers to the kernel. This would probably be the
> > > >   responsibility of the C lib to make sure it doesn't tag heap
> > > >   allocations. If the user did not opt-in, the syscalls are routed
> > > >   through the normal path (no untagging address shim).
> > > >
> > > >b) ioctl() and other blacklisted syscalls (prctl) will not accept
> > > >   tagged pointers (to be documented in Vicenzo's ABI patches).
> > >
> > > The way I see it, a patch that breaks handling of tagged pointers is
> > > not that different from, say, a patch that adds a wild pointer
> > > dereference. Both are bugs; the difference is that (a) the former
> > > breaks a relatively uncommon target and (b) it's arguably an easier
> > > mistake to make. If MTE adoption goes well, (a) will not be the case
> > > for long.
> > 
> > It's also the fact such patch would go unnoticed for a long time until
> > someone exercises that code path. And when they do, the user would be
> > pretty much in the dark trying to figure what what went wrong, why a
> > SIGSEGV or -EFAULT happened. What's worse, we can't even say we fixed
> > all the places where it matters in the current kernel codebase (ignoring
> > future patches).
> 
> So, looking forward a bit, this isn't going to be an ARM-specific issue
> for long.

I do hope so.

> In fact, I think we shouldn't have arm-specific syscall wrappers
> in this series: I think untagged_addr() should likely be added at the
> top-level and have it be a no-op for other architectures.

That's what the current patchset does, so we have this as a starting
point. Kostya raised another potential issue with the syscall wrappers:
with MTE the kernel will be forced to enable the match-all (wildcard)
pointers for user space accesses since copy_from_user() would only get a
0 tag. So it has wider implications than just uaccess routines not
checking the colour.

> So given this becoming a kernel-wide multi-architecture issue (under
> the assumption that x86, RISC-V, and others will gain similar TBI or
> MTE things), we should solve it in a way that we can re-use.

Can we do any better to aid the untagged_addr() placement (e.g. better
type annotations, better static analysis)? We have to distinguish
between user pointers that may be dereferenced by the kernel (I think
almost fully covered with this patchset) and user addresses represented
as ulong that may:

a) be converted to a user pointer and dereferenced; I think that's the
   case for many overloaded ulong/u64 arguments

b) used for address space management, rbtree look-ups etc. where the tag
   is no longer relevant and it even gets in the way

We tried last year to identify void __user * casts to unsigned long
using sparse on the assumption that pointers can be tagged while ulong
is about address space management and needs to lose such tag. I think we
could have pushed this further. For example, get_user_pages() takes an
unsigned long but it is perfectly capable of untagging the address
itself. Shall we change its first argument to void __user * (together
with all its callers)?

find_vma(), OTOH, could untag the address but it doesn't help since
vm_start/end don't have such information (that's more about the content
or type that the user decided) and the callers check against it.

Are there any other places where this matters? These patches tracked
down find_vma() as some heuristics but we may need better static
analysis to identify other cases.

> We need something that is going to work everywhere. And it needs to be
> supported by the kernel for the simple reason that the kernel needs to
> do MTE checks during copy_from_user(): having that information stripped
> means we lose any userspace-assigned MTE protections if they get handled
> by the kernel, which 

Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel

2019-05-21 Thread Catalin Marinas
On Mon, May 20, 2019 at 04:53:07PM -0700, Evgenii Stepanov wrote:
> On Fri, May 17, 2019 at 7:49 AM Catalin Marinas  
> wrote:
> > IMO (RFC for now), I see two ways forward:
> >
> > 1. Make this a user space problem and do not allow tagged pointers into
> >the syscall ABI. A libc wrapper would have to convert structures,
> >parameters before passing them into the kernel. Note that we can
> >still support the hardware MTE in the kernel by enabling tagged
> >memory ranges, saving/restoring tags etc. but not allowing tagged
> >addresses at the syscall boundary.
> >
> > 2. Similar shim to the above libc wrapper but inside the kernel
> >(arch/arm64 only; most pointer arguments could be covered with an
> >__SC_CAST similar to the s390 one). There are two differences from
> >what we've discussed in the past:
> >
> >a) this is an opt-in by the user which would have to explicitly call
> >   prctl(). If it returns -ENOTSUPP etc., the user won't be allowed
> >   to pass tagged pointers to the kernel. This would probably be the
> >   responsibility of the C lib to make sure it doesn't tag heap
> >   allocations. If the user did not opt-in, the syscalls are routed
> >   through the normal path (no untagging address shim).
> >
> >b) ioctl() and other blacklisted syscalls (prctl) will not accept
> >   tagged pointers (to be documented in Vicenzo's ABI patches).
[...]
> Any userspace shim approach is problematic for Android because of the
> apps that use raw system calls. AFAIK, all apps written in Go are in
> that camp - I'm not sure how common they are, but getting them all
> recompiled is probably not realistic.

That's a fair point (I wasn't expecting it would get much traction
anyway ;)). OTOH, it allows upstreaming of the MTE patches while we
continue the discussions around TBI.

> The way I see it, a patch that breaks handling of tagged pointers is
> not that different from, say, a patch that adds a wild pointer
> dereference. Both are bugs; the difference is that (a) the former
> breaks a relatively uncommon target and (b) it's arguably an easier
> mistake to make. If MTE adoption goes well, (a) will not be the case
> for long.

It's also the fact such patch would go unnoticed for a long time until
someone exercises that code path. And when they do, the user would be
pretty much in the dark trying to figure what what went wrong, why a
SIGSEGV or -EFAULT happened. What's worse, we can't even say we fixed
all the places where it matters in the current kernel codebase (ignoring
future patches).

I think we should revisit the static checking discussions we had last
year. Run-time checking (even with compiler instrumentation and
syzkaller fuzzing) would only cover the code paths specific to a Linux
or Android installation.

> This is a bit of a chicken-and-egg problem. In a world where memory
> allocators on one or several popular platforms generate pointers with
> non-zero tags, any such breakage will be caught in testing.
> Unfortunately to reach that state we need the kernel to start
> accepting tagged pointers first, and then hold on for a couple of
> years until userspace catches up.

Would the kernel also catch up with providing a stable ABI? Because we
have two moving targets.

On one hand, you have Android or some Linux distro that stick to a
stable kernel version for some time, so they have better chance of
clearing most of the problems. On the other hand, we have mainline
kernel that gets over 500K lines every release. As maintainer, I can't
rely on my testing alone as this is on a limited number of platforms. So
my concern is that every kernel release has a significant chance of
breaking the ABI, unless we have a better way of identifying potential
issues.

> Perhaps we can start by whitelisting ioctls by driver?

This was also raised by Ruben in private but without a (static) tool to
to check, manually going through all the drivers doesn't scale. It's
very likely that most drivers don't care, just a get_user/put_user is
already handled by these patches. Searching for find_vma() was
identifying one such use-case but is this sufficient? Are there other
cases we need to explicitly untag a pointer?


The other point I'd like feedback on is 2.a above. I see _some_ value
into having the user opt-in to this relaxed ABI rather than blinding
exposing it to all applications. Dave suggested (in private) a new
personality (e.g. PER_LINUX_TBI) inherited by children. It would be the
responsibility of the C library to check the current personality bits
and only tag pointers on allocation *if* the kernel allowed it. The
kernel could provide the AT_FLAGS bit as in Vincenzo's patches if the
personality was set but can't set it retrospectively if the user called
sy

Re: [PATCH v15 00/17] arm64: untag user pointers passed to the kernel

2019-05-17 Thread Catalin Marinas
Hi Andrey,

On Mon, May 06, 2019 at 06:30:46PM +0200, Andrey Konovalov wrote:
> One of the alternative approaches to untagging that was considered is to
> completely strip the pointer tag as the pointer enters the kernel with
> some kind of a syscall wrapper, but that won't work with the countless
> number of different ioctl calls. With this approach we would need a custom
> wrapper for each ioctl variation, which doesn't seem practical.

The more I look at this problem, the less convinced I am that we can
solve it in a way that results in a stable ABI covering ioctls(). While
for the Android kernel codebase it could be simpler as you don't upgrade
the kernel version every 2.5 months, for the mainline kernel this
doesn't scale. Any run-time checks are relatively limited in terms of
drivers covered. Better static checking would be nice as a long term
solution but we didn't get anywhere with the discussion last year.

IMO (RFC for now), I see two ways forward:

1. Make this a user space problem and do not allow tagged pointers into
   the syscall ABI. A libc wrapper would have to convert structures,
   parameters before passing them into the kernel. Note that we can
   still support the hardware MTE in the kernel by enabling tagged
   memory ranges, saving/restoring tags etc. but not allowing tagged
   addresses at the syscall boundary.

2. Similar shim to the above libc wrapper but inside the kernel
   (arch/arm64 only; most pointer arguments could be covered with an
   __SC_CAST similar to the s390 one). There are two differences from
   what we've discussed in the past:

   a) this is an opt-in by the user which would have to explicitly call
  prctl(). If it returns -ENOTSUPP etc., the user won't be allowed
  to pass tagged pointers to the kernel. This would probably be the
  responsibility of the C lib to make sure it doesn't tag heap
  allocations. If the user did not opt-in, the syscalls are routed
  through the normal path (no untagging address shim).

   b) ioctl() and other blacklisted syscalls (prctl) will not accept
  tagged pointers (to be documented in Vicenzo's ABI patches).

It doesn't solve the problems we are trying to address but 2.a saves us
from blindly relaxing the ABI without knowing how to easily assess new
code being merged (over 500K lines between kernel versions). Existing
applications (who don't opt-in) won't inadvertently start using the new
ABI which could risk becoming de-facto ABI that we need to support on
the long run.

Option 1 wouldn't solve the ioctl() problem either and while it makes
things simpler for the kernel, I am aware that it's slightly more
complicated in user space (but I really don't mind if you prefer option
1 ;)).

The tagged pointers (whether hwasan or MTE) should ideally be a
transparent feature for the application writer but I don't think we can
solve it entirely and make it seamless for the multitude of ioctls().
I'd say you only opt in to such feature if you know what you are doing
and the user code takes care of specific cases like ioctl(), hence the
prctl() proposal even for the hwasan.

Comments welcomed.

-- 
Catalin


Re: [PATCH v14 13/17] IB/mlx4, arm64: untag user pointers in mlx4_get_umem_mr

2019-05-03 Thread Catalin Marinas
On Tue, Apr 30, 2019 at 03:25:09PM +0200, Andrey Konovalov wrote:
> This patch is a part of a series that extends arm64 kernel ABI to allow to
> pass tagged user pointers (with the top byte set to something else other
> than 0x00) as syscall arguments.
> 
> mlx4_get_umem_mr() uses provided user pointers for vma lookups, which can
> only by done with untagged pointers.
> 
> Untag user pointers in this function.
> 
> Signed-off-by: Andrey Konovalov 
> Reviewed-by: Leon Romanovsky 
> ---
>  drivers/infiniband/hw/mlx4/mr.c | 7 ---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/mlx4/mr.c b/drivers/infiniband/hw/mlx4/mr.c
> index 395379a480cb..9a35ed2c6a6f 100644
> --- a/drivers/infiniband/hw/mlx4/mr.c
> +++ b/drivers/infiniband/hw/mlx4/mr.c
> @@ -378,6 +378,7 @@ static struct ib_umem *mlx4_get_umem_mr(struct ib_udata 
> *udata, u64 start,
>* again
>*/
>   if (!ib_access_writable(access_flags)) {
> + unsigned long untagged_start = untagged_addr(start);
>   struct vm_area_struct *vma;
>  
>   down_read(>mm->mmap_sem);
> @@ -386,9 +387,9 @@ static struct ib_umem *mlx4_get_umem_mr(struct ib_udata 
> *udata, u64 start,
>* cover the memory, but for now it requires a single vma to
>* entirely cover the MR to support RO mappings.
>*/
> - vma = find_vma(current->mm, start);
> - if (vma && vma->vm_end >= start + length &&
> - vma->vm_start <= start) {
> + vma = find_vma(current->mm, untagged_start);
> + if (vma && vma->vm_end >= untagged_start + length &&
> + vma->vm_start <= untagged_start) {
>   if (vma->vm_flags & VM_WRITE)
>   access_flags |= IB_ACCESS_LOCAL_WRITE;
>   } else {

Discussion ongoing on the previous version of the patch but I'm more
inclined to do this in ib_uverbs_(re)reg_mr() on cmd.start.

-- 
Catalin
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v14 10/17] fs, arm64: untag user pointers in fs/userfaultfd.c

2019-05-03 Thread Catalin Marinas
On Tue, Apr 30, 2019 at 03:25:06PM +0200, Andrey Konovalov wrote:
> This patch is a part of a series that extends arm64 kernel ABI to allow to
> pass tagged user pointers (with the top byte set to something else other
> than 0x00) as syscall arguments.
> 
> userfaultfd_register() and userfaultfd_unregister() use provided user
> pointers for vma lookups, which can only by done with untagged pointers.
> 
> Untag user pointers in these functions.
> 
> Signed-off-by: Andrey Konovalov 
> ---
>  fs/userfaultfd.c | 5 +
>  1 file changed, 5 insertions(+)
> 
> diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c
> index f5de1e726356..fdee0db0e847 100644
> --- a/fs/userfaultfd.c
> +++ b/fs/userfaultfd.c
> @@ -1325,6 +1325,9 @@ static int userfaultfd_register(struct userfaultfd_ctx 
> *ctx,
>   goto out;
>   }
>  
> + uffdio_register.range.start =
> + untagged_addr(uffdio_register.range.start);
> +
>   ret = validate_range(mm, uffdio_register.range.start,
>uffdio_register.range.len);
>   if (ret)
> @@ -1514,6 +1517,8 @@ static int userfaultfd_unregister(struct 
> userfaultfd_ctx *ctx,
>   if (copy_from_user(_unregister, buf, sizeof(uffdio_unregister)))
>   goto out;
>  
> + uffdio_unregister.start = untagged_addr(uffdio_unregister.start);
> +
>   ret = validate_range(mm, uffdio_unregister.start,
>uffdio_unregister.len);
>   if (ret)

Wouldn't it be easier to do this in validate_range()? There are a few
more calls in this file, though I didn't check whether a tagged address
would cause issues.

-- 
Catalin
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v14 08/17] mm, arm64: untag user pointers in get_vaddr_frames

2019-05-03 Thread Catalin Marinas
On Tue, Apr 30, 2019 at 03:25:04PM +0200, Andrey Konovalov wrote:
> This patch is a part of a series that extends arm64 kernel ABI to allow to
> pass tagged user pointers (with the top byte set to something else other
> than 0x00) as syscall arguments.
> 
> get_vaddr_frames uses provided user pointers for vma lookups, which can
> only by done with untagged pointers. Instead of locating and changing
> all callers of this function, perform untagging in it.
> 
> Signed-off-by: Andrey Konovalov 
> ---
>  mm/frame_vector.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/mm/frame_vector.c b/mm/frame_vector.c
> index c64dca6e27c2..c431ca81dad5 100644
> --- a/mm/frame_vector.c
> +++ b/mm/frame_vector.c
> @@ -46,6 +46,8 @@ int get_vaddr_frames(unsigned long start, unsigned int 
> nr_frames,
>   if (WARN_ON_ONCE(nr_frames > vec->nr_allocated))
>   nr_frames = vec->nr_allocated;
>  
> + start = untagged_addr(start);
> +
>   down_read(>mmap_sem);
>   locked = 1;
>   vma = find_vma_intersection(mm, start, start + 1);

Is this some buffer that the user may have malloc'ed? I got lost when
trying to track down the provenience of this buffer.

-- 
Catalin
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v13 10/20] kernel, arm64: untag user pointers in prctl_set_mm*

2019-04-26 Thread Catalin Marinas
On Mon, Apr 01, 2019 at 06:44:34PM +0200, Andrey Konovalov wrote:
> On Fri, Mar 22, 2019 at 4:41 PM Catalin Marinas  
> wrote:
> > On Wed, Mar 20, 2019 at 03:51:24PM +0100, Andrey Konovalov wrote:
> > > @@ -2120,13 +2135,14 @@ static int prctl_set_mm(int opt, unsigned long 
> > > addr,
> > >   if (opt == PR_SET_MM_AUXV)
> > >   return prctl_set_auxv(mm, addr, arg4);
> > >
> > > - if (addr >= TASK_SIZE || addr < mmap_min_addr)
> > > + if (untagged_addr(addr) >= TASK_SIZE ||
> > > + untagged_addr(addr) < mmap_min_addr)
> > >   return -EINVAL;
> > >
> > >   error = -EINVAL;
> > >
> > >   down_write(>mmap_sem);
> > > - vma = find_vma(mm, addr);
> > > + vma = find_vma(mm, untagged_addr(addr));
> > >
> > >   prctl_map.start_code= mm->start_code;
> > >   prctl_map.end_code  = mm->end_code;
> >
> > Does this mean that we are left with tagged addresses for the
> > mm->start_code etc. values? I really don't think we should allow this,
> > I'm not sure what the implications are in other parts of the kernel.
> >
> > Arguably, these are not even pointer values but some address ranges. I
> > know we decided to relax this notion for mmap/mprotect/madvise() since
> > the user function prototypes take pointer as arguments but it feels like
> > we are overdoing it here (struct prctl_mm_map doesn't even have
> > pointers).
> >
> > What is the use-case for allowing tagged addresses here? Can user space
> > handle untagging?
> 
> I don't know any use cases for this. I did it because it seems to be
> covered by the relaxed ABI. I'm not entirely sure what to do here,
> should I just drop this patch?

If we allow tagged addresses to be passed here, we'd have to untag them
before they end up in the mm->start_code etc. members.

I know we are trying to relax the ABI here w.r.t. address ranges but
mostly because we couldn't figure out a way to document unambiguously
the difference between a user pointer that may be dereferenced by the
kernel (tags allowed) and an address typically used for managing the
address space layout. Suggestions welcomed.

I'd say just drop this patch and capture it in the ABI document.

-- 
Catalin
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v13 18/20] tee/optee, arm64: untag user pointers in check_mem_type

2019-03-22 Thread Catalin Marinas
On Wed, Mar 20, 2019 at 03:51:32PM +0100, Andrey Konovalov wrote:
> This patch is a part of a series that extends arm64 kernel ABI to allow to
> pass tagged user pointers (with the top byte set to something else other
> than 0x00) as syscall arguments.
> 
> check_mem_type() uses provided user pointers for vma lookups (via
> __check_mem_type()), which can only by done with untagged pointers.
> 
> Untag user pointers in this function.
> 
> Signed-off-by: Andrey Konovalov 
> ---
>  drivers/tee/optee/call.c | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/tee/optee/call.c b/drivers/tee/optee/call.c
> index a5afbe6dee68..e3be20264092 100644
> --- a/drivers/tee/optee/call.c
> +++ b/drivers/tee/optee/call.c
> @@ -563,6 +563,7 @@ static int check_mem_type(unsigned long start, size_t 
> num_pages)
>   int rc;
>  
>   down_read(>mmap_sem);
> + start = untagged_addr(start);
>   rc = __check_mem_type(find_vma(mm, start),
> start + num_pages * PAGE_SIZE);
>   up_read(>mmap_sem);

I guess we could just untag this in tee_shm_register(). The tag is not
relevant to a TEE implementation (firmware) anyway.

-- 
Catalin
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v13 10/20] kernel, arm64: untag user pointers in prctl_set_mm*

2019-03-22 Thread Catalin Marinas
On Wed, Mar 20, 2019 at 03:51:24PM +0100, Andrey Konovalov wrote:
> @@ -2120,13 +2135,14 @@ static int prctl_set_mm(int opt, unsigned long addr,
>   if (opt == PR_SET_MM_AUXV)
>   return prctl_set_auxv(mm, addr, arg4);
>  
> - if (addr >= TASK_SIZE || addr < mmap_min_addr)
> + if (untagged_addr(addr) >= TASK_SIZE ||
> + untagged_addr(addr) < mmap_min_addr)
>   return -EINVAL;
>  
>   error = -EINVAL;
>  
>   down_write(>mmap_sem);
> - vma = find_vma(mm, addr);
> + vma = find_vma(mm, untagged_addr(addr));
>  
>   prctl_map.start_code= mm->start_code;
>   prctl_map.end_code  = mm->end_code;

Does this mean that we are left with tagged addresses for the
mm->start_code etc. values? I really don't think we should allow this,
I'm not sure what the implications are in other parts of the kernel.

Arguably, these are not even pointer values but some address ranges. I
know we decided to relax this notion for mmap/mprotect/madvise() since
the user function prototypes take pointer as arguments but it feels like
we are overdoing it here (struct prctl_mm_map doesn't even have
pointers).

What is the use-case for allowing tagged addresses here? Can user space
handle untagging?

-- 
Catalin
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v13 13/20] bpf, arm64: untag user pointers in stack_map_get_build_id_offset

2019-03-22 Thread Catalin Marinas
On Wed, Mar 20, 2019 at 03:51:27PM +0100, Andrey Konovalov wrote:
> This patch is a part of a series that extends arm64 kernel ABI to allow to
> pass tagged user pointers (with the top byte set to something else other
> than 0x00) as syscall arguments.
> 
> stack_map_get_build_id_offset() uses provided user pointers for vma
> lookups, which can only by done with untagged pointers.
> 
> Untag user pointers in this function for doing the lookup and
> calculating the offset, but save as is in the bpf_stack_build_id
> struct.
> 
> Signed-off-by: Andrey Konovalov 
> ---
>  kernel/bpf/stackmap.c | 6 --
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/bpf/stackmap.c b/kernel/bpf/stackmap.c
> index 950ab2f28922..bb89341d3faf 100644
> --- a/kernel/bpf/stackmap.c
> +++ b/kernel/bpf/stackmap.c
> @@ -320,7 +320,9 @@ static void stack_map_get_build_id_offset(struct 
> bpf_stack_build_id *id_offs,
>   }
>  
>   for (i = 0; i < trace_nr; i++) {
> - vma = find_vma(current->mm, ips[i]);
> + u64 untagged_ip = untagged_addr(ips[i]);
> +
> + vma = find_vma(current->mm, untagged_ip);
>   if (!vma || stack_map_get_build_id(vma, id_offs[i].build_id)) {
>   /* per entry fall back to ips */
>   id_offs[i].status = BPF_STACK_BUILD_ID_IP;
> @@ -328,7 +330,7 @@ static void stack_map_get_build_id_offset(struct 
> bpf_stack_build_id *id_offs,
>   memset(id_offs[i].build_id, 0, BPF_BUILD_ID_SIZE);
>   continue;
>   }
> - id_offs[i].offset = (vma->vm_pgoff << PAGE_SHIFT) + ips[i]
> + id_offs[i].offset = (vma->vm_pgoff << PAGE_SHIFT) + untagged_ip
>   - vma->vm_start;
>   id_offs[i].status = BPF_STACK_BUILD_ID_VALID;
>   }

Can the ips[*] here ever be tagged?

-- 
Catalin
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v13 15/20] drm/radeon, arm64: untag user pointers in radeon_ttm_tt_pin_userptr

2019-03-22 Thread Catalin Marinas
On Wed, Mar 20, 2019 at 03:51:29PM +0100, Andrey Konovalov wrote:
> This patch is a part of a series that extends arm64 kernel ABI to allow to
> pass tagged user pointers (with the top byte set to something else other
> than 0x00) as syscall arguments.
> 
> radeon_ttm_tt_pin_userptr() uses provided user pointers for vma
> lookups, which can only by done with untagged pointers.
> 
> Untag user pointers in this function.
> 
> Signed-off-by: Andrey Konovalov 
> ---
>  drivers/gpu/drm/radeon/radeon_ttm.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/radeon/radeon_ttm.c 
> b/drivers/gpu/drm/radeon/radeon_ttm.c
> index 9920a6fc11bf..872a98796117 100644
> --- a/drivers/gpu/drm/radeon/radeon_ttm.c
> +++ b/drivers/gpu/drm/radeon/radeon_ttm.c
> @@ -497,9 +497,10 @@ static int radeon_ttm_tt_pin_userptr(struct ttm_tt *ttm)
>   if (gtt->userflags & RADEON_GEM_USERPTR_ANONONLY) {
>   /* check that we only pin down anonymous memory
>  to prevent problems with writeback */
> - unsigned long end = gtt->userptr + ttm->num_pages * PAGE_SIZE;
> + unsigned long userptr = untagged_addr(gtt->userptr);
> + unsigned long end = userptr + ttm->num_pages * PAGE_SIZE;
>   struct vm_area_struct *vma;
> - vma = find_vma(gtt->usermm, gtt->userptr);
> + vma = find_vma(gtt->usermm, userptr);
>   if (!vma || vma->vm_file || vma->vm_end < end)
>   return -EPERM;
>   }

Same comment as on the previous patch.

-- 
Catalin
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v13 11/20] tracing, arm64: untag user pointers in seq_print_user_ip

2019-03-22 Thread Catalin Marinas
On Wed, Mar 20, 2019 at 03:51:25PM +0100, Andrey Konovalov wrote:
> This patch is a part of a series that extends arm64 kernel ABI to allow to
> pass tagged user pointers (with the top byte set to something else other
> than 0x00) as syscall arguments.
> 
> seq_print_user_ip() uses provided user pointers for vma lookups, which
> can only by done with untagged pointers.
> 
> Untag user pointers in this function.
> 
> Signed-off-by: Andrey Konovalov 
> ---
>  kernel/trace/trace_output.c | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
> 
> diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c
> index 54373d93e251..6376bee93c84 100644
> --- a/kernel/trace/trace_output.c
> +++ b/kernel/trace/trace_output.c
> @@ -370,6 +370,7 @@ static int seq_print_user_ip(struct trace_seq *s, struct 
> mm_struct *mm,
>  {
>   struct file *file = NULL;
>   unsigned long vmstart = 0;
> + unsigned long untagged_ip = untagged_addr(ip);
>   int ret = 1;
>  
>   if (s->full)
> @@ -379,7 +380,7 @@ static int seq_print_user_ip(struct trace_seq *s, struct 
> mm_struct *mm,
>   const struct vm_area_struct *vma;
>  
>   down_read(>mmap_sem);
> - vma = find_vma(mm, ip);
> + vma = find_vma(mm, untagged_ip);
>   if (vma) {
>   file = vma->vm_file;
>   vmstart = vma->vm_start;
> @@ -388,7 +389,7 @@ static int seq_print_user_ip(struct trace_seq *s, struct 
> mm_struct *mm,
>   ret = trace_seq_path(s, >f_path);
>   if (ret)
>   trace_seq_printf(s, "[+0x%lx]",
> -  ip - vmstart);
> +  untagged_ip - vmstart);
>   }
>   up_read(>mmap_sem);
>   }

How would we end up with a tagged address here? Does "ip" here imply
instruction pointer, which we wouldn't tag?

-- 
Catalin
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v13 17/20] media/v4l2-core, arm64: untag user pointers in videobuf_dma_contig_user_get

2019-03-22 Thread Catalin Marinas
On Wed, Mar 20, 2019 at 03:51:31PM +0100, Andrey Konovalov wrote:
> This patch is a part of a series that extends arm64 kernel ABI to allow to
> pass tagged user pointers (with the top byte set to something else other
> than 0x00) as syscall arguments.
> 
> videobuf_dma_contig_user_get() uses provided user pointers for vma
> lookups, which can only by done with untagged pointers.
> 
> Untag the pointers in this function.
> 
> Signed-off-by: Andrey Konovalov 
> ---
>  drivers/media/v4l2-core/videobuf-dma-contig.c | 9 +
>  1 file changed, 5 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/media/v4l2-core/videobuf-dma-contig.c 
> b/drivers/media/v4l2-core/videobuf-dma-contig.c
> index e1bf50df4c70..8a1ddd146b17 100644
> --- a/drivers/media/v4l2-core/videobuf-dma-contig.c
> +++ b/drivers/media/v4l2-core/videobuf-dma-contig.c
> @@ -160,6 +160,7 @@ static void videobuf_dma_contig_user_put(struct 
> videobuf_dma_contig_memory *mem)
>  static int videobuf_dma_contig_user_get(struct videobuf_dma_contig_memory 
> *mem,
>   struct videobuf_buffer *vb)
>  {
> + unsigned long untagged_baddr = untagged_addr(vb->baddr);
>   struct mm_struct *mm = current->mm;
>   struct vm_area_struct *vma;
>   unsigned long prev_pfn, this_pfn;
> @@ -167,22 +168,22 @@ static int videobuf_dma_contig_user_get(struct 
> videobuf_dma_contig_memory *mem,
>   unsigned int offset;
>   int ret;
>  
> - offset = vb->baddr & ~PAGE_MASK;
> + offset = untagged_baddr & ~PAGE_MASK;
>   mem->size = PAGE_ALIGN(vb->size + offset);
>   ret = -EINVAL;
>  
>   down_read(>mmap_sem);
>  
> - vma = find_vma(mm, vb->baddr);
> + vma = find_vma(mm, untagged_baddr);
>   if (!vma)
>   goto out_up;
>  
> - if ((vb->baddr + mem->size) > vma->vm_end)
> + if ((untagged_baddr + mem->size) > vma->vm_end)
>   goto out_up;
>  
>   pages_done = 0;
>   prev_pfn = 0; /* kill warning */
> - user_address = vb->baddr;
> + user_address = untagged_baddr;
>  
>   while (pages_done < (mem->size >> PAGE_SHIFT)) {
>   ret = follow_pfn(vma, user_address, _pfn);

I don't think vb->baddr here is anonymous mmap() but worth checking the
call paths.

-- 
Catalin
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v13 12/20] uprobes, arm64: untag user pointers in find_active_uprobe

2019-03-22 Thread Catalin Marinas
On Wed, Mar 20, 2019 at 03:51:26PM +0100, Andrey Konovalov wrote:
> This patch is a part of a series that extends arm64 kernel ABI to allow to
> pass tagged user pointers (with the top byte set to something else other
> than 0x00) as syscall arguments.
> 
> find_active_uprobe() uses user pointers (obtained via
> instruction_pointer(regs)) for vma lookups, which can only by done with
> untagged pointers.
> 
> Untag user pointers in this function.
> 
> Signed-off-by: Andrey Konovalov 
> ---
>  kernel/events/uprobes.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
> index c5cde87329c7..d3a2716a813a 100644
> --- a/kernel/events/uprobes.c
> +++ b/kernel/events/uprobes.c
> @@ -1992,6 +1992,8 @@ static struct uprobe *find_active_uprobe(unsigned long 
> bp_vaddr, int *is_swbp)
>   struct uprobe *uprobe = NULL;
>   struct vm_area_struct *vma;
>  
> + bp_vaddr = untagged_addr(bp_vaddr);
> +
>   down_read(>mmap_sem);
>   vma = find_vma(mm, bp_vaddr);
>   if (vma && vma->vm_start <= bp_vaddr) {

Similarly here, that's a breakpoint address, hence instruction pointer
(PC) which is untagged.

-- 
Catalin
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v13 04/20] mm, arm64: untag user pointers passed to memory syscalls

2019-03-22 Thread Catalin Marinas
On Wed, Mar 20, 2019 at 03:51:18PM +0100, Andrey Konovalov wrote:
> This patch is a part of a series that extends arm64 kernel ABI to allow to
> pass tagged user pointers (with the top byte set to something else other
> than 0x00) as syscall arguments.
> 
> This patch allows tagged pointers to be passed to the following memory
> syscalls: madvise, mbind, get_mempolicy, mincore, mlock, mlock2, brk,
> mmap_pgoff, old_mmap, munmap, remap_file_pages, mprotect, pkey_mprotect,
> mremap, msync and shmdt.
> 
> This is done by untagging pointers passed to these syscalls in the
> prologues of their handlers.
> 
> Signed-off-by: Andrey Konovalov 
> ---
>  ipc/shm.c  | 2 ++
>  mm/madvise.c   | 2 ++
>  mm/mempolicy.c | 5 +
>  mm/migrate.c   | 1 +
>  mm/mincore.c   | 2 ++
>  mm/mlock.c | 5 +
>  mm/mmap.c  | 7 +++
>  mm/mprotect.c  | 1 +
>  mm/mremap.c| 2 ++
>  mm/msync.c | 2 ++
>  10 files changed, 29 insertions(+)

I wonder whether it's better to keep these as wrappers in the arm64
code.

-- 
Catalin
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx

Re: [PATCH v13 09/20] net, arm64: untag user pointers in tcp_zerocopy_receive

2019-03-22 Thread Catalin Marinas
On Wed, Mar 20, 2019 at 03:51:23PM +0100, Andrey Konovalov wrote:
> This patch is a part of a series that extends arm64 kernel ABI to allow to
> pass tagged user pointers (with the top byte set to something else other
> than 0x00) as syscall arguments.
> 
> tcp_zerocopy_receive() uses provided user pointers for vma lookups, which
> can only by done with untagged pointers.
> 
> Untag user pointers in this function.
> 
> Signed-off-by: Andrey Konovalov 
> ---
>  net/ipv4/tcp.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 6baa6dc1b13b..855a1f68c1ea 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -1761,6 +1761,8 @@ static int tcp_zerocopy_receive(struct sock *sk,
>   if (address & (PAGE_SIZE - 1) || address != zc->address)
>   return -EINVAL;
>  
> + address = untagged_addr(address);
> +
>   if (sk->sk_state == TCP_LISTEN)
>   return -ENOTCONN;

I don't think we need this patch if we stick to Vincenzo's ABI
restrictions. Can zc->address be an anonymous mmap()? My understanding
of TCP_ZEROCOPY_RECEIVE is that this is an mmap() on a socket, so user
should not tag such pointer.

We want to allow tagged pointers to work transparently only for heap and
stack, hence the restriction to anonymous mmap() and those addresses
below sbrk(0).

-- 
Catalin
___
amd-gfx mailing list
amd-gfx@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/amd-gfx