Re: [PATCH 00/10] mm: adjust get_user_pages* functions to explicitly pass FOLL_* flags

2016-10-20 Thread Michal Hocko
On Wed 19-10-16 10:23:55, Dave Hansen wrote:
> On 10/19/2016 10:01 AM, Michal Hocko wrote:
> > The question I had earlier was whether this has to be an explicit FOLL
> > flag used by g-u-p users or we can just use it internally when mm !=
> > current->mm
> 
> The reason I chose not to do that was that deferred work gets run under
> a basically random 'current'.  If we just use 'mm != current->mm', then
> the deferred work will sometimes have pkeys enforced and sometimes not,
> basically randomly.

OK, I see (async_pf_execute and ksm ). It makes more sense to me. Thanks
for the clarification.

-- 
Michal Hocko
SUSE Labs


Re: [PATCH 00/10] mm: adjust get_user_pages* functions to explicitly pass FOLL_* flags

2016-10-19 Thread Dave Hansen
On 10/19/2016 10:01 AM, Michal Hocko wrote:
> The question I had earlier was whether this has to be an explicit FOLL
> flag used by g-u-p users or we can just use it internally when mm !=
> current->mm

The reason I chose not to do that was that deferred work gets run under
a basically random 'current'.  If we just use 'mm != current->mm', then
the deferred work will sometimes have pkeys enforced and sometimes not,
basically randomly.

We want to be consistent with whether they are enforced or not, so we
explicitly indicate that by calling the remote variant vs. plain.


Re: [PATCH 00/10] mm: adjust get_user_pages* functions to explicitly pass FOLL_* flags

2016-10-19 Thread Michal Hocko
On Wed 19-10-16 09:49:43, Dave Hansen wrote:
> On 10/19/2016 02:07 AM, Michal Hocko wrote:
> > On Wed 19-10-16 09:58:15, Lorenzo Stoakes wrote:
> >> On Tue, Oct 18, 2016 at 05:30:50PM +0200, Michal Hocko wrote:
> >>> I am wondering whether we can go further. E.g. it is not really clear to
> >>> me whether we need an explicit FOLL_REMOTE when we can in fact check
> >>> mm != current->mm and imply that. Maybe there are some contexts which
> >>> wouldn't work, I haven't checked.
> >>
> >> This flag is set even when /proc/self/mem is used. I've not looked deeply 
> >> into
> >> this flag but perhaps accessing your own memory this way can be considered
> >> 'remote' since you're not accessing it directly. On the other hand, 
> >> perhaps this
> >> is just mistaken in this case?
> > 
> > My understanding of the flag is quite limited as well. All I know it is
> > related to protection keys and it is needed to bypass protection check.
> > See arch_vma_access_permitted. See also 1b2ee1266ea6 ("mm/core: Do not
> > enforce PKEY permissions on remote mm access").
> 
> Yeah, we need the flag to tell us when PKEYs should be applied or not.
> The current task's PKRU (pkey rights register) should really only be
> used to impact access to the task's memory, but has no bearing on how a
> given task should access remote memory.

The question I had earlier was whether this has to be an explicit FOLL
flag used by g-u-p users or we can just use it internally when mm !=
current->mm

-- 
Michal Hocko
SUSE Labs


Re: [PATCH 00/10] mm: adjust get_user_pages* functions to explicitly pass FOLL_* flags

2016-10-19 Thread Dave Hansen
On 10/19/2016 02:07 AM, Michal Hocko wrote:
> On Wed 19-10-16 09:58:15, Lorenzo Stoakes wrote:
>> On Tue, Oct 18, 2016 at 05:30:50PM +0200, Michal Hocko wrote:
>>> I am wondering whether we can go further. E.g. it is not really clear to
>>> me whether we need an explicit FOLL_REMOTE when we can in fact check
>>> mm != current->mm and imply that. Maybe there are some contexts which
>>> wouldn't work, I haven't checked.
>>
>> This flag is set even when /proc/self/mem is used. I've not looked deeply 
>> into
>> this flag but perhaps accessing your own memory this way can be considered
>> 'remote' since you're not accessing it directly. On the other hand, perhaps 
>> this
>> is just mistaken in this case?
> 
> My understanding of the flag is quite limited as well. All I know it is
> related to protection keys and it is needed to bypass protection check.
> See arch_vma_access_permitted. See also 1b2ee1266ea6 ("mm/core: Do not
> enforce PKEY permissions on remote mm access").

Yeah, we need the flag to tell us when PKEYs should be applied or not.
The current task's PKRU (pkey rights register) should really only be
used to impact access to the task's memory, but has no bearing on how a
given task should access remote memory.



Re: [PATCH 00/10] mm: adjust get_user_pages* functions to explicitly pass FOLL_* flags

2016-10-19 Thread Lorenzo Stoakes
On Tue, Oct 18, 2016 at 05:30:50PM +0200, Michal Hocko wrote:
> I am wondering whether we can go further. E.g. it is not really clear to
> me whether we need an explicit FOLL_REMOTE when we can in fact check
> mm != current->mm and imply that. Maybe there are some contexts which
> wouldn't work, I haven't checked.

This flag is set even when /proc/self/mem is used. I've not looked deeply into
this flag but perhaps accessing your own memory this way can be considered
'remote' since you're not accessing it directly. On the other hand, perhaps this
is just mistaken in this case?

> I guess there is more work in that area and I do not want to impose all
> that work on you, but I couldn't resist once I saw you playing in that
> area ;) Definitely a good start!

Thanks, I am more than happy to go as far down this rabbit hole as is helpful,
no imposition at all :)


Re: [PATCH 00/10] mm: adjust get_user_pages* functions to explicitly pass FOLL_* flags

2016-10-19 Thread Michal Hocko
On Wed 19-10-16 09:58:15, Lorenzo Stoakes wrote:
> On Tue, Oct 18, 2016 at 05:30:50PM +0200, Michal Hocko wrote:
> > I am wondering whether we can go further. E.g. it is not really clear to
> > me whether we need an explicit FOLL_REMOTE when we can in fact check
> > mm != current->mm and imply that. Maybe there are some contexts which
> > wouldn't work, I haven't checked.
> 
> This flag is set even when /proc/self/mem is used. I've not looked deeply into
> this flag but perhaps accessing your own memory this way can be considered
> 'remote' since you're not accessing it directly. On the other hand, perhaps 
> this
> is just mistaken in this case?

My understanding of the flag is quite limited as well. All I know it is
related to protection keys and it is needed to bypass protection check.
See arch_vma_access_permitted. See also 1b2ee1266ea6 ("mm/core: Do not
enforce PKEY permissions on remote mm access").

-- 
Michal Hocko
SUSE Labs


Re: [PATCH 00/10] mm: adjust get_user_pages* functions to explicitly pass FOLL_* flags

2016-10-18 Thread Michal Hocko
On Thu 13-10-16 01:20:10, Lorenzo Stoakes wrote:
> This patch series adjusts functions in the get_user_pages* family such that
> desired FOLL_* flags are passed as an argument rather than implied by flags.
> 
> The purpose of this change is to make the use of FOLL_FORCE explicit so it is
> easier to grep for and clearer to callers that this flag is being used. The 
> use
> of FOLL_FORCE is an issue as it overrides missing VM_READ/VM_WRITE flags for 
> the
> VMA whose pages we are reading from/writing to, which can result in surprising
> behaviour.
> 
> The patch series came out of the discussion around commit 38e0885, which
> addressed a BUG_ON() being triggered when a page was faulted in with PROT_NONE
> set but having been overridden by FOLL_FORCE. do_numa_page() was run on the
> assumption the page _must_ be one marked for NUMA node migration as an actual
> PROT_NONE page would have been dealt with prior to this code path, however
> FOLL_FORCE introduced a situation where this assumption did not hold.
> 
> See https://marc.info/?l=linux-mm=147585445805166 for the patch proposal.

I like this cleanup. Tracking FOLL_FORCE users was always a nightmare
and the flag behavior is really subtle so we should better be explicit
about it. I haven't gone through each patch separately but rather
applied the whole series and checked the resulting diff. This all seems
OK to me and feel free to add
Acked-by: Michal Hocko 

I am wondering whether we can go further. E.g. it is not really clear to
me whether we need an explicit FOLL_REMOTE when we can in fact check
mm != current->mm and imply that. Maybe there are some contexts which
wouldn't work, I haven't checked.

Then I am also wondering about FOLL_TOUCH behavior.
__get_user_pages_unlocked has only few callers which used to be
get_user_pages_unlocked before 1e9877902dc7e ("mm/gup: Introduce
get_user_pages_remote()"). To me a dropped FOLL_TOUCH seems
unintentional. Now that get_user_pages_unlocked has gup_flags argument I
guess we might want to get rid of the __g-u-p-u version altogether, no?

__get_user_pages is quite low level and imho shouldn't be exported. It's
only user - kvm - should rather pull those two functions to gup instead
and export them. There is nothing really KVM specific in them.

I also cannot say I would be entirely thrilled about get_user_pages_locked,
we only have one user which can simply do lock g-u-p unlock AFAICS.

I guess there is more work in that area and I do not want to impose all
that work on you, but I couldn't resist once I saw you playing in that
area ;) Definitely a good start!
-- 
Michal Hocko
SUSE Labs


Re: [PATCH 00/10] mm: adjust get_user_pages* functions to explicitly pass FOLL_* flags

2016-10-13 Thread Christian König

Am 13.10.2016 um 02:20 schrieb Lorenzo Stoakes:

This patch series adjusts functions in the get_user_pages* family such that
desired FOLL_* flags are passed as an argument rather than implied by flags.

The purpose of this change is to make the use of FOLL_FORCE explicit so it is
easier to grep for and clearer to callers that this flag is being used. The use
of FOLL_FORCE is an issue as it overrides missing VM_READ/VM_WRITE flags for the
VMA whose pages we are reading from/writing to, which can result in surprising
behaviour.

The patch series came out of the discussion around commit 38e0885, which
addressed a BUG_ON() being triggered when a page was faulted in with PROT_NONE
set but having been overridden by FOLL_FORCE. do_numa_page() was run on the
assumption the page _must_ be one marked for NUMA node migration as an actual
PROT_NONE page would have been dealt with prior to this code path, however
FOLL_FORCE introduced a situation where this assumption did not hold.

See https://marc.info/?l=linux-mm=147585445805166 for the patch proposal.

Lorenzo Stoakes (10):
   mm: remove write/force parameters from __get_user_pages_locked()
   mm: remove write/force parameters from __get_user_pages_unlocked()
   mm: replace get_user_pages_unlocked() write/force parameters with gup_flags
   mm: replace get_user_pages_locked() write/force parameters with gup_flags
   mm: replace get_vaddr_frames() write/force parameters with gup_flags
   mm: replace get_user_pages() write/force parameters with gup_flags
   mm: replace get_user_pages_remote() write/force parameters with gup_flags
   mm: replace __access_remote_vm() write parameter with gup_flags
   mm: replace access_remote_vm() write parameter with gup_flags
   mm: replace access_process_vm() write parameter with gup_flags


Patch number 6 in this series (which touches drivers I co-maintain) is 
Acked-by: Christian König .


In general looks like a very nice cleanup to me, but I'm not enlightened 
enough to full judge.


Regards,
Christian.



  arch/alpha/kernel/ptrace.c |  9 ++--
  arch/blackfin/kernel/ptrace.c  |  5 ++-
  arch/cris/arch-v32/drivers/cryptocop.c |  4 +-
  arch/cris/arch-v32/kernel/ptrace.c |  4 +-
  arch/ia64/kernel/err_inject.c  |  2 +-
  arch/ia64/kernel/ptrace.c  | 14 +++---
  arch/m32r/kernel/ptrace.c  | 15 ---
  arch/mips/kernel/ptrace32.c|  5 ++-
  arch/mips/mm/gup.c |  2 +-
  arch/powerpc/kernel/ptrace32.c |  5 ++-
  arch/s390/mm/gup.c |  3 +-
  arch/score/kernel/ptrace.c | 10 +++--
  arch/sh/mm/gup.c   |  3 +-
  arch/sparc/kernel/ptrace_64.c  | 24 +++
  arch/sparc/mm/gup.c|  3 +-
  arch/x86/kernel/step.c |  3 +-
  arch/x86/mm/gup.c  |  2 +-
  arch/x86/mm/mpx.c  |  5 +--
  arch/x86/um/ptrace_32.c|  3 +-
  arch/x86/um/ptrace_64.c|  3 +-
  drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c|  7 ++-
  drivers/gpu/drm/etnaviv/etnaviv_gem.c  |  7 ++-
  drivers/gpu/drm/exynos/exynos_drm_g2d.c|  3 +-
  drivers/gpu/drm/i915/i915_gem_userptr.c|  6 ++-
  drivers/gpu/drm/radeon/radeon_ttm.c|  3 +-
  drivers/gpu/drm/via/via_dmablit.c  |  4 +-
  drivers/infiniband/core/umem.c |  6 ++-
  drivers/infiniband/core/umem_odp.c |  7 ++-
  drivers/infiniband/hw/mthca/mthca_memfree.c|  2 +-
  drivers/infiniband/hw/qib/qib_user_pages.c |  3 +-
  drivers/infiniband/hw/usnic/usnic_uiom.c   |  5 ++-
  drivers/media/pci/ivtv/ivtv-udma.c |  4 +-
  drivers/media/pci/ivtv/ivtv-yuv.c  |  5 ++-
  drivers/media/platform/omap/omap_vout.c|  2 +-
  drivers/media/v4l2-core/videobuf-dma-sg.c  |  7 ++-
  drivers/media/v4l2-core/videobuf2-memops.c |  6 ++-
  drivers/misc/mic/scif/scif_rma.c   |  3 +-
  drivers/misc/sgi-gru/grufault.c|  2 +-
  drivers/platform/goldfish/goldfish_pipe.c  |  3 +-
  drivers/rapidio/devices/rio_mport_cdev.c   |  3 +-
  drivers/scsi/st.c  |  5 +--
  .../interface/vchiq_arm/vchiq_2835_arm.c   |  3 +-
  .../vc04_services/interface/vchiq_arm/vchiq_arm.c  |  3 +-
  drivers/video/fbdev/pvr2fb.c   |  4 +-
  drivers/virt/fsl_hypervisor.c  |  4 +-
  fs/exec.c  |  9 +++-
  fs/proc/base.c | 19 +---
  include/linux/mm.h  

[PATCH 00/10] mm: adjust get_user_pages* functions to explicitly pass FOLL_* flags

2016-10-12 Thread Lorenzo Stoakes
This patch series adjusts functions in the get_user_pages* family such that
desired FOLL_* flags are passed as an argument rather than implied by flags.

The purpose of this change is to make the use of FOLL_FORCE explicit so it is
easier to grep for and clearer to callers that this flag is being used. The use
of FOLL_FORCE is an issue as it overrides missing VM_READ/VM_WRITE flags for the
VMA whose pages we are reading from/writing to, which can result in surprising
behaviour.

The patch series came out of the discussion around commit 38e0885, which
addressed a BUG_ON() being triggered when a page was faulted in with PROT_NONE
set but having been overridden by FOLL_FORCE. do_numa_page() was run on the
assumption the page _must_ be one marked for NUMA node migration as an actual
PROT_NONE page would have been dealt with prior to this code path, however
FOLL_FORCE introduced a situation where this assumption did not hold.

See https://marc.info/?l=linux-mm=147585445805166 for the patch proposal.

Lorenzo Stoakes (10):
  mm: remove write/force parameters from __get_user_pages_locked()
  mm: remove write/force parameters from __get_user_pages_unlocked()
  mm: replace get_user_pages_unlocked() write/force parameters with gup_flags
  mm: replace get_user_pages_locked() write/force parameters with gup_flags
  mm: replace get_vaddr_frames() write/force parameters with gup_flags
  mm: replace get_user_pages() write/force parameters with gup_flags
  mm: replace get_user_pages_remote() write/force parameters with gup_flags
  mm: replace __access_remote_vm() write parameter with gup_flags
  mm: replace access_remote_vm() write parameter with gup_flags
  mm: replace access_process_vm() write parameter with gup_flags

 arch/alpha/kernel/ptrace.c |  9 ++--
 arch/blackfin/kernel/ptrace.c  |  5 ++-
 arch/cris/arch-v32/drivers/cryptocop.c |  4 +-
 arch/cris/arch-v32/kernel/ptrace.c |  4 +-
 arch/ia64/kernel/err_inject.c  |  2 +-
 arch/ia64/kernel/ptrace.c  | 14 +++---
 arch/m32r/kernel/ptrace.c  | 15 ---
 arch/mips/kernel/ptrace32.c|  5 ++-
 arch/mips/mm/gup.c |  2 +-
 arch/powerpc/kernel/ptrace32.c |  5 ++-
 arch/s390/mm/gup.c |  3 +-
 arch/score/kernel/ptrace.c | 10 +++--
 arch/sh/mm/gup.c   |  3 +-
 arch/sparc/kernel/ptrace_64.c  | 24 +++
 arch/sparc/mm/gup.c|  3 +-
 arch/x86/kernel/step.c |  3 +-
 arch/x86/mm/gup.c  |  2 +-
 arch/x86/mm/mpx.c  |  5 +--
 arch/x86/um/ptrace_32.c|  3 +-
 arch/x86/um/ptrace_64.c|  3 +-
 drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c|  7 ++-
 drivers/gpu/drm/etnaviv/etnaviv_gem.c  |  7 ++-
 drivers/gpu/drm/exynos/exynos_drm_g2d.c|  3 +-
 drivers/gpu/drm/i915/i915_gem_userptr.c|  6 ++-
 drivers/gpu/drm/radeon/radeon_ttm.c|  3 +-
 drivers/gpu/drm/via/via_dmablit.c  |  4 +-
 drivers/infiniband/core/umem.c |  6 ++-
 drivers/infiniband/core/umem_odp.c |  7 ++-
 drivers/infiniband/hw/mthca/mthca_memfree.c|  2 +-
 drivers/infiniband/hw/qib/qib_user_pages.c |  3 +-
 drivers/infiniband/hw/usnic/usnic_uiom.c   |  5 ++-
 drivers/media/pci/ivtv/ivtv-udma.c |  4 +-
 drivers/media/pci/ivtv/ivtv-yuv.c  |  5 ++-
 drivers/media/platform/omap/omap_vout.c|  2 +-
 drivers/media/v4l2-core/videobuf-dma-sg.c  |  7 ++-
 drivers/media/v4l2-core/videobuf2-memops.c |  6 ++-
 drivers/misc/mic/scif/scif_rma.c   |  3 +-
 drivers/misc/sgi-gru/grufault.c|  2 +-
 drivers/platform/goldfish/goldfish_pipe.c  |  3 +-
 drivers/rapidio/devices/rio_mport_cdev.c   |  3 +-
 drivers/scsi/st.c  |  5 +--
 .../interface/vchiq_arm/vchiq_2835_arm.c   |  3 +-
 .../vc04_services/interface/vchiq_arm/vchiq_arm.c  |  3 +-
 drivers/video/fbdev/pvr2fb.c   |  4 +-
 drivers/virt/fsl_hypervisor.c  |  4 +-
 fs/exec.c  |  9 +++-
 fs/proc/base.c | 19 +---
 include/linux/mm.h | 18 
 kernel/events/uprobes.c|  6 ++-
 kernel/ptrace.c| 16 ---
 mm/frame_vector.c  |  9 ++--
 mm/gup.c   | 50 ++
 mm/memory.c| 16 ---