Re: [PATCH V3 0/5] Allow user to request memory to be locked on page fault

2015-07-08 Thread Eric B Munson
On Tue, 07 Jul 2015, Andrew Morton wrote:

 On Tue,  7 Jul 2015 13:03:38 -0400 Eric B Munson emun...@akamai.com wrote:
 
  mlock() allows a user to control page out of program memory, but this
  comes at the cost of faulting in the entire mapping when it is
  allocated.  For large mappings where the entire area is not necessary
  this is not ideal.  Instead of forcing all locked pages to be present
  when they are allocated, this set creates a middle ground.  Pages are
  marked to be placed on the unevictable LRU (locked) when they are first
  used, but they are not faulted in by the mlock call.
  
  This series introduces a new mlock() system call that takes a flags
  argument along with the start address and size.  This flags argument
  gives the caller the ability to request memory be locked in the
  traditional way, or to be locked after the page is faulted in.  New
  calls are added for munlock() and munlockall() which give the called a
  way to specify which flags are supposed to be cleared.  A new MCL flag
  is added to mirror the lock on fault behavior from mlock() in
  mlockall().  Finally, a flag for mmap() is added that allows a user to
  specify that the covered are should not be paged out, but only after the
  memory has been used the first time.
 
 Thanks for sticking with this.  Adding new syscalls is a bit of a
 hassle but I do think we end up with a better interface - the existing
 mlock/munlock/mlockall interfaces just aren't appropriate for these
 things.
 
 I don't know whether these syscalls should be documented via new
 manpages, or if we should instead add them to the existing
 mlock/munlock/mlockall manpages.  Michael, could you please advise?
 

Thanks for adding the series.  I owe you several updates (getting the
new syscall right for all architectures and a set of tests for the new
syscalls).  Would you prefer a new pair of patches or I update this set?

Eric


signature.asc
Description: Digital signature
___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

Re: [PATCH V3 0/5] Allow user to request memory to be locked on page fault

2015-07-08 Thread Andrew Morton
On Wed, 8 Jul 2015 09:23:02 -0400 Eric B Munson emun...@akamai.com wrote:

  I don't know whether these syscalls should be documented via new
  manpages, or if we should instead add them to the existing
  mlock/munlock/mlockall manpages.  Michael, could you please advise?
  
 
 Thanks for adding the series.  I owe you several updates (getting the
 new syscall right for all architectures and a set of tests for the new
 syscalls).  Would you prefer a new pair of patches or I update this set?

It doesn't matter much.  I guess a full update will be more convenient
at your end.

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev

[PATCH V3 0/5] Allow user to request memory to be locked on page fault

2015-07-07 Thread Eric B Munson
mlock() allows a user to control page out of program memory, but this
comes at the cost of faulting in the entire mapping when it is
allocated.  For large mappings where the entire area is not necessary
this is not ideal.  Instead of forcing all locked pages to be present
when they are allocated, this set creates a middle ground.  Pages are
marked to be placed on the unevictable LRU (locked) when they are first
used, but they are not faulted in by the mlock call.

This series introduces a new mlock() system call that takes a flags
argument along with the start address and size.  This flags argument
gives the caller the ability to request memory be locked in the
traditional way, or to be locked after the page is faulted in.  New
calls are added for munlock() and munlockall() which give the called a
way to specify which flags are supposed to be cleared.  A new MCL flag
is added to mirror the lock on fault behavior from mlock() in
mlockall().  Finally, a flag for mmap() is added that allows a user to
specify that the covered are should not be paged out, but only after the
memory has been used the first time.

There are two main use cases that this set covers.  The first is the
security focussed mlock case.  A buffer is needed that cannot be written
to swap.  The maximum size is known, but on average the memory used is
significantly less than this maximum.  With lock on fault, the buffer
is guaranteed to never be paged out without consuming the maximum size
every time such a buffer is created.

The second use case is focussed on performance.  Portions of a large
file are needed and we want to keep the used portions in memory once
accessed.  This is the case for large graphical models where the path
through the graph is not known until run time.  The entire graph is
unlikely to be used in a given invocation, but once a node has been
used it needs to stay resident for further processing.  Given these
constraints we have a number of options.  We can potentially waste a
large amount of memory by mlocking the entire region (this can also
cause a significant stall at startup as the entire file is read in).
We can mlock every page as we access them without tracking if the page
is already resident but this introduces large overhead for each access.
The third option is mapping the entire region with PROT_NONE and using
a signal handler for SIGSEGV to mprotect(PROT_READ) and mlock() the
needed page.  Doing this page at a time adds a significant performance
penalty.  Batching can be used to mitigate this overhead, but in order
to safely avoid trying to mprotect pages outside of the mapping, the
boundaries of each mapping to be used in this way must be tracked and
available to the signal handler.  This is precisely what the mm system
in the kernel should already be doing.

For mlock(MLOCK_ONFAULT) and mmap(MAP_LOCKONFAULT) the user is charged
against RLIMIT_MEMLOCK as if mlock(MLOCK_LOCKED) or mmap(MAP_LOCKED) was
used, so when the VMA is created not when the pages are faulted in.  For
mlockall(MCL_ONFAULT) the user is charged as if MCL_FUTURE was used.
This decision was made to keep the accounting checks out of the page
fault path.

To illustrate the benefit of this set I wrote a test program that mmaps
a 5 GB file filled with random data and then makes 15,000,000 accesses
to random addresses in that mapping.  The test program was run 20 times
for each setup.  Results are reported for two program portions, setup
and execution.  The setup phase is calling mmap and optionally mlock on
the entire region.  For most experiments this is trivial, but it
highlights the cost of faulting in the entire region.  Results are
averages across the 20 runs in milliseconds.

mmap with mlock(MLOCK_LOCKED) on entire range:
Setup avg:  8228.666
Processing avg: 8274.257

mmap with mlock(MLOCK_LOCKED) before each access:
Setup avg:  0.113
Processing avg: 90993.552

mmap with PROT_NONE and signal handler and batch size of 1 page:
With the default value in max_map_count, this gets ENOMEM as I attempt
to change the permissions, after upping the sysctl significantly I get:
Setup avg:  0.058
Processing avg: 69488.073

mmap with PROT_NONE and signal handler and batch size of 8 pages:
Setup avg:  0.068
Processing avg: 38204.116

mmap with PROT_NONE and signal handler and batch size of 16 pages:
Setup avg:  0.044
Processing avg: 29671.180

mmap with mlock(MLOCK_ONFAULT) on entire range:
Setup avg:  0.189
Processing avg: 17904.899

The signal handler in the batch cases faulted in memory in two steps to
avoid having to know the start and end of the faulting mapping.  The
first step covers the page that caused the fault as we know that it will
be possible to lock.  The second step speculatively tries to mlock and
mprotect the batch size - 1 pages that follow.  There may be a clever
way to avoid this without having the program track each mapping to be
covered by this handeler in a globally accessible structure, but I could
not 

Re: [PATCH V3 0/5] Allow user to request memory to be locked on page fault

2015-07-07 Thread Andrew Morton
On Tue,  7 Jul 2015 13:03:38 -0400 Eric B Munson emun...@akamai.com wrote:

 mlock() allows a user to control page out of program memory, but this
 comes at the cost of faulting in the entire mapping when it is
 allocated.  For large mappings where the entire area is not necessary
 this is not ideal.  Instead of forcing all locked pages to be present
 when they are allocated, this set creates a middle ground.  Pages are
 marked to be placed on the unevictable LRU (locked) when they are first
 used, but they are not faulted in by the mlock call.
 
 This series introduces a new mlock() system call that takes a flags
 argument along with the start address and size.  This flags argument
 gives the caller the ability to request memory be locked in the
 traditional way, or to be locked after the page is faulted in.  New
 calls are added for munlock() and munlockall() which give the called a
 way to specify which flags are supposed to be cleared.  A new MCL flag
 is added to mirror the lock on fault behavior from mlock() in
 mlockall().  Finally, a flag for mmap() is added that allows a user to
 specify that the covered are should not be paged out, but only after the
 memory has been used the first time.

Thanks for sticking with this.  Adding new syscalls is a bit of a
hassle but I do think we end up with a better interface - the existing
mlock/munlock/mlockall interfaces just aren't appropriate for these
things.

I don't know whether these syscalls should be documented via new
manpages, or if we should instead add them to the existing
mlock/munlock/mlockall manpages.  Michael, could you please advise?

___
Linuxppc-dev mailing list
Linuxppc-dev@lists.ozlabs.org
https://lists.ozlabs.org/listinfo/linuxppc-dev