[PATCH 25/31] mm, multi-arch: pass a protection key in to calc_vm_flag_bits()

2016-01-06 Thread Dave Hansen
From: Dave Hansen <dave.han...@linux.intel.com> This plumbs a protection key through calc_vm_flag_bits(). We could have done this in calc_vm_prot_bits(), but I did not feel super strongly which way to go. It was pretty arbitrary which one to use. Signed-off-by: Dave Hansen <

[PATCH 25/32] mm, multi-arch: pass a protection key in to calc_vm_flag_bits()

2015-12-14 Thread Dave Hansen
From: Dave Hansen <dave.han...@linux.intel.com> This plumbs a protection key through calc_vm_flag_bits(). We could have done this in calc_vm_prot_bits(), but I did not feel super strongly which way to go. It was pretty arbitrary which one to use. Signed-off-by: Dave Hansen <

[PATCH 00/32] x86: Memory Protection Keys (v7)

2015-12-14 Thread Dave Hansen
nted fully enough to be usable. If you are interested in running this for real, please get in touch with me. Hardware is available to a very small but nonzero number of people. This set is also available here: git://git.kernel.org/pub/scm/linux/kernel/git/daveh/x86-pkeys.git pkeys-v018 ===

Re: [PATCH 26/34] mm: implement new mprotect_key() system call

2015-12-09 Thread Dave Hansen
On 12/09/2015 08:45 AM, Michael Kerrisk (man-pages) wrote: >>> >> * Explanation of what a protection domain is. >> > >> > A protection domain is a unique view of memory and is represented by the >> > value in the PKRU register. > Out something about this in pkey(7), but explain what you mean by a

Re: [PATCH 28/34] x86: wire up mprotect_key() system call

2015-12-08 Thread Dave Hansen
On 12/08/2015 10:44 AM, Thomas Gleixner wrote: > On Thu, 3 Dec 2015, Dave Hansen wrote: >> #include >> diff -puN mm/Kconfig~pkeys-16-x86-mprotect_key mm/Kconfig >> --- a/mm/Kconfig~pkeys-16-x86-mprotect_key 2015-12-03 16:21:31.114920208 >> -0800 >> +++

Re: [PATCH 26/34] mm: implement new mprotect_key() system call

2015-12-07 Thread Dave Hansen
On 12/04/2015 10:50 PM, Michael Kerrisk (man-pages) wrote: > On 12/04/2015 02:15 AM, Dave Hansen wrote: >> From: Dave Hansen <dave.han...@linux.intel.com> >> >> mprotect_key() is just like mprotect, except it also takes a >> protection key as an argument

Re: [PATCH 00/34] x86: Memory Protection Keys (v5)

2015-12-04 Thread Dave Hansen
On 12/04/2015 03:31 PM, Andy Lutomirski wrote: > On Thu, Dec 3, 2015 at 5:14 PM, Dave Hansen <d...@sr71.net> wrote: >> Memory Protection Keys for User pages is a CPU feature which will >> first appear on Skylake Servers, but will also be supported on >> future non

[PATCH 26/34] mm: implement new mprotect_key() system call

2015-12-03 Thread Dave Hansen
From: Dave Hansen <dave.han...@linux.intel.com> mprotect_key() is just like mprotect, except it also takes a protection key as an argument. On systems that do not support protection keys, it still works, but requires that key=0. Otherwise it does exactly what mprotect does. I expect it

[PATCH 24/34] mm, multi-arch: pass a protection key in to calc_vm_flag_bits()

2015-12-03 Thread Dave Hansen
From: Dave Hansen <dave.han...@linux.intel.com> This plumbs a protection key through calc_vm_flag_bits(). We could have done this in calc_vm_prot_bits(), but I did not feel super strongly which way to go. It was pretty arbitrary which one to use. Signed-off-by: Dave Hansen <

[PATCH 28/34] x86: wire up mprotect_key() system call

2015-12-03 Thread Dave Hansen
From: Dave Hansen <dave.han...@linux.intel.com> This is all that we need to get the new system call itself working on x86. Signed-off-by: Dave Hansen <dave.han...@linux.intel.com> Cc: linux-api@vger.kernel.org --- b/arch/x86/entry/syscalls/syscall_32.tbl |1 + b/arch/x86/en

[PATCH 00/34] x86: Memory Protection Keys (v5)

2015-12-03 Thread Dave Hansen
x86-pkeys.git pkeys-v014 === diffstat === Dave Hansen (34): mm, gup: introduce concept of "foreign" get_user_pages() x86, fpu: add placeholder for Processor Trace XSAVE state x86, pkeys: Add Kconfig option x86, pkeys: cpuid bit definition x86, pkeys: defin

[PATCH 31/34] x86, pkeys: allocation/free syscalls

2015-12-03 Thread Dave Hansen
From: Dave Hansen <dave.han...@linux.intel.com> This patch adds two new system calls: int pkey_alloc(unsigned long flags, unsigned long init_access_rights) int pkey_free(int pkey); These establish which protection keys are valid for use by userspace. A key

[PATCH 32/34] x86, pkeys: add pkey set/get syscalls

2015-12-03 Thread Dave Hansen
From: Dave Hansen <dave.han...@linux.intel.com> This establishes two more system calls for protection key management: unsigned long pkey_get(int pkey); int pkey_set(int pkey, unsigned long access_rights); The return value from pkey_get() and the 'access_rights'

[PATCH 34/37] x86, pkeys: allocation/free syscalls

2015-11-16 Thread Dave Hansen
From: Dave Hansen <dave.han...@linux.intel.com> This patch adds two new system calls: int pkey_alloc(unsigned long flags, unsigned long init_access_rights) int pkey_free(int pkey); These establish which protection keys are valid for use by userspace. A key

[PATCH 29/37] mm: implement new mprotect_key() system call

2015-11-16 Thread Dave Hansen
From: Dave Hansen <dave.han...@linux.intel.com> mprotect_key() is just like mprotect, except it also takes a protection key as an argument. On systems that do not support protection keys, it still works, but requires that key=0. Otherwise it does exactly what mprotect does. I expect it

[PATCH 35/37] x86, pkeys: add pkey set/get syscalls

2015-11-16 Thread Dave Hansen
From: Dave Hansen <dave.han...@linux.intel.com> This establishes two more system calls for protection key management: unsigned long pkey_get(int pkey); int pkey_set(int pkey, unsigned long access_rights); The return value from pkey_get() and the 'access_rights'

[PATCH 31/37] x86: wire up mprotect_key() system call

2015-11-16 Thread Dave Hansen
From: Dave Hansen <dave.han...@linux.intel.com> This is all that we need to get the new system call itself working on x86. Signed-off-by: Dave Hansen <dave.han...@linux.intel.com> Cc: linux-api@vger.kernel.org --- b/arch/x86/entry/syscalls/syscall_32.tbl |1 + b/arch/x86/en

[PATCH 27/37] mm, multi-arch: pass a protection key in to calc_vm_flag_bits()

2015-11-16 Thread Dave Hansen
From: Dave Hansen <dave.han...@linux.intel.com> This plumbs a protection key through calc_vm_flag_bits(). We could have done this in calc_vm_prot_bits(), but I did not feel super strongly which way to go. It was pretty arbitrary which one to use. Signed-off-by: Dave Hansen <

[PATCH 00/37] x86: Memory Protection Keys

2015-11-16 Thread Dave Hansen
Memory Protection Keys for User pages is a CPU feature which will first appear on Skylake Servers, but will also be supported on future non-server parts. It provides a mechanism for enforcing page-based protections, but without requiring modification of the page tables when an application changes

Re: [PATCH 21/25] mm: implement new mprotect_key() system call

2015-09-29 Thread Dave Hansen
On 09/28/2015 11:39 PM, Michael Ellerman wrote: > On Mon, 2015-09-28 at 12:18 -0700, Dave Hansen wrote: >> From: Dave Hansen <dave.han...@linux.intel.com> >> >> mprotect_key() is just like mprotect, except it also takes a >> protection key as an argument

[PATCH 21/25] mm: implement new mprotect_key() system call

2015-09-28 Thread Dave Hansen
From: Dave Hansen <dave.han...@linux.intel.com> mprotect_key() is just like mprotect, except it also takes a protection key as an argument. On systems that do not support protection keys, it still works, but requires that key=0. Otherwise it does exactly what mprotect does. I expect it

[PATCH 20/25] mm, multi-arch: pass a protection key in to calc_vm_flag_bits()

2015-09-28 Thread Dave Hansen
From: Dave Hansen <dave.han...@linux.intel.com> This plumbs a protection key through calc_vm_flag_bits(). We could of done this in calc_vm_prot_bits(), but I did not feel super strongly which way to go. It was pretty arbitrary which one to use. Signed-off-by: Dave Hansen <

[PATCH 00/25] x86: Memory Protection Keys

2015-09-28 Thread Dave Hansen
I have addressed all known issues and review comments. I believe they are ready to be pulled in to the x86 tree. Note that this is also the first time anyone has seen the new 'selftests' code. If there are issues limited to it, I'd prefer to fix those up separately post-merge. Changes from

Re: [PATCH 10/23] userfaultfd: add new syscall to provide memory externalization

2015-06-23 Thread Dave Hansen
On 05/14/2015 10:31 AM, Andrea Arcangeli wrote: +static int userfaultfd_wake_function(wait_queue_t *wq, unsigned mode, + int wake_flags, void *key) +{ + struct userfaultfd_wake_range *range = key; + int ret; + struct userfaultfd_wait_queue *uwq;

Re: [PATCH v3 1/3] mm: introduce fincore()

2014-07-07 Thread Dave Hansen
+/* + * You can control how the buffer in userspace is filled with this mode + * parameters: I agree that we don't have any good mechanisms for looking at the page cache from userspace. I've hacked some things up using mincore() and they weren't pretty, so I welcome _something_ like this.

Re: [PATCH v3 3/3] man2/fincore.2: document general description about fincore(2)

2014-07-07 Thread Dave Hansen
On 07/07/2014 11:00 AM, Naoya Horiguchi wrote: +.SH RETURN VALUE +On success, +.BR fincore () +returns 0. +On error, \-1 is returned, and +.I errno +is set appropriately. Is this accurate? From reading the syscall itself, it looked like it did this: + * Return value is the number of

Re: [PATCH v3 1/3] mm: introduce fincore()

2014-07-07 Thread Dave Hansen
On 07/07/2014 01:21 PM, Naoya Horiguchi wrote: On Mon, Jul 07, 2014 at 12:01:41PM -0700, Dave Hansen wrote: But, is this trying to do too many things at once? Do we have solid use cases spelled out for each of these modes? Have we thought out how they will be used in practice? tools/vm

Re: [PATCH v3 3/3] man2/fincore.2: document general description about fincore(2)

2014-07-07 Thread Dave Hansen
On 07/07/2014 01:59 PM, Naoya Horiguchi wrote: On Mon, Jul 07, 2014 at 12:08:12PM -0700, Dave Hansen wrote: On 07/07/2014 11:00 AM, Naoya Horiguchi wrote: +.SH RETURN VALUE +On success, +.BR fincore () +returns 0. +On error, \-1 is returned, and +.I errno +is set appropriately

Re: [PATCH v21 001/100] eclone (1/11): Factor out code to allocate pidmap page

2010-05-03 Thread Dave Hansen
On Sat, 2010-05-01 at 15:10 -0700, David Miller wrote: NO WAY, there is no way in the world you should post 100 patches at a time to any mailing list, especially those at vger.kernel.org that have thousands upon thousands of subscribers. Post only small, well contained, sets of patches at a

Re: How much of a mess does OpenVZ make? ;) Was: What can OpenVZ do?

2009-03-13 Thread Dave Hansen
On Fri, 2009-03-13 at 14:01 -0700, Linus Torvalds wrote: On Fri, 13 Mar 2009, Alexey Dobriyan wrote: Let's face it, we're not going to _ever_ checkpoint any kind of general case process. Just TCP makes that fundamentally impossible in the general case, and there are lots and lots of

Re: How much of a mess does OpenVZ make? ;) Was: What can OpenVZ do?

2009-02-27 Thread Dave Hansen
On Fri, 2009-02-27 at 01:31 +0300, Alexey Dobriyan wrote: I think the main question is: will we ever find ourselves in the future saying that C/R sucks, nobody but a small minority uses it, wish we had never merged it? I think the likelyhood of that is very low. I think the current

Re: [RFC v13][PATCH 05/14] x86 support for checkpoint/restart

2009-02-24 Thread Dave Hansen
On Tue, 2009-02-24 at 01:47 -0600, Nathan Lynch wrote: But I think this has been pointed out before. If I understand the justification for cr_hbuf_get correctly, the allocations it services are somehow known to be bounded in size and nesting. But even if that is the case, it's not much of a

Re: Banning checkpoint (was: Re: What can OpenVZ do?)

2009-02-19 Thread Dave Hansen
On Thu, 2009-02-19 at 22:06 +0300, Alexey Dobriyan wrote: Inotify isn't supported yet? You do if (!list_empty(inode-inotify_watches)) return -E; without hooking into inotify syscalls. ptrace(2) isn't supported -- look at struct task_struct::ptraced and friends.

Re: What can OpenVZ do?

2009-02-18 Thread Dave Hansen
On Wed, 2009-02-18 at 19:16 +0100, Ingo Molnar wrote: Nothing motivates more than app designers complaining about the one-way flag. Furthermore, it's _far_ easier to make a one-way flag SMP-safe. We just set it and that's it. When we unset it, what do we about SMP races with other

Re: What can OpenVZ do?

2009-02-17 Thread Dave Hansen
On Tue, 2009-02-17 at 23:23 +0100, Ingo Molnar wrote: * Dave Hansen d...@linux.vnet.ibm.com wrote: On Fri, 2009-02-13 at 11:53 +0100, Ingo Molnar wrote: In any case, by designing checkpointing to reuse the existing LSM callbacks, we'd hit multiple birds with the same stone. (One

Re: [RFC v13][PATCH 00/14] Kernel based checkpoint/restart

2009-02-16 Thread Dave Hansen
On Fri, 2009-02-13 at 15:28 -0800, Andrew Morton wrote: For extra marks: - Will any of this involve non-trivial serialisation of kernel objects? If so, that's getting into the unacceptably-expensive-to-maintain space, I suspect. We have some structures that are certainly

Re: What can OpenVZ do?

2009-02-16 Thread Dave Hansen
On Fri, 2009-02-13 at 11:53 +0100, Ingo Molnar wrote: In any case, by designing checkpointing to reuse the existing LSM callbacks, we'd hit multiple birds with the same stone. (One of which is the constant complaints about the runtime costs of the LSM callbacks - with checkpointing we get an

What can OpenVZ do?

2009-02-12 Thread Dave Hansen
On Thu, 2009-02-12 at 11:42 -0800, Andrew Morton wrote: On Thu, 12 Feb 2009 13:30:35 -0600 Matt Mackall m...@selenic.com wrote: On Thu, 2009-02-12 at 10:11 -0800, Dave Hansen wrote: - In bullet-point form, what features are missing, and should be added? * support for more

How much of a mess does OpenVZ make? ;) Was: What can OpenVZ do?

2009-02-12 Thread Dave Hansen
On Thu, 2009-02-12 at 14:10 -0800, Andrew Morton wrote: On Thu, 12 Feb 2009 13:51:23 -0800 Dave Hansen d...@linux.vnet.ibm.com wrote: On Thu, 2009-02-12 at 11:42 -0800, Andrew Morton wrote: On Thu, 12 Feb 2009 13:30:35 -0600 Matt Mackall m...@selenic.com wrote: On Thu, 2009-02

Re: [RFC v13][PATCH 00/14] Kernel based checkpoint/restart

2009-02-10 Thread Dave Hansen
On Tue, 2009-01-27 at 12:07 -0500, Oren Laadan wrote: Checkpoint-restart (c/r): a couple of fixes in preparation for 64bit architectures, and a couple of fixes for bugss (comments from Serge Hallyn, Sudakvev Bhattiprolu and Nathan Lynch). Updated and tested against v2.6.28. Aiming for -mm.

Re: [RFC v11][PATCH 05/13] Dump memory address space

2008-12-18 Thread Dave Hansen
On Thu, 2008-12-18 at 06:10 -0500, Oren Laadan wrote: +mutex_lock(mm-context.lock); + +hh-ldt_entry_size = LDT_ENTRY_SIZE; +hh-nldt = mm-context.size; + +cr_debug(nldt %d\n, hh-nldt); + +ret = cr_write_obj(ctx, h, hh); +cr_hbuf_put(ctx, sizeof(*hh)); +

Re: [RFC v11][PATCH 05/13] Dump memory address space

2008-12-18 Thread Dave Hansen
not __get_free_page()? Hahaha .. well, it's a guaranteed method to keep Dave Hansen from barking about not using kmalloc ... Personally I prefer __get_free_page() here, but not enough to keep arguing with him. Let me know when the two of you settle it :) Alright, I just wasn't sure if it had been

Re: [RFC v11][PATCH 00/13] Kernel based checkpoint/restart

2008-12-16 Thread Dave Hansen
Andrew, I just realized that you weren't cc'd on these when they were posted. Can we give them a run in -mm? As far as I know, all review comments have been addressed and there's nothing outstanding. On Fri, 2008-12-05 at 12:31 -0500, Oren Laadan wrote: Checkpoint-restart (c/r): fixed races

Re: [RFC v11][PATCH 03/13] General infrastructure for checkpoint restart

2008-12-16 Thread Dave Hansen
On Tue, 2008-12-16 at 14:43 -0800, Mike Waychison wrote: Hmm, if I'm understanding you correctly, adding ref counts explicitly (like you suggest below) would be used to let a lower layer defer writes. Seems like this could be just as easily done with explicits kmallocs and transferring

Re: [RFC v10][PATCH 05/13] Dump memory address space

2008-12-01 Thread Dave Hansen
On Fri, 2008-11-28 at 10:53 +, Al Viro wrote: +static int cr_ctx_checkpoint(struct cr_ctx *ctx, pid_t pid) +{ + ctx-root_pid = pid; + + /* + * assume checkpointer is in container's root vfs + * FIXME: this works for now, but will change with real containers

Re: [RFC v10][PATCH 02/13] Checkpoint/restart: initial documentation

2008-12-01 Thread Dave Hansen
On Fri, 2008-11-28 at 10:45 +, Al Viro wrote: On Wed, Nov 26, 2008 at 08:04:33PM -0500, Oren Laadan wrote: +Currently, namespaces are not saved or restored. They will be treated +as a class of a shared object. In particular, it is assumed that the +task's file system namespace is the

Re: [RFC v10][PATCH 09/13] Restore open file descriprtors

2008-12-01 Thread Dave Hansen
On Mon, 2008-12-01 at 15:41 -0500, Oren Laadan wrote: + fd = cr_attach_file(file); /* no need to cleanup 'file' below */ + if (fd 0) { + filp_close(file, NULL); + ret = fd; + goto out; + } + + /* register new objref, file tuple in hash

Re: [RFC v10][PATCH 09/13] Restore open file descriprtors

2008-12-01 Thread Dave Hansen
On Mon, 2008-12-01 at 13:07 -0800, Dave Hansen wrote: When a shared object is inserted to the hash we automatically take another reference to it (according to its type) for as long as it remains in the hash. See: 'cr_obj_ref_grab()' and 'cr_obj_ref_drop()'. So by moving that call higher

Re: [RFC v7][PATCH 2/9] General infrastructure for checkpoint restart

2008-10-27 Thread Dave Hansen
On Mon, 2008-10-27 at 07:03 -0400, Oren Laadan wrote: In our implementation, we simply refused to checkpoint setid programs. True. And this works very well for HPC applications. However, it doesn't work so well for server applications, for instance. Also, you could use file system

Re: [RFC v7][PATCH 2/9] General infrastructure for checkpoint restart

2008-10-21 Thread Dave Hansen
On Tue, 2008-10-21 at 22:55 -0400, Daniel Jacobowitz wrote: I haven't been following - but why this whole container restriction? Checkpoint/restart of individual processes is very useful too. There are issues with e.g. IPC, but I'm not convinced they're substantially different than the issues