Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor

2018-04-10 Thread Christopher Lameter
On Tue, 10 Apr 2018, Matthew Wilcox wrote: > > How do you envision dealing with the SLAB_TYPESAFE_BY_RCU slab caches? > > Those must have a defined state of the objects at all times and a > > constructor is > > required for that. And their use of RCU is required for numerous lockless > > lookup

Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor

2018-04-10 Thread Christopher Lameter
On Tue, 10 Apr 2018, Matthew Wilcox wrote: > If we want to get rid of the concept of constructors, it's doable, > but somebody needs to do the work to show what the effects will be. How do you envision dealing with the SLAB_TYPESAFE_BY_RCU slab caches? Those must have a defined state of the

Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor

2018-04-10 Thread Christopher Lameter
On Tue, 10 Apr 2018, Matthew Wilcox wrote: > If we want to get rid of the concept of constructors, it's doable, > but somebody needs to do the work to show what the effects will be. How do you envision dealing with the SLAB_TYPESAFE_BY_RCU slab caches? Those must have a defined state of the

Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor

2018-04-10 Thread Christopher Lameter
On Tue, 10 Apr 2018, Matthew Wilcox wrote: > Are you willing to have this kind of bug go uncaught for a while? There will be frequent allocations and this will show up at some point. Also you could put this into the debug only portions somehwere so we always catch it when debugging is on, '

Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor

2018-04-10 Thread Christopher Lameter
On Tue, 10 Apr 2018, Matthew Wilcox wrote: > Are you willing to have this kind of bug go uncaught for a while? There will be frequent allocations and this will show up at some point. Also you could put this into the debug only portions somehwere so we always catch it when debugging is on, '

Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor

2018-04-10 Thread Christopher Lameter
On Tue, 10 Apr 2018, Christopher Lameter wrote: > On Tue, 10 Apr 2018, Matthew Wilcox wrote: > > > __GFP_ZERO requests that the object be initialised to all-zeroes, > > while the purpose of a constructor is to initialise an object to a > > particular pattern. We cannot

Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor

2018-04-10 Thread Christopher Lameter
On Tue, 10 Apr 2018, Christopher Lameter wrote: > On Tue, 10 Apr 2018, Matthew Wilcox wrote: > > > __GFP_ZERO requests that the object be initialised to all-zeroes, > > while the purpose of a constructor is to initialise an object to a > > particular pattern. We cannot

Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor

2018-04-10 Thread Christopher Lameter
On Tue, 10 Apr 2018, Matthew Wilcox wrote: > __GFP_ZERO requests that the object be initialised to all-zeroes, > while the purpose of a constructor is to initialise an object to a > particular pattern. We cannot do both. Add a warning to catch any > users who mistakenly pass a __GFP_ZERO flag

Re: [PATCH 1/2] slab: __GFP_ZERO is incompatible with a constructor

2018-04-10 Thread Christopher Lameter
On Tue, 10 Apr 2018, Matthew Wilcox wrote: > __GFP_ZERO requests that the object be initialised to all-zeroes, > while the purpose of a constructor is to initialise an object to a > particular pattern. We cannot do both. Add a warning to catch any > users who mistakenly pass a __GFP_ZERO flag

Re: [RFC] mm, slab: reschedule cache_reap() on the same CPU

2018-04-10 Thread Christopher Lameter
On Tue, 10 Apr 2018, Vlastimil Babka wrote: > cache_reap() is initially scheduled in start_cpu_timer() via > schedule_delayed_work_on(). But then the next iterations are scheduled via > schedule_delayed_work(), thus using WORK_CPU_UNBOUND. That is a bug.. cache_reap must run on the same cpu

Re: [RFC] mm, slab: reschedule cache_reap() on the same CPU

2018-04-10 Thread Christopher Lameter
On Tue, 10 Apr 2018, Vlastimil Babka wrote: > cache_reap() is initially scheduled in start_cpu_timer() via > schedule_delayed_work_on(). But then the next iterations are scheduled via > schedule_delayed_work(), thus using WORK_CPU_UNBOUND. That is a bug.. cache_reap must run on the same cpu

Re: [PATCH 2/2] kfree_rcu() should use kfree_bulk() interface

2018-04-02 Thread Christopher Lameter
On Sun, 1 Apr 2018, rao.sho...@oracle.com wrote: > kfree_rcu() should use the new kfree_bulk() interface for freeing > rcu structures as it is more efficient. It would be even better if this approach could also use kmem_cache_free_bulk() or kfree_bulk()

Re: [PATCH 2/2] kfree_rcu() should use kfree_bulk() interface

2018-04-02 Thread Christopher Lameter
On Sun, 1 Apr 2018, rao.sho...@oracle.com wrote: > kfree_rcu() should use the new kfree_bulk() interface for freeing > rcu structures as it is more efficient. It would be even better if this approach could also use kmem_cache_free_bulk() or kfree_bulk()

Re: [RFC PATCH for 4.17 02/21] rseq: Introduce restartable sequences system call (v12)

2018-04-02 Thread Christopher Lameter
On Sun, 1 Apr 2018, Alan Cox wrote: > >Restartable sequences are atomic with respect to preemption > >(making it atomic with respect to other threads running on the > >same CPU), as well as signal delivery (user-space execution > >contexts nested over the

Re: [RFC PATCH for 4.17 02/21] rseq: Introduce restartable sequences system call (v12)

2018-04-02 Thread Christopher Lameter
On Sun, 1 Apr 2018, Alan Cox wrote: > >Restartable sequences are atomic with respect to preemption > >(making it atomic with respect to other threads running on the > >same CPU), as well as signal delivery (user-space execution > >contexts nested over the

Re: [PATCH] slab, slub: skip unnecessary kasan_cache_shutdown()

2018-03-28 Thread Christopher Lameter
On Tue, 27 Mar 2018, Shakeel Butt wrote: > The kasan quarantine is designed to delay freeing slab objects to catch > use-after-free. The quarantine can be large (several percent of machine > memory size). When kmem_caches are deleted related objects are flushed > from the quarantine but this

Re: [PATCH] slab, slub: skip unnecessary kasan_cache_shutdown()

2018-03-28 Thread Christopher Lameter
On Tue, 27 Mar 2018, Shakeel Butt wrote: > The kasan quarantine is designed to delay freeing slab objects to catch > use-after-free. The quarantine can be large (several percent of machine > memory size). When kmem_caches are deleted related objects are flushed > from the quarantine but this

Re: [PATCH] slab_common: remove test if cache name is accessible

2018-03-23 Thread Christopher Lameter
On Fri, 23 Mar 2018, Mikulas Patocka wrote: > Since the commit db265eca7700 ("mm/sl[aou]b: Move duping of slab name to > slab_common.c"), the kernel always duplicates the slab cache name when > creating a slab cache, so the test if the slab name is accessible is > useless. Acked-by: Christoph

Re: [PATCH] slab_common: remove test if cache name is accessible

2018-03-23 Thread Christopher Lameter
On Fri, 23 Mar 2018, Mikulas Patocka wrote: > Since the commit db265eca7700 ("mm/sl[aou]b: Move duping of slab name to > slab_common.c"), the kernel always duplicates the slab cache name when > creating a slab cache, so the test if the slab name is accessible is > useless. Acked-by: Christoph

Re: [PATCH] slab, slub: remove size disparity on debug kernel

2018-03-13 Thread Christopher Lameter
On Tue, 13 Mar 2018, Shakeel Butt wrote: > However for SLUB in debug kernel, the sizes were same. On further > inspection it is found that SLUB always use kmem_cache.object_size to > measure the kmem_cache.size while SLAB use the given kmem_cache.size. In > the debug kernel the slab's size can be

Re: [PATCH] slab, slub: remove size disparity on debug kernel

2018-03-13 Thread Christopher Lameter
On Tue, 13 Mar 2018, Shakeel Butt wrote: > However for SLUB in debug kernel, the sizes were same. On further > inspection it is found that SLUB always use kmem_cache.object_size to > measure the kmem_cache.size while SLAB use the given kmem_cache.size. In > the debug kernel the slab's size can be

Re: [PATCH] mm/slab.c: remove duplicated check of colour_next

2018-03-12 Thread Christopher Lameter
Acked-by: Christoph Lameter

Re: [PATCH] mm/slab.c: remove duplicated check of colour_next

2018-03-12 Thread Christopher Lameter
Acked-by: Christoph Lameter

Re: [PATCH v2] slub: use jitter-free reference while printing age

2018-03-08 Thread Christopher Lameter
On Thu, 8 Mar 2018, Chintan Pandya wrote: > In this case, object got freed later but 'age' > shows otherwise. This could be because, while > printing this info, we print allocation traces > first and free traces thereafter. In between, > if we get schedule out or jiffies increment, > (jiffies -

Re: [PATCH v2] slub: use jitter-free reference while printing age

2018-03-08 Thread Christopher Lameter
On Thu, 8 Mar 2018, Chintan Pandya wrote: > In this case, object got freed later but 'age' > shows otherwise. This could be because, while > printing this info, we print allocation traces > first and free traces thereafter. In between, > if we get schedule out or jiffies increment, > (jiffies -

Re: [PATCH] slub: Fix misleading 'age' in verbose slub prints

2018-03-08 Thread Christopher Lameter
On Thu, 8 Mar 2018, Chintan Pandya wrote: > > If you print the raw value, then you can do the subtraction yourself; > > if you've subtracted it from jiffies each time, you've at least introduced > > jitter, and possibly enough jitter to confuse and mislead. > > > This is exactly what I was

Re: [PATCH] slub: Fix misleading 'age' in verbose slub prints

2018-03-08 Thread Christopher Lameter
On Thu, 8 Mar 2018, Chintan Pandya wrote: > > If you print the raw value, then you can do the subtraction yourself; > > if you've subtracted it from jiffies each time, you've at least introduced > > jitter, and possibly enough jitter to confuse and mislead. > > > This is exactly what I was

Re: [PATCH] slub: Fix misleading 'age' in verbose slub prints

2018-03-07 Thread Christopher Lameter
On Wed, 7 Mar 2018, Chintan Pandya wrote: > In this case, object got freed later but 'age' shows > otherwise. This could be because, while printing > this info, we print allocation traces first and > free traces thereafter. In between, if we get schedule > out, (jiffies - t->when) could become

Re: [PATCH] slub: Fix misleading 'age' in verbose slub prints

2018-03-07 Thread Christopher Lameter
On Wed, 7 Mar 2018, Chintan Pandya wrote: > In this case, object got freed later but 'age' shows > otherwise. This could be because, while printing > this info, we print allocation traces first and > free traces thereafter. In between, if we get schedule > out, (jiffies - t->when) could become

Re: [PATCH v2 0/3] Directed kmem charging

2018-02-22 Thread Christopher Lameter
On Wed, 21 Feb 2018, Andrew Morton wrote: > What do others think? I think the changes to the hotpaths of the slab allocators increasing register pressure in some of the hotttest paths of the kernel are problematic. Its better to do the allocation properly in the task context to which it is

Re: [PATCH v2 0/3] Directed kmem charging

2018-02-22 Thread Christopher Lameter
On Wed, 21 Feb 2018, Andrew Morton wrote: > What do others think? I think the changes to the hotpaths of the slab allocators increasing register pressure in some of the hotttest paths of the kernel are problematic. Its better to do the allocation properly in the task context to which it is

Re: [PATCH v2 0/3] Directed kmem charging

2018-02-22 Thread Christopher Lameter
On Thu, 22 Feb 2018, Jan Kara wrote: > I don't see how task work can be used here. Firstly I don't know of a case > where task work would be used for something else than the current task - > and that is substantial because otherwise you have to deal with lots of > problems like races with task

Re: [PATCH v2 0/3] Directed kmem charging

2018-02-22 Thread Christopher Lameter
On Thu, 22 Feb 2018, Jan Kara wrote: > I don't see how task work can be used here. Firstly I don't know of a case > where task work would be used for something else than the current task - > and that is substantial because otherwise you have to deal with lots of > problems like races with task

Re: [PATCH v2 0/3] Directed kmem charging

2018-02-21 Thread Christopher Lameter
On Wed, 21 Feb 2018, Shakeel Butt wrote: > On Wed, Feb 21, 2018 at 8:09 AM, Christopher Lameter <c...@linux.com> wrote: > > Another way to solve this is to switch the user context right? > > > > Isnt it possible to avoid these patches if do the allocation in anoth

Re: [PATCH v2 0/3] Directed kmem charging

2018-02-21 Thread Christopher Lameter
On Wed, 21 Feb 2018, Shakeel Butt wrote: > On Wed, Feb 21, 2018 at 8:09 AM, Christopher Lameter wrote: > > Another way to solve this is to switch the user context right? > > > > Isnt it possible to avoid these patches if do the allocation in another > > task context

Re: [PATCH v2 3/3] fs: fsnotify: account fsnotify metadata to kmemcg

2018-02-21 Thread Christopher Lameter
On Tue, 20 Feb 2018, Shakeel Butt wrote: > diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c > index 6702a6a0bbb5..0d9493ebc7cd 100644 > --- a/fs/notify/fanotify/fanotify.c > +++ b/fs/notify/fanotify/fanotify.c > if (fanotify_is_perm_event(mask)) { >

Re: [PATCH v2 3/3] fs: fsnotify: account fsnotify metadata to kmemcg

2018-02-21 Thread Christopher Lameter
On Tue, 20 Feb 2018, Shakeel Butt wrote: > diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c > index 6702a6a0bbb5..0d9493ebc7cd 100644 > --- a/fs/notify/fanotify/fanotify.c > +++ b/fs/notify/fanotify/fanotify.c > if (fanotify_is_perm_event(mask)) { >

Re: [PATCH v2 0/3] Directed kmem charging

2018-02-21 Thread Christopher Lameter
Another way to solve this is to switch the user context right? Isnt it possible to avoid these patches if do the allocation in another task context instead? Are there really any other use cases beyond fsnotify? The charging of the memory works on a per page level but the allocation occur from

Re: [PATCH v2 0/3] Directed kmem charging

2018-02-21 Thread Christopher Lameter
Another way to solve this is to switch the user context right? Isnt it possible to avoid these patches if do the allocation in another task context instead? Are there really any other use cases beyond fsnotify? The charging of the memory works on a per page level but the allocation occur from

Re: [patch 1/2] mm, page_alloc: extend kernelcore and movablecore for percent

2018-02-16 Thread Christopher Lameter
On Fri, 16 Feb 2018, Matthew Wilcox wrote: > On Fri, Feb 16, 2018 at 09:44:25AM -0600, Christopher Lameter wrote: > > On Thu, 15 Feb 2018, Matthew Wilcox wrote: > > > What I was proposing was an intermediate page allocator where slab would > > > request 2MB f

Re: [patch 1/2] mm, page_alloc: extend kernelcore and movablecore for percent

2018-02-16 Thread Christopher Lameter
On Fri, 16 Feb 2018, Matthew Wilcox wrote: > On Fri, Feb 16, 2018 at 09:44:25AM -0600, Christopher Lameter wrote: > > On Thu, 15 Feb 2018, Matthew Wilcox wrote: > > > What I was proposing was an intermediate page allocator where slab would > > > request 2MB f

Re: [patch 1/2] mm, page_alloc: extend kernelcore and movablecore for percent

2018-02-16 Thread Christopher Lameter
On Thu, 15 Feb 2018, Matthew Wilcox wrote: > > The inducing of releasing memory back is not there but you can run SLUB > > with MAX_ORDER allocations by passing "slab_min_order=9" or so on bootup. > > This is subtly different from the idea that I had. If you set > slub_min_order to 9, then slub

Re: [patch 1/2] mm, page_alloc: extend kernelcore and movablecore for percent

2018-02-16 Thread Christopher Lameter
On Thu, 15 Feb 2018, Matthew Wilcox wrote: > > The inducing of releasing memory back is not there but you can run SLUB > > with MAX_ORDER allocations by passing "slab_min_order=9" or so on bootup. > > This is subtly different from the idea that I had. If you set > slub_min_order to 9, then slub

Re: [patch 1/2] mm, page_alloc: extend kernelcore and movablecore for percent

2018-02-16 Thread Christopher Lameter
On Thu, 15 Feb 2018, Matthew Wilcox wrote: > On Thu, Feb 15, 2018 at 09:49:00AM -0600, Christopher Lameter wrote: > > On Thu, 15 Feb 2018, Matthew Wilcox wrote: > > > > > What if ... on startup, slab allocated a MAX_ORDER page for itself. > > > It would the

Re: [patch 1/2] mm, page_alloc: extend kernelcore and movablecore for percent

2018-02-16 Thread Christopher Lameter
On Thu, 15 Feb 2018, Matthew Wilcox wrote: > On Thu, Feb 15, 2018 at 09:49:00AM -0600, Christopher Lameter wrote: > > On Thu, 15 Feb 2018, Matthew Wilcox wrote: > > > > > What if ... on startup, slab allocated a MAX_ORDER page for itself. > > > It would the

Re: [PATCH 2/2] mm: Add kvmalloc_ab_c and kvzalloc_struct

2018-02-15 Thread Christopher Lameter
On Thu, 15 Feb 2018, Matthew Wilcox wrote: > I dunno. Yes, there's macro trickery going on here, but it certainly > resembles a function. It doesn't fail any of the rules laid out in that > chapter of coding-style about unacceptable uses of macros. It sure looks like a function but does magic

Re: [PATCH 2/2] mm: Add kvmalloc_ab_c and kvzalloc_struct

2018-02-15 Thread Christopher Lameter
On Thu, 15 Feb 2018, Matthew Wilcox wrote: > I dunno. Yes, there's macro trickery going on here, but it certainly > resembles a function. It doesn't fail any of the rules laid out in that > chapter of coding-style about unacceptable uses of macros. It sure looks like a function but does magic

Re: [PATCH 1/3] percpu: match chunk allocator declarations with definitions

2018-02-15 Thread Christopher Lameter
On Thu, 15 Feb 2018, Dennis Zhou wrote: > At some point the function declaration parameters got out of sync with > the function definitions in percpu-vm.c and percpu-km.c. This patch > makes them match again. Acked-by: Christoph Lameter

Re: [PATCH 1/3] percpu: match chunk allocator declarations with definitions

2018-02-15 Thread Christopher Lameter
On Thu, 15 Feb 2018, Dennis Zhou wrote: > At some point the function declaration parameters got out of sync with > the function definitions in percpu-vm.c and percpu-km.c. This patch > makes them match again. Acked-by: Christoph Lameter

Re: [PATCH 2/2] mm: Add kvmalloc_ab_c and kvzalloc_struct

2018-02-15 Thread Christopher Lameter
On Wed, 14 Feb 2018, Matthew Wilcox wrote: > > Uppercase like the similar KMEM_CACHE related macros in > > include/linux/slab.h?> > > Do you think that would look better in the users? Compare: Does looking matter? I thought we had the convention that macros are uppercase. There are some tricks

Re: [PATCH 2/2] mm: Add kvmalloc_ab_c and kvzalloc_struct

2018-02-15 Thread Christopher Lameter
On Wed, 14 Feb 2018, Matthew Wilcox wrote: > > Uppercase like the similar KMEM_CACHE related macros in > > include/linux/slab.h?> > > Do you think that would look better in the users? Compare: Does looking matter? I thought we had the convention that macros are uppercase. There are some tricks

Re: [patch 1/2] mm, page_alloc: extend kernelcore and movablecore for percent

2018-02-15 Thread Christopher Lameter
On Thu, 15 Feb 2018, Matthew Wilcox wrote: > What if ... on startup, slab allocated a MAX_ORDER page for itself. > It would then satisfy its own page allocation requests from this giant > page. If we start to run low on memory in the rest of the system, slab > can be induced to return some of it

Re: [patch 1/2] mm, page_alloc: extend kernelcore and movablecore for percent

2018-02-15 Thread Christopher Lameter
On Thu, 15 Feb 2018, Matthew Wilcox wrote: > What if ... on startup, slab allocated a MAX_ORDER page for itself. > It would then satisfy its own page allocation requests from this giant > page. If we start to run low on memory in the rest of the system, slab > can be induced to return some of it

Re: [PATCH 2/2] mm: Add kvmalloc_ab_c and kvzalloc_struct

2018-02-14 Thread Christopher Lameter
On Wed, 14 Feb 2018, Matthew Wilcox wrote: > +#define kvzalloc_struct(p, member, n, gfp) \ > + (typeof(p))kvzalloc_ab_c(n, \ > + sizeof(*(p)->member) + __must_be_array((p)->member),\ > +

Re: [PATCH 2/2] mm: Add kvmalloc_ab_c and kvzalloc_struct

2018-02-14 Thread Christopher Lameter
On Wed, 14 Feb 2018, Matthew Wilcox wrote: > +#define kvzalloc_struct(p, member, n, gfp) \ > + (typeof(p))kvzalloc_ab_c(n, \ > + sizeof(*(p)->member) + __must_be_array((p)->member),\ > +

Re: [PATCH for-4.17] percpu: add Dennis Zhou as a percpu co-maintainer

2018-02-12 Thread Christopher Lameter
Acked-by: Christopher Lameter <c...@linux.com>

Re: [PATCH for-4.17] percpu: add Dennis Zhou as a percpu co-maintainer

2018-02-12 Thread Christopher Lameter
Acked-by: Christopher Lameter

Re: [PATCH 0/2] rcu: Transform kfree_rcu() into kvfree_rcu()

2018-02-07 Thread Christopher Lameter
On Wed, 7 Feb 2018, Steven Rostedt wrote: > But a generic "malloc" or "free" that does things differently depending > on the size is a different story. They would not be used for cases with special requirements but for the throwaway allows where noone cares about these details. Its just a

Re: [PATCH 0/2] rcu: Transform kfree_rcu() into kvfree_rcu()

2018-02-07 Thread Christopher Lameter
On Wed, 7 Feb 2018, Steven Rostedt wrote: > But a generic "malloc" or "free" that does things differently depending > on the size is a different story. They would not be used for cases with special requirements but for the throwaway allows where noone cares about these details. Its just a

Re: [PATCH 0/2] rcu: Transform kfree_rcu() into kvfree_rcu()

2018-02-07 Thread Christopher Lameter
On Tue, 6 Feb 2018, Matthew Wilcox wrote: > Personally, I would like us to rename kvfree() to just free(), and have > malloc(x) be an alias to kvmalloc(x, GFP_KERNEL), but I haven't won that > fight yet. Maybe lets implement malloc(), free() and realloc() in the kernel to be consistent with user

Re: [PATCH 0/2] rcu: Transform kfree_rcu() into kvfree_rcu()

2018-02-07 Thread Christopher Lameter
On Tue, 6 Feb 2018, Matthew Wilcox wrote: > Personally, I would like us to rename kvfree() to just free(), and have > malloc(x) be an alias to kvmalloc(x, GFP_KERNEL), but I haven't won that > fight yet. Maybe lets implement malloc(), free() and realloc() in the kernel to be consistent with user

Re: [PATCH 0/2] rcu: Transform kfree_rcu() into kvfree_rcu()

2018-02-07 Thread Christopher Lameter
On Tue, 6 Feb 2018, Paul E. McKenney wrote: > So it is OK to kvmalloc() something and pass it to either kfree() or > kvfree(), and it had better be OK to kvmalloc() something and pass it > to kvfree(). kvfree() is fine but not kfree().

Re: [PATCH 0/2] rcu: Transform kfree_rcu() into kvfree_rcu()

2018-02-07 Thread Christopher Lameter
On Tue, 6 Feb 2018, Paul E. McKenney wrote: > So it is OK to kvmalloc() something and pass it to either kfree() or > kvfree(), and it had better be OK to kvmalloc() something and pass it > to kvfree(). kvfree() is fine but not kfree().

Re: [kernel-hardening] [PATCH 4/6] Protectable Memory

2018-02-05 Thread Christopher Lameter
On Sat, 3 Feb 2018, Igor Stoppa wrote: > > We could even do this in a more thorough way. Can we use a ring 1 / 2 > > distinction to create a hardened OS core that policies the rest of > > the ever expanding kernel with all its modules and this and that feature? > > What would be the

Re: [kernel-hardening] [PATCH 4/6] Protectable Memory

2018-02-05 Thread Christopher Lameter
On Sat, 3 Feb 2018, Igor Stoppa wrote: > > We could even do this in a more thorough way. Can we use a ring 1 / 2 > > distinction to create a hardened OS core that policies the rest of > > the ever expanding kernel with all its modules and this and that feature? > > What would be the

Re: [PATCH 3/6] struct page: add field for vm_struct

2018-02-05 Thread Christopher Lameter
On Sat, 3 Feb 2018, Igor Stoppa wrote: > - the property of the compound page will affect the property of all the > pages in the compound, so when one is write protected, it can generate a > lot of wasted memory, if there is too much slack (because of the order) > With vmalloc, I can allocate any

Re: [PATCH 3/6] struct page: add field for vm_struct

2018-02-05 Thread Christopher Lameter
On Sat, 3 Feb 2018, Igor Stoppa wrote: > - the property of the compound page will affect the property of all the > pages in the compound, so when one is write protected, it can generate a > lot of wasted memory, if there is too much slack (because of the order) > With vmalloc, I can allocate any

Re: [PATCH 3/6] struct page: add field for vm_struct

2018-02-02 Thread Christopher Lameter
On Thu, 1 Feb 2018, Igor Stoppa wrote: > > Would it not be better to use compound page allocations here? > > page_head(whatever) gets you the head page where you can store all sorts > > of information about the chunk of memory. > > Can you please point me to this function/macro? I don't seem to

Re: [PATCH 3/6] struct page: add field for vm_struct

2018-02-02 Thread Christopher Lameter
On Thu, 1 Feb 2018, Igor Stoppa wrote: > > Would it not be better to use compound page allocations here? > > page_head(whatever) gets you the head page where you can store all sorts > > of information about the chunk of memory. > > Can you please point me to this function/macro? I don't seem to

Re: [kernel-hardening] [PATCH 4/6] Protectable Memory

2018-02-02 Thread Christopher Lameter
On Thu, 25 Jan 2018, Matthew Wilcox wrote: > It's worth having a discussion about whether we want the pmalloc API > or whether we want a slab-based API. We can have a separate discussion > about an API to remove pages from the physmap. We could even do this in a more thorough way. Can we use a

Re: [kernel-hardening] [PATCH 4/6] Protectable Memory

2018-02-02 Thread Christopher Lameter
On Thu, 25 Jan 2018, Matthew Wilcox wrote: > It's worth having a discussion about whether we want the pmalloc API > or whether we want a slab-based API. We can have a separate discussion > about an API to remove pages from the physmap. We could even do this in a more thorough way. Can we use a

Re: [PATCH 3/6] struct page: add field for vm_struct

2018-01-31 Thread Christopher Lameter
On Tue, 30 Jan 2018, Igor Stoppa wrote: > @@ -1769,6 +1774,9 @@ void *__vmalloc_node_range(unsigned long size, unsigned > long align, > > kmemleak_vmalloc(area, size, gfp_mask); > > + for (page_counter = 0; page_counter < area->nr_pages; page_counter++) > +

Re: [PATCH 3/6] struct page: add field for vm_struct

2018-01-31 Thread Christopher Lameter
On Tue, 30 Jan 2018, Igor Stoppa wrote: > @@ -1769,6 +1774,9 @@ void *__vmalloc_node_range(unsigned long size, unsigned > long align, > > kmemleak_vmalloc(area, size, gfp_mask); > > + for (page_counter = 0; page_counter < area->nr_pages; page_counter++) > +

Re: [PATCH] mm: numa: Do not trap faults on shared data section pages.

2018-01-19 Thread Christopher Lameter
On Thu, 18 Jan 2018, Henry Willard wrote: > If MPOL_MF_LAZY were allowed and specified things would not work > correctly. change_pte_range() is unaware of and can’t honor the > difference between MPOL_MF_MOVE_ALL and MPOL_MF_MOVE. Not sure how that relates to what I said earlier... Sorry. > >

Re: [PATCH] mm: numa: Do not trap faults on shared data section pages.

2018-01-19 Thread Christopher Lameter
On Thu, 18 Jan 2018, Henry Willard wrote: > If MPOL_MF_LAZY were allowed and specified things would not work > correctly. change_pte_range() is unaware of and can’t honor the > difference between MPOL_MF_MOVE_ALL and MPOL_MF_MOVE. Not sure how that relates to what I said earlier... Sorry. > >

Re: [PATCH] mm: numa: Do not trap faults on shared data section pages.

2018-01-17 Thread Christopher Lameter
On Tue, 16 Jan 2018, Mel Gorman wrote: > My main source of discomfort is the fact that this is permanent as two > processes perfectly isolated but with a suitably shared COW mapping > will never migrate the data. A potential improvement to get the reported > bandwidth up in the test program would

Re: [PATCH] mm: numa: Do not trap faults on shared data section pages.

2018-01-17 Thread Christopher Lameter
On Tue, 16 Jan 2018, Mel Gorman wrote: > My main source of discomfort is the fact that this is permanent as two > processes perfectly isolated but with a suitably shared COW mapping > will never migrate the data. A potential improvement to get the reported > bandwidth up in the test program would

Re: [GIT PULL] isolation: 1Hz residual tick offloading v3

2018-01-17 Thread Christopher Lameter
On Wed, 17 Jan 2018, Mike Galbraith wrote: > Domain connectivity very much is a property of a set of CPUs, a rather > important one, and one managed by cpusets.  NOHZ_FULL is a property of > a set of cpus, thus a most excellent fit.  Other things are as well. Not sure to what domain refers to in

Re: [GIT PULL] isolation: 1Hz residual tick offloading v3

2018-01-17 Thread Christopher Lameter
On Wed, 17 Jan 2018, Mike Galbraith wrote: > Domain connectivity very much is a property of a set of CPUs, a rather > important one, and one managed by cpusets.  NOHZ_FULL is a property of > a set of cpus, thus a most excellent fit.  Other things are as well. Not sure to what domain refers to in

Re: [GIT PULL] isolation: 1Hz residual tick offloading v3

2018-01-17 Thread Christopher Lameter
On Tue, 16 Jan 2018, Mike Galbraith wrote: > > I tried to remove isolcpus or at least change the way it works so that its > > effects are reversible (ie: affine the init task instead of isolating > > domains) > > but that got nacked due to the behaviour's expectations for userspace. > > So we

Re: [GIT PULL] isolation: 1Hz residual tick offloading v3

2018-01-17 Thread Christopher Lameter
On Tue, 16 Jan 2018, Mike Galbraith wrote: > > I tried to remove isolcpus or at least change the way it works so that its > > effects are reversible (ie: affine the init task instead of isolating > > domains) > > but that got nacked due to the behaviour's expectations for userspace. > > So we

Re: kmem_cache_attr (was Re: [PATCH 04/36] usercopy: Prepare for usercopy whitelisting)

2018-01-17 Thread Christopher Lameter
On Tue, 16 Jan 2018, Matthew Wilcox wrote: > On Tue, Jan 16, 2018 at 12:17:01PM -0600, Christopher Lameter wrote: > > Draft patch of how the data structs could change. kmem_cache_attr is read > > only. > > Looks good. Although I would add Kees' user feature: Sure I tried

Re: kmem_cache_attr (was Re: [PATCH 04/36] usercopy: Prepare for usercopy whitelisting)

2018-01-17 Thread Christopher Lameter
On Tue, 16 Jan 2018, Matthew Wilcox wrote: > On Tue, Jan 16, 2018 at 12:17:01PM -0600, Christopher Lameter wrote: > > Draft patch of how the data structs could change. kmem_cache_attr is read > > only. > > Looks good. Although I would add Kees' user feature: Sure I tried

Re: kmem_cache_attr (was Re: [PATCH 04/36] usercopy: Prepare for usercopy whitelisting)

2018-01-16 Thread Christopher Lameter
Draft patch of how the data structs could change. kmem_cache_attr is read only. Index: linux/include/linux/slab.h === --- linux.orig/include/linux/slab.h +++ linux/include/linux/slab.h @@ -135,9 +135,17 @@ struct mem_cgroup; void

Re: kmem_cache_attr (was Re: [PATCH 04/36] usercopy: Prepare for usercopy whitelisting)

2018-01-16 Thread Christopher Lameter
Draft patch of how the data structs could change. kmem_cache_attr is read only. Index: linux/include/linux/slab.h === --- linux.orig/include/linux/slab.h +++ linux/include/linux/slab.h @@ -135,9 +135,17 @@ struct mem_cgroup; void

Re: kmem_cache_attr (was Re: [PATCH 04/36] usercopy: Prepare for usercopy whitelisting)

2018-01-16 Thread Christopher Lameter
On Tue, 16 Jan 2018, Matthew Wilcox wrote: > > Sure this data is never changed. It can be const. > > It's changed at initialisation. Look: > > kmem_cache_create(const char *name, size_t size, size_t align, > slab_flags_t flags, void (*ctor)(void *)) > s =

Re: kmem_cache_attr (was Re: [PATCH 04/36] usercopy: Prepare for usercopy whitelisting)

2018-01-16 Thread Christopher Lameter
On Tue, 16 Jan 2018, Matthew Wilcox wrote: > > Sure this data is never changed. It can be const. > > It's changed at initialisation. Look: > > kmem_cache_create(const char *name, size_t size, size_t align, > slab_flags_t flags, void (*ctor)(void *)) > s =

Re: kmem_cache_attr (was Re: [PATCH 04/36] usercopy: Prepare for usercopy whitelisting)

2018-01-16 Thread Christopher Lameter
On Tue, 16 Jan 2018, Matthew Wilcox wrote: > I think that's a good thing! /proc/slabinfo really starts to get grotty > above 16 bytes. I'd like to chop off "_cache" from the name of every > single slab! If ext4_allocation_context has to become ext4_alloc_ctx, > I don't think we're going to

Re: kmem_cache_attr (was Re: [PATCH 04/36] usercopy: Prepare for usercopy whitelisting)

2018-01-16 Thread Christopher Lameter
On Tue, 16 Jan 2018, Matthew Wilcox wrote: > I think that's a good thing! /proc/slabinfo really starts to get grotty > above 16 bytes. I'd like to chop off "_cache" from the name of every > single slab! If ext4_allocation_context has to become ext4_alloc_ctx, > I don't think we're going to

kmem_cache_attr (was Re: [PATCH 04/36] usercopy: Prepare for usercopy whitelisting)

2018-01-16 Thread Christopher Lameter
On Sun, 14 Jan 2018, Matthew Wilcox wrote: > > Hmmm... At some point we should switch kmem_cache_create to pass a struct > > containing all the parameters. Otherwise the API will blow up with > > additional functions. > > Obviously I agree with you. I'm inclined to not let that delay Kees' >

kmem_cache_attr (was Re: [PATCH 04/36] usercopy: Prepare for usercopy whitelisting)

2018-01-16 Thread Christopher Lameter
On Sun, 14 Jan 2018, Matthew Wilcox wrote: > > Hmmm... At some point we should switch kmem_cache_create to pass a struct > > containing all the parameters. Otherwise the API will blow up with > > additional functions. > > Obviously I agree with you. I'm inclined to not let that delay Kees' >

RE: [PATCH 04/36] usercopy: Prepare for usercopy whitelisting

2018-01-12 Thread Christopher Lameter
On Fri, 12 Jan 2018, David Laight wrote: > > Hmmm... At some point we should switch kmem_cache_create to pass a struct > > containing all the parameters. Otherwise the API will blow up with > > additional functions. > > Or add an extra function to 'configure' the kmem_cache with the > extra

RE: [PATCH 04/36] usercopy: Prepare for usercopy whitelisting

2018-01-12 Thread Christopher Lameter
On Fri, 12 Jan 2018, David Laight wrote: > > Hmmm... At some point we should switch kmem_cache_create to pass a struct > > containing all the parameters. Otherwise the API will blow up with > > additional functions. > > Or add an extra function to 'configure' the kmem_cache with the > extra

Re: [PATCH 02/38] usercopy: Enhance and rename report_usercopy()

2018-01-11 Thread Christopher Lameter
On Wed, 10 Jan 2018, Kees Cook wrote: > diff --git a/mm/slab.h b/mm/slab.h > index ad657ffa44e5..7d29e69ac310 100644 > --- a/mm/slab.h > +++ b/mm/slab.h > @@ -526,4 +526,10 @@ static inline int cache_random_seq_create(struct > kmem_cache *cachep, > static inline void

Re: [PATCH 02/38] usercopy: Enhance and rename report_usercopy()

2018-01-11 Thread Christopher Lameter
On Wed, 10 Jan 2018, Kees Cook wrote: > diff --git a/mm/slab.h b/mm/slab.h > index ad657ffa44e5..7d29e69ac310 100644 > --- a/mm/slab.h > +++ b/mm/slab.h > @@ -526,4 +526,10 @@ static inline int cache_random_seq_create(struct > kmem_cache *cachep, > static inline void

Re: [PATCH 05/36] usercopy: WARN() on slab cache usercopy region violations

2018-01-10 Thread Christopher Lameter
On Tue, 9 Jan 2018, Kees Cook wrote: > @@ -3823,11 +3825,9 @@ int __check_heap_object(const void *ptr, unsigned long > n, struct page *page, Could we do the check in mm_slab_common.c for all allocators and just have a small function in each allocators that give you the metadata needed for the

Re: [PATCH 05/36] usercopy: WARN() on slab cache usercopy region violations

2018-01-10 Thread Christopher Lameter
On Tue, 9 Jan 2018, Kees Cook wrote: > @@ -3823,11 +3825,9 @@ int __check_heap_object(const void *ptr, unsigned long > n, struct page *page, Could we do the check in mm_slab_common.c for all allocators and just have a small function in each allocators that give you the metadata needed for the

Re: [PATCH 04/36] usercopy: Prepare for usercopy whitelisting

2018-01-10 Thread Christopher Lameter
On Tue, 9 Jan 2018, Kees Cook wrote: > +struct kmem_cache *kmem_cache_create_usercopy(const char *name, > + size_t size, size_t align, slab_flags_t flags, > + size_t useroffset, size_t usersize, > + void (*ctor)(void *)); Hmmm... At

Re: [PATCH 04/36] usercopy: Prepare for usercopy whitelisting

2018-01-10 Thread Christopher Lameter
On Tue, 9 Jan 2018, Kees Cook wrote: > +struct kmem_cache *kmem_cache_create_usercopy(const char *name, > + size_t size, size_t align, slab_flags_t flags, > + size_t useroffset, size_t usersize, > + void (*ctor)(void *)); Hmmm... At

Re: [PATCH 02/36] usercopy: Include offset in overflow report

2018-01-10 Thread Christopher Lameter
On Tue, 9 Jan 2018, Kees Cook wrote: > -static void report_usercopy(unsigned long len, bool to_user, const char > *type) > +int report_usercopy(const char *name, const char *detail, bool to_user, > + unsigned long offset, unsigned long len) > { > - pr_emerg("kernel memory %s

<    1   2   3   4   5   6   7   >