Re: [PATCH] vmalloc: add __alloc_vm_area() for optimizing vmap stack

2018-01-23 Thread Konstantin Khlebnikov

On 22.01.2018 23:51, Andy Lutomirski wrote:

On Wed, Oct 11, 2017 at 6:32 AM, Konstantin Khlebnikov
 wrote:

On 08.10.2017 12:16, Christoph Hellwig wrote:


This looks fine in general, but a few comments:

   - can you split adding the new function from switching over the fork
 codeok




   - at least kasan and vmalloc_user/vmalloc_32_user use very similar
 patterns, can you switch them over as well?



I don't see why VM_USERMAP cannot be set right at allocation.

I'll add vm_flags argument to __vmalloc_node() and
pass here VM_USERMAP from vmalloc_user/vmalloc_32_user
in separate patch.

KASAN is different: it allocates shadow area for area allocated for module.
Pointer to module area must be pushed from module_alloc().
This isn't worth optimization.


   - the new __alloc_vm_area looks very different from alloc_vm_area,
 maybe it needs a better name?  vmalloc_range_area for example?



__vmalloc_area() is vacant - this most low-level, so I'll keep "__".


   - when you split an existing function please keep the more low-level
 function on top of the higher level one that calls it.ok


Did this ever get re-sent?



It seems not. Probably lost in race-condition with my vacation.
Will do.


Re: [PATCH] vmalloc: add __alloc_vm_area() for optimizing vmap stack

2018-01-23 Thread Konstantin Khlebnikov

On 22.01.2018 23:51, Andy Lutomirski wrote:

On Wed, Oct 11, 2017 at 6:32 AM, Konstantin Khlebnikov
 wrote:

On 08.10.2017 12:16, Christoph Hellwig wrote:


This looks fine in general, but a few comments:

   - can you split adding the new function from switching over the fork
 codeok




   - at least kasan and vmalloc_user/vmalloc_32_user use very similar
 patterns, can you switch them over as well?



I don't see why VM_USERMAP cannot be set right at allocation.

I'll add vm_flags argument to __vmalloc_node() and
pass here VM_USERMAP from vmalloc_user/vmalloc_32_user
in separate patch.

KASAN is different: it allocates shadow area for area allocated for module.
Pointer to module area must be pushed from module_alloc().
This isn't worth optimization.


   - the new __alloc_vm_area looks very different from alloc_vm_area,
 maybe it needs a better name?  vmalloc_range_area for example?



__vmalloc_area() is vacant - this most low-level, so I'll keep "__".


   - when you split an existing function please keep the more low-level
 function on top of the higher level one that calls it.ok


Did this ever get re-sent?



It seems not. Probably lost in race-condition with my vacation.
Will do.


Re: [PATCH] vmalloc: add __alloc_vm_area() for optimizing vmap stack

2018-01-22 Thread Andy Lutomirski
On Wed, Oct 11, 2017 at 6:32 AM, Konstantin Khlebnikov
 wrote:
> On 08.10.2017 12:16, Christoph Hellwig wrote:
>>
>> This looks fine in general, but a few comments:
>>
>>   - can you split adding the new function from switching over the fork
>> codeok
>
>
>>   - at least kasan and vmalloc_user/vmalloc_32_user use very similar
>> patterns, can you switch them over as well?
>
>
> I don't see why VM_USERMAP cannot be set right at allocation.
>
> I'll add vm_flags argument to __vmalloc_node() and
> pass here VM_USERMAP from vmalloc_user/vmalloc_32_user
> in separate patch.
>
> KASAN is different: it allocates shadow area for area allocated for module.
> Pointer to module area must be pushed from module_alloc().
> This isn't worth optimization.
>
>>   - the new __alloc_vm_area looks very different from alloc_vm_area,
>> maybe it needs a better name?  vmalloc_range_area for example?
>
>
> __vmalloc_area() is vacant - this most low-level, so I'll keep "__".
>
>>   - when you split an existing function please keep the more low-level
>> function on top of the higher level one that calls it.ok

Did this ever get re-sent?


Re: [PATCH] vmalloc: add __alloc_vm_area() for optimizing vmap stack

2018-01-22 Thread Andy Lutomirski
On Wed, Oct 11, 2017 at 6:32 AM, Konstantin Khlebnikov
 wrote:
> On 08.10.2017 12:16, Christoph Hellwig wrote:
>>
>> This looks fine in general, but a few comments:
>>
>>   - can you split adding the new function from switching over the fork
>> codeok
>
>
>>   - at least kasan and vmalloc_user/vmalloc_32_user use very similar
>> patterns, can you switch them over as well?
>
>
> I don't see why VM_USERMAP cannot be set right at allocation.
>
> I'll add vm_flags argument to __vmalloc_node() and
> pass here VM_USERMAP from vmalloc_user/vmalloc_32_user
> in separate patch.
>
> KASAN is different: it allocates shadow area for area allocated for module.
> Pointer to module area must be pushed from module_alloc().
> This isn't worth optimization.
>
>>   - the new __alloc_vm_area looks very different from alloc_vm_area,
>> maybe it needs a better name?  vmalloc_range_area for example?
>
>
> __vmalloc_area() is vacant - this most low-level, so I'll keep "__".
>
>>   - when you split an existing function please keep the more low-level
>> function on top of the higher level one that calls it.ok

Did this ever get re-sent?


Re: [PATCH] vmalloc: add __alloc_vm_area() for optimizing vmap stack

2017-10-11 Thread Konstantin Khlebnikov

On 08.10.2017 12:16, Christoph Hellwig wrote:

This looks fine in general, but a few comments:

  - can you split adding the new function from switching over the fork
codeok



  - at least kasan and vmalloc_user/vmalloc_32_user use very similar
patterns, can you switch them over as well?


I don't see why VM_USERMAP cannot be set right at allocation.

I'll add vm_flags argument to __vmalloc_node() and
pass here VM_USERMAP from vmalloc_user/vmalloc_32_user
in separate patch.

KASAN is different: it allocates shadow area for area allocated for module.
Pointer to module area must be pushed from module_alloc().
This isn't worth optimization.


  - the new __alloc_vm_area looks very different from alloc_vm_area,
maybe it needs a better name?  vmalloc_range_area for example?


__vmalloc_area() is vacant - this most low-level, so I'll keep "__".


  - when you split an existing function please keep the more low-level
function on top of the higher level one that calls it.ok


Re: [PATCH] vmalloc: add __alloc_vm_area() for optimizing vmap stack

2017-10-11 Thread Konstantin Khlebnikov

On 08.10.2017 12:16, Christoph Hellwig wrote:

This looks fine in general, but a few comments:

  - can you split adding the new function from switching over the fork
codeok



  - at least kasan and vmalloc_user/vmalloc_32_user use very similar
patterns, can you switch them over as well?


I don't see why VM_USERMAP cannot be set right at allocation.

I'll add vm_flags argument to __vmalloc_node() and
pass here VM_USERMAP from vmalloc_user/vmalloc_32_user
in separate patch.

KASAN is different: it allocates shadow area for area allocated for module.
Pointer to module area must be pushed from module_alloc().
This isn't worth optimization.


  - the new __alloc_vm_area looks very different from alloc_vm_area,
maybe it needs a better name?  vmalloc_range_area for example?


__vmalloc_area() is vacant - this most low-level, so I'll keep "__".


  - when you split an existing function please keep the more low-level
function on top of the higher level one that calls it.ok


Re: [PATCH] vmalloc: add __alloc_vm_area() for optimizing vmap stack

2017-10-08 Thread Christoph Hellwig
This looks fine in general, but a few comments:

 - can you split adding the new function from switching over the fork
   code?
 - at least kasan and vmalloc_user/vmalloc_32_user use very similar
   patterns, can you switch them over as well?
 - the new __alloc_vm_area looks very different from alloc_vm_area,
   maybe it needs a better name?  vmalloc_range_area for example?
 - when you split an existing function please keep the more low-level
   function on top of the higher level one that calls it.


Re: [PATCH] vmalloc: add __alloc_vm_area() for optimizing vmap stack

2017-10-08 Thread Christoph Hellwig
This looks fine in general, but a few comments:

 - can you split adding the new function from switching over the fork
   code?
 - at least kasan and vmalloc_user/vmalloc_32_user use very similar
   patterns, can you switch them over as well?
 - the new __alloc_vm_area looks very different from alloc_vm_area,
   maybe it needs a better name?  vmalloc_range_area for example?
 - when you split an existing function please keep the more low-level
   function on top of the higher level one that calls it.


Re: [PATCH] vmalloc: add __alloc_vm_area() for optimizing vmap stack

2017-10-06 Thread Andy Lutomirski
On Fri, Oct 6, 2017 at 4:35 AM, Konstantin Khlebnikov
 wrote:
> This same as __vmalloc_node_range() but returns vm_struct rather than
> virtual address. This allows to kill one call of find_vm_area() for
> each task stack allocation for CONFIG_VMAP_STACK=y.
>
> And fix comment about that task holds cache of vm area: this cache used
> for retrieving actual stack pages, freeing is done by vfree_deferred().

Nice!


Re: [PATCH] vmalloc: add __alloc_vm_area() for optimizing vmap stack

2017-10-06 Thread Andy Lutomirski
On Fri, Oct 6, 2017 at 4:35 AM, Konstantin Khlebnikov
 wrote:
> This same as __vmalloc_node_range() but returns vm_struct rather than
> virtual address. This allows to kill one call of find_vm_area() for
> each task stack allocation for CONFIG_VMAP_STACK=y.
>
> And fix comment about that task holds cache of vm area: this cache used
> for retrieving actual stack pages, freeing is done by vfree_deferred().

Nice!


[PATCH] vmalloc: add __alloc_vm_area() for optimizing vmap stack

2017-10-06 Thread Konstantin Khlebnikov
This same as __vmalloc_node_range() but returns vm_struct rather than
virtual address. This allows to kill one call of find_vm_area() for
each task stack allocation for CONFIG_VMAP_STACK=y.

And fix comment about that task holds cache of vm area: this cache used
for retrieving actual stack pages, freeing is done by vfree_deferred().

Signed-off-by: Konstantin Khlebnikov 
---
 include/linux/vmalloc.h |6 ++
 kernel/fork.c   |   26 ++
 mm/vmalloc.c|   33 -
 3 files changed, 48 insertions(+), 17 deletions(-)

diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
index 2d92dd002abd..ce94ab55d1d6 100644
--- a/include/linux/vmalloc.h
+++ b/include/linux/vmalloc.h
@@ -130,6 +130,12 @@ extern struct vm_struct *__get_vm_area_caller(unsigned 
long size,
unsigned long flags,
unsigned long start, unsigned long end,
const void *caller);
+extern struct vm_struct *__alloc_vm_area(unsigned long size,
+   unsigned long align,
+   unsigned long start, unsigned long end,
+   gfp_t gfp_mask, pgprot_t prot,
+   unsigned long vm_flags, int node,
+   const void *caller);
 extern struct vm_struct *remove_vm_area(const void *addr);
 extern struct vm_struct *find_vm_area(const void *addr);
 
diff --git a/kernel/fork.c b/kernel/fork.c
index e702cb9ffbd8..c4ff0303b7c5 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -204,12 +204,10 @@ static int free_vm_stack_cache(unsigned int cpu)
 static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int 
node)
 {
 #ifdef CONFIG_VMAP_STACK
-   void *stack;
+   struct vm_struct *s;
int i;
 
for (i = 0; i < NR_CACHED_STACKS; i++) {
-   struct vm_struct *s;
-
s = this_cpu_xchg(cached_stacks[i], NULL);
 
if (!s)
@@ -219,20 +217,16 @@ static unsigned long *alloc_thread_stack_node(struct 
task_struct *tsk, int node)
return s->addr;
}
 
-   stack = __vmalloc_node_range(THREAD_SIZE, THREAD_ALIGN,
-VMALLOC_START, VMALLOC_END,
-THREADINFO_GFP,
-PAGE_KERNEL,
-0, node, __builtin_return_address(0));
+   s = __alloc_vm_area(THREAD_SIZE, THREAD_ALIGN,
+   VMALLOC_START, VMALLOC_END,
+   THREADINFO_GFP, PAGE_KERNEL,
+   0, node, __builtin_return_address(0));
+   if (unlikely(!s))
+   return NULL;
 
-   /*
-* We can't call find_vm_area() in interrupt context, and
-* free_thread_stack() can be called in interrupt context,
-* so cache the vm_struct.
-*/
-   if (stack)
-   tsk->stack_vm_area = find_vm_area(stack);
-   return stack;
+   /* Cache the vm_struct for stack to page conversions. */
+   tsk->stack_vm_area = s;
+   return s->addr;
 #else
struct page *page = alloc_pages_node(node, THREADINFO_GFP,
 THREAD_SIZE_ORDER);
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 8a43db6284eb..456652c0660f 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1750,6 +1750,37 @@ void *__vmalloc_node_range(unsigned long size, unsigned 
long align,
const void *caller)
 {
struct vm_struct *area;
+
+   area = __alloc_vm_area(size, align, start, end, gfp_mask,
+  prot, vm_flags, node, caller);
+
+   return area ? area->addr : NULL;
+}
+
+/**
+ * __alloc_vm_area  -  allocate virtually contiguous memory
+ * @size:  allocation size
+ * @align: desired alignment
+ * @start: vm area range start
+ * @end:   vm area range end
+ * @gfp_mask:  flags for the page level allocator
+ * @prot:  protection mask for the allocated pages
+ * @vm_flags:  additional vm area flags (e.g. %VM_NO_GUARD)
+ * @node:  node to use for allocation or NUMA_NO_NODE
+ * @caller:caller's return address
+ *
+ * Allocate enough pages to cover @size from the page level
+ * allocator with @gfp_mask flags.  Map them into contiguous
+ * kernel virtual space, using a pagetable protection of @prot
+ *
+ * Returns the area descriptor on success or %NULL on failure.
+ */
+struct vm_struct *__alloc_vm_area(unsigned long size, unsigned long align,
+   unsigned long start, unsigned long end, gfp_t gfp_mask,
+   pgprot_t prot, unsigned long vm_flags, int node,
+   

[PATCH] vmalloc: add __alloc_vm_area() for optimizing vmap stack

2017-10-06 Thread Konstantin Khlebnikov
This same as __vmalloc_node_range() but returns vm_struct rather than
virtual address. This allows to kill one call of find_vm_area() for
each task stack allocation for CONFIG_VMAP_STACK=y.

And fix comment about that task holds cache of vm area: this cache used
for retrieving actual stack pages, freeing is done by vfree_deferred().

Signed-off-by: Konstantin Khlebnikov 
---
 include/linux/vmalloc.h |6 ++
 kernel/fork.c   |   26 ++
 mm/vmalloc.c|   33 -
 3 files changed, 48 insertions(+), 17 deletions(-)

diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h
index 2d92dd002abd..ce94ab55d1d6 100644
--- a/include/linux/vmalloc.h
+++ b/include/linux/vmalloc.h
@@ -130,6 +130,12 @@ extern struct vm_struct *__get_vm_area_caller(unsigned 
long size,
unsigned long flags,
unsigned long start, unsigned long end,
const void *caller);
+extern struct vm_struct *__alloc_vm_area(unsigned long size,
+   unsigned long align,
+   unsigned long start, unsigned long end,
+   gfp_t gfp_mask, pgprot_t prot,
+   unsigned long vm_flags, int node,
+   const void *caller);
 extern struct vm_struct *remove_vm_area(const void *addr);
 extern struct vm_struct *find_vm_area(const void *addr);
 
diff --git a/kernel/fork.c b/kernel/fork.c
index e702cb9ffbd8..c4ff0303b7c5 100644
--- a/kernel/fork.c
+++ b/kernel/fork.c
@@ -204,12 +204,10 @@ static int free_vm_stack_cache(unsigned int cpu)
 static unsigned long *alloc_thread_stack_node(struct task_struct *tsk, int 
node)
 {
 #ifdef CONFIG_VMAP_STACK
-   void *stack;
+   struct vm_struct *s;
int i;
 
for (i = 0; i < NR_CACHED_STACKS; i++) {
-   struct vm_struct *s;
-
s = this_cpu_xchg(cached_stacks[i], NULL);
 
if (!s)
@@ -219,20 +217,16 @@ static unsigned long *alloc_thread_stack_node(struct 
task_struct *tsk, int node)
return s->addr;
}
 
-   stack = __vmalloc_node_range(THREAD_SIZE, THREAD_ALIGN,
-VMALLOC_START, VMALLOC_END,
-THREADINFO_GFP,
-PAGE_KERNEL,
-0, node, __builtin_return_address(0));
+   s = __alloc_vm_area(THREAD_SIZE, THREAD_ALIGN,
+   VMALLOC_START, VMALLOC_END,
+   THREADINFO_GFP, PAGE_KERNEL,
+   0, node, __builtin_return_address(0));
+   if (unlikely(!s))
+   return NULL;
 
-   /*
-* We can't call find_vm_area() in interrupt context, and
-* free_thread_stack() can be called in interrupt context,
-* so cache the vm_struct.
-*/
-   if (stack)
-   tsk->stack_vm_area = find_vm_area(stack);
-   return stack;
+   /* Cache the vm_struct for stack to page conversions. */
+   tsk->stack_vm_area = s;
+   return s->addr;
 #else
struct page *page = alloc_pages_node(node, THREADINFO_GFP,
 THREAD_SIZE_ORDER);
diff --git a/mm/vmalloc.c b/mm/vmalloc.c
index 8a43db6284eb..456652c0660f 100644
--- a/mm/vmalloc.c
+++ b/mm/vmalloc.c
@@ -1750,6 +1750,37 @@ void *__vmalloc_node_range(unsigned long size, unsigned 
long align,
const void *caller)
 {
struct vm_struct *area;
+
+   area = __alloc_vm_area(size, align, start, end, gfp_mask,
+  prot, vm_flags, node, caller);
+
+   return area ? area->addr : NULL;
+}
+
+/**
+ * __alloc_vm_area  -  allocate virtually contiguous memory
+ * @size:  allocation size
+ * @align: desired alignment
+ * @start: vm area range start
+ * @end:   vm area range end
+ * @gfp_mask:  flags for the page level allocator
+ * @prot:  protection mask for the allocated pages
+ * @vm_flags:  additional vm area flags (e.g. %VM_NO_GUARD)
+ * @node:  node to use for allocation or NUMA_NO_NODE
+ * @caller:caller's return address
+ *
+ * Allocate enough pages to cover @size from the page level
+ * allocator with @gfp_mask flags.  Map them into contiguous
+ * kernel virtual space, using a pagetable protection of @prot
+ *
+ * Returns the area descriptor on success or %NULL on failure.
+ */
+struct vm_struct *__alloc_vm_area(unsigned long size, unsigned long align,
+   unsigned long start, unsigned long end, gfp_t gfp_mask,
+   pgprot_t prot, unsigned long vm_flags, int node,
+   const void *caller)
+{
+