Re: [PATCH] riscv: fix bugon.cocci warnings

2021-02-28 Thread Pekka Enberg
On Sun, Feb 28, 2021 at 12:10:22PM +0100, Julia Lawall wrote:
> From: kernel test robot 
> 
> Use BUG_ON instead of a if condition followed by BUG.
> 
> Generated by: scripts/coccinelle/misc/bugon.cocci
> 
> Fixes: c22b0bcb1dd0 ("riscv: Add kprobes supported")
> CC: Guo Ren 
> Reported-by: kernel test robot 
> Signed-off-by: kernel test robot 
> Signed-off-by: Julia Lawall 

Reviewed-by: Pekka Enberg 


Re: [PATCH] riscv: Fixup CONFIG_GENERIC_TIME_VSYSCALL

2021-01-03 Thread Pekka Enberg
On Sat, Jan 2, 2021 at 3:26 PM  wrote:
>
> From: Guo Ren 
>
> The patch fix commit: ad5d112 ("riscv: use vDSO common flow to
> reduce the latency of the time-related functions").
>
> The GENERIC_TIME_VSYSCALL should be CONFIG_GENERIC_TIME_VSYSCALL
> or vgettimeofday won't work.
>
> Signed-off-by: Guo Ren 
> Cc: Atish Patra 
> Cc: Palmer Dabbelt 
> Cc: Vincent Chen 

Reviewed-by: Pekka Enberg 


Re: [PATCH] riscv: mm: abort uaccess retries upon fatal signal

2020-12-30 Thread Pekka Enberg
On Wed, Dec 30, 2020 at 4:40 PM  wrote:
> From: Guo Ren 
>
> Pick up the patch from the 'Link' made by Mark Rutland. Keep the
> same with x86, arm, arm64, arc, sh, power.
>
> Link: 
> https://lore.kernel.org/linux-arm-kernel/1499782763-31418-1-git-send-email-mark.rutl...@arm.com/
> Signed-off-by: Guo Ren 
> Cc: Mark Rutland 
> Cc: Pekka Enberg 
> Cc: Palmer Dabbelt 

Reviewed-by: Pekka Enberg 


Re: [PATCH v2 0/2] Let illegal access to user-space memory die

2020-12-02 Thread Pekka Enberg
Hi Eric,

On Thu, Dec 3, 2020 at 8:51 AM Eric Lin  wrote:
>
> Accesses to user-space memory without calling uaccess routine
> leads to hanging in page fault handler. Like arm64, we let it
> die earlier in page fault handler.
>
> Changes in v2:
> -Add a die_kernel_fault() helper
> -Split one long line code into two

Please also make no_context() use the new helper. Other than that:

Reviewed-by: Pekka Enberg 


Re: [PATCH] riscv/mm: Prevent kernel module access user-space memory without uaccess routines

2020-11-30 Thread Pekka Enberg
On Mon, Nov 30, 2020 at 7:33 AM Eric Lin  wrote:
>
> In the page fault handler, an access to user-space memory
> without get/put_user() or copy_from/to_user() routines is
> not resolved properly. Like arm and other architectures,
> we need to let it die earlier in page fault handler.

Fix looks good to me. Can you elaborate on how you found the issue and
how the bug manifests itself?

>
> Signed-off-by: Eric Lin 
> Cc: Alan Kao 
> ---
>  arch/riscv/mm/fault.c | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
> index 3c8b9e433c67..a452cfa266a2 100644
> --- a/arch/riscv/mm/fault.c
> +++ b/arch/riscv/mm/fault.c
> @@ -232,6 +232,9 @@ asmlinkage void do_page_fault(struct pt_regs *regs)
> if (user_mode(regs))
> flags |= FAULT_FLAG_USER;
>
> +   if (!user_mode(regs) && addr < TASK_SIZE && unlikely(!(regs->status & 
> SR_SUM)))
> +   die(regs, "Accessing user space memory without uaccess 
> routines\n");

Let's introduce a die_kernel_fault() helper (similar to arm64, for
example) to ensure same semantics for the different kernel faults. You
can extract the helper from no_context().

> +
> perf_sw_event(PERF_COUNT_SW_PAGE_FAULTS, 1, regs, addr);
>
> if (cause == EXC_STORE_PAGE_FAULT)
> --
> 2.17.0
>
>
> ___
> linux-riscv mailing list
> linux-ri...@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv


Re: [PATCH v3] riscv: fix pfn_to_virt err in do_page_fault().

2020-10-26 Thread Pekka Enberg
On Mon, Oct 26, 2020 at 08:26:54PM +0800, liush wrote:
> From: Liu Shaohua 
> 
> The argument to pfn_to_virt() should be pfn not the value of CSR_SATP.
> 
> Reviewed-by: Palmer Dabbelt 
> Reviewed-by: Anup Patel 
> Signed-off-by: liush 

Reviewed-by: Pekka Enberg 

> ---
>  arch/riscv/mm/fault.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c
> index 1359e21..3c8b9e4 100644
> --- a/arch/riscv/mm/fault.c
> +++ b/arch/riscv/mm/fault.c
> @@ -86,6 +86,7 @@ static inline void vmalloc_fault(struct pt_regs *regs, int 
> code, unsigned long a
>   pmd_t *pmd, *pmd_k;
>   pte_t *pte_k;
>   int index;
> + unsigned long pfn;
>  
>   /* User mode accesses just cause a SIGSEGV */
>   if (user_mode(regs))
> @@ -100,7 +101,8 @@ static inline void vmalloc_fault(struct pt_regs *regs, 
> int code, unsigned long a
>* of a task switch.
>*/
>   index = pgd_index(addr);
> - pgd = (pgd_t *)pfn_to_virt(csr_read(CSR_SATP)) + index;
> + pfn = csr_read(CSR_SATP) & SATP_PPN;
> + pgd = (pgd_t *)pfn_to_virt(pfn) + index;
>   pgd_k = init_mm.pgd + index;
>  
>   if (!pgd_present(*pgd_k)) {
> -- 
> 2.7.4
> 


Re: [PATCH -next] riscv/mm/fault: fix old-style-declaration warning

2020-09-09 Thread Pekka Enberg
Hi,

On Wed, Sep 9, 2020 at 2:20 PM Wei Yongjun  wrote:
>
> gcc report build warning as follows:
>
> arch/riscv/mm/fault.c:81:1: warning:
>  'inline' is not at beginning of declaration [-Wold-style-declaration]
>81 | static void inline vmalloc_fault(struct pt_regs *regs, int code, 
> unsigned long addr)
>   | ^~
>
> This commit fix it by moving 'inline' after 'static'.
>
> Signed-off-by: Wei Yongjun 

Thanks for the fix, but Palmer committed similar fix from me to for-next:

https://git.kernel.org/pub/scm/linux/kernel/git/riscv/linux.git/commit/?h=for-next=6f36a9635082e1d6910bc7853d0c9fd12d7890b5

- Pekka


Re: [PATCH v4 0/3] Get cache information from userland

2020-08-31 Thread Pekka Enberg
For the series:

Reviewed-by: Pekka Enberg 

- Pekka


Re: [PATCH v3 1/3] riscv: Set more data to cacheinfo

2020-08-31 Thread Pekka Enberg
Hi,

On Mon, Aug 31, 2020 at 9:15 AM Zong Li  wrote:
> If the sets is one, it means that the cache is fully associative, then
> we don't need to fill the ways number, just keep way number as zero,
> so here we want to find the fully associative case first and make the
> if expression fail at the beginning. We might also rewrite it as
> follow:

[snip]

Thanks for the explanation! The rewritten version is much easier to
read, at least to me.

- Pekka


Re: [PATCH v3 3/3] riscv: Add cache information in AUX vector

2020-08-30 Thread Pekka Enberg
On Fri, Aug 28, 2020 at 10:09 AM Zong Li  wrote:
> +uintptr_t get_cache_geometry(u32 level, enum cache_type type)
> +{
> +   struct cacheinfo *this_leaf = get_cacheinfo(level, type);
> +   uintptr_t ret = (this_leaf->ways_of_associativity << 16 |
> +this_leaf->coherency_line_size);

You are dereferencing "this_leaf" without checking if it's NULL here.

> +
> +   return this_leaf ? ret : 0;
> +}

- Pekka


Re: [PATCH v3 1/3] riscv: Set more data to cacheinfo

2020-08-30 Thread Pekka Enberg
Hi,

On Fri, Aug 28, 2020 at 10:09 AM Zong Li  wrote:
>
> Set cacheinfo.{size,sets,line_size} for each cache node, then we can
> get these information from userland through auxiliary vector.
>
> Signed-off-by: Zong Li 
> ---
>  arch/riscv/kernel/cacheinfo.c | 59 ++-
>  1 file changed, 44 insertions(+), 15 deletions(-)
>
> diff --git a/arch/riscv/kernel/cacheinfo.c b/arch/riscv/kernel/cacheinfo.c
> index bd0f122965c3..8b85abfbd77a 100644
> --- a/arch/riscv/kernel/cacheinfo.c
> +++ b/arch/riscv/kernel/cacheinfo.c
> @@ -25,12 +25,46 @@ cache_get_priv_group(struct cacheinfo *this_leaf)
> return NULL;
>  }
>
> -static void ci_leaf_init(struct cacheinfo *this_leaf,
> -struct device_node *node,
> -enum cache_type type, unsigned int level)
> +static void ci_leaf_init(struct cacheinfo *this_leaf, enum cache_type type,
> +unsigned int level, unsigned int size,
> +unsigned int sets, unsigned int line_size)
>  {
> this_leaf->level = level;
> this_leaf->type = type;
> +   this_leaf->size = size;
> +   this_leaf->number_of_sets = sets;
> +   this_leaf->coherency_line_size = line_size;
> +
> +   /*
> +* If the cache is fully associative, there is no need to
> +* check the other properties.
> +*/
> +   if (!(sets == 1) && (sets > 0 && size > 0 && line_size > 0))

Can you explain what this is attempting to do? AFAICT, the if
expression can be reduced to "sets > 1 && size > 0 && size > 0", but
what do you mean with the comment about fully associative caches?

> +   this_leaf->ways_of_associativity = (size / sets) / line_size;
> +}


Re: [PATCH v2 17/23] riscv: use asm-generic/mmu_context.h for no-op implementations

2020-08-26 Thread Pekka Enberg
On Wed, Aug 26, 2020 at 5:54 PM Nicholas Piggin  wrote:
>
> Cc: Paul Walmsley 
> Cc: Palmer Dabbelt 
> Cc: Albert Ou 
> Cc: linux-ri...@lists.infradead.org
> Acked-by: Palmer Dabbelt 
> Signed-off-by: Nicholas Piggin 

Reviewed-by: Pekka Enberg 


[RFC PATCH] x86/mm/fault: Inline page fault paths to reduce kernel text size

2020-08-25 Thread Pekka Enberg
From: Pekka Enberg 

The commit 92181f190b649f7ef2b79cbf5c00f26ccc66da2a ("x86: optimise
x86's do_page_fault (C entry point for the page fault path)") from 2009
shows significant stack savings when infrequent page fault handling
paths are moved out of line with the "noinline" annotation, with some increase
in kernel text size.

However, a decade of GCC improvements (and changes in the code) has
eliminated such wins in stack usage:

With "noinline":

0b30 :
 b30:   41 57   push   %r15
 b32:   41 56   push   %r14
 b34:   41 55   push   %r13
 b36:   49 89 d5mov%rdx,%r13
 b39:   41 54   push   %r12
 b3b:   49 89 fcmov%rdi,%r12
 b3e:   55  push   %rbp
 b3f:   48 89 f5mov%rsi,%rbp
 b42:   53  push   %rbx
 b43:   65 48 8b 04 25 00 00mov%gs:0x0,%rax
 b4a:   00 00
 b4c:   48 83 ec 18 sub$0x18,%rsp

With "inline":

08a0 :
 8a0:   41 57   push   %r15
 8a2:   41 56   push   %r14
 8a4:   41 55   push   %r13
 8a6:   49 89 d5mov%rdx,%r13
 8a9:   41 54   push   %r12
 8ab:   49 89 fcmov%rdi,%r12
 8ae:   55  push   %rbp
 8af:   48 89 f5mov%rsi,%rbp
 8b2:   53  push   %rbx
 8b3:   65 48 8b 04 25 00 00mov%gs:0x0,%rax
 8ba:   00 00
 8bc:   48 83 ec 18 sub$0x18,%rsp

So all we're left with is the increase in kernel text:

  add/remove: 1/5 grow/shrink: 2/1 up/down: 1049/-1144 (-95)
  Function old new   delta
  spurious_kernel_fault.part - 559+559
  do_user_addr_fault   9881446+458
  do_kern_addr_fault   103 135 +32
  bad_area_nosemaphore  13   - -13
  bad_area  66   - -66
  pgtable_bad  107   --107
  mm_fault_error   199   --199
  bad_area_access_error226   --226
  spurious_kernel_fault553  20-533
  Total: Before=7359, After=7264, chg -1.29%

Therefore, let's inline the page fault handling paths to reduce kernel
text size.

Cc: Dave Hansen 
Cc: Andy Lutomirski 
Cc: Peter Zijlstra 
Cc: Thomas Gleixner 
Cc: Ingo Molnar 
Cc: Borislav Petkov 
Cc: "H. Peter Anvin" 
Signed-off-by: Pekka Enberg 
---
 arch/x86/mm/fault.c | 14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
index 35f1498e9832..b8893684d80c 100644
--- a/arch/x86/mm/fault.c
+++ b/arch/x86/mm/fault.c
@@ -533,7 +533,7 @@ show_fault_oops(struct pt_regs *regs, unsigned long 
error_code, unsigned long ad
dump_pagetable(address);
 }
 
-static noinline void
+static inline void
 pgtable_bad(struct pt_regs *regs, unsigned long error_code,
unsigned long address)
 {
@@ -577,7 +577,7 @@ static void set_signal_archinfo(unsigned long address,
tsk->thread.cr2 = address;
 }
 
-static noinline void
+static inline void
 no_context(struct pt_regs *regs, unsigned long error_code,
   unsigned long address, int signal, int si_code)
 {
@@ -788,7 +788,7 @@ __bad_area_nosemaphore(struct pt_regs *regs, unsigned long 
error_code,
no_context(regs, error_code, address, SIGSEGV, si_code);
 }
 
-static noinline void
+static inline void
 bad_area_nosemaphore(struct pt_regs *regs, unsigned long error_code,
 unsigned long address)
 {
@@ -809,7 +809,7 @@ __bad_area(struct pt_regs *regs, unsigned long error_code,
__bad_area_nosemaphore(regs, error_code, address, pkey, si_code);
 }
 
-static noinline void
+static inline void
 bad_area(struct pt_regs *regs, unsigned long error_code, unsigned long address)
 {
__bad_area(regs, error_code, address, 0, SEGV_MAPERR);
@@ -832,7 +832,7 @@ static inline bool bad_area_access_from_pkeys(unsigned long 
error_code,
return false;
 }
 
-static noinline void
+static inline void
 bad_area_access_error(struct pt_regs *regs, unsigned long error_code,
  unsigned long address, struct vm_area_struct *vma)
 {
@@ -905,7 +905,7 @@ do_sigbus(struct pt_regs *regs, unsigned long error_code, 
unsigned long address,
force_sig_fault(SIGBUS, BUS_ADRERR, (void __user *)address);
 }
 
-static noinline void
+static inline void
 mm_fault_error(struct pt_regs *regs, unsigned long error_code,
   unsigned long address

Re: [PATCH v2 0/3] mm/slub: Fix count_partial() problem

2020-08-20 Thread Pekka Enberg
On Mon, Aug 10, 2020 at 3:18 PM Xunlei Pang  wrote:
>
> v1->v2:
> - Improved changelog and variable naming for PATCH 1~2.
> - PATCH3 adds per-cpu counter to avoid performance regression
>   in concurrent __slab_free().
>
> [Testing]
> On my 32-cpu 2-socket physical machine:
> Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
> perf stat --null --repeat 10 -- hackbench 20 thread 2
>
> == original, no patched
>   19.211637055 seconds time elapsed   
>( +-  0.57% )
>
> == patched with patch1~2
>  Performance counter stats for 'hackbench 20 thread 2' (10 runs):
>
>   21.731833146 seconds time elapsed   
>( +-  0.17% )
>
> == patched with patch1~3
>  Performance counter stats for 'hackbench 20 thread 2' (10 runs):
>
>   19.112106847 seconds time elapsed   
>( +-  0.64% )
>
>
> Xunlei Pang (3):
>   mm/slub: Introduce two counters for partial objects
>   mm/slub: Get rid of count_partial()
>   mm/slub: Use percpu partial free counter
>
>  mm/slab.h |   2 +
>  mm/slub.c | 124 
> +++---
>  2 files changed, 89 insertions(+), 37 deletions(-)

We probably need to wrap the counters under CONFIG_SLUB_DEBUG because
AFAICT all the code that uses them is also wrapped under it.

An alternative approach for this patch would be to somehow make the
lock in count_partial() more granular, but I don't know how feasible
that actually is.

Anyway, I am OK with this approach:

Reviewed-by: Pekka Enberg 

You still need to convince Christoph, though, because he had
objections over this approach.

- Pekka


Re: [PATCH 1/2] mm/slub: Introduce two counters for the partial objects

2020-08-20 Thread Pekka Enberg
Hi Christopher,

On Tue, Aug 11, 2020 at 3:52 PM Christopher Lameter  wrote:
>
> On Fri, 7 Aug 2020, Pekka Enberg wrote:
>
> > Why do you consider this to be a fast path? This is all partial list
> > accounting when we allocate/deallocate a slab, no? Just like
> > ___slab_alloc() says, I assumed this to be the slow path... What am I
> > missing?
>
> I thought these were per object counters? If you just want to count the
> number of slabs then you do not need the lock at all. We already have a
> counter for the number of slabs.

The patch attempts to speed up count_partial(), which holds on to the
"n->list_lock" (with IRQs off) for the whole duration it takes to walk
the partial slab list:

spin_lock_irqsave(>list_lock, flags);
list_for_each_entry(page, >partial, slab_list)
x += get_count(page);
spin_unlock_irqrestore(>list_lock, flags);

It's counting the number of *objects*, but the counters are only
updated in bulk when we add/remove a slab to/from the partial list.
The counter updates are therefore *not* in the fast-path AFAICT.

Xunlei, please correct me if I'm reading your patches wrong.

On Tue, Aug 11, 2020 at 3:52 PM Christopher Lameter  wrote:
> > No objections to alternative fixes, of course, but wrapping the
> > counters under CONFIG_DEBUG seems like just hiding the actual issue...
>
> CONFIG_DEBUG is on by default. It just compiles in the debug code and
> disables it so we can enable it with a kernel boot option. This is because
> we have had numerous issues in the past with "production" kernels that
> could not be recompiled with debug options. So just running the prod
> kernel with another option will allow you to find hard to debug issues in
> a full scale producton deployment with potentially proprietary modules
> etc.

Yeah, it's been too long since I last looked at the code and did not
realize even count_partial() is wrapped in CONFIG_DEBUG. So by all
means, let's also wrap the counters with that too.

- Pekka


Re: [PATCH] mm: sort freelist by rank number

2020-08-19 Thread Pekka Enberg
Hi KyongHo and David,

On 07.08.20 09:08, Pekka Enberg wrote:
> > > I think having more knowledge of DRAM controller details in the OS
> > > would be potentially beneficial for better page allocation policy, so
> > > maybe try come up with something more generic, even if the fallback to
> > > providing this information is a kernel command line option.

On Tue, Aug 18, 2020 at 12:02 PM Cho KyongHo  wrote:
> I don't find if there is a way to deliver detailed DRAM information
> through ACPI, ATAG or something similar. But I didn't find.

Perhaps that's still the case today then.

In the context of this patch, what I am hoping is that we'd turn the
"dram_rank_granule" parameter -- even if it's still a manually
configured thing -- into an abstraction that we can later extend. IOW,
something similar to what the "energy model"
(kernel/power/energy_model.c) today is.

On Mon, Aug 10, 2020 at 09:32:18AM +0200, David Hildenbrand wrote:
> I guess one universal approach is by measuring access times ... not what
> we might be looking for :)

I don't think it's an unreasonable approach, but measurement accuracy
and speed will determine if it's actually practical. There are
examples of this kind of approach elsewhere too. For example, MCTOP
[1] which aims to provide topology-aware OS abstractions, uses cache
latency measurements to infer the topology.

1. https://timharris.uk/papers/2017-eurosys-mctop.pdf

Regards,

- Pekka


Re: [PATCH -next] x86/mpparse: remove duplicate include

2020-08-19 Thread Pekka Enberg
On Wed, Aug 19, 2020 at 07:29:10PM +0800, Wang Hai wrote:
> Remove asm/io_apic.h which is included more than once
> 
> Reported-by: Hulk Robot 
> Signed-off-by: Wang Hai 

Reviewed-by: Pekka Enberg 


Re: [PATCH -next] mm: slab: Remove duplicate include

2020-08-18 Thread Pekka Enberg
On Tue, Aug 18, 2020 at 2:43 PM YueHaibing  wrote:
>
> Remove duplicate header which is included twice.
>
> Signed-off-by: YueHaibing 

Reviewed-by: Pekka Enberg 


Re: [PATCH] mm/slub: fix missing ALLOC_SLOWPATH stat when bulk alloc

2020-08-11 Thread Pekka Enberg
On Tue, Aug 11, 2020 at 5:25 AM  wrote:
>
> From: Abel Wu 
>
> The ALLOC_SLOWPATH statistics is missing in bulk allocation now.
> Fix it by doing statistics in alloc slow path.
>
> Signed-off-by: Abel Wu 

Reviewed-by: Pekka Enberg 


Re: Odd-sized kmem_cache_alloc and slub_debug=Z

2020-08-07 Thread Pekka Enberg
Hi Marco and Kees,

On Fri, Aug 07, 2020 at 08:06PM +0300, Pekka Enberg wrote:
> > Anything interesting in your .config? The fault does not reproduce
> > with 5.8.0 + x86-64 defconfig.

On Fri, Aug 7, 2020 at 8:18 PM Marco Elver  wrote:
> It's quite close to defconfig, just some extra options for my test
> environment. But none that I'd imagine change this behaviour -- but
> maybe I missed something. I've attached my config. Also, just in case,
> I'm on mainline from Tuesday: 2324d50d051ec0f14a548e78554fb02513d6dcef.

Yeah, it reproduces with defconfig too, as long as you remember to
pass "slub_debug=Z"... :-/

The following seems to be the culprit:

commit 3202fa62fb43087387c65bfa9c100feffac74aa6
Author: Kees Cook 
Date:   Wed Apr 1 21:04:27 2020 -0700

slub: relocate freelist pointer to middle of object

Reverting this commit and one of it's follow up fixes from Kees from
v5.8 makes the issue go away for me. Btw, please note that caches with
size 24 and larger do not trigger this bug, so the issue is that with
small enough object size, we're stomping on allocator metadata (I
assume part of the freelist).

- Pekka


Re: [PATCH 1/2] mm/slub: Introduce two counters for the partial objects

2020-08-07 Thread Pekka Enberg
Hi Christopher,

On Fri, 7 Aug 2020, Pekka Enberg wrote:
> > I think we can just default to the counters. After all, if I
> > understood correctly, we're talking about up to 100 ms time period
> > with IRQs disabled when count_partial() is called. As this is
> > triggerable from user space, that's a performance bug whatever way you
> > look at it.

On Fri, Aug 7, 2020 at 4:02 PM Christopher Lameter  wrote:
> Well yes under extreme conditions and this is only happening for sysfs
> counter retrieval.

You will likely get some stall even in less extreme conditions, and in
any case, the kernel should not allow user space to trigger such a
stall.

On Fri, Aug 7, 2020 at 4:02 PM Christopher Lameter  wrote:
> There could be other solutions to this. This solution here is penalizing
> evertu hotpath slab allocation for the sake of relatively infrequently
> used counter monitoring. There the possibility of not traversing the list
> ande simply estimating the value based on the number of slab pages
> allocated on that node.

Why do you consider this to be a fast path? This is all partial list
accounting when we allocate/deallocate a slab, no? Just like
___slab_alloc() says, I assumed this to be the slow path... What am I
missing?

No objections to alternative fixes, of course, but wrapping the
counters under CONFIG_DEBUG seems like just hiding the actual issue...

- Pekka


Re: Odd-sized kmem_cache_alloc and slub_debug=Z

2020-08-07 Thread Pekka Enberg
Hi Marco,

On Fri, Aug 7, 2020 at 7:07 PM Marco Elver  wrote:
> I found that the below debug-code using kmem_cache_alloc(), when using
> slub_debug=Z, results in the following crash:
>
> general protection fault, probably for non-canonical address 
> 0xcca41caea170:  [#1] PREEMPT SMP PTI
> CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.8.0+ #1
> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1 
> 04/01/2014
> RIP: 0010:freelist_dereference mm/slub.c:272 [inline]
> RIP: 0010:get_freepointer mm/slub.c:278 [inline]
> RIP: 0010:deactivate_slab+0x54/0x460 mm/slub.c:2111
> Code: 8b bc c7 e0 00 00 00 48 85 d2 0f 84 00 01 00 00 49 89 d5 31 c0 
> 48 89 44 24 08 66 66 2e 0f 1f 84 00 00 00 00 00 90 44 8b 43 20 <4b> 8b 44 05 
> 00 48 85 c0 0f 84 1e 01 00 00 4c 89 ed 49 89 c5 8b 43
> RSP: :a7e03e18 EFLAGS: 00010046
> RAX:  RBX: a3a41c972340 RCX: 
> RDX: cca41caea160 RSI: e7c6a072ba80 RDI: a3a41c972340
> RBP: a3a41caea008 R08: 0010 R09: a3a41caea01d
> R10: a7f8dc50 R11: a68f44c0 R12: a3a41c972340
> R13: cca41caea160 R14: e7c6a072ba80 R15: a3a41c96d540
> FS:  () GS:a3a41fc0() 
> knlGS:
> CS:  0010 DS:  ES:  CR0: 80050033
> CR2: a3a051c01000 CR3: 00045140a001 CR4: 00770ef0
> DR0:  DR1:  DR2: 
> DR3:  DR6: fffe0ff0 DR7: 0400
> PKRU: 
> Call Trace:
>  ___slab_alloc+0x336/0x340 mm/slub.c:2690
>  __slab_alloc mm/slub.c:2714 [inline]
>  slab_alloc_node mm/slub.c:2788 [inline]
>  slab_alloc mm/slub.c:2832 [inline]
>  kmem_cache_alloc+0x135/0x200 mm/slub.c:2837
>  start_kernel+0x3d6/0x44e init/main.c:1049
>  secondary_startup_64+0xb6/0xc0 arch/x86/kernel/head_64.S:243
>
> Any ideas what might be wrong?
>
> This does not crash when redzones are not enabled.
>
> Thanks,
> -- Marco
>
> -- >8 --
>
> diff --git a/init/main.c b/init/main.c
> index 15bd0efff3df..f4aa5bb3f2ec 100644
> --- a/init/main.c
> +++ b/init/main.c
> @@ -1041,6 +1041,16 @@ asmlinkage __visible void __init start_kernel(void)
> sfi_init_late();
> kcsan_init();
>
> +   /* DEBUG CODE */
> +   {
> +   struct kmem_cache *c = kmem_cache_create("test", 21, 1, 0, 
> NULL);
> +   char *buf;
> +   BUG_ON(!c);
> +   buf = kmem_cache_alloc(c, GFP_KERNEL);
> +   kmem_cache_free(c, buf);
> +   kmem_cache_destroy(c);
> +   }
> +
> /* Do the rest non-__init'ed, we're now alive */
> arch_call_rest_init();

Anything interesting in your .config? The fault does not reproduce
with 5.8.0 + x86-64 defconfig.

- Pekka


Re: [PATCH 1/2] mm/slub: Introduce two counters for the partial objects

2020-08-07 Thread Pekka Enberg
On Thu, Aug 6, 2020 at 3:42 PM Vlastimil Babka  wrote:
>
> On 7/2/20 10:32 AM, Xunlei Pang wrote:
> > The node list_lock in count_partial() spend long time iterating
> > in case of large amount of partial page lists, which can cause
> > thunder herd effect to the list_lock contention, e.g. it cause
> > business response-time jitters when accessing "/proc/slabinfo"
> > in our production environments.
> >
> > This patch introduces two counters to maintain the actual number
> > of partial objects dynamically instead of iterating the partial
> > page lists with list_lock held.
> >
> > New counters of kmem_cache_node are: pfree_objects, ptotal_objects.
> > The main operations are under list_lock in slow path, its performance
> > impact is minimal.
> >
> > Co-developed-by: Wen Yang 
> > Signed-off-by: Xunlei Pang 
>
> This or similar things seem to be reported every few months now, last time was
> here [1] AFAIK. The solution was to just stop counting at some point.
>
> Shall we perhaps add these counters under CONFIG_SLUB_DEBUG then and be done
> with it? If anyone needs the extreme performance and builds without
> CONFIG_SLUB_DEBUG, I'd assume they also don't have userspace programs reading
> /proc/slabinfo periodically anyway?

I think we can just default to the counters. After all, if I
understood correctly, we're talking about up to 100 ms time period
with IRQs disabled when count_partial() is called. As this is
triggerable from user space, that's a performance bug whatever way you
look at it.

Whoever needs to eliminate these counters from fast-path, can wrap
them in a CONFIG_MAKE_SLABINFO_EXTREMELY_SLOW option.

So for this patch, with updated information about the severity of the
problem, and the hackbench numbers:

Acked-by: Pekka Enberg 

Christoph, others, any objections?

- Pekka


Re: [PATCH] mm: sort freelist by rank number

2020-08-07 Thread Pekka Enberg
Hi Cho and David,

On Mon, Aug 3, 2020 at 10:57 AM David Hildenbrand  wrote:
>
> On 03.08.20 08:10, pullip@samsung.com wrote:
> > From: Cho KyongHo 
> >
> > LPDDR5 introduces rank switch delay. If three successive DRAM accesses
> > happens and the first and the second ones access one rank and the last
> > access happens on the other rank, the latency of the last access will
> > be longer than the second one.
> > To address this panelty, we can sort the freelist so that a specific
> > rank is allocated prior to another rank. We expect the page allocator
> > can allocate the pages from the same rank successively with this
> > change. It will hopefully improves the proportion of the consecutive
> > memory accesses to the same rank.
>
> This certainly needs performance numbers to justify ... and I am sorry,
> "hopefully improves" is not a valid justification :)
>
> I can imagine that this works well initially, when there hasn't been a
> lot of memory fragmentation going on. But quickly after your system is
> under stress, I doubt this will be very useful. Proof me wrong. ;)
>
> ... I dislike this manual setting of "dram_rank_granule". Yet another mm
> feature that can only be enabled by a magic command line parameter where
> users have to guess the right values.
>
> (side note, there have been similar research approaches to improve
> energy consumption by switching off ranks when not needed).

I was thinking of the exact same thing. PALLOC [1] comes to mind, but
perhaps there are more recent ones?

I also dislike the manual knob, but is there a way for the OS to
detect this by itself? My (perhaps outdated) understanding was that
the DRAM address mapping scheme, for example, is not exposed to the
OS.

I think having more knowledge of DRAM controller details in the OS
would be potentially beneficial for better page allocation policy, so
maybe try come up with something more generic, even if the fallback to
providing this information is a kernel command line option.

[1] http://cs-people.bu.edu/rmancuso/files/papers/palloc-rtas2014.pdf

- Pekka


Re: [PATCH v2 1/2] riscv: Fix build warning for mm/init

2020-07-16 Thread Pekka Enberg
On Thu, Jul 16, 2020 at 4:06 PM Zong Li  wrote:
>
> Add static keyword for resource_init, this function is only used in this
> object file.
>
> The warning message as follow (with W=1 build):
>
> arch/riscv/mm/init.c:520:13:
> warning: no previous prototype for 'resource_init' [-Wmissing-prototypes]
>
> Signed-off-by: Zong Li 

Reviewed-by: Pekka Enberg 

- Pekka


Re: [PATCH 2/2] riscv: fix build warning of mm/pageattr

2020-07-16 Thread Pekka Enberg
On Thu, Jul 16, 2020 at 10:11 AM Pekka Enberg  wrote:
>
> On Thu, Jul 16, 2020 at 9:16 AM Zong Li  wrote:
> >
> > Add hearder for missing prototype. Also, static keyword should be at
> > beginning of declaration.
> >
> > Signed-off-by: Zong Li 
>
> Which prototype is missing?

Aah, never mind, you mention them in the cover letter. I think patch
description would be a better place to ensure they end up in git logs.

For both patches:

Reviewed-by: Pekka Enberg 

- Pekka


Re: [PATCH 2/2] riscv: fix build warning of mm/pageattr

2020-07-16 Thread Pekka Enberg
On Thu, Jul 16, 2020 at 9:16 AM Zong Li  wrote:
>
> Add hearder for missing prototype. Also, static keyword should be at
> beginning of declaration.
>
> Signed-off-by: Zong Li 

Which prototype is missing?

- Pekka


Re: [PATCH 17/26] mm/riscv: Use general page fault accounting

2020-07-14 Thread Pekka Enberg
Hi Palmer,

On Sat, Jul 11, 2020 at 10:43 PM Palmer Dabbelt  wrote:
> This still slightly changes the accounting numbers, but I don't think it does
> so in a way that's meaningful enough to care about.  SIGBUS is the only one
> that might happen frequently enough to notice, I doubt anyone cares about
> whether faults are accounted for during OOM.

What do you mean?

AFAICT, this patch _fixes_ the accounting to be correct on page fault
restarts. I sent a patch to fix this up some time ago, but of course
this generic solutions is better:

http://lists.infradead.org/pipermail/linux-riscv/2020-June/000775.html

- Pekka


Re: [PATCH 1/2] mm/slub: Introduce two counters for the partial objects

2020-07-07 Thread Pekka Enberg
Hi!

(Sorry for the delay, I missed your response.)

On Fri, Jul 3, 2020 at 12:38 PM xunlei  wrote:
>
> On 2020/7/2 PM 7:59, Pekka Enberg wrote:
> > On Thu, Jul 2, 2020 at 11:32 AM Xunlei Pang  
> > wrote:
> >> The node list_lock in count_partial() spend long time iterating
> >> in case of large amount of partial page lists, which can cause
> >> thunder herd effect to the list_lock contention, e.g. it cause
> >> business response-time jitters when accessing "/proc/slabinfo"
> >> in our production environments.
> >
> > Would you have any numbers to share to quantify this jitter? I have no
>
> We have HSF RT(High-speed Service Framework Response-Time) monitors, the
> RT figures fluctuated randomly, then we deployed a tool detecting "irq
> off" and "preempt off" to dump the culprit's calltrace, capturing the
> list_lock cost up to 100ms with irq off issued by "ss", this also caused
> network timeouts.

Thanks for the follow up. This sounds like a good enough motivation
for this patch, but please include it in the changelog.

> > objections to this approach, but I think the original design
> > deliberately made reading "/proc/slabinfo" more expensive to avoid
> > atomic operations in the allocation/deallocation paths. It would be
> > good to understand what is the gain of this approach before we switch
> > to it. Maybe even run some slab-related benchmark (not sure if there's
> > something better than hackbench these days) to see if the overhead of
> > this approach shows up.
>
> I thought that before, but most atomic operations are serialized by the
> list_lock. Another possible way is to hold list_lock in __slab_free(),
> then these two counters can be changed from atomic to long.
>
> I also have no idea what's the standard SLUB benchmark for the
> regression test, any specific suggestion?

I don't know what people use these days. When I did benchmarking in
the past, hackbench and netperf were known to be slab-allocation
intensive macro-benchmarks. Christoph also had some SLUB
micro-benchmarks, but I don't think we ever merged them into the tree.

- Pekka


Re: [PATCH v2] mm/page_alloc: skip setting nodemask when we are in interrupt

2020-07-06 Thread Pekka Enberg
On Mon, Jul 6, 2020 at 5:59 AM Muchun Song  wrote:
> When we are in the interrupt context, it is irrelevant to the
> current task context. If we use current task's mems_allowed, we
> can fair to alloc pages in the fast path and fall back to slow
> path memory allocation when the current node(which is the current
> task mems_allowed) does not have enough memory to allocate. In
> this case, it slows down the memory allocation speed of interrupt
> context. So we can skip setting the nodemask to allow any node
> to allocate memory, so that fast path allocation can success.
>
> Signed-off-by: Muchun Song 

Reviewed-by: Pekka Enberg 


Re: [PATCH V1 0/5] riscv: Add k/uprobe supported

2020-07-04 Thread Pekka Enberg
Hi Guo,

On Sat, Jul 4, 2020 at 6:34 AM  wrote:
> > > There is no single step exception in riscv ISA, so utilize ebreak to
> > > simulate. Some pc related instructions couldn't be executed out of line
> > > and some system/fence instructions couldn't be a trace site at all.
> > > So we give out a reject list and simulate list in decode-insn.c.

On Sat, Jul 4, 2020 at 2:40 PM Pekka Enberg  wrote:
> > Can you elaborate on what you mean by this? Why would you need a
> > single-step facility for kprobes? Is it for executing the instruction
> > that was replaced with a probe breakpoint?

On Sat, Jul 4, 2020 at 5:55 PM Guo Ren  wrote:
> It's the single-step exception, not single-step facility!

Aah, right, I didn't read the specification carefully enough. Thanks
for taking the time to clarify this!

FWIW, for the whole series:

Reviewed-by: Pekka Enberg 

- Pekka


Re: [PATCH V1 0/5] riscv: Add k/uprobe supported

2020-07-04 Thread Pekka Enberg
On Sat, Jul 4, 2020 at 6:34 AM  wrote:
> The patchset includes kprobe/uprobe support and some related fixups.

Nice!

On Sat, Jul 4, 2020 at 6:34 AM  wrote:
> There is no single step exception in riscv ISA, so utilize ebreak to
> simulate. Some pc related instructions couldn't be executed out of line
> and some system/fence instructions couldn't be a trace site at all.
> So we give out a reject list and simulate list in decode-insn.c.

Can you elaborate on what you mean by this? Why would you need a
single-step facility for kprobes? Is it for executing the instruction
that was replaced with a probe breakpoint?

Also, the "Debug Specification" [1] specifies a single-step facility
for RISC-V -- why is that not useful for implementing kprobes?

1. https://riscv.org/specifications/debug-specification/

- Pekka


Re: [PATCH RESEND] mm/page_alloc: skip setting nodemask when we are in interrupt

2020-07-03 Thread Pekka Enberg
On 03.07.20 08:34, Pekka Enberg wrote:
> > if (cpusets_enabled()) {
> > *alloc_mask |= __GFP_HARDWALL;
> > if (!in_interrupt() && !ac->nodemask)
> > ac->nodemask = _current_mems_allowed;
> > else
> > *alloc_flags |= ALLOC_CPUSET;
> > }

On Fri, Jul 3, 2020 at 10:20 AM David Hildenbrand  wrote:
> ^ looks much cleaner as well. Do we want to add a summarizing comment?

I see no harm in adding one. I'm sure the next person starting a
journey in the maze some call the page allocator will appreciate it.

- Pekka


Re: [PATCH RESEND] mm/page_alloc: skip setting nodemask when we are in interrupt

2020-07-03 Thread Pekka Enberg
On Fri, Jul 3, 2020 at 9:14 AM Muchun Song  wrote:
>
> When we are in the interrupt context, it is irrelevant to the
> current task context. If we use current task's mems_allowed, we
> can fair to alloc pages in the fast path and fall back to slow
> path memory allocation when the current node(which is the current
> task mems_allowed) does not have enough memory to allocate. In
> this case, it slows down the memory allocation speed of interrupt
> context. So we can skip setting the nodemask to allow any node
> to allocate memory, so that fast path allocation can success.
>
> Signed-off-by: Muchun Song 
> ---
>  mm/page_alloc.c | 8 +---
>  1 file changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index b48336e20bdcd..a6c36cd557d1d 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -4726,10 +4726,12 @@ static inline bool prepare_alloc_pages(gfp_t 
> gfp_mask, unsigned int order,
>
> if (cpusets_enabled()) {
> *alloc_mask |= __GFP_HARDWALL;
> -   if (!ac->nodemask)
> -   ac->nodemask = _current_mems_allowed;
> -   else
> +   if (!ac->nodemask) {
> +   if (!in_interrupt())
> +   ac->nodemask = _current_mems_allowed;

If !ac->nodemask and in_interrupt() the ALLOC_CPUSET flag is not set,
which by-passes the __cpuset_zone_allowed() check for allocations.
This works fine because in the case if in_interrupt() the function
allows allocation on any zone/node.

> +   } else {
> *alloc_flags |= ALLOC_CPUSET;
> +   }
> }

However, if you write the condition as follows:

if (cpusets_enabled()) {
*alloc_mask |= __GFP_HARDWALL;
if (!in_interrupt() && !ac->nodemask)
ac->nodemask = _current_mems_allowed;
else
*alloc_flags |= ALLOC_CPUSET;
}

then the code is future-proof in case of __cpuset_zone_allowed() is
one day extended to support IRQ context too (it probably should
eventually respect IRQ SMP affinity).

>
> fs_reclaim_acquire(gfp_mask);
> --
> 2.11.0
>
>


Re: [PATCH 1/2] mm/slub: Introduce two counters for the partial objects

2020-07-02 Thread Pekka Enberg
On Thu, Jul 2, 2020 at 11:32 AM Xunlei Pang  wrote:
> The node list_lock in count_partial() spend long time iterating
> in case of large amount of partial page lists, which can cause
> thunder herd effect to the list_lock contention, e.g. it cause
> business response-time jitters when accessing "/proc/slabinfo"
> in our production environments.

Would you have any numbers to share to quantify this jitter? I have no
objections to this approach, but I think the original design
deliberately made reading "/proc/slabinfo" more expensive to avoid
atomic operations in the allocation/deallocation paths. It would be
good to understand what is the gain of this approach before we switch
to it. Maybe even run some slab-related benchmark (not sure if there's
something better than hackbench these days) to see if the overhead of
this approach shows up.

> This patch introduces two counters to maintain the actual number
> of partial objects dynamically instead of iterating the partial
> page lists with list_lock held.
>
> New counters of kmem_cache_node are: pfree_objects, ptotal_objects.
> The main operations are under list_lock in slow path, its performance
> impact is minimal.
>
> Co-developed-by: Wen Yang 
> Signed-off-by: Xunlei Pang 
> ---
>  mm/slab.h |  2 ++
>  mm/slub.c | 38 +-
>  2 files changed, 39 insertions(+), 1 deletion(-)
>
> diff --git a/mm/slab.h b/mm/slab.h
> index 7e94700..5935749 100644
> --- a/mm/slab.h
> +++ b/mm/slab.h
> @@ -616,6 +616,8 @@ struct kmem_cache_node {
>  #ifdef CONFIG_SLUB
> unsigned long nr_partial;
> struct list_head partial;
> +   atomic_long_t pfree_objects; /* partial free objects */
> +   atomic_long_t ptotal_objects; /* partial total objects */

You could rename these to "nr_partial_free_objs" and
"nr_partial_total_objs" for readability.

- Pekka


Re: [PATCH v3] mm, slab: Check GFP_SLAB_BUG_MASK before alloc_pages in kmalloc_order

2020-07-01 Thread Pekka Enberg
On Wed, Jul 1, 2020 at 6:19 PM Long Li  wrote:
>
> kmalloc cannot allocate memory from HIGHMEM.  Allocating large amounts
> of memory currently bypasses the check and will simply leak the memory
> when page_address() returns NULL.  To fix this, factor the
> GFP_SLAB_BUG_MASK check out of slab & slub, and call it from
> kmalloc_order() as well. In order to make the code clear, the warning
> message is put in one place.
>
> Signed-off-by: Long Li 

Reviewed-by: Pekka Enberg 


Re: [PATCH 17/26] mm/riscv: Use general page fault accounting

2020-07-01 Thread Pekka Enberg
Hi Peter,

On Sat, Jun 27, 2020 at 1:36 AM Peter Xu  wrote:
> Use the general page fault accounting by passing regs into handle_mm_fault().
> It naturally solve the issue of multiple page fault accounting when page fault
> retry happened.

I sent a patch to fix up riscv page fault accounting some days ago:

http://lists.infradead.org/pipermail/linux-riscv/2020-June/000775.html

However, your fix is obviously even better. For the generic and riscv parts:

Reviewed-by: Pekka Enberg 

- Pekka


Re: [PATCH 0/8] mm: cleanup usage of

2020-06-29 Thread Pekka Enberg
On Sat, Jun 27, 2020 at 5:35 PM Mike Rapoport  wrote:
> Most architectures have very similar versions of pXd_alloc_one() and
> pXd_free_one() for intermediate levels of page table.
> These patches add generic versions of these functions in
>  and enable use of the generic functions where
> appropriate.

Very nice cleanup series to the page table code!

FWIW:

Reviewed-by: Pekka Enberg 


Re: [PATCH] mm: Fix kobject memleak in SLUB

2019-04-28 Thread Pekka Enberg




On 28/04/2019 2.40, Tobin C. Harding wrote:

Currently error return from kobject_init_and_add() is not followed by a
call to kobject_put().  This means there is a memory leak.

Add call to kobject_put() in error path of kobject_init_and_add().

Signed-off-by: Tobin C. Harding 


Reviewed-by: Pekka Enberg 


Re: [PATCH 0/1] mm: Remove the SLAB allocator

2019-04-11 Thread Pekka Enberg

Hi,

On 4/11/19 10:55 AM, Michal Hocko wrote:

Please please have it more rigorous then what happened when SLUB was
forced to become a default


This is the hard part.

Even if you are able to show that SLUB is as fast as SLAB for all the 
benchmarks you run, there's bound to be that one workload where SLUB 
regresses. You will then have people complaining about that (rightly so) 
and you're again stuck with two allocators.


To move forward, I think we should look at possible *pathological* cases 
where we think SLAB might have an advantage. For example, SLUB had much 
more difficulties with remote CPU frees than SLAB. Now I don't know if 
this is the case, but it should be easy to construct a synthetic 
benchmark to measure this.


For example, have a userspace process that does networking, which is 
often memory allocation intensive, so that we know that SKBs traverse 
between CPUs. You can do this by making sure that the NIC queues are 
mapped to CPU N (so that network softirqs have to run on that CPU) but 
the process is pinned to CPU M.


It's, of course, worth thinking about other pathological cases too. 
Workloads that cause large allocations is one. Workloads that cause lots 
of slab cache shrinking is another.


- Pekka


Re: [PATCH v4] kmemleak: survive in a low-memory situation

2019-03-28 Thread Pekka Enberg

Hi Catalin,

On 27/03/2019 2.59, Qian Cai wrote:

Unless there is a brave soul to reimplement the kmemleak to embed it's
metadata into the tracked memory itself in a foreseeable future, this
provides a good balance between enabling kmemleak in a low-memory
situation and not introducing too much hackiness into the existing
code for now.


On Thu, Mar 28, 2019 at 08:05:31AM +0200, Pekka Enberg wrote:

Unfortunately I am not that brave soul, but I'm wondering what the
complication here is? It shouldn't be too hard to teach calculate_sizes() in
SLUB about a new SLAB_KMEMLEAK flag that reserves spaces for the metadata.


On 28/03/2019 12.30, Catalin Marinas wrote:> I don't think it's the 
calculate_sizes() that's the hard part. The way

kmemleak is designed assumes that the metadata has a longer lifespan
than the slab object it is tracking (and refcounted via
get_object/put_object()). We'd have to replace some of the
rcu_read_(un)lock() regions with a full kmemleak_lock together with a
few more tweaks to allow the release of kmemleak_lock during memory
scanning (which can take minutes; so it needs to be safe w.r.t. metadata
freeing, currently relying on a deferred RCU freeing).


Right.

I think SLUB already supports delaying object freeing because of KASAN 
(see the slab_free_freelist_hook() function) so the issue with metadata 
outliving object is solvable (although will consume more memory).


I can't say I remember enough details from kmemleak to comment on the 
locking complications you point out, though.


- Pekka


Re: [PATCH v4] kmemleak: survive in a low-memory situation

2019-03-28 Thread Pekka Enberg

Hi,

On 27/03/2019 2.59, Qian Cai wrote:

Unless there is a brave soul to reimplement the kmemleak to embed it's
metadata into the tracked memory itself in a foreseeable future, this
provides a good balance between enabling kmemleak in a low-memory
situation and not introducing too much hackiness into the existing
code for now.


Unfortunately I am not that brave soul, but I'm wondering what the 
complication here is? It shouldn't be too hard to teach 
calculate_sizes() in SLUB about a new SLAB_KMEMLEAK flag that reserves 
spaces for the metadata.


- Pekka


Re: [PATCH] slub: untag object before slab end

2019-02-12 Thread Pekka Enberg




On 13/02/2019 4.05, Qian Cai wrote:

get_freepointer() could return NULL if there is no more free objects in
the slab. However, it could return a tagged pointer (like
0x2200) with KASAN_SW_TAGS which would escape the NULL
object checking in check_valid_pointer() and trigger errors below, so
untag the object before checking for a NULL object there.


Reviewed-by: Pekka Enberg 


Re: [PATCH] mm/slab: Increase width of first /proc/slabinfo column

2019-02-01 Thread Pekka Enberg

Hi,

On 01/02/2019 4.34, Christopher Lameter wrote:

On Fri, 1 Feb 2019, Tobin C. Harding wrote:


Currently when displaying /proc/slabinfo if any cache names are too long
then the output columns are not aligned.  We could do something fancy to
get the maximum length of any cache name in the system or we could just
increase the hardcoded width.  Currently it is 17 characters.  Monitors
are wide these days so lets just increase it to 30 characters.


Hmm.. I wonder if there are any tools that depend on the field width here?



It's possible, but it's more likely that userspace parses by whitespace 
because it's easier to write it that way.


At least procps, which is used by slabtop, is prepared to parse a cache 
name of 128 characters. See the scanf() call in parse_slabinfo20() 
function in proc/slab.c of procps:


  http://procps.sourceforge.net/

Of course, testing with slabtop would make sense.

- Pekka


Re: [PATCH 0/3] slub: Do trivial comments fixes

2019-01-30 Thread Pekka Enberg

On 31/01/2019 6.10, Tobin C. Harding wrote:

From: "Tobin C. Harding" 

Hi Christopher,

Here is a trivial patchset to wet my toes. This is my first patchset to
mm, if there are some mm specific nuances in relation to when in the dev
cycle (if ever) that minor (*cough* trivial) pathsets are acceptable
please say so

This patchset fixes comments strings in the SLUB subsystem.

As per discussion at LCA I am working on getting my head around the SLUB
allocator.  If you specifically do *not* want me to do minor clean up
while I'm reading please say so, I will not be offended.


For the series:

Reviewed-by: Pekka Enberg 


Re: [PATCH] mm: Prevent mapping slab pages to userspace

2019-01-30 Thread Pekka Enberg

On 25/01/2019 19.38, Matthew Wilcox wrote:

It's never appropriate to map a page allocated by SLAB into userspace.
A buggy device driver might try this, or an attacker might be able to
find a way to make it happen.

Signed-off-by: Matthew Wilcox 


Acked-by: Pekka Enberg 

A WARN_ON_ONCE() would be nice here to let those buggy drivers know that 
they will no longer work.



---
  mm/memory.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/memory.c b/mm/memory.c
index e11ca9dd823f..ce8c90b752be 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -1451,7 +1451,7 @@ static int insert_page(struct vm_area_struct *vma, 
unsigned long addr,
spinlock_t *ptl;
  
  	retval = -EINVAL;

-   if (PageAnon(page))
+   if (PageAnon(page) || PageSlab(page))
goto out;
retval = -ENOMEM;
flush_dcache_page(page);



Re: [PATCH] mm/slub.c: freelist is ensured to be NULL when new_slab() fails

2018-12-29 Thread Pekka Enberg

On 29/12/2018 8.25, Peng Wang wrote:

new_slab_objects() will return immediately if freelist is not NULL.

  if (freelist)
  return freelist;

One more assignment operation could be avoided.

Signed-off-by: Peng Wang 


Reviewed-by: Pekka Enberg 


---
  mm/slub.c | 3 +--
  1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/mm/slub.c b/mm/slub.c
index 36c0befeebd8..cf2ef4ababff 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2463,8 +2463,7 @@ static inline void *new_slab_objects(struct kmem_cache 
*s, gfp_t flags,
stat(s, ALLOC_SLAB);
c->page = page;
*pc = c;
-   } else
-   freelist = NULL;
+   }
  
  	return freelist;

  }



Re: [PATCH] mm, slab: reschedule cache_reap() on the same CPU

2018-04-11 Thread Pekka Enberg



On 11/04/2018 10.00, Vlastimil Babka wrote:

cache_reap() is initially scheduled in start_cpu_timer() via
schedule_delayed_work_on(). But then the next iterations are scheduled via
schedule_delayed_work(), i.e. using WORK_CPU_UNBOUND.

Thus since commit ef557180447f ("workqueue: schedule WORK_CPU_UNBOUND work on
wq_unbound_cpumask CPUs") there is no guarantee the future iterations will run
on the originally intended cpu, although it's still preferred. I was able to
demonstrate this with /sys/module/workqueue/parameters/debug_force_rr_cpu.
IIUC, it may also happen due to migrating timers in nohz context. As a result,
some cpu's would be calling cache_reap() more frequently and others never.

This patch uses schedule_delayed_work_on() with the current cpu when scheduling
the next iteration.

Signed-off-by: Vlastimil Babka <vba...@suse.cz>
Fixes: ef557180447f ("workqueue: schedule WORK_CPU_UNBOUND work on 
wq_unbound_cpumask CPUs")
CC: <sta...@vger.kernel.org>
Cc: Joonsoo Kim <iamjoonsoo@lge.com>
Cc: David Rientjes <rient...@google.com>
Cc: Pekka Enberg <penb...@kernel.org>
Cc: Christoph Lameter <c...@linux.com>
Cc: Tejun Heo <t...@kernel.org>
Cc: Lai Jiangshan <jiangshan...@gmail.com>
Cc: John Stultz <john.stu...@linaro.org>
Cc: Thomas Gleixner <t...@linutronix.de>
Cc: Stephen Boyd <sb...@kernel.org>


Acked-by: Pekka Enberg <penb...@kernel.org>


---
  mm/slab.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/slab.c b/mm/slab.c
index 9095c3945425..a76006aae857 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -4074,7 +4074,8 @@ static void cache_reap(struct work_struct *w)
next_reap_node();
  out:
/* Set up the next iteration */
-   schedule_delayed_work(work, round_jiffies_relative(REAPTIMEOUT_AC));
+   schedule_delayed_work_on(smp_processor_id(), work,
+   round_jiffies_relative(REAPTIMEOUT_AC));
  }
  
  void get_slabinfo(struct kmem_cache *cachep, struct slabinfo *sinfo)




Re: [PATCH] mm, slab: reschedule cache_reap() on the same CPU

2018-04-11 Thread Pekka Enberg



On 11/04/2018 10.00, Vlastimil Babka wrote:

cache_reap() is initially scheduled in start_cpu_timer() via
schedule_delayed_work_on(). But then the next iterations are scheduled via
schedule_delayed_work(), i.e. using WORK_CPU_UNBOUND.

Thus since commit ef557180447f ("workqueue: schedule WORK_CPU_UNBOUND work on
wq_unbound_cpumask CPUs") there is no guarantee the future iterations will run
on the originally intended cpu, although it's still preferred. I was able to
demonstrate this with /sys/module/workqueue/parameters/debug_force_rr_cpu.
IIUC, it may also happen due to migrating timers in nohz context. As a result,
some cpu's would be calling cache_reap() more frequently and others never.

This patch uses schedule_delayed_work_on() with the current cpu when scheduling
the next iteration.

Signed-off-by: Vlastimil Babka 
Fixes: ef557180447f ("workqueue: schedule WORK_CPU_UNBOUND work on 
wq_unbound_cpumask CPUs")
CC: 
Cc: Joonsoo Kim 
Cc: David Rientjes 
Cc: Pekka Enberg 
Cc: Christoph Lameter 
Cc: Tejun Heo 
Cc: Lai Jiangshan 
Cc: John Stultz 
Cc: Thomas Gleixner 
Cc: Stephen Boyd 


Acked-by: Pekka Enberg 


---
  mm/slab.c | 3 ++-
  1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/slab.c b/mm/slab.c
index 9095c3945425..a76006aae857 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -4074,7 +4074,8 @@ static void cache_reap(struct work_struct *w)
next_reap_node();
  out:
/* Set up the next iteration */
-   schedule_delayed_work(work, round_jiffies_relative(REAPTIMEOUT_AC));
+   schedule_delayed_work_on(smp_processor_id(), work,
+   round_jiffies_relative(REAPTIMEOUT_AC));
  }
  
  void get_slabinfo(struct kmem_cache *cachep, struct slabinfo *sinfo)




Re: [PATCH 0/4] zswap: Optimize compressed pool memory utilization

2016-08-17 Thread Pekka Enberg
On Wed, Aug 17, 2016 at 1:03 PM, Srividya Desireddy
 wrote:
> This series of patches optimize the memory utilized by zswap for storing
> the swapped out pages.
>
> Zswap is a cache which compresses the pages that are being swapped out
> and stores them into a dynamically allocated RAM-based memory pool.
> Experiments have shown that around 10-15% of pages stored in zswap are
> duplicates which results in 10-12% more RAM required to store these
> duplicate compressed pages. Around 10-20% of pages stored in zswap
> are zero-filled pages, but these pages are handled as normal pages by
> compressing and allocating memory in the pool.
>
> The following patch-set optimizes memory utilized by zswap by avoiding the
> storage of duplicate pages and zero-filled pages in zswap compressed memory
> pool.
>
> Patch 1/4: zswap: Share zpool memory of duplicate pages
> This patch shares compressed pool memory of the duplicate pages. When a new
> page is requested for swap-out to zswap; search for an identical page in
> the pages already stored in zswap. If an identical page is found then share
> the compressed page data of the identical page with the new page. This
> avoids allocation of memory in the compressed pool for a duplicate page.
> This feature is tested on devices with 1GB, 2GB and 3GB RAM by executing
> performance test at low memory conditions. Around 15-20% of the pages
> swapped are duplicate of the pages existing in zswap, resulting in 15%
> saving of zswap memory pool when compared to the baseline version.
>
> Test Parameters BaselineWith patch  Improvement
> Total RAM   955MB   955MB
> Available RAM 254MB   269MB   15MB
> Avg. App entry time 2.469sec2.207sec7%
> Avg. App close time 1.151sec1.085sec6%
> Apps launched in 1sec   5 12 7
>
> There is little overhead in zswap store function due to the search
> operation for finding duplicate pages. However, if duplicate page is
> found it saves the compression and allocation time of the page. The average
> overhead per zswap_frontswap_store() function call in the experimental
> device is 9us. There is no overhead in case of zswap_frontswap_load()
> operation.
>
> Patch 2/4: zswap: Enable/disable sharing of duplicate pages at runtime
> This patch adds a module parameter to enable or disable the sharing of
> duplicate zswap pages at runtime.
>
> Patch 3/4: zswap: Zero-filled pages handling
> This patch checks if a page to be stored in zswap is a zero-filled page
> (i.e. contents of the page are all zeros). If such page is found,
> compression and allocation of memory for the compressed page is avoided
> and instead the page is just marked as zero-filled page.
> Although, compressed size of a zero-filled page using LZO compressor is
> very less (52 bytes including zswap_header), this patch saves compression
> and allocation time during store operation and decompression time during
> zswap load operation for zero-filled pages. Experiments have shown that
> around 10-20% of pages stored in zswap are zero-filled.

Aren't zero-filled pages already handled by patch 1/4 as their
contents match? So the overall memory saving is 52 bytes?

- Pekka


Re: [PATCH 0/4] zswap: Optimize compressed pool memory utilization

2016-08-17 Thread Pekka Enberg
On Wed, Aug 17, 2016 at 1:03 PM, Srividya Desireddy
 wrote:
> This series of patches optimize the memory utilized by zswap for storing
> the swapped out pages.
>
> Zswap is a cache which compresses the pages that are being swapped out
> and stores them into a dynamically allocated RAM-based memory pool.
> Experiments have shown that around 10-15% of pages stored in zswap are
> duplicates which results in 10-12% more RAM required to store these
> duplicate compressed pages. Around 10-20% of pages stored in zswap
> are zero-filled pages, but these pages are handled as normal pages by
> compressing and allocating memory in the pool.
>
> The following patch-set optimizes memory utilized by zswap by avoiding the
> storage of duplicate pages and zero-filled pages in zswap compressed memory
> pool.
>
> Patch 1/4: zswap: Share zpool memory of duplicate pages
> This patch shares compressed pool memory of the duplicate pages. When a new
> page is requested for swap-out to zswap; search for an identical page in
> the pages already stored in zswap. If an identical page is found then share
> the compressed page data of the identical page with the new page. This
> avoids allocation of memory in the compressed pool for a duplicate page.
> This feature is tested on devices with 1GB, 2GB and 3GB RAM by executing
> performance test at low memory conditions. Around 15-20% of the pages
> swapped are duplicate of the pages existing in zswap, resulting in 15%
> saving of zswap memory pool when compared to the baseline version.
>
> Test Parameters BaselineWith patch  Improvement
> Total RAM   955MB   955MB
> Available RAM 254MB   269MB   15MB
> Avg. App entry time 2.469sec2.207sec7%
> Avg. App close time 1.151sec1.085sec6%
> Apps launched in 1sec   5 12 7
>
> There is little overhead in zswap store function due to the search
> operation for finding duplicate pages. However, if duplicate page is
> found it saves the compression and allocation time of the page. The average
> overhead per zswap_frontswap_store() function call in the experimental
> device is 9us. There is no overhead in case of zswap_frontswap_load()
> operation.
>
> Patch 2/4: zswap: Enable/disable sharing of duplicate pages at runtime
> This patch adds a module parameter to enable or disable the sharing of
> duplicate zswap pages at runtime.
>
> Patch 3/4: zswap: Zero-filled pages handling
> This patch checks if a page to be stored in zswap is a zero-filled page
> (i.e. contents of the page are all zeros). If such page is found,
> compression and allocation of memory for the compressed page is avoided
> and instead the page is just marked as zero-filled page.
> Although, compressed size of a zero-filled page using LZO compressor is
> very less (52 bytes including zswap_header), this patch saves compression
> and allocation time during store operation and decompression time during
> zswap load operation for zero-filled pages. Experiments have shown that
> around 10-20% of pages stored in zswap are zero-filled.

Aren't zero-filled pages already handled by patch 1/4 as their
contents match? So the overall memory saving is 52 bytes?

- Pekka


Re: [PATCH 3/4] zswap: Zero-filled pages handling

2016-08-17 Thread Pekka Enberg
On Wed, Aug 17, 2016 at 1:18 PM, Srividya Desireddy
<srividya...@samsung.com> wrote:
>> This patch adds a check in zswap_frontswap_store() to identify zero-filled
>> page before compression of the page. If the page is a zero-filled page, set
>> zswap_entry.zeroflag and skip the compression of the page and alloction
>> of memory in zpool. In zswap_frontswap_load(), check if the zeroflag is
>> set for the page in zswap_entry. If the flag is set, memset the page with
>> zero. This saves the decompression time during load.
>>
>> The overall overhead caused due to zero-filled page check is very minimal
>> when compared to the time saved by avoiding compression and allocation in
>> case of zero-filled pages. The load time of a zero-filled page is reduced
>> by 80% when compared to baseline.

On Wed, Aug 17, 2016 at 3:25 PM, Pekka Enberg <penb...@kernel.org> wrote:
> AFAICT, that's an overall improvement only if there are a lot of
> zero-filled pages because it's just overhead for pages that we *need*
> to compress, no? So I suppose the question is, are there a lot of
> zero-filled pages that we need to swap and why is that the case?

I suppose reading your cover letter would have been helpful before
sending out my email:

"Experiments have shown that around 10-15% of pages stored in zswap are
duplicates which results in 10-12% more RAM required to store these
duplicate compressed pages."

But I still don't understand why we have zero-filled pages that we are
swapping out.

- Pekka


Re: [PATCH 3/4] zswap: Zero-filled pages handling

2016-08-17 Thread Pekka Enberg
On Wed, Aug 17, 2016 at 1:18 PM, Srividya Desireddy
 wrote:
>> This patch adds a check in zswap_frontswap_store() to identify zero-filled
>> page before compression of the page. If the page is a zero-filled page, set
>> zswap_entry.zeroflag and skip the compression of the page and alloction
>> of memory in zpool. In zswap_frontswap_load(), check if the zeroflag is
>> set for the page in zswap_entry. If the flag is set, memset the page with
>> zero. This saves the decompression time during load.
>>
>> The overall overhead caused due to zero-filled page check is very minimal
>> when compared to the time saved by avoiding compression and allocation in
>> case of zero-filled pages. The load time of a zero-filled page is reduced
>> by 80% when compared to baseline.

On Wed, Aug 17, 2016 at 3:25 PM, Pekka Enberg  wrote:
> AFAICT, that's an overall improvement only if there are a lot of
> zero-filled pages because it's just overhead for pages that we *need*
> to compress, no? So I suppose the question is, are there a lot of
> zero-filled pages that we need to swap and why is that the case?

I suppose reading your cover letter would have been helpful before
sending out my email:

"Experiments have shown that around 10-15% of pages stored in zswap are
duplicates which results in 10-12% more RAM required to store these
duplicate compressed pages."

But I still don't understand why we have zero-filled pages that we are
swapping out.

- Pekka


Re: [PATCH 3/4] zswap: Zero-filled pages handling

2016-08-17 Thread Pekka Enberg
On Wed, Aug 17, 2016 at 1:18 PM, Srividya Desireddy
 wrote:
> This patch adds a check in zswap_frontswap_store() to identify zero-filled
> page before compression of the page. If the page is a zero-filled page, set
> zswap_entry.zeroflag and skip the compression of the page and alloction
> of memory in zpool. In zswap_frontswap_load(), check if the zeroflag is
> set for the page in zswap_entry. If the flag is set, memset the page with
> zero. This saves the decompression time during load.
>
> The overall overhead caused due to zero-filled page check is very minimal
> when compared to the time saved by avoiding compression and allocation in
> case of zero-filled pages. The load time of a zero-filled page is reduced
> by 80% when compared to baseline.

AFAICT, that's an overall improvement only if there are a lot of
zero-filled pages because it's just overhead for pages that we *need*
to compress, no? So I suppose the question is, are there a lot of
zero-filled pages that we need to swap and why is that the case?

> @@ -1314,6 +1347,13 @@ static int zswap_frontswap_load(unsigned type, pgoff_t 
> offset,
> }
> spin_unlock(>lock);
>
> +   if (entry->zeroflag) {
> +   dst = kmap_atomic(page);
> +   memset(dst, 0, PAGE_SIZE);
> +   kunmap_atomic(dst);
> +   goto freeentry;
> +   }

Don't we need the same thing in zswap_writeback_entry() for the
ZSWAP_SWAPCACHE_NEW case?

> +
> /* decompress */
> dlen = PAGE_SIZE;
> src = (u8 *)zpool_map_handle(entry->pool->zpool, 
> entry->zhandle->handle,
> @@ -1327,6 +1367,7 @@ static int zswap_frontswap_load(unsigned type, pgoff_t 
> offset,
> zpool_unmap_handle(entry->pool->zpool, entry->zhandle->handle);
> BUG_ON(ret);

- Pekka


Re: [PATCH 3/4] zswap: Zero-filled pages handling

2016-08-17 Thread Pekka Enberg
On Wed, Aug 17, 2016 at 1:18 PM, Srividya Desireddy
 wrote:
> This patch adds a check in zswap_frontswap_store() to identify zero-filled
> page before compression of the page. If the page is a zero-filled page, set
> zswap_entry.zeroflag and skip the compression of the page and alloction
> of memory in zpool. In zswap_frontswap_load(), check if the zeroflag is
> set for the page in zswap_entry. If the flag is set, memset the page with
> zero. This saves the decompression time during load.
>
> The overall overhead caused due to zero-filled page check is very minimal
> when compared to the time saved by avoiding compression and allocation in
> case of zero-filled pages. The load time of a zero-filled page is reduced
> by 80% when compared to baseline.

AFAICT, that's an overall improvement only if there are a lot of
zero-filled pages because it's just overhead for pages that we *need*
to compress, no? So I suppose the question is, are there a lot of
zero-filled pages that we need to swap and why is that the case?

> @@ -1314,6 +1347,13 @@ static int zswap_frontswap_load(unsigned type, pgoff_t 
> offset,
> }
> spin_unlock(>lock);
>
> +   if (entry->zeroflag) {
> +   dst = kmap_atomic(page);
> +   memset(dst, 0, PAGE_SIZE);
> +   kunmap_atomic(dst);
> +   goto freeentry;
> +   }

Don't we need the same thing in zswap_writeback_entry() for the
ZSWAP_SWAPCACHE_NEW case?

> +
> /* decompress */
> dlen = PAGE_SIZE;
> src = (u8 *)zpool_map_handle(entry->pool->zpool, 
> entry->zhandle->handle,
> @@ -1327,6 +1367,7 @@ static int zswap_frontswap_load(unsigned type, pgoff_t 
> offset,
> zpool_unmap_handle(entry->pool->zpool, entry->zhandle->handle);
> BUG_ON(ret);

- Pekka


Re: [PATCH, REGRESSION v3] mm: make apply_to_page_range more robust

2016-01-21 Thread Pekka Enberg

On 01/22/2016 01:12 AM, David Rientjes wrote:
NACK to your patch as it is just covering up buggy code silently. The 
problem needs to be addressed in change_memory_common() to return if 
there is no size to change (numpages == 0). It's a two line fix to 
that function. 


So add a WARN_ON there to *warn* about the situations. There's really no 
need to BUG_ON here.


- Pekka


Re: [PATCH, REGRESSION v3] mm: make apply_to_page_range more robust

2016-01-21 Thread Pekka Enberg

On 01/22/2016 01:12 AM, David Rientjes wrote:
NACK to your patch as it is just covering up buggy code silently. The 
problem needs to be addressed in change_memory_common() to return if 
there is no size to change (numpages == 0). It's a two line fix to 
that function. 


So add a WARN_ON there to *warn* about the situations. There's really no 
need to BUG_ON here.


- Pekka


Re: [PATCH] mm: slab: convert slab_is_available to boolean

2015-09-15 Thread Pekka Enberg

On 9/15/15 8:50 PM, Denis Kirjanov wrote:

A good one candidate to return a boolean result

Signed-off-by: Denis Kirjanov 


Reviewed-by: Pekka Enberg 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm: slab: convert slab_is_available to boolean

2015-09-15 Thread Pekka Enberg

On 9/15/15 8:50 PM, Denis Kirjanov wrote:

A good one candidate to return a boolean result

Signed-off-by: Denis Kirjanov <k...@linux-powerpc.org>


Reviewed-by: Pekka Enberg <penb...@kernel.org>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 04/14] cxgb4: Use kvfree() in t4_free_mem()

2015-05-15 Thread Pekka Enberg
Use kvfree() instead of open-coding it.

Cc: Hariprasad S 
Signed-off-by: Pekka Enberg 
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c 
b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 803d91b..9143917 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -1121,10 +1121,7 @@ void *t4_alloc_mem(size_t size)
  */
 void t4_free_mem(void *addr)
 {
-   if (is_vmalloc_addr(addr))
-   vfree(addr);
-   else
-   kfree(addr);
+   kvfree(addr);
 }
 
 /* Send a Work Request to write the filter at a specified index.  We construct
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 07/14] drm/nouveau/gem: Use kvfree() in u_free()

2015-05-15 Thread Pekka Enberg
Use kvfree() instead of open-coding it.

Cc: David Airlie 
Signed-off-by: Pekka Enberg 
---
 drivers/gpu/drm/nouveau/nouveau_gem.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c 
b/drivers/gpu/drm/nouveau/nouveau_gem.c
index 0e690bf..af1ee51 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -555,10 +555,7 @@ nouveau_gem_pushbuf_validate(struct nouveau_channel *chan,
 static inline void
 u_free(void *addr)
 {
-   if (!is_vmalloc_addr(addr))
-   kfree(addr);
-   else
-   vfree(addr);
+   kvfree(addr);
 }
 
 static inline void *
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 03/14] cxgb3: Use kvfree() in cxgb_free_mem()

2015-05-15 Thread Pekka Enberg
Use kvfree() instead of open-coding it.

Cc: Santosh Raspatur 
Signed-off-by: Pekka Enberg 
---
 drivers/net/ethernet/chelsio/cxgb3/cxgb3_offload.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb3/cxgb3_offload.c 
b/drivers/net/ethernet/chelsio/cxgb3/cxgb3_offload.c
index b0cbb2b..76684dc 100644
--- a/drivers/net/ethernet/chelsio/cxgb3/cxgb3_offload.c
+++ b/drivers/net/ethernet/chelsio/cxgb3/cxgb3_offload.c
@@ -1169,10 +1169,7 @@ void *cxgb_alloc_mem(unsigned long size)
  */
 void cxgb_free_mem(void *addr)
 {
-   if (is_vmalloc_addr(addr))
-   vfree(addr);
-   else
-   kfree(addr);
+   kvfree(addr);
 }
 
 /*
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 09/14] IB/ehca: Use kvfree() in ipz_queue_{cd}tor()

2015-05-15 Thread Pekka Enberg
Use kvfree() instead of open-coding it.

Cc: Hoang-Nam Nguyen 
Cc: Christoph Raisch 
Signed-off-by: Pekka Enberg 
---
 drivers/infiniband/hw/ehca/ipz_pt_fn.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ipz_pt_fn.c 
b/drivers/infiniband/hw/ehca/ipz_pt_fn.c
index 8d59451..7ffc748 100644
--- a/drivers/infiniband/hw/ehca/ipz_pt_fn.c
+++ b/drivers/infiniband/hw/ehca/ipz_pt_fn.c
@@ -245,10 +245,7 @@ int ipz_queue_ctor(struct ehca_pd *pd, struct ipz_queue 
*queue,
 ipz_queue_ctor_exit0:
ehca_gen_err("Couldn't alloc pages queue=%p "
 "nr_of_pages=%x",  queue, nr_of_pages);
-   if (is_vmalloc_addr(queue->queue_pages))
-   vfree(queue->queue_pages);
-   else
-   kfree(queue->queue_pages);
+   kvfree(queue->queue_pages);
 
return 0;
 }
@@ -270,10 +267,7 @@ int ipz_queue_dtor(struct ehca_pd *pd, struct ipz_queue 
*queue)
free_page((unsigned long)queue->queue_pages[i]);
}
 
-   if (is_vmalloc_addr(queue->queue_pages))
-   vfree(queue->queue_pages);
-   else
-   kfree(queue->queue_pages);
+   kvfree(queue->queue_pages);
 
return 1;
 }
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 08/14] drivers/input/evdev.c: Use kvfree() in evdev_release()

2015-05-15 Thread Pekka Enberg
Use kvfree() instead of open-coding it.

Cc: Dmitry Torokhov 
Signed-off-by: Pekka Enberg 
---
 drivers/input/evdev.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/input/evdev.c b/drivers/input/evdev.c
index a18f41b..9d35499 100644
--- a/drivers/input/evdev.c
+++ b/drivers/input/evdev.c
@@ -422,10 +422,7 @@ static int evdev_release(struct inode *inode, struct file 
*file)
 
evdev_detach_client(evdev, client);
 
-   if (is_vmalloc_addr(client))
-   vfree(client);
-   else
-   kfree(client);
+   kvfree(client);
 
evdev_close_device(evdev);
 
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 01/14] ipc/util.c: Use kvfree() in ipc_rcu_free()

2015-05-15 Thread Pekka Enberg
Use kvfree() instead of open-coding it.

Signed-off-by: Pekka Enberg 
---
 ipc/util.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/ipc/util.c b/ipc/util.c
index ff3323e..537a41c 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -467,10 +467,7 @@ void ipc_rcu_free(struct rcu_head *head)
 {
struct ipc_rcu *p = container_of(head, struct ipc_rcu, rcu);
 
-   if (is_vmalloc_addr(p))
-   vfree(p);
-   else
-   kfree(p);
+   kvfree(p);
 }
 
 /**
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 10/14] target: Use kvfree() in session alloc and free

2015-05-15 Thread Pekka Enberg
Use kvfree() instead of open-coding it.

Cc: "Nicholas A. Bellinger" 
Signed-off-by: Pekka Enberg 
---
 drivers/target/target_core_transport.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/target/target_core_transport.c 
b/drivers/target/target_core_transport.c
index 3fe5cb2..9c058c0 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -280,10 +280,7 @@ int transport_alloc_session_tags(struct se_session 
*se_sess,
if (rc < 0) {
pr_err("Unable to init se_sess->sess_tag_pool,"
" tag_num: %u\n", tag_num);
-   if (is_vmalloc_addr(se_sess->sess_cmd_map))
-   vfree(se_sess->sess_cmd_map);
-   else
-   kfree(se_sess->sess_cmd_map);
+   kvfree(se_sess->sess_cmd_map);
se_sess->sess_cmd_map = NULL;
return -ENOMEM;
}
@@ -490,10 +487,7 @@ void transport_free_session(struct se_session *se_sess)
 {
if (se_sess->sess_cmd_map) {
percpu_ida_destroy(_sess->sess_tag_pool);
-   if (is_vmalloc_addr(se_sess->sess_cmd_map))
-   vfree(se_sess->sess_cmd_map);
-   else
-   kfree(se_sess->sess_cmd_map);
+   kvfree(se_sess->sess_cmd_map);
}
kmem_cache_free(se_sess_cache, se_sess);
 }
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 12/14] dm: Use kvfree() in dm_kvfree()

2015-05-15 Thread Pekka Enberg
Use kvfree() instead of open-coding it.

Cc: Alasdair Kergon 
Cc: Mike Snitzer 
Signed-off-by: Pekka Enberg 
---
 drivers/md/dm-stats.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/md/dm-stats.c b/drivers/md/dm-stats.c
index f478a4c..492fe6a 100644
--- a/drivers/md/dm-stats.c
+++ b/drivers/md/dm-stats.c
@@ -160,10 +160,7 @@ static void dm_kvfree(void *ptr, size_t alloc_size)
 
free_shared_memory(alloc_size);
 
-   if (is_vmalloc_addr(ptr))
-   vfree(ptr);
-   else
-   kfree(ptr);
+   kvfree(ptr);
 }
 
 static void dm_stat_free(struct rcu_head *head)
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 11/14] libcxgbi: Use kvfree() in cxgbi_free_big_mem()

2015-05-15 Thread Pekka Enberg
Use kvfree() instead of open-coding it.

Cc: "James E.J. Bottomley" 
Signed-off-by: Pekka Enberg 
---
 drivers/scsi/cxgbi/libcxgbi.h | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/scsi/cxgbi/libcxgbi.h b/drivers/scsi/cxgbi/libcxgbi.h
index aba1af7..c2eb7ea 100644
--- a/drivers/scsi/cxgbi/libcxgbi.h
+++ b/drivers/scsi/cxgbi/libcxgbi.h
@@ -685,10 +685,7 @@ static inline void *cxgbi_alloc_big_mem(unsigned int size,
 
 static inline void cxgbi_free_big_mem(void *addr)
 {
-   if (is_vmalloc_addr(addr))
-   vfree(addr);
-   else
-   kfree(addr);
+   kvfree(addr);
 }
 
 static inline void cxgbi_set_iscsi_ipv4(struct cxgbi_hba *chba, __be32 ipaddr)
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 13/14] bcache: Use kvfree() in various places

2015-05-15 Thread Pekka Enberg
Use kvfree() instead of open-coding it.

Cc: Kent Overstreet 
Signed-off-by: Pekka Enberg 
---
 drivers/md/bcache/super.c | 10 ++
 drivers/md/bcache/util.h  | 10 ++
 2 files changed, 4 insertions(+), 16 deletions(-)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 4dd2bb7..94980bf 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -760,14 +760,8 @@ static void bcache_device_free(struct bcache_device *d)
bio_split_pool_free(>bio_split_hook);
if (d->bio_split)
bioset_free(d->bio_split);
-   if (is_vmalloc_addr(d->full_dirty_stripes))
-   vfree(d->full_dirty_stripes);
-   else
-   kfree(d->full_dirty_stripes);
-   if (is_vmalloc_addr(d->stripe_sectors_dirty))
-   vfree(d->stripe_sectors_dirty);
-   else
-   kfree(d->stripe_sectors_dirty);
+   kvfree(d->full_dirty_stripes);
+   kvfree(d->stripe_sectors_dirty);
 
closure_debug_destroy(>cl);
 }
diff --git a/drivers/md/bcache/util.h b/drivers/md/bcache/util.h
index 98df757..1d04c48 100644
--- a/drivers/md/bcache/util.h
+++ b/drivers/md/bcache/util.h
@@ -52,10 +52,7 @@ struct closure;
 
 #define free_heap(heap)
\
 do {   \
-   if (is_vmalloc_addr((heap)->data))  \
-   vfree((heap)->data);\
-   else\
-   kfree((heap)->data);\
+   kvfree((heap)->data);   \
(heap)->data = NULL;\
 } while (0)
 
@@ -163,10 +160,7 @@ do {   
\
 
 #define free_fifo(fifo)
\
 do {   \
-   if (is_vmalloc_addr((fifo)->data))  \
-   vfree((fifo)->data);\
-   else\
-   kfree((fifo)->data);\
+   kvfree((fifo)->data);   \
(fifo)->data = NULL;\
 } while (0)
 
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 14/14] NTFS: Use kvfree() in ntfs_free()

2015-05-15 Thread Pekka Enberg
Use kvfree() instead of open-coding it.

Cc: Anton Altaparmakov 
Signed-off-by: Pekka Enberg 
---
 fs/ntfs/malloc.h | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/fs/ntfs/malloc.h b/fs/ntfs/malloc.h
index a44b14c..ab172e5 100644
--- a/fs/ntfs/malloc.h
+++ b/fs/ntfs/malloc.h
@@ -85,12 +85,7 @@ static inline void *ntfs_malloc_nofs_nofail(unsigned long 
size)
 
 static inline void ntfs_free(void *addr)
 {
-   if (!is_vmalloc_addr(addr)) {
-   kfree(addr);
-   /* free_page((unsigned long)addr); */
-   return;
-   }
-   vfree(addr);
+   kvfree(addr);
 }
 
 #endif /* _LINUX_NTFS_MALLOC_H */
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 06/14] drm: Use kvfree() in drm_free_large()

2015-05-15 Thread Pekka Enberg
Use kvfree() instead of open-coding it.

Cc: David Airlie 
Signed-off-by: Pekka Enberg 
---
 include/drm/drm_mem_util.h | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/include/drm/drm_mem_util.h b/include/drm/drm_mem_util.h
index 19a2404..e42495a 100644
--- a/include/drm/drm_mem_util.h
+++ b/include/drm/drm_mem_util.h
@@ -56,10 +56,7 @@ static __inline__ void *drm_malloc_ab(size_t nmemb, size_t 
size)
 
 static __inline void drm_free_large(void *ptr)
 {
-   if (!is_vmalloc_addr(ptr))
-   return kfree(ptr);
-
-   vfree(ptr);
+   kvfree(ptr);
 }
 
 #endif
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 02/14] kernel/relay.c: Use kvfree() in relay_free_page_array()

2015-05-15 Thread Pekka Enberg
Use kvfree() instead of open-coding it.

Signed-off-by: Pekka Enberg 
---
 kernel/relay.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/kernel/relay.c b/kernel/relay.c
index e9dbaeb..0b4570c 100644
--- a/kernel/relay.c
+++ b/kernel/relay.c
@@ -81,10 +81,7 @@ static struct page **relay_alloc_page_array(unsigned int 
n_pages)
  */
 static void relay_free_page_array(struct page **array)
 {
-   if (is_vmalloc_addr(array))
-   vfree(array);
-   else
-   kfree(array);
+   kvfree(array);
 }
 
 /**
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 05/14] ceph: Use kvfree() in ceph_put_page_vector()

2015-05-15 Thread Pekka Enberg
Use kvfree instead of open-coding it.

Cc: "Yan, Zheng" 
Cc: Sage Weil 
Signed-off-by: Pekka Enberg 
---
 net/ceph/pagevec.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/net/ceph/pagevec.c b/net/ceph/pagevec.c
index 096d914..d4f5f22 100644
--- a/net/ceph/pagevec.c
+++ b/net/ceph/pagevec.c
@@ -51,10 +51,7 @@ void ceph_put_page_vector(struct page **pages, int 
num_pages, bool dirty)
set_page_dirty_lock(pages[i]);
put_page(pages[i]);
}
-   if (is_vmalloc_addr(pages))
-   vfree(pages);
-   else
-   kfree(pages);
+   kvfree(pages);
 }
 EXPORT_SYMBOL(ceph_put_page_vector);
 
-- 
2.1.0

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 06/14] drm: Use kvfree() in drm_free_large()

2015-05-15 Thread Pekka Enberg
Use kvfree() instead of open-coding it.

Cc: David Airlie airl...@linux.ie
Signed-off-by: Pekka Enberg penb...@kernel.org
---
 include/drm/drm_mem_util.h | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/include/drm/drm_mem_util.h b/include/drm/drm_mem_util.h
index 19a2404..e42495a 100644
--- a/include/drm/drm_mem_util.h
+++ b/include/drm/drm_mem_util.h
@@ -56,10 +56,7 @@ static __inline__ void *drm_malloc_ab(size_t nmemb, size_t 
size)
 
 static __inline void drm_free_large(void *ptr)
 {
-   if (!is_vmalloc_addr(ptr))
-   return kfree(ptr);
-
-   vfree(ptr);
+   kvfree(ptr);
 }
 
 #endif
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 12/14] dm: Use kvfree() in dm_kvfree()

2015-05-15 Thread Pekka Enberg
Use kvfree() instead of open-coding it.

Cc: Alasdair Kergon a...@redhat.com
Cc: Mike Snitzer snit...@redhat.com
Signed-off-by: Pekka Enberg penb...@kernel.org
---
 drivers/md/dm-stats.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/md/dm-stats.c b/drivers/md/dm-stats.c
index f478a4c..492fe6a 100644
--- a/drivers/md/dm-stats.c
+++ b/drivers/md/dm-stats.c
@@ -160,10 +160,7 @@ static void dm_kvfree(void *ptr, size_t alloc_size)
 
free_shared_memory(alloc_size);
 
-   if (is_vmalloc_addr(ptr))
-   vfree(ptr);
-   else
-   kfree(ptr);
+   kvfree(ptr);
 }
 
 static void dm_stat_free(struct rcu_head *head)
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 03/14] cxgb3: Use kvfree() in cxgb_free_mem()

2015-05-15 Thread Pekka Enberg
Use kvfree() instead of open-coding it.

Cc: Santosh Raspatur sant...@chelsio.com
Signed-off-by: Pekka Enberg penb...@kernel.org
---
 drivers/net/ethernet/chelsio/cxgb3/cxgb3_offload.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb3/cxgb3_offload.c 
b/drivers/net/ethernet/chelsio/cxgb3/cxgb3_offload.c
index b0cbb2b..76684dc 100644
--- a/drivers/net/ethernet/chelsio/cxgb3/cxgb3_offload.c
+++ b/drivers/net/ethernet/chelsio/cxgb3/cxgb3_offload.c
@@ -1169,10 +1169,7 @@ void *cxgb_alloc_mem(unsigned long size)
  */
 void cxgb_free_mem(void *addr)
 {
-   if (is_vmalloc_addr(addr))
-   vfree(addr);
-   else
-   kfree(addr);
+   kvfree(addr);
 }
 
 /*
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 09/14] IB/ehca: Use kvfree() in ipz_queue_{cd}tor()

2015-05-15 Thread Pekka Enberg
Use kvfree() instead of open-coding it.

Cc: Hoang-Nam Nguyen hngu...@de.ibm.com
Cc: Christoph Raisch rai...@de.ibm.com
Signed-off-by: Pekka Enberg penb...@kernel.org
---
 drivers/infiniband/hw/ehca/ipz_pt_fn.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/infiniband/hw/ehca/ipz_pt_fn.c 
b/drivers/infiniband/hw/ehca/ipz_pt_fn.c
index 8d59451..7ffc748 100644
--- a/drivers/infiniband/hw/ehca/ipz_pt_fn.c
+++ b/drivers/infiniband/hw/ehca/ipz_pt_fn.c
@@ -245,10 +245,7 @@ int ipz_queue_ctor(struct ehca_pd *pd, struct ipz_queue 
*queue,
 ipz_queue_ctor_exit0:
ehca_gen_err(Couldn't alloc pages queue=%p 
 nr_of_pages=%x,  queue, nr_of_pages);
-   if (is_vmalloc_addr(queue-queue_pages))
-   vfree(queue-queue_pages);
-   else
-   kfree(queue-queue_pages);
+   kvfree(queue-queue_pages);
 
return 0;
 }
@@ -270,10 +267,7 @@ int ipz_queue_dtor(struct ehca_pd *pd, struct ipz_queue 
*queue)
free_page((unsigned long)queue-queue_pages[i]);
}
 
-   if (is_vmalloc_addr(queue-queue_pages))
-   vfree(queue-queue_pages);
-   else
-   kfree(queue-queue_pages);
+   kvfree(queue-queue_pages);
 
return 1;
 }
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 08/14] drivers/input/evdev.c: Use kvfree() in evdev_release()

2015-05-15 Thread Pekka Enberg
Use kvfree() instead of open-coding it.

Cc: Dmitry Torokhov dmitry.torok...@gmail.com
Signed-off-by: Pekka Enberg penb...@kernel.org
---
 drivers/input/evdev.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/input/evdev.c b/drivers/input/evdev.c
index a18f41b..9d35499 100644
--- a/drivers/input/evdev.c
+++ b/drivers/input/evdev.c
@@ -422,10 +422,7 @@ static int evdev_release(struct inode *inode, struct file 
*file)
 
evdev_detach_client(evdev, client);
 
-   if (is_vmalloc_addr(client))
-   vfree(client);
-   else
-   kfree(client);
+   kvfree(client);
 
evdev_close_device(evdev);
 
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 01/14] ipc/util.c: Use kvfree() in ipc_rcu_free()

2015-05-15 Thread Pekka Enberg
Use kvfree() instead of open-coding it.

Signed-off-by: Pekka Enberg penb...@kernel.org
---
 ipc/util.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/ipc/util.c b/ipc/util.c
index ff3323e..537a41c 100644
--- a/ipc/util.c
+++ b/ipc/util.c
@@ -467,10 +467,7 @@ void ipc_rcu_free(struct rcu_head *head)
 {
struct ipc_rcu *p = container_of(head, struct ipc_rcu, rcu);
 
-   if (is_vmalloc_addr(p))
-   vfree(p);
-   else
-   kfree(p);
+   kvfree(p);
 }
 
 /**
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 02/14] kernel/relay.c: Use kvfree() in relay_free_page_array()

2015-05-15 Thread Pekka Enberg
Use kvfree() instead of open-coding it.

Signed-off-by: Pekka Enberg penb...@kernel.org
---
 kernel/relay.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/kernel/relay.c b/kernel/relay.c
index e9dbaeb..0b4570c 100644
--- a/kernel/relay.c
+++ b/kernel/relay.c
@@ -81,10 +81,7 @@ static struct page **relay_alloc_page_array(unsigned int 
n_pages)
  */
 static void relay_free_page_array(struct page **array)
 {
-   if (is_vmalloc_addr(array))
-   vfree(array);
-   else
-   kfree(array);
+   kvfree(array);
 }
 
 /**
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 05/14] ceph: Use kvfree() in ceph_put_page_vector()

2015-05-15 Thread Pekka Enberg
Use kvfree instead of open-coding it.

Cc: Yan, Zheng z...@redhat.com
Cc: Sage Weil s...@redhat.com
Signed-off-by: Pekka Enberg penb...@kernel.org
---
 net/ceph/pagevec.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/net/ceph/pagevec.c b/net/ceph/pagevec.c
index 096d914..d4f5f22 100644
--- a/net/ceph/pagevec.c
+++ b/net/ceph/pagevec.c
@@ -51,10 +51,7 @@ void ceph_put_page_vector(struct page **pages, int 
num_pages, bool dirty)
set_page_dirty_lock(pages[i]);
put_page(pages[i]);
}
-   if (is_vmalloc_addr(pages))
-   vfree(pages);
-   else
-   kfree(pages);
+   kvfree(pages);
 }
 EXPORT_SYMBOL(ceph_put_page_vector);
 
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 11/14] libcxgbi: Use kvfree() in cxgbi_free_big_mem()

2015-05-15 Thread Pekka Enberg
Use kvfree() instead of open-coding it.

Cc: James E.J. Bottomley jbottom...@odin.com
Signed-off-by: Pekka Enberg penb...@kernel.org
---
 drivers/scsi/cxgbi/libcxgbi.h | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/scsi/cxgbi/libcxgbi.h b/drivers/scsi/cxgbi/libcxgbi.h
index aba1af7..c2eb7ea 100644
--- a/drivers/scsi/cxgbi/libcxgbi.h
+++ b/drivers/scsi/cxgbi/libcxgbi.h
@@ -685,10 +685,7 @@ static inline void *cxgbi_alloc_big_mem(unsigned int size,
 
 static inline void cxgbi_free_big_mem(void *addr)
 {
-   if (is_vmalloc_addr(addr))
-   vfree(addr);
-   else
-   kfree(addr);
+   kvfree(addr);
 }
 
 static inline void cxgbi_set_iscsi_ipv4(struct cxgbi_hba *chba, __be32 ipaddr)
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 13/14] bcache: Use kvfree() in various places

2015-05-15 Thread Pekka Enberg
Use kvfree() instead of open-coding it.

Cc: Kent Overstreet k...@daterainc.com
Signed-off-by: Pekka Enberg penb...@kernel.org
---
 drivers/md/bcache/super.c | 10 ++
 drivers/md/bcache/util.h  | 10 ++
 2 files changed, 4 insertions(+), 16 deletions(-)

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 4dd2bb7..94980bf 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -760,14 +760,8 @@ static void bcache_device_free(struct bcache_device *d)
bio_split_pool_free(d-bio_split_hook);
if (d-bio_split)
bioset_free(d-bio_split);
-   if (is_vmalloc_addr(d-full_dirty_stripes))
-   vfree(d-full_dirty_stripes);
-   else
-   kfree(d-full_dirty_stripes);
-   if (is_vmalloc_addr(d-stripe_sectors_dirty))
-   vfree(d-stripe_sectors_dirty);
-   else
-   kfree(d-stripe_sectors_dirty);
+   kvfree(d-full_dirty_stripes);
+   kvfree(d-stripe_sectors_dirty);
 
closure_debug_destroy(d-cl);
 }
diff --git a/drivers/md/bcache/util.h b/drivers/md/bcache/util.h
index 98df757..1d04c48 100644
--- a/drivers/md/bcache/util.h
+++ b/drivers/md/bcache/util.h
@@ -52,10 +52,7 @@ struct closure;
 
 #define free_heap(heap)
\
 do {   \
-   if (is_vmalloc_addr((heap)-data))  \
-   vfree((heap)-data);\
-   else\
-   kfree((heap)-data);\
+   kvfree((heap)-data);   \
(heap)-data = NULL;\
 } while (0)
 
@@ -163,10 +160,7 @@ do {   
\
 
 #define free_fifo(fifo)
\
 do {   \
-   if (is_vmalloc_addr((fifo)-data))  \
-   vfree((fifo)-data);\
-   else\
-   kfree((fifo)-data);\
+   kvfree((fifo)-data);   \
(fifo)-data = NULL;\
 } while (0)
 
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 14/14] NTFS: Use kvfree() in ntfs_free()

2015-05-15 Thread Pekka Enberg
Use kvfree() instead of open-coding it.

Cc: Anton Altaparmakov an...@tuxera.com
Signed-off-by: Pekka Enberg penb...@kernel.org
---
 fs/ntfs/malloc.h | 7 +--
 1 file changed, 1 insertion(+), 6 deletions(-)

diff --git a/fs/ntfs/malloc.h b/fs/ntfs/malloc.h
index a44b14c..ab172e5 100644
--- a/fs/ntfs/malloc.h
+++ b/fs/ntfs/malloc.h
@@ -85,12 +85,7 @@ static inline void *ntfs_malloc_nofs_nofail(unsigned long 
size)
 
 static inline void ntfs_free(void *addr)
 {
-   if (!is_vmalloc_addr(addr)) {
-   kfree(addr);
-   /* free_page((unsigned long)addr); */
-   return;
-   }
-   vfree(addr);
+   kvfree(addr);
 }
 
 #endif /* _LINUX_NTFS_MALLOC_H */
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 07/14] drm/nouveau/gem: Use kvfree() in u_free()

2015-05-15 Thread Pekka Enberg
Use kvfree() instead of open-coding it.

Cc: David Airlie airl...@linux.ie
Signed-off-by: Pekka Enberg penb...@kernel.org
---
 drivers/gpu/drm/nouveau/nouveau_gem.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/gpu/drm/nouveau/nouveau_gem.c 
b/drivers/gpu/drm/nouveau/nouveau_gem.c
index 0e690bf..af1ee51 100644
--- a/drivers/gpu/drm/nouveau/nouveau_gem.c
+++ b/drivers/gpu/drm/nouveau/nouveau_gem.c
@@ -555,10 +555,7 @@ nouveau_gem_pushbuf_validate(struct nouveau_channel *chan,
 static inline void
 u_free(void *addr)
 {
-   if (!is_vmalloc_addr(addr))
-   kfree(addr);
-   else
-   vfree(addr);
+   kvfree(addr);
 }
 
 static inline void *
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 04/14] cxgb4: Use kvfree() in t4_free_mem()

2015-05-15 Thread Pekka Enberg
Use kvfree() instead of open-coding it.

Cc: Hariprasad S haripra...@chelsio.com
Signed-off-by: Pekka Enberg penb...@kernel.org
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 5 +
 1 file changed, 1 insertion(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c 
b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
index 803d91b..9143917 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
@@ -1121,10 +1121,7 @@ void *t4_alloc_mem(size_t size)
  */
 void t4_free_mem(void *addr)
 {
-   if (is_vmalloc_addr(addr))
-   vfree(addr);
-   else
-   kfree(addr);
+   kvfree(addr);
 }
 
 /* Send a Work Request to write the filter at a specified index.  We construct
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 10/14] target: Use kvfree() in session alloc and free

2015-05-15 Thread Pekka Enberg
Use kvfree() instead of open-coding it.

Cc: Nicholas A. Bellinger n...@linux-iscsi.org
Signed-off-by: Pekka Enberg penb...@kernel.org
---
 drivers/target/target_core_transport.c | 10 ++
 1 file changed, 2 insertions(+), 8 deletions(-)

diff --git a/drivers/target/target_core_transport.c 
b/drivers/target/target_core_transport.c
index 3fe5cb2..9c058c0 100644
--- a/drivers/target/target_core_transport.c
+++ b/drivers/target/target_core_transport.c
@@ -280,10 +280,7 @@ int transport_alloc_session_tags(struct se_session 
*se_sess,
if (rc  0) {
pr_err(Unable to init se_sess-sess_tag_pool,
 tag_num: %u\n, tag_num);
-   if (is_vmalloc_addr(se_sess-sess_cmd_map))
-   vfree(se_sess-sess_cmd_map);
-   else
-   kfree(se_sess-sess_cmd_map);
+   kvfree(se_sess-sess_cmd_map);
se_sess-sess_cmd_map = NULL;
return -ENOMEM;
}
@@ -490,10 +487,7 @@ void transport_free_session(struct se_session *se_sess)
 {
if (se_sess-sess_cmd_map) {
percpu_ida_destroy(se_sess-sess_tag_pool);
-   if (is_vmalloc_addr(se_sess-sess_cmd_map))
-   vfree(se_sess-sess_cmd_map);
-   else
-   kfree(se_sess-sess_cmd_map);
+   kvfree(se_sess-sess_cmd_map);
}
kmem_cache_free(se_sess_cache, se_sess);
 }
-- 
2.1.0

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] perf kmem: Show warning when trying to run stat without record

2015-05-05 Thread Pekka Enberg

On 05/05/2015 05:07 PM, Arnaldo Carvalho de Melo wrote:

Em Tue, May 05, 2015 at 09:58:12AM +0900, Namhyung Kim escreveu:

Sometimes one can mistakenly run perf kmem stat without perf kmem
record before or different configuration like recoding --slab and stat
--page.  Show a warning message like below to inform user:

   # perf kmem stat --page --caller
   Not found page events.  Have you run 'perf kmem record --page' before?

Acked-by: Pekka Enberg 
Signed-off-by: Namhyung Kim 

Thanks, applied.

I just found the messages a bit odd souding, perhaps:

# perf kmem stat --page --caller
No page allocation events found.  Have you run 'perf kmem record --page'?

Pekka?


Sure, that sounds less confusing.

- Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] perf kmem: Show warning when trying to run stat without record

2015-05-05 Thread Pekka Enberg

On 05/05/2015 05:07 PM, Arnaldo Carvalho de Melo wrote:

Em Tue, May 05, 2015 at 09:58:12AM +0900, Namhyung Kim escreveu:

Sometimes one can mistakenly run perf kmem stat without perf kmem
record before or different configuration like recoding --slab and stat
--page.  Show a warning message like below to inform user:

   # perf kmem stat --page --caller
   Not found page events.  Have you run 'perf kmem record --page' before?

Acked-by: Pekka Enberg penb...@kernel.org
Signed-off-by: Namhyung Kim namhy...@kernel.org

Thanks, applied.

I just found the messages a bit odd souding, perhaps:

# perf kmem stat --page --caller
No page allocation events found.  Have you run 'perf kmem record --page'?

Pekka?


Sure, that sounds less confusing.

- Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/13] Parallel struct page initialisation v4

2015-04-28 Thread Pekka Enberg
On Tue, Apr 28, 2015 at 5:36 PM, Mel Gorman  wrote:
> Struct page initialisation had been identified as one of the reasons why
> large machines take a long time to boot. Patches were posted a long time ago
> to defer initialisation until they were first used.  This was rejected on
> the grounds it should not be necessary to hurt the fast paths. This series
> reuses much of the work from that time but defers the initialisation of
> memory to kswapd so that one thread per node initialises memory local to
> that node.
>
> After applying the series and setting the appropriate Kconfig variable I
> see this in the boot log on a 64G machine
>
> [7.383764] kswapd 0 initialised deferred memory in 188ms
> [7.404253] kswapd 1 initialised deferred memory in 208ms
> [7.411044] kswapd 3 initialised deferred memory in 216ms
> [7.411551] kswapd 2 initialised deferred memory in 216ms
>
> On a 1TB machine, I see
>
> [8.406511] kswapd 3 initialised deferred memory in 1116ms
> [8.428518] kswapd 1 initialised deferred memory in 1140ms
> [8.435977] kswapd 0 initialised deferred memory in 1148ms
> [8.437416] kswapd 2 initialised deferred memory in 1148ms
>
> Once booted the machine appears to work as normal. Boot times were measured
> from the time shutdown was called until ssh was available again.  In the
> 64G case, the boot time savings are negligible. On the 1TB machine, the
> savings were 16 seconds.

FWIW,

Acked-by: Pekka Enberg 

for the whole series.

- Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/13] Parallel struct page initialisation v4

2015-04-28 Thread Pekka Enberg
On Tue, Apr 28, 2015 at 5:36 PM, Mel Gorman mgor...@suse.de wrote:
 Struct page initialisation had been identified as one of the reasons why
 large machines take a long time to boot. Patches were posted a long time ago
 to defer initialisation until they were first used.  This was rejected on
 the grounds it should not be necessary to hurt the fast paths. This series
 reuses much of the work from that time but defers the initialisation of
 memory to kswapd so that one thread per node initialises memory local to
 that node.

 After applying the series and setting the appropriate Kconfig variable I
 see this in the boot log on a 64G machine

 [7.383764] kswapd 0 initialised deferred memory in 188ms
 [7.404253] kswapd 1 initialised deferred memory in 208ms
 [7.411044] kswapd 3 initialised deferred memory in 216ms
 [7.411551] kswapd 2 initialised deferred memory in 216ms

 On a 1TB machine, I see

 [8.406511] kswapd 3 initialised deferred memory in 1116ms
 [8.428518] kswapd 1 initialised deferred memory in 1140ms
 [8.435977] kswapd 0 initialised deferred memory in 1148ms
 [8.437416] kswapd 2 initialised deferred memory in 1148ms

 Once booted the machine appears to work as normal. Boot times were measured
 from the time shutdown was called until ssh was available again.  In the
 64G case, the boot time savings are negligible. On the 1TB machine, the
 savings were 16 seconds.

FWIW,

Acked-by: Pekka Enberg penb...@kernel.org

for the whole series.

- Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 0/6] perf kmem: Implement page allocation analysis (v7)

2015-04-13 Thread Pekka Enberg
On Tue, Apr 14, 2015 at 5:52 AM, Namhyung Kim  wrote:
> Currently perf kmem command only analyzes SLAB memory allocation.  And
> I'd like to introduce page allocation analysis also.  Users can use
>  --slab and/or --page option to select it.  If none of these options
>  are used, it does slab allocation analysis for backward compatibility.

Nice addition!

Acked-by: Pekka Enberg 

for the whole series.

- Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCHSET 0/6] perf kmem: Implement page allocation analysis (v7)

2015-04-13 Thread Pekka Enberg
On Tue, Apr 14, 2015 at 5:52 AM, Namhyung Kim namhy...@kernel.org wrote:
 Currently perf kmem command only analyzes SLAB memory allocation.  And
 I'd like to introduce page allocation analysis also.  Users can use
  --slab and/or --page option to select it.  If none of these options
  are used, it does slab allocation analysis for backward compatibility.

Nice addition!

Acked-by: Pekka Enberg penb...@kernel.org

for the whole series.

- Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 2/4] perf tools: add Java demangling support

2015-03-31 Thread Pekka Enberg
Hi Stephane,

On Tue, Mar 31, 2015 at 1:19 AM, Stephane Eranian  wrote:
> +#define BASE_ENT(c, n) [c-'A']=n
> +static const char *base_types['Z'-'A' + 1]={
> +   BASE_ENT('B', "byte" ),
> +   BASE_ENT('C', "char" ),
> +   BASE_ENT('D', "double" ),
> +   BASE_ENT('F', "float" ),
> +   BASE_ENT('I', "int" ),
> +   BASE_ENT('J', "long" ),
> +   BASE_ENT('S', "short" ),
> +   BASE_ENT('Z', "bool" ),

It's "boolean", not "bool" in JVM speak.

> +static char *
> +__demangle_java_sym(const char *str, const char *end, char *buf, int maxlen, 
> int mode)
> +{
> +   int rlen = 0;
> +   int array = 0;
> +   int narg = 0;
> +   const char *q;
> +
> +   if (!end)
> +   end = str + strlen(str);
> +
> +   for (q = str; q != end; q++) {
> +
> +   if (rlen == (maxlen - 1))
> +   break;
> +
> +   switch (*q) {
> +   case 'L':
> +   if (mode == MODE_PREFIX || mode == MODE_CTYPE) {
> +   if (mode == MODE_CTYPE) {
> +   if (narg)
> +   rlen += scnprintf(buf + rlen, 
> maxlen - rlen, ", ");
> +   narg++;
> +   }
> +   rlen += scnprintf(buf + rlen, maxlen - rlen, 
> "class ");
> +   if (mode == MODE_PREFIX)
> +   mode = MODE_CLASS;
> +   } else
> +   buf[rlen++] = *q;
> +   break;

This looks odd to me. "L" marks the beginning of an class name and
it's terminated by ";". You could just strhchr() the terminator and
simply copy the name to "buf" and drop cases ';', '/', and the default
label fro the switch statement.

> +char *
> +java_demangle_sym(const char *str, int flags)
> +{

[snip]

> +   /*
> +* expansion factor estimated to 3x
> +*/
> +   len = strlen(str) * 3 + 1;
> +   buf = malloc(len);
> +   if (!buf)
> +   return NULL;

Truncated symbols are lame. Can't you use realloc() to ensure that
never happens?

- Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v6 2/4] perf tools: add Java demangling support

2015-03-31 Thread Pekka Enberg
Hi Stephane,

On Tue, Mar 31, 2015 at 1:19 AM, Stephane Eranian eran...@google.com wrote:
 +#define BASE_ENT(c, n) [c-'A']=n
 +static const char *base_types['Z'-'A' + 1]={
 +   BASE_ENT('B', byte ),
 +   BASE_ENT('C', char ),
 +   BASE_ENT('D', double ),
 +   BASE_ENT('F', float ),
 +   BASE_ENT('I', int ),
 +   BASE_ENT('J', long ),
 +   BASE_ENT('S', short ),
 +   BASE_ENT('Z', bool ),

It's boolean, not bool in JVM speak.

 +static char *
 +__demangle_java_sym(const char *str, const char *end, char *buf, int maxlen, 
 int mode)
 +{
 +   int rlen = 0;
 +   int array = 0;
 +   int narg = 0;
 +   const char *q;
 +
 +   if (!end)
 +   end = str + strlen(str);
 +
 +   for (q = str; q != end; q++) {
 +
 +   if (rlen == (maxlen - 1))
 +   break;
 +
 +   switch (*q) {
 +   case 'L':
 +   if (mode == MODE_PREFIX || mode == MODE_CTYPE) {
 +   if (mode == MODE_CTYPE) {
 +   if (narg)
 +   rlen += scnprintf(buf + rlen, 
 maxlen - rlen, , );
 +   narg++;
 +   }
 +   rlen += scnprintf(buf + rlen, maxlen - rlen, 
 class );
 +   if (mode == MODE_PREFIX)
 +   mode = MODE_CLASS;
 +   } else
 +   buf[rlen++] = *q;
 +   break;

This looks odd to me. L marks the beginning of an class name and
it's terminated by ;. You could just strhchr() the terminator and
simply copy the name to buf and drop cases ';', '/', and the default
label fro the switch statement.

 +char *
 +java_demangle_sym(const char *str, int flags)
 +{

[snip]

 +   /*
 +* expansion factor estimated to 3x
 +*/
 +   len = strlen(str) * 3 + 1;
 +   buf = malloc(len);
 +   if (!buf)
 +   return NULL;

Truncated symbols are lame. Can't you use realloc() to ensure that
never happens?

- Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: stand-alone kvmtool

2015-03-01 Thread Pekka Enberg

On 2/26/15 1:02 PM, Alex Bennée wrote:

If you can have it all it would be nice to preserve buildability all
through your history for bisecting (and the moon on a stick please ;-)

Is the dependency on the kernel sources something that has been stable
over the projects history or something that's been declining/increasing
over time?


AFAICT, it's been mostly stable after the initial sweep to use 
kernel-provided data structures such as lists and rbtrees.


- Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: stand-alone kvmtool

2015-03-01 Thread Pekka Enberg

On 2/26/15 1:02 PM, Alex Bennée wrote:

If you can have it all it would be nice to preserve buildability all
through your history for bisecting (and the moon on a stick please ;-)

Is the dependency on the kernel sources something that has been stable
over the projects history or something that's been declining/increasing
over time?


AFAICT, it's been mostly stable after the initial sweep to use 
kernel-provided data structures such as lists and rbtrees.


- Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: stand-alone kvmtool

2015-02-23 Thread Pekka Enberg

Hi,

On 2/18/15 5:50 PM, Will Deacon wrote:

Thanks for doing this. Since it looks unlikely that kvmtool will ever be
merged back into the kernel tree, it makes sense to cut the dependency
in my opinion.


I am certainly OK with a standalone repository which preserves the 
history. Will, would you like to take over the proposed new repository 
and put it somewhere on git.kernel.org, perhaps?


- Pekka
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: stand-alone kvmtool

2015-02-23 Thread Pekka Enberg

Hi,

On 2/18/15 5:50 PM, Will Deacon wrote:

Thanks for doing this. Since it looks unlikely that kvmtool will ever be
merged back into the kernel tree, it makes sense to cut the dependency
in my opinion.


I am certainly OK with a standalone repository which preserves the 
history. Will, would you like to take over the proposed new repository 
and put it somewhere on git.kernel.org, perhaps?


- Pekka
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v3 1/5] slab: Correct size_index table before replacing the bootstrap kmem_cache_node.

2015-02-05 Thread Pekka Enberg


On 02/04/2015 11:06 PM, Daniel Sanders wrote:

This patch moves the initialization of the size_index table slightly
earlier so that the first few kmem_cache_node's can be safely allocated
when KMALLOC_MIN_SIZE is large.

There are currently two ways to generate indices into kmalloc_caches
(via kmalloc_index() and via the size_index table in slab_common.c)
and on some arches (possibly only MIPS) they potentially disagree with
each other until create_kmalloc_caches() has been called. It seems
that the intention is that the size_index table is a fast equivalent
to kmalloc_index() and that create_kmalloc_caches() patches the table
to return the correct value for the cases where kmalloc_index()'s
if-statements apply.

The failing sequence was:
* kmalloc_caches contains NULL elements
* kmem_cache_init initialises the element that 'struct
   kmem_cache_node' will be allocated to. For 32-bit Mips, this is a
   56-byte struct and kmalloc_index returns KMALLOC_SHIFT_LOW (7).
* init_list is called which calls kmalloc_node to allocate a 'struct
   kmem_cache_node'.
* kmalloc_slab selects the kmem_caches element using
   size_index[size_index_elem(size)]. For MIPS, size is 56, and the
   expression returns 6.
* This element of kmalloc_caches is NULL and allocation fails.
* If it had not already failed, it would have called
   create_kmalloc_caches() at this point which would have changed
   size_index[size_index_elem(size)] to 7.

Signed-off-by: Daniel Sanders 
Cc: Christoph Lameter 
Cc: Pekka Enberg 
Cc: David Rientjes 
Cc: Joonsoo Kim 
Cc: Andrew Morton 
Cc: linux...@kvack.org
Cc: linux-kernel@vger.kernel.org


Acked-by: Pekka Enberg 
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >