Re: [PATCH] mm: don't invoke __alloc_pages_direct_compact when order 0

2012-07-07 Thread JoonSoo Kim
And in almost invoking case, order is 0, so return immediately. You can't make sure it. Okay. Let's not invoke it when order 0 Let's not ruin git blame. Hmm... When I do git blame, I can't find anything related to this. I mean if we merge the pointless patch, it could be

Re: [PATCH] mm: don't invoke __alloc_pages_direct_compact when order 0

2012-07-07 Thread JoonSoo Kim
2012/7/7 David Rientjes rient...@google.com: On Sat, 7 Jul 2012, Joonsoo Kim wrote: __alloc_pages_direct_compact has many arguments so invoking it is very costly. And in almost invoking case, order is 0, so return immediately. If zero cost is very costly, then this might make sense

Re: WARNING: __GFP_FS allocations with IRQs disabled (kmemcheck_alloc_shadow)

2012-07-08 Thread JoonSoo Kim
2012/7/8 Fengguang Wu fengguang...@intel.com: Hi Vegard, This warning code is triggered for the attached config: __lockdep_trace_alloc(): /* * Oi! Can't be having __GFP_FS allocations with IRQs disabled. */ if

Re: [PATCH 3/3] slub: release a lock if freeing object with a lock is failed in __slab_free()

2012-07-08 Thread JoonSoo Kim
2012/7/7 Christoph Lameter c...@linux.com: On Fri, 6 Jul 2012, JoonSoo Kim wrote: At CPU2, we don't need lock anymore, because this slab already in partial list. For that scenario we could also simply do a trylock there and redo the loop if we fail. But still what guarantees

Re: WARNING: __GFP_FS allocations with IRQs disabled (kmemcheck_alloc_shadow)

2012-07-09 Thread JoonSoo Kim
2012/7/9 David Rientjes rient...@google.com: On Mon, 9 Jul 2012, JoonSoo Kim wrote: diff --git a/mm/slub.c b/mm/slub.c index 8c691fa..5d41cad 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1324,8 +1324,14 @@ static struct page *allocate_slab(struct kmem_cache *s, gfp_t flags, int node

Re: [PATCH] mm: don't invoke __alloc_pages_direct_compact when order 0

2012-07-09 Thread JoonSoo Kim
2012/7/9 David Rientjes rient...@google.com: On Sun, 8 Jul 2012, JoonSoo Kim wrote: __alloc_pages_direct_compact has many arguments so invoking it is very costly. And in almost invoking case, order is 0, so return immediately. If zero cost is very costly, then this might make sense

Re: [PATCH] mm: don't invoke __alloc_pages_direct_compact when order 0

2012-07-10 Thread JoonSoo Kim
2012/7/10 Mel Gorman mgor...@suse.de: You say that invoking the function is very costly. I agree that a function call with that many parameters is hefty but it is also in the slow path of the allocator. For order-0 allocations we are about to enter direct reclaim where I would expect the cost

Re: [Patch 4/7] softirq: Use hotplug thread infrastructure

2012-07-25 Thread JoonSoo Kim
2012/7/16 Thomas Gleixner t...@linutronix.de: - static const struct sched_param param = { - .sched_priority = MAX_RT_PRIO-1 - }; - - p = per_cpu(ksoftirqd, hotcpu); - per_cpu(ksoftirqd, hotcpu) = NULL; -

Re: [Patch 0/7] Per cpu thread hotplug infrastructure - V3

2012-07-25 Thread JoonSoo Kim
2012/7/16 Thomas Gleixner t...@linutronix.de: The following series implements the infrastructure for parking and unparking kernel threads to avoid the full teardown and fork on cpu hotplug operations along with management infrastructure for hotplug and users. Changes vs. V2: Use callbacks

[RFC PATCH 0/8] remove vm_struct list management

2012-12-06 Thread Joonsoo Kim
static_vm for ARM-specific static mapped area' https://lkml.org/lkml/2012/11/27/356 But, running properly on x86 without ARM patchset. Joonsoo Kim (8): mm, vmalloc: change iterating a vmlist to find_vm_area() mm, vmalloc: move get_vmalloc_info() to vmalloc.c mm, vmalloc: protect va-vm

[RFC PATCH 2/8] mm, vmalloc: move get_vmalloc_info() to vmalloc.c

2012-12-06 Thread Joonsoo Kim
. So move the code to vmalloc.c Signed-off-by: Joonsoo Kim js1...@gmail.com diff --git a/fs/proc/Makefile b/fs/proc/Makefile index 99349ef..88092c1 100644 --- a/fs/proc/Makefile +++ b/fs/proc/Makefile @@ -5,7 +5,7 @@ obj-y += proc.o proc-y := nommu.o task_nommu.o -proc

[RFC PATCH 5/8] mm, vmalloc: iterate vmap_area_list in get_vmalloc_info()

2012-12-06 Thread Joonsoo Kim
. For example, vm_map_ram() allocate area in vmalloc address space, but it doesn't make a link with vmlist. To provide full information about vmalloc address space is better idea, so we don't use va-vm and use vmap_area directly. This makes get_vmalloc_info() more precise. Signed-off-by: Joonsoo Kim js1

[RFC PATCH 7/8] mm, vmalloc: makes vmlist only for kexec

2012-12-06 Thread Joonsoo Kim
is sufficient. So use vmlist_early for full chain of vm_struct and assign a dummy_vm to vmlist for supporting kexec. Cc: Eric Biederman ebied...@xmission.com Signed-off-by: Joonsoo Kim js1...@gmail.com diff --git a/mm/vmalloc.c b/mm/vmalloc.c index f134950..8a1b959 100644 --- a/mm/vmalloc.c +++ b/mm

[RFC PATCH 8/8] mm, vmalloc: remove list management operation after initializing vmalloc

2012-12-06 Thread Joonsoo Kim
Now, there is no need to maintain vmlist_early after initializing vmalloc. So remove related code and data structure. Signed-off-by: Joonsoo Kim js1...@gmail.com diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index 698b1e5..10d19c9 100644 --- a/include/linux/vmalloc.h +++ b

[RFC PATCH 6/8] mm, vmalloc: iterate vmap_area_list, instead of vmlist, in vmallocinfo()

2012-12-06 Thread Joonsoo Kim
. So we need smp_[rw]mb for ensuring that proper values is assigned when we see that VM_UNLIST is removed. Therefore, this patch not only change a iteration list, but also add a appropriate smp_[rw]mb to right places. Signed-off-by: Joonsoo Kim js1...@gmail.com diff --git a/mm/vmalloc.c b/mm

[RFC PATCH 4/8] mm, vmalloc: iterate vmap_area_list, instead of vmlist in vread/vwrite()

2012-12-06 Thread Joonsoo Kim
to lock, because vmlist_lock is mutex, but, vmap_area_lock is spin_lock. It may introduce a spinning overhead during vread/vwrite() is executing. But, these are debug-oriented functions, so this overhead is not real problem for common case. Signed-off-by: Joonsoo Kim js1...@gmail.com diff --git a/mm

[RFC PATCH 1/8] mm, vmalloc: change iterating a vmlist to find_vm_area()

2012-12-06 Thread Joonsoo Kim
...@linutronix.de Cc: Ingo Molnar mi...@redhat.com Cc: H. Peter Anvin h...@zytor.com Signed-off-by: Joonsoo Kim js1...@gmail.com diff --git a/arch/tile/mm/pgtable.c b/arch/tile/mm/pgtable.c index de0de0c..862782d 100644 --- a/arch/tile/mm/pgtable.c +++ b/arch/tile/mm/pgtable.c @@ -592,12 +592,7 @@ void iounmap

[RFC PATCH 3/8] mm, vmalloc: protect va-vm by vmap_area_lock

2012-12-06 Thread Joonsoo Kim
that, we should make sure that when we iterate a vmap_area_list, accessing to va-vm doesn't cause a race condition. This patch ensure that when iterating a vmap_area_list, there is no race condition for accessing to vm_struct. Signed-off-by: Joonsoo Kim js1...@gmail.com diff --git a/mm/vmalloc.c b

Re: [PATCH v2 0/3] introduce static_vm for ARM-specific static mapped area

2012-12-06 Thread JoonSoo Kim
2012/11/28 Joonsoo Kim js1...@gmail.com: In current implementation, we used ARM-specific flag, that is, VM_ARM_STATIC_MAPPING, for distinguishing ARM specific static mapped area. The purpose of static mapped area is to re-use static mapped area when entire physical address range of the ioremap

Re: [RFC PATCH 0/8] remove vm_struct list management

2012-12-07 Thread JoonSoo Kim
Hello, Andrew. 2012/12/7 Andrew Morton a...@linux-foundation.org: On Fri, 7 Dec 2012 01:09:27 +0900 Joonsoo Kim js1...@gmail.com wrote: This patchset remove vm_struct list management after initializing vmalloc. Adding and removing an entry to vmlist is linear time complexity, so

Re: [RFC PATCH 0/8] remove vm_struct list management

2012-12-07 Thread JoonSoo Kim
2012/12/7 Andrew Morton a...@linux-foundation.org: On Fri, 7 Dec 2012 01:09:27 +0900 Joonsoo Kim js1...@gmail.com wrote: I'm not sure that 7/8: makes vmlist only for kexec is fine. Because it is related to userspace program. As far as I know, makedumpfile use kexec's output information

Re: [RFC PATCH 0/8] remove vm_struct list management

2012-12-07 Thread JoonSoo Kim
Hello, Bob. 2012/12/7 Bob Liu lliu...@gmail.com: Hi Joonsoo, On Fri, Dec 7, 2012 at 12:09 AM, Joonsoo Kim js1...@gmail.com wrote: This patchset remove vm_struct list management after initializing vmalloc. Adding and removing an entry to vmlist is linear time complexity, so it is inefficient

Re: [RFC PATCH 1/8] mm, vmalloc: change iterating a vmlist to find_vm_area()

2012-12-07 Thread JoonSoo Kim
Hello, Pekka. 2012/12/7 Pekka Enberg penb...@kernel.org: On Thu, Dec 6, 2012 at 6:09 PM, Joonsoo Kim js1...@gmail.com wrote: The purpose of iterating a vmlist is finding vm area with specific virtual address. find_vm_area() is provided for this purpose and more efficient, because it uses

Re: [RFC PATCH 0/8] remove vm_struct list management

2012-12-10 Thread JoonSoo Kim
Hello, Vivek. 2012/12/7 Vivek Goyal vgo...@redhat.com: On Fri, Dec 07, 2012 at 10:16:55PM +0900, JoonSoo Kim wrote: 2012/12/7 Andrew Morton a...@linux-foundation.org: On Fri, 7 Dec 2012 01:09:27 +0900 Joonsoo Kim js1...@gmail.com wrote: I'm not sure that 7/8: makes vmlist only

[PATCH v3 1/2] scripts/tags.sh: Support subarch for ARM

2012-12-10 Thread Joonsoo Kim
cscope O=. SRCARCH=arm SUBARCH=xxx Signed-off-by: Joonsoo Kim js1...@gmail.com --- v2: change bash specific '[[]]' to 'case in' statement. v3: quote the patterns. diff --git a/scripts/tags.sh b/scripts/tags.sh index 79fdafb..8fb18d1 100755 --- a/scripts/tags.sh +++ b/scripts/tags.sh @@ -48,13 +48,14

[PATCH v3 2/2] scripts/tags.sh: Support compiled source

2012-12-10 Thread Joonsoo Kim
the kernel. Signed-off-by: Joonsoo Kim js1...@gmail.com --- v2: change bash specific '[[]]' to 'case in' statement. use COMPILED_SOURCE env var, instead of abusing SUBARCH v3: change [ $COMPILED_SOURCE=compiled ] to [ -n $COMPILED_SOURCE ] diff --git a/scripts/tags.sh b/scripts/tags.sh index

Re: [PATCH v2 4/5] mm, highmem: makes flush_all_zero_pkmaps() return index of first flushed entry

2012-11-27 Thread JoonSoo Kim
Hello, Andrew. 2012/11/20 Minchan Kim minc...@kernel.org: Hi Joonsoo, Sorry for the delay. On Thu, Nov 15, 2012 at 02:09:04AM +0900, JoonSoo Kim wrote: Hi, Minchan. 2012/11/14 Minchan Kim minc...@kernel.org: On Tue, Nov 13, 2012 at 11:12:28PM +0900, JoonSoo Kim wrote: 2012/11/13

[PATCH v2 0/3] introduce static_vm for ARM-specific static mapped area

2012-11-27 Thread Joonsoo Kim
and stat inevitably use vmlist and vmlist_lock. But it is preferable that they are used as least as possible in outside of vmalloc.c Changelog v1-v2: [2/3]: patch description is improved. Rebased on v3.7-rc7 Joonsoo Kim (3): ARM: vmregion: remove vmregion code entirely ARM: static_vm

[PATCH v2 1/3] ARM: vmregion: remove vmregion code entirely

2012-11-27 Thread Joonsoo Kim
Now, there is no user for vmregion. So remove it. Signed-off-by: Joonsoo Kim js1...@gmail.com diff --git a/arch/arm/mm/Makefile b/arch/arm/mm/Makefile index 8a9c4cb..4e333fa 100644 --- a/arch/arm/mm/Makefile +++ b/arch/arm/mm/Makefile @@ -6,7 +6,7 @@ obj-y := dma

[PATCH v2 2/3] ARM: static_vm: introduce an infrastructure for static mapped area

2012-11-27 Thread Joonsoo Kim
-by: Joonsoo Kim js1...@gmail.com diff --git a/arch/arm/include/asm/mach/static_vm.h b/arch/arm/include/asm/mach/static_vm.h new file mode 100644 index 000..1bb6604 --- /dev/null +++ b/arch/arm/include/asm/mach/static_vm.h @@ -0,0 +1,45 @@ +/* + * arch/arm/include/asm/mach/static_vm.h

[PATCH v2 3/3] ARM: mm: use static_vm for managing static mapped areas

2012-11-27 Thread Joonsoo Kim
. With it, we don't need to iterate all mapped areas. Instead, we just iterate static mapped areas. It helps to reduce an overhead of finding matched area. And architecture dependency on vmalloc layer is removed, so it will help to maintainability for vmalloc layer. Signed-off-by: Joonsoo Kim js1

[PATCH] slub: assign refcount for kmalloc_caches

2012-12-25 Thread Joonsoo Kim
-by: Joonsoo Kim js1...@gmail.com diff --git a/mm/slub.c b/mm/slub.c index a0d6984..321afab 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -3279,6 +3279,7 @@ static struct kmem_cache *__init create_kmalloc_cache(const char *name, if (kmem_cache_open(s, flags)) goto panic

Re: [PATCH] slub: assign refcount for kmalloc_caches

2012-12-25 Thread JoonSoo Kim
2012/12/26 Joonsoo Kim js1...@gmail.com: commit cce89f4f6911286500cf7be0363f46c9b0a12ce0('Move kmem_cache refcounting to common code') moves some refcount manipulation code to common code. Unfortunately, it also removed refcount assignment for kmalloc_caches. So, kmalloc_caches's refcount

Re: [PATCH 2/3] mm, bootmem: panic in bootmem alloc functions even if slab is available

2012-12-28 Thread JoonSoo Kim
Hello, Sasha. 2012/12/28 Sasha Levin sasha.le...@oracle.com: On 12/27/2012 06:04 PM, David Rientjes wrote: On Thu, 27 Dec 2012, Sasha Levin wrote: That's exactly what happens with the patch. Note that in the current upstream version there are several slab checks scattered all over. In

[PATCH] x86, reboot: skip reboot_fixups in early boot phase

2012-12-28 Thread Joonsoo Kim
into panic_smp_self_stop() which prevent system to restart. For avoid second panic, skip reboot_fixups in early boot phase. It makes panic_timeout works in early boot phase. Cc: Thomas Gleixner t...@linutronix.de Cc: Ingo Molnar mi...@redhat.com Cc: H. Peter Anvin h...@zytor.com Signed-off-by: Joonsoo

Re: [RFC PATCH 0/8] remove vm_struct list management

2012-12-12 Thread JoonSoo Kim
Hello, Atsushi. 2012/12/12 Atsushi Kumagai kumagai-atsu...@mxc.nes.nec.co.jp: Hello, On Tue, 11 Dec 2012 17:17:05 -0500 (EST) Dave Anderson ander...@redhat.com wrote: - Original Message - On Mon, Dec 10, 2012 at 11:40:47PM +0900, JoonSoo Kim wrote: [..] So without

Re: [PATCH v2 0/3] introduce static_vm for ARM-specific static mapped area

2012-12-12 Thread JoonSoo Kim
2012/12/7 JoonSoo Kim js1...@gmail.com: 2012/11/28 Joonsoo Kim js1...@gmail.com: In current implementation, we used ARM-specific flag, that is, VM_ARM_STATIC_MAPPING, for distinguishing ARM specific static mapped area. The purpose of static mapped area is to re-use static mapped area when

[PATCH] mm: introduce numa_zero_pfn

2012-12-12 Thread Joonsoo Kim
reduce this overhead. This patch implement basic infrastructure for numa_zero_pfn. It is default disabled, because it doesn't provide page coloring and some architecture use page coloring for zero page. Signed-off-by: Joonsoo Kim js1...@gmail.com diff --git a/mm/Kconfig b/mm/Kconfig index a3f8ddd

Re: [PATCH] mm: introduce numa_zero_pfn

2012-12-17 Thread JoonSoo Kim
2012/12/13 Andi Kleen a...@firstfloor.org: I would expect a processor to fetch the zero page cachelines from the l3 cache from other sockets avoiding memory transactions altogether. The zero page is likely in use somewhere so no typically no memory accesses should occur in a system. It

[RFC PATCH 0/3] introduce static_vm for ARM-specific static mapped area

2012-11-14 Thread Joonsoo Kim
on v3.7-rc5. Thanks. Joonsoo Kim (3): ARM: vmregion: remove vmregion code entirely ARM: static_vm: introduce an infrastructure for static mapped area ARM: mm: use static_vm for managing static mapped areas arch/arm/include/asm/mach/static_vm.h | 51 arch/arm/mm/Makefile

[RFC PATCH 1/3] ARM: vmregion: remove vmregion code entirely

2012-11-14 Thread Joonsoo Kim
Now, there is no user for vmregion. So remove it. Signed-off-by: Joonsoo Kim js1...@gmail.com diff --git a/arch/arm/mm/Makefile b/arch/arm/mm/Makefile index 8a9c4cb..4e333fa 100644 --- a/arch/arm/mm/Makefile +++ b/arch/arm/mm/Makefile @@ -6,7 +6,7 @@ obj-y := dma

[RFC PATCH 2/3] ARM: static_vm: introduce an infrastructure for static mapped area

2012-11-14 Thread Joonsoo Kim
and vmlist_lock. But it is preferable that they are used outside of vmalloc.c as least as possible. Now, I introduce an ARM-specific infrastructure for static mapped area. In the following patch, we will use this and resolve above mentioned problem. Signed-off-by: Joonsoo Kim js1...@gmail.com diff --git a/arch

[RFC PATCH 3/3] ARM: mm: use static_vm for managing static mapped areas

2012-11-14 Thread Joonsoo Kim
. With it, we don't need to iterate all mapped areas. Instead, we just iterate static mapped areas. It helps to reduce an overhead of finding matched area. And architecture dependency on vmalloc layer is removed, so it will help to maintainability for vmalloc layer. Signed-off-by: Joonsoo Kim js1

Re: [PATCH v2 4/5] mm, highmem: makes flush_all_zero_pkmaps() return index of first flushed entry

2012-11-14 Thread JoonSoo Kim
Hi, Minchan. 2012/11/14 Minchan Kim minc...@kernel.org: On Tue, Nov 13, 2012 at 11:12:28PM +0900, JoonSoo Kim wrote: 2012/11/13 Minchan Kim minc...@kernel.org: On Tue, Nov 13, 2012 at 09:30:57AM +0900, JoonSoo Kim wrote: 2012/11/3 Minchan Kim minc...@kernel.org: Hi Joonsoo, On Sat

[RFC PATCH] mm: WARN_ON_ONCE if f_op-mmap() change vma's start address

2012-11-14 Thread Joonsoo Kim
, it is possible error situation, because we already prepare prev vma, rb_link and rb_parent and these are related to original address. So add WARN_ON_ONCE for finding that this situtation really happens. Signed-off-by: Joonsoo Kim js1...@gmail.com diff --git a/mm/mmap.c b/mm/mmap.c index 2d94235..36567b7

Re: [RFC PATCH 0/3] introduce static_vm for ARM-specific static mapped area

2012-11-15 Thread JoonSoo Kim
Hello, Russell. Thanks for review. 2012/11/15 Russell King - ARM Linux li...@arm.linux.org.uk: On Thu, Nov 15, 2012 at 01:55:51AM +0900, Joonsoo Kim wrote: In current implementation, we used ARM-specific flag, that is, VM_ARM_STATIC_MAPPING, for distinguishing ARM specific static mapped area

Re: [PATCH] slub: assign refcount for kmalloc_caches

2013-01-10 Thread Joonsoo Kim
, Dec 25, 2012 at 7:30 AM, JoonSoo Kim js1...@gmail.com wrote: 2012/12/26 Joonsoo Kim js1...@gmail.com: commit cce89f4f6911286500cf7be0363f46c9b0a12ce0('Move kmem_cache refcounting to common code') moves some refcount manipulation code to common code. Unfortunately, it also removed

[PATCH 1/3] slub: correct to calculate num of acquired objects in get_partial_node()

2013-01-14 Thread Joonsoo Kim
. After that, we don't need return value of put_cpu_partial(). So remove it. Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com --- These are based on v3.8-rc3 and there is no dependency between each other. If rebase is needed, please notify me. diff --git a/mm/slub.c b/mm/slub.c index ba2ca53..abef30e

[PATCH 3/3] slub: add 'likely' macro to inc_slabs_node()

2013-01-14 Thread Joonsoo Kim
After boot phase, 'n' always exist. So add 'likely' macro for helping compiler. Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com diff --git a/mm/slub.c b/mm/slub.c index 830348b..6f82070 100644 --- a/mm/slub.c +++ b/mm/slub.c @@ -1005,7 +1005,7 @@ static inline void inc_slabs_node(struct

[PATCH 2/3] slub: correct bootstrap() for kmem_cache, kmem_cache_node

2013-01-14 Thread Joonsoo Kim
this slab. These didn't make any error previously, because we normally don't free objects which comes from kmem_cache's first slab and kmem_cache_node's. Problem will be solved if we consider a cpu slab in bootstrap(). This patch implement it. Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com

[PATCH] tools, vm: add .gitignore to ignore built binaries

2013-01-14 Thread Joonsoo Kim
There is no .gitignore in tools/vm, so 'git status' always show built binaries. To ignore this, add .gitignore. Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com diff --git a/tools/vm/.gitignore b/tools/vm/.gitignore new file mode 100644 index 000..44f095f --- /dev/null +++ b/tools/vm

Re: [PATCH v2 2/3] sched: factor out code to should_we_balance()

2013-08-05 Thread Joonsoo Kim
On Fri, Aug 02, 2013 at 12:20:40PM +0200, Peter Zijlstra wrote: On Fri, Aug 02, 2013 at 06:05:51PM +0900, ���ؼ� wrote: What is with you people; have you never learned to trim emails? Seriously, I'm going to write a script which tests to too many quoted lines, too many nested quotes and

Re: [PATCH v2 2/3] sched: factor out code to should_we_balance()

2013-08-05 Thread Joonsoo Kim
On Mon, Aug 05, 2013 at 09:52:28AM +0530, Preeti U Murthy wrote: On 08/02/2013 04:02 PM, Peter Zijlstra wrote: On Fri, Aug 02, 2013 at 02:56:14PM +0530, Preeti U Murthy wrote: You need to iterate over all the groups of the sched domain env-sd and not just the first group of env-sd like you

Re: [PATCH v2 3/3] sched: clean-up struct sd_lb_stat

2013-08-05 Thread Joonsoo Kim
+ if (busiest-group_imb) { + busiest-sum_weighted_load = + min(busiest-sum_weighted_load, sds-sd_avg_load); Right here we get confused as to why the total load is being compared against load per task (although you are changing it to load per task above).

Re: [PATCH 17/18] mm, hugetlb: retry if we fail to allocate a hugepage with use_reserve

2013-08-05 Thread Joonsoo Kim
Any mapping that doesn't use the reserved pool, not just MAP_NORESERVE. For example, if a process makes a MAP_PRIVATE mapping, then fork()s then the mapping is instantiated in the child, that will not draw from the reserved pool. Should we ensure them to allocate the last hugepage? They

Re: [PATCH 2/4] mm, migrate: allocation new page lazyily in unmap_and_move()

2013-08-05 Thread Joonsoo Kim
get_new_page() sets up result to communicate error codes from the following checks. While the existing ones (page freed and thp split failed) don't change rc, somebody else might add a condition whose error code should be propagated back into *result but miss it. Please leave

Re: [PATCH 1/4] mm, page_alloc: add likely macro to help compiler optimization

2013-08-05 Thread Joonsoo Kim
Hello, Michal. On Fri, Aug 02, 2013 at 11:36:07PM +0200, Michal Hocko wrote: On Fri 02-08-13 16:47:10, Johannes Weiner wrote: On Fri, Aug 02, 2013 at 06:27:22PM +0200, Michal Hocko wrote: On Fri 02-08-13 11:07:56, Joonsoo Kim wrote: We rarely allocate a page with ALLOC_NO_WATERMARKS

Re: [PATCH 1/4] mm, page_alloc: add likely macro to help compiler optimization

2013-08-05 Thread Joonsoo Kim
On Mon, Aug 05, 2013 at 05:10:08PM +0900, Joonsoo Kim wrote: Hello, Michal. On Fri, Aug 02, 2013 at 11:36:07PM +0200, Michal Hocko wrote: On Fri 02-08-13 16:47:10, Johannes Weiner wrote: On Fri, Aug 02, 2013 at 06:27:22PM +0200, Michal Hocko wrote: On Fri 02-08-13 11:07:56, Joonsoo

[PATCH v3 0/3] optimization, clean-up about fair.c

2013-08-06 Thread Joonsoo Kim
, because I'm not sure they are right. Joonsoo Kim (3): sched: remove one division operation in find_buiest_queue() sched: factor out code to should_we_balance() sched: clean-up struct sd_lb_stat kernel/sched/fair.c | 326 +-- 1 file changed, 162

[PATCH v3 3/3] sched: clean-up struct sd_lb_stat

2013-08-06 Thread Joonsoo Kim
-by: Joonsoo Kim iamjoonsoo@lge.com diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index c6732d2..f8a9660 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4232,36 +4232,6 @@ static unsigned long task_h_load(struct task_struct *p) /** Helpers for find_busiest_group

[PATCH v2 mmotm 3/3] swap: clean-up #ifdef in page_mapping()

2013-08-06 Thread Joonsoo Kim
PageSwapCache() is always false when !CONFIG_SWAP, so compiler properly discard related code. Therefore, we don't need #ifdef explicitly. Acked-by: Johannes Weiner han...@cmpxchg.org Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com diff --git a/include/linux/swap.h b/include/linux/swap.h index

[PATCH v2 mmotm 1/3] mm, page_alloc: add unlikely macro to help compiler optimization

2013-08-06 Thread Joonsoo Kim
right and nobody re-evaluate if gcc do proper optimization with their change, for example, it is not optimized properly on v3.10. So adding compiler hint here is reasonable. Acked-by: Johannes Weiner han...@cmpxchg.org Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com diff --git a/mm/page_alloc.c b/mm

[PATCH v3 2/3] sched: factor out code to should_we_balance()

2013-08-06 Thread Joonsoo Kim
354958aa7 kernel/sched/fair.o In addition, rename @balance to @should_balance in order to represent its purpose more clearly. Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 52898dc..c6732d2 100644 --- a/kernel/sched/fair.c +++ b

[PATCH v2 mmotm 2/3] mm: move pgtable related functions to right place

2013-08-06 Thread Joonsoo Kim
pgtable related functions are mostly in pgtable-generic.c. So move remaining functions from memory.c to pgtable-generic.c. Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com diff --git a/mm/memory.c b/mm/memory.c index f2ab2a8..8fd4d42 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -374,30 +374,6

[PATCH v3 1/3] sched: remove one division operation in find_buiest_queue()

2013-08-06 Thread Joonsoo Kim
Remove one division operation in find_buiest_queue(). Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 9565645..52898dc 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -4968,7 +4968,7 @@ static struct rq

[PATCH] mm, page_alloc: optimize batch count in free_pcppages_bulk()

2013-08-06 Thread Joonsoo Kim
If we use a division operation, we can compute a batch count more closed to ideal value. With this value, we can finish our job within MIGRATE_PCPTYPES iteration. In addition, batching to free more pages may be helpful to cache usage. Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com diff --git

[PATCH 4/4] mm, page_alloc: optimize batch count in free_pcppages_bulk()

2013-08-06 Thread Joonsoo Kim
If we use a division operation, we can compute a batch count more closed to ideal value. With this value, we can finish our job within MIGRATE_PCPTYPES iteration. In addition, batching to free more pages may be helpful to cache usage. Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com diff --git

[PATCH 3/4] mm, rmap: minimize lock hold when unlink_anon_vmas

2013-08-06 Thread Joonsoo Kim
Currently, we free the avc objects with holding a lock. To minimize lock hold time, we just move the avc objects to another list with holding a lock. Then, iterate them and free objects without holding a lock. This makes lock hold time minimized. Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com

[PATCH 1/4] mm, rmap: do easy-job first in anon_vma_fork

2013-08-06 Thread Joonsoo Kim
If we fail due to some errorous situation, it is better to quit without doing heavy work. So changing order of execution. Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com diff --git a/mm/rmap.c b/mm/rmap.c index a149e3a..c2f51cb 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -278,19 +278,19 @@ int

[PATCH 2/4] mm, rmap: allocate anon_vma_chain before starting to link anon_vma_chain

2013-08-06 Thread Joonsoo Kim
If we allocate anon_vma_chain before starting to link, we can reduce the lock hold time. This patch implement it. Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com diff --git a/mm/rmap.c b/mm/rmap.c index c2f51cb..1603f64 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -240,18 +240,21 @@ int

Re: [PATCH 4/4] mm, page_alloc: optimize batch count in free_pcppages_bulk()

2013-08-06 Thread Joonsoo Kim
On Tue, Aug 06, 2013 at 05:43:40PM +0900, Joonsoo Kim wrote: If we use a division operation, we can compute a batch count more closed to ideal value. With this value, we can finish our job within MIGRATE_PCPTYPES iteration. In addition, batching to free more pages may be helpful to cache usage

Re: [PATCH 1/4] mm, rmap: do easy-job first in anon_vma_fork

2013-08-07 Thread Joonsoo Kim
Hello, Johannes. On Tue, Aug 06, 2013 at 08:58:54AM -0400, Johannes Weiner wrote: if (anon_vma_clone(vma, pvma)) - return -ENOMEM; - - /* Then add our own anon_vma. */ - anon_vma = anon_vma_alloc(); - if (!anon_vma) - goto out_error; - avc =

Re: [PATCH 2/4] mm, rmap: allocate anon_vma_chain before starting to link anon_vma_chain

2013-08-07 Thread Joonsoo Kim
On Wed, Aug 07, 2013 at 02:08:03AM -0400, Johannes Weiner wrote: list_for_each_entry_reverse(pavc, src-anon_vma_chain, same_vma) { struct anon_vma *anon_vma; - avc = anon_vma_chain_alloc(GFP_NOWAIT | __GFP_NOWARN); - if (unlikely(!avc)) { -

Re: [PATCH 3/4] mm, rmap: minimize lock hold when unlink_anon_vmas

2013-08-07 Thread Joonsoo Kim
On Wed, Aug 07, 2013 at 02:11:38AM -0400, Johannes Weiner wrote: On Tue, Aug 06, 2013 at 05:43:39PM +0900, Joonsoo Kim wrote: Currently, we free the avc objects with holding a lock. To minimize lock hold time, we just move the avc objects to another list with holding a lock. Then, iterate

Re: [PATCH 17/18] mm, hugetlb: retry if we fail to allocate a hugepage with use_reserve

2013-08-07 Thread Joonsoo Kim
On Tue, Aug 06, 2013 at 06:38:49PM -0700, Davidlohr Bueso wrote: On Wed, 2013-08-07 at 11:03 +1000, David Gibson wrote: On Tue, Aug 06, 2013 at 05:18:44PM -0700, Davidlohr Bueso wrote: On Mon, 2013-08-05 at 16:36 +0900, Joonsoo Kim wrote: Any mapping that doesn't use the reserved pool

Re: [PATCH 0/2] hugepage: optimize page fault path locking

2013-08-07 Thread Joonsoo Kim
On Tue, Aug 06, 2013 at 05:08:04PM -0700, Davidlohr Bueso wrote: On Mon, 2013-07-29 at 15:18 +0900, Joonsoo Kim wrote: On Fri, Jul 26, 2013 at 07:27:23AM -0700, Davidlohr Bueso wrote: This patchset attempts to reduce the amount of contention we impose on the hugetlb_instantiation_mutex

Re: [PATCH] mm, page_alloc: optimize batch count in free_pcppages_bulk()

2013-08-07 Thread JoonSoo Kim
Hello, Andrew. 2013/8/7 Andrew Morton a...@linux-foundation.org: On Tue, 6 Aug 2013 17:40:40 +0900 Joonsoo Kim iamjoonsoo@lge.com wrote: If we use a division operation, we can compute a batch count more closed to ideal value. With this value, we can finish our job within

[PATCH v3 8/9] mm, hugetlb: remove decrement_hugepage_resv_vma()

2013-07-28 Thread Joonsoo Kim
. This patch implement it. Reviewed-by: Wanpeng Li liw...@linux.vnet.ibm.com Reviewed-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com diff --git a/mm/hugetlb.c b/mm/hugetlb.c index ca15854..4b1b043 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c

[PATCH v3 1/9] mm, hugetlb: move up the code which check availability of free huge page

2013-07-28 Thread Joonsoo Kim
-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com diff --git a/mm/hugetlb.c b/mm/hugetlb.c index e2bfbf7..fc4988c 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -539,10 +539,6 @@ static struct page *dequeue_huge_page_vma(struct hstate *h

[PATCH v3 9/9] mm, hugetlb: decrement reserve count if VM_NORESERVE alloc page cache

2013-07-28 Thread Joonsoo Kim
, this patch solve the problem. Reviewed-by: Wanpeng Li liw...@linux.vnet.ibm.com Reviewed-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 4b1b043..b3b8252 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c

[PATCH v3 6/9] mm, hugetlb: do not use a page in page cache for cow optimization

2013-07-28 Thread Joonsoo Kim
. If this page is not AnonPage, we don't do optimization. This makes this optimization turning off for a page cache. Acked-by: Michal Hocko mho...@suse.cz Reviewed-by: Wanpeng Li liw...@linux.vnet.ibm.com Reviewed-by: Naoya Horiguchi n-horigu...@ah.jp.nec.com Signed-off-by: Joonsoo Kim iamjoonsoo

[PATCH v3 7/9] mm, hugetlb: add VM_NORESERVE check in vma_has_reserves()

2013-07-28 Thread Joonsoo Kim
. With this change, above test generate a SIGBUG which is correct, because all free pages are reserved and non reserved shared mapping can't get a free page. Reviewed-by: Wanpeng Li liw...@linux.vnet.ibm.com Reviewed-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com Signed-off-by: Joonsoo Kim iamjoonsoo

[PATCH v3 0/9] mm, hugetlb: clean-up and possible bug fix

2013-07-28 Thread Joonsoo Kim
. Remove useless indentation changes in 'clean-up alloc_huge_page()' Fix new iteration code bug. Add reviewed-by or acked-by. Joonsoo Kim (9): mm, hugetlb: move up the code which check availability of free huge page mm, hugetlb: trivial commenting fix mm, hugetlb: clean-up alloc_huge_page

[PATCH v3 4/9] mm, hugetlb: fix and clean-up node iteration code to alloc or free

2013-07-28 Thread Joonsoo Kim
[alloc|free] and fix and clean-up node iteration code to alloc or free. This makes code more understandable. Reviewed-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 31d78c5..87d7637 100644 --- a/mm

[PATCH 05/18] mm, hugetlb: protect region tracking via newly introduced resv_map lock

2013-07-28 Thread Joonsoo Kim
it can be modified by two processes concurrently. To solve this, I introduce a lock to resv_map and make region manipulation function grab a lock before they do actual work. This makes region tracking safe. Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com diff --git a/include/linux/hugetlb.h b

[PATCH 02/18] mm, hugetlb: change variable name reservations to resv

2013-07-28 Thread Joonsoo Kim
'reservations' is so long name as a variable and we use 'resv_map' to represent 'struct resv_map' in other place. To reduce confusion and unreadability, change it. Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com diff --git a/mm/hugetlb.c b/mm/hugetlb.c index d971233..12b6581 100644 --- a/mm

[PATCH v3 3/9] mm, hugetlb: clean-up alloc_huge_page()

2013-07-28 Thread Joonsoo Kim
This patch unifies successful allocation paths to make the code more readable. There are no functional changes. Acked-by: Michal Hocko mho...@suse.cz Reviewed-by: Wanpeng Li liw...@linux.vnet.ibm.com Reviewed-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com Signed-off-by: Joonsoo Kim

[PATCH v3 5/9] mm, hugetlb: remove redundant list_empty check in gather_surplus_pages()

2013-07-28 Thread Joonsoo Kim
If list is empty, list_for_each_entry_safe() doesn't do anything. So, this check is redundant. Remove it. Acked-by: Michal Hocko mho...@suse.cz Reviewed-by: Wanpeng Li liw...@linux.vnet.ibm.com Reviewed-by: Aneesh Kumar K.V aneesh.ku...@linux.vnet.ibm.com Signed-off-by: Joonsoo Kim iamjoonsoo

[PATCH 00/18] mm, hugetlb: remove a hugetlb_instantiation_mutex

2013-07-28 Thread Joonsoo Kim
] mm/hugetlb: per-vma instantiation mutexes [2] https://lkml.org/lkml/2013/7/22/96 [PATCH v2 00/10] mm, hugetlb: clean-up and possible bug fix Joonsoo Kim (18): mm, hugetlb: protect reserved pages when softofflining requests the pages mm, hugetlb: change variable name reservations

[PATCH 04/18] mm, hugetlb: region manipulation functions take resv_map rather list_head

2013-07-28 Thread Joonsoo Kim
To change a protection method for region tracking to find grained one, we pass the resv_map, instead of list_head, to region manipulation functions. This doesn't introduce any functional change, and it is just for preparing a next step. Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com diff

[PATCH 16/18] mm, hugetlb: return a reserved page to a reserved pool if failed

2013-07-28 Thread Joonsoo Kim
need, because reserve count already decrease in dequeue_huge_page_vma(). This patch fix this situation. Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com diff --git a/mm/hugetlb.c b/mm/hugetlb.c index bb8a45f..6a9ec69 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -649,6 +649,34 @@ struct hstate

[PATCH 18/18] mm, hugetlb: remove a hugetlb_instantiation_mutex

2013-07-28 Thread Joonsoo Kim
Now, we have prepared to have an infrastructure in order to remove a this awkward mutex which serialize all faulting tasks, so remove it. Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 909075b..4fab047 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c

[PATCH 15/18] mm, hugetlb: move up anon_vma_prepare()

2013-07-28 Thread Joonsoo Kim
. So move up anon_vma_prepare() which can be failed in OOM situation. Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 683fd38..bb8a45f 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2536,6 +2536,15 @@ retry_avoidcopy: /* Drop

[PATCH 17/18] mm, hugetlb: retry if we fail to allocate a hugepage with use_reserve

2013-07-28 Thread Joonsoo Kim
. use_reserve represent that this user is legimate one who are ensured to have enough reserved pages. This prevent these thread not to get a SIGBUS signal and make these thread retrying fault handling. Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 6a9ec69

[PATCH 14/18] mm, hugetlb: clean-up error handling in hugetlb_cow()

2013-07-28 Thread Joonsoo Kim
Current code include 'Caller expects lock to be held' in every error path. We can clean-up it as we do error handling in one place. Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 255bd9e..683fd38 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c

[PATCH 01/18] mm, hugetlb: protect reserved pages when softofflining requests the pages

2013-07-28 Thread Joonsoo Kim
alloc_huge_page_node() use dequeue_huge_page_node() without any validation check, so it can steal reserved page unconditionally. To fix it, check the number of free_huge_page in alloc_huge_page_node(). Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com diff --git a/mm/hugetlb.c b/mm/hugetlb.c

[PATCH 12/18] mm, hugetlb: remove a check for return value of alloc_huge_page()

2013-07-28 Thread Joonsoo Kim
Now, alloc_huge_page() only return -ENOSPEC if failed. So, we don't worry about other return value. Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 94173e0..35ccdad 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2562,7 +2562,6 @@ retry_avoidcopy

[PATCH 13/18] mm, hugetlb: grab a page_table_lock after page_cache_release

2013-07-28 Thread Joonsoo Kim
We don't need to grab a page_table_lock when we try to release a page. So, defer to grab a page_table_lock. Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 35ccdad..255bd9e 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2630,10 +2630,11

[PATCH 11/18] mm, hugetlb: move down outside_reserve check

2013-07-28 Thread Joonsoo Kim
Just move down outsider_reserve check. This makes code more readable. There is no functional change. Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 5f31ca5..94173e0 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -2530,20 +2530,6

[PATCH 10/18] mm, hugetlb: call vma_has_reserve() before entering alloc_huge_page()

2013-07-28 Thread Joonsoo Kim
handling and remove a hugetlb_instantiation_mutex. Signed-off-by: Joonsoo Kim iamjoonsoo@lge.com diff --git a/mm/hugetlb.c b/mm/hugetlb.c index a66226e..5f31ca5 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1123,12 +1123,12 @@ static void vma_commit_reservation(struct hstate *h, } static

<    1   2   3   4   5   6   7   8   9   10   >