Re: [PATCH v2 1/7] mm/compaction: split freepages without holding the zone lock
On Wed, Jun 15, 2016 at 11:27:31AM +0900, Joonsoo Kim wrote: > On Tue, Jun 14, 2016 at 03:10:21PM -0400, Sasha Levin wrote: > > On 06/14/2016 01:52 AM, Joonsoo Kim wrote: > > > On Mon, Jun 13, 2016 at 04:31:15PM -0400, Sasha Levin wrote: > > >> > On 05/25/2016 10:37 PM, js1...@gmail.com wrote: > > >>> > > From: Joonsoo Kim> > >>> > > > > >>> > > We don't need to split freepages with holding the zone lock. It > > >>> > > will cause > > >>> > > more contention on zone lock so not desirable. > > >>> > > > > >>> > > Signed-off-by: Joonsoo Kim > > >> > > > >> > Hey Joonsoo, > > > Hello, Sasha. > > >> > > > >> > I'm seeing the following corruption/crash which seems to be related to > > >> > this patch: > > > Could you tell me why you think that following corruption is related > > > to this patch? list_del() in __isolate_free_page() is unchanged part. > > > > > > Before this patch, we did it by split_free_page() -> > > > __isolate_free_page() -> list_del(). With this patch, we do it by > > > calling __isolate_free_page() directly. > > > > I haven't bisected it, but it's the first time I see this issue and this > > commit seems to have done related changes that might cause this. > > > > I can go ahead with bisection if you don't think it's related. > > Hmm... I can't find a bug in this patch for now. There are more candidates > on this area hat changed by me and it would be very helpful if you can > do bisection. Hello, Sasha. You are right! Minchan found the bug in this patch! I will send updated patch soon. http://marc.info/?i=20160616100932.GS17127%40bbox Thanks.
Re: [PATCH v2 1/7] mm/compaction: split freepages without holding the zone lock
On Wed, Jun 15, 2016 at 11:27:31AM +0900, Joonsoo Kim wrote: > On Tue, Jun 14, 2016 at 03:10:21PM -0400, Sasha Levin wrote: > > On 06/14/2016 01:52 AM, Joonsoo Kim wrote: > > > On Mon, Jun 13, 2016 at 04:31:15PM -0400, Sasha Levin wrote: > > >> > On 05/25/2016 10:37 PM, js1...@gmail.com wrote: > > >>> > > From: Joonsoo Kim > > >>> > > > > >>> > > We don't need to split freepages with holding the zone lock. It > > >>> > > will cause > > >>> > > more contention on zone lock so not desirable. > > >>> > > > > >>> > > Signed-off-by: Joonsoo Kim > > >> > > > >> > Hey Joonsoo, > > > Hello, Sasha. > > >> > > > >> > I'm seeing the following corruption/crash which seems to be related to > > >> > this patch: > > > Could you tell me why you think that following corruption is related > > > to this patch? list_del() in __isolate_free_page() is unchanged part. > > > > > > Before this patch, we did it by split_free_page() -> > > > __isolate_free_page() -> list_del(). With this patch, we do it by > > > calling __isolate_free_page() directly. > > > > I haven't bisected it, but it's the first time I see this issue and this > > commit seems to have done related changes that might cause this. > > > > I can go ahead with bisection if you don't think it's related. > > Hmm... I can't find a bug in this patch for now. There are more candidates > on this area hat changed by me and it would be very helpful if you can > do bisection. Hello, Sasha. You are right! Minchan found the bug in this patch! I will send updated patch soon. http://marc.info/?i=20160616100932.GS17127%40bbox Thanks.
Re: [PATCH v2 1/7] mm/compaction: split freepages without holding the zone lock
On Tue, Jun 14, 2016 at 03:10:21PM -0400, Sasha Levin wrote: > On 06/14/2016 01:52 AM, Joonsoo Kim wrote: > > On Mon, Jun 13, 2016 at 04:31:15PM -0400, Sasha Levin wrote: > >> > On 05/25/2016 10:37 PM, js1...@gmail.com wrote: > >>> > > From: Joonsoo Kim> >>> > > > >>> > > We don't need to split freepages with holding the zone lock. It will > >>> > > cause > >>> > > more contention on zone lock so not desirable. > >>> > > > >>> > > Signed-off-by: Joonsoo Kim > >> > > >> > Hey Joonsoo, > > Hello, Sasha. > >> > > >> > I'm seeing the following corruption/crash which seems to be related to > >> > this patch: > > Could you tell me why you think that following corruption is related > > to this patch? list_del() in __isolate_free_page() is unchanged part. > > > > Before this patch, we did it by split_free_page() -> > > __isolate_free_page() -> list_del(). With this patch, we do it by > > calling __isolate_free_page() directly. > > I haven't bisected it, but it's the first time I see this issue and this > commit seems to have done related changes that might cause this. > > I can go ahead with bisection if you don't think it's related. Hmm... I can't find a bug in this patch for now. There are more candidates on this area hat changed by me and it would be very helpful if you can do bisection. Thanks.
Re: [PATCH v2 1/7] mm/compaction: split freepages without holding the zone lock
On Tue, Jun 14, 2016 at 03:10:21PM -0400, Sasha Levin wrote: > On 06/14/2016 01:52 AM, Joonsoo Kim wrote: > > On Mon, Jun 13, 2016 at 04:31:15PM -0400, Sasha Levin wrote: > >> > On 05/25/2016 10:37 PM, js1...@gmail.com wrote: > >>> > > From: Joonsoo Kim > >>> > > > >>> > > We don't need to split freepages with holding the zone lock. It will > >>> > > cause > >>> > > more contention on zone lock so not desirable. > >>> > > > >>> > > Signed-off-by: Joonsoo Kim > >> > > >> > Hey Joonsoo, > > Hello, Sasha. > >> > > >> > I'm seeing the following corruption/crash which seems to be related to > >> > this patch: > > Could you tell me why you think that following corruption is related > > to this patch? list_del() in __isolate_free_page() is unchanged part. > > > > Before this patch, we did it by split_free_page() -> > > __isolate_free_page() -> list_del(). With this patch, we do it by > > calling __isolate_free_page() directly. > > I haven't bisected it, but it's the first time I see this issue and this > commit seems to have done related changes that might cause this. > > I can go ahead with bisection if you don't think it's related. Hmm... I can't find a bug in this patch for now. There are more candidates on this area hat changed by me and it would be very helpful if you can do bisection. Thanks.
Re: [PATCH v2 1/7] mm/compaction: split freepages without holding the zone lock
On 06/14/2016 01:52 AM, Joonsoo Kim wrote: > On Mon, Jun 13, 2016 at 04:31:15PM -0400, Sasha Levin wrote: >> > On 05/25/2016 10:37 PM, js1...@gmail.com wrote: >>> > > From: Joonsoo Kim>>> > > >>> > > We don't need to split freepages with holding the zone lock. It will >>> > > cause >>> > > more contention on zone lock so not desirable. >>> > > >>> > > Signed-off-by: Joonsoo Kim >> > >> > Hey Joonsoo, > Hello, Sasha. >> > >> > I'm seeing the following corruption/crash which seems to be related to >> > this patch: > Could you tell me why you think that following corruption is related > to this patch? list_del() in __isolate_free_page() is unchanged part. > > Before this patch, we did it by split_free_page() -> > __isolate_free_page() -> list_del(). With this patch, we do it by > calling __isolate_free_page() directly. I haven't bisected it, but it's the first time I see this issue and this commit seems to have done related changes that might cause this. I can go ahead with bisection if you don't think it's related. Thanks, Sasha
Re: [PATCH v2 1/7] mm/compaction: split freepages without holding the zone lock
On 06/14/2016 01:52 AM, Joonsoo Kim wrote: > On Mon, Jun 13, 2016 at 04:31:15PM -0400, Sasha Levin wrote: >> > On 05/25/2016 10:37 PM, js1...@gmail.com wrote: >>> > > From: Joonsoo Kim >>> > > >>> > > We don't need to split freepages with holding the zone lock. It will >>> > > cause >>> > > more contention on zone lock so not desirable. >>> > > >>> > > Signed-off-by: Joonsoo Kim >> > >> > Hey Joonsoo, > Hello, Sasha. >> > >> > I'm seeing the following corruption/crash which seems to be related to >> > this patch: > Could you tell me why you think that following corruption is related > to this patch? list_del() in __isolate_free_page() is unchanged part. > > Before this patch, we did it by split_free_page() -> > __isolate_free_page() -> list_del(). With this patch, we do it by > calling __isolate_free_page() directly. I haven't bisected it, but it's the first time I see this issue and this commit seems to have done related changes that might cause this. I can go ahead with bisection if you don't think it's related. Thanks, Sasha
Re: [PATCH v2 1/7] mm/compaction: split freepages without holding the zone lock
On Mon, Jun 13, 2016 at 04:31:15PM -0400, Sasha Levin wrote: > On 05/25/2016 10:37 PM, js1...@gmail.com wrote: > > From: Joonsoo Kim> > > > We don't need to split freepages with holding the zone lock. It will cause > > more contention on zone lock so not desirable. > > > > Signed-off-by: Joonsoo Kim > > Hey Joonsoo, Hello, Sasha. > > I'm seeing the following corruption/crash which seems to be related to > this patch: Could you tell me why you think that following corruption is related to this patch? list_del() in __isolate_free_page() is unchanged part. Before this patch, we did it by split_free_page() -> __isolate_free_page() -> list_del(). With this patch, we do it by calling __isolate_free_page() directly. Thanks. > [ 3777.807224] [ cut here ] > > [ 3777.807834] WARNING: CPU: 5 PID: 3270 at lib/list_debug.c:62 > __list_del_entry+0x14e/0x280 > > [ 3777.808562] list_del corruption. next->prev should be ea0004a76120, > but was ea0004a72120 > > [ 3777.809498] Modules linked in: > > [ 3777.809923] CPU: 5 PID: 3270 Comm: khugepaged Tainted: GW > 4.7.0-rc2-next-20160609-sasha-00024-g30ecaf6 #3101 > > [ 3777.811014] 1100f9315d7b 0bb7299a 8807c98aec60 > a0035b2b > > [ 3777.811816] 0005 fbfff5630bf4 41b58ab3 > aaaf18e0 > > [ 3777.812662] a00359bc 9e54d4a0 a8b2ade0 > 8807c98aece0 > > [ 3777.813493] Call Trace: > > [ 3777.813796] dump_stack (lib/dump_stack.c:53) > [ 3777.814310] ? arch_local_irq_restore > (./arch/x86/include/asm/paravirt.h:134) > [ 3777.814947] ? is_module_text_address (kernel/module.c:4185) > [ 3777.815571] ? __list_del_entry (lib/list_debug.c:60 (discriminator 1)) > [ 3777.816174] ? vprintk_default (kernel/printk/printk.c:1886) > [ 3777.816761] ? __list_del_entry (lib/list_debug.c:60 (discriminator 1)) > [ 3777.817381] __warn (kernel/panic.c:518) > [ 3777.817867] warn_slowpath_fmt (kernel/panic.c:526) > [ 3777.818428] ? __warn (kernel/panic.c:526) > [ 3777.819001] ? __schedule (kernel/sched/core.c:2858 > kernel/sched/core.c:3345) > [ 3777.819541] __list_del_entry (lib/list_debug.c:60 (discriminator 1)) > [ 3777.820116] ? __list_add (lib/list_debug.c:45) > [ 3777.820721] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) > [ 3777.821347] list_del (lib/list_debug.c:78) > [ 3777.821829] __isolate_free_page (mm/page_alloc.c:2514) > [ 3777.822400] ? __zone_watermark_ok (mm/page_alloc.c:2493) > [ 3777.823007] isolate_freepages_block (mm/compaction.c:498) > [ 3777.823629] ? compact_unlock_should_abort (mm/compaction.c:417) > [ 3777.824312] compaction_alloc (mm/compaction.c:1112 mm/compaction.c:1156) > [ 3777.824871] ? isolate_freepages_block (mm/compaction.c:1146) > [ 3777.825512] ? __page_cache_release (mm/swap.c:73) > [ 3777.826127] migrate_pages (mm/migrate.c:1079 mm/migrate.c:1325) > [ 3777.826712] ? __reset_isolation_suitable (mm/compaction.c:1175) > [ 3777.827398] ? isolate_freepages_block (mm/compaction.c:1146) > [ 3777.828109] ? buffer_migrate_page (mm/migrate.c:1301) > [ 3777.828727] compact_zone (mm/compaction.c:1555) > [ 3777.829290] ? compaction_restarting (mm/compaction.c:1476) > [ 3777.829969] ? _raw_spin_unlock_irq (./arch/x86/include/asm/preempt.h:92 > include/linux/spinlock_api_smp.h:171 kernel/locking/spinlock.c:199) > [ 3777.830607] compact_zone_order (mm/compaction.c:1653) > [ 3777.831204] ? kick_process (kernel/sched/core.c:2692) > [ 3777.831774] ? compact_zone (mm/compaction.c:1637) > [ 3777.832336] ? io_schedule_timeout (kernel/sched/core.c:3266) > [ 3777.832934] try_to_compact_pages (mm/compaction.c:1717) > [ 3777.833550] ? compaction_zonelist_suitable (mm/compaction.c:1679) > [ 3777.834265] __alloc_pages_direct_compact (mm/page_alloc.c:3180) > [ 3777.834922] ? get_page_from_freelist (mm/page_alloc.c:3172) > [ 3777.835549] __alloc_pages_slowpath (mm/page_alloc.c:3741) > [ 3777.836210] ? kvm_clock_read (./arch/x86/include/asm/preempt.h:84 > arch/x86/kernel/kvmclock.c:92) > [ 3777.836744] ? __alloc_pages_direct_compact (mm/page_alloc.c:3546) > [ 3777.837429] ? get_page_from_freelist (mm/page_alloc.c:2950) > [ 3777.838072] ? release_pages (mm/swap.c:731) > [ 3777.838610] ? __isolate_free_page (mm/page_alloc.c:2883) > [ 3777.839209] ? ___might_sleep (kernel/sched/core.c:7540 (discriminator 1)) > [ 3777.839826] ? __might_sleep (kernel/sched/core.c:7532 (discriminator 14)) > [ 3777.840427] __alloc_pages_nodemask (mm/page_alloc.c:3841) > [ 3777.841071] ? rwsem_wake (kernel/locking/rwsem-xadd.c:580) > [ 3777.841608] ? __alloc_pages_slowpath (mm/page_alloc.c:3757) > [ 3777.842253] ? call_rwsem_wake (arch/x86/lib/rwsem.S:129) > [ 3777.842839] ? up_write (kernel/locking/rwsem.c:112) > [ 3777.843350] ? pmdp_huge_clear_flush (mm/pgtable-generic.c:131) > [ 3777.844125] khugepaged_alloc_page (mm/khugepaged.c:752) > [ 3777.844719] collapse_huge_page (mm/khugepaged.c:948) > [
Re: [PATCH v2 1/7] mm/compaction: split freepages without holding the zone lock
On Mon, Jun 13, 2016 at 04:31:15PM -0400, Sasha Levin wrote: > On 05/25/2016 10:37 PM, js1...@gmail.com wrote: > > From: Joonsoo Kim > > > > We don't need to split freepages with holding the zone lock. It will cause > > more contention on zone lock so not desirable. > > > > Signed-off-by: Joonsoo Kim > > Hey Joonsoo, Hello, Sasha. > > I'm seeing the following corruption/crash which seems to be related to > this patch: Could you tell me why you think that following corruption is related to this patch? list_del() in __isolate_free_page() is unchanged part. Before this patch, we did it by split_free_page() -> __isolate_free_page() -> list_del(). With this patch, we do it by calling __isolate_free_page() directly. Thanks. > [ 3777.807224] [ cut here ] > > [ 3777.807834] WARNING: CPU: 5 PID: 3270 at lib/list_debug.c:62 > __list_del_entry+0x14e/0x280 > > [ 3777.808562] list_del corruption. next->prev should be ea0004a76120, > but was ea0004a72120 > > [ 3777.809498] Modules linked in: > > [ 3777.809923] CPU: 5 PID: 3270 Comm: khugepaged Tainted: GW > 4.7.0-rc2-next-20160609-sasha-00024-g30ecaf6 #3101 > > [ 3777.811014] 1100f9315d7b 0bb7299a 8807c98aec60 > a0035b2b > > [ 3777.811816] 0005 fbfff5630bf4 41b58ab3 > aaaf18e0 > > [ 3777.812662] a00359bc 9e54d4a0 a8b2ade0 > 8807c98aece0 > > [ 3777.813493] Call Trace: > > [ 3777.813796] dump_stack (lib/dump_stack.c:53) > [ 3777.814310] ? arch_local_irq_restore > (./arch/x86/include/asm/paravirt.h:134) > [ 3777.814947] ? is_module_text_address (kernel/module.c:4185) > [ 3777.815571] ? __list_del_entry (lib/list_debug.c:60 (discriminator 1)) > [ 3777.816174] ? vprintk_default (kernel/printk/printk.c:1886) > [ 3777.816761] ? __list_del_entry (lib/list_debug.c:60 (discriminator 1)) > [ 3777.817381] __warn (kernel/panic.c:518) > [ 3777.817867] warn_slowpath_fmt (kernel/panic.c:526) > [ 3777.818428] ? __warn (kernel/panic.c:526) > [ 3777.819001] ? __schedule (kernel/sched/core.c:2858 > kernel/sched/core.c:3345) > [ 3777.819541] __list_del_entry (lib/list_debug.c:60 (discriminator 1)) > [ 3777.820116] ? __list_add (lib/list_debug.c:45) > [ 3777.820721] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) > [ 3777.821347] list_del (lib/list_debug.c:78) > [ 3777.821829] __isolate_free_page (mm/page_alloc.c:2514) > [ 3777.822400] ? __zone_watermark_ok (mm/page_alloc.c:2493) > [ 3777.823007] isolate_freepages_block (mm/compaction.c:498) > [ 3777.823629] ? compact_unlock_should_abort (mm/compaction.c:417) > [ 3777.824312] compaction_alloc (mm/compaction.c:1112 mm/compaction.c:1156) > [ 3777.824871] ? isolate_freepages_block (mm/compaction.c:1146) > [ 3777.825512] ? __page_cache_release (mm/swap.c:73) > [ 3777.826127] migrate_pages (mm/migrate.c:1079 mm/migrate.c:1325) > [ 3777.826712] ? __reset_isolation_suitable (mm/compaction.c:1175) > [ 3777.827398] ? isolate_freepages_block (mm/compaction.c:1146) > [ 3777.828109] ? buffer_migrate_page (mm/migrate.c:1301) > [ 3777.828727] compact_zone (mm/compaction.c:1555) > [ 3777.829290] ? compaction_restarting (mm/compaction.c:1476) > [ 3777.829969] ? _raw_spin_unlock_irq (./arch/x86/include/asm/preempt.h:92 > include/linux/spinlock_api_smp.h:171 kernel/locking/spinlock.c:199) > [ 3777.830607] compact_zone_order (mm/compaction.c:1653) > [ 3777.831204] ? kick_process (kernel/sched/core.c:2692) > [ 3777.831774] ? compact_zone (mm/compaction.c:1637) > [ 3777.832336] ? io_schedule_timeout (kernel/sched/core.c:3266) > [ 3777.832934] try_to_compact_pages (mm/compaction.c:1717) > [ 3777.833550] ? compaction_zonelist_suitable (mm/compaction.c:1679) > [ 3777.834265] __alloc_pages_direct_compact (mm/page_alloc.c:3180) > [ 3777.834922] ? get_page_from_freelist (mm/page_alloc.c:3172) > [ 3777.835549] __alloc_pages_slowpath (mm/page_alloc.c:3741) > [ 3777.836210] ? kvm_clock_read (./arch/x86/include/asm/preempt.h:84 > arch/x86/kernel/kvmclock.c:92) > [ 3777.836744] ? __alloc_pages_direct_compact (mm/page_alloc.c:3546) > [ 3777.837429] ? get_page_from_freelist (mm/page_alloc.c:2950) > [ 3777.838072] ? release_pages (mm/swap.c:731) > [ 3777.838610] ? __isolate_free_page (mm/page_alloc.c:2883) > [ 3777.839209] ? ___might_sleep (kernel/sched/core.c:7540 (discriminator 1)) > [ 3777.839826] ? __might_sleep (kernel/sched/core.c:7532 (discriminator 14)) > [ 3777.840427] __alloc_pages_nodemask (mm/page_alloc.c:3841) > [ 3777.841071] ? rwsem_wake (kernel/locking/rwsem-xadd.c:580) > [ 3777.841608] ? __alloc_pages_slowpath (mm/page_alloc.c:3757) > [ 3777.842253] ? call_rwsem_wake (arch/x86/lib/rwsem.S:129) > [ 3777.842839] ? up_write (kernel/locking/rwsem.c:112) > [ 3777.843350] ? pmdp_huge_clear_flush (mm/pgtable-generic.c:131) > [ 3777.844125] khugepaged_alloc_page (mm/khugepaged.c:752) > [ 3777.844719] collapse_huge_page (mm/khugepaged.c:948) > [ 3777.845332] ? khugepaged_scan_shmem
Re: [PATCH v2 1/7] mm/compaction: split freepages without holding the zone lock
On 05/25/2016 10:37 PM, js1...@gmail.com wrote: > From: Joonsoo Kim> > We don't need to split freepages with holding the zone lock. It will cause > more contention on zone lock so not desirable. > > Signed-off-by: Joonsoo Kim Hey Joonsoo, I'm seeing the following corruption/crash which seems to be related to this patch: [ 3777.807224] [ cut here ] [ 3777.807834] WARNING: CPU: 5 PID: 3270 at lib/list_debug.c:62 __list_del_entry+0x14e/0x280 [ 3777.808562] list_del corruption. next->prev should be ea0004a76120, but was ea0004a72120 [ 3777.809498] Modules linked in: [ 3777.809923] CPU: 5 PID: 3270 Comm: khugepaged Tainted: GW 4.7.0-rc2-next-20160609-sasha-00024-g30ecaf6 #3101 [ 3777.811014] 1100f9315d7b 0bb7299a 8807c98aec60 a0035b2b [ 3777.811816] 0005 fbfff5630bf4 41b58ab3 aaaf18e0 [ 3777.812662] a00359bc 9e54d4a0 a8b2ade0 8807c98aece0 [ 3777.813493] Call Trace: [ 3777.813796] dump_stack (lib/dump_stack.c:53) [ 3777.814310] ? arch_local_irq_restore (./arch/x86/include/asm/paravirt.h:134) [ 3777.814947] ? is_module_text_address (kernel/module.c:4185) [ 3777.815571] ? __list_del_entry (lib/list_debug.c:60 (discriminator 1)) [ 3777.816174] ? vprintk_default (kernel/printk/printk.c:1886) [ 3777.816761] ? __list_del_entry (lib/list_debug.c:60 (discriminator 1)) [ 3777.817381] __warn (kernel/panic.c:518) [ 3777.817867] warn_slowpath_fmt (kernel/panic.c:526) [ 3777.818428] ? __warn (kernel/panic.c:526) [ 3777.819001] ? __schedule (kernel/sched/core.c:2858 kernel/sched/core.c:3345) [ 3777.819541] __list_del_entry (lib/list_debug.c:60 (discriminator 1)) [ 3777.820116] ? __list_add (lib/list_debug.c:45) [ 3777.820721] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) [ 3777.821347] list_del (lib/list_debug.c:78) [ 3777.821829] __isolate_free_page (mm/page_alloc.c:2514) [ 3777.822400] ? __zone_watermark_ok (mm/page_alloc.c:2493) [ 3777.823007] isolate_freepages_block (mm/compaction.c:498) [ 3777.823629] ? compact_unlock_should_abort (mm/compaction.c:417) [ 3777.824312] compaction_alloc (mm/compaction.c:1112 mm/compaction.c:1156) [ 3777.824871] ? isolate_freepages_block (mm/compaction.c:1146) [ 3777.825512] ? __page_cache_release (mm/swap.c:73) [ 3777.826127] migrate_pages (mm/migrate.c:1079 mm/migrate.c:1325) [ 3777.826712] ? __reset_isolation_suitable (mm/compaction.c:1175) [ 3777.827398] ? isolate_freepages_block (mm/compaction.c:1146) [ 3777.828109] ? buffer_migrate_page (mm/migrate.c:1301) [ 3777.828727] compact_zone (mm/compaction.c:1555) [ 3777.829290] ? compaction_restarting (mm/compaction.c:1476) [ 3777.829969] ? _raw_spin_unlock_irq (./arch/x86/include/asm/preempt.h:92 include/linux/spinlock_api_smp.h:171 kernel/locking/spinlock.c:199) [ 3777.830607] compact_zone_order (mm/compaction.c:1653) [ 3777.831204] ? kick_process (kernel/sched/core.c:2692) [ 3777.831774] ? compact_zone (mm/compaction.c:1637) [ 3777.832336] ? io_schedule_timeout (kernel/sched/core.c:3266) [ 3777.832934] try_to_compact_pages (mm/compaction.c:1717) [ 3777.833550] ? compaction_zonelist_suitable (mm/compaction.c:1679) [ 3777.834265] __alloc_pages_direct_compact (mm/page_alloc.c:3180) [ 3777.834922] ? get_page_from_freelist (mm/page_alloc.c:3172) [ 3777.835549] __alloc_pages_slowpath (mm/page_alloc.c:3741) [ 3777.836210] ? kvm_clock_read (./arch/x86/include/asm/preempt.h:84 arch/x86/kernel/kvmclock.c:92) [ 3777.836744] ? __alloc_pages_direct_compact (mm/page_alloc.c:3546) [ 3777.837429] ? get_page_from_freelist (mm/page_alloc.c:2950) [ 3777.838072] ? release_pages (mm/swap.c:731) [ 3777.838610] ? __isolate_free_page (mm/page_alloc.c:2883) [ 3777.839209] ? ___might_sleep (kernel/sched/core.c:7540 (discriminator 1)) [ 3777.839826] ? __might_sleep (kernel/sched/core.c:7532 (discriminator 14)) [ 3777.840427] __alloc_pages_nodemask (mm/page_alloc.c:3841) [ 3777.841071] ? rwsem_wake (kernel/locking/rwsem-xadd.c:580) [ 3777.841608] ? __alloc_pages_slowpath (mm/page_alloc.c:3757) [ 3777.842253] ? call_rwsem_wake (arch/x86/lib/rwsem.S:129) [ 3777.842839] ? up_write (kernel/locking/rwsem.c:112) [ 3777.843350] ? pmdp_huge_clear_flush (mm/pgtable-generic.c:131) [ 3777.844125] khugepaged_alloc_page (mm/khugepaged.c:752) [ 3777.844719] collapse_huge_page (mm/khugepaged.c:948) [ 3777.845332] ? khugepaged_scan_shmem (mm/khugepaged.c:922) [ 3777.846020] ? __might_sleep (kernel/sched/core.c:7532 (discriminator 14)) [ 3777.846608] ? remove_wait_queue (kernel/sched/wait.c:292) [ 3777.847181] khugepaged (mm/khugepaged.c:1724 mm/khugepaged.c:1799 mm/khugepaged.c:1848) [ 3777.847704] ? _raw_spin_unlock_irq (./arch/x86/include/asm/preempt.h:92 include/linux/spinlock_api_smp.h:171 kernel/locking/spinlock.c:199) [ 3777.848297] ? collapse_huge_page (mm/khugepaged.c:1840) [ 3777.848950] ? io_schedule_timeout (kernel/sched/core.c:3266) [ 3777.849555] ?
Re: [PATCH v2 1/7] mm/compaction: split freepages without holding the zone lock
On 05/25/2016 10:37 PM, js1...@gmail.com wrote: > From: Joonsoo Kim > > We don't need to split freepages with holding the zone lock. It will cause > more contention on zone lock so not desirable. > > Signed-off-by: Joonsoo Kim Hey Joonsoo, I'm seeing the following corruption/crash which seems to be related to this patch: [ 3777.807224] [ cut here ] [ 3777.807834] WARNING: CPU: 5 PID: 3270 at lib/list_debug.c:62 __list_del_entry+0x14e/0x280 [ 3777.808562] list_del corruption. next->prev should be ea0004a76120, but was ea0004a72120 [ 3777.809498] Modules linked in: [ 3777.809923] CPU: 5 PID: 3270 Comm: khugepaged Tainted: GW 4.7.0-rc2-next-20160609-sasha-00024-g30ecaf6 #3101 [ 3777.811014] 1100f9315d7b 0bb7299a 8807c98aec60 a0035b2b [ 3777.811816] 0005 fbfff5630bf4 41b58ab3 aaaf18e0 [ 3777.812662] a00359bc 9e54d4a0 a8b2ade0 8807c98aece0 [ 3777.813493] Call Trace: [ 3777.813796] dump_stack (lib/dump_stack.c:53) [ 3777.814310] ? arch_local_irq_restore (./arch/x86/include/asm/paravirt.h:134) [ 3777.814947] ? is_module_text_address (kernel/module.c:4185) [ 3777.815571] ? __list_del_entry (lib/list_debug.c:60 (discriminator 1)) [ 3777.816174] ? vprintk_default (kernel/printk/printk.c:1886) [ 3777.816761] ? __list_del_entry (lib/list_debug.c:60 (discriminator 1)) [ 3777.817381] __warn (kernel/panic.c:518) [ 3777.817867] warn_slowpath_fmt (kernel/panic.c:526) [ 3777.818428] ? __warn (kernel/panic.c:526) [ 3777.819001] ? __schedule (kernel/sched/core.c:2858 kernel/sched/core.c:3345) [ 3777.819541] __list_del_entry (lib/list_debug.c:60 (discriminator 1)) [ 3777.820116] ? __list_add (lib/list_debug.c:45) [ 3777.820721] ? __this_cpu_preempt_check (lib/smp_processor_id.c:63) [ 3777.821347] list_del (lib/list_debug.c:78) [ 3777.821829] __isolate_free_page (mm/page_alloc.c:2514) [ 3777.822400] ? __zone_watermark_ok (mm/page_alloc.c:2493) [ 3777.823007] isolate_freepages_block (mm/compaction.c:498) [ 3777.823629] ? compact_unlock_should_abort (mm/compaction.c:417) [ 3777.824312] compaction_alloc (mm/compaction.c:1112 mm/compaction.c:1156) [ 3777.824871] ? isolate_freepages_block (mm/compaction.c:1146) [ 3777.825512] ? __page_cache_release (mm/swap.c:73) [ 3777.826127] migrate_pages (mm/migrate.c:1079 mm/migrate.c:1325) [ 3777.826712] ? __reset_isolation_suitable (mm/compaction.c:1175) [ 3777.827398] ? isolate_freepages_block (mm/compaction.c:1146) [ 3777.828109] ? buffer_migrate_page (mm/migrate.c:1301) [ 3777.828727] compact_zone (mm/compaction.c:1555) [ 3777.829290] ? compaction_restarting (mm/compaction.c:1476) [ 3777.829969] ? _raw_spin_unlock_irq (./arch/x86/include/asm/preempt.h:92 include/linux/spinlock_api_smp.h:171 kernel/locking/spinlock.c:199) [ 3777.830607] compact_zone_order (mm/compaction.c:1653) [ 3777.831204] ? kick_process (kernel/sched/core.c:2692) [ 3777.831774] ? compact_zone (mm/compaction.c:1637) [ 3777.832336] ? io_schedule_timeout (kernel/sched/core.c:3266) [ 3777.832934] try_to_compact_pages (mm/compaction.c:1717) [ 3777.833550] ? compaction_zonelist_suitable (mm/compaction.c:1679) [ 3777.834265] __alloc_pages_direct_compact (mm/page_alloc.c:3180) [ 3777.834922] ? get_page_from_freelist (mm/page_alloc.c:3172) [ 3777.835549] __alloc_pages_slowpath (mm/page_alloc.c:3741) [ 3777.836210] ? kvm_clock_read (./arch/x86/include/asm/preempt.h:84 arch/x86/kernel/kvmclock.c:92) [ 3777.836744] ? __alloc_pages_direct_compact (mm/page_alloc.c:3546) [ 3777.837429] ? get_page_from_freelist (mm/page_alloc.c:2950) [ 3777.838072] ? release_pages (mm/swap.c:731) [ 3777.838610] ? __isolate_free_page (mm/page_alloc.c:2883) [ 3777.839209] ? ___might_sleep (kernel/sched/core.c:7540 (discriminator 1)) [ 3777.839826] ? __might_sleep (kernel/sched/core.c:7532 (discriminator 14)) [ 3777.840427] __alloc_pages_nodemask (mm/page_alloc.c:3841) [ 3777.841071] ? rwsem_wake (kernel/locking/rwsem-xadd.c:580) [ 3777.841608] ? __alloc_pages_slowpath (mm/page_alloc.c:3757) [ 3777.842253] ? call_rwsem_wake (arch/x86/lib/rwsem.S:129) [ 3777.842839] ? up_write (kernel/locking/rwsem.c:112) [ 3777.843350] ? pmdp_huge_clear_flush (mm/pgtable-generic.c:131) [ 3777.844125] khugepaged_alloc_page (mm/khugepaged.c:752) [ 3777.844719] collapse_huge_page (mm/khugepaged.c:948) [ 3777.845332] ? khugepaged_scan_shmem (mm/khugepaged.c:922) [ 3777.846020] ? __might_sleep (kernel/sched/core.c:7532 (discriminator 14)) [ 3777.846608] ? remove_wait_queue (kernel/sched/wait.c:292) [ 3777.847181] khugepaged (mm/khugepaged.c:1724 mm/khugepaged.c:1799 mm/khugepaged.c:1848) [ 3777.847704] ? _raw_spin_unlock_irq (./arch/x86/include/asm/preempt.h:92 include/linux/spinlock_api_smp.h:171 kernel/locking/spinlock.c:199) [ 3777.848297] ? collapse_huge_page (mm/khugepaged.c:1840) [ 3777.848950] ? io_schedule_timeout (kernel/sched/core.c:3266) [ 3777.849555] ? default_wake_function (kernel/sched/core.c:3544) [
Re: [PATCH v2 1/7] mm/compaction: split freepages without holding the zone lock
On 06/03/2016 02:45 PM, Joonsoo Kim wrote: 2016-06-03 19:10 GMT+09:00 Vlastimil Babka: On 05/26/2016 04:37 AM, js1...@gmail.com wrote: From: Joonsoo Kim We don't need to split freepages with holding the zone lock. It will cause more contention on zone lock so not desirable. Signed-off-by: Joonsoo Kim So it wasn't possible to at least move this code from compaction.c to page_alloc.c? Or better, reuse prep_new_page() with some forged gfp/alloc_flags? As we discussed in v1... Sorry for not mentioning that I did it as a separate patch, Please see below link which is the last one within this patchset. Link: http://lkml.kernel.org/r/1464230275-25791-7-git-send-email-iamjoonsoo@lge.com Ah I see. In that case, Acked-by: Vlastimil Babka Thanks.
Re: [PATCH v2 1/7] mm/compaction: split freepages without holding the zone lock
On 06/03/2016 02:45 PM, Joonsoo Kim wrote: 2016-06-03 19:10 GMT+09:00 Vlastimil Babka : On 05/26/2016 04:37 AM, js1...@gmail.com wrote: From: Joonsoo Kim We don't need to split freepages with holding the zone lock. It will cause more contention on zone lock so not desirable. Signed-off-by: Joonsoo Kim So it wasn't possible to at least move this code from compaction.c to page_alloc.c? Or better, reuse prep_new_page() with some forged gfp/alloc_flags? As we discussed in v1... Sorry for not mentioning that I did it as a separate patch, Please see below link which is the last one within this patchset. Link: http://lkml.kernel.org/r/1464230275-25791-7-git-send-email-iamjoonsoo@lge.com Ah I see. In that case, Acked-by: Vlastimil Babka Thanks.
Re: [PATCH v2 1/7] mm/compaction: split freepages without holding the zone lock
2016-06-03 19:10 GMT+09:00 Vlastimil Babka: > On 05/26/2016 04:37 AM, js1...@gmail.com wrote: >> >> From: Joonsoo Kim >> >> We don't need to split freepages with holding the zone lock. It will cause >> more contention on zone lock so not desirable. >> >> Signed-off-by: Joonsoo Kim > > > So it wasn't possible to at least move this code from compaction.c to > page_alloc.c? Or better, reuse prep_new_page() with some forged > gfp/alloc_flags? As we discussed in v1... Sorry for not mentioning that I did it as a separate patch, Please see below link which is the last one within this patchset. Link: http://lkml.kernel.org/r/1464230275-25791-7-git-send-email-iamjoonsoo@lge.com Thanks.
Re: [PATCH v2 1/7] mm/compaction: split freepages without holding the zone lock
2016-06-03 19:10 GMT+09:00 Vlastimil Babka : > On 05/26/2016 04:37 AM, js1...@gmail.com wrote: >> >> From: Joonsoo Kim >> >> We don't need to split freepages with holding the zone lock. It will cause >> more contention on zone lock so not desirable. >> >> Signed-off-by: Joonsoo Kim > > > So it wasn't possible to at least move this code from compaction.c to > page_alloc.c? Or better, reuse prep_new_page() with some forged > gfp/alloc_flags? As we discussed in v1... Sorry for not mentioning that I did it as a separate patch, Please see below link which is the last one within this patchset. Link: http://lkml.kernel.org/r/1464230275-25791-7-git-send-email-iamjoonsoo@lge.com Thanks.
Re: [PATCH v2 1/7] mm/compaction: split freepages without holding the zone lock
On 05/26/2016 04:37 AM, js1...@gmail.com wrote: From: Joonsoo KimWe don't need to split freepages with holding the zone lock. It will cause more contention on zone lock so not desirable. Signed-off-by: Joonsoo Kim So it wasn't possible to at least move this code from compaction.c to page_alloc.c? Or better, reuse prep_new_page() with some forged gfp/alloc_flags? As we discussed in v1... --- include/linux/mm.h | 1 - mm/compaction.c| 42 ++ mm/page_alloc.c| 27 --- 3 files changed, 30 insertions(+), 40 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index a00ec81..1a1782c 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -537,7 +537,6 @@ void __put_page(struct page *page); void put_pages_list(struct list_head *pages); void split_page(struct page *page, unsigned int order); -int split_free_page(struct page *page); /* * Compound pages have a destructor function. Provide a diff --git a/mm/compaction.c b/mm/compaction.c index 1427366..8e013eb 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -65,13 +65,31 @@ static unsigned long release_freepages(struct list_head *freelist) static void map_pages(struct list_head *list) { - struct page *page; + unsigned int i, order, nr_pages; + struct page *page, *next; + LIST_HEAD(tmp_list); + + list_for_each_entry_safe(page, next, list, lru) { + list_del(>lru); + + order = page_private(page); + nr_pages = 1 << order; + set_page_private(page, 0); + set_page_refcounted(page); + + arch_alloc_page(page, order); + kernel_map_pages(page, nr_pages, 1); + kasan_alloc_pages(page, order); + if (order) + split_page(page, order); - list_for_each_entry(page, list, lru) { - arch_alloc_page(page, 0); - kernel_map_pages(page, 1, 1); - kasan_alloc_pages(page, 0); + for (i = 0; i < nr_pages; i++) { + list_add(>lru, _list); + page++; + } } + + list_splice(_list, list); } static inline bool migrate_async_suitable(int migratetype) @@ -368,12 +386,13 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, unsigned long flags = 0; bool locked = false; unsigned long blockpfn = *start_pfn; + unsigned int order; cursor = pfn_to_page(blockpfn); /* Isolate free pages. */ for (; blockpfn < end_pfn; blockpfn++, cursor++) { - int isolated, i; + int isolated; struct page *page = cursor; /* @@ -439,13 +458,12 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, goto isolate_fail; } - /* Found a free page, break it into order-0 pages */ - isolated = split_free_page(page); + /* Found a free page, will break it into order-0 pages */ + order = page_order(page); + isolated = __isolate_free_page(page, page_order(page)); + set_page_private(page, order); total_isolated += isolated; - for (i = 0; i < isolated; i++) { - list_add(>lru, freelist); - page++; - } + list_add_tail(>lru, freelist); /* If a page was split, advance to the end of it */ if (isolated) { diff --git a/mm/page_alloc.c b/mm/page_alloc.c index d27e8b9..5134f46 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2525,33 +2525,6 @@ int __isolate_free_page(struct page *page, unsigned int order) } /* - * Similar to split_page except the page is already free. As this is only - * being used for migration, the migratetype of the block also changes. - * As this is called with interrupts disabled, the caller is responsible - * for calling arch_alloc_page() and kernel_map_page() after interrupts - * are enabled. - * - * Note: this is probably too low level an operation for use in drivers. - * Please consult with lkml before using this in your driver. - */ -int split_free_page(struct page *page) -{ - unsigned int order; - int nr_pages; - - order = page_order(page); - - nr_pages = __isolate_free_page(page, order); - if (!nr_pages) - return 0; - - /* Split into individual pages */ - set_page_refcounted(page); - split_page(page, order); - return nr_pages; -} - -/* * Update NUMA hit/miss statistics * * Must be called with interrupts disabled.
Re: [PATCH v2 1/7] mm/compaction: split freepages without holding the zone lock
On 05/26/2016 04:37 AM, js1...@gmail.com wrote: From: Joonsoo Kim We don't need to split freepages with holding the zone lock. It will cause more contention on zone lock so not desirable. Signed-off-by: Joonsoo Kim So it wasn't possible to at least move this code from compaction.c to page_alloc.c? Or better, reuse prep_new_page() with some forged gfp/alloc_flags? As we discussed in v1... --- include/linux/mm.h | 1 - mm/compaction.c| 42 ++ mm/page_alloc.c| 27 --- 3 files changed, 30 insertions(+), 40 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index a00ec81..1a1782c 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -537,7 +537,6 @@ void __put_page(struct page *page); void put_pages_list(struct list_head *pages); void split_page(struct page *page, unsigned int order); -int split_free_page(struct page *page); /* * Compound pages have a destructor function. Provide a diff --git a/mm/compaction.c b/mm/compaction.c index 1427366..8e013eb 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -65,13 +65,31 @@ static unsigned long release_freepages(struct list_head *freelist) static void map_pages(struct list_head *list) { - struct page *page; + unsigned int i, order, nr_pages; + struct page *page, *next; + LIST_HEAD(tmp_list); + + list_for_each_entry_safe(page, next, list, lru) { + list_del(>lru); + + order = page_private(page); + nr_pages = 1 << order; + set_page_private(page, 0); + set_page_refcounted(page); + + arch_alloc_page(page, order); + kernel_map_pages(page, nr_pages, 1); + kasan_alloc_pages(page, order); + if (order) + split_page(page, order); - list_for_each_entry(page, list, lru) { - arch_alloc_page(page, 0); - kernel_map_pages(page, 1, 1); - kasan_alloc_pages(page, 0); + for (i = 0; i < nr_pages; i++) { + list_add(>lru, _list); + page++; + } } + + list_splice(_list, list); } static inline bool migrate_async_suitable(int migratetype) @@ -368,12 +386,13 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, unsigned long flags = 0; bool locked = false; unsigned long blockpfn = *start_pfn; + unsigned int order; cursor = pfn_to_page(blockpfn); /* Isolate free pages. */ for (; blockpfn < end_pfn; blockpfn++, cursor++) { - int isolated, i; + int isolated; struct page *page = cursor; /* @@ -439,13 +458,12 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, goto isolate_fail; } - /* Found a free page, break it into order-0 pages */ - isolated = split_free_page(page); + /* Found a free page, will break it into order-0 pages */ + order = page_order(page); + isolated = __isolate_free_page(page, page_order(page)); + set_page_private(page, order); total_isolated += isolated; - for (i = 0; i < isolated; i++) { - list_add(>lru, freelist); - page++; - } + list_add_tail(>lru, freelist); /* If a page was split, advance to the end of it */ if (isolated) { diff --git a/mm/page_alloc.c b/mm/page_alloc.c index d27e8b9..5134f46 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2525,33 +2525,6 @@ int __isolate_free_page(struct page *page, unsigned int order) } /* - * Similar to split_page except the page is already free. As this is only - * being used for migration, the migratetype of the block also changes. - * As this is called with interrupts disabled, the caller is responsible - * for calling arch_alloc_page() and kernel_map_page() after interrupts - * are enabled. - * - * Note: this is probably too low level an operation for use in drivers. - * Please consult with lkml before using this in your driver. - */ -int split_free_page(struct page *page) -{ - unsigned int order; - int nr_pages; - - order = page_order(page); - - nr_pages = __isolate_free_page(page, order); - if (!nr_pages) - return 0; - - /* Split into individual pages */ - set_page_refcounted(page); - split_page(page, order); - return nr_pages; -} - -/* * Update NUMA hit/miss statistics * * Must be called with interrupts disabled.
[PATCH v2 1/7] mm/compaction: split freepages without holding the zone lock
From: Joonsoo KimWe don't need to split freepages with holding the zone lock. It will cause more contention on zone lock so not desirable. Signed-off-by: Joonsoo Kim --- include/linux/mm.h | 1 - mm/compaction.c| 42 ++ mm/page_alloc.c| 27 --- 3 files changed, 30 insertions(+), 40 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index a00ec81..1a1782c 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -537,7 +537,6 @@ void __put_page(struct page *page); void put_pages_list(struct list_head *pages); void split_page(struct page *page, unsigned int order); -int split_free_page(struct page *page); /* * Compound pages have a destructor function. Provide a diff --git a/mm/compaction.c b/mm/compaction.c index 1427366..8e013eb 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -65,13 +65,31 @@ static unsigned long release_freepages(struct list_head *freelist) static void map_pages(struct list_head *list) { - struct page *page; + unsigned int i, order, nr_pages; + struct page *page, *next; + LIST_HEAD(tmp_list); + + list_for_each_entry_safe(page, next, list, lru) { + list_del(>lru); + + order = page_private(page); + nr_pages = 1 << order; + set_page_private(page, 0); + set_page_refcounted(page); + + arch_alloc_page(page, order); + kernel_map_pages(page, nr_pages, 1); + kasan_alloc_pages(page, order); + if (order) + split_page(page, order); - list_for_each_entry(page, list, lru) { - arch_alloc_page(page, 0); - kernel_map_pages(page, 1, 1); - kasan_alloc_pages(page, 0); + for (i = 0; i < nr_pages; i++) { + list_add(>lru, _list); + page++; + } } + + list_splice(_list, list); } static inline bool migrate_async_suitable(int migratetype) @@ -368,12 +386,13 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, unsigned long flags = 0; bool locked = false; unsigned long blockpfn = *start_pfn; + unsigned int order; cursor = pfn_to_page(blockpfn); /* Isolate free pages. */ for (; blockpfn < end_pfn; blockpfn++, cursor++) { - int isolated, i; + int isolated; struct page *page = cursor; /* @@ -439,13 +458,12 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, goto isolate_fail; } - /* Found a free page, break it into order-0 pages */ - isolated = split_free_page(page); + /* Found a free page, will break it into order-0 pages */ + order = page_order(page); + isolated = __isolate_free_page(page, page_order(page)); + set_page_private(page, order); total_isolated += isolated; - for (i = 0; i < isolated; i++) { - list_add(>lru, freelist); - page++; - } + list_add_tail(>lru, freelist); /* If a page was split, advance to the end of it */ if (isolated) { diff --git a/mm/page_alloc.c b/mm/page_alloc.c index d27e8b9..5134f46 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2525,33 +2525,6 @@ int __isolate_free_page(struct page *page, unsigned int order) } /* - * Similar to split_page except the page is already free. As this is only - * being used for migration, the migratetype of the block also changes. - * As this is called with interrupts disabled, the caller is responsible - * for calling arch_alloc_page() and kernel_map_page() after interrupts - * are enabled. - * - * Note: this is probably too low level an operation for use in drivers. - * Please consult with lkml before using this in your driver. - */ -int split_free_page(struct page *page) -{ - unsigned int order; - int nr_pages; - - order = page_order(page); - - nr_pages = __isolate_free_page(page, order); - if (!nr_pages) - return 0; - - /* Split into individual pages */ - set_page_refcounted(page); - split_page(page, order); - return nr_pages; -} - -/* * Update NUMA hit/miss statistics * * Must be called with interrupts disabled. -- 1.9.1
[PATCH v2 1/7] mm/compaction: split freepages without holding the zone lock
From: Joonsoo Kim We don't need to split freepages with holding the zone lock. It will cause more contention on zone lock so not desirable. Signed-off-by: Joonsoo Kim --- include/linux/mm.h | 1 - mm/compaction.c| 42 ++ mm/page_alloc.c| 27 --- 3 files changed, 30 insertions(+), 40 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index a00ec81..1a1782c 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -537,7 +537,6 @@ void __put_page(struct page *page); void put_pages_list(struct list_head *pages); void split_page(struct page *page, unsigned int order); -int split_free_page(struct page *page); /* * Compound pages have a destructor function. Provide a diff --git a/mm/compaction.c b/mm/compaction.c index 1427366..8e013eb 100644 --- a/mm/compaction.c +++ b/mm/compaction.c @@ -65,13 +65,31 @@ static unsigned long release_freepages(struct list_head *freelist) static void map_pages(struct list_head *list) { - struct page *page; + unsigned int i, order, nr_pages; + struct page *page, *next; + LIST_HEAD(tmp_list); + + list_for_each_entry_safe(page, next, list, lru) { + list_del(>lru); + + order = page_private(page); + nr_pages = 1 << order; + set_page_private(page, 0); + set_page_refcounted(page); + + arch_alloc_page(page, order); + kernel_map_pages(page, nr_pages, 1); + kasan_alloc_pages(page, order); + if (order) + split_page(page, order); - list_for_each_entry(page, list, lru) { - arch_alloc_page(page, 0); - kernel_map_pages(page, 1, 1); - kasan_alloc_pages(page, 0); + for (i = 0; i < nr_pages; i++) { + list_add(>lru, _list); + page++; + } } + + list_splice(_list, list); } static inline bool migrate_async_suitable(int migratetype) @@ -368,12 +386,13 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, unsigned long flags = 0; bool locked = false; unsigned long blockpfn = *start_pfn; + unsigned int order; cursor = pfn_to_page(blockpfn); /* Isolate free pages. */ for (; blockpfn < end_pfn; blockpfn++, cursor++) { - int isolated, i; + int isolated; struct page *page = cursor; /* @@ -439,13 +458,12 @@ static unsigned long isolate_freepages_block(struct compact_control *cc, goto isolate_fail; } - /* Found a free page, break it into order-0 pages */ - isolated = split_free_page(page); + /* Found a free page, will break it into order-0 pages */ + order = page_order(page); + isolated = __isolate_free_page(page, page_order(page)); + set_page_private(page, order); total_isolated += isolated; - for (i = 0; i < isolated; i++) { - list_add(>lru, freelist); - page++; - } + list_add_tail(>lru, freelist); /* If a page was split, advance to the end of it */ if (isolated) { diff --git a/mm/page_alloc.c b/mm/page_alloc.c index d27e8b9..5134f46 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2525,33 +2525,6 @@ int __isolate_free_page(struct page *page, unsigned int order) } /* - * Similar to split_page except the page is already free. As this is only - * being used for migration, the migratetype of the block also changes. - * As this is called with interrupts disabled, the caller is responsible - * for calling arch_alloc_page() and kernel_map_page() after interrupts - * are enabled. - * - * Note: this is probably too low level an operation for use in drivers. - * Please consult with lkml before using this in your driver. - */ -int split_free_page(struct page *page) -{ - unsigned int order; - int nr_pages; - - order = page_order(page); - - nr_pages = __isolate_free_page(page, order); - if (!nr_pages) - return 0; - - /* Split into individual pages */ - set_page_refcounted(page); - split_page(page, order); - return nr_pages; -} - -/* * Update NUMA hit/miss statistics * * Must be called with interrupts disabled. -- 1.9.1