Re: [PATCH 01/11] mm: page_alloc: set_migratetype_isolate: drain PCP prior to isolating
On Thu, Dec 29, 2011 at 2:39 PM, Marek Szyprowski m.szyprow...@samsung.com wrote: From: Michal Nazarewicz min...@mina86.com When set_migratetype_isolate() sets pageblock's migrate type, it does not change each page_private data. This makes sense, as the function has no way of knowing what kind of information page_private stores. A side effect is that instead of draining pages from all zones, set_migratetype_isolate() now drain only pages from zone pageblock it operates on is in. +/* Caller must hold zone-lock. */ +static void __zone_drain_local_pages(void *arg) +{ + struct per_cpu_pages *pcp; + struct zone *zone = arg; + unsigned long flags; + + local_irq_save(flags); + pcp = per_cpu_ptr(zone-pageset, smp_processor_id())-pcp; + if (pcp-count) { + /* Caller holds zone-lock, no need to grab it. */ + __free_pcppages_bulk(zone, pcp-count, pcp); + pcp-count = 0; + } + local_irq_restore(flags); +} + +/* + * Like drain_all_pages() but operates on a single zone. Caller must + * hold zone-lock. + */ +static void __zone_drain_all_pages(struct zone *zone) +{ + on_each_cpu(__zone_drain_local_pages, zone, 1); +} + On Sun, 01 Jan 2012 08:49:13 +0100, Gilad Ben-Yossef gi...@benyossef.com wrote: Please consider whether sending an IPI to all processors in the system and interrupting them is appropriate here. You seem to assume that it is probable that each CPU of the possibly 4,096 (MAXSMP on x86) has a per-cpu page for the specified zone, I'm not really assuming that (in fact I expect what you fear, ie. that most CPUs won't have pages from specified zone an PCP list), however, I really need to make sure to get them off all PCP lists. otherwise you're just interrupting them out of doing something useful, or save power idle for nothing. Exactly what's happening now anyway. While that may or may not be a reasonable assumption for the general drain_all_pages that drains pcps from all zones, I feel it is less likely to be the right thing once you limit the drain to a single zone. Currently, set_migratetype_isolate() seem to do more then it needs to, ie. it drains all the pages even though all it cares about is a single zone. Some background on my attempt to reduce IPI noise in the system in this context is probably useful here as well: https://lkml.org/lkml/2011/11/22/133 Looks interesting, I'm not entirely sure why it does not end up a race condition, but in case of __zone_drain_all_pages() we already hold zone-lock, so my fears are somehow gone.. I'll give it a try, and prepare a patch for __zone_drain_all_pages(). -- Best regards, _ _ .o. | Liege of Serenely Enlightened Majesty of o' \,=./ `o ..o | Computer Science, Michał “mina86” Nazarewicz(o o) ooo +email/xmpp: m...@google.com--ooO--(_)--Ooo-- -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/11] mm: page_alloc: set_migratetype_isolate: drain PCP prior to isolating
2012/1/1 Michal Nazarewicz min...@mina86.com: On Thu, Dec 29, 2011 at 2:39 PM, Marek Szyprowski m.szyprow...@samsung.com wrote: ... On Sun, 01 Jan 2012 08:49:13 +0100, Gilad Ben-Yossef gi...@benyossef.com wrote: Please consider whether sending an IPI to all processors in the system and interrupting them is appropriate here. You seem to assume that it is probable that each CPU of the possibly 4,096 (MAXSMP on x86) has a per-cpu page for the specified zone, I'm not really assuming that (in fact I expect what you fear, ie. that most CPUs won't have pages from specified zone an PCP list), however, I really need to make sure to get them off all PCP lists. True, the question is whether or not you have to send a global IPI to do that. otherwise you're just interrupting them out of doing something useful, or save power idle for nothing. Exactly what's happening now anyway. Indeed. While that may or may not be a reasonable assumption for the general drain_all_pages that drains pcps from all zones, I feel it is less likely to be the right thing once you limit the drain to a single zone. Currently, set_migratetype_isolate() seem to do more then it needs to, ie. it drains all the pages even though all it cares about is a single zone. I agree your patch is better then current state. I just did want to add yet another global IPI I'll have to chase afterwards.. :-) Some background on my attempt to reduce IPI noise in the system in this context is probably useful here as well: https://lkml.org/lkml/2011/11/22/133 Looks interesting, I'm not entirely sure why it does not end up a race condition, but in case of __zone_drain_all_pages() we already hold If a page is in the PCP list when we check, you'll send the IPI and all is well. If it isn't when we check and gets added later you could just the same have situation where we send the IPI, try to do try an empty PCP list and then the page gets added. So we are not adding a race condition that is not there already :-) zone-lock, so my fears are somehow gone.. I'll give it a try, and prepare a patch for __zone_drain_all_pages(). I plan to send V5 of the IPI noise patch after some testing. It has a new version of the drain_all_pages, with no allocation in the reclaim path and no locking. You might want to wait till that one is out to base on it. Thank you for considering my feedback :-) Gilad -- Gilad Ben-Yossef Chief Coffee Drinker gi...@benyossef.com Israel Cell: +972-52-8260388 US Cell: +1-973-8260388 http://benyossef.com Unfortunately, cache misses are an equal opportunity pain provider. -- Mike Galbraith, LKML -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/11] mm: page_alloc: set_migratetype_isolate: drain PCP prior to isolating
On Sun, 01 Jan 2012 17:06:53 +0100, Gilad Ben-Yossef gi...@benyossef.com wrote: 2012/1/1 Michal Nazarewicz min...@mina86.com: Looks interesting, I'm not entirely sure why it does not end up a race condition, but in case of __zone_drain_all_pages() we already hold If a page is in the PCP list when we check, you'll send the IPI and all is well. If it isn't when we check and gets added later you could just the same have situation where we send the IPI, try to do try an empty PCP list and then the page gets added. So we are not adding a race condition that is not there already :-) Right, makes sense. zone-lock, so my fears are somehow gone.. I'll give it a try, and prepare a patch for __zone_drain_all_pages(). I plan to send V5 of the IPI noise patch after some testing. It has a new version of the drain_all_pages, with no allocation in the reclaim path and no locking. You might want to wait till that one is out to base on it. This shouldn't be a problem for my case as set_migratetype_isolate() is hardly ever called in reclaim path. :) The change so far seems rather obvious: mm/page_alloc.c | 14 +- 1 files changed, 13 insertions(+), 1 deletions(-) diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 424d36a..eaa686b 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -1181,7 +1181,19 @@ static void __zone_drain_local_pages(void *arg) */ static void __zone_drain_all_pages(struct zone *zone) { - on_each_cpu(__zone_drain_local_pages, zone, 1); + struct per_cpu_pageset *pcp; + cpumask_var_t cpus; + int cpu; + + if (likely(zalloc_cpumask_var(cpus, GFP_ATOMIC | __GFP_NOWARN))) { + for_each_online_cpu(cpu) + if (per_cpu_ptr(zone-pageset, cpu)-pcp.count) + cpumask_set_cpu(cpu, cpus); + on_each_cpu_mask(cpus, __zone_drain_local_pages, zone, 1); + free_cpumask_var(cpus); + } else { + on_each_cpu(__zone_drain_local_pages, zone, 1); + } } #ifdef CONFIG_HIBERNATION -- Best regards, _ _ .o. | Liege of Serenely Enlightened Majesty of o' \,=./ `o ..o | Computer Science, Michał “mina86” Nazarewicz(o o) ooo +email/xmpp: m...@google.com--ooO--(_)--Ooo-- -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/11] mm: page_alloc: set_migratetype_isolate: drain PCP prior to isolating
On Thu, Dec 29, 2011 at 2:39 PM, Marek Szyprowski m.szyprow...@samsung.com wrote: From: Michal Nazarewicz min...@mina86.com When set_migratetype_isolate() sets pageblock's migrate type, it does not change each page_private data. This makes sense, as the function has no way of knowing what kind of information page_private stores. ... A side effect is that instead of draining pages from all zones, set_migratetype_isolate() now drain only pages from zone pageblock it operates on is in. ... +/* Caller must hold zone-lock. */ +static void __zone_drain_local_pages(void *arg) +{ + struct per_cpu_pages *pcp; + struct zone *zone = arg; + unsigned long flags; + + local_irq_save(flags); + pcp = per_cpu_ptr(zone-pageset, smp_processor_id())-pcp; + if (pcp-count) { + /* Caller holds zone-lock, no need to grab it. */ + __free_pcppages_bulk(zone, pcp-count, pcp); + pcp-count = 0; + } + local_irq_restore(flags); +} + +/* + * Like drain_all_pages() but operates on a single zone. Caller must + * hold zone-lock. + */ +static void __zone_drain_all_pages(struct zone *zone) +{ + on_each_cpu(__zone_drain_local_pages, zone, 1); +} + Please consider whether sending an IPI to all processors in the system and interrupting them is appropriate here. You seem to assume that it is probable that each CPU of the possibly 4,096 (MAXSMP on x86) has a per-cpu page for the specified zone, otherwise you're just interrupting them out of doing something useful, or save power idle for nothing. While that may or may not be a reasonable assumption for the general drain_all_pages that drains pcps from all zones, I feel it is less likely to be the right thing once you limit the drain to a single zone. Some background on my attempt to reduce IPI noise in the system in this context is probably useful here as well: https://lkml.org/lkml/2011/11/22/133 Thanks :-) Gilad -- Gilad Ben-Yossef Chief Coffee Drinker gi...@benyossef.com Israel Cell: +972-52-8260388 US Cell: +1-973-8260388 http://benyossef.com Unfortunately, cache misses are an equal opportunity pain provider. -- Mike Galbraith, LKML -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html