On Mon, 2010-05-17 at 16:25 +0200, Stephane Eranian wrote: > > Right, but I think the default of inherit is right, and once you do that > > you basically have to do the per-task-per-cpu thing, otherwise your > > fancy 16-way will start spending most of its time in cacheline bounces. > > > In that case, don't you think you should also ensure that the buffer is > allocated on the NUMA node of the designated per-thread-per-cpu? > I don't think it is the case today.
Yeah, something like the below ought to do I guess.. Almost-Signed-off-by: Peter Zijlstra <a.p.zijls...@chello.nl> --- kernel/perf_event.c | 17 +++++++++++++++-- 1 files changed, 15 insertions(+), 2 deletions(-) diff --git a/kernel/perf_event.c b/kernel/perf_event.c index 9dbe8cd..85e2d32 100644 --- a/kernel/perf_event.c +++ b/kernel/perf_event.c @@ -2288,6 +2288,19 @@ perf_mmap_to_page(struct perf_mmap_data *data, unsigned long pgoff) return virt_to_page(data->data_pages[pgoff - 1]); } +static void *perf_mmap_alloc_page(int cpu) +{ + struct page *page; + int node; + + node = (cpu == -1) ? cpu : cpu_to_node(cpu); + page = alloc_pages_node(node, GFP_KERNEL | __GFP_ZERO, 0); + if (!page) + return NULL; + + return page_address(page); +} + static struct perf_mmap_data * perf_mmap_data_alloc(struct perf_event *event, int nr_pages) { @@ -2304,12 +2317,12 @@ perf_mmap_data_alloc(struct perf_event *event, int nr_pages) if (!data) goto fail; - data->user_page = (void *)get_zeroed_page(GFP_KERNEL); + data->user_page = perf_mmap_alloc_page(event->cpu); if (!data->user_page) goto fail_user_page; for (i = 0; i < nr_pages; i++) { - data->data_pages[i] = (void *)get_zeroed_page(GFP_KERNEL); + data->data_pages[i] = perf_mmap_alloc_page(event->cpu); if (!data->data_pages[i]) goto fail_data_pages; } ------------------------------------------------------------------------------ _______________________________________________ perfmon2-devel mailing list perfmon2-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/perfmon2-devel