Re: [PATCH/RFC] mm: add and use batched version of __tlb_remove_table()

2021-12-23 Thread Nikita Yushchenko
I currently don't have numbers for this patch taken alone. This patch originates from work done some years ago to reduce cost of memory accounting, and x86-only version of this patch was in virtuozzo/openvz kernel since then. Other patches from that work have been upstreamed, but this one was miss

Re: [PATCH/RFC] mm: add and use batched version of __tlb_remove_table()

2021-12-18 Thread Dave Hansen
On 12/18/21 6:31 AM, Nikita Yushchenko wrote: >>> This allows archs to optimize it, by >>> freeing multiple tables in a single release_pages() call. This is >>> faster than individual put_page() calls, especially with memcg >>> accounting enabled. >> >> Could we quantify "faster"?  There's a non-tr

Re: [PATCH/RFC] mm: add and use batched version of __tlb_remove_table()

2021-12-18 Thread Nikita Yushchenko
This allows archs to optimize it, by freeing multiple tables in a single release_pages() call. This is faster than individual put_page() calls, especially with memcg accounting enabled. Could we quantify "faster"? There's a non-trivial amount of code being added here and it would be nice to bac

Re: [PATCH/RFC] mm: add and use batched version of __tlb_remove_table()

2021-12-18 Thread Nikita Yushchenko
17.12.2021 21:39, Sam Ravnborg wrote: Hi Nikita, How about adding the following to tlb.h: #ifndef __tlb_remove_tables static void __tlb_remove_tables(...) { } #endif And then the few archs that want to override __tlb_remove_tables needs to do a #define __tlb_remove_tables __tlb_re

Re: [PATCH/RFC] mm: add and use batched version of __tlb_remove_table()

2021-12-18 Thread Nikita Yushchenko
Oh gawd, that's terrible. Never, ever duplicate code like that. What the patch does is: - formally shift the loop one level down in the call graph, adding instances of __tmp_remove_tables() exactly to locations where instances of __tmp_remove_table() already exist, - on architectures where __tm

Re: [PATCH/RFC] mm: add and use batched version of __tlb_remove_table()

2021-12-17 Thread Peter Zijlstra
On Fri, Dec 17, 2021 at 11:19:10AM +0300, Nikita Yushchenko wrote: > When batched page table freeing via struct mmu_table_batch is used, the > final freeing in __tlb_remove_table_free() executes a loop, calling > arch hook __tlb_remove_table() to free each table individually. > > Shift that loop d

Re: [PATCH/RFC] mm: add and use batched version of __tlb_remove_table()

2021-12-17 Thread Sam Ravnborg
Hi Nikita, How about adding the following to tlb.h: #ifndef __tlb_remove_tables static void __tlb_remove_tables(...) { } #endif And then the few archs that want to override __tlb_remove_tables needs to do a #define __tlb_remove_tables __tlb_remove_tables static void __tlb_remove_ta

Re: [PATCH/RFC] mm: add and use batched version of __tlb_remove_table()

2021-12-17 Thread Dave Hansen
On 12/17/21 12:19 AM, Nikita Yushchenko wrote: > When batched page table freeing via struct mmu_table_batch is used, the > final freeing in __tlb_remove_table_free() executes a loop, calling > arch hook __tlb_remove_table() to free each table individually. > > Shift that loop down to archs. This a

[PATCH/RFC] mm: add and use batched version of __tlb_remove_table()

2021-12-17 Thread Nikita Yushchenko
When batched page table freeing via struct mmu_table_batch is used, the final freeing in __tlb_remove_table_free() executes a loop, calling arch hook __tlb_remove_table() to free each table individually. Shift that loop down to archs. This allows archs to optimize it, by freeing multiple tables in