> I'm wondering, why are inode_sb_list_add()/del() even called for a presumably > reasonably well cached benchmark running on a system with enough RAM? Are > these > perhaps thousands of temporary files, already deleted, and released when all > the > file descriptors are closed as part of sys_exit()? > > If that's the case then I suspect an even bigger win would be not just to > batch > the (sb-)global list fiddling, but to potentially turn the sb list into a > percpu_alloc() managed set of per CPU lists? It's a bigger change, but it > could
We had such a patch in the lock elision patchkit (It avoided a lot of cache line bouncing leading to aborts) https://git.kernel.org/cgit/linux/kernel/git/ak/linux-misc.git/commit/?h=hle315/combined&id=f1cf9e715a40f44086662ae3b29f123cf059cbf4 -Andi

