"Huang, Ying" <ying.hu...@intel.com> writes: > Hi, Peter, > > "Huang, Ying" <ying.hu...@intel.com> writes: > >> Peter Zijlstra <pet...@infradead.org> writes: >> >>> On Sat, Aug 05, 2017 at 08:47:02AM +0800, Huang, Ying wrote: >>>> Yes. That looks good. So you will prepare the final patch? Or you >>>> hope me to do that? >>> >>> I was hoping you'd do it ;-) >> >> Thanks! Here is the updated patch >> >> Best Regards, >> Huang, Ying >> >> ---------->8---------- >> From 957735e9ff3922368286540dab852986fc7b23b5 Mon Sep 17 00:00:00 2001 >> From: Huang Ying <ying.hu...@intel.com> >> Date: Mon, 7 Aug 2017 16:55:33 +0800 >> Subject: [PATCH -v3] IPI: Avoid to use 2 cache lines for one >> call_single_data >> >> struct call_single_data is used in IPI to transfer information between >> CPUs. Its size is bigger than sizeof(unsigned long) and less than >> cache line size. Now, it is allocated with no explicit alignment >> requirement. This makes it possible for allocated call_single_data to >> cross 2 cache lines. So that double the number of the cache lines >> that need to be transferred among CPUs. >> >> This is resolved by requiring call_single_data to be aligned with the >> size of call_single_data. Now the size of call_single_data is the >> power of 2. If we add new fields to call_single_data, we may need to >> add pads to make sure the size of new definition is the power of 2. >> Fortunately, this is enforced by gcc, which will report error for not >> power of 2 alignment requirement. >> >> To set alignment requirement of call_single_data to the size of >> call_single_data, a struct definition and a typedef is used. >> >> To test the effect of the patch, we use the vm-scalability multiple >> thread swap test case (swap-w-seq-mt). The test will create multiple >> threads and each thread will eat memory until all RAM and part of swap >> is used, so that huge number of IPI will be triggered when unmapping >> memory. In the test, the throughput of memory writing improves ~5% >> compared with misaligned call_single_data because of faster IPI. > > What do you think about this version? >
Ping. Best Regards, Huang, Ying