RE: [v5 PATCH 1/2] mm: swap: check if swap backing device is congested or not

2019-01-10 Thread Chen, Tim C
> > > > + if (si->flags & (SWP_BLKDEV | SWP_FS)) { > > I re-read your discussion with Tim and I must say the reasoning behind this > test remain foggy. I was worried that the dereference inode = si->swap_file->f_mapping->host; is not always safe for corner cases. So the test makes sure that

RE: [Update][PATCH v5 7/9] mm/swap: Add cache for swap slots allocation

2017-01-17 Thread Chen, Tim C
> > > > > > > The cache->slots_ret is protected by cache->free_lock and > > cache->slots is protected by cache->free_lock. Typo. cache->slots is protected by cache->alloc_lock. Tim

RE: [Update][PATCH v5 7/9] mm/swap: Add cache for swap slots allocation

2017-01-17 Thread Chen, Tim C
> > + /* > > +* Preemption need to be turned on here, because we may sleep > > +* in refill_swap_slots_cache(). But it is safe, because > > +* accesses to the per-CPU data structure are protected by a > > +* mutex. > > +*/ > > the comment doesn't really explain why it is saf

RE: [PATCH -v3 00/10] THP swap: Delay splitting THP during swapping out

2016-09-22 Thread Chen, Tim C
> >So this is impossible without THP swapin. While 2M swapout makes a lot of >sense, I doubt 2M swapin is really useful. What kind of application is >'optimized' >to do sequential memory access? We waste a lot of cpu cycles to re-compact 4K pages back to a large page under THP. Swapping it back

RE: [PATCH -v3 00/10] THP swap: Delay splitting THP during swapping out

2016-09-13 Thread Chen, Tim C
>> >> - Avoid CPU time for splitting, collapsing THP across swap out/in. > >Yes, if you want, please give us how bad it is. > It could be pretty bad. In an experiment with THP turned on and we enter swap, 50% of the cpu are spent in the page compaction path. So if we could deal with units of la

RE: performance delta after VFS i_mutex=>i_rwsem conversion

2016-06-09 Thread Chen, Tim C
>> Ok, these enhancements are now in the locking tree and are queued up for >v4.8: >> >>git pull git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git >> locking/core >> >> Dave, you might want to check your numbers with these changes: is >> rwsem performance still significantly worse than

RE: Regression with SLUB on Netperf and Volanomark

2007-05-03 Thread Chen, Tim C
Christoph Lameter wrote: > Try to boot with > > slub_max_order=4 slub_min_objects=8 > > If that does not help increase slub_min_objects to 16. > We are still seeing a 5% regression on TCP streaming with slub_min_objects set at 16 and a 10% regression for Volanomark, after increasing slub_min_ob

RE: [PATCH] lock stat for -rt 2.6.20-rc2-rt2.2.lock_stat.patch

2007-01-03 Thread Chen, Tim C
Bill Huey (hui) wrote: > This should have the fix. > > http://mmlinux.sf.net/public/patch-2.6.20-rc2-rt2.3.lock_stat.patch > > If you can rerun it and post the results, it'll hopefully show the > behavior of that lock acquisition better. > Here's the run with fix to produce correct statistics.

RE: [PATCH] lock stat for -rt 2.6.20-rc2-rt2.2.lock_stat.patch

2007-01-03 Thread Chen, Tim C
Bill Huey (hui) wrote: > > Thanks, the numbers look a bit weird in that the first column should > have a bigger number of events than that second column since it is a > special case subset. Looking at the lock_stat_note() code should show > that to be the case. Did you make a change to the output

RE: [PATCH] lock stat for -rt 2.6.20-rc2-rt2.2.lock_stat.patch

2007-01-03 Thread Chen, Tim C
Bill Huey (hui) wrote: > Can you sort the output ("sort -n" what ever..) and post it without > the zeroed entries ? > > I'm curious about how that statistical spike compares to the rest of > the system activity. I'm sure that'll get the attention of Peter as > well and maybe he'll do something abo

RE: [PATCH] lock stat for -rt 2.6.20-rc2-rt2.2.lock_stat.patch

2007-01-03 Thread Chen, Tim C
Bill Huey (hui) wrote: > > Good to know that. What did the output reveal ? > > What's your intended use again summarized ? futex contention ? I'll > read the first posting again. > Earlier I used latency_trace and figured that there was read contention on mm->mmap_sem during call to _rt_down_re

RE: [PATCH] lock stat for -rt 2.6.20-rc2-rt2.2.lock_stat.patch

2007-01-03 Thread Chen, Tim C
Bill Huey (hui) wrote: > > Patch here: > > http://mmlinux.sourceforge.net/public/patch-2.6.20-rc2-rt2.2.lock_stat.p atch > > bill This version is much better and ran stablely. If I'm reading the output correctly, the locks are listed by their initialization point (function, file and line #

RE: [PATCH] lock stat for -rt 2.6.20-rc2-rt2 [was Re: 2.6.19-rt14 slowdown compared to 2.6.19]

2007-01-02 Thread Chen, Tim C
Bill Huey (hui) wrote: > On Tue, Dec 26, 2006 at 04:51:21PM -0800, Chen, Tim C wrote: >> Ingo Molnar wrote: >>> If you'd like to profile this yourself then the lowest-cost way of >>> profiling lock contention on -rt is to use the yum kernel and run >>> the

RE: 2.6.19-rt14 slowdown compared to 2.6.19

2007-01-02 Thread Chen, Tim C
Ingo Molnar wrote: > > (could you send me the whole trace if you still have it? It would be > interesting to see a broader snippet from the life of individual java > threads.) > > Ingo Sure, I'll send it to you separately due to the size of the complete trace. Tim - To unsubscribe from th

RE: 2.6.19-rt14 slowdown compared to 2.6.19

2006-12-29 Thread Chen, Tim C
Ingo Molnar wrote: > > If you'd like to profile this yourself then the lowest-cost way of > profiling lock contention on -rt is to use the yum kernel and run the > attached trace-it-lock-prof.c code on the box while your workload is > in 'steady state' (and is showing those extended idle times): >

RE: 2.6.19-rt14 slowdown compared to 2.6.19

2006-12-26 Thread Chen, Tim C
Ingo Molnar wrote: > > cool - thanks for the feedback! Running the 64-bit kernel, right? > Yes, 64-bit kernel was used. > > while some slowdown is to be expected, did in each case idle time > increase significantly? Volanomark and Re-Aim7 ran close to 0% idle time for 2.6.19 kernel. Idle tim

2.6.19-rt14 slowdown compared to 2.6.19

2006-12-22 Thread Chen, Tim C
Ingo, We did some benchmarking on 2.6.19-rt14, compared it with 2.6.19 kernel and noticed several slowdowns. The test machine is a 2 socket woodcrest machine with your default configuration. Netperf TCP Streaming was slower by 40% ( 1 server and 1 client each bound to separate cpu cores on d