* Peter Zijlstra <pet...@infradead.org> [2018-06-04 21:28:21]: > > if (time_after(jiffies, pgdat->numabalancing_migrate_next_window)) { > > - spin_lock(&pgdat->numabalancing_migrate_lock); > > - pgdat->numabalancing_migrate_nr_pages = 0; > > - pgdat->numabalancing_migrate_next_window = jiffies + > > - msecs_to_jiffies(migrate_interval_millisecs); > > - spin_unlock(&pgdat->numabalancing_migrate_lock); > > + if (xchg(&pgdat->numabalancing_migrate_nr_pages, 0)) > > + pgdat->numabalancing_migrate_next_window = jiffies + > > + msecs_to_jiffies(migrate_interval_millisecs); > > Note that both are in fact wrong. That wants to be something like: > > pgdat->numabalancing_migrate_next_window += interval; > > Otherwise you stretch every interval by 'jiffies - > numabalancing_migrate_next_window'.
Okay, I get your point. > > Also, that all wants READ_ONCE/WRITE_ONCE, irrespective of the > spinlock/xchg. > > I suppose the problem here is that PPC has a very nasty test-and-set > spinlock with fwd progress issues while xchg maps to a fairly simple > ll/sc that (hopefully) has some hardware fairness. > > And pgdata being a rather course data structure (per node?) there could > be a lot of CPUs stomping on this here thing. > > So simpler not really, but better for PPC. > unsigned long interval = READ_ONCE(pgdat->numabalancing_migrate_next_window); if (time_after(jiffies, interval)) { interval += msecs_to_jiffies(migrate_interval_millisecs)); if (xchg(&pgdat->numabalancing_migrate_nr_pages, 0)) WRITE_ONCE(pgdat->numabalancing_migrate_next_window, interval); } Something like this?