On Wed, Jul 08, 2015 at 10:19:19PM +1000, Tim Connors wrote: >I mentioned at the meeting that sometimes memory has to be shunted between >NUMA nodes via swapping out to disk because.... brokenness.
AFAIK that hasn't happened for years. that was the first numa migration implementation that SGI did (IIRC the place I used to work paid for that work as part of a supercomputer contract). numa migration has since been refined to not use anyting so crude. all happens in ram now, as it should. cgroups have re-raised a whole bunch of these issues though, as they pretend to be mini-machines using a subset of ram, and they don't yet have all the sophistication of the real virtual memory system. they're getting there though... >Stewart's post just handily came up and reminded me about this, and came >with Citations Needed[TM]: > >https://www.flamingspork.com/blog/2015/07/08/the-sad-state-of-mysql-and-numa/ in 2010 the default distro zone_reclaim_mode could still have been wrong and I guess it could cause spurious swapping. setting it to zone_reclaim_mode=0 fixes it. that sounds like their problem to me. zone_reclaim_mode can even cause numa related deadlocks if != 0. BoM almost kicked out their last supercomputer vendor before I told folks at BoM to change that setting :) the default is 0 for modern kernels. cheers, robin _______________________________________________ luv-main mailing list [email protected] http://lists.luv.asn.au/listinfo/luv-main
