On Wed, Jul 08, 2015 at 10:19:19PM +1000, Tim Connors wrote:
>I mentioned at the meeting that sometimes memory has to be shunted between
>NUMA nodes via swapping out to disk because.... brokenness.

AFAIK that hasn't happened for years. that was the first numa
migration implementation that SGI did (IIRC the place I used to work
paid for that work as part of a supercomputer contract).

numa migration has since been refined to not use anyting so crude.
all happens in ram now, as it should.

cgroups have re-raised a whole bunch of these issues though, as they
pretend to be mini-machines using a subset of ram, and they don't yet
have all the sophistication of the real virtual memory system. they're
getting there though...

>Stewart's post just handily came up and reminded me about this, and came
>with Citations Needed[TM]:
>
>https://www.flamingspork.com/blog/2015/07/08/the-sad-state-of-mysql-and-numa/

in 2010 the default distro zone_reclaim_mode could still have been
wrong and I guess it could cause spurious swapping. setting it to
zone_reclaim_mode=0 fixes it. that sounds like their problem to me.

zone_reclaim_mode can even cause numa related deadlocks if != 0.
BoM almost kicked out their last supercomputer vendor before I told
folks at BoM to change that setting :)

the default is 0 for modern kernels.

cheers,
robin
_______________________________________________
luv-main mailing list
[email protected]
http://lists.luv.asn.au/listinfo/luv-main

Reply via email to