On 21/04/13 05:29, David Gerard wrote: > So where would I start looking to work out what's going on?
If there is any kind of site issue at WMF, I usually start with Ganglia. It does take some practise to be able to read it correctly, but it gives you information far more quickly than just about anything else. My notes on WMF incident response give some hints about how to use it, as well as discussing some other tools: https://wikitech.wikimedia.org/wiki/Incident_response If the problem seems to be downstream of MediaWiki, then profiling is usually the next thing to look at. Wikipedia has been using DIY profiling to diagnose site performance issues since it was on a single server. > * Sometimes it isn't, e.g. this afternoon when the site was running > like a slug and load average was 0.8 with nothing amiss in top. Processes in the "S" state do not contribute to the load average, whether or not users are waiting for them. For example, PHP may be waiting for Lucene. Try the section in the incident response notes under "slow backend service". -- Tim Starling _______________________________________________ MediaWiki-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
