On Thu, Sep 13, 2012 at 2:28 PM, Neil Yalowitz <[email protected]> wrote: > This is a great answer, I can see that particular ganglia metric sharply > increased when the issue began. Thanks much.
Nice! > > One followup question: > > Can a distressed slave cluster cause performance issues on the master > cluster? It appears our performance problem was occurring on the slave > peer, but the master cluster almost crashed as well. I'm trying to > determine if that was a coincidence or something more... That's a tougher one, but FWIW the work required on the master cluster is low compared to what the slave has to do; the master just needs to read a bunch of edits and send them whereas the slave has to write them to the WAL, add them to the MemStore, eventually flush and compact, etc. Also if you had a big MR job that ran on the master and that inserted a lot of data, I would assume that it made everything slower. If it's also what caused swapping then it would explain a lot. J-D
