On Fri, Dec 4, 2009 at 10:40 PM, Thorsten von Eicken <[email protected]> wrote: >>> For the first few hours of my load test, I have enough I/O. The problem >>> is that Cassandra is spending too much I/O on reads and writes and too >>> little on compactions to function well in the long term. >>> >> >> If you don't have enough room for both, it doesn't matter how you >> prioritize. >> > > Mhhh, maybe... You're technically correct. The question here is whether > cassandra degrades gracefully or not. If I understand correctly, there are > two ways to look at it: > > 1) it's accepting a higher request load than it can actually process and > builds up an increasing backlog that eventually brings performance down far > below the level of performance that it could sustain, thus it fails to do > the type of early admission control or back-pressure that keeps the request > load close to the sustainable maximum performance. > > 2) the compaction backlog size is a primary variable that has to be exposed > and monitored in any cassandra installation because it's a direct indicator > for an overload situation, just like hitting 100% cpu or similar would be. > > I can buy that (2) is ok, but (1) is certainly nicer.
I agree that it's much much nicer in the sense that it makes it more obvious what the problem is (not enough capacity) but it only helps diagnosis, not mitigation.
