On Mon, Feb 13, 2012 at 7:21 PM, Peter Schuller <peter.schul...@infidyne.com
> wrote:

> > I actually has the opposite 'problem'. I have a pair of servers that have
> > been static since mid last week, but have seen performance vary
> > significantly (x10) for exactly the same query. I hypothesised it was
> > various caches so I shut down Cassandra, flushed the O/S buffer cache and
> > then bought it back up. The performance wasn't significantly different to
> > the pre-flush performance
>
> I don't get this thread at all :)
>
> Why would restarting with clean caches be expected to *improve*
> performance?


I was expecting it to reduce performance due to cleaning of keycache and
O/S buffer cache - performance stayed roughly the same


> And why is key cache loading involved other than to delay
> start-up and hopefully pre-populating caches for better (not worse)
> performance?
>
> If you want to figure out why queries seem to be slow relative to
> normal, you'll need to monitor the behavior of the nodes. Look at disk
> I/O statistics primarily (everyone reading this running Cassandra who
> aren't intimately familiar with "iostat -x -k 1" should go and read up
> on it right away; make sure you understand the utilization and avg
> queue size columns), CPU usage, weather compaction is happening, etc.
>

Yep - I've been looking at these - I don't see anything in iostat/dstat etc
that point strongly to a problem. There is quite a bit of I/O load, but it
looks roughly uniform on slow and fast instances of the queries. The last
compaction ran 4 days ago - which was before I started seeing variable
performance



> One easy way to see sudden bursts of poor behavior is to be heavily
> reliant on cache, and then have sudden decreases in performance due to
> compaction evicting data from page cache while also generating more
> I/O.
>

Unlikely to be a cache issue - In one case an immediate second run of
exactly the same query performed significantly worse.


>
> But that's total speculation. It is also the case that you cannot
> expect consistent performance on EC2 and that might be it.
>

Variable performance from ec2 is my lead theory at the moment.


>
> But my #1 advise: Log into the node while it is being slow, and
> observe. Figure out what the bottleneck is. iostat, top, nodetool
> tpstats, nodetool netstats, nodetool compactionstats.
>

I now why it is slow - it's clearly I/O bound. I am trying to hunt down why
it is sometimes much faster even though I have (tried) to replicate  the
same conditions


>
> --
> / Peter Schuller (@scode, http://worldmodscode.wordpress.com)
>



-- 

*Franc Carter* | Systems architect | Sirca Ltd
 <marc.zianideferra...@sirca.org.au>

franc.car...@sirca.org.au | www.sirca.org.au

Tel: +61 2 9236 9118

Level 9, 80 Clarence St, Sydney NSW 2000

PO Box H58, Australia Square, Sydney NSW 1215

Reply via email to