Re: keycache persisted to disk ?

2012-02-13 Thread R. Verlangen
This is because of the warm up of Cassandra as it starts. On a start it
will start fetching the rows that were cached: this will have to be loaded
from the disk, as there is nothing in the cache yet. You can read more
about this at  http://wiki.apache.org/cassandra/LargeDataSetConsiderations

2012/2/13 Franc Carter franc.car...@sirca.org.au

 On Mon, Feb 13, 2012 at 5:03 PM, zhangcheng zhangch...@jike.com wrote:

 **

 I think the keycaches and rowcahches are bothe persisted to disk when
 shutdown, and restored from disk when restart, then improve the performance.


 Thanks - that would explain at least some of what I am seeing

 cheers



 2012-02-13
 --
  zhangcheng
 --
 *发件人:* Franc Carter
 *发送时间:* 2012-02-13  13:53:56
 *收件人:* user
 *抄送:*
 *主题:* keycache persisted to disk ?

 Hi,

 I am testing Cassandra on Amazon and finding performance can vary fairly
 wildly. I'm leaning towards it being an artifact of the AWS I/O system but
 have one other possibility.

 Are keycaches persisted to disk and restored on a clean shutdown and
 restart ?

 cheers

 --

 *Franc Carter* | Systems architect | Sirca Ltd
 marc.zianideferra...@sirca.org.au

 franc.car...@sirca.org.au | www.sirca.org.au

 Tel: +61 2 9236 9118

 Level 9, 80 Clarence St, Sydney NSW 2000

 PO Box H58, Australia Square, Sydney NSW 1215




 --

 *Franc Carter* | Systems architect | Sirca Ltd
  marc.zianideferra...@sirca.org.au

 franc.car...@sirca.org.au | www.sirca.org.au

 Tel: +61 2 9236 9118

 Level 9, 80 Clarence St, Sydney NSW 2000

 PO Box H58, Australia Square, Sydney NSW 1215




Re: keycache persisted to disk ?

2012-02-13 Thread Franc Carter
2012/2/13 R. Verlangen ro...@us2.nl

 This is because of the warm up of Cassandra as it starts. On a start it
 will start fetching the rows that were cached: this will have to be loaded
 from the disk, as there is nothing in the cache yet. You can read more
 about this at  http://wiki.apache.org/cassandra/LargeDataSetConsiderations



I actually has the opposite 'problem'. I have a pair of servers that have
been static since mid last week, but have seen performance vary
significantly (x10) for exactly the same query. I hypothesised it was
various caches so I shut down Cassandra, flushed the O/S buffer cache and
then bought it back up. The performance wasn't significantly different to
the pre-flush performance

cheers




 2012/2/13 Franc Carter franc.car...@sirca.org.au

 On Mon, Feb 13, 2012 at 5:03 PM, zhangcheng zhangch...@jike.com wrote:

 **

 I think the keycaches and rowcahches are bothe persisted to disk when
 shutdown, and restored from disk when restart, then improve the performance.


 Thanks - that would explain at least some of what I am seeing

 cheers



 2012-02-13
 --
  zhangcheng
 --
 *发件人:* Franc Carter
 *发送时间:* 2012-02-13  13:53:56
 *收件人:* user
 *抄送:*
 *主题:* keycache persisted to disk ?

 Hi,

 I am testing Cassandra on Amazon and finding performance can vary fairly
 wildly. I'm leaning towards it being an artifact of the AWS I/O system but
 have one other possibility.

 Are keycaches persisted to disk and restored on a clean shutdown and
 restart ?

 cheers

 --

 *Franc Carter* | Systems architect | Sirca Ltd
 marc.zianideferra...@sirca.org.au

 franc.car...@sirca.org.au | www.sirca.org.au

 Tel: +61 2 9236 9118

 Level 9, 80 Clarence St, Sydney NSW 2000

 PO Box H58, Australia Square, Sydney NSW 1215




 --

 *Franc Carter* | Systems architect | Sirca Ltd
  marc.zianideferra...@sirca.org.au

 franc.car...@sirca.org.au | www.sirca.org.au

 Tel: +61 2 9236 9118

 Level 9, 80 Clarence St, Sydney NSW 2000

 PO Box H58, Australia Square, Sydney NSW 1215





-- 

*Franc Carter* | Systems architect | Sirca Ltd
 marc.zianideferra...@sirca.org.au

franc.car...@sirca.org.au | www.sirca.org.au

Tel: +61 2 9236 9118

Level 9, 80 Clarence St, Sydney NSW 2000

PO Box H58, Australia Square, Sydney NSW 1215


Re: keycache persisted to disk ?

2012-02-13 Thread R. Verlangen
I also noticed that, Cassandra appears to perform better under a continues
load.

Are you sure the rows you're quering are actually in the cache?

2012/2/13 Franc Carter franc.car...@sirca.org.au

 2012/2/13 R. Verlangen ro...@us2.nl

 This is because of the warm up of Cassandra as it starts. On a start it
 will start fetching the rows that were cached: this will have to be loaded
 from the disk, as there is nothing in the cache yet. You can read more
 about this at
 http://wiki.apache.org/cassandra/LargeDataSetConsiderations


 I actually has the opposite 'problem'. I have a pair of servers that have
 been static since mid last week, but have seen performance vary
 significantly (x10) for exactly the same query. I hypothesised it was
 various caches so I shut down Cassandra, flushed the O/S buffer cache and
 then bought it back up. The performance wasn't significantly different to
 the pre-flush performance

 cheers




 2012/2/13 Franc Carter franc.car...@sirca.org.au

 On Mon, Feb 13, 2012 at 5:03 PM, zhangcheng zhangch...@jike.com wrote:

 **

 I think the keycaches and rowcahches are bothe persisted to disk when
 shutdown, and restored from disk when restart, then improve the 
 performance.


 Thanks - that would explain at least some of what I am seeing

 cheers



 2012-02-13
 --
  zhangcheng
 --
 *发件人:* Franc Carter
 *发送时间:* 2012-02-13  13:53:56
 *收件人:* user
 *抄送:*
 *主题:* keycache persisted to disk ?

 Hi,

 I am testing Cassandra on Amazon and finding performance can vary
 fairly wildly. I'm leaning towards it being an artifact of the AWS I/O
 system but have one other possibility.

 Are keycaches persisted to disk and restored on a clean shutdown and
 restart ?

 cheers

 --

 *Franc Carter* | Systems architect | Sirca Ltd
 marc.zianideferra...@sirca.org.au

 franc.car...@sirca.org.au | www.sirca.org.au

 Tel: +61 2 9236 9118

 Level 9, 80 Clarence St, Sydney NSW 2000

 PO Box H58, Australia Square, Sydney NSW 1215




 --

 *Franc Carter* | Systems architect | Sirca Ltd
  marc.zianideferra...@sirca.org.au

 franc.car...@sirca.org.au | www.sirca.org.au

 Tel: +61 2 9236 9118

 Level 9, 80 Clarence St, Sydney NSW 2000

 PO Box H58, Australia Square, Sydney NSW 1215





 --

 *Franc Carter* | Systems architect | Sirca Ltd
  marc.zianideferra...@sirca.org.au

 franc.car...@sirca.org.au | www.sirca.org.au

 Tel: +61 2 9236 9118

 Level 9, 80 Clarence St, Sydney NSW 2000

 PO Box H58, Australia Square, Sydney NSW 1215




Re: keycache persisted to disk ?

2012-02-13 Thread Peter Schuller
 I actually has the opposite 'problem'. I have a pair of servers that have
 been static since mid last week, but have seen performance vary
 significantly (x10) for exactly the same query. I hypothesised it was
 various caches so I shut down Cassandra, flushed the O/S buffer cache and
 then bought it back up. The performance wasn't significantly different to
 the pre-flush performance

I don't get this thread at all :)

Why would restarting with clean caches be expected to *improve*
performance? And why is key cache loading involved other than to delay
start-up and hopefully pre-populating caches for better (not worse)
performance?

If you want to figure out why queries seem to be slow relative to
normal, you'll need to monitor the behavior of the nodes. Look at disk
I/O statistics primarily (everyone reading this running Cassandra who
aren't intimately familiar with iostat -x -k 1 should go and read up
on it right away; make sure you understand the utilization and avg
queue size columns), CPU usage, weather compaction is happening, etc.

One easy way to see sudden bursts of poor behavior is to be heavily
reliant on cache, and then have sudden decreases in performance due to
compaction evicting data from page cache while also generating more
I/O.

But that's total speculation. It is also the case that you cannot
expect consistent performance on EC2 and that might be it.

But my #1 advise: Log into the node while it is being slow, and
observe. Figure out what the bottleneck is. iostat, top, nodetool
tpstats, nodetool netstats, nodetool compactionstats.

-- 
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)


Re: keycache persisted to disk ?

2012-02-13 Thread Franc Carter
On Mon, Feb 13, 2012 at 7:21 PM, Peter Schuller peter.schul...@infidyne.com
 wrote:

  I actually has the opposite 'problem'. I have a pair of servers that have
  been static since mid last week, but have seen performance vary
  significantly (x10) for exactly the same query. I hypothesised it was
  various caches so I shut down Cassandra, flushed the O/S buffer cache and
  then bought it back up. The performance wasn't significantly different to
  the pre-flush performance

 I don't get this thread at all :)

 Why would restarting with clean caches be expected to *improve*
 performance?


I was expecting it to reduce performance due to cleaning of keycache and
O/S buffer cache - performance stayed roughly the same


 And why is key cache loading involved other than to delay
 start-up and hopefully pre-populating caches for better (not worse)
 performance?

 If you want to figure out why queries seem to be slow relative to
 normal, you'll need to monitor the behavior of the nodes. Look at disk
 I/O statistics primarily (everyone reading this running Cassandra who
 aren't intimately familiar with iostat -x -k 1 should go and read up
 on it right away; make sure you understand the utilization and avg
 queue size columns), CPU usage, weather compaction is happening, etc.


Yep - I've been looking at these - I don't see anything in iostat/dstat etc
that point strongly to a problem. There is quite a bit of I/O load, but it
looks roughly uniform on slow and fast instances of the queries. The last
compaction ran 4 days ago - which was before I started seeing variable
performance



 One easy way to see sudden bursts of poor behavior is to be heavily
 reliant on cache, and then have sudden decreases in performance due to
 compaction evicting data from page cache while also generating more
 I/O.


Unlikely to be a cache issue - In one case an immediate second run of
exactly the same query performed significantly worse.



 But that's total speculation. It is also the case that you cannot
 expect consistent performance on EC2 and that might be it.


Variable performance from ec2 is my lead theory at the moment.



 But my #1 advise: Log into the node while it is being slow, and
 observe. Figure out what the bottleneck is. iostat, top, nodetool
 tpstats, nodetool netstats, nodetool compactionstats.


I now why it is slow - it's clearly I/O bound. I am trying to hunt down why
it is sometimes much faster even though I have (tried) to replicate  the
same conditions



 --
 / Peter Schuller (@scode, http://worldmodscode.wordpress.com)




-- 

*Franc Carter* | Systems architect | Sirca Ltd
 marc.zianideferra...@sirca.org.au

franc.car...@sirca.org.au | www.sirca.org.au

Tel: +61 2 9236 9118

Level 9, 80 Clarence St, Sydney NSW 2000

PO Box H58, Australia Square, Sydney NSW 1215


Re: keycache persisted to disk ?

2012-02-13 Thread Franc Carter
2012/2/13 R. Verlangen ro...@us2.nl

 I also noticed that, Cassandra appears to perform better under a continues
 load.

 Are you sure the rows you're quering are actually in the cache?


I'm making an assumption . . .  I don't yet know enough about cassandra to
prove they are in the cache. I have my keycache set to 2 million, and am
only querying ~900,000 keys. so after the first time I'm assuming they are
in the cache.

cheers




 2012/2/13 Franc Carter franc.car...@sirca.org.au

 2012/2/13 R. Verlangen ro...@us2.nl

 This is because of the warm up of Cassandra as it starts. On a start
 it will start fetching the rows that were cached: this will have to be
 loaded from the disk, as there is nothing in the cache yet. You can read
 more about this at
 http://wiki.apache.org/cassandra/LargeDataSetConsiderations


 I actually has the opposite 'problem'. I have a pair of servers that have
 been static since mid last week, but have seen performance vary
 significantly (x10) for exactly the same query. I hypothesised it was
 various caches so I shut down Cassandra, flushed the O/S buffer cache and
 then bought it back up. The performance wasn't significantly different to
 the pre-flush performance

 cheers




 2012/2/13 Franc Carter franc.car...@sirca.org.au

 On Mon, Feb 13, 2012 at 5:03 PM, zhangcheng zhangch...@jike.comwrote:

 **

 I think the keycaches and rowcahches are bothe persisted to disk when
 shutdown, and restored from disk when restart, then improve the 
 performance.


 Thanks - that would explain at least some of what I am seeing

 cheers



 2012-02-13
 --
  zhangcheng
 --
 *发件人:* Franc Carter
 *发送时间:* 2012-02-13  13:53:56
 *收件人:* user
 *抄送:*
 *主题:* keycache persisted to disk ?

 Hi,

 I am testing Cassandra on Amazon and finding performance can vary
 fairly wildly. I'm leaning towards it being an artifact of the AWS I/O
 system but have one other possibility.

 Are keycaches persisted to disk and restored on a clean shutdown and
 restart ?

 cheers

 --

 *Franc Carter* | Systems architect | Sirca Ltd
 marc.zianideferra...@sirca.org.au

 franc.car...@sirca.org.au | www.sirca.org.au

 Tel: +61 2 9236 9118

 Level 9, 80 Clarence St, Sydney NSW 2000

 PO Box H58, Australia Square, Sydney NSW 1215




 --

 *Franc Carter* | Systems architect | Sirca Ltd
  marc.zianideferra...@sirca.org.au

 franc.car...@sirca.org.au | www.sirca.org.au

 Tel: +61 2 9236 9118

 Level 9, 80 Clarence St, Sydney NSW 2000

 PO Box H58, Australia Square, Sydney NSW 1215





 --

 *Franc Carter* | Systems architect | Sirca Ltd
  marc.zianideferra...@sirca.org.au

 franc.car...@sirca.org.au | www.sirca.org.au

 Tel: +61 2 9236 9118

 Level 9, 80 Clarence St, Sydney NSW 2000

 PO Box H58, Australia Square, Sydney NSW 1215





-- 

*Franc Carter* | Systems architect | Sirca Ltd
 marc.zianideferra...@sirca.org.au

franc.car...@sirca.org.au | www.sirca.org.au

Tel: +61 2 9236 9118

Level 9, 80 Clarence St, Sydney NSW 2000

PO Box H58, Australia Square, Sydney NSW 1215


Re: keycache persisted to disk ?

2012-02-13 Thread Peter Schuller
 Yep - I've been looking at these - I don't see anything in iostat/dstat etc
 that point strongly to a problem. There is quite a bit of I/O load, but it
 looks roughly uniform on slow and fast instances of the queries. The last
 compaction ran 4 days ago - which was before I started seeing variable
 performance

[snip]

 I now why it is slow - it's clearly I/O bound. I am trying to hunt down why
 it is sometimes much faster even though I have (tried) to replicate  the
 same conditions

What does clearly I/O bound mean, and what is quite a bit of I/O
load? In general, if you have queries that come in at some rate that
is determined by outside sources (rather than by the time the last
query took to execute), you will typically either get more queries
than your cluster can take, or fewer. If fewer, there is a
non-trivially sized grey area where overall I/O throughput needed is
lower than that available, but the closer you are to capacity the more
often requests have to wait for other I/O to complete, for purely
statistical reasons.

If you're running close to maximum capacity, it would be expected that
the variation in query latency is high.

That said, if you're seeing consistently bad latencies for a while
where you sometimes see consistently good latencies, that sounds
different but would hopefully be observable somehow.

-- 
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)


Re: keycache persisted to disk ?

2012-02-13 Thread Peter Schuller
 I'm making an assumption . . .  I don't yet know enough about cassandra to
 prove they are in the cache. I have my keycache set to 2 million, and am
 only querying ~900,000 keys. so after the first time I'm assuming they are
 in the cache.

Note that the key cache only caches the index positions in the data
file, and not the actual data. The key cache will only ever eliminate
the I/O that would have been required to lookup the index entry; it
doesn't help to eliminate seeking to get the data (but as usual, it
may still be in the operating system page cache).

-- 
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)


Re: keycache persisted to disk ?

2012-02-13 Thread Peter Schuller
For one thing, what does ReadStage's pending look like if you
repeatedly run nodetool tpstats on these nodes? If you're simply
bottlenecking on I/O on reads, that is the most easy and direct way to
observe this empirically. If you're saturated, you'll see active close
to maximum at all times, and pending racking up consistently. If
you're just close, you'll likely see spikes sometimes.

-- 
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)


Re: keycache persisted to disk ?

2012-02-13 Thread Franc Carter
On Mon, Feb 13, 2012 at 7:49 PM, Peter Schuller peter.schul...@infidyne.com
 wrote:

  I'm making an assumption . . .  I don't yet know enough about cassandra
 to
  prove they are in the cache. I have my keycache set to 2 million, and am
  only querying ~900,000 keys. so after the first time I'm assuming they
 are
  in the cache.

 Note that the key cache only caches the index positions in the data
 file, and not the actual data. The key cache will only ever eliminate
 the I/O that would have been required to lookup the index entry; it
 doesn't help to eliminate seeking to get the data (but as usual, it
 may still be in the operating system page cache).


Yep - I haven't enabled row caches, my calculations at the moment indicate
that the hit-ratio won't be great - but I'll be testing that later



 --
 / Peter Schuller (@scode, http://worldmodscode.wordpress.com)




-- 

*Franc Carter* | Systems architect | Sirca Ltd
 marc.zianideferra...@sirca.org.au

franc.car...@sirca.org.au | www.sirca.org.au

Tel: +61 2 9236 9118

Level 9, 80 Clarence St, Sydney NSW 2000

PO Box H58, Australia Square, Sydney NSW 1215


Re: keycache persisted to disk ?

2012-02-13 Thread Peter Schuller
What is your total data size (nodetool info/nodetool ring) per node,
your heap size, and the amount of memory on the system?


-- 
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)


Re: keycache persisted to disk ?

2012-02-13 Thread Franc Carter
On Mon, Feb 13, 2012 at 7:48 PM, Peter Schuller peter.schul...@infidyne.com
 wrote:

  Yep - I've been looking at these - I don't see anything in iostat/dstat
 etc
  that point strongly to a problem. There is quite a bit of I/O load, but
 it
  looks roughly uniform on slow and fast instances of the queries. The last
  compaction ran 4 days ago - which was before I started seeing variable
  performance

 [snip]

  I now why it is slow - it's clearly I/O bound. I am trying to hunt down
 why
  it is sometimes much faster even though I have (tried) to replicate  the
  same conditions

 What does clearly I/O bound mean, and what is quite a bit of I/O
 load?


the servers spending 50% of the time in io-wait


 In general, if you have queries that come in at some rate that
 is determined by outside sources (rather than by the time the last
 query took to execute),


That's an interesting approach - is that likely to give close to optimal
performance ?


 you will typically either get more queries
 than your cluster can take, or fewer. If fewer, there is a
 non-trivially sized grey area where overall I/O throughput needed is
 lower than that available, but the closer you are to capacity the more
 often requests have to wait for other I/O to complete, for purely
 statistical reasons.

 If you're running close to maximum capacity, it would be expected that
 the variation in query latency is high.


That may well explain it - I'll have to think about what that means for our
use case as load will be extremely bursty



 That said, if you're seeing consistently bad latencies for a while
 where you sometimes see consistently good latencies, that sounds
 different but would hopefully be observable somehow.

 --
 / Peter Schuller (@scode, http://worldmodscode.wordpress.com)




-- 

*Franc Carter* | Systems architect | Sirca Ltd
 marc.zianideferra...@sirca.org.au

franc.car...@sirca.org.au | www.sirca.org.au

Tel: +61 2 9236 9118

Level 9, 80 Clarence St, Sydney NSW 2000

PO Box H58, Australia Square, Sydney NSW 1215


Re: keycache persisted to disk ?

2012-02-13 Thread Franc Carter
On Mon, Feb 13, 2012 at 7:51 PM, Peter Schuller peter.schul...@infidyne.com
 wrote:

 For one thing, what does ReadStage's pending look like if you
 repeatedly run nodetool tpstats on these nodes? If you're simply
 bottlenecking on I/O on reads, that is the most easy and direct way to
 observe this empirically. If you're saturated, you'll see active close
 to maximum at all times, and pending racking up consistently. If
 you're just close, you'll likely see spikes sometimes.


Yep, the readstage is backlogging consistently - but the thing I am trying
to explain s why it is good sometimes in an environment that is pretty well
controlled - other than being on ec2





 --
 / Peter Schuller (@scode, http://worldmodscode.wordpress.com)




-- 

*Franc Carter* | Systems architect | Sirca Ltd
 marc.zianideferra...@sirca.org.au

franc.car...@sirca.org.au | www.sirca.org.au

Tel: +61 2 9236 9118

Level 9, 80 Clarence St, Sydney NSW 2000

PO Box H58, Australia Square, Sydney NSW 1215


Re: keycache persisted to disk ?

2012-02-13 Thread Peter Schuller
 the servers spending 50% of the time in io-wait

Note that I/O wait is not necessarily a good indicator, depending on
situation. In particular if you have multiple drives, I/O wait can
mostly be ignored. Similarly if you have non-trivial CPU usage in
addition to disk I/O, it is also not a good indicator. I/O wait is
essentially giving you the amount of time CPU:s spend doing nothing
because the only processes that would otherwise be runnable are
waiting on disk I/O. But even a single process waiting on disk I/O -
lots of I/O wait even if you have 24 drives.

The per-disk % utilization is generally a much better indicator
(assuming no hardware raid device, and assuming no SSD), along with
the average queue size.

 In general, if you have queries that come in at some rate that
 is determined by outside sources (rather than by the time the last
 query took to execute),

 That's an interesting approach - is that likely to give close to optimal
 performance ?

I just mean that it all depends on the situation. If you have, for
example, some N number of clients that are doing work as fast as they
can, bottlenecking only on Cassandra, you're essentially saturating
the Cassandra cluster no matter what (until the client/network becomes
a bottleneck). Under such conditions (saturation) you generally never
should expect good latencies.

For most non-batch job production use-cases, you tend to have incoming
requests driven by something external such as user behavior or
automated systems not related to the Cassandra cluster. In this cases,
you tend to have a certain amount of incoming requests at any given
time that you must serve within a reasonable time frame, and that's
where the question comes in of how much I/O you're doing in relation
to maximum. For good latencies, you always want to be significantly
below maximum - particularly when platter based disk I/O is involved.

 That may well explain it - I'll have to think about what that means for our
 use case as load will be extremely bursty

To be clear though, even your typical un-bursty load is still bursty
once you look at it at sufficient resolution, unless you have
something specifically ensuring that it is entirely smooth. A
completely random distribution over time for example would look very
even on almost any graph you can imagine unless you have sub-second
resolution, but would still exhibit un-evenness and have an affect on
latency.

-- 
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)


Re: keycache persisted to disk ?

2012-02-13 Thread Peter Schuller
 Yep, the readstage is backlogging consistently - but the thing I am trying
 to explain s why it is good sometimes in an environment that is pretty well
 controlled - other than being on ec2

So pending is constantly  0? What are the clients? Is it batch jobs
or something similar where there is a feedback mechanism implicit in
that the higher latencies of the cluster are slowing down the clients,
thus reaching an equilibrium? Or are you just teetering on the edge,
dropping requests constantly?

Under typical live-traffic conditions, you never want to be running
with read stage pending backing up constantly. If on the other hand
these are batch jobs where throughput is the concern, it's not
relevant.

-- 
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)


Re: keycache persisted to disk ?

2012-02-13 Thread Franc Carter
On Mon, Feb 13, 2012 at 8:00 PM, Peter Schuller peter.schul...@infidyne.com
 wrote:

 What is your total data size (nodetool info/nodetool ring) per node,
 your heap size, and the amount of memory on the system?


2 Node cluster, 7.9GB of ram (ec2 m1.large)
RF=2
11GB per node
Quorum reads
122 million keys
heap size is 1867M (default from the AMI I am running)
I'm reading about 900k keys

As I was just going through cfstats - I noticed something I don't understand

Key cache capacity: 906897
Key cache size: 906897

I set the key cache to 2million, it's somehow got to a rather odd number




 --
 / Peter Schuller (@scode, http://worldmodscode.wordpress.com)




-- 

*Franc Carter* | Systems architect | Sirca Ltd
 marc.zianideferra...@sirca.org.au

franc.car...@sirca.org.au | www.sirca.org.au

Tel: +61 2 9236 9118

Level 9, 80 Clarence St, Sydney NSW 2000

PO Box H58, Australia Square, Sydney NSW 1215


Re: keycache persisted to disk ?

2012-02-13 Thread Peter Schuller
 2 Node cluster, 7.9GB of ram (ec2 m1.large)
 RF=2
 11GB per node
 Quorum reads
 122 million keys
 heap size is 1867M (default from the AMI I am running)
 I'm reading about 900k keys

Ok, so basically a very significant portion of the data fits in page
cache, but not all.

 As I was just going through cfstats - I noticed something I don't understand

                 Key cache capacity: 906897
                 Key cache size: 906897

 I set the key cache to 2million, it's somehow got to a rather odd number

You're on 1.0 +? Nowadays there is code to actively make caches
smaller if Cassandra detects that you seem to be running low on heap.
Watch cassandra.log for messages to that effect (don't remember the
exact message right now).

-- 
/ Peter Schuller (@scode, http://worldmodscode.wordpress.com)


Re: keycache persisted to disk ?

2012-02-13 Thread Franc Carter
On Mon, Feb 13, 2012 at 8:09 PM, Peter Schuller peter.schul...@infidyne.com
 wrote:

  the servers spending 50% of the time in io-wait

 Note that I/O wait is not necessarily a good indicator, depending on
 situation. In particular if you have multiple drives, I/O wait can
 mostly be ignored. Similarly if you have non-trivial CPU usage in
 addition to disk I/O, it is also not a good indicator. I/O wait is
 essentially giving you the amount of time CPU:s spend doing nothing
 because the only processes that would otherwise be runnable are
 waiting on disk I/O. But even a single process waiting on disk I/O -
 lots of I/O wait even if you have 24 drives.


Yep - user space cpu is 20% or much worse when the io-wait goes in to the
90's - looks a great deal like IO bottleknecks



 The per-disk % utilization is generally a much better indicator
 (assuming no hardware raid device, and assuming no SSD), along with
 the average queue size.


I doubt that figure is available sensibly in an ec2 instance



  In general, if you have queries that come in at some rate that
  is determined by outside sources (rather than by the time the last
  query took to execute),
 
  That's an interesting approach - is that likely to give close to optimal
  performance ?

 I just mean that it all depends on the situation. If you have, for
 example, some N number of clients that are doing work as fast as they
 can, bottlenecking only on Cassandra, you're essentially saturating
 the Cassandra cluster no matter what (until the client/network becomes
 a bottleneck). Under such conditions (saturation) you generally never
 should expect good latencies.

 For most non-batch job production use-cases, you tend to have incoming
 requests driven by something external such as user behavior or
 automated systems not related to the Cassandra cluster. In this cases,
 you tend to have a certain amount of incoming requests at any given
 time that you must serve within a reasonable time frame, and that's
 where the question comes in of how much I/O you're doing in relation
 to maximum. For good latencies, you always want to be significantly
 below maximum - particularly when platter based disk I/O is involved.

  That may well explain it - I'll have to think about what that means for
 our
  use case as load will be extremely bursty

 To be clear though, even your typical un-bursty load is still bursty
 once you look at it at sufficient resolution, unless you have
 something specifically ensuring that it is entirely smooth. A
 completely random distribution over time for example would look very
 even on almost any graph you can imagine unless you have sub-second
 resolution, but would still exhibit un-evenness and have an affect on
 latency.

 --
 / Peter Schuller (@scode, http://worldmodscode.wordpress.com)




-- 

*Franc Carter* | Systems architect | Sirca Ltd
 marc.zianideferra...@sirca.org.au

franc.car...@sirca.org.au | www.sirca.org.au

Tel: +61 2 9236 9118

Level 9, 80 Clarence St, Sydney NSW 2000

PO Box H58, Australia Square, Sydney NSW 1215


Re: keycache persisted to disk ?

2012-02-13 Thread Franc Carter
On Mon, Feb 13, 2012 at 8:15 PM, Peter Schuller peter.schul...@infidyne.com
 wrote:

  2 Node cluster, 7.9GB of ram (ec2 m1.large)
  RF=2
  11GB per node
  Quorum reads
  122 million keys
  heap size is 1867M (default from the AMI I am running)
  I'm reading about 900k keys

 Ok, so basically a very significant portion of the data fits in page
 cache, but not all.


yep



  As I was just going through cfstats - I noticed something I don't
 understand
 
  Key cache capacity: 906897
  Key cache size: 906897
 
  I set the key cache to 2million, it's somehow got to a rather odd number

 You're on 1.0 +?


yep 1.07


 Nowadays there is code to actively make caches
 smaller if Cassandra detects that you seem to be running low on heap.
 Watch cassandra.log for messages to that effect (don't remember the
 exact message right now).


I just grep'd the logs and couldn't see anything that looked like that


  --
 / Peter Schuller (@scode, http://worldmodscode.wordpress.com)




-- 

*Franc Carter* | Systems architect | Sirca Ltd
 marc.zianideferra...@sirca.org.au

franc.car...@sirca.org.au | www.sirca.org.au

Tel: +61 2 9236 9118

Level 9, 80 Clarence St, Sydney NSW 2000

PO Box H58, Australia Square, Sydney NSW 1215


Re: keycache persisted to disk ?

2012-02-12 Thread zhangcheng

I think the keycaches and rowcahches are bothe persisted to disk when shutdown, 
and restored from disk when restart, then improve the performance.

2012-02-13 



zhangcheng 



发件人: Franc Carter 
发送时间: 2012-02-13  13:53:56 
收件人: user 
抄送: 
主题: keycache persisted to disk ? 
 

Hi,

I am testing Cassandra on Amazon and finding performance can vary fairly 
wildly. I'm leaning towards it being an artifact of the AWS I/O system but have 
one other possibility.

Are keycaches persisted to disk and restored on a clean shutdown and restart ?

cheers

-- 

Franc Carter | Systems architect | Sirca Ltd

franc.car...@sirca.org.au | www.sirca.org.au
Tel: +61 2 9236 9118 

Level 9, 80 Clarence St, Sydney NSW 2000
PO Box H58, Australia Square, Sydney NSW 1215


Re: keycache persisted to disk ?

2012-02-12 Thread Franc Carter
On Mon, Feb 13, 2012 at 5:03 PM, zhangcheng zhangch...@jike.com wrote:

 **

 I think the keycaches and rowcahches are bothe persisted to disk when
 shutdown, and restored from disk when restart, then improve the performance.


Thanks - that would explain at least some of what I am seeing

cheers



 2012-02-13
 --
  zhangcheng
 --
 *发件人:* Franc Carter
 *发送时间:* 2012-02-13  13:53:56
 *收件人:* user
 *抄送:*
 *主题:* keycache persisted to disk ?

 Hi,

 I am testing Cassandra on Amazon and finding performance can vary fairly
 wildly. I'm leaning towards it being an artifact of the AWS I/O system but
 have one other possibility.

 Are keycaches persisted to disk and restored on a clean shutdown and
 restart ?

 cheers

 --

 *Franc Carter* | Systems architect | Sirca Ltd
 marc.zianideferra...@sirca.org.au

 franc.car...@sirca.org.au | www.sirca.org.au

 Tel: +61 2 9236 9118

 Level 9, 80 Clarence St, Sydney NSW 2000

 PO Box H58, Australia Square, Sydney NSW 1215




-- 

*Franc Carter* | Systems architect | Sirca Ltd
 marc.zianideferra...@sirca.org.au

franc.car...@sirca.org.au | www.sirca.org.au

Tel: +61 2 9236 9118

Level 9, 80 Clarence St, Sydney NSW 2000

PO Box H58, Australia Square, Sydney NSW 1215